Constraints Manual

SmartHLS accepts user-provided constraints that impact the automatically generated hardware. These constraints can be specified using the SmartHLS IDE and are stored in the Tcl configuration file config.tcl in your project directory. This reference section explains the constraints available for SmartHLS.

The main constraints available from the SmartHLS IDE are:

A few debugging constraints are available from the SmartHLS IDE:


Commonly Used Constraints

CLOCK_PERIOD

This is a widely used constraint that allows the user to set the target clock period for a design. The clock period is specified in nanoseconds.

It has a significant impact on scheduling: the scheduler will schedule operators into clock cycles using delay estimates for each operator, such that the specified clock period is honored. In other words, operators will be chained together combinationally to the extent allowed by the value of the CLOCK_PERIOD parameter.

SmartHLS has a default CLOCK_PERIOD value for each device family that is supported (see table in SmartHLS Constraints).

Category

HLS Constraints

Value Type

Integer represent a value in nanoseconds

Valid Values

Integer

Default Value

Depends on the target device

Dependencies

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_parameter CLOCK_PERIOD 15


set_custom_top_level_module

This Tcl command specifies the top-level C/C++ function. The top-level function and all of its descendant functions will be compiled to hardware. The top-level function can also be specified using a pragma (see Set Custom Top-Level Function), but it cannot be specified using both the Tcl command and the pragma.

Category

HLS Constraints

Value Type

string

Dependencies

NONE

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_custom_top_level_module "foo"


set_custom_test_bench_module

This TCL command is to specify the name of the user-provided testbench module to be using for RTL simulation. The testbench file must also be specified with set_custom_test_bench_file.

Category

Simulation

Value Type

String

Dependencies

set_custom_test_bench_file user_tb.v

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_custom_test_bench_module "user_tb"


set_custom_test_bench_file

This TCL command is to specify the user-provided custom testbench file that defines the custom testbench module, which is set via set_custom_test_bench_module. This is not needed for SW/HW co-simulation.

Category

Simulation

Value Type

String

Dependencies

set_custom_test_bench_module "user_tb"

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_custom_test_bench_file user_tb.v


set_synthesis_top_module

This TCL command specifies the name of the Verilog module that will be set as the top-level module when creating a Libero project for synthesis, place and route. By default, the top-level function (see Specifying the Top-level Function) is set as the top-level module for the Libero project, however user may want to provide wrapper HDL module that instantiates the SmartHLS-generate top-level module. In this case, this Tcl command can be used to give the name of the wrapper module.

Category

Libero

Value Type

String

Dependencies

NONE

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_synthesis_top_module "wrapper_top"


set_synthesis_top_module_file

When set_synthesis_top_module is used to set a different wrapper module as the top-level for synthesis, place & route, use this command to specify the file that defines the wrapper module.

Category

Libero

Value Type

String

Dependencies

set_synthesis_top_module "wrapper_top"

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_synthesis_top_module_file custom_synthesis_top.v


set_resource_constraint

This Tcl command constrains the resource allocated by SmartHLS. For instance, to only have a single divider in the entire circuit, user can specify: set_resource_constraint divide 1. This makes SmartHLS instantiate a maximum of 1 divider in the circuit, and if there are multiple division operations required, they will share the same divider.

Note: A constraint on “divide” will apply to:
  • signed_divide_8

  • signed_divide_16

  • signed_divide_32

  • signed_divide_64

  • unsigned_divide_8

  • unsigned_divide_16

  • unsigned_divide_32

  • unsigned_divide_64

It can also be used to constrain the number of memory ports. To make all memories single-ported: set_resource_constraint memory_port 1 For memory ports, only 1 and 2 are valid values, as FPGA RAMs have up to 2 ports.

This Tcl command should only be used by advanced SmartHLS users.

Category

HLS Constraints

Value Type

set_resource_constraint <operation> <constraint> <operation> is a string <constraint> is an integer

Valid Values

See Default and Examples Note: operator name should match the device family operation database file: boards/PolarFire/PolarFire.tcl

Default Values

memory_port 2
divide 1
modulus 1
multiply 2
fp_add 1
fp_subtract 1
fp_multiply 1
fp_divide 1
fp 1

Location Where Default is Specified

examples/legup.tcl

Dependencies

None

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_resource_constraint signed_divide_16 3

set_resource_constraint signed_divide 2

set_resource_constraint divide 1

Note

When implementing an integer multiply operation, SmartHLS’s default behaviour is to optimize for timing (Fmax) by mapping a multiply operation to target DSPs. When a multiply operation is wider than the DSP can support, SmartHLS splits the multiply into multiple DSPs and automatically inserts registers to help RTL synthesis tool to utilize registers in DSPs better, and hence achieve better timing.

However, this split-multiply feature cannot support sharing multipliers. When set_resource_constraint multiply X is specified by user, SmartHLS will turn off the split-multiply feature and use a generic multiplier that can be shared by more than one multiply operations. SmartHLS will also print a warning as follows, Warning: Detected set_resource_constraint setting for integer multiplier. However, multiplier sharing is incompatible with the multiply splitting feature. Disabling multiply splitting feature and the normal multiplier modules will be used.

Similarly, set_operation_latency multiply X will trigger the same behaviour because the split-multiply does not support user-configured multiply latency.


set_operation_latency

This Tcl command sets the latency of a given operation. Latency refers to the number of clock cycles required to complete the computation; an operation with latency one requires one cycle, while zero-latency operations are completely combinational, meaning multiple such operations can be chained together in a single clock cycle. This command is used to schedule each type operation to take the specified number of cycles.

This Tcl command should only be used by advanced SmartHLS users.

Category

HLS Constraints

Value Type

set_operation_latency <operation> <constraint> <operation> is a string <constraint> is an integer

Valid Values

See Default and Examples Note: operator name should match the operation database file: boards/PolarFire/PolarFire.tcl or boards/set_operation_latency.tcl

Default Values

fp_add 14
fp_subtract 14
fp_multiply 11
fp_divide_32 33
fp_divide_64 61
fp_truncate_64 3
fp_extend_32 2
fp_fptosi 6
fp_sitofp 6
signed_comp_o 1
signed_comp_u 1
reg 2
memory_port 2
local_memory_port 1
multiply 1

Location Where Default is Specified

examples/legup.tcl

Dependencies

None

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

// set memory operations to take 3 cycles set_operation_latency memory_port 3

Note

When implementing an integer multiply operation, SmartHLS’s default behaviour is to optimize for timing (Fmax) by mapping a multiply operation to target DSPs. When a multiply operation is wider than the DSP can support, SmartHLS splits the multiply into multiple DSPs and automatically inserts registers to help RTL synthesis tool to utilize registers in DSPs better, and hence achieve better timing.

However, this split-multiply feature does not support a user-configurable latency. When set_operation_latency multiply X is specified by user, SmartHLS will turn off the split-multiply feature and use a generic multiplier that will adapt to user-specified latency. SmartHLS will also print a warnign as follows, Warning: Detected set_operation_latency setting for integer multiplier. However, configurable latency is incompatible with the multiply splitting feature. Disabling multiply splitting feature and the normal multiplier modules will be used.

Similarly, set_resource_constraint multiply X will trigger the same behaviour because the split-multiply does not support sharing the multipliers.


Debugging Constraints

KEEP_SIGNALS_WITH_NO_FANOUT

If this parameter is enabled, all signals will be printed to the output Verilog file, even if they don’t drive any outputs.

Category

HLS Constraint

Value Type

Integer

Valid Values

0, 1

Default Value

unset (0)

Location Where Default is Specified

examples/legup.tcl

Dependencies

None

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_parameter KEEP_SIGNALS_WITH_NO_FANOUT 1


VSIM_ASSERT

When set to 1, this constraint causes assertions to be inserted in the Verilog produced by SmartHLS. This is useful for debugging the circuit to see where invalid values (X’s) are being assigned.

Category

Simulation

Value Type

Integer

Valid Values

0, 1

Default Value

0

Location Where Default is Specified

examples/legup.tcl

Dependencies

None

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_parameter VSIM_ASSERT 1


Advanced Constraints

These are not available from the SmartHLS GUI.

LATENCY_REDUCTION

The LATENCY_REDUCTION settings control the SmartHLS’ expression balancing optimization, of which the objective is to reduce the circuit latency. Below are the related settings,

Parameter Name

Default Value

LATENCY_REDUCTION

1

The main switch that enables or disables expression balancing. Setting to 0 disables all expression balancing optimizations.

LATENCY_REDUCTION_ALLOW_FP_REORDERING

0

By default expression balancing does not re-order floating-point operations to prevent loss of precisions.
Setting to 1 allows to re-order floating-point operations if the circuit latency can be reduced.

LATENCY_REDUCTION_REDUCE_FP_CONVERSIONS

0

Setting to 1 will allow SmartHLS to cancel out back-and-forth conversion between floating-point and integer, with potential variations in numerical values.
For example, the following conversions can be cancelled when this setting is 1.
int   a = (int)(float)(3);    // a == 3.
float b = (float)(int)(1.2);  // b == 1.2 instead of 1.

LATENCY_REDUCTION_BALANCE_MULTI_USE_NODE

0

By default expression balancing does not optimize the intermediate operations that have multiple uses, to avoid potential increase of resource usage.
Setting to 1 allows to re-order intermediate operations that have multiple uses and more latency reduction could be achieved.

Category

HLS Constraints

Value Type

Integer

Valid Values

0, 1

Default Value

As listed in the table above.

Location Where Default is Specified

examples/legup.tcl

Dependencies

None

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_parameter LATENCY_REDUCTION 1

set_parameter LATENCY_REDUCTION_ALLOW_FP_REORDERING 0

set_parameter LATENCY_REDUCTION_REDUCE_FP_CONVERSIONS 0

set_parameter LATENCY_REDUCTION_BALANCE_MULTI_USE_NODE 0


STRENGTH_REDUCTION

Strength reduction is an optimization that converts multiply-by-constant into shifts and additions:

Info: StrengthReduction: Replacing multiply by constant (i26 33038) with 3 adders:
        - (1 << 1) + (1 << 4) + (1 << 8) + (1 << 15)
Info: StrengthReduction: Replacing multiply by constant (i26 6416) with 3 adders:
        + (1 << 4) + (1 << 8) + (1 << 11) + (1 << 12)
Info: StrengthReduction: Replacing multiply by constant (i26 28784) with 3 adders:
        - (1 << 4) + (1 << 7) - (1 << 12) + (1 << 15)
Info: StrengthReduction: Replacing multiply by constant (i26 4680) with 3 adders:
        + (1 << 3) + (1 << 6) + (1 << 9) + (1 << 12)
Info: StrengthReduction: Replacing multiply by constant (i26 33024) with 1 adder:
        + (1 << 8) + (1 << 15)

This optimization saves DSP blocks on the FPGA but can also increase LUT usage in the design.

You can tune the number of adders allowed per multiplier with the constraint: STRENGTH_REDUCTION_ADDERS_ALLOWED_PER_MULTIPLIER

Category

HLS Constraints

Value Type

Integer

Valid Values

0, 1

Default Value

1

Location Where Default is Specified

examples/legup.tcl

Dependencies

None

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_parameter STRENGTH_REDUCTION 1


STRENGTH_REDUCTION_ADDERS_ALLOWED_PER_MULTIPLIER

Strength reduction is an optimization that converts multiply-by-constant into shifts and additions:

Info: StrengthReduction: Replacing multiply by constant (i26 33038) with 3 adders:
        - (1 << 1) + (1 << 4) + (1 << 8) + (1 << 15)

The STRENGTH_REDUCTION_ADDERS_ALLOWED_PER_MULTIPLIER constraint allows you to tune the number of adders allowed per multiplier (default is 3). Strength reduction for multiply-by-constants will not be performed if this requires more adders than allowed:

i26 16828 is composed of 4 adders:
- (1 << 2) + (1 << 6) + (1 << 7) + (1 << 8) + (1 << 14)
Skipping conversion otherwise would need too many additions.

In this example, we would need 4 adders which is more than the default of 3, meaning strength reduction will not occur and SmartHLS will keep the multiplier.

Category

HLS Constraints

Value Type

Integer

Valid Values

Positive Integer

Default Value

3

Location Where Default is Specified

examples/legup.tcl

Dependencies

STRENGTH_REDUCTION must be on

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_parameter STRENGTH_REDUCTION_ADDERS_ALLOWED_PER_MULTIPLIER 3


USE_FIFO_FOR_PIPELINE_REG

In a pipeline circuit where multiple stages of the circuit are concurrently active and processing different loop iterations (or function calls), pipeline registers are used to retain and propagate a variable value from the value-producing stage to the value-use stage. The pipeline registers are essentially a chain of shift registers with additional control logic. When the chain of pipeline registers is long, it may be more resource-efficient to implement the pipeline registers as a block-RAM FIFO rather than shift registers.

When this parameter is enabled, Smart HLS will examine each chain of pipeline registers and use the implementation (FIFO or shift register) that is estimated to be more resource-efficient.

Category

HLS Constraint

Value Type

Integer

Valid Values

0, 1

Default Value

unset (0)

Location Where Default is Specified

examples/legup.tcl

Dependencies

None

Applicable Flows

All devices and flows

Test Status

Actively in-use

Examples

set_parameter USE_FIFO_FOR_PIPELINE_REG 1