Constraints Manual¶
SmartHLS accepts user-provided constraints that impact the automatically generated
hardware. These constraints can be specified using the SmartHLS IDE and are
stored in the Tcl configuration file config.tcl
in your project directory.
This reference section explains the constraints available for SmartHLS.
The main constraints available from the SmartHLS IDE are:
Set target clock period: CLOCK_PERIOD
Set top-level function: set_custom_top_level_module
Set test bench module: set_custom_test_bench_module
Set test bench file: set_custom_test_bench_file
Set FPGA synthesis top-level module: set_synthesis_top_module
Set FPGA synthesis top-level file: set_synthesis_top_module_file
Set resource constraint: set_resource_constraint
Set operation latency: set_operation_latency
A few debugging constraints are available from the SmartHLS IDE:
Keep signals with no fanout: KEEP_SIGNALS_WITH_NO_FANOUT
Insert simulation assertions (for debugging): VSIM_ASSERT
Commonly Used Constraints¶
CLOCK_PERIOD¶
This is a widely used constraint that allows the user to set the target clock period for a design. The clock period is specified in nanoseconds.
It has a significant impact on scheduling: the scheduler will schedule operators into clock cycles using delay estimates for each operator, such that the specified clock period is honored. In other words, operators will be chained together combinationally to the extent allowed by the value of the CLOCK_PERIOD parameter.
SmartHLS has a default CLOCK_PERIOD value for each device family that is supported (see table in SmartHLS Constraints).
Category
HLS Constraints
Value Type
Integer represent a value in nanoseconds
Valid Values
Integer
Default Value
Depends on the target device
Dependencies
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_parameter CLOCK_PERIOD 15
set_custom_top_level_module¶
This Tcl command specifies the top-level C/C++ function. The top-level function and all of its descendant functions will be compiled to hardware. The top-level function can also be specified using a pragma (see Set Custom Top-Level Function), but it cannot be specified using both the Tcl command and the pragma.
Category
HLS Constraints
Value Type
string
Dependencies
NONE
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_custom_top_level_module "foo"
set_custom_test_bench_module¶
This TCL command is to specify the name of the user-provided testbench module to be using for RTL simulation. The testbench file must also be specified with set_custom_test_bench_file.
Category
Simulation
Value Type
String
Dependencies
set_custom_test_bench_file user_tb.v
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_custom_test_bench_module "user_tb"
set_custom_test_bench_file¶
This TCL command is to specify the user-provided custom testbench file
that defines the custom testbench module, which is set via set_custom_test_bench_module
.
This is not needed for SW/HW co-simulation.
Category
Simulation
Value Type
String
Dependencies
set_custom_test_bench_module "user_tb"
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_custom_test_bench_file user_tb.v
set_synthesis_top_module¶
This TCL command specifies the name of the Verilog module that will be set as the top-level module when creating a Libero project for synthesis, place and route. By default, the top-level function (see Specifying the Top-level Function) is set as the top-level module for the Libero project, however user may want to provide wrapper HDL module that instantiates the SmartHLS-generate top-level module. In this case, this Tcl command can be used to give the name of the wrapper module.
Category
Libero
Value Type
String
Dependencies
NONE
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_synthesis_top_module "wrapper_top"
set_synthesis_top_module_file¶
When set_synthesis_top_module
is used to set a different wrapper module as
the top-level for synthesis, place & route, use this command to specify the
file that defines the wrapper module.
Category
Libero
Value Type
String
Dependencies
set_synthesis_top_module "wrapper_top"
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_synthesis_top_module_file custom_synthesis_top.v
set_resource_constraint¶
This Tcl command constrains the resource allocated by SmartHLS.
For instance, to only have a single divider in the entire circuit, user can specify: set_resource_constraint divide 1
.
This makes SmartHLS instantiate a maximum of 1 divider in the circuit, and if there are multiple division operations required, they will share the same divider.
- Note: A constraint on “divide” will apply to:
signed_divide_8
signed_divide_16
signed_divide_32
signed_divide_64
unsigned_divide_8
unsigned_divide_16
unsigned_divide_32
unsigned_divide_64
It can also be used to constrain the number of memory ports. To make all memories single-ported: set_resource_constraint memory_port 1
For memory ports, only 1 and 2 are valid values, as FPGA RAMs have up to 2 ports.
This Tcl command should only be used by advanced SmartHLS users.
Category
HLS Constraints
Value Type
set_resource_constraint <operation> <constraint> <operation> is a string <constraint> is an integer
Valid Values
See Default and Examples Note: operator name should match the device family operation database file: boards/PolarFire/PolarFire.tcl
Default Values
memory_port 2
divide 1
modulus 1
multiply 2
fp_add 1
fp_subtract 1
fp_multiply 1
fp_divide 1
fp 1
Location Where Default is Specified
examples/legup.tcl
Dependencies
None
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_resource_constraint signed_divide_16 3
set_resource_constraint signed_divide 2
set_resource_constraint divide 1
Note
When implementing an integer multiply operation, SmartHLS’s default behaviour is to optimize for timing (Fmax) by mapping a multiply operation to target DSPs. When a multiply operation is wider than the DSP can support, SmartHLS splits the multiply into multiple DSPs and automatically inserts registers to help RTL synthesis tool to utilize registers in DSPs better, and hence achieve better timing.
However, this split-multiply feature cannot support sharing multipliers.
When set_resource_constraint multiply X
is specified by user, SmartHLS will
turn off the split-multiply feature and use a generic multiplier that can be
shared by more than one multiply operations.
SmartHLS will also print a warning as follows,
Warning: Detected set_resource_constraint setting for integer multiplier. However, multiplier sharing is incompatible with the multiply splitting feature. Disabling multiply splitting feature and the normal multiplier modules will be used.
Similarly, set_operation_latency multiply X
will trigger the same
behaviour because the split-multiply does not support user-configured
multiply latency.
set_operation_latency¶
This Tcl command sets the latency of a given operation. Latency refers to the number of clock cycles required to complete the computation; an operation with latency one requires one cycle, while zero-latency operations are completely combinational, meaning multiple such operations can be chained together in a single clock cycle. This command is used to schedule each type operation to take the specified number of cycles.
This Tcl command should only be used by advanced SmartHLS users.
Category
HLS Constraints
Value Type
set_operation_latency <operation> <constraint> <operation> is a string <constraint> is an integer
Valid Values
See Default and Examples Note: operator name should match the operation database file: boards/PolarFire/PolarFire.tcl or boards/set_operation_latency.tcl
Default Values
fp_add 14
fp_subtract 14
fp_multiply 11
fp_divide_32 33
fp_divide_64 61
fp_truncate_64 3
fp_extend_32 2
fp_fptosi 6
fp_sitofp 6
signed_comp_o 1
signed_comp_u 1
reg 2
memory_port 2
local_memory_port 1
multiply 1
Location Where Default is Specified
examples/legup.tcl
Dependencies
None
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
// set memory operations to take 3 cycles
set_operation_latency memory_port 3
Note
When implementing an integer multiply operation, SmartHLS’s default behaviour is to optimize for timing (Fmax) by mapping a multiply operation to target DSPs. When a multiply operation is wider than the DSP can support, SmartHLS splits the multiply into multiple DSPs and automatically inserts registers to help RTL synthesis tool to utilize registers in DSPs better, and hence achieve better timing.
However, this split-multiply feature does not support a user-configurable
latency.
When set_operation_latency multiply X
is specified by user, SmartHLS will
turn off the split-multiply feature and use a generic multiplier that will
adapt to user-specified latency.
SmartHLS will also print a warnign as follows,
Warning: Detected set_operation_latency setting for integer multiplier. However, configurable latency is incompatible with the multiply splitting feature. Disabling multiply splitting feature and the normal multiplier modules will be used.
Similarly, set_resource_constraint multiply X
will trigger the same
behaviour because the split-multiply does not support sharing the
multipliers.
Debugging Constraints¶
KEEP_SIGNALS_WITH_NO_FANOUT¶
If this parameter is enabled, all signals will be printed to the output Verilog file, even if they don’t drive any outputs.
Category
HLS Constraint
Value Type
Integer
Valid Values
0, 1
Default Value
unset (0)
Location Where Default is Specified
examples/legup.tcl
Dependencies
None
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_parameter KEEP_SIGNALS_WITH_NO_FANOUT 1
VSIM_ASSERT¶
When set to 1, this constraint causes assertions to be inserted in the Verilog produced by SmartHLS. This is useful for debugging the circuit to see where invalid values (X’s) are being assigned.
Category
Simulation
Value Type
Integer
Valid Values
0, 1
Default Value
0
Location Where Default is Specified
examples/legup.tcl
Dependencies
None
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_parameter VSIM_ASSERT 1
Advanced Constraints¶
These are not available from the SmartHLS GUI.
LATENCY_REDUCTION¶
The LATENCY_REDUCTION settings control the SmartHLS’ expression balancing optimization, of which the objective is to reduce the circuit latency. Below are the related settings,
Parameter Name |
Default Value |
|
---|---|---|
LATENCY_REDUCTION |
1 |
The main switch that enables or disables expression balancing. Setting to 0 disables all expression balancing optimizations. |
LATENCY_REDUCTION_ALLOW_FP_REORDERING |
0 |
By default expression balancing does not re-order floating-point operations to prevent loss of precisions.
Setting to 1 allows to re-order floating-point operations if the circuit latency can be reduced.
|
LATENCY_REDUCTION_REDUCE_FP_CONVERSIONS |
0 |
Setting to 1 will allow SmartHLS to cancel out back-and-forth conversion between floating-point and integer, with potential variations in numerical values.
For example, the following conversions can be cancelled when this setting is 1.
int a = (int)(float)(3); // a == 3. float b = (float)(int)(1.2); // b == 1.2 instead of 1. |
LATENCY_REDUCTION_BALANCE_MULTI_USE_NODE |
0 |
By default expression balancing does not optimize the intermediate operations that have multiple uses, to avoid potential increase of resource usage.
Setting to 1 allows to re-order intermediate operations that have multiple uses and more latency reduction could be achieved.
|
Category
HLS Constraints
Value Type
Integer
Valid Values
0, 1
Default Value
As listed in the table above.
Location Where Default is Specified
examples/legup.tcl
Dependencies
None
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_parameter LATENCY_REDUCTION 1
set_parameter LATENCY_REDUCTION_ALLOW_FP_REORDERING 0
set_parameter LATENCY_REDUCTION_REDUCE_FP_CONVERSIONS 0
set_parameter LATENCY_REDUCTION_BALANCE_MULTI_USE_NODE 0
STRENGTH_REDUCTION¶
Strength reduction is an optimization that converts multiply-by-constant into shifts and additions:
Info: StrengthReduction: Replacing multiply by constant (i26 33038) with 3 adders:
- (1 << 1) + (1 << 4) + (1 << 8) + (1 << 15)
Info: StrengthReduction: Replacing multiply by constant (i26 6416) with 3 adders:
+ (1 << 4) + (1 << 8) + (1 << 11) + (1 << 12)
Info: StrengthReduction: Replacing multiply by constant (i26 28784) with 3 adders:
- (1 << 4) + (1 << 7) - (1 << 12) + (1 << 15)
Info: StrengthReduction: Replacing multiply by constant (i26 4680) with 3 adders:
+ (1 << 3) + (1 << 6) + (1 << 9) + (1 << 12)
Info: StrengthReduction: Replacing multiply by constant (i26 33024) with 1 adder:
+ (1 << 8) + (1 << 15)
This optimization saves DSP blocks on the FPGA but can also increase LUT usage in the design.
You can tune the number of adders allowed per multiplier with the constraint: STRENGTH_REDUCTION_ADDERS_ALLOWED_PER_MULTIPLIER
Category
HLS Constraints
Value Type
Integer
Valid Values
0, 1
Default Value
1
Location Where Default is Specified
examples/legup.tcl
Dependencies
None
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_parameter STRENGTH_REDUCTION 1
STRENGTH_REDUCTION_ADDERS_ALLOWED_PER_MULTIPLIER¶
Strength reduction is an optimization that converts multiply-by-constant into shifts and additions:
Info: StrengthReduction: Replacing multiply by constant (i26 33038) with 3 adders:
- (1 << 1) + (1 << 4) + (1 << 8) + (1 << 15)
The STRENGTH_REDUCTION_ADDERS_ALLOWED_PER_MULTIPLIER constraint allows you to tune the number of adders allowed per multiplier (default is 3). Strength reduction for multiply-by-constants will not be performed if this requires more adders than allowed:
i26 16828 is composed of 4 adders:
- (1 << 2) + (1 << 6) + (1 << 7) + (1 << 8) + (1 << 14)
Skipping conversion otherwise would need too many additions.
In this example, we would need 4 adders which is more than the default of 3, meaning strength reduction will not occur and SmartHLS will keep the multiplier.
Category
HLS Constraints
Value Type
Integer
Valid Values
Positive Integer
Default Value
3
Location Where Default is Specified
examples/legup.tcl
Dependencies
STRENGTH_REDUCTION must be on
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_parameter STRENGTH_REDUCTION_ADDERS_ALLOWED_PER_MULTIPLIER 3
USE_FIFO_FOR_PIPELINE_REG¶
In a pipeline circuit where multiple stages of the circuit are concurrently active and processing different loop iterations (or function calls), pipeline registers are used to retain and propagate a variable value from the value-producing stage to the value-use stage. The pipeline registers are essentially a chain of shift registers with additional control logic. When the chain of pipeline registers is long, it may be more resource-efficient to implement the pipeline registers as a block-RAM FIFO rather than shift registers.
When this parameter is enabled, Smart HLS will examine each chain of pipeline registers and use the implementation (FIFO or shift register) that is estimated to be more resource-efficient.
Category
HLS Constraint
Value Type
Integer
Valid Values
0, 1
Default Value
unset (0)
Location Where Default is Specified
examples/legup.tcl
Dependencies
None
Applicable Flows
All devices and flows
Test Status
Actively in-use
Examples
set_parameter USE_FIFO_FOR_PIPELINE_REG 1