SmartHLS Pragmas Manual¶
SmartHLS accepts pragma directives in the source code to guide the hardware generation. This reference section explains all of the pragmas available for SmartHLS.
The pragmas follow the following syntax:
#pragma HLS <category> <feature> <parameter>(<value>)
The category refers to the general usage class of the pragma. Each pragma has one of the following categories:
function
: configure a hardware function.loop
: configure loop optimizations.interface
: configure hardware interfaces (arguments / global variables).memory
: configure hardware memory implementation.
Each category can have different configurable features. Some categories / features have parameters to passed to the pragma. A parameter can be optional with a default behaviour if not specified.
The value of a parameter can be either integer, boolean (true|false
),
name (variable / argument), or a set of pre-specified values.
Note
For integer parameters, the user is allowed to use constants (or expressions of constants) defined using #define directive. For example, this is allowed:
#define N 10
void fun() {
#pragma HLS loop unroll factor(N+1)
for (int i = 0; i < 100; i++)
...
}
The pragma position is not arbitrary and placing the pragma in an incorrect position will cause an error. Each pragma can has one of the following positions:
- At the beginning of function definition block before any other statements.
- Before global / local variable declaration.
- Before loop block.
Set Top-Level Function¶
Syntax
#pragma HLS function top
Description
This pragma specifies the top-level C/C++ function. The top-level function and all of its descendant functions will be compiled to hardware.
Position
At the beginning of the function definition block.
Examples
int sum(int *a) {
#pragma HLS function top
...
}
Pipeline Function¶
Syntax
#pragma HLS function pipeline II(<int>)
Description
This pragma enables pipelining for a given function in the code. Function pipelining allows a new invocation of a function to begin before the current one finishes, achieving higher throughput. Optional arguments:
Parameters
Parameter | Value | Optional | Default | Description |
---|---|---|---|---|
II |
Integer | Yes | 1 | Pipeline initiation interval |
Position
At the beginning of the function definition block.
Examples
int sum(int *a) {
#pragma HLS function pipeline
...
}
int conv(int a[], int b[]) {
#pragma HLS function pipeline II(3)
...
}
Inline Function¶
Syntax
#pragma HLS function inline
Description
This pragma forces a given function to be inlined.
Position
At the beginning of the function definition block.
Examples
int sum(int *a) {
#pragma HLS function inline
...
}
Noinline Function¶
Syntax
#pragma HLS function noinline
Description
This pragma prevents a given function from being inlined.
Position
At the beginning of the function definition block.
Examples
int sum(int *a) {
#pragma HLS function noinline
...
}
Flatten Function¶
Syntax
#pragma HLS function flatten branchless(true|false)
Description
This pragma unrolls all loops and inlines all subfunctions for a given function.
If the branchless
option is set to true, all branches (e.g., if-else,
switch) in the specified function will also be flattened to allow more
parallelism between operations, specifically between the operations that are
under different-yet-independent conditions.
Parameters
Parameter | Value | Optional | Default | Description |
---|---|---|---|---|
branchless |
true|false |
Yes | false |
true to flatten branch statements |
Position
At the beginning of the function definition block.
Examples
int sum(int *a) {
#pragma HLS function flatten branchless(true)
...
}
Replicate Function¶
Syntax
#pragma HLS function replicate
Description
This pragma specifies a function to be replicated every time it is called. By default, when the circuit is not pipelined, SmartHLS creates a single instance for each function which is shared across multiple calls to the function. When using this pragma on the function, SmartHLS will create a new instance of the function for every function call.
Position
At the beginning of the function definition block.
Examples
int sum(int *a) {
#pragma HLS function replicate
...
}
Pipeline Loop¶
Syntax
#pragma HLS loop pipeline II(<int>)
Description
This pragma enables pipelining for a given loop in the code. Loop pipelining allows a new iteration of the loop to begin before the current one has finished, achieving higher throughput. It can be specified to pipeline a single loop or a nested loop. If specified on a single loop or an inner loop of a nested loop, that loop will be pipelined. If specified on the outer loop of a nested loop, the outer loop will be pipelined, and all of its inner loops will be automatically unrolled.
Parameters
Parameter | Value | Optional | Default | Description |
---|---|---|---|---|
II |
Integer | Yes | 1 | Pipeline initiation interval |
Position
Before the beginning of the loop. If there is a loop label, the pragma should be placed after the label.
Examples
#pragma HLS loop pipeline II(2)
for (int i = 0; i < 10; i++) {
...
}
LOOP_LABEL:
#pragma HLS loop pipeline
while (i < 10) {
...
}
Unroll Loop¶
Syntax
#pragma HLS loop unroll factor(<int>)
Description
Specifies a loop to be unrolled.
Parameters
The factor indicates how many times to unroll the loop. If it is not specified, or specified as N (the total number of loop iterations), the loop will be fully unrolled. If it is specified as 2, the loop will be unrolled 2 times, where the number of loop iterations will be halved and the loop body will be replicated twice. If it is specified as 1, the loop will NOT be unrolled.
Parameter | Value | Optional | Default | Description |
---|---|---|---|---|
factor |
Integer | Yes | N (fully unroll) | Unroll count |
Position
Before the beginning of the loop.
Note
If there is a loop label, the pragma should be placed after the label.
Examples
Fully unroll a loop.
#pragma HLS loop unroll
for (int i = 0; i < 10; i++) {
...
}
Unroll the loop by 2 times only.
LOOP_LABEL:
#pragma HLS loop unroll factor(2)
while (i < 10) {
...
}
Small loops may be unrolled even without the unroll pragma. Make sure the loop is not unrolled.
#pragma HLS loop unroll factor(1)
for (int i = 0; i < 10; i++) {
...
}
Module Control Interface¶
Syntax
#pragma HLS interface control type(<simple|axi_target>)
Description
This pragma configures the Module Control Interface.
This pragma is ignored if the enclosing function is not specified as the top-level.
Parameters
Parameter | Type | Optional | Default | Description |
---|---|---|---|---|
type |
simple|axi_target |
No | simple |
Interface type |
Position
At the beginning of the function definition block.
Examples
int fun() {
#pragma HLS function top
#pragma HLS interface control type(simple)
...
}
Scalar Argument Interface¶
Syntax
#pragma HLS interface argument(<arg_name>) type(<simple|axi_target>) stable(<false|true>)
Description
This pragma configures the RTL interface for a Scalar Argument.
This pragma is ignored if the enclosing function is not specified as the top-level.
Parameters
Parameter | Type | Optional | Default | Description |
---|---|---|---|---|
argument |
String | No | Argument name | |
type |
simple|axi_target |
No | simple |
Interface type |
stable |
true|false |
Yes | false |
Only available for simple type, true if the argument is stable. |
Position
At the beginning of the function definition block.
Examples
int fun(int a) {
#pragma HLS function top
#pragma HLS interface argument(a) type(simple) stable(true)
...
}
Memory Interface for Pointer Argument¶
Syntax
#pragma HLS interface argument(<arg_name>) type(memory) num_elements(<int>)
Description
This pragma specifies the memory interface type for a pointer (including array, struct, and class types) argument. More details in Memory Interface section.
This pragma is ignored if the enclosing function is not specified as the top-level.
Parameters
Parameter | Type | Optional | Default | Description |
---|---|---|---|---|
argument |
String | No | Argument name | |
type |
memory |
No | Interface type | |
num_elements |
Integer | Yes | Specifies the number of elements of the argument array. Can override the array size in the argument. |
Position
At the beginning of the function definition block.
Examples
int fun(int a[], int b[]) {
#pragma HLS function top
#pragma HLS interface argument(a) type(memory) num_elements(100)
#pragma HLS interface argument(b) type(memory)
...
}
Memory Interface for Global Variable¶
Syntax
#pragma HLS interface variable(<var_name>) type(memory) num_elements(<int>)
Description
This pragma specifies the memory interface type for a shared global variable. More details in Memory Interface section.
This pragma is ignored if the enclosing function is not specified as the top-level.
Parameters
Parameter | Type | Optional | Default | Description |
---|---|---|---|---|
variable |
String | No | Variable name | |
type |
memory |
No | Interface type | |
num_elements |
Integer | Yes | Specifies the number of elements of the variable array. Can override the array size of the variable. |
Position
Before the global variable declaration.
Examples
#pragma HLS interface variable(b) type(memory) num_elements(100)
int b[100];
int fun() {
...
}
AXI4 Initiator Interface for Pointer Argument¶
Syntax
#pragma HLS interface argument(<arg_name>) type(axi_initiator) \
ptr_addr_interface(<simple|axi_target>) num_elements(<int>)
Description
This pragma specifies the AXI4 initiator interface type for a pointer (including array, struct, and class types) argument. More details in AXI4 Initiator Interface section.
This pragma is ignored if the enclosing function is not specified as the top-level.
Parameters
Parameter | Type | Optional | Default | Description |
---|---|---|---|---|
argument |
String | No | Argument name | |
type |
axi_initiator |
No | Interface type | |
ptr_addr_interface |
simple|axi_target |
Yes | simple |
Specifies the interface type for setting the base address of the accessing memory. The default type is simple but is changed to axi_target if Default All Interface to Use AXI4 Target is set. |
num_elements |
Integer | Yes | Specifies the number of elements of the argument array. Can override the array size in the argument. Only needed by the SW/HW Co-Simulation feature and does not affect HLS-generated RTL. | |
max_burst_len |
Integer | Yes | Specifies the maximum burst length for AXI Initiator burst transactions using this pointer, see AXI4 Initiator Interface. Permitted values: 1 - 256 (default: 16). |
Position
At the beginning of the function definition block.
Examples
int fun(int a[]) {
#pragma HLS function top
#pragma HLS interface argument(a) type(axi_initiator) ptr_addr_interface(axi_target) num_elements(100)
...
}
AXI4 Target Interface for Pointer Argument¶
Syntax
#pragma HLS interface argument(<arg_name>) type(axi_target) \
num_elements(<int>) dma(true|false) requires_copy_in(true|false)
Description
This pragma specifies the AXI4 target interface type for a pointer (including array, struct, and class types) argument. More details in AXI4 Target Interface section.
This pragma is ignored if the enclosing function is not specified as the top-level.
Parameters
Parameter | Type | Optional | Default | Description |
---|---|---|---|---|
argument |
String | No | Argument name | |
type |
axi_target |
No | Interface type | |
num_elements |
Integer | Yes | Specifies the number of elements of the argument array. Can override the array size in the argument. | |
dma |
true|false |
Yes | false | Specifies the transfer method and copy-in behaviour in the top-level driver function. See Top-level Driver Options in Pointer Arguments’ AXI4 Target Interface Pragma |
requires_copy_in |
true|false |
Yes |
Position
At the beginning of the function definition block.
Examples
int fun(int a[], int b[101]) {
#pragma HLS function top
#pragma HLS interface argument(a) type(axi_target) num_elements(100) dma(true) requires_copy_in(false)
#pragma HLS interface argument(b) type(axi_target)
...
}
Legacy AXI4 Slave Interface for Global Variable¶
Syntax
#pragma HLS interface variable(<var_name>) type(axi_slave) concurrent_access(true|false)
Description
This pragma specifies the legacy AXI4 slave interface for a global struct.
When the concurrent_access
option is set to true (default to false), the
external logic can read/write the AXI4 slave interface while the SmartHLS module is
running. However, concurrent access will reduce the SmartHLS module’s
throughput to access the memory.
More details in Legacy AXI4 Slave Interface section.
This pragma is ignored if the enclosing function is not specified as the top-level.
Parameters
Parameter | Value | Optional | Default | Description |
---|---|---|---|---|
variable |
String | No | Variable name | |
type |
axi_slave |
No | Interface type | |
concurrent_access |
true|false |
Yes | false |
Enable/disable concurrent access |
Position
Before the global variable declaration.
Examples
#pragma HLS interface variable(b) type(axi_slave) concurrent_access(true)
int b[SIZE];
Default All Interface to Use AXI4 Target¶
Syntax
#pragma HLS interface default type(axi_target)
Description
This pragma specifies the default interface to AXI4 target for all arguments and module control.
This pragma is ignored if the enclosing function is not specified as the top-level.
Parameters
Parameter | Type | Optional | Default | Description |
---|---|---|---|---|
type |
axi_target |
No | Interface type |
Position
At the beginning of the function definition block.
Examples
// The following two functions have the same interface configurations.
// Without using default interface pragma:
int fun(int a, int b[10], int c[20], int d[30]) {
#pragma HLS function top
#pragma HLS interface control type(axi_target)
#pragma HLS interface argument(a) type(axi_target)
#pragma HLS interface argument(b) type(axi_target)
#pragma HLS interface argument(c) type(axi_target) dma(true)
#pragma HLS interface argument(d) type(axi_initiator) ptr_addr_interface(axi_target)
...
}
// Use default interface pragma:
int fun(int a, int b[10], int c[20], int d[30]) {
#pragma HLS function top
#pragma HLS interface default type(axi_target)
#pragma HLS interface argument(c) type(axi_target) dma(true)
// Note that 'ptr_addr_interface(axi_target)' can be omitted when default interface is set to axi_target.
#pragma HLS interface argument(d) type(axi_initiator)
...
}
Partition Memory¶
Syntax
#pragma HLS memory partition variable(<var_name>) type(block|cyclic|complete|struct_fields|none) dim(<int>) factor(<int>)
Description
This pragma specifies a variable to be partitioned. Dimension 1 corresponds to the left-most dimension of an array and higher dimensions correspond to right-ward dimensions. The dim
parameter is only applicable for block|cyclic|complete
types. If dim
is 0, the specified partitioning will be applied to all dimensions. The factor
parameter is only applicable for block|cyclic
types tos specify the number of partitions. factor
must be larger than 1.
See User-Specified Memory Partitioning for more details about the pragma options.
Parameters
Parameter | Value | Optional | Default | Description |
---|---|---|---|---|
variable |
String | No | Variable name | |
type |
block,cyclic,complete,struct_fields,none |
Yes | complete |
Partition type |
dim |
Integer | Yes | 0 | Partition dimension |
factor |
Integer | Yes | Number of partitions |
Position
Before the global / local variable declaration.
Examples
#pragma HLS memory partition variable(b) type(none)
int b[100];
int fun(int *a) {
...
#pragma HLS memory partition variable(c) type(block) dim(1) factor(2)
int c[100][100];
...
}
Partition Top-Level Interface¶
Syntax
#pragma HLS memory partition argument(<arg_name>) type(block|cyclic|complete|struct_fields|none) dim(<int>) factor(<int>)
Description
This pragma specifies a top-level argument to be partitioned. Dimension 1 corresponds to the left-most dimension of an array and higher dimensions correspond to right-ward dimensions. The dim
parameter is only applicable for block|cyclic|complete
types. If dim
is 0, the specified partitioning will be applied to all dimensions. The factor
parameter is only applicable for block|cyclic
types to specify the number of partitions. factor
must be larger than 1.
See User-Specified Memory Partitioning for more details about the pragma options.
Parameters
Parameter | Value | Optional | Default | Description |
---|---|---|---|---|
variable |
String | No | Variable name | |
type |
block,cyclic,complete,struct_fields,none |
Yes | complete |
Partition type |
dim |
Integer | Yes | 0 | Partition dimension |
factor |
Integer | Yes | Number of partitions |
Position
At the beginning of the function definition block.
Examples
int sum(int *a, int *b) {
#pragma HLS function top
#pragma HLS memory partition argument(a) type(cyclic) dim(2) factor(4)
#pragma HLS memory partition argument(b)
}
Struct Variable Packing¶
Syntax
#pragma HLS memory impl variable(<var_name>) pack(bit|byte) byte_enable(true|false)
Description
The pragma is to be used to pack a global interface / local memory variable with struct type. There are two packing modes: bit / byte where bit packing packs the struct fields using the exact bit-width and byte mode packs the fields with 8-bit alignment. byte_enable option creates an interface / memory with byte enable signals to write individual fields when set to true. Note that byte_enable is only valid with byte packing.
Parameters
Parameter | Value | Optional | Default | Description |
---|---|---|---|---|
variable |
String | No | Variable name | |
pack |
bit - byte |
No | Packing Mode | |
byte_enable |
true - false |
Yes | false |
Use byte-enable to write struct fields |
Position
Before the global / local variable declaration.
Examples
#pragma HLS memory impl variable(b) pack(bit)
struct S s[100];
Struct Argument Packing¶
Syntax
#pragma HLS memory impl argument(<arg_name>) pack(bit|byte) byte_enable(true|false)
Description
The pragma is to be used to pack a local memory variable with struct type. There are two packing modes: bit / byte where bit packing packs the struct fields using the exact bit-width and byte mode packs the fields with 8-bit alignment. byte_enable option creates an interface with byte enable signals to write individual fields when set to true. Note that byte_enable is only valid with byte packing.
Parameters
Parameter | Value | Optional | Default | Description |
---|---|---|---|---|
argument |
String | No | Argument name | |
pack |
bit - byte |
No | Packing Mode | |
byte_enable |
true - false |
Yes | false |
Use byte-enable to write struct fields |
Position
At the beginning of the function definition block.
Examples
int sum(struct S &s) {
#pragma HLS function top
#pragma HLS memory impl argument(s) pack(btye) byte_enable(true)
...
}
Replicate ROM¶
Syntax
#pragma HLS memory replicate_rom variable(<rom_var_name>) max_replicas(<int>)
Description
This pragma can be used to replicate constant memory (i.e., arrays) to achieve better throughput (shorter cycle latency) at the expense of extra resources (e.g., block RAM). Typically when an array is implemented in block RAMs, there are up-to-two RAM ports to allow a maximum of two reads per clock cycle. To allow more parallel read accesses in each clock cycle, constant read-only memories (ROM) can be replicated by using this pragma.
The optional max_replicas
can be used to control the maximum number of replicas.
If a max_replicas
of N is specified, SmartHLS will make sure to use no
more than N replicas of the ROM in the generated circuit; the generated
circuit may use less than N replicas when the throughput cannot be further
improved with more replicas.
When max_replicas
is unspecified or set to 0, the number of
replicas is unlimited and SmartHLS will use as many replicas as it needs to
maximize throughput.
A max_replicas
of 1 means only one copy is allowed,
hence no replication, equivalent to not having the pragma.
Parameters
Parameter | Value | Optional | Default | Description |
---|---|---|---|---|
max_replicas |
<int> |
Yes | 0 |
The maximum number of replicas allowed |
Position
Before the global / local variable declaration.
Examples
#pragma HLS memory replicate_rom variable(my_rom) max_replicas(10)
const int my_rom[100];
Contention-Free Memory Access¶
Syntax
#pragma HLS memory impl variable(<arg_name>) contention_free(true|false)
Description
The pragma is to be used for variables accessed by parallel functions (hls::thread
)
so that SmartHLS does not create arbiters for the specified variable.
The specified variable can still be accessed by multiple concurrently running functions, but without contention.
It will be the users’ responsibility to ensure at most one function may access the shared
variable in a clock cycle. If not specified, by default,
SmartHLS creates arbiters for variables that are accessed by parallel functions.
Parameters
Parameter | Value | Optional | Default | Description |
---|---|---|---|---|
variable |
String | No | Variable name | |
contention_free |
true|false |
Yes | false |
true for contention-free access |
Position
Before the global / local variable declaration.
Examples
#pragma HLS memory impl variable(b) contention_free(true)
int b[100];