Lab13 Interface_streaming_2
using_axi_stream_no_side_channel_data
An AXI4-Stream is used without side-channels when the function argument, ap_axis
or ap_axiu
data type, does not contain any AXI4 side-channel elements (that is, when the WUser
, WId
, and WDest
parameters are set to 0). In the following example, both interfaces are implemented using an AXI4-Stream:
example.h
#include "ap_axi_sdata.h"
#include "hls_stream.h"
#define DWIDTH 32
#define type ap_int<DWIDTH>
typedef ap_axiu<32, 0, 0, 0> pkt;
example.cpp
#include "example.h"
//Read data from the stream A and add 5 to the data, then write to the stream B
//Input: stream A and num
//Output: stream B
void example(hls::stream<pkt>& A, hls::stream<pkt>& B,int num) {
#pragma HLS INTERFACE axis port = A
#pragma HLS INTERFACE axis port = B
#pragma HLS INTERFACE ap_ctrl_none port=return
#pragma HLS INTERFACE mode=s_axilite port=num
#pragma HLS PIPELINE style=flp II=1
static int counter=0;
pkt tmp;
pkt t1;
tmp.keep=-1;
tmp.last=0;
t1.keep=-1;
A.read(tmp);
t1.data = tmp.data + 5;
if(counter==num-1)
{
t1.last=1;
counter=0;
}
else
{
t1.last=0;
counter++;
}
B.write(t1);
}
After synthesis, both arguments are implemented with a data port (TDATA) and the standard AXI4-Stream protocol ports, TVALID, TREADY, TKEEP, TLAST, and TSTRB, as shown in the following figure.
And run cosimulation
and click Wave Viewer
to see the wave to enable the last
signal is right as shown below.
example_tb.h
#include "example.h"
#include <iostream>
using namespace std;
void example(hls::stream<pkt>& A, hls::stream<pkt>& B,int num);
int main() {
int num = 100;
hls::stream<pkt> A, B;
pkt tmp1, tmp2;
for(int i=0;i<100;i++)
{
tmp1.data = i;
A.write(tmp1);
example(A, B,num);
}
for(int j=0;j<100;j++)
{
B.read(tmp2);
printf("tmp2[%d] is %d\r\n",j,tmp2.data);
}
}
Create the Vivado project
The configure block design can use reference materials here. And we need to choose the number of the DMA according to the number of the interface.
Run synthesis, Implementation, and generate bitstream
It may show some errors about I/O Ports, please fix them.
Download the bitstream file to PYNQ
import numpy as np
import pynq
from pynq import MMIO
hw = pynq.Overlay('design_1.bit')
We can use the hw?
to check the IP dictionary
s2mm = hw.axi_dma_0.sendchannel
mm2s = hw.axi_dma_0.recvchannel
Allocate DMA memory address size
top=hw.example_0
N = 100
oBuf_0 = allocate(shape=(N,), dtype = np.int32)
iBuf_0 = allocate(shape=(N,), dtype = np.int32)
for i in range(N):
oBuf_0[i]= i
top_ip.register_map
#by the s_axilite to define the number of the transfer data
top.register_map.num=100
#begin the DMA transfer
s2mm.transfer(oBuf_0)
mm2s.transfer(iBuf_0)
s2mm.wait()
mm2s.wait()
We will see:
using_axi_stream_with_custom_side_channel_data_1
An instance of an AXI4-Stream without side channels (for example hls::stream<ap_axiu<32, 0, 0, 0>>
) can result in the following RTL signals on the interface:
TVALID
TREADY
TDATA[ ]
TLAST[ ]
TKEEP[ ]
TSTRB[ ]
Some of these signals (TLAST
, TKEEP
, and TSTRB
) can be unnecessary or unwanted in your design. In addition, there are use cases where TDATA
is not required as the modeler uses TUSER
to move data.Ref
In addition to letting you enable or disable specific side-channel signals, the set of macros was expanded to allow the specification of all
or none
of the signals. This allowed for the definition of the special hls::axis_data<>
and hls::axis_user<>
helper classes, making it easier to pick and choose the exact signal specification.
Example C++ code for implementing an AXI stream interface in Vitis HLS using only one side channel.(TDATA
, TLAST
only)
example.h
#pragma once
#include "ap_axi_sdata.h"
#include "hls_stream.h"
#include <iostream>
// Only TDATA and TLAST
typedef hls::axis_data<int, AXIS_ENABLE_LAST> packet;
#define SIZE 5
void example(hls::stream<packet>& A, hls::stream<packet>& B);
example.cpp
#include "example.h"
void example(hls::stream<packet>& A, hls::stream<packet>& B) {
#pragma HLS INTERFACE axis port = A
#pragma HLS INTERFACE axis port = B
#pragma HLS INTERFACE ap_ctrl_none port=return
#pragma HLS PIPELINE style=flp II=1
static int counter=0;
packet tmp = A.read();
tmp.data = tmp.data + 5;
if(counter==SIZE-1)
{
tmp.last=1;
counter=0;
}
else
{
tmp.last=0;
counter++;
}
B.write(tmp);
}
The synthesis report is shown below:
example_tb.h
#include "example.h"
using namespace std;
int main() {
hls::stream<packet> A, B;
packet tmp1, tmp2;
for (int j = 0; j < SIZE; j++) {
tmp1.data = j;
if (j == 99) {
tmp1.last = 1;
} else {
tmp1.last = 0;
}
A.write(tmp1);
example(A, B);
B.read(tmp2);
if (tmp1.get_data() + 5 != tmp2.get_data()) {
cout << "ERROR: results mismatch" << endl;
cout << "tmp1.data=" << tmp1.data;
cout << " != ";
cout << "tmp2.data=" << tmp2.data << endl;
return 1;
}
}
cout << "Success: results match" << endl;
return 0;
}
using_axi_stream_with_custom_side_channel_data_2
Example C++ code for implementing an AXI stream interface in Vitis HLS using only TUSER
and TLAST
channels with no TDATA
.
example.h
#pragma once
#include "ap_axi_sdata.h"
#include "hls_stream.h"
#include <iostream>
typedef hls::axis_user<13, (AXIS_ENABLE_USER | AXIS_ENABLE_LAST)> packet;
#define SIZE 5
void example(hls::stream<packet>& A, hls::stream<packet>& B);
example.cpp
#include "example.h"
void example(hls::stream<packet>& A, hls::stream<packet>& B) {
#pragma HLS INTERFACE axis port = A
#pragma HLS INTERFACE axis port = B
packet tmp = A.read();
tmp.set_user(tmp.get_user() + 5);
B.write(tmp);
}
The synthesis report is shown below:
example_tb.h
#include "example.h"
using namespace std;
int main() {
hls::stream<packet> A, B;
packet tmp1, tmp2;
for (int j = 0; j < SIZE; j++) {
tmp1.user = j;
if (j == 99) {
tmp1.last = 1;
} else {
tmp1.last = 0;
}
A.write(tmp1);
example(A, B);
B.read(tmp2);
if (tmp1.get_user() + 5 != tmp2.get_user()) {
cout << "ERROR: results mismatch" << endl;
cout << "tmp1.user=" << tmp1.get_user();
cout << " != ";
cout << "tmp2.user=" << tmp2.get_user() << endl;
return 1;
}
}
cout << "Success: results match" << endl;
return 0;
}
using_axi_stream_with_side_channel_data
The following example shows how the side channels can be used directly in the C/C++ code and implemented on the interface. The code uses #include “ap_axi_sdata.h” to provide an API to handle the side channels of the AXI4-Stream interface. In the following example a signed 32-bit data type is used:
example.h
#pragma once
#include "ap_axi_sdata.h"
#include "hls_stream.h"
#include <iostream>
typedef ap_axis<32, 2, 5, 6> packet;
#define SIZE 5
void example(hls::stream<packet>& A, hls::stream<packet>& B);
example.cpp
#include "example.h"
void example(hls::stream<packet>& A, hls::stream<packet>& B) {
#pragma HLS INTERFACE axis port = A
#pragma HLS INTERFACE axis port = B
#pragma HLS INTERFACE ap_ctrl_none port=return
#pragma HLS PIPELINE style=flp II=1
static int counter=0;
packet tmp = A.read();
tmp.keep=-1;
tmp.data = tmp.data + 5;
if(counter==SIZE-1)
{
tmp.last=1;
counter=0;
}
else
{
tmp.last=0;
counter++;
}
B.write(tmp);
}
The synthesis report is shown below:
example_tb.h
#include "example.h"
using namespace std;
int main() {
hls::stream<packet> A, B;
packet tmp1, tmp2;
for (int j = 0; j < SIZE; j++) {
tmp1.data = j;
tmp1.keep = -1;
tmp1.strb = 1;
tmp1.user = 1;
if (j == 99) {
tmp1.last = 1;
} else {
tmp1.last = 0;
}
A.write(tmp1);
example(A, B);
B.read(tmp2);
if (tmp1.data.to_int() + 5 != tmp2.data.to_int()) {
cout << "ERROR: results mismatch" << endl;
cout << "tmp1.data=" << tmp1.data;
cout << " != ";
cout << "tmp2.data=" << tmp2.data << endl;
return 1;
}
}
cout << "Success: results match" << endl;
return 0;
}
After Run the cosimulation, we can see the wave viewer to confirm the last
signal as shown below:
The export IP is shown below.
Create the Vivado project
The configure block design can use reference materials here. And we need to choose the number of the DMA according to the number of the interface.
Run synthesis, Implementation, and generate bitstream
It may show some errors about I/O Ports, please fix them.
Download the bitstream file to PYNQ
import numpy as np
import pynq
from pynq import MMIO
hw = pynq.Overlay('design_1.bit')
s2mm = hw.axi_dma_0.sendchannel
mm2s = hw.axi_dma_0.recvchannel
Allocate DMA memory address size
N = 5
oBuf_0 = allocate(shape=(N,), dtype = np.int32)
iBuf_0 = allocate(shape=(N,), dtype = np.int32)
for i in range(N):
oBuf_0[i]= i
#begin the DMA transfer
s2mm.transfer(oBuf_0)
mm2s.transfer(iBuf_0)
s2mm.wait()
mm2s.wait()
We will see:
using_axi_stream_with_struct
example.h
#include "ap_axi_sdata.h"
#include "hls_stream.h"
#include <complex>
#define SIZE 5
typedef hls::axis<std::complex<short int>, 0, 0, 0> data_t;
typedef hls::stream<data_t> mystream;
extern void example(mystream& A, mystream& B);
example.cpp
#include "example.h"
void example(mystream& A, mystream& B) {
#pragma HLS INTERFACE axis port = A
#pragma HLS INTERFACE axis port = B
#pragma HLS INTERFACE ap_ctrl_none port=return
data_t tmp_a;
static int counter=0;
tmp_a.keep=-1;
do {
tmp_a = A.read();
data_t tmp_b;
tmp_b.keep=-1;
tmp_b.data.real(tmp_a.data.real() + 5);
tmp_b.data.imag(tmp_a.data.imag() + 1);
if(counter==SIZE-1)
{
tmp_b.last=1;
counter=0;
}
else
{
tmp_b.last=0;
counter++;
}
B.write(tmp_b);
} while (!tmp_a.last);
}
The synthesis report is shown below:
example_tb.h
#include "example.h"
#include <stdio.h>
int main() {
short int i;
mystream A, B, C;
printf("HLS AXI-Stream no side-channel data example\n");
// Put data into A Stream
for (i = 0; i < SIZE; i++) {
data_t tmp;
tmp.data = {i, i};
tmp.last = (i == (SIZE - 1)) ? 1 : 0;
A.write(tmp);
}
// Call the hardware function
example(A, B);
// Run a software version of the hardware function to validate results
for (i = 0; i < SIZE; i++) {
data_t tmp_c;
tmp_c.data.real(i + 5);
tmp_c.data.imag(i + 1);
C.write(tmp_c);
}
// Compare the results
for (i = 0; i < SIZE; i++) {
data_t tmp_b = B.read();
data_t tmp_c = C.read();
if (tmp_b.data != tmp_c.data) {
printf("ERROR HW and SW results mismatch\n");
return 1;
}
}
printf("Success HW and SW results match\n");
return 0;
}
After Running the cosimulation, we can see the wave viewer to confirm the last
signal as shown below:
The export IP is shown below.
Create the Vivado project
The configure block design can use reference materials here. And we need to choose the number of the DMA according to the number of the interface.
Run synthesis, Implementation, and generate bitstream
It may show some errors about I/O Ports, please fix them.
Download the bitstream file to PYNQ
import numpy as np
import pynq
from pynq import MMIO
hw = pynq.Overlay('design_1.bit')
s2mm = hw.axi_dma_0.sendchannel
mm2s = hw.axi_dma_0.recvchannel
Allocate DMA memory address size
N = 5
oBuf_0 = allocate(shape=(N,), dtype = np.int32)
iBuf_0 = allocate(shape=(N,), dtype = np.int32)
for i in range(1):
oBuf_0[0]= 0
oBuf_0[1]= 65537
oBuf_0[2]= 131074
oBuf_0[3]= 196611
oBuf_0[4]= 262148
#begin the DMA transfer
s2mm.transfer(oBuf_0)
mm2s.transfer(iBuf_0)
s2mm.wait()
mm2s.wait()
We will see:
using_axis_array_stream_no_side_channel_data
Example C++ code for implementing an AXI array to stream with no side channels.
example.cpp
void example(int A[50], int B[50]) {
#pragma HLS INTERFACE axis port = A
#pragma HLS INTERFACE axis port = B
#pragma HLS INTERFACE ap_ctrl_none port=return
int i;
for (i = 0; i < 50; i++) {
B[i] = A[i] + 5;
}
}
The synthesis report is shown below:
example_tb.h
#include <stdio.h>
void example(int A[50], int B[50]);
int main() {
int i;
int A[50];
int B[50];
int C[50];
printf("HLS AXI-Stream no side-channel data example\n");
// Put data into A
for (i = 0; i < 50; i++) {
A[i] = i;
}
// Call the hardware function
example(A, B);
// Run a software version of the hardware function to validate results
for (i = 0; i < 50; i++) {
C[i] = A[i] + 5;
}
// Compare results
for (i = 0; i < 50; i++) {
if (B[i] != C[i]) {
printf("ERROR HW and SW results mismatch\n");
return 1;
}
}
printf("Success HW and SW results match\n");
return 0;
}
Create the Vivado project
The configure block design can use reference materials here. We also need to choose the number of the DMA according to the number of the interface. We also need middle_data IP to provide the last and keep the signal. Ref
Run synthesis, Implementation, and generate bitstream
It may show some errors about I/O Ports, please fix them.
Download the bitstream file to PYNQ
import numpy as np
import pynq
from pynq import MMIO
hw = pynq.Overlay('design_1.bit')
hw?
s2mm = hw.axi_dma_0.sendchannel
mm2s = hw.axi_dma_0.recvchannel
Allocate DMA memory address size
top=hw.example_0
N = 50
oBuf_0 = allocate(shape=(N,), dtype = np.int32)
iBuf_0 = allocate(shape=(N,), dtype = np.int32)
for i in range(N):
oBuf_0[i]= i
#begin the DMA transfer
s2mm.transfer(oBuf_0)
mm2s.transfer(iBuf_0)
s2mm.wait()
mm2s.wait()
We will see:
Demonstrate
Please create an HLS component that uses axi_stream_no_side_channel_data
interface and axi_stream_with_side_channel_data
interface and axi_stream_with_struct
interface and axi_array_strema_no_side_channel_data
separately and implement them on the PYNQ-Z2 board, and you can refer to the examples of the tutorial.