1 ******************
2 Using the TIDL API
3 ******************
5 This example illustrates using the TIDL API to offload deep learning network processing from a Linux application to the C66x DSPs or DLAs on AM57x devices.
7 Step 1
8 ======
10 Determine if there are any TIDL capable devices on the AM57x SoC:
12 .. code-block:: c++
14 uint32_t num_dla = Executor::GetNumDevices(DeviceType::DLA);
15 uint32_t num_dsp = Executor::GetNumDevices(DeviceType::DSP);
17 Step 2
18 ======
19 Create a Configuration object by reading it from a file or by initializing it directly. The example below parses a configuration file and initializes the Configuration object. See ``examples/test/testvecs/config/infer`` for examples of configuration files.
21 .. code::
23 Configuration configuration;
24 bool status = configuration.ReadFromFile(config_file);
26 .. note::
27 Refer TIDL Translation Tool documentation for creating TIDL network and parameter binary files from TensorFlow and Caffe.
29 Step 3
30 ======
31 Create an Executor with the approriate device type, set of devices and a configuration. In the snippet below, an Executor is created on 2 DLAs.
33 .. code-block:: c++
35 DeviceIds ids = {DeviceId::ID0, DeviceId::ID1};
36 Executor executor(DeviceType::DLA, ids, configuration);
38 Step 4
39 ======
40 Get the set of available ExecutionObjects and allocate input and output buffers for each ExecutionObject.
42 .. code-block:: c++
44 const ExecutionObjects& execution_objects = executor.GetExecutionObjects();
45 int num_eos = execution_objects.size();
47 // Allocate input and output buffers for each execution object
48 std::vector<void *> buffers;
49 for (auto &eo : execution_objects)
50 {
51 ArgInfo in = { ArgInfo(malloc(frame_sz), frame_sz)};
52 ArgInfo out = { ArgInfo(malloc(frame_sz), frame_sz)};
53 eo->SetInputOutputBuffer(in, out);
55 buffers.push_back(in.ptr());
56 buffers.push_back(out.ptr());
57 }
61 Step 5
62 ======
63 Run the network on each input frame. The frames are processed with available execution objects in a pipelined manner with additional num_eos iterations to flush the pipeline (epilogue).
65 .. code-block:: c++
67 for (int frame_idx = 0; frame_idx < configuration.numFrames + num_eos; frame_idx++)
68 {
69 ExecutionObject* eo = execution_objects[frame_idx % num_eos].get();
71 // Wait for previous frame on the same eo to finish processing
72 if (eo->ProcessFrameWait())
73 WriteFrame(*eo, output_data_file);
75 // Read a frame and start processing it with current eo
76 if (ReadFrame(*eo, frame_idx, configuration, input_data_file))
77 eo->ProcessFrameStartAsync();
78 }
82 Putting it together
83 ===================
84 The code snippet :ref:`tidl_main` illustrates using the API to offload a network.
86 .. literalinclude:: ../../examples/test/main.cpp
87 :name: tidl_main
88 :caption: examples/test/main.cpp
89 :lines: 155-189,208-213,215-220
90 :linenos:
92 For a complete example of using the API, refer ``examples/test/main.cpp`` on the EVM filesystem.