summary | shortlog | log | commit | commitdiff | tree
raw | patch | inline | side by side (from parent 1: c6aeb3e)
raw | patch | inline | side by side (from parent 1: c6aeb3e)
author | Ajay Jayaraj <ajayj@ti.com> | |
Thu, 15 Nov 2018 22:33:55 +0000 (16:33 -0600) | ||
committer | Ajay Jayaraj <ajayj@ti.com> | |
Thu, 15 Nov 2018 22:33:55 +0000 (16:33 -0600) |
(MCT-1086)
docs/source/changelog.rst | patch | blob | history | |
docs/source/example.rst | patch | blob | history | |
docs/source/using_api.rst | patch | blob | history |
index 8eb92315572913b8a029485bb199b8df2425249d..fbc18ab4b56cfe12ae0d64aaf276aa64beebcf47 100644 (file)
These methods were replaced by a timestamp based approach because they were
no longer accurate with multiple ExecutionObject contexts and pipelining.
These methods were replaced by a timestamp based approach because they were
no longer accurate with multiple ExecutionObject contexts and pipelining.
+ Refer to :ref:`execution-graph` for details.
1.1.0 [Processor Linux SDK 5.1]
===============================
1.1.0 [Processor Linux SDK 5.1]
===============================
index 28486ccb4d4d33690cdfcc744313504d2f7408e6..608c728d4ea94f78623ca9a6e9460b397f21414e 100644 (file)
--- a/docs/source/example.rst
+++ b/docs/source/example.rst
- OpenCV used to read input image from file or capture from camera.
* - classification
- Classification example, called from the Matrix GUI.
- OpenCV used to read input image from file or capture from camera.
* - classification
- Classification example, called from the Matrix GUI.
- -
+ - EVE or C66x
- OpenCV used to read input image from file or capture from camera.
- OpenCV used to read input image from file or capture from camera.
+ * - mcbench
+ - Used to benchmark supported networks. Refer ``mcbench/scripts`` for command line options.
+ - EVE or C66x
+ - Pre-processed image read from file.
* - layer_output
- Illustrates using TIDL APIs to access output buffers of intermediate :term:`layers<Layer>` in the network.
- EVE or C66x
* - layer_output
- Illustrates using TIDL APIs to access output buffers of intermediate :term:`layers<Layer>` in the network.
- EVE or C66x
index 724824b5fc5305d813d26e8d978a0e7f28b2d2a7..47c0d9c6db38636089083b1d80d5d7c2e3f721f0 100644 (file)
* Create :term:`Executor` objects - one to manage overall execution on the EVEs, the other for C66x DSPs.
* Use the :term:`Execution Objects<EO>` (EO) created by the Executor to process :term:`frames<Frame>`. There are two approaches to processing frames using Execution Objects:
* Create :term:`Executor` objects - one to manage overall execution on the EVEs, the other for C66x DSPs.
* Use the :term:`Execution Objects<EO>` (EO) created by the Executor to process :term:`frames<Frame>`. There are two approaches to processing frames using Execution Objects:
- #. Each EO processes a single frame.
- #. Split processing frame across multiple EOs using an :term:`ExecutionObjectPipeline`.
+.. list-table:: TIDL API Use Cases
+ :header-rows: 1
+ :widths: 30 50 20
+
+ * - Use Case
+ - Application/Network characteristic
+ - Examples
+ * - Each :term:`EO` processes a single frame. The network consists of a single :term:`Layer Group` and the entire Layer Group is processed on a single EO. See :ref:`use-case-1`.
+ -
+
+ * The latency to run the network on a single frame meets application requirements
+ * If frame processing time is fairly similar across EVE and C66x, multiple EOs can be used to increase throughtput
+ * The latency to read an input frame can be hidden by using 2 :term:`EOPs<EOP>` with the same :term:`EO`.
+ - one_eo_per_frame, imagenet, segmentation
+
+ * - Split processing a single frame across multiple :term:`EOs<EO>` using an :term:`ExecutionObjectPipeline`. The network consists of 2 or more :term:`Layer Groups<Layer Group>`. See :ref:`use-case-2` and :ref:`use-case-3`.
+ -
+
+ * Network execution must be split across EVE and C66x to meet single frame latency requirements.
+ * Split processing a single frame across multiple :term:`EOs<EO>` using an :term:`ExecutionObjectPipeline`.
+ * Splitting can lower the overall latency because memory bound :term:`layers<Layer>` tend to run faster on the C66x.
+ - two_eo_per_frame, two_eo_per_frame_opt, ssd_multibox
+
Refer Section :ref:`api-documentation` for API documentation.
Refer Section :ref:`api-documentation` for API documentation.
@@ -43,10 +64,10 @@ In this approach, the :term:`network<Network>` is set up as a single :term:`Laye
.. literalinclude:: ../../examples/one_eo_per_frame/main.cpp
:language: c++
.. literalinclude:: ../../examples/one_eo_per_frame/main.cpp
:language: c++
- :lines: 92-94
+ :lines: 97-99
:linenos:
:linenos:
-#. Create Executor on C66x and EVE. In this example, all available C66x and EVE cores are used (lines 1-2 and :ref:`CreateExecutor`).
+#. Create Executor on C66x and EVE. In this example, all available C66x and EVE cores are used (lines 2-3 and :ref:`CreateExecutor`).
#. Create a vector of available ExecutionObjects from both Executors (lines 7-8 and :ref:`CollectEOs`).
#. Allocate input and output buffers for each ExecutionObject (:ref:`AllocateMemory`)
#. Run the network on each input frame. The frames are processed with available execution objects in a pipelined manner. The additional num_eos iterations are required to flush the pipeline (lines 15-26).
#. Create a vector of available ExecutionObjects from both Executors (lines 7-8 and :ref:`CollectEOs`).
#. Allocate input and output buffers for each ExecutionObject (:ref:`AllocateMemory`)
#. Run the network on each input frame. The frames are processed with available execution objects in a pipelined manner. The additional num_eos iterations are required to flush the pipeline (lines 15-26).
@@ -56,26 +77,26 @@ In this approach, the :term:`network<Network>` is set up as a single :term:`Laye
.. literalinclude:: ../../examples/one_eo_per_frame/main.cpp
:language: c++
.. literalinclude:: ../../examples/one_eo_per_frame/main.cpp
:language: c++
- :lines: 108-127,129,133-139
+ :lines: 113-132,134,138-142
:linenos:
.. literalinclude:: ../../examples/one_eo_per_frame/main.cpp
:language: c++
:linenos:
.. literalinclude:: ../../examples/one_eo_per_frame/main.cpp
:language: c++
- :lines: 154-163
+ :lines: 159-168
:linenos:
:caption: CreateExecutor
:name: CreateExecutor
.. literalinclude:: ../../examples/one_eo_per_frame/main.cpp
:language: c++
:linenos:
:caption: CreateExecutor
:name: CreateExecutor
.. literalinclude:: ../../examples/one_eo_per_frame/main.cpp
:language: c++
- :lines: 166-172
+ :lines: 171-177
:linenos:
:caption: CollectEOs
:name: CollectEOs
.. literalinclude:: ../../examples/common/utils.cpp
:language: c++
:linenos:
:caption: CollectEOs
:name: CollectEOs
.. literalinclude:: ../../examples/common/utils.cpp
:language: c++
- :lines: 197-212
+ :lines: 176-191
:linenos:
:caption: AllocateMemory
:name: AllocateMemory
:linenos:
:caption: AllocateMemory
:name: AllocateMemory
The complete example is available at ``/usr/share/ti/tidl/examples/one_eo_per_frame/main.cpp``.
.. note::
The complete example is available at ``/usr/share/ti/tidl/examples/one_eo_per_frame/main.cpp``.
.. note::
- The double buffering technique described in :ref:`use-case-3` can be used with a single :term:`ExecutionObject` to overlap reading a frame with the processing of the previous frame.
+ The double buffering technique described in :ref:`use-case-3` can be used with a single :term:`ExecutionObject` to overlap reading an input frame with the processing of the previous input frame. Refer to ``examples/imagenet/main.cpp``.
.. _use-case-2:
.. _use-case-2:
@@ -121,7 +142,7 @@ The network consists of 2 :term:`Layer Groups<Layer Group>`. :term:`Execution Ob
.. literalinclude:: ../../examples/one_eo_per_frame/main.cpp
:language: c++
.. literalinclude:: ../../examples/one_eo_per_frame/main.cpp
:language: c++
- :lines: 92-94
+ :lines: 97-99
:linenos:
#. Update the default layer group index assignment. Pooling (layer 12), InnerProduct (layer 13) and SoftMax (layer 14) are added to a second layer group. Refer :ref:`layer-group-override` for details.
:linenos:
#. Update the default layer group index assignment. Pooling (layer 12), InnerProduct (layer 13) and SoftMax (layer 14) are added to a second layer group. Refer :ref:`layer-group-override` for details.
@@ -143,12 +164,12 @@ The network consists of 2 :term:`Layer Groups<Layer Group>`. :term:`Execution Ob
.. literalinclude:: ../../examples/two_eo_per_frame/main.cpp
:language: c++
.. literalinclude:: ../../examples/two_eo_per_frame/main.cpp
:language: c++
- :lines: 110-138,140,147-153
+ :lines: 132-139,144-149
:linenos:
.. literalinclude:: ../../examples/common/utils.cpp
:language: c++
:linenos:
.. literalinclude:: ../../examples/common/utils.cpp
:language: c++
- :lines: 225-240
+ :lines: 204-219
:linenos:
:caption: AllocateMemory
:name: AllocateMemory2
:linenos:
:caption: AllocateMemory
:name: AllocateMemory2
@@ -176,15 +197,18 @@ The only change in the code compared to :ref:`use-case-2` is to create an additi
.. literalinclude:: ../../examples/two_eo_per_frame_opt/main.cpp
:language: c++
.. literalinclude:: ../../examples/two_eo_per_frame_opt/main.cpp
:language: c++
- :lines: 117-129
+ :lines: 122-134
:linenos:
:caption: Setting up EOPs for double buffering
:name: test-code
.. note::
EOP1 in :numref:`frame-across-eos-opt` -> EOPs[0] in :numref:`test-code`.
:linenos:
:caption: Setting up EOPs for double buffering
:name: test-code
.. note::
EOP1 in :numref:`frame-across-eos-opt` -> EOPs[0] in :numref:`test-code`.
+
EOP2 in :numref:`frame-across-eos-opt` -> EOPs[1] in :numref:`test-code`.
EOP2 in :numref:`frame-across-eos-opt` -> EOPs[1] in :numref:`test-code`.
+
EOP3 in :numref:`frame-across-eos-opt` -> EOPs[2] in :numref:`test-code`.
EOP3 in :numref:`frame-across-eos-opt` -> EOPs[2] in :numref:`test-code`.
+
EOP4 in :numref:`frame-across-eos-opt` -> EOPs[3] in :numref:`test-code`.
The complete example is available at ``/usr/share/ti/tidl/examples/two_eo_per_frame_opt/main.cpp``.
EOP4 in :numref:`frame-across-eos-opt` -> EOPs[3] in :numref:`test-code`.
The complete example is available at ``/usr/share/ti/tidl/examples/two_eo_per_frame_opt/main.cpp``.