summary | shortlog | log | commit | commitdiff | tree
raw | patch | inline | side by side (parent: 686f055)
raw | patch | inline | side by side (parent: 686f055)
author | Ajay Jayaraj <ajayj@ti.com> | |
Thu, 10 May 2018 19:22:13 +0000 (14:22 -0500) | ||
committer | Ajay Jayaraj <ajayj@ti.com> | |
Mon, 14 May 2018 16:32:50 +0000 (11:32 -0500) |
(MCT-981)
docs/Makefile | patch | blob | history | |
docs/source/api.rst | patch | blob | history | |
docs/source/conf.py | patch | blob | history | |
docs/source/example.rst | patch | blob | history | |
docs/source/images/tidl-api.png | [new file with mode: 0755] | patch | blob |
docs/source/images/tidl-development-flow.png | [new file with mode: 0755] | patch | blob |
docs/source/images/tinn_api.png | [deleted file] | patch | blob | history |
docs/source/index.rst | patch | blob | history | |
docs/source/intro.rst | patch | blob | history | |
docs/source/using_api.rst | [new file with mode: 0644] | patch | blob |
docs/source/viewer.rst | [new file with mode: 0644] | patch | blob |
diff --git a/docs/Makefile b/docs/Makefile
index a8ae35e956e853ed9d4b23771ac292408e1e80a4..028b747b0133c073e94980532d74189c73d82adb 100644 (file)
--- a/docs/Makefile
+++ b/docs/Makefile
clean:
rm -rf $(BUILDDIR)/*
-html:
+html: api-xml
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)."
artifacts:
$(MAKE) -c artifacts
+
+api-xml:
+ (cd ../tinn_api/doxygen; doxygen)
diff --git a/docs/source/api.rst b/docs/source/api.rst
index e6acc37e6fb9dd3c20cf0e265913652330bebadb..d27202780ddb91a5c5889cc5f2b84597fa4b1bdc 100644 (file)
--- a/docs/source/api.rst
+++ b/docs/source/api.rst
******************
-TINN API Reference
+TIDL API Reference
******************
.. doxygennamespace:: tinn
- :project: TINN
+ :project: TIDL
:members:
diff --git a/docs/source/conf.py b/docs/source/conf.py
index 85379f73e500fee5b222e353795cc2b6f266b1ac..ab901cf49e3cd72c7c8ce533c54530001d76e83a 100644 (file)
--- a/docs/source/conf.py
+++ b/docs/source/conf.py
master_doc = 'index'
# General information about the project.
-project = u'TI Neural Network API'
+project = u'TI Deep Learning'
copyright = u'2018, Texas Instruments Incorporated'
# The version info for the project you're documenting, acts as replacement for
# The name for this set of Sphinx documents. If None, it defaults to
# "<project> v<release> documentation".
-html_title = 'TINN API User\'s Guide'
+html_title = 'TIDL API User\'s Guide'
# A shorter title for the navigation bar. Default is the same as html_title.
#html_short_title = None
html_search_scorer = ''
# Output file base name for HTML help builder.
-htmlhelp_basename = 'TINNDoc'
+htmlhelp_basename = 'TIDLDoc'
# -- Options for LaTeX output ---------------------------------------------
# (source start file, target name, title,
# author, documentclass [howto, manual, or own class]).
latex_documents = [
- (master_doc, 'TINN.tex', u'TI Neural Network API Documentation',
+ (master_doc, 'TIDL.tex', u'TI Neural Network API Documentation',
u'Texas Instruments Incorporated', u'manual'),
]
# -- Breathe extension to integrate doxygen output --
breathe_projects = {
-"TINN":"../../tinn_api/doxygen/xml/",
+"TIDL":"../../tinn_api/doxygen/xml/",
}
index 09711a74c9b24e44f15a247a07b96a5e031df6a9..316a6ae316f3cf6e5c1f7d8ac5e6b28aa73b364c 100644 (file)
--- a/docs/source/example.rst
+++ b/docs/source/example.rst
-******************
-Using the TINN API
-******************
+********
+Examples
+********
-This example illustrates using the TI Neural Network (TINN) API to offload deep learning network processing from a Linux application to the C66x DSPs or DLAs on AM57x devices.
+Imagenet
+--------
-Step 1
-======
+Segmentation
+------------
-Determine if there are any TINN capable devices on the AM57x SoC:
-
-.. code-block:: c++
-
- uint32_t num_dla = Executor::GetNumDevices(DeviceType::DLA);
- uint32_t num_dsp = Executor::GetNumDevices(DeviceType::DSP);
-
-Step 2
-======
-Create a Configuration object by reading it from a file or by initializing it directly. The example below parses a configuration file and initializes the Configuration object. See ``tidl/testvecs/config/infer`` for examples of configuration files.
-
-.. code::
-
- Configuration configuration;
- bool status = configuration.ReadFromFile(config_file);
-
-.. note::
- Refer TINN Translation Tool documentation for creating TINN network and parameter binary files from TensorFlow and Caffe.
-
-Step 3
-======
-Create an Executor with the approriate device type, set of devices and a configuration. In the snippet below, an Executor is created on 2 DLAs.
-
-.. code-block:: c++
-
- DeviceIds ids = {DeviceId::ID0, DeviceId::ID1};
- Executor executor(DeviceType::DLA, ids, configuration);
-
-Step 4
-======
-Get the set of available ExecutionObjects and allocate input and output buffers for each ExecutionObject.
-
-.. code-block:: c++
-
- const ExecutionObjects& execution_objects = executor.GetExecutionObjects();
- int num_eos = execution_objects.size();
-
- // Allocate input and output buffers for each execution object
- std::vector<void *> buffers;
- for (auto &eo : execution_objects)
- {
- ArgInfo in = { ArgInfo(malloc(frame_sz), frame_sz)};
- ArgInfo out = { ArgInfo(malloc(frame_sz), frame_sz)};
- eo->SetInputOutputBuffer(in, out);
-
- buffers.push_back(in.ptr());
- buffers.push_back(out.ptr());
- }
-
-
-
-Step 5
-======
-Run the network on each input frame. The frames are processed with available execution objects in a pipelined manner with additional num_eos iterations to flush the pipeline (epilogue).
-
-.. code-block:: c++
-
- for (int frame_idx = 0; frame_idx < configuration.numFrames + num_eos; frame_idx++)
- {
- ExecutionObject* eo = execution_objects[frame_idx % num_eos].get();
-
- // Wait for previous frame on the same eo to finish processing
- if (eo->ProcessFrameWait())
- WriteFrame(*eo, output_data_file);
-
- // Read a frame and start processing it with current eo
- if (ReadFrame(*eo, frame_idx, configuration, input_data_file))
- eo->ProcessFrameStartAsync();
- }
-
-
-
-Putting it together
-===================
-The code snippet :ref:`tidl_main` illustrates using the API to offload a network.
-
-.. literalinclude:: ../../examples/test/main.cpp
- :name: tidl_main
- :caption: examples/test/main.cpp
- :lines: 161-195,213-218,220-225
- :linenos:
-
-For a complete example of using the API, refer ``tinn/examples/opencl/tidl/main.cpp`` on the EVM filesystem.
+SSD
+---
diff --git a/docs/source/images/tidl-api.png b/docs/source/images/tidl-api.png
new file mode 100755 (executable)
index 0000000..26a8a7c
Binary files /dev/null and b/docs/source/images/tidl-api.png differ
index 0000000..26a8a7c
Binary files /dev/null and b/docs/source/images/tidl-api.png differ
diff --git a/docs/source/images/tidl-development-flow.png b/docs/source/images/tidl-development-flow.png
new file mode 100755 (executable)
index 0000000..051767b
Binary files /dev/null and b/docs/source/images/tidl-development-flow.png differ
index 0000000..051767b
Binary files /dev/null and b/docs/source/images/tidl-development-flow.png differ
diff --git a/docs/source/images/tinn_api.png b/docs/source/images/tinn_api.png
deleted file mode 100644 (file)
index 04520bf..0000000
Binary files a/docs/source/images/tinn_api.png and /dev/null differ
index 04520bf..0000000
Binary files a/docs/source/images/tinn_api.png and /dev/null differ
diff --git a/docs/source/index.rst b/docs/source/index.rst
index 1e49db60dfaef415a4fdeb53736deac860dcb088..ce05adca18f8f9415f807653bbb7a2f2cc15466a 100644 (file)
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
-.. _tinn-home:
+.. _tidl-home:
##################################
-TI Neural Network API User's Guide
+TI Deep Learning API User's Guide
##################################
.. toctree::
:maxdepth: 3
intro
+ using_api
+ viewer
example
api
notice
disclaimer
-.. image:: images/platform_red.png
+.. .. image:: images/platform_red.png
diff --git a/docs/source/intro.rst b/docs/source/intro.rst
index 46a775d68409710c4a52d2939fe6ff22fbdc468f..c973aa71286c31db1e8d583dc4787c4407db2dd4 100644 (file)
--- a/docs/source/intro.rst
+++ b/docs/source/intro.rst
Introduction
************
-The TI Neural Network (TINN) API is a C++ API to abstract lower level OpenCL host APIs for custom devices. The TINN API enables AM57x applications to leverage DLAs or DSPs for deep learning. The API:
+TI Deep Learning (TIDL) API brings Deep Learning to the edge and enables applications to leverage TI's proprietary CNN/DNN implementation on Deep Learning Accelerators (DLAs) and C66x DSPs. TIDL will initially target Vision/2D use cases.
-* Enables easy integration of TINN into other frameworks such as OpenCV
-* Is low overhead - OpenCL APIs account for ~1.5% of overall per frame processing time (224x224 frame with 3 channels)
-* Provides an example of using the OpenCL DLAs custom device kernels
-* Provides a common abstraction for running networks on DLAs or C66x DSPs
+TIDL leverages the following techniques to optimize performance of CNN/DNN on TI's AM57x SoCs. For details on these techniques, refer [TODO: add link]
-OpenCL v1.2 added support for custom devices. The OpenCL runtime for a custom device implements the standard OpenCL host API functions. However, a custom device does not support OpenCL-C programs. Host programs can invoke a fixed set of kernels built into the runtime. The DLAs on AM57x SoCs are modeled as OpenCL custom devices with a fixed set of built-in kernels.
+* Complexity reductions
+* Sparse convolutions
+* Quantization to fixed point (8-bit and 16-bit)
+
+TIDL consists of the following components:
+
+* A C++ API on ARM/Linux
+* Translator tool to convert TensorFlow/Caffe networks to TIDL network format
+* A graphviz based visualizer for TIDL network graphs
+
+.. note:: Only certain Caffe/Tensor flow models can be translated to TIDL. There are constraints on supported layers that must be met. This is not a universal translator.
-The figure below describes the relationship between TINN APIs, the user's application and OpenCL host APIs.
+Key Features
+------------
+Ease of use
++++++++++++
+* Easily integrate TIDL APIs into other frameworks such as OpenCV
+* Provides a common host abstraction for user applications across multiple compute engines (DLAs and C66x DSPs)
-.. figure:: images/tinn_api.png
+Low overhead
++++++++++++++
+The execution time of TIDL APIs on the host is a fairly small percentage of the overall per-frame exection time. For example, with jseg21 network, 1024x512x3 frame size, the APIs account for ~1.5% of overall per frame processing time of 320ms.
-The API consistes of 3 classes with simple user interfaces:
+Software Architecture
+---------------------
+The API consists of 3 classes with simple user interfaces:
* Configuration
* Executor
* ExecutionObject
-.. note::
- DLA: TI Deep Learning Accelerator, also known as EVE.
+Figure X shows the TIDL API software architecture.
+
+.. image:: images/tidl-api.png
+ :align: center
+
+
+[TODO: Add text]
+
+OpenCL v1.2 added support for custom devices. The OpenCL runtime for a custom device implements the standard OpenCL host API functions. However, a custom device does not support OpenCL-C programs. Host programs can invoke a fixed set of kernels built into the runtime. The DLAs on AM57x SoCs are modeled as OpenCL custom devices with a fixed set of built-in kernels.
+
+Supported Layers
+----------------
+
+* Convolution
+* Pooling (Average, Max)
+* ReLU (including PReLU and ReLU6)
+* ElementWise (Add, Max, Product)
+* Inner Product (Fully Connected)
+* SoftMax
+* Bias
+* Deconvolution
+* Concatenate
+* ArgMax
+* Scale
+* Batch Normalization
+* Crop
+* Slice
+* Flatten
+* Split
+
+.. note:: There are constraints on usage of these layers. See documentation for details. [TODO: add link]
+
+Development Flow
+----------------
+
+.. image:: images/tidl-development-flow.png
+ :align: center
diff --git a/docs/source/using_api.rst b/docs/source/using_api.rst
--- /dev/null
@@ -0,0 +1,92 @@
+******************
+Using the TIDL API
+******************
+
+This example illustrates using the TIDL API to offload deep learning network processing from a Linux application to the C66x DSPs or DLAs on AM57x devices.
+
+Step 1
+======
+
+Determine if there are any TIDL capable devices on the AM57x SoC:
+
+.. code-block:: c++
+
+ uint32_t num_dla = Executor::GetNumDevices(DeviceType::DLA);
+ uint32_t num_dsp = Executor::GetNumDevices(DeviceType::DSP);
+
+Step 2
+======
+Create a Configuration object by reading it from a file or by initializing it directly. The example below parses a configuration file and initializes the Configuration object. See ``tidl/testvecs/config/infer`` for examples of configuration files.
+
+.. code::
+
+ Configuration configuration;
+ bool status = configuration.ReadFromFile(config_file);
+
+.. note::
+ Refer TIDL Translation Tool documentation for creating TIDL network and parameter binary files from TensorFlow and Caffe.
+
+Step 3
+======
+Create an Executor with the approriate device type, set of devices and a configuration. In the snippet below, an Executor is created on 2 DLAs.
+
+.. code-block:: c++
+
+ DeviceIds ids = {DeviceId::ID0, DeviceId::ID1};
+ Executor executor(DeviceType::DLA, ids, configuration);
+
+Step 4
+======
+Get the set of available ExecutionObjects and allocate input and output buffers for each ExecutionObject.
+
+.. code-block:: c++
+
+ const ExecutionObjects& execution_objects = executor.GetExecutionObjects();
+ int num_eos = execution_objects.size();
+
+ // Allocate input and output buffers for each execution object
+ std::vector<void *> buffers;
+ for (auto &eo : execution_objects)
+ {
+ ArgInfo in = { ArgInfo(malloc(frame_sz), frame_sz)};
+ ArgInfo out = { ArgInfo(malloc(frame_sz), frame_sz)};
+ eo->SetInputOutputBuffer(in, out);
+
+ buffers.push_back(in.ptr());
+ buffers.push_back(out.ptr());
+ }
+
+
+
+Step 5
+======
+Run the network on each input frame. The frames are processed with available execution objects in a pipelined manner with additional num_eos iterations to flush the pipeline (epilogue).
+
+.. code-block:: c++
+
+ for (int frame_idx = 0; frame_idx < configuration.numFrames + num_eos; frame_idx++)
+ {
+ ExecutionObject* eo = execution_objects[frame_idx % num_eos].get();
+
+ // Wait for previous frame on the same eo to finish processing
+ if (eo->ProcessFrameWait())
+ WriteFrame(*eo, output_data_file);
+
+ // Read a frame and start processing it with current eo
+ if (ReadFrame(*eo, frame_idx, configuration, input_data_file))
+ eo->ProcessFrameStartAsync();
+ }
+
+
+
+Putting it together
+===================
+The code snippet :ref:`tidl_main` illustrates using the API to offload a network.
+
+.. literalinclude:: ../../examples/test/main.cpp
+ :name: tidl_main
+ :caption: examples/test/main.cpp
+ :lines: 161-195,213-218,220-225
+ :linenos:
+
+For a complete example of using the API, refer ``tinn/examples/opencl/tidl/main.cpp`` on the EVM filesystem.
diff --git a/docs/source/viewer.rst b/docs/source/viewer.rst
--- /dev/null
+++ b/docs/source/viewer.rst
@@ -0,0 +1,3 @@
+**************
+Network Viewer
+**************