PLSDK-2986: Calibrate TIDL models for TF with correct raw image (preproc type 2).
Start hotfix v01.03.02
Merge branch 'release/v01.03.01'
TIDL-API 1.3.1 for Processor SDK 6.1
TIDL-API 1.3.1 for Processor SDK 6.1
More example sizes adjustment with new TIDL lib
- MCT-1217
- MCT-1217
Update changelog for version 1.3.1
Update TIDL-API manifest for version 1.3.1
Update squeeze net reference output
- 6 out of 1000 outputs changed with the newly imported network
- 6 out of 1000 outputs changed with the newly imported network
PLSDK-2986: update TIDL models for mobilenetV1, inceptionNetV1, squeezeNetV1.
Adjust example heap sizes with new TIDL library
- The latest TIDL library increased memory requirement slightly.
Adjust heap sizes in the examples accordingly.
- MCT-1217
- The latest TIDL library increased memory requirement slightly.
Adjust heap sizes in the examples accordingly.
- MCT-1217
Print out imagenet object index in imagenet example
- The object index is helpful information, in addition to text label
that has already been printed out.
- Offset tensorflow model output by 1 to remove backgroup index
- MCT-1216
- The object index is helpful information, in addition to text label
that has already been printed out.
- Offset tensorflow model output by 1 to remove backgroup index
- MCT-1216
Control heap size and alloc opt using env vars
- TIDL_PARAM_HEAP_SIZE_EVE, TIDL_PARAM_HEAP_SIZE_DSP,
TIDL_NETWORK_HEAP_SIZE_EVE, TIDL_NETWORK_HEAP_SIZE_DSP,
TIDL_EXTMEM_ALLOC_OPT_EVE, TIDL_EXTMEM_ALLOC_OPT_DSP
are provided to overwrite the heap sizes and heap allocation optimization
level (1 or 2) that are specified by default or by application.
- MCT-1215
- TIDL_PARAM_HEAP_SIZE_EVE, TIDL_PARAM_HEAP_SIZE_DSP,
TIDL_NETWORK_HEAP_SIZE_EVE, TIDL_NETWORK_HEAP_SIZE_DSP,
TIDL_EXTMEM_ALLOC_OPT_EVE, TIDL_EXTMEM_ALLOC_OPT_DSP
are provided to overwrite the heap sizes and heap allocation optimization
level (1 or 2) that are specified by default or by application.
- MCT-1215
Change develop version to 1.3.1 for patch release
Update network binary in TIDL-API to new format
- New network format corresponds to the network data structure update,
where strideOffsetMethod field moved from sTIDL_Network_t to sTIDL_Layer_t.
Old format is 483364 bytes, new format is 484384 bytes.
- Relates to: commit 49401e64374a4f0999479245dcd01eab38bec304, MCT-1136
- MCT-1203
- New network format corresponds to the network data structure update,
where strideOffsetMethod field moved from sTIDL_Network_t to sTIDL_Layer_t.
Old format is 483364 bytes, new format is 484384 bytes.
- Relates to: commit 49401e64374a4f0999479245dcd01eab38bec304, MCT-1136
- MCT-1203
Add ssd_multibox_fullnet example
- To demonstrate running jdenet/jdetnet_voc on a single core,
without paritioning the network. This is useful for situations
where SoC only has C66x cores but not EVE cores.
- MCT-1202
- To demonstrate running jdenet/jdetnet_voc on a single core,
without paritioning the network. This is useful for situations
where SoC only has C66x cores but not EVE cores.
- MCT-1202
Dump dataQ/minValue/maxValue for TIDL trace
- MCT-1201
- MCT-1201
Fix g++ 8.3.0 compilation error
- Fix a syntax allowed in g++ 7.2.1 (PSDK5.3) but not in 8.3.0 (PSDK6.0)
- Make should report error from loop
- MCT-1199
- Fix a syntax allowed in g++ 7.2.1 (PSDK5.3) but not in 8.3.0 (PSDK6.0)
- Make should report error from loop
- MCT-1199
Add imagenet python example
- Show how to interface with EO/EOP's input/output buffer in python.
- Show how to use OpenCV to read and transform image,
and how to process imagenet's output data.
- Fix EOP construction in examples
- MCT-1197
- Show how to interface with EO/EOP's input/output buffer in python.
- Show how to use OpenCV to read and transform image,
and how to process imagenet's output data.
- Fix EOP construction in examples
- MCT-1197
Fix unique_ptr that holds an allocated array
- Customer reported this problem. unique_ptr that holds an allocated
array was created as "unique_ptr<char>", which will call "delete"
at destruction. However, the array was created with "new char[]".
The proper way should be "unique_ptr<char[]>", so that "delete []"
will be called at destrution.
- One minor trace message update so that we know which type of device
is being dispatched to.
- MCT-1196
- Customer reported this problem. unique_ptr that holds an allocated
array was created as "unique_ptr<char>", which will call "delete"
at destruction. However, the array was created with "new char[]".
The proper way should be "unique_ptr<char[]>", so that "delete []"
will be called at destrution.
- One minor trace message update so that we know which type of device
is being dispatched to.
- MCT-1196
Use DSP Built-in Kernels in TIDL-API
- Replace previously used kernel wrappers
- MCT-1143, MCT-1154
- Replace previously used kernel wrappers
- MCT-1143, MCT-1154
Merge tag 'v01.03.00' into develop
TIDL-API 01.03.00 for Processor SDK 5.3
TIDL-API 01.03.00 for Processor SDK 5.3
Merge branch 'release/v01.03.00'
Update TIDL network data structure
- To be in sync with TIDL library and TIDL import utility
- strideOffsetMethod field moved from sTIDL_Network_t to sTIDL_Layer_t
- Add ReadNetworkBinary util that can read both network formats,
so that TIDL-API can be compatible with both old and new formats
- Update reference network output due to updated TIDL library
- MCT-1136
- To be in sync with TIDL library and TIDL import utility
- strideOffsetMethod field moved from sTIDL_Network_t to sTIDL_Layer_t
- Add ReadNetworkBinary util that can read both network formats,
so that TIDL-API can be compatible with both old and new formats
- Update reference network output due to updated TIDL library
- MCT-1136
[segmentation] Add video clip autorewind
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
[segmentation] Add sample clip with traffic scenes (from pixabay)
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Update documentation for TIDL-API 1.3.0
- MCT-1136
- MCT-1136
Clean up ssd_multibox changes
- PLSDK-2597
- PLSDK-2597
[ssd_multibox] Addressing review comments
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Bump develop branch version to 1.4.0
PLSDK-2597
- SSD_Multibox: updated to include slider for run-time probability modification
- SSD_Multibox: skip grabbing frame input multiple times, as real-time would very based on multicore configuration and network complexity
- SSD_Multibox: resize and central cropping added; instead of showing rectangles in original image, network input is presented
- Classification: Toydogs configuration added including models
Signed-off-by: Djordje Senicic <x0157990@ti.com>
- SSD_Multibox: updated to include slider for run-time probability modification
- SSD_Multibox: skip grabbing frame input multiple times, as real-time would very based on multicore configuration and network complexity
- SSD_Multibox: resize and central cropping added; instead of showing rectangles in original image, network input is presented
- Classification: Toydogs configuration added including models
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Enable DSP out-of-order execution in TIDL-API
- MCT-1108
- MCT-1108
Enable MNIST example on DSP
- It turns out DSP implementation of InnerProduct layer in TIDL library
requires input size to be multiple of 8, because it is doing
aligned 8-byte loads.
- Original LeNet network used in the MNIST example has a second InnerProduct
layer of size 500, which is not a multiple of 8. Change the size to 504,
re-train the network, re-import into TIDL format. Now the MNIST example
works correctly on DSP as well.
- MCT-1105
- It turns out DSP implementation of InnerProduct layer in TIDL library
requires input size to be multiple of 8, because it is doing
aligned 8-byte loads.
- Original LeNet network used in the MNIST example has a second InnerProduct
layer of size 500, which is not a multiple of 8. Change the size to 504,
re-train the network, re-import into TIDL format. Now the MNIST example
works correctly on DSP as well.
- MCT-1105
Merge tag 'v01.02.02' into develop
Merge branch 'hotfix/v01.02.02'
Fix memory leak in classification example
- MCT-1101
- MCT-1101
classification: Modify configuration structure to runFullNet before calling constructor for Execution Object
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Updated patch version to 2 (TIDL API 1.2.2)
Merge tag 'v01.02.01' into develop
Merge branch 'hotfix/v01.02.01'
tidl-viewer: Remove executor.h inclusion in utils.cpp
One of the files in the tidl-viewer build, utils.cpp, was updated to include
executor.h. This header in turn includes a file from OpenCL, custom.h.
The yocto build of tidl-viewer is a native recipe and hence cannot
include opencl recipes as a dependency to obtain custom.h.
This commit updates utils.cpp to remove the include of executor.h and
custom.h.
(MCT-1100)
One of the files in the tidl-viewer build, utils.cpp, was updated to include
executor.h. This header in turn includes a file from OpenCL, custom.h.
The yocto build of tidl-viewer is a native recipe and hence cannot
include opencl recipes as a dependency to obtain custom.h.
This commit updates utils.cpp to remove the include of executor.h and
custom.h.
(MCT-1100)
Merge tag 'v01.02.00' into develop
Merge branch 'release/v01.02.00'
Add jdetnet_voc network and make it the default
- jdetnet_voc is trained with more object categories than original
jdetnet. Make jdetnet_voc the default in the ssd example. User can
still use command line options to run the original jdetnet network.
- MCT-1091
- jdetnet_voc is trained with more object categories than original
jdetnet. Make jdetnet_voc the default in the ssd example. User can
still use command line options to run the original jdetnet network.
- MCT-1091
Update imagenet example with new softmax output
- MCT-1089
- MCT-1089
Added Python variant of mnist example
Also fix one_eo_per_frame.py to avoid creating an EVE executor if there are
no EVEs available.
(MCT-1088)
Also fix one_eo_per_frame.py to avoid creating an EVE executor if there are
no EVEs available.
(MCT-1088)
Add link in changelog for mnist example
- MCT-1083
- MCT-1083
Add MNIST LeNet network model and test input
- Constrained to EVE only for now.
- Add documentation for mnist example.
- MCT-1083
- Constrained to EVE only for now.
- Add documentation for mnist example.
- MCT-1083
Add mnist example with low compute
- Show that TIDL API with multiple contexts and pipelined computation
offers low overhead for small networks as well.
- MCT-1083
- Show that TIDL API with multiple contexts and pipelined computation
offers low overhead for small networks as well.
- MCT-1083
Update reference output for unit tests
A defect fix in the softmax layer necessitated updates to the reference
output of networks using the softmax layer.
(MCT-1087)
A defect fix in the softmax layer necessitated updates to the reference
output of networks using the softmax layer.
(MCT-1087)
Documentation - update 'Using the API' chapter
(MCT-1086)
(MCT-1086)
Initialize EO::current_frame_idx_m in constructor
Initialize ExecutionObject::current_frame_idx_m array to 0 in the
ExecutionObject constructor to prevent out of range entries when
recording trace data.
In a pipelined processing loop, the application executes
ExecutionObject::ProcessFrameWait() on the first frame before it calls
ExecutionObject::ProcessFrameStartAsync. The side effect is that the
current_frame_idx_m is not initialized. This can result in negative
frame indices when writing trace data using ReportTrace or UpdateTrace
leading to memory errors.
Setting ExecutionObject::current_frame_idx_m to 0 in the constructor
avoids this scenario.
(MCT-1085)
Initialize ExecutionObject::current_frame_idx_m array to 0 in the
ExecutionObject constructor to prevent out of range entries when
recording trace data.
In a pipelined processing loop, the application executes
ExecutionObject::ProcessFrameWait() on the first frame before it calls
ExecutionObject::ProcessFrameStartAsync. The side effect is that the
current_frame_idx_m is not initialized. This can result in negative
frame indices when writing trace data using ReportTrace or UpdateTrace
leading to memory errors.
Setting ExecutionObject::current_frame_idx_m to 0 in the constructor
avoids this scenario.
(MCT-1085)
Updated version on develop to 1.3.0
Updated parameter description in doxygen comments
(MCT-1084)
(MCT-1084)
Build Python bindings library by default
(MCT-1069)
(MCT-1069)
Updated manifest for v1.2.0
(MCT-1060)
(MCT-1060)
Add option to specify object classes list file
- so that user can specify a different object classes list file
without re-compiling the application.
- MCT-1081
- so that user can specify a different object classes list file
without re-compiling the application.
- MCT-1081
mcbench: image preprocessing, handle layergroups=1
- Add image preprocessing for types 1 and 2.
- if layer groups is 1, force all layers to be in the same group
(MCT-1075)
- Add image preprocessing for types 1 and 2.
- if layer groups is 1, force all layers to be in the same group
(MCT-1075)
classification: Support different network models
jacinto11, mobilenet and inceptionet models can be used with this example.
jacinto11, mobilenet and inceptionet models can be used with this example.
infer: Add configuration files for inceptionnet and mobilenet that can run as two layer groups
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
tidl_models: Add mobilenet and inceptionnet models, trained on ImageNet
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
mcbench: Multicore benchmark with minimal overhead
- Add required models, input test vectors and platform specific scripts
- Add inference configuration files for multicore benchmarking
- Rename input files to indicate multiple frames and add more inference
configurations, covered in regression scripts
(MCT-1075)
- Add required models, input test vectors and platform specific scripts
- Add inference configuration files for multicore benchmarking
- Rename input files to indicate multiple frames and add more inference
configurations, covered in regression scripts
(MCT-1075)
examples: Add layers group command line parameter
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Updated Python bindings to reflect API changes
(MCT-1080)
(MCT-1080)
Add contextSize: hide OCL_TIDL_CACHE_ALIGN on host
- MCT-1059
- MCT-1059
Added graph viewer for TIDL API timestamp data
1. Created a python matplotlib based utility for viewing timestamp data
generated from the TIDL API (viewer/execution_graph.py)
2. Minor updates to API internals to add ExecutionObject type, index to
timestamp output
(MCT-1073)
1. Created a python matplotlib based utility for viewing timestamp data
generated from the TIDL API (viewer/execution_graph.py)
2. Minor updates to API internals to add ExecutionObject type, index to
timestamp output
(MCT-1073)
Simplify API for multiple contexts
1. Simplify context API in ExecutionObject. Replace context_id variants to
multiple existing APIs with these two APIs:
bool AcquireAndRunContext(uint32_t& context_idx,
int frame_idx,
const IODeviceArgInfo& in,
const IODeviceArgInfo& out);
bool WaitAndReleaseContext(uint32_t context_idx);
2. The timing methods for host execution in EOPs and EOs:
* GetProcessTimeInMilliSeconds()
* GetHostProcessTimeInMilliSeconds()
are no longer accurate with multiple contexts and pipelining.
Replace these methods and replace with a generic timestamp
based approach. There is a single API call to enable time stamps in an
application:
//! Enable time stamp generation for TIDL API events
bool EnableTimeStamps(const std::string& file = "timestamp.log", size_t
num_frames=32);
If this method is called before TIDL API frame processing, the API will
generate timestamps for events corresponding to each frame (e.g.
EOP::ProcessFrameStartAsync, EOP::ProcessFrameWait, etc.). These
timestamps are then written to file when the user's application
completes.
A separate script is used for post-processing the time stamps and
generating data for the user.
(MCT-1073, MCT-1074)
1. Simplify context API in ExecutionObject. Replace context_id variants to
multiple existing APIs with these two APIs:
bool AcquireAndRunContext(uint32_t& context_idx,
int frame_idx,
const IODeviceArgInfo& in,
const IODeviceArgInfo& out);
bool WaitAndReleaseContext(uint32_t context_idx);
2. The timing methods for host execution in EOPs and EOs:
* GetProcessTimeInMilliSeconds()
* GetHostProcessTimeInMilliSeconds()
are no longer accurate with multiple contexts and pipelining.
Replace these methods and replace with a generic timestamp
based approach. There is a single API call to enable time stamps in an
application:
//! Enable time stamp generation for TIDL API events
bool EnableTimeStamps(const std::string& file = "timestamp.log", size_t
num_frames=32);
If this method is called before TIDL API frame processing, the API will
generate timestamps for events corresponding to each frame (e.g.
EOP::ProcessFrameStartAsync, EOP::ProcessFrameWait, etc.). These
timestamps are then written to file when the user's application
completes.
A separate script is used for post-processing the time stamps and
generating data for the user.
(MCT-1073, MCT-1074)
Enqueue multiple frames at device side
- Previous implementation won't send/enqueue next frame to device
until the host has received completion message for current frame.
The improvement is to create multiple sets/contexts of internal
TIDL input/output buffers at device side, and to send/enqueue next
frame using a different set/context of internal TIDL input/output
buffers to device while device is still processing the current frame.
When device finishes current frame, it can immediately read
its messageQ and start processing the next frame, without waiting
for the completion message reaching the host and the hosting sending
the next frame.
- In pipelined processing of multiple frames, this optimization can
effectively hide the round-trip communication between host and device.
- Removed deprecated enableInternalInput feature
- MCT-1059
- Previous implementation won't send/enqueue next frame to device
until the host has received completion message for current frame.
The improvement is to create multiple sets/contexts of internal
TIDL input/output buffers at device side, and to send/enqueue next
frame using a different set/context of internal TIDL input/output
buffers to device while device is still processing the current frame.
When device finishes current frame, it can immediately read
its messageQ and start processing the next frame, without waiting
for the completion message reaching the host and the hosting sending
the next frame.
- In pipelined processing of multiple frames, this optimization can
effectively hide the round-trip communication between host and device.
- Removed deprecated enableInternalInput feature
- MCT-1059
Quantization history configuration parameters
Added the following parameters to Configuration:
* quantHistoryParam1
* quantHistoryParam2
* quantMargin
These parameters can be specified in the configuration file or set
directly in the code.
(MCT-1062)
Added the following parameters to Configuration:
* quantHistoryParam1
* quantHistoryParam2
* quantMargin
These parameters can be specified in the configuration file or set
directly in the code.
(MCT-1062)
Removed unused API functionality, added changelog
(MCT-1062)
(MCT-1062)
Refactor imgutils::PreProcImage function
- Renamed to imgutils::PreprocessImage
- Remove alloc/memcpy of data buffer, work off split OpenCV::Mat(s)
- Use Configuration to extract width, height, #channels etc.
- Number of ROIs is always 1 for TIDL API, remove from parameter list
(MCT-1063)
- Renamed to imgutils::PreprocessImage
- Remove alloc/memcpy of data buffer, work off split OpenCV::Mat(s)
- Use Configuration to extract width, height, #channels etc.
- Number of ROIs is always 1 for TIDL API, remove from parameter list
(MCT-1063)
Adding pybind11 v2.2.4 to repo
- https://github.com/pybind/pybind11.git, branch v2.2
- commit sha: 9a19306fbf30642ca331d0ec88e7da54a96860f9
(MCT-1009)
- https://github.com/pybind/pybind11.git, branch v2.2
- commit sha: 9a19306fbf30642ca331d0ec88e7da54a96860f9
(MCT-1009)
Added Python 3 bindings for TIDL API
* Using pybind11 v2.2 to add Python 3 bindings to TIDL API classes/methods
https://pybind11.readthedocs.io/en/stable/index.html
https://github.com/pybind/pybind11/tree/v2.2
* Leveraging the Python buffer protocol to expose input/output buffers
from ExecutionObject/ExecutionObjectPipeline to Python application
code. This eliminates copies between the Python application and the TIDL
API library. (see examples/pybind/one_eo_per_frame.py).
* Methods renamed to follow Python style guide (PEP8)
* Bindings split across multiple pybind_* source files to reduce compile
time
* tidl_api/Makefile builds a shared object - tidl.so. Add this so to
PYTHONPATH to make the tidl module available to the Python interpreter.
>>> import tidl
>>> help (tidl)
* See examples/pybind for examples of using the Python bindings
(MCT-1009)
* Using pybind11 v2.2 to add Python 3 bindings to TIDL API classes/methods
https://pybind11.readthedocs.io/en/stable/index.html
https://github.com/pybind/pybind11/tree/v2.2
* Leveraging the Python buffer protocol to expose input/output buffers
from ExecutionObject/ExecutionObjectPipeline to Python application
code. This eliminates copies between the Python application and the TIDL
API library. (see examples/pybind/one_eo_per_frame.py).
* Methods renamed to follow Python style guide (PEP8)
* Bindings split across multiple pybind_* source files to reduce compile
time
* tidl_api/Makefile builds a shared object - tidl.so. Add this so to
PYTHONPATH to make the tidl module available to the Python interpreter.
>>> import tidl
>>> help (tidl)
* See examples/pybind for examples of using the Python bindings
(MCT-1009)
Restore version to 1.2.0.0
Merge tag 'v01.01.00.01' into develop
For PSDK 5.1 release
For PSDK 5.1 release
Merge branch 'hotfix/v01.01.00.01'
Optimize classification perf, report loop avg_fps
- Double buffer EOPs to overlap host pre/post-processing
and device processing. When EOP contains more than one EO,
pipeline at EO level rather than at EOP level.
- Compute average FPS across a sliding window of frames
using host loop iteration/frame time.
- MCT-1049
- Double buffer EOPs to overlap host pre/post-processing
and device processing. When EOP contains more than one EO,
pipeline at EO level rather than at EOP level.
- Compute average FPS across a sliding window of frames
using host loop iteration/frame time.
- MCT-1049
Start hotfix v01.01.00.01
Merge tag 'v01.01.00.00' into develop
Merge branch 'release/v01.01.00.00'
Updated TIDL API manifest for v1.1
(MCT-1050)
(MCT-1050)
examples:classification: Use configuration.numFrames instead of hard coded big value
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
examples:classification: Increase main window and update picture of TIDL SW stack
- MCT-1049
Signed-off-by: Djordje Senicic <x0157990@ti.com>
- MCT-1049
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Updated version to 1.2.0
Classification example code refactor and clean up
- MCT-1049
- MCT-1049
Optimize examples with EOP double buffering
- Improve overall loop performance for imagenet and segmentation
- Update documentation on performance
- MCT-1039
- Improve overall loop performance for imagenet and segmentation
- Update documentation on performance
- MCT-1039
Updates to User's Guide and related examples
Changes:
* Overview chapter, includes a Terminology section.
* Section on different use cases in the "Using the API" chapter.
* Updated the Examples chapter to reflect new examples and AM5749
benchmarking.
* Added the two_eo_per_frame_opt example to illustrate double buffering.
(MCT-1043)
Changes:
* Overview chapter, includes a Terminology section.
* Section on different use cases in the "Using the API" chapter.
* Updated the Examples chapter to reflect new examples and AM5749
benchmarking.
* Added the two_eo_per_frame_opt example to illustrate double buffering.
(MCT-1043)
Wall cleanup, optimize ssd_multibox
- Fix -Wall errors
- Optimize pipeline execution for ssd_multibox
- MCT-1015
- Fix -Wall errors
- Optimize pipeline execution for ssd_multibox
- MCT-1015
Video input option and document update
- mp4/avi/mov as pre-recorded video input
- camera as live video input, let user choose video input port #
- refactor examples code
- bookkeep each EO's device/host time inside EOP since EO could be shared
- documentation update on 650MHz EVE
- documentation on video inputs and output
- MCT-1015
- mp4/avi/mov as pre-recorded video input
- camera as live video input, let user choose video input port #
- refactor examples code
- bookkeep each EO's device/host time inside EOP since EO could be shared
- documentation update on 650MHz EVE
- documentation on video inputs and output
- MCT-1015
Added example to illustrate pipelining across EOs
two_eo_per_frame is a simple example to illustrate using
ExecutionObjectPipeline to split processing a single frame across EVE
and DSP.
(MCT-1048)
two_eo_per_frame is a simple example to illustrate using
ExecutionObjectPipeline to split processing a single frame across EVE
and DSP.
(MCT-1048)
Refactor examples - test, one_eo_per_frame
- Remove code duplication across test/main.cpp,
test/multiple_executors.cpp and one_eo_per_frame/main.cpp
- Moved common code into common/utils.h, common/utils.cpp
(MCT-1047)
- Remove code duplication across test/main.cpp,
test/multiple_executors.cpp and one_eo_per_frame/main.cpp
- Moved common code into common/utils.h, common/utils.cpp
(MCT-1047)
classification - fixed zero size image clip
(PLSDK-2250)
(PLSDK-2250)
classification - adjust example for API updates
- Enable operation up to 36fps on AM5749 with EVEs at 650MHz
- DisplayHelp update for number of cores
(PLSDK-2250)
- Enable operation up to 36fps on AM5749 with EVEs at 650MHz
- DisplayHelp update for number of cores
(PLSDK-2250)
Added an example to illustrate 1 EO per frame
(MCT-1043)
(MCT-1043)
Documentation - refactoring and updates
Changes:
- Added a release notes section with notes for v1.0 and v1.1
- Reworked the intro section
- Added an overview section, changed the API software picture to show
more detail
- Removed duplicate documentation for tidl::Configuration in the rst
file, moved documentation to doxygen comments in configuration.h
- Moved "building from source" to FAQs
(MCT-1043)
Changes:
- Added a release notes section with notes for v1.0 and v1.1
- Reworked the intro section
- Added an overview section, changed the API software picture to show
more detail
- Removed duplicate documentation for tidl::Configuration in the rst
file, moved documentation to doxygen comments in configuration.h
- Moved "building from source" to FAQs
(MCT-1043)
Update imagenet to take mp4 input
- clean up command line options
- MCT-1015, MCT-1039
- clean up command line options
- MCT-1015, MCT-1039
Report memory usage when device allocation fails
TIDL API creates 2 device side heaps:
1. Parameter heap
2. Network heap
The sizes of these heaps are specified in the Configuration object, via
PARAM_HEAP_SIZE and NETWORK_HEAP_SIZE.
Existing behavior: If the heaps are not large enough, allocation on the
device triggers an assertion failure with no indication of how large the
heaps need to be for successfull allocation.
To improve the usability of the API, provide feedback to the user on the
heap sizes required to satisfy device side allocations when any
allocation fails.
Also added `-Wall -Werror` when building examples and fixed failures.
(MCT-1035)
TIDL API creates 2 device side heaps:
1. Parameter heap
2. Network heap
The sizes of these heaps are specified in the Configuration object, via
PARAM_HEAP_SIZE and NETWORK_HEAP_SIZE.
Existing behavior: If the heaps are not large enough, allocation on the
device triggers an assertion failure with no indication of how large the
heaps need to be for successfull allocation.
To improve the usability of the API, provide feedback to the user on the
heap sizes required to satisfy device side allocations when any
allocation fails.
Also added `-Wall -Werror` when building examples and fixed failures.
(MCT-1035)
ExecutionObjectPipeline for executing layersGroups
- Add top level ExecutionObjectPipeline class to execute multiple
layersGroups.
- An ExecutionObjectPipeline is constructed from multiple
ExecutionObjects, each ExecutionObject executes one layersGroup
in the network, together they execute consecutive layersGroups.
- Same look and feel as ExecutionObject, e.g. ProcessFrameStartAsync,
ProcessFrameWait, GetInputBufferPointer, GetOutputBufferPointer
- MCT-1017, MCT-1029
- Add top level ExecutionObjectPipeline class to execute multiple
layersGroups.
- An ExecutionObjectPipeline is constructed from multiple
ExecutionObjects, each ExecutionObject executes one layersGroup
in the network, together they execute consecutive layersGroups.
- Same look and feel as ExecutionObject, e.g. ProcessFrameStartAsync,
ProcessFrameWait, GetInputBufferPointer, GetOutputBufferPointer
- MCT-1017, MCT-1029
Modified IODeviceArgInfo to enable pipelining EOs
(MCT-1030)
(MCT-1030)
Remove implementation details from ArgInfo
Implementation details such as argument kind and PipeInfo should not be
a part of the user facing ArgInfo class. Also, PipeInfo is relevant only
for input/output arguments.
Moved implementation details out of ArgInfo and created 2 new classes:
DeviceArgInfo and IODeviceArgInfo.
DeviceArgInfo inherits from ArgInfo and adds an
argument kind (buffer, local or scalar). IODeviceArgInfo consists of
DeviceArgInfo and PipeInfo.
(MCT-1030)
Implementation details such as argument kind and PipeInfo should not be
a part of the user facing ArgInfo class. Also, PipeInfo is relevant only
for input/output arguments.
Moved implementation details out of ArgInfo and created 2 new classes:
DeviceArgInfo and IODeviceArgInfo.
DeviceArgInfo inherits from ArgInfo and adds an
argument kind (buffer, local or scalar). IODeviceArgInfo consists of
DeviceArgInfo and PipeInfo.
(MCT-1030)
Remove input, output buffers from process kernel
Input and output OpenCL buffers do not have to be passed into the process
kernel. The host will directly update input and output in the buffers
allocated by the TIDL library via the HostWriteNetInput and
HostReadNetOutput methods.
(MCT-1030)
Input and output OpenCL buffers do not have to be passed into the process
kernel. The host will directly update input and output in the buffers
allocated by the TIDL library via the HostWriteNetInput and
HostReadNetOutput methods.
(MCT-1030)
classification: Update static images, synthetic video clip
(MCT-1031)
Signed-off-by: Djordje Senicic <x0157990@ti.com>
(MCT-1031)
Signed-off-by: Djordje Senicic <x0157990@ti.com>