Merge branch 'hotfix/v01.05.01'
Add TIDL_SUBGRAPH_NUM_EVES env var
- Current subgraph implementation will initialize and use all available
EVEs and DSPs, with streaming/batch inputs in mind. There are cases
we only need 1 EVE and 1 DSP, for example, demonstrating subgraph
offloading on a single input.
This commit adds an environment variable, TIDL_SUBGRAPH_NUM_EVES,
to specify number of EVEs used for subgraph inferencing.
- MCT-1243
- Current subgraph implementation will initialize and use all available
EVEs and DSPs, with streaming/batch inputs in mind. There are cases
we only need 1 EVE and 1 DSP, for example, demonstrating subgraph
offloading on a single input.
This commit adds an environment variable, TIDL_SUBGRAPH_NUM_EVES,
to specify number of EVEs used for subgraph inferencing.
- MCT-1243
Start hotfix v01.05.01
Merge branch 'release/v01.05.00'
Update version to 1.5.0 in manifest and docs
Clean up required subgraph cfg file entries
- Added environment variable TIDL_SUBGRAPH_DIR for locating the
subgraph config files.
- Updated documentation for subgraph runtime.
- MCT-1227
- Added environment variable TIDL_SUBGRAPH_DIR for locating the
subgraph config files.
- Updated documentation for subgraph runtime.
- MCT-1227
replace 2 dsp + 2 group layer use cases with 1 dsp
reference to PLSDK-3189.
The BBAI only has enough CMEM for 4 EVEs, 1 DSP, and 2 group
layers. In the case of all of our networks, the difference between
1 and 2 dsps is essentially nonexistent.
The following is the benchmarks run side by side:
CMDLINE: ./mcbench -g 1 -d 2 -e 4 -c ../test/testvecs/config/ CMDLINE: ./mcbench -g 1 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame
Loop total time: 1189ms Loop total time: 1189ms
FPS:42.06 FPS:42.06
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 1 -d 2 -e 4 -c ../test/testvecs/config/ CMDLINE: ./mcbench -g 1 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame
Loop total time: 3066ms Loop total time: 3066ms
FPS:16.31 FPS:16.31
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 2 -d 1 -e 4 -c ../test/testvecs/config/ | CMDLINE: ./mcbench -g 2 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_2_224x224_multi.y frame Input: ../test/testvecs/input/preproc_2_224x224_multi.y frame
Loop total time: 1822ms | Loop total time: 1835ms
FPS:27.44 | FPS:27.24
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 2 -d 1 -e 4 -c ../test/testvecs/config/ | CMDLINE: ./mcbench -g 2 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_2_224x224_multi.y frame Input: ../test/testvecs/input/preproc_2_224x224_multi.y frame
Loop total time: 1823ms | Loop total time: 1841ms
FPS:27.42 | FPS:27.16
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 2 -d 1 -e 4 -c ../test/testvecs/config/ | CMDLINE: ./mcbench -g 2 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_2_224x224_multi.y frame Input: ../test/testvecs/input/preproc_2_224x224_multi.y frame
Loop total time: 1793ms | Loop total time: 1817ms
FPS:27.89 | FPS:27.52
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 2 -d 1 -e 4 -c ../test/testvecs/config/ | CMDLINE: ./mcbench -g 2 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame
Loop total time: 4269ms | Loop total time: 4285ms
FPS:11.71 | FPS:11.67
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 2 -d 1 -e 4 -c ../test/testvecs/config/ | CMDLINE: ./mcbench -g 2 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame
Loop total time: 892.9ms | Loop total time: 915ms
FPS:55.99 | FPS:54.64
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 2 -d 1 -e 4 -c ../test/testvecs/config/ | CMDLINE: ./mcbench -g 2 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame
Loop total time: 2008ms | Loop total time: 2014ms
FPS:24.9 | FPS:24.82
mcbench PASSED mcbench PASSED
reference to PLSDK-3189.
The BBAI only has enough CMEM for 4 EVEs, 1 DSP, and 2 group
layers. In the case of all of our networks, the difference between
1 and 2 dsps is essentially nonexistent.
The following is the benchmarks run side by side:
CMDLINE: ./mcbench -g 1 -d 2 -e 4 -c ../test/testvecs/config/ CMDLINE: ./mcbench -g 1 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame
Loop total time: 1189ms Loop total time: 1189ms
FPS:42.06 FPS:42.06
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 1 -d 2 -e 4 -c ../test/testvecs/config/ CMDLINE: ./mcbench -g 1 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame
Loop total time: 3066ms Loop total time: 3066ms
FPS:16.31 FPS:16.31
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 2 -d 1 -e 4 -c ../test/testvecs/config/ | CMDLINE: ./mcbench -g 2 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_2_224x224_multi.y frame Input: ../test/testvecs/input/preproc_2_224x224_multi.y frame
Loop total time: 1822ms | Loop total time: 1835ms
FPS:27.44 | FPS:27.24
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 2 -d 1 -e 4 -c ../test/testvecs/config/ | CMDLINE: ./mcbench -g 2 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_2_224x224_multi.y frame Input: ../test/testvecs/input/preproc_2_224x224_multi.y frame
Loop total time: 1823ms | Loop total time: 1841ms
FPS:27.42 | FPS:27.16
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 2 -d 1 -e 4 -c ../test/testvecs/config/ | CMDLINE: ./mcbench -g 2 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_2_224x224_multi.y frame Input: ../test/testvecs/input/preproc_2_224x224_multi.y frame
Loop total time: 1793ms | Loop total time: 1817ms
FPS:27.89 | FPS:27.52
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 2 -d 1 -e 4 -c ../test/testvecs/config/ | CMDLINE: ./mcbench -g 2 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame
Loop total time: 4269ms | Loop total time: 4285ms
FPS:11.71 | FPS:11.67
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 2 -d 1 -e 4 -c ../test/testvecs/config/ | CMDLINE: ./mcbench -g 2 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame
Loop total time: 892.9ms | Loop total time: 915ms
FPS:55.99 | FPS:54.64
mcbench PASSED mcbench PASSED
CMDLINE: ./mcbench -g 2 -d 1 -e 4 -c ../test/testvecs/config/ | CMDLINE: ./mcbench -g 2 -d 2 -e 4 -c ../test/testvecs/config/
Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame Input: ../test/testvecs/input/preproc_0_224x224_multi.y frame
Loop total time: 2008ms | Loop total time: 2014ms
FPS:24.9 | FPS:24.82
mcbench PASSED mcbench PASSED
Minor version update in docs
Fix versioning for shared libary
- by putting proper SONAME in built shared libraries.
- MCT-1231
- by putting proper SONAME in built shared libraries.
- MCT-1231
Merge tag 'v01.04.00' into develop
TIDL-API v01.04.00 for Processor SDK 6.2
TIDL-API v01.04.00 for Processor SDK 6.2
Merge branch 'release/v01.04.00'
Update changelog for v01.04.00 release
mcbench: Adjust network heap sizes, so that all test cases can fit into CMEM of 384MB
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
examples:classification: Detect number of EVEs, DSPs and CMEM size on SoC
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
mcbench: Add test cases for AM5729
- Add one line of comment in each script, to indicate SoC used with specific the script
- Add all_5729.sh, script with benchmarking test cases for AM5729 device, 2xDSP+4xEVE
- PLSDK-3140
Signed-off-by: Djordje Senicic <x0157990@ti.com>
- Add one line of comment in each script, to indicate SoC used with specific the script
- Add all_5729.sh, script with benchmarking test cases for AM5729 device, 2xDSP+4xEVE
- PLSDK-3140
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Bump up develop branch version to 1.5.0
Subgraph: use Layer2Group map in config file
- If Layer2Group map exists in subgraph config file, use it.
Otherwise, try derive the map from network layer types.
- Added TidlFreeSubgraph() for subgraph resource de-allocation
- Code changes based on review comments.
- MCT-1223
- If Layer2Group map exists in subgraph config file, use it.
Otherwise, try derive the map from network layer types.
- Added TidlFreeSubgraph() for subgraph resource de-allocation
- Code changes based on review comments.
- MCT-1223
Subgraph example: multi-threaded batch processing
- Compared different batch size in subgraph execution example
- Compared async/future implementation vs thread pool implementation,
async/future has slightly worse (~1%) performance,
but it is much easier to program
- Recommended inference is multi-threaded batch processing, where
batch_size can be obtained from TidlGetPreferredBatchSize(),
number of threads can be set to 2.
- MCT-1223
- Compared different batch size in subgraph execution example
- Compared async/future implementation vs thread pool implementation,
async/future has slightly worse (~1%) performance,
but it is much easier to program
- Recommended inference is multi-threaded batch processing, where
batch_size can be obtained from TidlGetPreferredBatchSize(),
number of threads can be set to 2.
- MCT-1223
Parse data conversion info from subgraph config
- MCT-1224
- MCT-1224
Subgraph: support batch processing
- MCT-1223
- MCT-1223
Subgraph: add a mobilenet v1 example
- Using the TidlRunSubgraph() interface
- Using the TidlRunSubgraph() interface
Subgraph: add top level API TidlRunSubgraph
- TidlRunSubgraph() should be the interface function that TVM/TFLite
calls to offload subgraph to TIDL
- MCT-1222
- TidlRunSubgraph() should be the interface function that TVM/TFLite
calls to offload subgraph to TIDL
- MCT-1222
Subgraph data conversion at boundaries
- Data layout: NCHW <-> NHWC
- Data type: 8-bit quantized <-> float
- MCT-1222
- Data layout: NCHW <-> NHWC
- Data type: 8-bit quantized <-> float
- MCT-1222
Subgraph offloading to TIDL: first commit
- ResM class provides top level encapsulation
- All allocation of core resources and buffers, and all creation of
Executor, ExecutionObject, ExecutionObjectPipeline are encapsulated.
- Auto-partition last few layers to DSP if profitable, also encapsulated.
- MCT-1223, MCT-1224
- ResM class provides top level encapsulation
- All allocation of core resources and buffers, and all creation of
Executor, ExecutionObject, ExecutionObjectPipeline are encapsulated.
- Auto-partition last few layers to DSP if profitable, also encapsulated.
- MCT-1223, MCT-1224
Merge tag 'v01.03.03' into develop
TIDL-API 1.3.3 for PSDK 6.1
TIDL-API 1.3.3 for PSDK 6.1
Merge branch 'hotfix/v01.03.03'
Update the changelog
Revert "Adjust example heap sizes with new TIDL library"
This reverts commit c3786ddb01e187a983811d1cd3e08f6dfa20dd2e.
This reverts commit c3786ddb01e187a983811d1cd3e08f6dfa20dd2e.
Revert "More example sizes adjustment with new TIDL lib"
This reverts commit 268aecd993dec4faec1a414d6aac4c43b0c059ed.
This reverts commit 268aecd993dec4faec1a414d6aac4c43b0c059ed.
mcbench: Add MobileNetV2 test cases
- PLSDK-3078
Signed-off-by: Djordje Senicic <x0157990@ti.com>
- PLSDK-3078
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Start hotfix v01.03.03
Merge tag 'v01.03.02' into develop
TIDL-API 1.3.2 for Processor SDK 6.1
TIDL-API 1.3.2 for Processor SDK 6.1
Merge branch 'hotfix/v01.03.02'
TIDL-API 1.3.2 for Processor SDK 6.1
TIDL-API 1.3.2 for Processor SDK 6.1
Fix min OpenCL version to 1.1.19.00
- MCT-1221
- MCT-1221
Added double quotes "" to input data file
PLSDK-2956: add MobileNetV2 model (.bin files) and inference config files
Fix classification example for tensorflow models
- Copy original image to show image before pre-processing, because
pre-processing will change BGR to RGB for tensorflow models
- Subtract 1 from output object class index, because tensorflow outputs
1001 bytes and uses index-0 for background. Regular imagenet labels
only have 1000 entries.
- Fix path to inceptionnet net and params binaries in the config file.
- MCT-1221
- Copy original image to show image before pre-processing, because
pre-processing will change BGR to RGB for tensorflow models
- Subtract 1 from output object class index, because tensorflow outputs
1001 bytes and uses index-0 for background. Regular imagenet labels
only have 1000 entries.
- Fix path to inceptionnet net and params binaries in the config file.
- MCT-1221
PLSDK-2986: Calibrate TIDL models for TF with correct raw image (preproc type 2).
Start hotfix v01.03.02
Merge tag 'v01.03.01' into develop
TIDL-API 1.3.1 for Processor SDK 6.1
TIDL-API 1.3.1 for Processor SDK 6.1
Merge branch 'release/v01.03.01'
TIDL-API 1.3.1 for Processor SDK 6.1
TIDL-API 1.3.1 for Processor SDK 6.1
More example sizes adjustment with new TIDL lib
- MCT-1217
- MCT-1217
Update changelog for version 1.3.1
Update TIDL-API manifest for version 1.3.1
Update squeeze net reference output
- 6 out of 1000 outputs changed with the newly imported network
- 6 out of 1000 outputs changed with the newly imported network
PLSDK-2986: update TIDL models for mobilenetV1, inceptionNetV1, squeezeNetV1.
Adjust example heap sizes with new TIDL library
- The latest TIDL library increased memory requirement slightly.
Adjust heap sizes in the examples accordingly.
- MCT-1217
- The latest TIDL library increased memory requirement slightly.
Adjust heap sizes in the examples accordingly.
- MCT-1217
Bump develop branch to version 1.4.0
Print out imagenet object index in imagenet example
- The object index is helpful information, in addition to text label
that has already been printed out.
- Offset tensorflow model output by 1 to remove backgroup index
- MCT-1216
- The object index is helpful information, in addition to text label
that has already been printed out.
- Offset tensorflow model output by 1 to remove backgroup index
- MCT-1216
Control heap size and alloc opt using env vars
- TIDL_PARAM_HEAP_SIZE_EVE, TIDL_PARAM_HEAP_SIZE_DSP,
TIDL_NETWORK_HEAP_SIZE_EVE, TIDL_NETWORK_HEAP_SIZE_DSP,
TIDL_EXTMEM_ALLOC_OPT_EVE, TIDL_EXTMEM_ALLOC_OPT_DSP
are provided to overwrite the heap sizes and heap allocation optimization
level (1 or 2) that are specified by default or by application.
- MCT-1215
- TIDL_PARAM_HEAP_SIZE_EVE, TIDL_PARAM_HEAP_SIZE_DSP,
TIDL_NETWORK_HEAP_SIZE_EVE, TIDL_NETWORK_HEAP_SIZE_DSP,
TIDL_EXTMEM_ALLOC_OPT_EVE, TIDL_EXTMEM_ALLOC_OPT_DSP
are provided to overwrite the heap sizes and heap allocation optimization
level (1 or 2) that are specified by default or by application.
- MCT-1215
Change develop version to 1.3.1 for patch release
Update network binary in TIDL-API to new format
- New network format corresponds to the network data structure update,
where strideOffsetMethod field moved from sTIDL_Network_t to sTIDL_Layer_t.
Old format is 483364 bytes, new format is 484384 bytes.
- Relates to: commit 49401e64374a4f0999479245dcd01eab38bec304, MCT-1136
- MCT-1203
- New network format corresponds to the network data structure update,
where strideOffsetMethod field moved from sTIDL_Network_t to sTIDL_Layer_t.
Old format is 483364 bytes, new format is 484384 bytes.
- Relates to: commit 49401e64374a4f0999479245dcd01eab38bec304, MCT-1136
- MCT-1203
Add ssd_multibox_fullnet example
- To demonstrate running jdenet/jdetnet_voc on a single core,
without paritioning the network. This is useful for situations
where SoC only has C66x cores but not EVE cores.
- MCT-1202
- To demonstrate running jdenet/jdetnet_voc on a single core,
without paritioning the network. This is useful for situations
where SoC only has C66x cores but not EVE cores.
- MCT-1202
Dump dataQ/minValue/maxValue for TIDL trace
- MCT-1201
- MCT-1201
Fix g++ 8.3.0 compilation error
- Fix a syntax allowed in g++ 7.2.1 (PSDK5.3) but not in 8.3.0 (PSDK6.0)
- Make should report error from loop
- MCT-1199
- Fix a syntax allowed in g++ 7.2.1 (PSDK5.3) but not in 8.3.0 (PSDK6.0)
- Make should report error from loop
- MCT-1199
Add imagenet python example
- Show how to interface with EO/EOP's input/output buffer in python.
- Show how to use OpenCV to read and transform image,
and how to process imagenet's output data.
- Fix EOP construction in examples
- MCT-1197
- Show how to interface with EO/EOP's input/output buffer in python.
- Show how to use OpenCV to read and transform image,
and how to process imagenet's output data.
- Fix EOP construction in examples
- MCT-1197
Fix unique_ptr that holds an allocated array
- Customer reported this problem. unique_ptr that holds an allocated
array was created as "unique_ptr<char>", which will call "delete"
at destruction. However, the array was created with "new char[]".
The proper way should be "unique_ptr<char[]>", so that "delete []"
will be called at destrution.
- One minor trace message update so that we know which type of device
is being dispatched to.
- MCT-1196
- Customer reported this problem. unique_ptr that holds an allocated
array was created as "unique_ptr<char>", which will call "delete"
at destruction. However, the array was created with "new char[]".
The proper way should be "unique_ptr<char[]>", so that "delete []"
will be called at destrution.
- One minor trace message update so that we know which type of device
is being dispatched to.
- MCT-1196
Use DSP Built-in Kernels in TIDL-API
- Replace previously used kernel wrappers
- MCT-1143, MCT-1154
- Replace previously used kernel wrappers
- MCT-1143, MCT-1154
Merge tag 'v01.03.00' into develop
TIDL-API 01.03.00 for Processor SDK 5.3
TIDL-API 01.03.00 for Processor SDK 5.3
Merge branch 'release/v01.03.00'
Update TIDL network data structure
- To be in sync with TIDL library and TIDL import utility
- strideOffsetMethod field moved from sTIDL_Network_t to sTIDL_Layer_t
- Add ReadNetworkBinary util that can read both network formats,
so that TIDL-API can be compatible with both old and new formats
- Update reference network output due to updated TIDL library
- MCT-1136
- To be in sync with TIDL library and TIDL import utility
- strideOffsetMethod field moved from sTIDL_Network_t to sTIDL_Layer_t
- Add ReadNetworkBinary util that can read both network formats,
so that TIDL-API can be compatible with both old and new formats
- Update reference network output due to updated TIDL library
- MCT-1136
[segmentation] Add video clip autorewind
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
[segmentation] Add sample clip with traffic scenes (from pixabay)
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Update documentation for TIDL-API 1.3.0
- MCT-1136
- MCT-1136
Clean up ssd_multibox changes
- PLSDK-2597
- PLSDK-2597
[ssd_multibox] Addressing review comments
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Bump develop branch version to 1.4.0
PLSDK-2597
- SSD_Multibox: updated to include slider for run-time probability modification
- SSD_Multibox: skip grabbing frame input multiple times, as real-time would very based on multicore configuration and network complexity
- SSD_Multibox: resize and central cropping added; instead of showing rectangles in original image, network input is presented
- Classification: Toydogs configuration added including models
Signed-off-by: Djordje Senicic <x0157990@ti.com>
- SSD_Multibox: updated to include slider for run-time probability modification
- SSD_Multibox: skip grabbing frame input multiple times, as real-time would very based on multicore configuration and network complexity
- SSD_Multibox: resize and central cropping added; instead of showing rectangles in original image, network input is presented
- Classification: Toydogs configuration added including models
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Enable DSP out-of-order execution in TIDL-API
- MCT-1108
- MCT-1108
Enable MNIST example on DSP
- It turns out DSP implementation of InnerProduct layer in TIDL library
requires input size to be multiple of 8, because it is doing
aligned 8-byte loads.
- Original LeNet network used in the MNIST example has a second InnerProduct
layer of size 500, which is not a multiple of 8. Change the size to 504,
re-train the network, re-import into TIDL format. Now the MNIST example
works correctly on DSP as well.
- MCT-1105
- It turns out DSP implementation of InnerProduct layer in TIDL library
requires input size to be multiple of 8, because it is doing
aligned 8-byte loads.
- Original LeNet network used in the MNIST example has a second InnerProduct
layer of size 500, which is not a multiple of 8. Change the size to 504,
re-train the network, re-import into TIDL format. Now the MNIST example
works correctly on DSP as well.
- MCT-1105
Merge tag 'v01.02.02' into develop
Merge branch 'hotfix/v01.02.02'
Fix memory leak in classification example
- MCT-1101
- MCT-1101
classification: Modify configuration structure to runFullNet before calling constructor for Execution Object
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Updated patch version to 2 (TIDL API 1.2.2)
Merge tag 'v01.02.01' into develop
Merge branch 'hotfix/v01.02.01'
tidl-viewer: Remove executor.h inclusion in utils.cpp
One of the files in the tidl-viewer build, utils.cpp, was updated to include
executor.h. This header in turn includes a file from OpenCL, custom.h.
The yocto build of tidl-viewer is a native recipe and hence cannot
include opencl recipes as a dependency to obtain custom.h.
This commit updates utils.cpp to remove the include of executor.h and
custom.h.
(MCT-1100)
One of the files in the tidl-viewer build, utils.cpp, was updated to include
executor.h. This header in turn includes a file from OpenCL, custom.h.
The yocto build of tidl-viewer is a native recipe and hence cannot
include opencl recipes as a dependency to obtain custom.h.
This commit updates utils.cpp to remove the include of executor.h and
custom.h.
(MCT-1100)
Merge tag 'v01.02.00' into develop
Merge branch 'release/v01.02.00'
Add jdetnet_voc network and make it the default
- jdetnet_voc is trained with more object categories than original
jdetnet. Make jdetnet_voc the default in the ssd example. User can
still use command line options to run the original jdetnet network.
- MCT-1091
- jdetnet_voc is trained with more object categories than original
jdetnet. Make jdetnet_voc the default in the ssd example. User can
still use command line options to run the original jdetnet network.
- MCT-1091
Update imagenet example with new softmax output
- MCT-1089
- MCT-1089
Added Python variant of mnist example
Also fix one_eo_per_frame.py to avoid creating an EVE executor if there are
no EVEs available.
(MCT-1088)
Also fix one_eo_per_frame.py to avoid creating an EVE executor if there are
no EVEs available.
(MCT-1088)
Add link in changelog for mnist example
- MCT-1083
- MCT-1083
Add MNIST LeNet network model and test input
- Constrained to EVE only for now.
- Add documentation for mnist example.
- MCT-1083
- Constrained to EVE only for now.
- Add documentation for mnist example.
- MCT-1083
Add mnist example with low compute
- Show that TIDL API with multiple contexts and pipelined computation
offers low overhead for small networks as well.
- MCT-1083
- Show that TIDL API with multiple contexts and pipelined computation
offers low overhead for small networks as well.
- MCT-1083
Update reference output for unit tests
A defect fix in the softmax layer necessitated updates to the reference
output of networks using the softmax layer.
(MCT-1087)
A defect fix in the softmax layer necessitated updates to the reference
output of networks using the softmax layer.
(MCT-1087)
Documentation - update 'Using the API' chapter
(MCT-1086)
(MCT-1086)
Initialize EO::current_frame_idx_m in constructor
Initialize ExecutionObject::current_frame_idx_m array to 0 in the
ExecutionObject constructor to prevent out of range entries when
recording trace data.
In a pipelined processing loop, the application executes
ExecutionObject::ProcessFrameWait() on the first frame before it calls
ExecutionObject::ProcessFrameStartAsync. The side effect is that the
current_frame_idx_m is not initialized. This can result in negative
frame indices when writing trace data using ReportTrace or UpdateTrace
leading to memory errors.
Setting ExecutionObject::current_frame_idx_m to 0 in the constructor
avoids this scenario.
(MCT-1085)
Initialize ExecutionObject::current_frame_idx_m array to 0 in the
ExecutionObject constructor to prevent out of range entries when
recording trace data.
In a pipelined processing loop, the application executes
ExecutionObject::ProcessFrameWait() on the first frame before it calls
ExecutionObject::ProcessFrameStartAsync. The side effect is that the
current_frame_idx_m is not initialized. This can result in negative
frame indices when writing trace data using ReportTrace or UpdateTrace
leading to memory errors.
Setting ExecutionObject::current_frame_idx_m to 0 in the constructor
avoids this scenario.
(MCT-1085)
Updated version on develop to 1.3.0
Updated parameter description in doxygen comments
(MCT-1084)
(MCT-1084)
Build Python bindings library by default
(MCT-1069)
(MCT-1069)
Updated manifest for v1.2.0
(MCT-1060)
(MCT-1060)
Add option to specify object classes list file
- so that user can specify a different object classes list file
without re-compiling the application.
- MCT-1081
- so that user can specify a different object classes list file
without re-compiling the application.
- MCT-1081
mcbench: image preprocessing, handle layergroups=1
- Add image preprocessing for types 1 and 2.
- if layer groups is 1, force all layers to be in the same group
(MCT-1075)
- Add image preprocessing for types 1 and 2.
- if layer groups is 1, force all layers to be in the same group
(MCT-1075)
classification: Support different network models
jacinto11, mobilenet and inceptionet models can be used with this example.
jacinto11, mobilenet and inceptionet models can be used with this example.
infer: Add configuration files for inceptionnet and mobilenet that can run as two layer groups
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
tidl_models: Add mobilenet and inceptionnet models, trained on ImageNet
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>
mcbench: Multicore benchmark with minimal overhead
- Add required models, input test vectors and platform specific scripts
- Add inference configuration files for multicore benchmarking
- Rename input files to indicate multiple frames and add more inference
configurations, covered in regression scripts
(MCT-1075)
- Add required models, input test vectors and platform specific scripts
- Add inference configuration files for multicore benchmarking
- Rename input files to indicate multiple frames and add more inference
configurations, covered in regression scripts
(MCT-1075)
examples: Add layers group command line parameter
Signed-off-by: Djordje Senicic <x0157990@ti.com>
Signed-off-by: Djordje Senicic <x0157990@ti.com>