summary | shortlog | log | commit | commitdiff | tree
raw | patch | inline | side by side (parent: 6b03a3b)
raw | patch | inline | side by side (parent: 6b03a3b)
author | Yuan Zhao <yuanzhao@ti.com> | |
Thu, 23 Aug 2018 21:53:58 +0000 (16:53 -0500) | ||
committer | Yuan Zhao <yuanzhao@ti.com> | |
Wed, 5 Sep 2018 16:31:05 +0000 (11:31 -0500) |
- mp4/avi/mov as pre-recorded video input
- camera as live video input, let user choose video input port #
- refactor examples code
- bookkeep each EO's device/host time inside EOP since EO could be shared
- documentation update on 650MHz EVE
- documentation on video inputs and output
- MCT-1015
- camera as live video input, let user choose video input port #
- refactor examples code
- bookkeep each EO's device/host time inside EOP since EO could be shared
- documentation update on 650MHz EVE
- documentation on video inputs and output
- MCT-1015
15 files changed:
index b72d6a0eec5b05a3097d496b9f5bef6eb493312e..2ebbdac6402b4b6228d5e9362bccfac915a6be7f 100644 (file)
--- a/docs/source/example.rst
+++ b/docs/source/example.rst
``imagenet`` and ``segmentation`` can run on AM57x processors with either EVE or C66x cores.
``ssd_multibox`` requires AM57x processors with both EVE and C66x. The performance
numbers that we present here were obtained on an AM5729 EVM, which
-includes 2 Arm Cortex-A15 cores running at 1.5GHz, 4 EVE cores at 535MHz, and
+includes 2 Arm Cortex-A15 cores running at 1.5GHz, 2 EVE cores at 650MHz, and
2 DSP cores at 750MHz.
For each example, we report device processing time, host processing time,
**Host processing time** is measured on the host, from the moment
``ProcessFrameStartAsync()`` is called till ``ProcessFrameWait()`` returns
in user application. It includes the TIDL API overhead, the OpenCL runtime
-overhead, and the time to copy user input data into padded TIDL internal
-buffers.
+overhead, and the time to copy between user input/output data and
+the padded TIDL internal buffers.
Imagenet
--------
====================== ==================== ============
Device Processing Time Host Processing Time API Overhead
====================== ==================== ============
- EVE: 123.1 ms 124.7 ms 1.34 %
+ EVE: 103.5 ms 104.8 ms 1.21 %
**OR**
- DSP: 117.9 ms 119.3 ms 1.14 %
+ DSP: 117.4 ms 118.4 ms 0.827 %
====================== ==================== ============
The particular network that we ran in this category, jacintonet11v2,
-has 14 layers. User can specify whether to run the network on EVE or DSP
-for acceleration. We can see that EVE time is slightly higher than DSP time.
-We can also see that the overall overhead is less than 1.5%.
+has 14 layers. Input to the network is RGB image of 224x224.
+User can specify whether to run the network on EVE or DSP
+for acceleration. We can see that EVE time is slightly faster than DSP time.
+We can also see that the overall overhead is less than 1.3%.
.. note::
The predicitions reported here are based on the output of the softmax
:width: 600
The network we ran in this category is jsegnet21v2, which has 26 layers.
+Input to the network is RGB image of size 1024x512. The output is 1024x512
+values, each value indicates which pre-trained category the current pixel
+belongs to. The example will take the network output, create an overlay,
+and blend the overlay onto the original input image to create an output image.
From the reported time in the following table, we can see that this network
-runs significantly faster on EVE than on DSP.
+runs significantly faster on EVE than on DSP. The API overhead is less than
+1.1%.
.. table::
====================== ==================== ============
Device Processing Time Host Processing Time API Overhead
====================== ==================== ============
- EVE: 296.5 ms 303.3 ms 2.26 %
+ EVE: 248.7 ms 251.3 ms 1.02 %
**OR**
- DSP: 812.0 ms 818.4 ms 0.79 %
+ DSP: 813.2 ms 815.5 ms 0.281 %
====================== ==================== ============
.. _ssd-example:
.. image:: images/pexels-photo-378570-ssd.jpg
:width: 600
+The network we ran in this category is jdenet_ssd, which has 43 layers.
+Input to the network is RGB image of size 768x320. Output is a list of
+boxes (up to 20), each box has information about the box coordinates, and
+which pre-trained category that the object inside the box belongs to.
+The example will take the network output, draw boxes accordingly,
+and create an output image.
The network can be run entirely on either EVE or DSP. But the best
-performance comes with running the first 30 layers on EVE and the
-next 13 layers on DSP, for this particular jdetnet_ssd network.
+performance comes with running the first 30 layers as a group on EVE
+and the next 13 layers as another group on DSP.
Note the **AND** in the following table for the reported time.
+The overall API overhead is about 1.61%.
Our end-to-end example shows how easy it is to assign a layers group id
-to an *Executor* and how easy it is to connect the output from one
-*ExecutionObject* to the input to another *ExecutionObject*.
+to an *Executor* and how easy it is to construct an *ExecutionObjectPipeline*
+to connect the output of one *Executor*'s *ExecutionObject*
+to the input of another *Executor*'s *ExecutionObject*.
.. table::
====================== ==================== ============
Device Processing Time Host Processing Time API Overhead
====================== ==================== ============
- EVE: 175.2 ms 179.1 ms 2.14 %
+ EVE: 148.0 ms 150.1 ms 1.33 %
**AND**
- DSP: 21.1 ms 22.3 ms 5.62 %
+ DSP: 22.27 ms 23.06 ms 3.44 %
+ **TOTAL**
+ EVE+DSP: 170.3 ms 173.1 ms 1.61 %
====================== ==================== ============
Test
The following code section shows how to run the examples, and
the test program that tests all supported TIDL network configs.
-.. code:: shell
+.. code-block:: shell
root@am57xx-evm:~# cd /usr/share/ti/tidl-api/examples/imagenet/
root@am57xx-evm:/usr/share/ti/tidl-api/examples/imagenet# make -j4
frame[0]: Time on device: 960ms, host: 961.1ms API overhead: 0.116 %
squeeze1_1 : PASSED
tidl PASSED
+
+Image input
+^^^^^^^^^^^
+
+The image input option, ``-i <image>``, takes an image file as input.
+You can supply an image file with format that OpenCV can read, since
+we use OpenCV for image pre/post-processing. When ``-f <number>`` option
+is used, the same image will be processed repeatedly.
+
+Camera (live video) input
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The input option, ``-i camera<number>``, enables live frame inputs
+from camera. ``<number>`` is the video input port number
+of your camera in Linux. Use the following command to check video
+input ports. The number defaults to ``1`` for TMDSCM572X camera module
+used on AM57x EVMs. You can use ``-f <number>`` to specify the number
+of frames you want to process.
+
+.. code-block:: shell
+
+ root@am57xx-evm:~# v4l2-ctl --list-devices
+ omapwb-cap (platform:omapwb-cap):
+ /dev/video11
+
+ omapwb-m2m (platform:omapwb-m2m):
+ /dev/video10
+
+ vip (platform:vip):
+ /dev/video1
+
+ vpe (platform:vpe):
+ /dev/video0
+
+
+Pre-recorded video (mp4/mov/avi) input
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The input option, ``-i <name>.{mp4,mov,avi}``, enables frame inputs from
+pre-recorded video file in mp4, mov or avi format. If you have a video in
+a different OpenCV-supported format/suffix, you can simply create a softlink
+with one of the mp4, mov or avi suffixes and feed it into the example.
+Again, use ``-f <number>`` to specify the number of frames you want to process.
+
+Displaying video output
+^^^^^^^^^^^^^^^^^^^^^^^
+
+When using video input, live or pre-recorded, the example will display
+the output in a window using OpenCV. If you have a LCD screen attached
+to the EVM, you will need to kill the ``matrix-gui`` first in order to
+see the example display window, as shown in the following example.
+
+.. code-block:: shell
+
+ root@am57xx-evm:/usr/share/ti/tidl/examples/ssd_multibox# /etc/init.d/matrix-gui-2.0 stop
+ Stopping Matrix GUI application.
+ root@am57xx-evm:/usr/share/ti/tidl/examples/ssd_multibox# ./ssd_multibox -i camera -f 100
+ Input: camera
+ init done
+ Using Wayland-EGL
+ wlpvr: PVR Services Initialised
+ Using the 'xdg-shell-v5' shell integration
+ ... ...
+ root@am57xx-evm:/usr/share/ti/tidl/examples/ssd_multibox# /etc/init.d/matrix-gui-2.0 start
+ /usr/share/ti/tidl/examples/ssd_multibox
+ Removing stale PID file /var/run/matrix-gui-2.0.pid.
+ Starting Matrix GUI application.
index c45acc6b04f3701b0df0ddc92e55b3d86679f800..bc0ab3e88bf95f959859ca6b7aa3fcdc6a26fe1b 100644 (file)
using std::string;
using std::istream;
using std::ostream;
+using std::vector;
static bool read_frame_helper(char* ptr, size_t size, istream& input_file);
return buffer;
}
+
+// Allocate input and output memory for each EO
+void AllocateMemory(const vector<ExecutionObject *>& eos)
+{
+ // Allocate input and output buffers for each execution object
+ for (auto eo : eos)
+ {
+ size_t in_size = eo->GetInputBufferSizeInBytes();
+ size_t out_size = eo->GetOutputBufferSizeInBytes();
+ void* in_ptr = malloc(in_size);
+ void* out_ptr = malloc(out_size);
+ assert(in_ptr != nullptr && out_ptr != nullptr);
+
+ ArgInfo in = { ArgInfo(in_ptr, in_size)};
+ ArgInfo out = { ArgInfo(out_ptr, out_size)};
+ eo->SetInputOutputBuffer(in, out);
+ }
+}
+
+// Free the input and output memory associated with each EO
+void FreeMemory(const vector<ExecutionObject *>& eos)
+{
+ for (auto eo : eos)
+ {
+ free(eo->GetInputBufferPtr());
+ free(eo->GetOutputBufferPtr());
+ }
+}
+
+// Allocate input and output memory for each EOP
+void AllocateMemory(const vector<ExecutionObjectPipeline *>& eops)
+{
+ // Allocate input and output buffers for each execution object
+ for (auto eop : eops)
+ {
+ size_t in_size = eop->GetInputBufferSizeInBytes();
+ size_t out_size = eop->GetOutputBufferSizeInBytes();
+ void* in_ptr = malloc(in_size);
+ void* out_ptr = malloc(out_size);
+ assert(in_ptr != nullptr && out_ptr != nullptr);
+
+ ArgInfo in = { ArgInfo(in_ptr, in_size)};
+ ArgInfo out = { ArgInfo(out_ptr, out_size)};
+ eop->SetInputOutputBuffer(in, out);
+ }
+}
+
+// Free the input and output memory associated with each EOP
+void FreeMemory(const vector<ExecutionObjectPipeline *>& eops)
+{
+ for (auto eop : eops)
+ {
+ free(eop->GetInputBufferPtr());
+ free(eop->GetOutputBufferPtr());
+ }
+}
+
index 732b8af63d243d7bc32f4b10ca90e83afeebe1f0..deef59a4a7d2fbea95ef84281985fe9e5d6919ca 100644 (file)
--- a/examples/common/utils.h
+++ b/examples/common/utils.h
#include <string>
#include <iostream>
#include <fstream>
+#include <vector>
#include "executor.h"
#include "execution_object.h"
#include "execution_object_pipeline.h"
bool CheckFrame(const ExecutionObjectPipeline *eop, const char *ref_output);
const char* ReadReferenceOutput(const std::string& name);
+
+void AllocateMemory(const std::vector<ExecutionObject *>& eos);
+void FreeMemory(const std::vector<ExecutionObject *>& eos);
+void AllocateMemory(const std::vector<ExecutionObjectPipeline *>& eops);
+void FreeMemory(const std::vector<ExecutionObjectPipeline *>& eops);
diff --git a/examples/common/video_utils.cpp b/examples/common/video_utils.cpp
--- /dev/null
@@ -0,0 +1,145 @@
+/******************************************************************************
+ * Copyright (c) 2018, Texas Instruments Incorporated - http://www.ti.com/
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * * Neither the name of Texas Instruments Incorporated nor the
+ * names of its contributors may be used to endorse or promote products
+ * derived from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ *****************************************************************************/
+
+#include <getopt.h>
+#include <cassert>
+#include "video_utils.h"
+
+using namespace std;
+using namespace tidl;
+
+
+bool ProcessArgs(int argc, char *argv[], cmdline_opts_t& opts)
+{
+ opts.num_frames = 0;
+ opts.output_width = 0;
+ opts.verbose = false;
+ opts.is_camera_input = false;
+ opts.is_video_input = false;
+
+ const struct option long_options[] =
+ {
+ {"config", required_argument, 0, 'c'},
+ {"num_dsps", required_argument, 0, 'd'},
+ {"num_eves", required_argument, 0, 'e'},
+ {"num_frames", required_argument, 0, 'f'},
+ {"input_file", required_argument, 0, 'i'},
+ {"output_width", required_argument, 0, 'w'},
+ {"help", no_argument, 0, 'h'},
+ {"verbose", no_argument, 0, 'v'},
+ {0, 0, 0, 0}
+ };
+
+ int option_index = 0;
+
+ while (true)
+ {
+ int c = getopt_long(argc, argv, "c:d:e:f:i:w:hv", long_options,
+ &option_index);
+
+ if (c == -1)
+ break;
+
+ switch (c)
+ {
+ case 'c': opts.config = optarg;
+ break;
+
+ case 'd': opts.num_dsps = atoi(optarg);
+ assert(opts.num_dsps >= 0 && opts.num_dsps <=
+ Executor::GetNumDevices(DeviceType::DSP));
+ break;
+
+ case 'e': opts.num_eves = atoi(optarg);
+ assert(opts.num_eves >= 0 && opts.num_eves <=
+ Executor::GetNumDevices(DeviceType::EVE));
+ break;
+
+ case 'f': opts.num_frames = atoi(optarg);
+ assert (opts.num_frames > 0);
+ break;
+
+ case 'i': opts.input_file = optarg;
+ break;
+
+ case 'w': opts.output_width = atoi(optarg);
+ assert (opts.output_width > 0);
+ break;
+
+ case 'v': opts.verbose = true;
+ break;
+
+ case 'h': return false;
+ break;
+
+ case '?': // Error in getopt_long
+ exit(EXIT_FAILURE);
+ break;
+
+ default:
+ cerr << "Unsupported option: " << c << endl;
+ return false;
+ break;
+ }
+ }
+
+ opts.is_camera_input = (opts.input_file.size() > 5 &&
+ opts.input_file.substr(0, 6) == "camera");
+ if (opts.input_file.size() > 4)
+ {
+ string suffix = opts.input_file.substr(opts.input_file.size() - 4, 4);
+ opts.is_video_input = (suffix == ".mp4") || (suffix == ".avi") ||
+ (suffix == ".mov");
+ }
+}
+
+// Set Video Input and Output
+bool SetVideoInputOutput(VideoCapture &cap, const cmdline_opts_t& opts,
+ const char* window_name)
+{
+ if (opts.is_camera_input || opts.is_video_input)
+ {
+ if (opts.is_camera_input)
+ {
+ int port_num = 1; // if TMDSCM572X camera module on AM57x EVM
+ if (opts.input_file.size() > 6) // "camera#"
+ port_num = stoi(opts.input_file.substr(6,
+ opts.input_file.size() - 6));
+ cap = VideoCapture(port_num);
+ }
+ else
+ cap = VideoCapture(opts.input_file);
+ if (! cap.isOpened())
+ {
+ cerr << "Cannot open video input: " << opts.input_file << endl;
+ return false;
+ }
+ namedWindow(window_name, WINDOW_AUTOSIZE | CV_GUI_NORMAL);
+ }
+}
+
diff --git a/examples/common/video_utils.h b/examples/common/video_utils.h
--- /dev/null
@@ -0,0 +1,53 @@
+/******************************************************************************
+ * Copyright (c) 2018, Texas Instruments Incorporated - http://www.ti.com/
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ * * Neither the name of Texas Instruments Incorporated nor the
+ * names of its contributors may be used to endorse or promote products
+ * derived from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ *****************************************************************************/
+#pragma once
+
+#include "utils.h"
+#include "opencv2/core.hpp"
+#include "opencv2/imgproc.hpp"
+#include "opencv2/highgui.hpp"
+#include "opencv2/videoio.hpp"
+
+using namespace cv;
+
+typedef struct cmdline_opts_t_ {
+ std::string config;
+ uint32_t num_dsps;
+ uint32_t num_eves;
+ uint32_t num_frames;
+ std::string input_file;
+ uint32_t output_width;
+ bool verbose;
+ bool is_camera_input;
+ bool is_video_input;
+} cmdline_opts_t;
+
+
+bool ProcessArgs(int argc, char *argv[], cmdline_opts_t& opts);
+bool SetVideoInputOutput(VideoCapture &cap, const cmdline_opts_t& opts,
+ const char* window_name);
index cfb3ba849df205058678d73181f054778c566dea..637cd07e940077a232f6bcfb486753f56834a160 100644 (file)
LIBS += -lopencv_highgui -lopencv_imgcodecs -lopencv_videoio\
-lopencv_imgproc -lopencv_core
-SOURCES = main.cpp imagenet_classes.cpp
+SOURCES = main.cpp imagenet_classes.cpp ../common/utils.cpp \
+ ../common/video_utils.cpp
$(EXE): $(TIDL_API_LIB) $(TIDL_API_LIB_IMGUTIL) $(HEADERS) $(SOURCES)
$(CXX) $(CXXFLAGS) $(SOURCES) $(TIDL_API_LIB) $(TIDL_API_LIB_IMGUTIL) \
index ec6c77fe943233e39305a469329fe977b848c414..8504f1132b0b714706309a9f28b01481fd4ce00d 100644 (file)
* THE POSSIBILITY OF SUCH DAMAGE.
*****************************************************************************/
#include <signal.h>
-#include <getopt.h>
#include <iostream>
#include <iomanip>
#include <fstream>
#include "configuration.h"
#include "imagenet_classes.h"
#include "imgutil.h"
+#include "../common/video_utils.h"
#include "opencv2/core.hpp"
#include "opencv2/imgproc.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/videoio.hpp"
-
-bool __TI_show_debug_ = false;
-
+using namespace std;
using namespace tidl;
using namespace tidl::imgutil;
using namespace cv;
#define NUM_VIDEO_FRAMES 300
+#define DEFAULT_CONFIG "j11_v2"
#define NUM_DEFAULT_INPUTS 1
const char *default_inputs[NUM_DEFAULT_INPUTS] =
{
"../test/testvecs/input/objects/cat-pet-animal-domestic-104827.jpeg"
};
-bool is_camera_input = false;
-bool is_video_input = false;
-
-bool RunConfiguration(const std::string& config_file,
- uint32_t num_dsps, uint32_t num_eves,
- std::string& input_file, int num_frames);
-bool RunAllConfigurations(int32_t num_devices, DeviceType device_type);
-
-void ReportTime(ExecutionObject& eo);
-bool ReadFrame(ExecutionObject& eo, int frame_idx,
- const Configuration& configuration, int num_frames,
- const std::string& input_file, VideoCapture &cap);
+Executor* CreateExecutor(DeviceType dt, int num, const Configuration& c);
+bool RunConfiguration(cmdline_opts_t& opts);
+bool ReadFrame(ExecutionObject& eo, int frame_idx, const Configuration& c,
+ const cmdline_opts_t& opts, VideoCapture &cap);
bool WriteFrameOutput(const ExecutionObject &eo);
-
-static
-void ProcessArgs(int argc, char *argv[], std::string& config,
- uint32_t& num_dsps, uint32_t& num_eves,
- std::string& input_file, int &num_frames);
-bool IsCameraOrVideoInput(const std::string& s);
-
-static void DisplayHelp();
+void DisplayHelp();
int main(int argc, char *argv[])
uint32_t num_dsps = Executor::GetNumDevices(DeviceType::DSP);
if (num_eves == 0 && num_dsps == 0)
{
- std::cout << "TI DL not supported on this SoC." << std::endl;
+ cout << "TI DL not supported on this SoC." << endl;
return EXIT_SUCCESS;
}
// Process arguments
- std::string config = "j11_v2";
- std::string input_file;
- if (num_eves != 0) { num_eves = 1; num_dsps = 0; }
- else { num_eves = 0; num_dsps = 1; }
- int num_frames = 1;
- ProcessArgs(argc, argv, config, num_dsps, num_eves, input_file, num_frames);
-
- assert(num_dsps != 0 || num_eves != 0);
-
- if (IsCameraOrVideoInput(input_file) && num_frames == 1)
- num_frames = NUM_VIDEO_FRAMES;
- if (input_file.empty())
- std::cout << "Input: " << default_inputs[0] << std::endl;
+ cmdline_opts_t opts;
+ opts.config = DEFAULT_CONFIG;
+ if (num_eves != 0) { opts.num_eves = 1; opts.num_dsps = 0; }
+ else { opts.num_eves = 0; opts.num_dsps = 1; }
+ if (! ProcessArgs(argc, argv, opts))
+ {
+ DisplayHelp();
+ exit(EXIT_SUCCESS);
+ }
+ assert(opts.num_dsps != 0 || opts.num_eves != 0);
+ if (opts.num_frames == 0)
+ opts.num_frames = (opts.is_camera_input || opts.is_video_input) ?
+ NUM_VIDEO_FRAMES : 1;
+ if (opts.input_file.empty())
+ cout << "Input: " << default_inputs[0] << endl;
else
- std::cout << "Input: " << input_file << std::endl;
+ cout << "Input: " << opts.input_file << endl;
- std::string config_file = "../test/testvecs/config/infer/tidl_config_"
- + config + ".txt";
- bool status = RunConfiguration(config_file, num_dsps, num_eves,
- input_file, num_frames);
+ // Run network
+ bool status = RunConfiguration(opts);
if (!status)
{
- std::cout << "imagenet FAILED" << std::endl;
+ cout << "imagenet FAILED" << endl;
return EXIT_FAILURE;
}
- std::cout << "imagenet PASSED" << std::endl;
+ cout << "imagenet PASSED" << endl;
return EXIT_SUCCESS;
}
-bool IsCameraOrVideoInput(const std::string& s)
-{
- is_camera_input = (s == "camera");
- is_video_input = (s.size() > 4 && s.substr(s.size() - 4, 4) == ".mp4");
- return is_camera_input || is_video_input;
-}
-
-bool RunConfiguration(const std::string& config_file,
- uint32_t num_dsps, uint32_t num_eves,
- std::string& input_file, int num_frames)
+bool RunConfiguration(cmdline_opts_t& opts)
{
- DeviceIds dsp_ids, eve_ids;
- for (uint32_t i = 0; i < num_dsps; i++)
- dsp_ids.insert(static_cast<DeviceId>(i));
- for (uint32_t i = 0; i < num_eves; i++)
- eve_ids.insert(static_cast<DeviceId>(i));
-
// Read the TI DL configuration file
- Configuration configuration;
- bool status = configuration.ReadFromFile(config_file);
+ Configuration c;
+ string config_file = "../test/testvecs/config/infer/tidl_config_"
+ + opts.config + ".txt";
+ bool status = c.ReadFromFile(config_file);
if (!status)
{
- std::cerr << "Error in configuration file: " << config_file
- << std::endl;
+ cerr << "Error in configuration file: " << config_file << endl;
return false;
}
+ c.enableApiTrace = opts.verbose;
- // setup input
+ // setup camera/video input/output
VideoCapture cap;
- if (is_camera_input || is_video_input)
- {
- if (is_camera_input)
- cap = VideoCapture(1);
- else
- cap = VideoCapture(input_file);
- if (! cap.isOpened())
- {
- std::cerr << "Cannot open video input: " << input_file << std::endl;
- return false;
- }
- namedWindow("ImageNet", WINDOW_AUTOSIZE | CV_GUI_NORMAL);
- }
+ if (! SetVideoInputOutput(cap, opts, "ImageNet")) return false;
try
{
// Create Executors with the approriate core type, number of cores
// and configuration specified
- Executor *e_eve = (num_eves == 0) ? nullptr :
- new Executor(DeviceType::EVE, eve_ids, configuration);
- Executor *e_dsp = (num_dsps == 0) ? nullptr :
- new Executor(DeviceType::DSP, dsp_ids, configuration);
+ Executor* e_eve = CreateExecutor(DeviceType::EVE, opts.num_eves, c);
+ Executor* e_dsp = CreateExecutor(DeviceType::DSP, opts.num_dsps, c);
// Get ExecutionObjects from Executors
- std::vector<ExecutionObject*> eos;
- for (uint32_t i = 0; i < num_eves; i++) eos.push_back((*e_eve)[i]);
- for (uint32_t i = 0; i < num_dsps; i++) eos.push_back((*e_dsp)[i]);
- int num_eos = eos.size();
+ vector<ExecutionObject*> eos;
+ for (uint32_t i = 0; i < opts.num_eves; i++) eos.push_back((*e_eve)[i]);
+ for (uint32_t i = 0; i < opts.num_dsps; i++) eos.push_back((*e_dsp)[i]);
+ uint32_t num_eos = eos.size();
// Allocate input and output buffers for each ExecutionObject
- std::vector<void *> buffers;
- for (auto eo : eos)
- {
- size_t in_size = eo->GetInputBufferSizeInBytes();
- size_t out_size = eo->GetOutputBufferSizeInBytes();
- void* in_ptr = malloc(in_size);
- void* out_ptr = malloc(out_size);
- assert(in_ptr != nullptr && out_ptr != nullptr);
- buffers.push_back(in_ptr);
- buffers.push_back(out_ptr);
-
- ArgInfo in = { ArgInfo(in_ptr, in_size)};
- ArgInfo out = { ArgInfo(out_ptr, out_size)};
- eo->SetInputOutputBuffer(in, out);
- }
+ AllocateMemory(eos);
- std::chrono::time_point<std::chrono::steady_clock> tloop0, tloop1;
- tloop0 = std::chrono::steady_clock::now();
+ chrono::time_point<chrono::steady_clock> tloop0, tloop1;
+ tloop0 = chrono::steady_clock::now();
- // Process frames with available execution objects in a pipelined manner
+ // Process frames with available eos in a pipelined manner
// additional num_eos iterations to flush the pipeline (epilogue)
- for (int frame_idx = 0;
- frame_idx < num_frames + num_eos; frame_idx++)
+ for (uint32_t frame_idx = 0;
+ frame_idx < opts.num_frames + num_eos; frame_idx++)
{
ExecutionObject* eo = eos[frame_idx % num_eos];
// Wait for previous frame on the same eo to finish processing
if (eo->ProcessFrameWait())
{
- ReportTime(*eo);
+ ReportTime(eo);
WriteFrameOutput(*eo);
}
// Read a frame and start processing it with current eo
- if (ReadFrame(*eo, frame_idx, configuration, num_frames,
- input_file, cap))
- {
+ if (ReadFrame(*eo, frame_idx, c, opts, cap))
eo->ProcessFrameStartAsync();
- }
}
- tloop1 = std::chrono::steady_clock::now();
- std::chrono::duration<float> elapsed = tloop1 - tloop0;
- std::cout << "Loop total time (including read/write/opencv/print/etc): "
- << std::setw(6) << std::setprecision(4)
- << (elapsed.count() * 1000) << "ms" << std::endl;
+ tloop1 = chrono::steady_clock::now();
+ chrono::duration<float> elapsed = tloop1 - tloop0;
+ cout << "Loop total time (including read/write/opencv/print/etc): "
+ << setw(6) << setprecision(4)
+ << (elapsed.count() * 1000) << "ms" << endl;
- for (auto b : buffers)
- free(b);
+ FreeMemory(eos);
delete e_eve;
delete e_dsp;
}
catch (tidl::Exception &e)
{
- std::cerr << e.what() << std::endl;
+ cerr << e.what() << endl;
status = false;
}
return status;
}
-void ReportTime(ExecutionObject& eo)
+// Create an Executor with the specified type and number of EOs
+Executor* CreateExecutor(DeviceType dt, int num, const Configuration& c)
{
- double elapsed_host = eo.GetHostProcessTimeInMilliSeconds();
- double elapsed_device = eo.GetProcessTimeInMilliSeconds();
- double overhead = 100 - (elapsed_device/elapsed_host*100);
-
- std::cout << "frame[" << eo.GetFrameIndex() << "]: "
- << "Time on " << eo.GetDeviceName() << ": "
- << std::setw(6) << std::setprecision(4)
- << elapsed_device << "ms, "
- << "host: "
- << std::setw(6) << std::setprecision(4)
- << elapsed_host << "ms ";
- std::cout << "API overhead: "
- << std::setw(6) << std::setprecision(3)
- << overhead << " %" << std::endl;
+ if (num == 0) return nullptr;
+
+ DeviceIds ids;
+ for (uint32_t i = 0; i < num; i++)
+ ids.insert(static_cast<DeviceId>(i));
+
+ return new Executor(dt, ids, c);
}
-bool ReadFrame(ExecutionObject &eo, int frame_idx,
- const Configuration& configuration, int num_frames,
- const std::string& input_file, VideoCapture &cap)
+bool ReadFrame(ExecutionObject &eo, int frame_idx, const Configuration& c,
+ const cmdline_opts_t& opts, VideoCapture &cap)
{
- if (frame_idx >= num_frames)
+ if (frame_idx >= opts.num_frames)
return false;
eo.SetFrameIndex(frame_idx);
assert (frame_buffer != nullptr);
Mat image;
- if (! is_camera_input && ! is_video_input)
+ if (! opts.is_camera_input && ! opts.is_video_input)
{
- if (input_file.empty())
+ if (opts.input_file.empty())
image = cv::imread(default_inputs[frame_idx % NUM_DEFAULT_INPUTS],
CV_LOAD_IMAGE_COLOR);
else
- image = cv::imread(input_file, CV_LOAD_IMAGE_COLOR);
+ image = cv::imread(opts.input_file, CV_LOAD_IMAGE_COLOR);
if (image.empty())
{
- std::cerr << "Unable to read input image" << std::endl;
+ cerr << "Unable to read input image" << endl;
return false;
}
}
Mat v_image;
if (! cap.grab()) return false;
if (! cap.retrieve(v_image)) return false;
- if (is_camera_input)
- // Crop 640x480 camera input to center 256x256 input
- image = Mat(v_image, Rect(192, 112, 256, 256));
+ int orig_width = v_image.cols;
+ int orig_height = v_image.rows;
+ // Crop camera/video input to center 256x256 input
+ if (orig_width > 256 && orig_height > 256)
+ {
+ image = Mat(v_image, Rect((orig_width-256)/2, (orig_height-256)/2,
+ 256, 256));
+ }
else
image = v_image;
cv::imshow("ImageNet", image);
}
// TI DL image preprocessing, into frame_buffer
- return PreProcImage(image, frame_buffer, 1, 3,
- configuration.inWidth, configuration.inHeight,
- configuration.inWidth,
- configuration.inWidth * configuration.inHeight,
- 1, configuration.preProcType);
+ return PreProcImage(image, frame_buffer, 1, 3, c.inWidth, c.inHeight,
+ c.inWidth, c.inWidth * c.inHeight, 1, c.preProcType);
}
// Display top 5 classified imagenet classes with probabilities
int out_size = eo.GetOutputBufferSizeInBytes();
// sort and get k largest values and corresponding indices
- typedef std::pair<unsigned char, int> val_index;
+ typedef pair<unsigned char, int> val_index;
auto constexpr cmp = [](val_index &left, val_index &right)
{ return left.first > right.first; };
- std::priority_queue<val_index, std::vector<val_index>, decltype(cmp)>
- queue(cmp);
+ priority_queue<val_index, vector<val_index>, decltype(cmp)> queue(cmp);
// initialize priority queue with smallest value on top
for (int i = 0; i < k; i++)
queue.push(val_index(out[i], i));
}
// output top k values in reverse order: largest val first
- std::vector<val_index> sorted;
+ vector<val_index> sorted;
while (! queue.empty())
{
sorted.push_back(queue.top());
}
for (int i = k - 1; i >= 0; i--)
- {
- std::cout << k-i << ": " << imagenet_classes[sorted[i].second]
- << std::endl;
- }
+ cout << k-i << ": " << imagenet_classes[sorted[i].second] << endl;
return true;
}
-
-void ProcessArgs(int argc, char *argv[], std::string& config,
- uint32_t& num_dsps, uint32_t& num_eves,
- std::string& input_file, int &num_frames)
-{
- const struct option long_options[] =
- {
- {"config", required_argument, 0, 'c'},
- {"num_devices", required_argument, 0, 'n'},
- {"device_type", required_argument, 0, 't'},
- {"image_file", required_argument, 0, 'i'},
- {"num_frames", required_argument, 0, 'f'},
- {"help", no_argument, 0, 'h'},
- {"verbose", no_argument, 0, 'v'},
- {0, 0, 0, 0}
- };
-
- int option_index = 0;
-
- while (true)
- {
- int c = getopt_long(argc, argv, "c:d:e:i:f:hv", long_options, &option_index);
-
- if (c == -1)
- break;
-
- switch (c)
- {
- case 'c': config = optarg;
- break;
-
- case 'd': num_dsps = atoi(optarg);
- assert(num_dsps >= 0 && num_dsps <=
- Executor::GetNumDevices(DeviceType::DSP));
- break;
-
- case 'e': num_eves = atoi(optarg);
- assert(num_eves >= 0 && num_eves <=
- Executor::GetNumDevices(DeviceType::EVE));
- break;
-
- case 'i': input_file = optarg;
- break;
-
- case 'f': num_frames = atoi(optarg);
- assert (num_frames > 0);
- break;
-
- case 'v': __TI_show_debug_ = true;
- break;
-
- case 'h': DisplayHelp();
- exit(EXIT_SUCCESS);
- break;
-
- case '?': // Error in getopt_long
- exit(EXIT_FAILURE);
- break;
-
- default:
- std::cerr << "Unsupported option: " << c << std::endl;
- break;
- }
- }
-}
-
void DisplayHelp()
{
- std::cout << "Usage: imagenet\n"
- " Will run imagenet network to predict top 5 object"
- " classes for the input.\n Use -c to run a"
- " different imagenet network. Default is j11_v2.\n"
- "Optional arguments:\n"
- " -c <config> Valid configs: j11_bn, j11_prelu, j11_v2\n"
- " -d <number> Number of dsp cores to use\n"
- " -e <number> Number of eve cores to use\n"
- " -i <image> Path to the image file\n"
- " -i camera Use camera as input\n"
- " -i *.mp4 Use video file as input\n"
- " -f <number> Number of frames to process\n"
- " -v Verbose output during execution\n"
- " -h Help\n";
+ cout <<
+ "Usage: imagenet\n"
+ " Will run imagenet network to predict top 5 object"
+ " classes for the input.\n Use -c to run a"
+ " different imagenet network. Default is j11_v2.\n"
+ "Optional arguments:\n"
+ " -c <config> Valid configs: j11_bn, j11_prelu, j11_v2\n"
+ " -d <number> Number of dsp cores to use\n"
+ " -e <number> Number of eve cores to use\n"
+ " -i <image> Path to the image file as input\n"
+ " -i camera<number> Use camera as input\n"
+ " video input port: /dev/video<number>\n"
+ " -i <name>.{mp4,mov,avi} Use video file as input\n"
+ " -f <number> Number of frames to process\n"
+ " -v Verbose output during execution\n"
+ " -h Help\n";
}
index 009efdd8f0e4231d37780a849fb21030603efbf4..36197c82905040efc85dbb0eaeba9e817abdfd1b 100644 (file)
Executor* CreateExecutor(DeviceType dt, int num, const Configuration& c);
void CollectEOs(const Executor *e, vector<ExecutionObject *>& EOs);
-void AllocateMemory(const vector<ExecutionObject *>& EOs);
-void FreeMemory (const vector<ExecutionObject *>& EOs);
-
int main(int argc, char *argv[])
{
EOs.push_back((*e)[i]);
}
-// Allocate input and output memory for each EO
-void AllocateMemory(const vector<ExecutionObject *>& EOs)
-{
- // Allocate input and output buffers for each execution object
- for (auto eo : EOs)
- {
- size_t in_size = eo->GetInputBufferSizeInBytes();
- size_t out_size = eo->GetOutputBufferSizeInBytes();
- ArgInfo in = { ArgInfo(malloc(in_size), in_size)};
- ArgInfo out = { ArgInfo(malloc(out_size), out_size)};
- eo->SetInputOutputBuffer(in, out);
- }
-}
-
-// Free the input and output memory associated with each EO
-void FreeMemory(const vector<ExecutionObject *>& EOs)
-{
- for (auto eo : EOs)
- {
- free(eo->GetInputBufferPtr());
- free(eo->GetOutputBufferPtr());
- }
-
-}
index ac22366af10d0ba402cfe3439791591a6b4a8183..50f3fe38cf00672c24584443bee3ac5232cbe024 100644 (file)
LIBS += -lopencv_highgui -lopencv_imgcodecs -lopencv_videoio\
-lopencv_imgproc -lopencv_core
-SOURCES = main.cpp object_classes.cpp
+SOURCES = main.cpp object_classes.cpp ../common/utils.cpp \
+ ../common/video_utils.cpp
$(EXE): $(TIDL_API_LIB) $(HEADERS) $(SOURCES)
$(CXX) $(CXXFLAGS) $(SOURCES) $(TIDL_API_LIB) \
index 67f1a4acaeff3e80138096f1240efc50fe5fe04d..91d68f48615abaef9156af5f1c8b0c22bd32c0b0 100644 (file)
* THE POSSIBILITY OF SUCH DAMAGE.
*****************************************************************************/
#include <signal.h>
-#include <getopt.h>
#include <iostream>
#include <iomanip>
#include <fstream>
#include <queue>
#include <vector>
#include <cstdio>
+#include <chrono>
#include "executor.h"
#include "execution_object.h"
#include "configuration.h"
#include "object_classes.h"
+#include "../common/utils.h"
+#include "../common/video_utils.h"
-#include "opencv2/core.hpp"
-#include "opencv2/imgproc.hpp"
-#include "opencv2/highgui.hpp"
-#include "opencv2/videoio.hpp"
-
-#define NUM_VIDEO_FRAMES 100
-#define DEFAULT_CONFIG "jseg21_tiscapes"
-#define DEFAULT_INPUT "../test/testvecs/input/000100_1024x512_bgr.y"
-
-bool __TI_show_debug_ = false;
-bool is_default_input = false;
-bool is_preprocessed_input = false;
-bool is_camera_input = false;
-int orig_width;
-int orig_height;
-object_class_table_t *object_class_table;
-
+using namespace std;
using namespace tidl;
using namespace cv;
-bool RunConfiguration(const std::string& config_file, int num_devices,
- DeviceType device_type, std::string& input_file);
-bool RunAllConfigurations(int32_t num_devices, DeviceType device_type);
-
-bool ReadFrame(ExecutionObject& eo, int frame_idx,
- const Configuration& configuration, int num_frames,
- std::string& image_file, VideoCapture &cap);
-
-bool WriteFrameOutput(const ExecutionObject &eo,
- const Configuration& configuration);
+#define NUM_VIDEO_FRAMES 300
+#define DEFAULT_CONFIG "jseg21_tiscapes"
+#define DEFAULT_INPUT "../test/testvecs/input/000100_1024x512_bgr.y"
+#define DEFAULT_INPUT_FRAMES (9)
-static void ProcessArgs(int argc, char *argv[],
- std::string& config,
- int& num_devices,
- DeviceType& device_type,
- std::string& input_file);
+object_class_table_t *object_class_table;
+uint32_t orig_width;
+uint32_t orig_height;
-static void DisplayHelp();
-static double ms_diff(struct timespec &t0, struct timespec &t1)
-{ return (t1.tv_sec - t0.tv_sec) * 1e3 + (t1.tv_nsec - t0.tv_nsec) / 1e6; }
+bool RunConfiguration(const cmdline_opts_t& opts);
+Executor* CreateExecutor(DeviceType dt, int num, const Configuration& c);
+bool ReadFrame(ExecutionObject& eo, int frame_idx, const Configuration& c,
+ const cmdline_opts_t& opts, VideoCapture &cap);
+bool WriteFrameOutput(const ExecutionObject &eo, const Configuration& c,
+ const cmdline_opts_t& opts);
+void DisplayHelp();
int main(int argc, char *argv[])
signal(SIGTERM, exit);
// If there are no devices capable of offloading TIDL on the SoC, exit
- uint32_t num_eve = Executor::GetNumDevices(DeviceType::EVE);
- uint32_t num_dsp = Executor::GetNumDevices(DeviceType::DSP);
- if (num_eve == 0 && num_dsp == 0)
+ uint32_t num_eves = Executor::GetNumDevices(DeviceType::EVE);
+ uint32_t num_dsps = Executor::GetNumDevices(DeviceType::DSP);
+ if (num_eves == 0 && num_dsps == 0)
{
- std::cout << "TI DL not supported on this SoC." << std::endl;
+ cout << "TI DL not supported on this SoC." << endl;
return EXIT_SUCCESS;
}
// Process arguments
- std::string config = DEFAULT_CONFIG;
- std::string input_file = DEFAULT_INPUT;
- int num_devices = 1;
- DeviceType device_type = (num_eve > 0 ? DeviceType::EVE:DeviceType::DSP);
- ProcessArgs(argc, argv, config, num_devices, device_type, input_file);
+ cmdline_opts_t opts;
+ opts.config = DEFAULT_CONFIG;
+ if (num_eves != 0) { opts.num_eves = 1; opts.num_dsps = 0; }
+ else { opts.num_eves = 0; opts.num_dsps = 1; }
+ if (! ProcessArgs(argc, argv, opts))
+ {
+ DisplayHelp();
+ exit(EXIT_SUCCESS);
+ }
+ assert(opts.num_dsps != 0 || opts.num_eves != 0);
+ if (opts.num_frames == 0)
+ opts.num_frames = (opts.is_camera_input || opts.is_video_input) ?
+ NUM_VIDEO_FRAMES :
+ (opts.input_file.empty() ? DEFAULT_INPUT_FRAMES : 1);
+ if (opts.input_file.empty())
+ cout << "Input: " << DEFAULT_INPUT << endl;
+ else
+ cout << "Input: " << opts.input_file << endl;
- if ((object_class_table = GetObjectClassTable(config)) == nullptr)
+ // Get object class table
+ if ((object_class_table = GetObjectClassTable(opts.config)) == nullptr)
{
- std::cout << "No object classes defined for this config." << std::endl;
+ cout << "No object classes defined for this config." << endl;
return EXIT_FAILURE;
}
- if (input_file == DEFAULT_INPUT) is_default_input = true;
- if (input_file == "camera") is_camera_input = true;
- if (input_file.length() > 2 &&
- input_file.compare(input_file.length() - 2, 2, ".y") == 0)
- is_preprocessed_input = true;
- std::cout << "Input: " << input_file << std::endl;
- std::string config_file = "../test/testvecs/config/infer/tidl_config_"
- + config + ".txt";
- bool status = RunConfiguration(config_file, num_devices, device_type,
- input_file);
-
+ // Run network
+ bool status = RunConfiguration(opts);
if (!status)
{
- std::cout << "segmentation FAILED" << std::endl;
+ cout << "segmentation FAILED" << endl;
return EXIT_FAILURE;
}
- std::cout << "segmentation PASSED" << std::endl;
+ cout << "segmentation PASSED" << endl;
return EXIT_SUCCESS;
}
-bool RunConfiguration(const std::string& config_file, int num_devices,
- DeviceType device_type, std::string& input_file)
+bool RunConfiguration(const cmdline_opts_t& opts)
{
- DeviceIds ids;
- for (int i = 0; i < num_devices; i++)
- ids.insert(static_cast<DeviceId>(i));
-
// Read the TI DL configuration file
- Configuration configuration;
- bool status = configuration.ReadFromFile(config_file);
+ Configuration c;
+ std::string config_file = "../test/testvecs/config/infer/tidl_config_"
+ + opts.config + ".txt";
+ bool status = c.ReadFromFile(config_file);
if (!status)
{
- std::cerr << "Error in configuration file: " << config_file
- << std::endl;
+ cerr << "Error in configuration file: " << config_file << endl;
return false;
}
+ c.enableApiTrace = opts.verbose;
- // setup input
- int num_frames = is_default_input ? 3 : 1;
+ // setup camera/video input/output
VideoCapture cap;
- std::string image_file;
- if (is_camera_input)
- {
- cap = VideoCapture(1); // cap = VideoCapture("test.mp4");
- if (! cap.isOpened())
- {
- std::cerr << "Cannot open camera input." << std::endl;
- return false;
- }
- num_frames = NUM_VIDEO_FRAMES;
- namedWindow("Segmentation", WINDOW_AUTOSIZE | CV_GUI_NORMAL);
- }
- else
- {
- image_file = input_file;
- }
+ if (! SetVideoInputOutput(cap, opts, "Segmentation")) return false;
try
{
- // Create a executor with the approriate core type, number of cores
+ // Create Executors with the approriate core type, number of cores
// and configuration specified
- Executor executor(device_type, ids, configuration);
+ Executor* e_eve = CreateExecutor(DeviceType::EVE, opts.num_eves, c);
+ Executor* e_dsp = CreateExecutor(DeviceType::DSP, opts.num_dsps, c);
- // Query Executor for set of ExecutionObjects created
- const ExecutionObjects& execution_objects =
- executor.GetExecutionObjects();
- int num_eos = execution_objects.size();
+ // Get ExecutionObjects from Executors
+ vector<ExecutionObject*> eos;
+ for (uint32_t i = 0; i < opts.num_eves; i++) eos.push_back((*e_eve)[i]);
+ for (uint32_t i = 0; i < opts.num_dsps; i++) eos.push_back((*e_dsp)[i]);
+ uint32_t num_eos = eos.size();
// Allocate input and output buffers for each execution object
- std::vector<void *> buffers;
- for (auto &eo : execution_objects)
- {
- size_t in_size = eo->GetInputBufferSizeInBytes();
- size_t out_size = eo->GetOutputBufferSizeInBytes();
- ArgInfo in = { ArgInfo(malloc(in_size), in_size)};
- ArgInfo out = { ArgInfo(malloc(out_size), out_size)};
- eo->SetInputOutputBuffer(in, out);
-
- buffers.push_back(in.ptr());
- buffers.push_back(out.ptr());
- }
+ AllocateMemory(eos);
- #define MAX_NUM_EOS 4
- struct timespec t0[MAX_NUM_EOS], t1;
+ chrono::time_point<chrono::steady_clock> tloop0, tloop1;
+ tloop0 = chrono::steady_clock::now();
- // Process frames with available execution objects in a pipelined manner
+ // Process frames with available eos in a pipelined manner
// additional num_eos iterations to flush the pipeline (epilogue)
- for (int frame_idx = 0;
- frame_idx < num_frames + num_eos; frame_idx++)
+ for (uint32_t frame_idx = 0;
+ frame_idx < opts.num_frames + num_eos; frame_idx++)
{
- ExecutionObject* eo = execution_objects[frame_idx % num_eos].get();
+ ExecutionObject* eo = eos[frame_idx % num_eos];
// Wait for previous frame on the same eo to finish processing
if (eo->ProcessFrameWait())
{
- clock_gettime(CLOCK_MONOTONIC, &t1);
- double elapsed_host =
- ms_diff(t0[eo->GetFrameIndex() % num_eos], t1);
- double elapsed_device = eo->GetProcessTimeInMilliSeconds();
- double overhead = 100 - (elapsed_device/elapsed_host*100);
-
- std::cout << "frame[" << eo->GetFrameIndex() << "]: "
- << "Time on device: "
- << std::setw(6) << std::setprecision(4)
- << elapsed_device << "ms, "
- << "host: "
- << std::setw(6) << std::setprecision(4)
- << elapsed_host << "ms ";
- std::cout << "API overhead: "
- << std::setw(6) << std::setprecision(3)
- << overhead << " %" << std::endl;
-
- WriteFrameOutput(*eo, configuration);
+ ReportTime(eo);
+ WriteFrameOutput(*eo, c, opts);
}
// Read a frame and start processing it with current eo
- if (ReadFrame(*eo, frame_idx, configuration, num_frames,
- image_file, cap))
- {
- clock_gettime(CLOCK_MONOTONIC, &t0[frame_idx % num_eos]);
+ if (ReadFrame(*eo, frame_idx, c, opts, cap))
eo->ProcessFrameStartAsync();
- }
}
- for (auto b : buffers)
- free(b);
+ tloop1 = chrono::steady_clock::now();
+ chrono::duration<float> elapsed = tloop1 - tloop0;
+ cout << "Loop total time (including read/write/opencv/print/etc): "
+ << setw(6) << setprecision(4)
+ << (elapsed.count() * 1000) << "ms" << endl;
+ FreeMemory(eos);
+ delete e_eve;
+ delete e_dsp;
}
catch (tidl::Exception &e)
{
- std::cerr << e.what() << std::endl;
+ cerr << e.what() << endl;
status = false;
}
return status;
}
+// Create an Executor with the specified type and number of EOs
+Executor* CreateExecutor(DeviceType dt, int num, const Configuration& c)
+{
+ if (num == 0) return nullptr;
+
+ DeviceIds ids;
+ for (uint32_t i = 0; i < num; i++)
+ ids.insert(static_cast<DeviceId>(i));
+
+ return new Executor(dt, ids, c);
+}
-bool ReadFrame(ExecutionObject &eo, int frame_idx,
- const Configuration& configuration, int num_frames,
- std::string& image_file, VideoCapture &cap)
+bool ReadFrame(ExecutionObject &eo, int frame_idx, const Configuration& c,
+ const cmdline_opts_t& opts, VideoCapture &cap)
{
- if (frame_idx >= num_frames)
+ if (frame_idx >= opts.num_frames)
return false;
eo.SetFrameIndex(frame_idx);
char* frame_buffer = eo.GetInputBufferPtr();
assert (frame_buffer != nullptr);
- int channel_size = configuration.inWidth * configuration.inHeight;
+ int channel_size = c.inWidth * c.inHeight;
Mat image;
- if (! image_file.empty())
+ if (! opts.is_camera_input && ! opts.is_video_input)
{
- if (is_preprocessed_input)
+ if (opts.input_file.empty())
{
- std::ifstream ifs(image_file, std::ios::binary);
- ifs.seekg(frame_idx * channel_size * 3);
+ ifstream ifs(DEFAULT_INPUT, ios::binary);
+ ifs.seekg((frame_idx % DEFAULT_INPUT_FRAMES) * channel_size * 3);
ifs.read(frame_buffer, channel_size * 3);
bool ifs_status = ifs.good();
ifs.close();
- orig_width = configuration.inWidth;
- orig_height = configuration.inHeight;
+ orig_width = c.inWidth;
+ orig_height = c.inHeight;
return ifs_status; // already PreProc-ed
}
else
{
- image = cv::imread(image_file, CV_LOAD_IMAGE_COLOR);
+ image = cv::imread(opts.input_file, CV_LOAD_IMAGE_COLOR);
if (image.empty())
{
- std::cerr << "Unable to read from: " << image_file << std::endl;
+ cerr << "Unable to read from: " << opts.input_file << endl;
return false;
}
}
Mat s_image, bgr_frames[3];
orig_width = image.cols;
orig_height = image.rows;
- cv::resize(image, s_image,
- Size(configuration.inWidth, configuration.inHeight),
+ cv::resize(image, s_image, Size(c.inWidth, c.inHeight),
0, 0, cv::INTER_AREA);
cv::split(s_image, bgr_frames);
memcpy(frame_buffer, bgr_frames[0].ptr(), channel_size);
}
// Create frame overlayed with pixel-level segmentation
-bool WriteFrameOutput(const ExecutionObject &eo,
- const Configuration& configuration)
+bool WriteFrameOutput(const ExecutionObject &eo, const Configuration& c,
+ const cmdline_opts_t& opts)
{
unsigned char *out = (unsigned char *) eo.GetOutputBufferPtr();
- int width = configuration.inWidth;
- int height = configuration.inHeight;
+ int width = c.inWidth;
+ int height = c.inHeight;
int channel_size = width * height;
Mat mask, frame, blend, r_blend, bgr[3];
// Create overlayed frame
cv::addWeighted(frame, 0.7, mask, 0.3, 0.0, blend);
- cv::resize(blend, r_blend, Size(orig_width, orig_height));
- if (is_camera_input)
+ // Resize to output width/height, keep aspect ratio
+ uint32_t output_width = opts.output_width;
+ if (output_width == 0) output_width = orig_width;
+ uint32_t output_height = (output_width*1.0f) / orig_width * orig_height;
+ cv::resize(blend, r_blend, Size(output_width, output_height));
+
+ if (opts.is_camera_input || opts.is_video_input)
{
cv::imshow("Segmentation", r_blend);
waitKey(1);
{
int frame_index = eo.GetFrameIndex();
char outfile_name[64];
- if (is_preprocessed_input)
+ if (opts.input_file.empty())
{
snprintf(outfile_name, 64, "frame_%d.png", frame_index);
cv::imwrite(outfile_name, frame);
return true;
}
-
-void ProcessArgs(int argc, char *argv[], std::string& config,
- int& num_devices, DeviceType& device_type,
- std::string& input_file)
-{
- const struct option long_options[] =
- {
- {"config", required_argument, 0, 'c'},
- {"num_devices", required_argument, 0, 'n'},
- {"device_type", required_argument, 0, 't'},
- {"image_file", required_argument, 0, 'i'},
- {"help", no_argument, 0, 'h'},
- {"verbose", no_argument, 0, 'v'},
- {0, 0, 0, 0}
- };
-
- int option_index = 0;
-
- while (true)
- {
- int c = getopt_long(argc, argv, "c:n:t:i:hv", long_options, &option_index);
-
- if (c == -1)
- break;
-
- switch (c)
- {
- case 'c': config = optarg;
- break;
-
- case 'n': num_devices = atoi(optarg);
- assert (num_devices > 0 && num_devices <= 4);
- break;
-
- case 't': if (*optarg == 'e')
- device_type = DeviceType::EVE;
- else if (*optarg == 'd')
- device_type = DeviceType::DSP;
- else
- {
- std::cerr << "Invalid argument to -t, only e or d"
- " allowed" << std::endl;
- exit(EXIT_FAILURE);
- }
- break;
-
- case 'i': input_file = optarg;
- break;
-
- case 'v': __TI_show_debug_ = true;
- break;
-
- case 'h': DisplayHelp();
- exit(EXIT_SUCCESS);
- break;
-
- case '?': // Error in getopt_long
- exit(EXIT_FAILURE);
- break;
-
- default:
- std::cerr << "Unsupported option: " << c << std::endl;
- break;
- }
- }
-}
-
void DisplayHelp()
{
- std::cout << "Usage: segmentation\n"
- " Will run segmentation network to perform pixel-level"
- " classification.\n Use -c to run a different"
- " segmentation network. Default is jseg21_tiscapes.\n"
- "Optional arguments:\n"
- " -c <config> Valid configs: jseg21_tiscapes, jseg21\n"
- " -n <number of cores> Number of cores to use (1 - 4)\n"
- " -t <d|e> Type of core. d -> DSP, e -> EVE\n"
- " -i <image> Path to the image file\n"
- " Default are 3 frames in testvecs\n"
- " -i camera Use camera as input\n"
- " -v Verbose output during execution\n"
- " -h Help\n";
+ std::cout <<
+ "Usage: segmentation\n"
+ " Will run segmentation network to perform pixel-level"
+ " classification.\n Use -c to run a different"
+ " segmentation network. Default is jseg21_tiscapes.\n"
+ "Optional arguments:\n"
+ " -c <config> Valid configs: jseg21_tiscapes, jseg21\n"
+ " -d <number> Number of dsp cores to use\n"
+ " -e <number> Number of eve cores to use\n"
+ " -i <image> Path to the image file as input\n"
+ " Default are 9 frames in testvecs\n"
+ " -i camera<number> Use camera as input\n"
+ " video input port: /dev/video<number>\n"
+ " -i <name>.{mp4,mov,avi} Use video file as input\n"
+ " -f <number> Number of frames to process\n"
+ " -w <number> Output image/video width\n"
+ " -v Verbose output during execution\n"
+ " -h Help\n";
}
index f427f1c948b63d4750ebbedab8cc74aba42be933..3e67ba383c04c317801c1659b4e9b35d748419f0 100644 (file)
LIBS += -lopencv_highgui -lopencv_imgcodecs -lopencv_videoio\
-lopencv_imgproc -lopencv_core
-SOURCES = main.cpp ../segmentation/object_classes.cpp
+SOURCES = main.cpp ../segmentation/object_classes.cpp ../common/utils.cpp \
+ ../common/video_utils.cpp
$(EXE): $(TIDL_API_LIB) $(HEADERS) $(SOURCES)
$(CXX) $(CXXFLAGS) $(SOURCES) $(TIDL_API_LIB) $(LDFLAGS) $(LIBS) -o $@
index 70cda0663384d2388751bd1a9a663cfb8edf55a9..073a6efa0198596d8e03c488ed594090ae5dd7ec 100644 (file)
* THE POSSIBILITY OF SUCH DAMAGE.
*****************************************************************************/
#include <signal.h>
-#include <getopt.h>
#include <iostream>
#include <iomanip>
#include <fstream>
#include <queue>
#include <vector>
#include <cstdio>
+#include <chrono>
#include "executor.h"
#include "execution_object.h"
#include "execution_object_pipeline.h"
#include "configuration.h"
#include "../segmentation/object_classes.h"
+#include "../common/utils.h"
+#include "../common/video_utils.h"
+
+using namespace std;
+using namespace tidl;
+using namespace cv;
-#include "opencv2/core.hpp"
-#include "opencv2/imgproc.hpp"
-#include "opencv2/highgui.hpp"
-#include "opencv2/videoio.hpp"
#define NUM_VIDEO_FRAMES 100
#define DEFAULT_CONFIG "jdetnet"
#define DEFAULT_INPUT "../test/testvecs/input/preproc_0_768x320.y"
+#define DEFAULT_INPUT_FRAMES (1)
-bool __TI_show_debug_ = false;
-bool is_default_input = false;
-bool is_preprocessed_input = false;
-bool is_camera_input = false;
-int orig_width;
-int orig_height;
object_class_table_t *object_class_table;
-
-using namespace tidl;
-using namespace cv;
+uint32_t orig_width;
+uint32_t orig_height;
-bool RunConfiguration(const std::string& config_file,
- uint32_t num_dsps, uint32_t num_eves,
- DeviceType device_type, std::string& input_file);
+bool RunConfiguration(const cmdline_opts_t& opts);
+Executor* CreateExecutor(DeviceType dt, int num, const Configuration& c,
+ int layers_group_id);
bool ReadFrame(ExecutionObjectPipeline& eop, int frame_idx,
- const Configuration& configuration, int num_frames,
- std::string& image_file, VideoCapture &cap);
+ const Configuration& c, const cmdline_opts_t& opts,
+ VideoCapture &cap);
bool WriteFrameOutput(const ExecutionObjectPipeline& eop,
- const Configuration& configuration);
-
-void ReportTime(int frame_index, std::string device_name, double elapsed_host,
- double elapsed_device);
-
-static void ProcessArgs(int argc, char *argv[],
- std::string& config,
- uint32_t& num_dsps,
- uint32_t& num_eves,
- DeviceType& device_type,
- std::string& input_file);
-
+ const Configuration& c, const cmdline_opts_t& opts);
static void DisplayHelp();
-static double ms_diff(struct timespec &t0, struct timespec &t1)
-{ return (t1.tv_sec - t0.tv_sec) * 1e3 + (t1.tv_nsec - t0.tv_nsec) / 1e6; }
-
int main(int argc, char *argv[])
{
signal(SIGTERM, exit);
// If there are no devices capable of offloading TIDL on the SoC, exit
- uint32_t num_eve = Executor::GetNumDevices(DeviceType::EVE);
- uint32_t num_dsp = Executor::GetNumDevices(DeviceType::DSP);
- if (num_eve == 0 || num_dsp == 0)
+ uint32_t num_eves = Executor::GetNumDevices(DeviceType::EVE);
+ uint32_t num_dsps = Executor::GetNumDevices(DeviceType::DSP);
+ if (num_eves == 0 || num_dsps == 0)
{
- std::cout << "ssd_multibox requires both EVE and DSP for execution."
- << std::endl;
+ cout << "ssd_multibox requires both EVE and DSP for execution." << endl;
return EXIT_SUCCESS;
}
// Process arguments
- std::string config = DEFAULT_CONFIG;
- std::string input_file = DEFAULT_INPUT;
- uint32_t num_dsps = 1;
- uint32_t num_eves = 1;
- DeviceType device_type = DeviceType::EVE;
- ProcessArgs(argc, argv, config, num_dsps, num_eves,
- device_type, input_file);
-
- if ((object_class_table = GetObjectClassTable(config)) == nullptr)
+ cmdline_opts_t opts;
+ opts.config = DEFAULT_CONFIG;
+ opts.num_eves = 1;
+ opts.num_dsps = 1;
+ if (! ProcessArgs(argc, argv, opts))
{
- std::cout << "No object classes defined for this config." << std::endl;
- return EXIT_FAILURE;
+ DisplayHelp();
+ exit(EXIT_SUCCESS);
}
+ assert(opts.num_dsps != 0 && opts.num_eves != 0);
+ if (opts.num_frames == 0)
+ opts.num_frames = (opts.is_camera_input || opts.is_video_input) ?
+ NUM_VIDEO_FRAMES :
+ (opts.input_file.empty() ? DEFAULT_INPUT_FRAMES : 1);
+ if (opts.input_file.empty())
+ cout << "Input: " << DEFAULT_INPUT << endl;
+ else
+ cout << "Input: " << opts.input_file << endl;
- if (input_file == DEFAULT_INPUT) is_default_input = true;
- if (input_file == "camera") is_camera_input = true;
- if (input_file.length() > 2 &&
- input_file.compare(input_file.length() - 2, 2, ".y") == 0)
- is_preprocessed_input = true;
- std::cout << "Input: " << input_file << std::endl;
- std::string config_file = "../test/testvecs/config/infer/tidl_config_"
- + config + ".txt";
- bool status = RunConfiguration(config_file, num_dsps, num_eves,
- device_type, input_file);
+ // Get object class table
+ if ((object_class_table = GetObjectClassTable(opts.config)) == nullptr)
+ {
+ cout << "No object classes defined for this config." << endl;
+ return EXIT_FAILURE;
+ }
+ // Run network
+ bool status = RunConfiguration(opts);
if (!status)
{
- std::cout << "ssd_multibox FAILED" << std::endl;
+ cout << "ssd_multibox FAILED" << endl;
return EXIT_FAILURE;
}
- std::cout << "ssd_multibox PASSED" << std::endl;
+ cout << "ssd_multibox PASSED" << endl;
return EXIT_SUCCESS;
}
-bool RunConfiguration(const std::string& config_file,
- uint32_t num_dsps, uint32_t num_eves,
- DeviceType device_type, std::string& input_file)
+bool RunConfiguration(const cmdline_opts_t& opts)
{
- DeviceIds ids_eve, ids_dsp;
- for (unsigned int i = 0; i < num_eves; i++)
- ids_eve.insert(static_cast<DeviceId>(i));
- for (unsigned int i = 0; i < num_dsps; i++)
- ids_dsp.insert(static_cast<DeviceId>(i));
-
// Read the TI DL configuration file
- Configuration configuration;
- bool status = configuration.ReadFromFile(config_file);
+ Configuration c;
+ std::string config_file = "../test/testvecs/config/infer/tidl_config_"
+ + opts.config + ".txt";
+ bool status = c.ReadFromFile(config_file);
if (!status)
{
- std::cerr << "Error in configuration file: " << config_file
- << std::endl;
+ cerr << "Error in configuration file: " << config_file << endl;
return false;
}
+ c.enableApiTrace = opts.verbose;
- // setup input
- int num_frames = is_default_input ? 9 : 9;
+ // setup camera/video input
VideoCapture cap;
- std::string image_file;
- if (is_camera_input)
- {
- cap = VideoCapture(1); // cap = VideoCapture("test.mp4");
- if (! cap.isOpened())
- {
- std::cerr << "Cannot open camera input." << std::endl;
- return false;
- }
- num_frames = NUM_VIDEO_FRAMES;
- namedWindow("SSD_Multibox", WINDOW_AUTOSIZE | CV_GUI_NORMAL);
- }
- else
- {
- image_file = input_file;
- }
+ if (! SetVideoInputOutput(cap, opts, "SSD_Multibox")) return false;
try
{
- // Create a executor with the approriate core type, number of cores
+ // Create Executors with the approriate core type, number of cores
// and configuration specified
// EVE will run layersGroupId 1 in the network, while
// DSP will run layersGroupId 2 in the network
- Executor exe_eve(DeviceType::EVE, ids_eve, configuration, 1);
- Executor exe_dsp(DeviceType::DSP, ids_dsp, configuration, 2);
+ Executor* e_eve = CreateExecutor(DeviceType::EVE, opts.num_eves, c, 1);
+ Executor* e_dsp = CreateExecutor(DeviceType::DSP, opts.num_dsps, c, 2);
// Construct ExecutionObjectPipeline that utilizes multiple
// ExecutionObjects to process a single frame, each ExecutionObject
// processes one layerGroup of the network
- int num_eops = std::max(num_eves, num_dsps);
- std::vector<ExecutionObjectPipeline *> eops;
- for (int i = 0; i < num_eops; i++)
- eops.push_back(new ExecutionObjectPipeline({exe_eve[i%num_eves],
- exe_dsp[i%num_dsps]}));
+ vector<ExecutionObjectPipeline *> eops;
+ for (uint32_t i = 0; i < max(opts.num_eves, opts.num_dsps); i++)
+ eops.push_back(new ExecutionObjectPipeline(
+ {(*e_eve)[i%opts.num_eves], (*e_dsp)[i%opts.num_dsps]}));
+ uint32_t num_eops = eops.size();
// Allocate input/output memory for each EOP
- std::vector<void *> buffers;
- for (auto eop : eops)
- {
- size_t in_size = eop->GetInputBufferSizeInBytes();
- size_t out_size = eop->GetOutputBufferSizeInBytes();
- void* in_ptr = malloc(in_size);
- void* out_ptr = malloc(out_size);
- assert(in_ptr != nullptr && out_ptr != nullptr);
- buffers.push_back(in_ptr);
- buffers.push_back(out_ptr);
-
- ArgInfo in(in_ptr, in_size);
- ArgInfo out(out_ptr, out_size);
- eop->SetInputOutputBuffer(in, out);
- }
+ AllocateMemory(eops);
- struct timespec tloop0, tloop1;
- clock_gettime(CLOCK_MONOTONIC, &tloop0);
+ chrono::time_point<chrono::steady_clock> tloop0, tloop1;
+ tloop0 = chrono::steady_clock::now();
- // Process frames with ExecutionObjectPipelines in a pipelined manner
+ // Process frames with available eops in a pipelined manner
// additional num_eops iterations to flush pipeline (epilogue)
- for (int frame_idx = 0; frame_idx < num_frames + num_eops; frame_idx++)
+ for (uint32_t frame_idx = 0;
+ frame_idx < opts.num_frames + num_eops; frame_idx++)
{
ExecutionObjectPipeline* eop = eops[frame_idx % num_eops];
// Wait for previous frame on the same eop to finish processing
if (eop->ProcessFrameWait())
{
- ReportTime(eop->GetFrameIndex(), eop->GetDeviceName(),
- eop->GetHostProcessTimeInMilliSeconds(),
- eop->GetProcessTimeInMilliSeconds());
- WriteFrameOutput(*eop, configuration);
+ ReportTime(eop);
+ WriteFrameOutput(*eop, c, opts);
}
// Read a frame and start processing it with current eo
- if (ReadFrame(*eop, frame_idx, configuration, num_frames,
- image_file, cap))
- {
+ if (ReadFrame(*eop, frame_idx, c, opts, cap))
eop->ProcessFrameStartAsync();
- }
}
- clock_gettime(CLOCK_MONOTONIC, &tloop1);
- std::cout << "Loop total time (including read/write/print/etc): "
- << std::setw(6) << std::setprecision(4)
- << ms_diff(tloop0, tloop1) << "ms" << std::endl;
+ tloop1 = chrono::steady_clock::now();
+ chrono::duration<float> elapsed = tloop1 - tloop0;
+ cout << "Loop total time (including read/write/opencv/print/etc): "
+ << setw(6) << setprecision(4)
+ << (elapsed.count() * 1000) << "ms" << endl;
- for (auto eop : eops)
- delete eop;
- for (auto b : buffers)
- free(b);
+ FreeMemory(eops);
+ for (auto eop : eops) delete eop;
+ delete e_eve;
+ delete e_dsp;
}
catch (tidl::Exception &e)
{
- std::cerr << e.what() << std::endl;
+ cerr << e.what() << endl;
status = false;
}
return status;
}
-void ReportTime(int frame_index, std::string device_name, double elapsed_host,
- double elapsed_device)
+// Create an Executor with the specified type and number of EOs
+Executor* CreateExecutor(DeviceType dt, int num, const Configuration& c,
+ int layers_group_id)
{
- double overhead = 100 - (elapsed_device/elapsed_host*100);
- std::cout << "frame[" << frame_index << "]: "
- << "Time on " << device_name << ": "
- << std::setw(6) << std::setprecision(4)
- << elapsed_device << "ms, "
- << "host: "
- << std::setw(6) << std::setprecision(4)
- << elapsed_host << "ms ";
- std::cout << "API overhead: "
- << std::setw(6) << std::setprecision(3)
- << overhead << " %" << std::endl;
-}
+ if (num == 0) return nullptr;
+ DeviceIds ids;
+ for (uint32_t i = 0; i < num; i++)
+ ids.insert(static_cast<DeviceId>(i));
+
+ return new Executor(dt, ids, c, layers_group_id);
+}
bool ReadFrame(ExecutionObjectPipeline& eop, int frame_idx,
- const Configuration& configuration, int num_frames,
- std::string& image_file, VideoCapture &cap)
+ const Configuration& c, const cmdline_opts_t& opts,
+ VideoCapture &cap)
{
- if (frame_idx >= num_frames)
+ if (frame_idx >= opts.num_frames)
return false;
eop.SetFrameIndex(frame_idx);
char* frame_buffer = eop.GetInputBufferPtr();
assert (frame_buffer != nullptr);
- int channel_size = configuration.inWidth * configuration.inHeight;
+ int channel_size = c.inWidth * c.inHeight;
Mat image;
- if (! image_file.empty())
+ if (!opts.is_camera_input && !opts.is_video_input)
{
- if (is_preprocessed_input)
+ if (opts.input_file.empty())
{
- std::ifstream ifs(image_file, std::ios::binary);
- //ifs.seekg(frame_idx * channel_size * 3);
+ ifstream ifs(DEFAULT_INPUT, ios::binary);
+ ifs.seekg((frame_idx % DEFAULT_INPUT_FRAMES) * channel_size * 3);
ifs.read(frame_buffer, channel_size * 3);
bool ifs_status = ifs.good();
ifs.close();
- orig_width = configuration.inWidth;
- orig_height = configuration.inHeight;
+ orig_width = c.inWidth;
+ orig_height = c.inHeight;
return ifs_status; // already PreProc-ed
}
else
{
- image = cv::imread(image_file, CV_LOAD_IMAGE_COLOR);
+ image = cv::imread(opts.input_file, CV_LOAD_IMAGE_COLOR);
if (image.empty())
{
- std::cerr << "Unable to read from: " << image_file << std::endl;
+ cerr << "Unable to read from: " << opts.input_file << endl;
return false;
}
}
Mat s_image, bgr_frames[3];
orig_width = image.cols;
orig_height = image.rows;
- cv::resize(image, s_image,
- Size(configuration.inWidth, configuration.inHeight),
+ cv::resize(image, s_image, Size(c.inWidth, c.inHeight),
0, 0, cv::INTER_AREA);
cv::split(s_image, bgr_frames);
memcpy(frame_buffer, bgr_frames[0].ptr(), channel_size);
// Create frame with boxes drawn around classified objects
bool WriteFrameOutput(const ExecutionObjectPipeline& eop,
- const Configuration& configuration)
+ const Configuration& c, const cmdline_opts_t& opts)
{
// Asseembly original frame
- int width = configuration.inWidth;
- int height = configuration.inHeight;
+ int width = c.inWidth;
+ int height = c.inHeight;
int channel_size = width * height;
Mat frame, r_frame, bgr[3];
int frame_index = eop.GetFrameIndex();
char outfile_name[64];
- if (! is_camera_input && is_preprocessed_input)
+ if (opts.input_file.empty())
{
snprintf(outfile_name, 64, "frame_%d.png", frame_index);
cv::imwrite(outfile_name, frame);
object_class->color.red), 2);
}
- // output
- cv::resize(frame, r_frame, Size(orig_width, orig_height));
- if (is_camera_input)
+ // Resize to output width/height, keep aspect ratio
+ uint32_t output_width = opts.output_width;
+ if (output_width == 0) output_width = orig_width;
+ uint32_t output_height = (output_width*1.0f) / orig_width * orig_height;
+ cv::resize(frame, r_frame, Size(output_width, output_height));
+
+ if (opts.is_camera_input || opts.is_video_input)
{
cv::imshow("SSD_Multibox", r_frame);
waitKey(1);
return true;
}
-
-void ProcessArgs(int argc, char *argv[], std::string& config,
- uint32_t& num_dsps, uint32_t& num_eves,
- DeviceType& device_type, std::string& input_file)
-{
- const struct option long_options[] =
- {
- {"config", required_argument, 0, 'c'},
- {"num_dsps", required_argument, 0, 'd'},
- {"num_eves", required_argument, 0, 'e'},
- {"image_file", required_argument, 0, 'i'},
- {"help", no_argument, 0, 'h'},
- {"verbose", no_argument, 0, 'v'},
- {0, 0, 0, 0}
- };
-
- int option_index = 0;
-
- while (true)
- {
- int c = getopt_long(argc, argv, "c:d:e:i:hv", long_options,
- &option_index);
-
- if (c == -1)
- break;
-
- switch (c)
- {
- case 'c': config = optarg;
- break;
-
- case 'd': num_dsps = atoi(optarg);
- assert (num_dsps > 0 && num_dsps <=
- Executor::GetNumDevices(DeviceType::DSP));
- break;
-
- case 'e': num_eves = atoi(optarg);
- assert (num_eves > 0 && num_eves <=
- Executor::GetNumDevices(DeviceType::EVE));
- break;
-
- case 'i': input_file = optarg;
- break;
-
- case 'v': __TI_show_debug_ = true;
- break;
-
- case 'h': DisplayHelp();
- exit(EXIT_SUCCESS);
- break;
-
- case '?': // Error in getopt_long
- exit(EXIT_FAILURE);
- break;
-
- default:
- std::cerr << "Unsupported option: " << c << std::endl;
- break;
- }
- }
-}
-
void DisplayHelp()
{
- std::cout << "Usage: ssd_multibox\n"
- " Will run partitioned ssd_multibox network to perform "
- "multi-objects detection\n"
- " and classification. First part of network "
- "(layersGroupId 1) runs on EVE,\n"
- " second part (layersGroupId 2) runs on DSP.\n"
- " Use -c to run a different segmentation network. "
- "Default is jdetnet.\n"
- "Optional arguments:\n"
- " -c <config> Valid configs: jdetnet \n"
- " -d <number> Number of dsp cores to use\n"
- " -e <number> Number of eve cores to use\n"
- " -i <image> Path to the image file\n"
- " Default is 1 frame in testvecs\n"
- " -i camera Use camera as input\n"
- " -v Verbose output during execution\n"
- " -h Help\n";
+ std::cout <<
+ "Usage: ssd_multibox\n"
+ " Will run partitioned ssd_multibox network to perform "
+ "multi-objects detection\n"
+ " and classification. First part of network "
+ "(layersGroupId 1) runs on EVE,\n"
+ " second part (layersGroupId 2) runs on DSP.\n"
+ " Use -c to run a different segmentation network. Default is jdetnet.\n"
+ "Optional arguments:\n"
+ " -c <config> Valid configs: jdetnet \n"
+ " -d <number> Number of dsp cores to use\n"
+ " -e <number> Number of eve cores to use\n"
+ " -i <image> Path to the image file as input\n"
+ " Default are 9 frames in testvecs\n"
+ " -i camera<number> Use camera as input\n"
+ " video input port: /dev/video<number>\n"
+ " -i <name>.{mp4,mov,avi} Use video file as input\n"
+ " -f <number> Number of frames to process\n"
+ " -w <number> Output image/video width\n"
+ " -v Verbose output during execution\n"
+ " -h Help\n";
}
index a289507d4b050a9d9418b8424de26397e16b7f87..2291d97f9ab543aed19d147b7464b2742cd937fc 100644 (file)
Executor* CreateExecutor(DeviceType dt, int num, const Configuration& c,
int layer_group_id);
-void AllocateMemory(const vector<EOP *>& EOPs);
-void FreeMemory (const vector<EOP *>& EOPs);
-
int main(int argc, char *argv[])
{
return new Executor(dt, ids, c, layer_group_id);
}
-// Allocate input and output memory for each EO
-void AllocateMemory(const vector<EOP *>& EOPs)
-{
- // Allocate input and output buffers for each execution object
- for (auto eop : EOPs)
- {
- size_t in_size = eop->GetInputBufferSizeInBytes();
- size_t out_size = eop->GetOutputBufferSizeInBytes();
- ArgInfo in = { ArgInfo(malloc(in_size), in_size)};
- ArgInfo out = { ArgInfo(malloc(out_size), out_size)};
- eop->SetInputOutputBuffer(in, out);
- }
-}
-
-// Free the input and output memory associated with each EO
-void FreeMemory(const vector<EOP *>& EOPs)
-{
- for (auto eop : EOPs)
- {
- free(eop->GetInputBufferPtr());
- free(eop->GetOutputBufferPtr());
- }
-
-}
index aaa6cf020fddfc0381a9782fa89b1b4fbfb9dd83..b31fec6ab94882222e0cc660912394b715960b00 100644 (file)
//! @brief Tear down an ExecutionObjectPipeline and free used resources
~ExecutionObjectPipeline();
+ //! Returns the number of ExecutionObjects associated with the
+ //! ExecutionObjectPipeline
+ uint32_t GetNumExecutionObjects() const;
+
//! Specify the input and output buffers used by the EOP
//! @param in buffer used for input.
//! @param out buffer used for output.
//! @return Number of milliseconds to process a frame on the device.
float GetProcessTimeInMilliSeconds() const override;
+ //! @brief return the number of milliseconds taken *on the device*
+ //! to process a layersGroup by a componenet ExecutionObject
+ //! @return Number of milliseconds to process a layersGroup on the
+ //! device by a component ExecutionObject.
+ float GetProcessTimeInMilliSeconds(uint32_t eo_index) const;
+
//! @brief return the number of milliseconds taken *on the host* to
//! execute the process call
//! @return Number of milliseconds to process a frame on the host.
float GetHostProcessTimeInMilliSeconds() const override;
+ //! @brief return the number of milliseconds taken *on the host*
+ //! to process a layersGroup by a componenet ExecutionObject
+ //! @return Number of milliseconds to process a layersGroup on the
+ //! host by a component ExecutionObject.
+ float GetHostProcessTimeInMilliSeconds(uint32_t eo_index) const;
+
//! Return the combined device names that this pipeline runs on
const std::string& GetDeviceName() const override;
diff --git a/tidl_api/src/execution_object_pipeline.cpp b/tidl_api/src/execution_object_pipeline.cpp
index ff84255b832b9a2681a65220dbcebfa329027078..1998da3a4daa34dd7a76bd98f05418ee7ff45801 100644 (file)
//! for pipelined execution
std::vector<ExecutionObject*> eos_m;
std::vector<IODeviceArgInfo*> iobufs_m;
+ std::vector<float> eo_device_time_m;
+ std::vector<float> eo_host_time_m;
std::string device_name_m;
//! current execution object index
uint32_t curr_eo_idx_m;
- // host time tracking: pipeline start to finish
+ // device and host time tracking: pipeline start to finish
+ float device_time_m;
float host_time_m;
private:
return static_cast<char *>(pimpl_m->iobufs_m.front()->GetArg().ptr());
}
+uint32_t ExecutionObjectPipeline::GetNumExecutionObjects() const
+{
+ return pimpl_m->eos_m.size();
+}
+
size_t ExecutionObjectPipeline::GetInputBufferSizeInBytes() const
{
return pimpl_m->eos_m.front()->GetInputBufferSizeInBytes();
float ExecutionObjectPipeline::GetProcessTimeInMilliSeconds() const
{
- float total = 0.0f;
- for (auto eo : pimpl_m->eos_m)
- total += eo->GetProcessTimeInMilliSeconds();
- return total;
+ return pimpl_m->device_time_m;
}
float ExecutionObjectPipeline::GetHostProcessTimeInMilliSeconds() const
return pimpl_m->host_time_m;
}
+float ExecutionObjectPipeline::GetProcessTimeInMilliSeconds(
+ uint32_t eo_index) const
+{
+ assert(eo_index < pimpl_m->eos_m.size());
+ return pimpl_m->eo_device_time_m[eo_index];
+}
+
+float ExecutionObjectPipeline::GetHostProcessTimeInMilliSeconds(
+ uint32_t eo_index) const
+{
+ assert(eo_index < pimpl_m->eos_m.size());
+ return pimpl_m->eo_host_time_m[eo_index];
+}
+
const std::string& ExecutionObjectPipeline::GetDeviceName() const
{
return pimpl_m->device_name_m;
ArgInfo out(ptr, size);
iobufs_m.push_back(new IODeviceArgInfo(out));
}
+
+ // Record keeping for each EO's device time and host time
+ // because EO could be shared by another EOP
+ eo_device_time_m.resize(eos_m.size());
+ eo_host_time_m.resize(eos_m.size());
}
ExecutionObjectPipeline::Impl::~Impl()
bool ExecutionObjectPipeline::Impl::RunAsyncStart()
{
- start_m = std::chrono::steady_clock::now();
has_work_m = true;
is_processed_m = false;
+ device_time_m = 0.0f;
host_time_m = 0.0f;
curr_eo_idx_m = 0;
eos_m[0]->AcquireLock();
+ start_m = std::chrono::steady_clock::now();
eos_m[0]->SetInputOutputBuffer(iobufs_m[0], iobufs_m[1]);
return eos_m[0]->ProcessFrameStartAsync();
}
bool ExecutionObjectPipeline::Impl::RunAsyncNext()
{
eos_m[curr_eo_idx_m]->ProcessFrameWait();
+ // need to capture EO's device/host time before we release its lock
+ eo_device_time_m[curr_eo_idx_m] = eos_m[curr_eo_idx_m]->
+ GetProcessTimeInMilliSeconds();
+ eo_host_time_m[curr_eo_idx_m] = eos_m[curr_eo_idx_m]->
+ GetHostProcessTimeInMilliSeconds();
+ device_time_m += eo_device_time_m[curr_eo_idx_m];
eos_m[curr_eo_idx_m]->ReleaseLock();
curr_eo_idx_m += 1;
if (curr_eo_idx_m < eos_m.size())