Quantcast
Channel: OpenCV Q&A Forum - RSS feed
Viewing all articles
Browse latest Browse all 41027

OpenCV 3.0 parallel_for_

$
0
0
Hello, I've implemented a parallel image descriptor extractor using opencv 3.0 parallel_for_ and BRISK descriptor extractor. What I do is to split the image into horizontal stripes and compute the image descriptors in parallel. This is the outer call: _extractor.init(img, featureCount); cv::parallel_for_(cv::Range(0, _processorCount), _extractor); // _processorCount == 4 _extractor.buildFinal(keyPoints, features); Then, for each thread I call _featureExtractor->detectAndCompute(_image, mask, keyPoints, features); with a proper initialized horizontal mask stripe. However, this implementation doesn't run faster than the serial implementation. It runs 2x slower (23 ms for serial and 40+ ms for parallel). I did some debugging and saw opencv uses Microsoft concurrency framework. Moreover, when printing out the called range and getThreadNum() I get this: range[0,1] thread [2] range[3,4] thread [2] range[2,3] thread [2] range[1,2] thread [2] That means that my inner code runs on a single thread, 4 times like a serial implementation. Do you know what's wrong with my approach? Thank you, Alin

Viewing all articles
Browse latest Browse all 41027

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>