Quantcast
Channel: OpenCV Q&A Forum - RSS feed
Viewing all articles
Browse latest Browse all 41027

Thread-creation with parallel_for

$
0
0
Hey! I'm trying to speed up an application and want to see if parallel_for can help me. As my first step, I wrote a fancy programm that finds the maximum value per row in an image. This looks like that: class Parallel_markMax: public cv::ParallelLoopBody { private: cv::Mat &img_rgb_; cv::Mat &img_; public: Parallel_markMax(Mat& img, Mat&img_rgb): img_rgb_(img_rgb), img_(img) {} virtual void operator()( const Range& range ) const { int h = img_.rows; cout << "hell " << range.start << " " << range.size() << endl; for (int x = range.start; x < range.end; ++x){ uchar max_val = 0; int pos = 0; for (int y = 0; y(y,x) > max_val){ max_val = img_.at(y,x); pos = y; } } cv::circle(img_rgb_, cv::Point(x,pos),3,cv::Scalar(255,0,0),1); } } }; And i call it like this: parallel_for_(Range(0,w), Parallel_markMax(img, img_col)); I have 8 threads (result of getNumThreads), so I expected that there will be eight threads with each 1/8 of the range. But I get huge amounts of calls to the operator() with each only a size of 1 to 10. So instead of giving a thread a bigger task, some Threadmanager only assigns very small tasks to each thread which probably leads to much overhead. In my example, I only get a speedup of 3 with 8 cores which is rather bad for a perfectly parallelizable task. Is there a parameter I miss?

Viewing all articles
Browse latest Browse all 41027

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>