Quantcast
Channel: OpenCV Q&A Forum - RSS feed
Viewing all articles
Browse latest Browse all 41027

GPU/CUDA CascadeClassifier detectMultiScale Slow Performance

$
0
0
I was trying to improve the face detection (HAAR face frontal dataset) performance, so I switched the regular CascadeClassifier::detectMultiScale API to the one under gpu (CUDA) module, and the result is opposite to my expectation. The video source is a remote HTTP MJPEG stream with 4K resolution, and cudacodec::createVideoReader doesn't seem to support remote protocols, so I had to use regular VideoCapture API to read a Mat object, and then upload it to a GpuMat object. With the regular parameters(scale factor=1.1, minNeighbors=3), my quad-core i7 3GHz CPU is able to process 4.5 fps, but my GTX 760 is only able to process 1.8 fps. I remember in HOG detection testing, the CUDA version API runs much faster than CPU one with the same setup. I understand CUDA is not always faster than CPU in different algorithms, but still surprised by this big difference in CascadeClassfier. So my question is if the GPU CascadeClassifier is still in progress of optimization, or if there is something needed to be declared before using this API, or if it's not possible that GTX760 CUDA will be faster than i7 CPU in this algorithm. My CUDA version is 7.0 and nvidia driver is 346.87. Any advice will be appreciated.

Viewing all articles
Browse latest Browse all 41027

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>