I'm combining two codes for real-time numberplate recognition. Mostly runs fine but sometimes an error occurs at line: cv2.drawcontours(imCopy, screenCnt, -1, ..... etc. This is mainly when there are no contours being picked up (I believe). Apologies if there is unnecessary garbage in the code. I've been struggling to amalgamate two codes. Error states: returned NULL without setting an error.
The code is as follows:
↧
cv2 draw contours issue- Python for Raspberry Pi
↧
how to solve (-215:Assertion failed) !empty() in function 'cv::CascadeClassifier::detectMultiScale'
please help me i got this error.
---------------------------------------------------------------------------
error Traceback (most recent call last)
in
19
20 rerect_size = cv2.resize(im, (im.shape[1] // rect_size, im.shape[0] // rect_size))
---> 21 faces = haarcascade.detectMultiScale(rerect_size)
22 for f in faces:
23 (x, y, w, h) = [v * rect_size for v in f]
error: OpenCV(4.4.0) C:\Users\appveyor\AppData\Local\Temp\1\pip-req-build-9d_dfo3_\opencv\modules\objdetect\src\cascadedetect.cpp:1689: error: (-215:Assertion failed) !empty() in function 'cv::CascadeClassifier::detectMultiScale'
this is my code
import cv2
import numpy as np
from tensorflow.keras.models import load_model
model=load_model("E:\Data Science\Mask detector\model2-008.model")
results={0:'without mask',1:'mask'}
GR_dict={0:(0,0,255),1:(0,255,0)}
rect_size = 4
cap = cv2.VideoCapture(0)
haarcascade = cv2.CascadeClassifier('/home/user_name/.local/lib/python3.6/site-packages/cv2/data/haarcascade_frontalface_default.xml')
while True:
(rval, im) = cap.read()
im=cv2.flip(im,1,1)
rerect_size = cv2.resize(im, (im.shape[1] // rect_size, im.shape[0] // rect_size))
faces = haarcascade.detectMultiScale(rerect_size)
for f in faces:
(x, y, w, h) = [v * rect_size for v in f]
face_img = im[y:y+h, x:x+w]
rerect_sized=cv2.resize(face_img,(150,150))
normalized=rerect_sized/255.0
reshaped=np.reshape(normalized,(1,150,150,3))
reshaped = np.vstack([reshaped])
result=model.predict(reshaped)
label=np.argmax(result,axis=1)[0]
cv2.rectangle(im,(x,y),(x+w,y+h),GR_dict[label],2)
cv2.rectangle(im,(x,y-40),(x+w,y),GR_dict[label],-1)
cv2.putText(im, results[label], (x, y-10),cv2.FONT_HERSHEY_SIMPLEX,0.8,(255,255,255),2)
cv2.imshow('LIVE', im)
key = cv2.waitKey(10)
if key == 27:
break
cap.release()
cv2.destroyAllWindows()
↧
↧
ArUco Thresholding Window Size
As stated in the ArUco [guide](https://docs.opencv.org/master/d5/dae/tutorial_aruco_detection.html): > The simplest case is using the same> value for adaptiveThreshWinSizeMin and> adaptiveThreshWinSizeMax, which> produces a single thresholding step.
For example:
parm = aruco.DetectorParameters_create()
constant = 7
parm.adaptiveThreshWinSizeMin = constant
parm.adaptiveThreshWinSizeMax = constant
parm.adaptiveThreshWinSizeStep = constant
Looping through values for `constant` it was found that as the window size increases, there was a decrease in the number of markers detected in a video stream.
However, when using two poor individually performing thresholds (for example window sizes of 21, and 25 - based off [this](https://github.com/opencv/opencv_contrib/blob/74d0117176138235648f95c742665f8754eb78dd/modules/aruco/src/aruco.cpp#L351) function), yielded extremely accurate results.
parm = aruco.DetectorParameters_create()
parm.adaptiveThreshWinSizeMin = 21
parm.adaptiveThreshWinSizeMax = 25
parm.adaptiveThreshWinSizeStep = 4
Can anyone please explain how two individually poor performing windows combine to create an accurate one? Statistically I can understand an increase but not of this magnitude. Am I missing something (e.g. - incorrect calculation of the windows). Here is an [example image](/upfiles/1600969008423275.png).
↧
How to separate objects wich have the same color
Hi , for an internship I need to detect some pigs. I can separate them to the background but not separate one to another when they are too close. I'm starting to learn image processing so maybe I'm doing bad. Do you have any advices for me ?
The first picture.
Without background 
Outline with canny detector 
↧
imencode not working with ".jpg" extension (Access violation reading location)
I update my opencv from 3.4.5 to 3.4.11 and since then the cv:imencode not working with ".jpg" files, but ".png" and ".bmp" works very fine.
vector params;
string defaultFormat = ".jpg";
params.push_back(CV_IMWRITE_JPEG_QUALITY);
params.push_back(100);
if (!image.empty())
{
imencode(defaultFormat, image, buffer, params);
}
The error message: Access violation reading location 0x00000014.
ive tried in both release and debug mode.
Additional Dependencies:
IlmImf.lib
ippicvmt.lib
ippiw.lib
ittnotify.lib
libjasper.lib
libjpeg-turbo.lib
libpng.lib
libprotobuf.lib
libtiff.lib
libwebp.lib
opencv_world3411.lib
quirc.lib
zlib.lib
Im on Windows 10 x64
↧
↧
Cannot decode server-sent uchar vector on client side: Empty Mat results.
Hello, thank-you in advance for any help that may be provided here:
I am trying to send a Mat image from a server to a client over TCP-ip with winsock (between two different PC's). Like many others in these forums I am also new to socket programming. I have followed this very good thread "https://answers.opencv.org/question/197414/sendreceive-vector-mat-over-socket-cc/" which I have basically adopted for my own solution. Thank-you therefore to the contributors (pwk3, berak & opalmirror).
I have "successfully" been able to send a uchar vector, (converted from cv::Mat via imencode(..)) which appears to be exactly the same size when received on the client-side in comparison to the server-side prior to sending.
Unfortunately however when I try to decode it on the client side it results in an empty matrix. According to the docs: "If the buffer is too short or contains invalid data, the function returns an empty matrix ( Mat::data==NULL )."
Therefore I am assuming that although the exact number of bytes were sent... there is still some corruption somewhere.
If I decode the encoded buffer on the server side before transmission the original Mat is directly recoverable with the same code. Hence the issue has to occur in the transmission between PC's somehow.
I have attached an image showing the following: What I can see on the server (the sending side), and what appears on the client (receipt side).[C:\fakepath\Client_Receive_Server_Send_Output.jpg](/upfiles/16009911501490604.jpg).
If someone could please assist in perhaps explaining why I am able to decode on the server side without obtaining an empty Mat but not on the client that would be much appreciated. I feel the data must somehow be corrupted during transmission (even though it looks as though all data was received). Any tips would be much appreciated.
Thank-you in advance.
↧
peculiar docker problem
Hi there,
I am seeing some strange issues when using opencv (opencv-python==4.3.0.36) in a docker container.
First the base image is python:3-slim
I have a very simple python file that does something like this:
import cv2
def main():
....
cap = cv2.VideoCapture(video_src)
// for the purpose of checking
if cap.isOpened():
print("open")
else:
print("closed")
so if run my container which calls the script (the call looking like **python python_file_name.py -f path_to_video_file**
I get the following error:
**OpenCV(4.3.0) /tmp/pip-req-build-6amqbhlx/opencv/modules/videoio/src/cap_images.cpp:253: error: (-5:Bad argument) CAP_IMAGES: can't find starting number (in the name of file): /sample_video/hkk_video_demo.avi in function 'icvExtractPattern'**
and the **cap.isOpened()** returns false
I have researched this error and many have resolved it being installing some sort of ffmpeg codec.
However this is where (for me at least) it gets strange.
When i **docker exec -it imageID /bin/bash** into the container and run the same command above it runs fine and the **cap.isOpened()** returns true.
I have tried restructuring the calling mechanism into a few different ways but nothing seems to work when calling from the Dockerfile.
I created a script to be called at [ENTRYPOINT] in the Dockerfile with the shebang set to /bin/bash and thought but that might work, but seeing as I am writing this you can guess not.
If anyone has any ideas i would appreciate some feedback.
Thanks in advance,
wil
↧
Determine new coordinates of pixel after calibration + remap
I'm trying to calibrate camera and I need to be able to determine where to moves each pixel of the original image.
As far as I understand initUndistortRectifyMap creates horizontal and vertical maps (map1 and map2) for remap as a result of some mathematical functions but I honestly did not quite get [all the math behind it](https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga7dfb72c9cf9780a347fbe3d1c47e5d5a).
I think I get the idea of how x'' and y'' are calculated (also can you check if I get it right on this screenshot from Maple?) but I can't understand what's going on with s and x''' & y'''.

Here's code. As far as I understand in my case R is identity matrix and all distortion coefficients apart from k1,k2,p1,p2,k3 are zeroes.
auto result = calibrateCamera(objectPoints, // the 3D points
imagePoints, // the image points
imageSize, // image size
cameraMatrix, // output camera matrix
distCoeffs, // output distortion matrix
rvecs, tvecs, // Rs, Ts
flag); // set options
// . . .
cv::initUndistortRectifyMap(
cameraMatrix, // computed camera matrix
distCoeffs, // computed distortion matrix
cv::Mat(), // optional rectification (none)
cv::Mat(), // camera matrix to generate undistorted
cv::Size(640, 480), // size of undistorted
CV_32FC1, // type of output map
map1, map2); // the x and y mapping functions
↧
How to update UMat when Mat is updated?
I am writing one application in which i'll be having 2 Mat objects whose data will be updating every 50ms. Is there any way to create UMat for Mat objects, So that once Mat data updates, UMat data should also be updated.
Thankyou
↧
↧
Preform separation and image segmentation
Hi everyone!
I'm beginner at OpenCV and Computer Vision.
----------
I need to solve the problem of separating connected preforms. I'm write code that allows me to handle a single preform fairly well, but if there are more than 1 of them, then there are problems, which you can see in the picture.
How can I solve this problem? I read about the waterfall, but its application did not give the desired result.


[Source code](https://pastebin.com/bv91TCaE)
If u want u can get source image this: [click](https://drive.google.com/drive/folders/1nCSwTxkRNMQ3yHaVbLHc6zhdQ0tvZivY?usp=sharing)
#include "mainwindow.h"
using namespace std;
using namespace cv;
// Найти уравнение прямой
void line_equation(Point src1, Point src2, double& k, int& b) {
if ((src1.x != src2.x) && (src1.y != src2.y)) {
k = (double) (src1.y - src2.y) / (double) (src1.x - src2.x);
b = src1.y - (k * src1.x);
} else {
k = 0;
}
}
void find_parallel(Point src, vector& dst, double k, int rows) {
int b = src.y - k * src.x;
dst.push_back(Point(-b/k, 0));
dst.push_back(Point((rows - b) / k, rows));
}
// Найти точки пересечения прямой и контура
void line_interseption(vector src_line, Point p, Point q, vector& dst_point) {
int d = 0;
bool first_point_flag = false;
bool second_point_flag = false;
for(size_t i = 1; i < src_line.size(); i++) {
d = abs((p.y - q.y)*src_line[i].x - (p.x - q.x)*src_line[i].y + p.x * q.y - p.y * q.x)
/sqrt((p.y - q.y) * (p.y - q.y) + (p.x - q.x) * (p.x - q.x));
if(p.x != q.x) {
if((d == 0) && !first_point_flag && (p.x > src_line[i].x)) {
dst_point.push_back(src_line[i]);
first_point_flag = true;
}
if((d == 0) && !second_point_flag && (p.x < src_line[i].x)) {
dst_point.push_back(src_line[i]);
second_point_flag = true;
}
} else {
if((d == 0) && !first_point_flag && (p.y > src_line[i].y)) {
dst_point.push_back(src_line[i]);
first_point_flag = true;
}
if((d == 0) && !second_point_flag && (p.y < src_line[i].y)) {
dst_point.push_back(src_line[i]);
second_point_flag = true;
}
}
}
}
// Найти точку пересечения прямых
void intersection_points(Point flp1, Point flp2, Point slp1, Point slp2, Point& dst_point) {
double k1 = (double)(flp2.y - flp1.y) / (flp2.x - flp1.x);
double k2 = (double)(slp2.y - slp1.y) / (slp2.x - slp1.x);
double b1 = k1 * flp1.x - flp1.y;
double b2 = k2 * slp1.x - slp1.y;
dst_point.x = -(b1 - b2) / (k2 - k1);
dst_point.y = -(k2 * b1 - k1 * b2) / (k2 - k1);
}
// Найти максимально удаленную точку от прямой
void find_max(vector sepCont, Point p, Point q, Point &dst) {
dst = sepCont[0];
int d = 0;
int d_max = abs((p.y - q.y)*sepCont[0].x - (p.x - q.x)*sepCont[0].y + p.x * q.y - p.y * q.x)
/sqrt((p.y - q.y) * (p.y - q.y) + (p.x - q.x) * (p.x - q.x));
for(size_t i = 1; i < sepCont.size(); i++) {
d = abs((p.y - q.y)*sepCont[i].x - (p.x - q.x)*sepCont[i].y + p.x * q.y - p.y * q.x)
/sqrt((p.y - q.y) * (p.y - q.y) + (p.x - q.x) * (p.x - q.x));
if(d > d_max) {
dst = sepCont[i];
d_max = d;
}
}
}
// Найти расстаяние между двумя точками
int find_dist(Point p1, Point p2) {
return sqrt((p2.x - p1.x) * (p2.x - p1.x) + (p2.y - p1.y) * (p2.y - p1.y));
}
// Найти точку на прямой
void point_on_line(Point src1, Point src2, double k, int b, Point& dst) {
double dif = 0;
if(src1.y - src2.y != 0) {
dif = (double)abs(src1.x - src2.x) / (double)abs(src1.y - src2.y);
} else {
dif = 2;
}
int dist = find_dist(src1, src2);
if(dif > 1) {
if(src1.x > src2.x) {
dst.y = k * (src1.x - dist/10) + b;
dst.x = src1.x - dist/10;
} else {
dst.y = k * (src1.x + dist/10) + b;
dst.x = src1.x + dist/10;
}
} else {
if(src1.y > src2.y) {
dst.y = src1.y - dist/10;
dst.x = (dst.y - b) / k;
} else {
dst.y = src1.y + dist/10;
dst.x = (dst.y - b) / k;
}
}
}
void point_on_line(vector src, double k, int b, vector& dst) {
double dif = 0;
if(src[0].y - src[1].y != 0) {
dif = (double)abs(src[0].x - src[1].x) / (double)abs(src[0].y - src[1].y);
} else {
dif = 2;
}
int dist = find_dist(src[0], src[1]);
if(dif > 1) {
if(src[0].x > src[1].x) {
dst[0].y = k * (src[0].x - dist/10) + b;
dst[0].x = src[0].x - dist/10;
dst[1].y = k * (src[1].x + dist/10) + b;
dst[1].x = src[1].x + dist/10;
} else {
dst[0].y = k * (src[0].x + dist/10) + b;
dst[0].x = src[0].x + dist/10;
dst[1].y = k * (src[1].x - dist/10) + b;
dst[1].x = src[1].x - dist/10;
}
} else {
if(src[0].y > src[1].y) {
dst[0].y = src[0].y - dist/10;
dst[0].x = (dst[0].y - b) / k;
dst[1].y = src[1].y + dist/10;
dst[1].x = (dst[1].y - b) / k;
} else {
dst[0].y = src[0].y + dist/10;
dst[0].x = (dst[0].y - b) / k;
dst[1].y = src[1].y - dist/10;
dst[1].x = (dst[1].y - b) / k;
}
}
}
// Извлечь прямоугольник по 4 точкам
void extractRect(Mat src, Mat dst, int min_y, int min_x, int max_y, int max_x,
int b1, int b2, int b3, int b4, double k1, double k2) {
while(min_y <= max_y) {
int x1 = (min_y - b3) / k1;
int x2 = (min_y - b1) / k2;
if(x1 < min_x)
x1 = (min_y - b2) / k1;
if(x2 > max_x)
x2 = (min_y - b4) / k2;
if(x1 > x2) {
int temp = x2;
x2 = x1;
x1 = temp;
}
for(int i = x1; i <= x2; i++) {
dst.at(Point(i, min_y)) = src.at(Point(i, min_y));
}
min_y++;
}
}
void separeteContour(vector> src, vector& top, vector& low, double k, int b) {
for(size_t i = 0; i < src.size(); i++) {
for(size_t j = 0; j < src[i].size(); j++) {
int y_cont = k * src[i][j].x + b;
if(src[i][j].y < y_cont) {
low.push_back(src[i][j]);
} else {
top.push_back(src[i][j]);
}
}
}
}
int main()
{
Mat fg, bg, blured, preform, bin, ROI, erosion_dst, cont;
vector> contours, true_cont;
Point2d vertices[4];
int b_main, b1, b2, b3, b4;
double k_main, k_peak;
bg = imread("C:\\2\\bg.bmp", IMREAD_GRAYSCALE);
fg = imread("C:\\2\\pf1.bmp", IMREAD_GRAYSCALE);
resize(fg, fg, Size(), 0.3, 0.3, INTER_AREA);
resize(bg, bg, Size(), 0.3, 0.3, INTER_AREA);
absdiff(fg, bg, preform);
threshold(preform, bin, 10, 255, THRESH_BINARY);
findContours(bin, contours, RETR_EXTERNAL, CHAIN_APPROX_NONE);
uint32_t resolution = fg.rows * fg.cols;
uint32_t kernelSize = resolution < 1280 * 1280 ? 7:
resolution < 2000 * 2000 ? 15:
resolution < 3000 * 3000 ? 27:
45;
GaussianBlur(bin, blured, Size(kernelSize, kernelSize), 9, 9);
imshow("bin", bin);
imshow("pf", preform);
for(size_t i = 0; i < contours.size(); i++) {
if(contours[i].size() > 1000)
true_cont.push_back(contours[i]);
}
vector circles;
HoughCircles(blured, circles, HOUGH_GRADIENT, 1, fg.rows/16, 100, 30, 1, 100);
Moments mnt = moments(true_cont[0]);
Point center_mass(mnt.m10/mnt.m00, mnt.m01/mnt.m00);
Point circle_center(cvRound(circles[0][0]), cvRound(circles[0][1]));
line_equation(center_mass, circle_center, k_main, b_main);
int main_dist = find_dist(center_mass, circle_center);
Point peak1, peak2;
vector top_cont, low_cont;
separeteContour(true_cont, top_cont, low_cont, k_main, b_main);
find_max(top_cont, center_mass, circle_center, peak1);
find_max(low_cont, center_mass, circle_center, peak2);
line_equation(peak1, peak2, k_peak, b1);
Point neck_point;
intersection_points(center_mass, circle_center, peak1, peak2, neck_point);
point_on_line(neck_point, center_mass, k_main, b_main, neck_point);
line(fg, Point(-b_main/k_main, 0), Point((fg.rows-b_main)/k_main, fg.rows), 255, 1);
vector neck_parallel;
vector circle_parallel;
vector circle_int;
vector circle_int_par1;
vector circle_int_par2;
find_parallel(neck_point, neck_parallel, k_peak, fg.rows);
find_parallel(circle_center, circle_parallel, k_peak, fg.rows);
line_interseption(top_cont, circle_parallel[0], circle_parallel[1], circle_int);
line_interseption(low_cont, circle_parallel[0], circle_parallel[1], circle_int);
b2 = circle_int[0].y - k_peak * circle_int[0].x;
point_on_line(circle_int, k_peak, b2, circle_int);
find_parallel(circle_int[0], circle_int_par1, k_main, fg.rows);
find_parallel(circle_int[1], circle_int_par2, k_main, fg.rows);
Point neck_int1, neck_int2;
intersection_points(circle_int_par1[0], circle_int_par1[1], neck_parallel[0], neck_parallel[1], neck_int1);
intersection_points(circle_int_par2[0], circle_int_par2[1], neck_parallel[0], neck_parallel[1], neck_int2);
vertices[0] = circle_int[0];
vertices[1] = circle_int[1];
vertices[2] = neck_int1;
vertices[3] = neck_int2;
circle(fg, neck_point, 2, 0, 2);
circle(fg, peak1, 2, 100, 2);
circle(fg, peak2, 2, 100, 2);
circle(fg, vertices[0], 2, 0, 2);
circle(fg, vertices[1], 2, 0, 2);
circle(fg, vertices[2], 2, 0, 2);
circle(fg, vertices[3], 2, 0, 2);
line(fg, neck_parallel[1], neck_parallel[0], 255, 1);
line(fg, circle_parallel[1], circle_parallel[0], 255, 1);
line(fg, circle_int_par1[1], circle_int_par1[0], 255, 1);
line(fg, circle_int_par2[1], circle_int_par2[0], 255, 1);
imshow("fg", fg);
imshow("pf", preform);
imshow("bl", blured);
waitKey();
return EXIT_SUCCESS;
}
↧
How to Apply warpPerspective only for part of Image?
In my application I am doing warpPerspective after that I am cropping with ROI ( example: cv::Rec(45, 233, 403, 182)). Is there any way to apply warpPerspective only for ROI so that I can reduce time.

need to apply warpPerspective only for part marked in yellow rectangle.
Thanks!
↧
Disable file system cache during imwrite
I am trying to build a application with opencv and c++ in Linux env to capture and save 10000 images, But after saving 800 images System buff/cache is increasing and application running slow (system getting hanged up). Is there any way to disable file system cache or clear cache after imwrite or save image without using imwrite. (want to store in .bmp format)
any suggestion will be helpful, Thankyou
↧
What is the length of the tape/strip from the image?
Problem explanation:
1. tape can be in any position of the image
2. Position can vertical / horizontal for tape
Condition: Is to do without using a deep learning model
only with help of OpenCV
my approach is given below
1. firstly convert the image to grayscale
2. thresholding to differentiate between foreground and background
3. morphology to reduce the noise in the image
4. After that trying to find the ROI but no ability to detect the tape
please suggest me how to proceed/guidance very helpful for me
↧
↧
Hi I have started working in ML and CV field recently. I want to participate in GSoC 2021. Can someone please guide and suggest me what I should do for contributing.
Can Someone help me in participating GSoC 2021. As this is my first time I want to participate be part of GSoC 2021.
Thank you
↧
Calibration grid larger than FOV - which method?
I have a custom circle-grid calibration routine that detects three donut markers to establish the world coordinate system. With this method our pattern can be partially visible, which helps getting good coverage in the corners. My homegrown circle detection isn't very good, though, and I'd like to use the existing methods in OpenCV to get better grid locations. Previously we had to use the vendor-specific circle grid with the donut markers but now have the freedom to choose checkerboards.
Questions:
1) What's the recommended method, charuco or findChessboardCornersSB()?
2) Is there any documentation on the marker detection algorithm and requirements for the "Radon" chessboard shown here: https://docs.opencv.org/master/checkerboard_radon.png?
Thanks,
Andrew Queisser
hp
↧
Fastest way to convert BGR RGB! Aka: Do NOT use Numpy magic "tricks".
----------
**IMPORTANT:** This article is **very** long. Remember to click the `(more)` at the bottom of this post to read the whole article!
----------
*I was reading this question: https://answers.opencv.org/question/188664/colour-conversion-from-bgr-to-rgb-in-cv2-is-slower-than-on-python/ and it didn't explain things very well at all. So here's a deep examination and explanation for everyone's future reference!*
**Converting RGB to BGR, and vice versa, is one of the most important operations you can do in OpenCV if you're interoperating with other libraries, raw system memory, etc. And each imaging library depends on their own special channel orders.**
There are many ways to achieve the conversion, and `cv2.cvtColor()` is often frowned upon because there are "much faster" ways to do it via numpy "view" manipulation.
Whenever you attempt to convert colors in OpenCV, you actually invoke a huge machinery:
https://github.com/opencv/opencv/blob/8c0b0714e76efef4a8ca2a7c410c60e55c5e9829/modules/imgproc/src/color.cpp#L20-L25
https://github.com/opencv/opencv/blob/8b541e450b511fde9dd363fa55a30fbb6fc0ace6/modules/imgproc/src/color_rgb.dispatch.cpp#L426-L437
As you can see, internally, OpenCV creates an "OpenCL Kernel" with the instructions for the data transformation, and then runs it. This creates brand new (re-arranged) image data in memory, which is of course a pretty slow operation, involving new memory allocation and data-copying.
However, there is another way to flip between RGB and BGR channel orders, which is very popular - and very bad (as you'll find out soon). And that is: Using numpy's built-in methods for manipulating the array data.
Note that there are two ways to manipulate data in Numpy:
- One of the ways, the bad way, just changes the "view" of the Numpy array and is therefore instant (`O(1)`), but does NOT transform the underlying `img.data` in RAM/memory. This means that the raw memory does NOT contain the new channel order, and Numpy instead "fakes" it by creating a "view" that simply says "when we read this data from RAM, view it as R=B, G=G, B=R" basically... (Technically speaking, it changes the ".strides" property of the Numpy object, which instead of saying "read R then G then B" (stride "1" aka going forwards in RAM when reading the color channels) changes it to say "read B, then G, then R" (stride "-1" aka going backwards in RAM when reading the color channels)).
- The second way, which is totally fine, is to *always ensure* that we arrange the pixel data properly in memory too, which is a lot slower but is *almost always necessary*, depending on what library/API your data is intended to be sent to!
To determine whether a numpy array manipulation has also changed the underlying MEMORY, you can look at the `img.flags['C_CONTIGUOUS']` value. If `True` it means that the data in RAM is in the correct order (that's great!). If `False` it means that the data in RAM is in the wrong order and that we are "cheating" via a numpy View instead (that's BAD!).
Whenever you use the "View-based" methods to flip channels in an ndarray (such as `RGB -> BGR`), its `C_CONTIGUOUS` becomes `False`. If you then flip the image's channels again (such as `BGR -> back to RGB`), its `C_CONTIGUOUS` becomes `True` again. So, the "view" is able to be transformed multiple times, and the "Contiguous" flag only says `True` whenever the view happens to match the actual RAM data's layout.
So... in what situations do you need the data to ALWAYS be contiguous? Well, it varies based on API...
- OpenCV APIs *all* need data in **contiguous** order. If you give it non-contiguous data from Python, the Python-to-OpenCV wrapper layer *internally* makes a COPY of the BADLY FORMATTED image you gave it, and then converts the color channel order, and THEN finally passes the COPIED-AND-CONTIGUOUS image to the internal OpenCV C++ function. This is of course very wasteful!
- Matplotlib APIs do not need contiguous data, because they have stride-handling code. But all of their calls are slowed down if given non-contiguous data, [as seen here](https://github.com/scivision/pymap3d/issues/30#issuecomment-537663693).
- Other libraries: Depends on the library. Some of them do something like "take the `img.data` memory address and give it to a raw Windows API via a COM call" in which case YES the RAM-data MUST be contiguous too.
**What type of data do YOU need?**
If you want the SAFEST possible data that is 100% sure to work in ANY API ANYWHERE, you should always make CONTIGUOUS pixel data. It doesn't take long to do the conversion up-front, since we're still talking about very fast operations!
There are probably situations where non-contiguous data is fine, such as if you are doing all image manipulations purely in Numpy math without any library APIs (in which case there's no real reason to convert the data layout to contiguous in RAM). But as soon as you invoke various library APIs, you should *pretty much always* have contiguous data, otherwise you'll create huge performance issues (or even completely incorrect results).
I'll explain those performance issues further down, but first we'll look at the various "conversion techniques" people use in Python.
**Techniques**
Without further ado, here are all the ways that people use in Python whenever they want to convert back/forth between RGB and BGR. These benchmarks are on a 4K image (3840x2160):
- Always Contiguous: No. Method: `x = x[...,::-1]`. Speed: `237 nsec (aka 0.237 usec aka 0.000237 msec) per call`
- Always Contiguous: Yes. Method: `x = x[...,::-1].copy()`. Speed: `37.5 msec per call`
- Always Contiguous: No. Method: `x = x[:, :, [2, 1, 0]]`. Speed: `12.6 msec per call`
- Always Contiguous: Yes. Method: `x = cv2.cvtColor(x, cv2.COLOR_RGB2BGR)`. Speed: `5.39 msec per call`
- Always Contiguous: No. Method: `x = np.fliplr(x.reshape(-1,3)).reshape(x.shape)`. Speed: `1.62 usec (aka 0.00162 msec) per call`
- Always Contiguous: Yes. Method: `x = np.fliplr(x.reshape(-1,3)).reshape(x.shape).copy()`. Speed: `37.4 msec per call`
- Always Contiguous: No. Method: `x = np.flip(x, axis=2)`. Speed: `2.74 usec (aka 0.00274 msec) per call`
- Always Contiguous: Yes. Method: `x = np.flip(x, axis=2).copy()`. Speed: `37.5 msec per call`
- Always Contiguous: Yes. Method: `r = x[..., 0].copy(); x[..., 0] = x[..., 2]; x[..., 2] = r`. Speed: `21.8 msec per call`
- Always Contiguous: Yes. Method: `x[:, :, [0, 2]] = x[:, :, [2, 0]]`. Speed: `21.7 msec per call`
- Always Contiguous: Yes. Method: `x[..., [0, 2]] = x[..., [2, 0]]`. Speed: `21.8 msec per call`
- Always Contiguous: Yes. Method: `x[:, :, [0, 1, 2]] = x[:, :, [2, 1, 0]]`. Speed: `33.1 msec per call`
- Always Contiguous: Yes. Method: `x[:, :] = x[:, :, [2, 1, 0]]`. Speed: `49.3 msec per call`
- Always Contiguous: Yes. Method: `foo = x.copy()`. Speed: `11.8 msec per call` (This example doesn't change the RGB/BGR channel order, and is just included here as a reference, to show how slow Numpy is at doing a *super simple* copy of an already-contiguous chunk of RAM. As you can see, even when the data is already in the proper order, Numpy is very slow... And if "x" had been non-contiguous here, it would be even slower, as shown in the `x = x[...,::-1].copy()` (equivalent to saying `bar = x[...,::-1]; foo = bar.copy()`) example near the top of the list, which took `37.5 msec` and demonstrates Numpy copying non-contiguous RAM (from numpy "views" marked as "read in reverse order" via "stride = -1") into contiguous RAM...
PS: Whenever we want contiguous data from numpy, we're mostly using `x.copy()` to tell Numpy to allocate new RAM and copy all data to it in the correct (contiguous) order. There's also a `np.ascontiguousarray(x)` API but it does the *exact* same thing (it copies too, "but only when the Numpy data isn't already contiguous") and requires much more typing. ;-) And in a *few* of the examples we're using special indexing (such as `x[:, :, [0, 1, 2]] = x[:, :, [2, 1, 0]]`) to overwrite the memory directly, which always creates contiguous memory with correct "strides", and is faster than telling Numpy to do a `.copy()`, but is still *extremely* slow compared to `cv2.cvtColor()`.
Docs for the various Numpy functions: [copy](https://docs.scipy.org/doc/numpy/reference/generated/numpy.copy.html), [ascontiguousarray](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ascontiguousarray.html), [fliplr](https://docs.scipy.org/doc/numpy/reference/generated/numpy.fliplr.html#numpy.fliplr), [flip](https://docs.scipy.org/doc/numpy/reference/generated/numpy.flip.html#numpy.flip), [reshape](https://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html)
Here's the benchmark that was used:
`python -m timeit -s "import numpy as np; import cv2; x = np.zeros([2160,3840,3], np.uint8); x[:,:,2] = 255; x[:,:,1] = 100" "ALGORITHM HERE"`
Replace the `"ALGORITHM HERE"` part with the algorithm above, such as `"x = np.flip(x, axis=2).copy()"`.
**People's Misunderstandings of those Benchmarks**
Alright, so we're finally getting to the whole purpose of this article!
When people see the benchmarks above, they usually think "Oh my god, `x = x[...,::-1]` executes in `0.000237` milliseconds, and `x = cv2.cvtColor(x, cv2.COLOR_RGB2BGR)` executes in `5.39` milliseconds, which is **23798 times slower!!** And then they decide to always use "Numpy view manipulation" to do their channel conversions.
That's a huge mistake. And here's why:
- When you call an OpenCV API from Python, and pass it a `numpy.ndarray` image object, there's a process which prepares that data for internal usage within OpenCV (since OpenCV itself doesn't use `ndarray` internally; it uses `cv::Mat`).
- First, your Python object (which is coming from the `PyOpenCV` module) goes into the appropriate `pyopencv_to()` function, whose purpose is to convert raw Python objects (such as numbers, strings, `ndarray`, etc), into something usable by OpenCV internally in C++.
- Your Python object first enters the "full ArgInfo converter" code at https://github.com/opencv/opencv/blob/778f42ad34559451d62ac9ba585717aec77fb23a/modules/python/src2/cv2.cpp#L249
- That code, in turn, looks at the object and determines if it's a number, a float, or a tuple... If it's any of those, it does the appropriate conversion. Otherwise it assumes it's a Numpy array, at this line: https://github.com/opencv/opencv/blob/778f42ad34559451d62ac9ba585717aec77fb23a/modules/python/src2/cv2.cpp#L292
- Next, it begins to analyze the Numpy array to determine how to use the data internally. It wants to determine "do we need to copy the data or can we use it as-is? do we need to cast the data?", see here: https://github.com/opencv/opencv/blob/778f42ad34559451d62ac9ba585717aec77fb23a/modules/python/src2/cv2.cpp#L300
- It first does some simple checks to see if the number-type in the Numpy array is legal or not. (If illegal type, it marks the data as "needs copy" *and* "needs cast").
- Next, it retrieves the "strides" information from the Numpy array, which is those simple numbers such as "-1" which determine how to read a numpy array (such as backwards, in the case of our "fast" numpy-based "channel flipping" code earlier): https://github.com/opencv/opencv/blob/778f42ad34559451d62ac9ba585717aec77fb23a/modules/python/src2/cv2.cpp#L341-L342
- Then it analyzes the Strides for all dimensions of the Numpy array, and if it finds a non-contiguous stride (our "screwed up" data layout caused by doing those so-called "fast" Numpy view manipulations), then it marks the data as "needs copy": https://github.com/opencv/opencv/blob/778f42ad34559451d62ac9ba585717aec77fb23a/modules/python/src2/cv2.cpp#L344-L357
- Next, if "needs copy" is true, it does this horrible thing: https://github.com/opencv/opencv/blob/778f42ad34559451d62ac9ba585717aec77fb23a/modules/python/src2/cv2.cpp#L367-L374
- As you can see, it calls `PyArray_Cast()` (if casting was needed too) or `PyArray_GETCONTIGUOUS()` (if we only need to make sure the data is contiguous). *Both* of those functions, no matter which one is called, then generates a brand-new Python Numpy Object, with all data COPIED by Numpy into brand-new memory, and re-arranged into proper Contiguous ordering. That's extremely wasteful! I'll explain more soon, after this walkthrough of what the code does...
- Finally, the code proceeds to create a `cv::Mat` object whose data pointer points at the internal byte-array (RAM) of the Numpy object, ie. the RAM address that you can easily see in Python by typing `img.data`. That's an incredibly fast operation because it is just a pointer which says `Use the existing RAM data owned by Numpy at RAM address XYZ`: https://github.com/opencv/opencv/blob/778f42ad34559451d62ac9ba585717aec77fb23a/modules/python/src2/cv2.cpp#L415-L416
So, can you spot the problem yet?
When you pass a contiguous Numpy array, the conversion into OpenCV is pretty much INSTANT: "This data looks fine! Just give its RAM address to `cv::Mat` and voila!".
But when you instead insist on using those so-called "fast" channel transformations, where you "tweak" the Numpy array's view and stride values, then you are giving OpenCV a Numpy array with non-contiguous RAM and bad "strides". The PyOpenCV layer (the wrapper between OpenCV and Python) detects this problem, and creates a BRAND NEW, COPIED, RE-ARRANGED (CONTIGUOUS) NUMPY ARRAY. This is VERY VERY VERY VERY SLOW.
In other words, if you've used those dumb Numpy "view" manipulation tricks, *EVERY* CALL TO OPENCV APIS IS CAUSING a HUGE memory copy (images are large, especially 1080p+ screenshots/video frames), a lot of math inside `PyArray_GETCONTIGUOUS / PyArray_Cast` to create that new object while respecting your tweaked "strides", etc.
Your code won't be faster at all. It will be SLOWER!
**Demonstration of the Slowness**
Let's use a random OpenCV API to demonstrate the slowdown caused by all of those conversions. We'll use `cv2.imshow` here, but *any* OpenCV API call will always be doing the same "Python to OpenCV" conversions of the numpy data, so the exact API doesn't matter. They will *all* have this overhead.
Here's the example code:
import cv2
import numpy as np
import time
#img1 = cv2.imread("yourimage.png") # If you want to test with an image.
img1 = np.zeros([2160,3840,3], np.uint8) # Create a 4K image.
img1[:,:,2] = 255; img1[:,:,1] = 100 # Fill the channels with different values.
img2 = img1[...,::-1] # Make a "channel flipped view" of the Numpy data.
print("img1 contiguous", img1.flags['C_CONTIGUOUS'], "img2 contiguous", img2.flags['C_CONTIGUOUS'])
print("img1 strides", img1.strides, "img2 strides", img2.strides)
wnd = cv2.namedWindow("", cv2.WINDOW_NORMAL)
def show1():
cv2.imshow(wnd, img1)
def show2():
cv2.imshow(wnd, img2)
iterations = 20
start = time.perf_counter()
for i in range(0, iterations):
show1()
elapsed1 = (time.perf_counter() - start) * 1000
elapsed1_percall = elapsed1 / iterations
start = time.perf_counter()
for i in range(0, iterations):
show2()
elapsed2 = (time.perf_counter() - start) * 1000
elapsed2_percall = elapsed2 / iterations
# We know that the contiguos (img1) data does not need conversion,
# which tells us that the runtime of the contiguous data is the
# "internal work of the imshow" function. We only want to measure
# the conversion time for non-contiguous data. So we'll subtract
# the first image's (contiguous) runtime from the non-contiguous time.
noncontiguous_overhead_per_call = elapsed2_percall - elapsed1_percall
print("Extra time taken per OpenCV call when given non-contiguous data (in ms):", noncontiguous_overhead_per_call, "ms")
The results:
img1 contiguous True img2 contiguous False
img1 strides (11520, 3, 1) img2 strides (11520, 3, -1)
Extra time taken per OpenCV call when given non-contiguous data (in ms): 39.45334999999999 ms
As you can see, the extra time added to the OpenCV calls when copy-conversion is needed (`39.45 ms`), is pretty much the same as when you call Numpy's own `img.copy()` on a "flipped view" inside Python itself (as seen in the earlier benchmark for "Method: `x = x[...,::-1].copy()`. Speed: `37.5 msec per call`").
So yes, *every* time you call OpenCV with a non-contiguous Numpy array as its argument, you are causing a Numpy `.copy()` to happen internally!
PS: If we repeat the same test above with 1920x1080 test data instead of 4K test data, we get `Extra time taken per OpenCV call when given non-contiguous data (in ms): 9.972125 ms` which means that at the world's most popular image resolution (1080p) you're still adding around 10 milliseconds of overhead to *all* of your OpenCV calls.
**Numpy "tricks" will cause subtle Bugs too!**
Using those Numpy "tricks" isn't *just* extremely slow. It will cause *very subtle bugs* in your code, *too*.
Look at this code and see if you can figure out the bug yourself before you run this example:
import cv2
import numpy as np
img1 = np.zeros([200,200,3], np.uint8) # Create a 200x200 image. (Is Contiguous)
img2 = img1[...,::-1] # Make a "channel flipped view" of the Numpy data. (A Non-Contiguous View)
print("img1 contiguous", img1.flags['C_CONTIGUOUS'], "img2 contiguous", img2.flags['C_CONTIGUOUS'])
print("img1 strides", img1.strides, "img2 strides", img2.strides)
cv2.rectangle(img2, (80,80), (120,120), (255,255,255), 2)
cv2.imshow("", img2)
What do you think the result will be when running this program? Logically, you expect to see black image with a white rectangle in the middle... But instead, you see *nothing* except a black image. Why?
Well, it's simple... think about what was explained earlier about **how** `PyOpenCV` converts every incoming `numpy.ndarray` object into an internal C++ `cv::Mat` object. In this example, we're giving a *non-contiguous* `ndarray` as an argument to `cv2.rectangle()`, which causes `PyOpenCV` to "fix" the data by making a temporary, internal, *contiguous* `.copy()` of the image data, and then it wraps the *copy*'s memory address in a `cv::Mat`. Next, it passes that `cv::Mat` object to the internal C++ "draw rectangle" function, which dutifully draws a rectangle onto the memory pointed to by the `cv::Mat` object... which is... the memory of the temporary internal *copy* of your input array, since a copy had to be created...
So, OpenCV happily writes a rectangle to the temporary object *copy*. And then when execution returns to Python, you're of course seeing NO RECTANGLE, since *nothing* was drawn to *your* actual `ndarray` data in RAM (since its memory storage was non-contiguous and therefore *not usable as-is* by OpenCV).
If you want to see what the code above *should* be doing, simply add `img2 = img2.copy()` immediately above the `cv2.rectangle` call, to cause the `img2` ndarray object to become contiguous memory so that OpenCV won't need to make a copy of it (and will be able to use *that* exact object's memory internally, as intended)... After that tweak, you'll see OpenCV properly drawing the rectangle to the image...
This is the kind of subtle bug that is very easy to cause when you're playing around with faked Numpy "views" rather than *real* contiguous memory.
**Bonus: A note about Numpy "slices"**
Numpy allows you to efficiently "slice" arrays, to extract a "partial view" of the data. This is very useful for images, since you can do something such as extracting a 100:100 pixel square from the middle of an image. The slicing syntax is `img_sliced = img[y1:y2,x1:x2]`. This generates a full Numpy object which points at the data of the original image (they share each other's memory), but which only points at the sub-range you wanted. Therefore it's super fast (since the slice is just a small object that points at and says how to interpret a small range of the original array's data).
So it basically becomes a fully usable "Numpy array" object which you can use in any context you would pass an image. Such as to an OpenCV function, which would then only operate on the sliced segment of RAM. That's really useful!
However, be aware that the Numpy slices inherit the `strides` and `contiguous` flag of the original object / data they were sliced from! So if you're slicing from a non-contiguous array, you'll generate a non-contiguous slice object too, which is horrible and has *all* the issues of non-contiguous objects.
It's *only* safe to make partial views/slices (like `img[0:100, 0:100]`) when `img` itself is already PROVEN to be FULLY contiguous (with no "Numpy tricks" applied to it). In that case, feel free to pass your contiguous, partial image slices to OpenCV functions. You won't invoke any copy-mechanics in that case!
Alternatively, if you already have a non-contiguous image array and you want to slice it, it's faster to *slice first* and *then* make the *slice* contiguous, since that means less data copying (for example, a 100x100 slice of a 4K image will need much less copying "to make contiguous" than the whole image would have needed). By slicing first and then making a contiguous copy of the slice, you will ensure that your slice is contiguous/safe to use with OpenCV. As an example, let's say that `xyz` is a non-contiguous image; in that case, the technique would look as follows: `slice = xyz[0:100, 0:100].copy()` (create a non-contiguous slice "view" of a non-contiguous image, and then force that to become copied which creates a new contiguous array based on the slice's view). Alternatively, if you *don't know* if the image that you're slicing from is already contiguous or not, then you can use `slice = np.ascontiguousarray(xyz[0:100, 0:100])` (creates a slice "view", and then instantly uses that fast view as-is if already contiguous, else copies the data to a new contiguous array and returns that instead).
**Bonus: What to do when you get a non-contiguous ndarray from a library?**
As an example, the very cool `D3DShot` [library](https://github.com/SerpentAI/D3DShot) has an optional `numpy` mode where it retrieves the screenshots as `ndarray` objects. The problem is that it generates them from RAM data laid out in a different order, so it tweaks the `ndarray` strides etc to give us an object of the proper "shape" `(height, width, 3 color channels in RGB order)`. Its `.flags` property shows that Contiguous is FALSE.
So what do you do? If you try to pass that directly to OpenCV, you'll invoke the heavy `PyOpenCV` copy-mechanics described earlier.
Well, you have two options. In this example case, the colors are in RGB order, and you want them to be BGR for usage in OpenCV. So you should be invoking `cv2.cvtColor` which internally will trigger the Numpy `.copy()` for you (just like all OpenCV APIs do when given non-contiguous data), and then changes the color order in RAM for you.
The second option is when you have Numpy data that is already in the correct color order (such as BGR), but whose RAM is non-contiguous. In that case, you should *directly* invoke `img = img.copy()` to tell Numpy to make a contiguous copy of the array, to fix it. Then you're welcome to use that contiguous copy for everything. Also note that you can use `img = np.ascontiguousarray(img)` instead, if you're not sure if your library always returns non-contiguous data; this method automatically returns the same array if it was already contiguous, or does a `.copy` if it was non-contiguous.
Alright, so let's look at the `D3DShot` example:
import cv2
import d3dshot
import time
d = d3dshot.create(capture_output="numpy", frame_buffer_size=60)
img1 = d.screenshot()
img2 = d.screenshot()
print(img1.strides, img1.flags)
print(img2.strides, img2.flags)
print("-------------")
start = time.perf_counter()
img1_justcopy = img1.copy() # copy RGB image to new, contiguous RAM
elapsed = (time.perf_counter() - start) * 1000
print(img1_justcopy.strides, img1_justcopy.flags)
print("justcopy milliseconds:", elapsed)
print("-------------")
start = time.perf_counter()
img1 = img1.copy()
img1 = cv2.cvtColor(img1, cv2.COLOR_RGB2BGR) # flip RGB -> BGR
elapsed = (time.perf_counter() - start) * 1000
print(img1.strides, img1.flags)
print("copy+cvtColor milliseconds:", elapsed)
print("-------------")
start = time.perf_counter()
img2 = cv2.cvtColor(img2, cv2.COLOR_RGB2BGR) # flip RGB -> BGR
elapsed = (time.perf_counter() - start) * 1000
print(img2.strides, img2.flags)
print("cvtColor milliseconds:", elapsed)
Output:
(1920, 1, 2073600) C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
UPDATEIFCOPY : False
(1920, 1, 2073600) C_CONTIGUOUS : False
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
UPDATEIFCOPY : False
-------------
(5760, 3, 1) C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
UPDATEIFCOPY : False
justcopy milliseconds: 9.122899999999989
-------------
(5760, 3, 1) C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
UPDATEIFCOPY : False
copy+cvtColor milliseconds: 12.177900000000019
-------------
(5760, 3, 1) C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
UPDATEIFCOPY : False
cvtColor milliseconds: 11.461500000000013
These examples are all on my 1920x1080 screen, so they're not directly comparable to the 4K resolution times we saw in earlier benchmarks.
Anyway, what we can see here, is first of all that the two captured images (img1 and img2) coming straight from the `D3DShot` library have very strange `strides` values, and `C_CONTIGUOUS : False`. That's because they are raw RAM given to D3DShot by Windows and then just packaged into a ndarray with custom strides to make it read the raw RAM data in the desired order.
Next, we see that just doing `img1_justcopy = img1.copy()` (which copies the RGB-channeled, non-contiguous RAM into new, contiguous RAM, but does not change the channel order (the image will still be RGB)), takes `9.12 ms`, which is indeed how slow Numpy is at copying non-contiguous `ndarray` data into new, contiguous RAM. Basically, internally, Numpy has to do a ton of looping to read the data byte-by-byte while writing each byte into the correct order in the new, contiguous RAM.
So, the PyArray (Numpy) copying of non-contiguous to contiguous is always the slowest operation. That's why we want to avoid having non-contiguous RAM.
Alright, we also demonstrated how to make a "copy AND fix the colors from RGB to BGR" in two different ways. Doing `img1 = img1.copy(); img1 = cv2.cvtColor(img1, cv2.COLOR_RGB2BGR)` takes `11.83 ms`, and letting `cvtColor` trigger the Numpy `.copy` internally via directly calling `img2 = cv2.cvtColor(img2, cv2.COLOR_RGB2BGR)` takes `10.61 ms`. The reason for the slight difference is of course that there's slightly more work involved when we're doing 2 separate function calls, than when we let OpenCV do the Numpy copying in its single call.
In both cases, a PyArray (Numpy) copy operation happens internally, to give us a straight, contiguous RAM location. And then we pass that fixed, contiguous ndarray to `cvtColor` which fixes the color channel order.
That gives you the following guidelines for dealing with image data from libraries:
- If your Numpy data is always non-contiguous but is already in the correct channel order (you don't want to convert RGB to/from BGR, etc): Use `img = img.copy()` to force Numpy to make a contiguous copy of the data, which is then usable in all OpenCV calls without any bugs and without causing any slow internal, temporary copying.
- If your Numpy data is SOMETIMES non-contiguous but is already in the correct channel order: Use `img = np.ascontiguousarray(img)`, which automatically copies the array to make it contiguous if necessary, or otherwise returns the exact same array (if it was already contiguous).
- If your Numpy data has the wrong color channel order (and is either contiguous or non-contiguous; it doesn't matter which): Use `img = cv2.cvtColor(img, cv2.COLOR_)`, which will internally do the `.copy` (only does it if necessary) slightly more efficiently than if you had used two separate Python statements. And it will do the color conversion very rapidly with OpenCL accelerated code.
All of these techniques will result in giving you fast, contiguous RAM, in the color arrangement of your choice!
**Conclusions**
Stop using Numpy view manipulations and "tricks". They are not "cool". They lead to SUBTLE BUGS and they are EXTREMELY SLOW. You are slowing down all of your OpenCV API calls by about 40 milliseconds PER CALL (at 4K resolution) or 10 milliseconds PER CALL (at 1920x1080 resolution), since your "cool" data has to be converted by OpenCV internally to PROPER CONTIGUOUS RAM.
Those `39.45 ms (@ 4K) or 9.97 ms (@ 1920x1080)` are wasted on EVERY OpenCV call whenever you give OpenCV a non-contiguous image. So if you're (as people often do) passing the image to multiple OpenCV APIs to analyze it in multiple ways, then you are causing extreme slowdowns in your code.
Use `cv2.cvtColor()` instead, which does a super fast one-time conversion to the proper format, using accelerated OpenCL code. You are guaranteed to get contiguous data which works as-is for EVERY OpenCV call with no memory copying/conversion needed. And OpenCV's color converter is WAY FASTER than Numpy's internal data copier/converter.
Let's end this by imagining a scenario where you're using some Python library to capture an RGB 4K screenshot as a numpy array, and you need to use that data with OpenCV. So you're thinking you're clever and you write `img = img[...,::-1]` to "turn the RGB data into BGR (which OpenCV needs)", and you're thinking "Wow, my code is so fast! That RGB-to-BGR operation only took `0.000237 ms`!"... And *then* you're calling five different OpenCV functions to analyze that screenshot-image in various ways. Well, since you're causing one internal Numpy copy-conversion-to-contiguous PER CALL, you're now causing `5 * 39.45 = 197.25 ms` of total conversion overhead, just to get your "stupid" Numpy view into a proper contiguous memory stream.
Does it still sound "slow" to just do a single, one-time `5.39 ms (@ 4K) or 1.53 ms (@ 1920x1080)` conversion via `cv2.cvtColor()`? ;-)
Stop. Using. Numpy. Tricks!
Enjoy! ;-)
↧
How to convert stereo camera distances, pixels to mm
Hello.
I calibrated and stereo-parallelized the stereo camera.I would like to know how to find the coefficient to convert from pixels to mm.
From the Q matrix obtained by stereo rectification, Tx: baseline and f: focal length were determined.
As you know, the depth z is found by the following formula
z = Bf/(x-x') z:[pixels]
However, the units of z obtained from this formula are pixels.
A factor k[mm/pixel] is needed to convert it to mm.
Then the equation is as follows
z = (Bf/(x-x'))*k z:[mm]
Is there a smart way to find this coefficient k?
Please use something other than how to actually measure z several times to derive k, or how to examine the pitch pixels of the camera's image sensor.
In the case of HALCON, I was able to find it by a function instead of using the method described above.
Couldn't it be calculated from the camera and Q matrices obtained from calibration and stereo parallelization?
Thank you.


↧
↧
AttributeError: 'Voxels' object has no attribute 'faces'
hi i used binvox_rw library for working with OBJ files as:
import binvox_rw
with open('fox.binvox', 'rb') as f:
fox1 = binvox_rw.read_as_3d_array(f)
and wrote some lines
...
...
..
but in below codes:
for face in fox1.faces:
face_vertices = face[0]
points = np.array([vertices[vertex - 1] for vertex in face_vertices])
points = np.dot(points, scale_matrix)
python gave this error:
AttributeError: 'Voxels' object has no attribute 'faces'
↧
can't find QuasiDenseStereo in opencv_contrib_python
Hi everyone,
It's very possible I'm overlooking something, but I can't seem to find the Quasi Dense Stereo implementation in opencv_contrib for python. It's part of the `stereo` module in contrib.
Using python3.6 on 64-bit linux, just recompiled the whl file to make sure that quasi_dense_stereo.{cpp,hpp} are getting processed, which it is.
But I can't find it in the python module. I've made sure the `import cv2` picks the correct version, the one that I just compiled, but there's no QuasiDenseStereo class anywhere in the module.
Any thoughts? Thanks!
1. List item
↧
How to color the pixels in an image depending on whether they are inside/outside a zone/circle?
I am converting an input ROS image to OpenCV, drawing a circle on the image, and removing points outside the circle. I want to be able to color the pixels in the image depending on whether they fall *outside* or *inside* the circle (like a zone). I'd like the pixels inside the circle to be red, and the pixels ouside the circle to be green. [This image][1] clearly shows what I mean: I am currently getting the images on the left, but would like to get the images on the right.
I am attaching my code below. I have played around with the cv2.circle and cv2.rectangle functions, but have had trouble getting these to work. Thank you for your help!
depth_image = self.bridge.imgmsg_to_cv2(self.myImage, desired_encoding="passthrough")
depth_image_noNaN = np.nan_to_num(depth_image,0)
self.oldDepthArray = np.array(depth_image_noNaN)
cv2.circle(self.oldDepthArray, (self.myImage.width/2, self.myImage.height/2), 130, (255,0,0), 4)
self.oldDepthArray.resize((self.myImage.height, self.myImage.width))
boundary_circle = np.zeros((self.myImage.height, self.myImage.width), np.uint8)
cv2.circle(boundary_circle, (self.myImage.width/2, self.myImage.height/2), zone_radius, (170,0,0), -1)
self.circleImage = self.bridge.cv2_to_imgmsg(self.oldDepthArray, encoding="passthrough")
self.pubCircleImage.publish(self.circleImage)
self.newDepthArray = cv2.bitwise_and(self.oldDepthArray, self.oldDepthArray, mask = boundary_circle)
self.newImage = self.bridge.cv2_to_imgmsg(self.newDepthArray, encoding="passthrough")
self.pubNewImage.publish(self.newImage)

[1]: https://i.stack.imgur.com/X0ugq.png
↧