Preface
I analyze that the first step of my work is to split the images. I follow this tutorial to learn color spaces..The example and method is wonderful and allow me to fully understand how to choose the best color spaces. You can get the source code from my Github Account.
Four Color Spaces
Four color spaces are introduced. They are BGR, HSV, LAB and YCrCb respectively. There are certain formulas used to transform one color space into another. You can access them from this website. We choose them depending on their properties. So let us focus on their properties first.
The RGB Color Space:
1. The three channels(B, G, R) are correlated by the amount of light hitting the surface. In RGB color space the color information is separated into three channels but the same three channels also encode brightness information.
2. significant perceptual non-uniformity(Even with our eyes we can see the clear difference between indoor and outdoor B, G, R channels)
3. mixing of chrominance ( Color related information ) and luminance ( Intensity related information ) data
The LAB Color Space:
1. Different from RGB Color Space, LAB only have two channels encoding color. Let me introduce the meaning of different channels first:
- L-Lightness(Intensity)
- A-color component ranging from Green to Magenta
- B-color component ranging from Blue to Yellow
So, the L channel is independent of color information and encodes brightness only. The other two channels encode color.
2. Perceptually uniform color space which approximates how we perceive color.(We can not use our eyes to see clear difference between indoor and outdoor A, B channels which encodes color.)
3. It is pretty clear from the figure that the change in illumination has mostly affected the L component.
The YCrCb Color Space:
1. The YCrCb color space is derived from the RGB color space and has the following three components
- Y-Luminance or Luma component obtained from RGB after gamma correction.
- Cr = R – Y ( how far is the red component from Luma ).
- Cb = B – Y ( how far is the blue component from Luma ).
2. Similar observations as LAB can be made for Intensity and color components with regard to Illumination changes
3. White has undergone change in all 3 components
The HSV Color Space:
1. The HSV color space has the following three components
- H-Hue ( Dominant Wavelength )
- Saturation ( Purity / shades of the color )
- Value ( Intensity )
2. The H Component is very similar in both the images which indicates the color information is intact even under illumination changes.
3. The S component is also very similar in both images.
4. The V Component captures the amount of light falling on it thus it changes due to illumination changes.
How to Choose Color Spaces
I run the code written by Satya Mallick. He built a interactive GUI to detect the color of the pixels in the images. You can obtain the values of different colors through this simple GUI. The next step is to extract certain colors from the cube. Still, he wrote a simple software which has Trackbars on the windows. Since each color space can be split into three particular channels, the author set up thresholds(the maximum and minimum values of the channels) for each channel and used the functions inRange and bitwise_and to extract the specific color. You can see the description of the inRange function here and bitwise_and function here. I got stuck here for a while since I ignore the fact that the value is 8 bit rather than 1 bit which means that it ranges from 00000000 to 11111111. While doing & and ^ operation, every 8 bit joins in the operation.
This method seems very great. But it is still hard to find the suitable thresholds manually. Still, there is a problem that even if you have found the best thresholds with all the pictures you have, it still be possible to fail while dealing with another image. So the author used another approach which is better. Let us analyze the code first.
The code block of interactive GUI:
Let us go into the main function first.
int main( int argc, const char** argv )
{
// filename
// Read the input image
int image_number = 0;
int nImages = 10;
if(argc > 1)
nImages = atoi(argv[1]);
char filename[20];
sprintf_s(filename,"images/rub%02d.jpg",image_number%nImages);
img = imread(filename);
// Resize the image to 400x400
Size rsize(400,400);
resize(img,img,rsize);
if(img.empty())
{
return -1;
}
// Create an empty window
namedWindow("PRESS P for Previous, N for Next Image", WINDOW_AUTOSIZE);
// Create a callback function for any event on the mouse
setMouseCallback( "PRESS P for Previous, N for Next Image", onMouse );
imshow( "PRESS P for Previous, N for Next Image", img );
while(1)
{
imshow("PRESS P for Previous, N for Next Image", img);
char k = waitKey(30) & 0xFF;
if (k == 27)
break;
//Check next image in the folder
if (k =='n')
{
image_number++;
sprintf_s(filename,"images/rub%02d.jpg",image_number%nImages);
img = imread(filename);
resize(img,img,rsize);
}
//Check previous image in he folder
else if (k =='p')
{
image_number--;
sprintf_s(filename,"images/rub%02d.jpg",image_number%nImages);
img = imread(filename);
resize(img,img,rsize);
}
}
return 0;
}
This block of codes mainly implement three effects.
1. Load the images and resize them.
2. Show the image.
3. Press P for the previous image, N for the next image and Esc to quit.
Let separate the codes into three parts according to their effects.
Load the images and resize them
int image_number = 0;
int nImages = 10;
char filename[20];
sprintf_s(filename,"images/rub%02d.jpg",image_number%nImages);
img = imread(filename);
// Resize the image to 400x400
Size rsize(400,400);
resize(img,img,rsize);
There is only one line needed to explain.
sprintf_s(filename,"images/rub%02d.jpg",image_number%nImages);
The function sprintf_s is similar to the function sprintf. You can learn the parameters of this function from this website.. I also have an example from another post. You have better understand it from this example.%02d is derived from %d which you must be familiar with. So the value of %02d is replaced by the value of the third parameter(image_number%nImages). You may be confused about the % in the third parameter, but actually the % is the Modulus Operator. It returns the remainder of an integer division. For example, 1%10 = 1, 3%10 =3, 5%2 = 1.
Show the image
namedWindow("PRESS P for Previous, N for Next Image", WINDOW_AUTOSIZE);
imshow( "PRESS P for Previous, N for Next Image", img );
The format is stable. Just remember it.
Press P for the previous image, N for the next image and Esc to quit
while(1)
{
imshow("PRESS P for Previous, N for Next Image", img);
char k = waitKey(30) & 0xFF;
if (k == 27)
break;
//Check next image in the folder
if (k =='n')
{
image_number++;
sprintf_s(filename,"images/rub%02d.jpg",image_number%nImages);
img = imread(filename);
resize(img,img,rsize);
}
//Check previous image in he folder
else if (k =='p')
{
image_number--;
sprintf_s(filename,"images/rub%02d.jpg",image_number%nImages);
img = imread(filename);
resize(img,img,rsize);
}
}
There is only one line needed to explain.
char k = waitKey(30) & 0xFF
You can see the definition of waitKey here. However, you may be still confused about this line of code. You can see the pictures below as a reference.
Then is the setMouseCallback function
void onMouse( int event, int x, int y, int flags, void* userdata )
{
if( event == EVENT_MOUSEMOVE )
{
Vec3b bgrPixel(img.at<Vec3b>(y, x));
Mat3b hsv,ycb,lab;
// Create Mat object from vector since cvtColor accepts a Mat object
Mat3b bgr (bgrPixel);
//Convert the single pixel BGR Mat to other formats
cvtColor(bgr, ycb, COLOR_BGR2YCrCb);
cvtColor(bgr, hsv, COLOR_BGR2HSV);
cvtColor(bgr, lab, COLOR_BGR2Lab);
//Get back the vector from Mat
Vec3b hsvPixel(hsv.at<Vec3b>(0,0));
Vec3b ycbPixel(ycb.at<Vec3b>(0,0));
Vec3b labPixel(lab.at<Vec3b>(0,0));
// Create an empty placeholder for displaying the values
placeholder = Mat::zeros(img.rows,400,CV_8UC3);
//fill the placeholder with the values of color spaces
putText(placeholder, format("BGR [%d, %d, %d]",bgrPixel[0],bgrPixel[1],bgrPixel[2]), Point(20, 70), FONT_HERSHEY_COMPLEX, .9, Scalar(255,255,255), 1);
putText(placeholder, format("HSV [%d, %d, %d]",hsvPixel[0],hsvPixel[1],hsvPixel[2]), Point(20, 140), FONT_HERSHEY_COMPLEX, .9, Scalar(255,255,255), 1);
putText(placeholder, format("YCrCb [%d, %d, %d]",ycbPixel[0],ycbPixel[1],ycbPixel[2]), Point(20, 210), FONT_HERSHEY_COMPLEX, .9, Scalar(255,255,255), 1);
putText(placeholder, format("LAB [%d, %d, %d]",labPixel[0],labPixel[1],labPixel[2]), Point(20, 280), FONT_HERSHEY_COMPLEX, .9, Scalar(255,255,255), 1);
Size sz1 = img.size();
Size sz2 = placeholder.size();
//Combine the two results to show side by side in a single image
Mat combinedResult(sz1.height, sz1.width+sz2.width, CV_8UC3);
Mat left(combinedResult, Rect(0, 0, sz1.width, sz1.height));
img.copyTo(left);
Mat right(combinedResult, Rect(sz1.width, 0, sz2.width, sz2.height));
placeholder.copyTo(right);
imshow("PRESS P for Previous, N for Next Image", combinedResult);
}
}
This block of codes mainly implement three effects.
1. Obtain the value of pixel and convert it into different color spaces.
2. Add and show the text.
3. Combine and show the image.
Obtain the value of pixel and convert it into different color spaces.
Vec3b bgrPixel(img.at<Vec3b>(y, x));
Mat3b hsv,ycb,lab;
// Create Mat object from vector since cvtColor accepts a Mat object
Mat3b bgr (bgrPixel);
//Convert the single pixel BGR Mat to other formats
cvtColor(bgr, ycb, COLOR_BGR2YCrCb);
cvtColor(bgr, hsv, COLOR_BGR2HSV);
cvtColor(bgr, lab, COLOR_BGR2Lab);
//Get back the vector from Mat
Vec3b hsvPixel(hsv.at<Vec3b>(0,0));
Vec3b ycbPixel(ycb.at<Vec3b>(0,0));
Vec3b labPixel(lab.at<Vec3b>(0,0));
Vec3b intensity = img.at
1. uchar blue = intensity.val[0];
2. uchar green = intensity.val[1];
3. uchar red = intensity.val[2];
If you want to go deep into the source with F12 and understand the syntax of this code and Mat3b, you need to know about some basic grammar of C++, such as Class and Templates.
cvtColor(bgr, ycb, COLOR_BGR2YCrCb);
cvtColor(bgr, hsv, COLOR_BGR2HSV);
cvtColor(bgr, lab, COLOR_BGR2Lab);
You can use function cvtColor to convert RGB image or pixel into HSV, YCrCb, LAB one.
Add and show the text.
// Create an empty placeholder for displaying the values
placeholder = Mat::zeros(img.rows,400,CV_8UC3);
//fill the placeholder with the values of color spaces
putText(placeholder, format("BGR [%d, %d, %d]",bgrPixel[0],bgrPixel[1],bgrPixel[2]), Point(20, 70), FONT_HERSHEY_COMPLEX, .9, Scalar(255,255,255), 1);
putText(placeholder, format("HSV [%d, %d, %d]",hsvPixel[0],hsvPixel[1],hsvPixel[2]), Point(20, 140), FONT_HERSHEY_COMPLEX, .9, Scalar(255,255,255), 1);
putText(placeholder, format("YCrCb [%d, %d, %d]",ycbPixel[0],ycbPixel[1],ycbPixel[2]), Point(20, 210), FONT_HERSHEY_COMPLEX, .9, Scalar(255,255,255), 1);
putText(placeholder, format("LAB [%d, %d, %d]",labPixel[0],labPixel[1],labPixel[2]), Point(20, 280), FONT_HERSHEY_COMPLEX, .9, Scalar(255,255,255), 1);
What I want to mention in this part is CV_8UC3. The common syntax of this is {U|S|F}C(
Combine and show the image.
Size sz1 = img.size();
Size sz2 = placeholder.size();
//Combine the two results to show side by side in a single image
Mat combinedResult(sz1.height, sz1.width+sz2.width, CV_8UC3);
Mat left(combinedResult, Rect(0, 0, sz1.width, sz1.height));
img.copyTo(left);
Mat right(combinedResult, Rect(sz1.width, 0, sz2.width, sz2.height));
placeholder.copyTo(right);
imshow("PRESS P for Previous, N for Next Image", combinedResult);
Just merge two images into one. If you successfully run the codes, you will get a interactive GUI like this.
The code block of setting thresholds:
In this block of code, there is nothing I should explain in detail since the functions inRange and bitwise_and are mentioned already.
The code block of DataAnalysis:
This block of code are written in python. I will choose some specific functions to introduce.
Load and convert the images
color = 'pieces/yellow'
files = glob.glob(color + '*.jpg')
files.sort()
for fi in files[:]:
# BGR
im = cv2.imread(fi)
b = im[:,:,0]
b = b.reshape(b.shape[0]*b.shape[1])
g = im[:,:,1]
g = g.reshape(g.shape[0]*g.shape[1])
r = im[:,:,2]
r = r.reshape(r.shape[0]*r.shape[1])
B = np.append(B,b)
G = np.append(G,g)
R = np.append(R,r)
# HSV
hsv = cv2.cvtColor(im,cv2.COLOR_BGR2HSV)
h = hsv[:,:,0]
h = h.reshape(h.shape[0]*h.shape[1])
s = hsv[:,:,1]
s = s.reshape(s.shape[0]*s.shape[1])
v = hsv[:,:,2]
v = v.reshape(v.shape[0]*v.shape[1])
H = np.append(H,h)
S = np.append(S,s)
V = np.append(V,v)
# YCrCb
ycb = cv2.cvtColor(im,cv2.COLOR_BGR2YCrCb)
y = ycb[:,:,0]
y = y.reshape(y.shape[0]*y.shape[1])
cr = ycb[:,:,1]
cr = cr.reshape(cr.shape[0]*cr.shape[1])
cb = ycb[:,:,2]
cb = cb.reshape(cb.shape[0]*cb.shape[1])
Y = np.append(Y,y)
Cr = np.append(Cr,cr)
Cb = np.append(Cb,cb)
# Lab
lab = cv2.cvtColor(im,cv2.COLOR_BGR2LAB)
ll = lab[:,:,0]
ll = ll.reshape(ll.shape[0]*ll.shape[1])
la = lab[:,:,1]
la = la.reshape(la.shape[0]*la.shape[1])
lb = lab[:,:,2]
lb = lb.reshape(lb.shape[0]*lb.shape[1])
LL = np.append(LL,ll)
LA = np.append(LA,la)
LB = np.append(LB,lb)
glob.glob returns a possibly-empty list of path names that match pathname, which must be a string containing a path specification.
sort() method can return a sorted list. You can better understand it from this post.
If you want to split an image in a certain color space into three channels. You can use codes like this:
b = im[:,:,0]
g = im[:,:,1]
r = im[:,:,2]
np.append appends values to the end of an array. You can better understand it from this website. So in this case, since the code is in a for loop, the blue channels from different images are extracted and combined together into one matrix B. So are the other channels.
Make a density plot
nbins = 11
plt.figure(figsize=[20,10])
plt.subplot(2,3,1)
plt.hist2d(B, G, bins=nbins, norm=LogNorm())
plt.xlabel('B')
plt.ylabel('G')
plt.title('RGB')
if not zoom:
plt.xlim([0,255])
plt.ylim([0,255])
plt.colorbar()
plt.subplot(2,3,2)
plt.hist2d(B, R, bins=nbins, norm=LogNorm())
plt.colorbar()
plt.xlabel('B')
plt.ylabel('R')
plt.title('RGB')
if not zoom:
plt.xlim([0,255])
plt.ylim([0,255])
plt.subplot(2,3,3)
plt.hist2d(R, G, bins=nbins, norm=LogNorm())
plt.colorbar()
plt.xlabel('R')
plt.ylabel('G')
plt.title('RGB')
if not zoom:
plt.xlim([0,255])
plt.ylim([0,255])
plt.subplot(2,3,4)
plt.hist2d(H, S, bins=nbins, norm=LogNorm())
plt.colorbar()
plt.xlabel('H')
plt.ylabel('S')
plt.title('HSV')
if not zoom:
plt.xlim([0,180])
plt.ylim([0,255])
plt.subplot(2,3,5)
plt.hist2d(Cr, Cb, bins=nbins, norm=LogNorm())
plt.colorbar()
plt.xlabel('Cr')
plt.ylabel('Cb')
plt.title('YCrCb')
if not zoom:
plt.xlim([0,255])
plt.ylim([0,255])
plt.subplot(2,3,6)
plt.hist2d(LA, LB, bins=nbins, norm=LogNorm())
plt.colorbar()
plt.xlabel('A')
plt.ylabel('B')
plt.title('LAB')
if not zoom:
plt.xlim([0,255])
plt.ylim([0,255])
plt.savefig(color + '.png',bbox_inches='tight')
else:
plt.savefig(color + '-zoom.png',bbox_inches='tight')
plt.show()
plt.figure(figsize=[20,10]) make a background whose size is 20*10
plt.subplot(2,3,1) separate the whole figure into a 23 region and the third parameter indicates the location of the picture. According to a 23 region, six pictures can be showed.
plt.hist2d is a little difficult to understand. You can grasp it from this post. It is easy for us to observe the degree of the density of the channels through the image plotted by plt.hist2d.
Also you can learn about the usage of “if not” syntax here.
Conclusion
As the Illumination changes by a large amount, we can see that:
1. Ideally, we want to work with a color space with the most compact / concentrated density plot for color channels.(because when the color value is concentrated in a small region, we can use a small range of thresholds to extract it. And it is robust to other conditions as well.)
2. The density plots for RGB blow up drastically. This means that the variation in the values of the channels is very high and fixing a threshold is a big problem.
3. In HSV, since only the H component contains information about the absolute color. Thus, it becomes my first choice of color space since I can tweak just one knob ( H ) to specify a color as compared to 2 knobs in YCrCb ( Cr and Cb ) and LAB ( A and B ).(And as you can see, the value of H channel is concentrated in a small range, it is also an important feature of HSV)
4. Comparing the plots of YCrCb and LAB shows a higher level of compactness in case of LAB. So, next best choice for me becomes the LAB color space.