szproxy's Research

วันอังคารที่ 26 เมษายน พ.ศ. 2554

How do we understand the image?

Are we allowed to modify the input image data? Please look at + for 1 min, you will see magic

Are we using priori knowledge model?

Are we using our experience?

Are we using some comparison algorithm? (Point B and A is the same color!)

Do we make conclusion based on continuity? (see big illusion images)

Is it based on logical experience?

What is actually a movement?

วันพฤหัสบดีที่ 3 มีนาคม พ.ศ. 2554

วันพุธที่ 8 ธันวาคม พ.ศ. 2553

First Success in HOG descriptor training

My own trained detector using Dalal HOG method.

detection time = 873.865ms

Opencv2.1 has peopledetect.cpp that come with Dalal default detector. It does not provide the details how the detector can be trained.

In this blog, I present the training method I have done so far. The result is shown in the left.

My request toward reader: I still do not know how the Det graph and performance measurement can be implemented. Any help will be appreciated.

Theory: Dalal and Trig HOG must be computed first. After that, we use the SVM to classify the people/no people. With the fault positive, retrain the system once again.

Result: SVM
Accuracy on test set: 98.90% (14447 correct, 161 incorrect, 14608 total)

Precision/recall on test set: 99.04%/94.25%

Implement: Opencv with the HOGDescriptor::compute, one can compute the HOG of the given image. Requirement is to train the SVM with Train/negative, Train/Positive image set. After that, test the SVM model with the Test/negative. Any detection is therefore, fault positive. Retrain the fault positive with the training set again.

The trained SVM model is a file containing support vectors. With the support vectors, one can use them to predict people/non-people classification. Opencv, However, use only one vector to detect people.

This vector is the weight vector as shown in picture above as w.

Start implement:
1. Using the INRIA person dataset ,
- INRIAPerson/96X160H96/Train/pos
- INRIAPerson/Train/neg
- INRIAPerson/Test/neg
for positive train test set.
2. Prepare the 64x128 images for those picture. For negative image, it is required that the 64x128 randomly cropped from the original image.

3. HogCompute parameter. We will compute the HOG using Opencv HOGDescriptor::compute with the following default parameters.
blockSize =(16,16) blockStride =(8,8) CellSize =(8,8) derivAperture bool =0 gammaCorrection bool=0 histogramNormType =0if = 0 means L2Hys L2HysTheshold =0.2 nbins =9
winSigma =-1 winSize =(64,128)

4. SVM learn, and classifier. SVMLight is used. Using libsvm/tools/checkdata.py to make sure that the input file to SVM is correctly formated.

5. Compute HOG, train the SVM, test the negative test set with the trained model, put the fault positive to retrain again. (its called hard train)

6. Compute the weight vector of the hard train and use it as vector<float> getDefaultPeopleDetector() the trig is we need to put the theshold value from the model added into the last vector position.

7. At detectMultiScale function, change the threshold values,group threshold and hit threshold until u get good result.

Discussion: Manually change the threshold value does not seems to be "computer vision" it is human vision. How can I auto change the threshold value and get good result?

Download: Here I created a project to host all above training. License: Its for free to everything.

szproxy's Research

วันอังคารที่ 26 เมษายน พ.ศ. 2554

How do we understand the image?

วันพุธที่ 23 มีนาคม พ.ศ. 2554

bg subtraction

วันพฤหัสบดีที่ 3 มีนาคม พ.ศ. 2554

First Compressed Sensing Reconstruction

วันพุธที่ 8 ธันวาคม พ.ศ. 2553

First Success in HOG descriptor training