Object detection using dlib, opencv and python


Object detection is technique to identify objects inside image and its location inside the image. It is used in autonomous vehicle driving to detect pedestrians walking or jogging on the street to avoid accidents. Here is image with 3 pedestrians correct detected by object detection and enclosed in green rectangles.

Amazon video uses object detection to detect face detection in streaming video. This is used X-ray tab where user can see more information about the actors in current scene.

DLib is popular machine learning library used for object detection. This library was developed by Davis King. Its highly optimized C++ library used in image processing.


For object detection, we first get a training dataset consisting of images and associated bounding rectangle coordinates. Such a dataset is available at Caltech. This dataset was developed Prof Fei Fei Le. It consists of images in 1010 categories such as lily, stop sign, bass, chair etc and for each category there are annotations available that denote bounding rectangle of objects within the image. You can download dataset from here.


When we feed this dataset to Dlib, it generates features t from this dataset by creating 3780 dimensional HOG (Histogram of Oriented Gradients) and then trains model using Support Vector Machine (SVM) algorithm. The model is saved to a disk. Then you can download the random images for category that was trained from google and ask model to detect the object. The workflow is described below.

I have chosen category of cougar face and trained model for this category.

Here is how HOG descriptor looks like. It contains compact representation of gradient changes and results in very efficient computation of model parameters.

If you would like to know how to generate this HOG Descriptor , refer to my youtube video below

Here is the code for training the model

https://gist.github.com/evergreenllc2020/dd4feca0b3f7393222935dbcacebaa57


After Training the model, I downloaded random cougar image from google and ask model to predict it. Here is the result.

Here is code for detecting the object

https://gist.github.com/evergreenllc2020/12f1484c1a2466f8fe42573d019a31ed


Citation:

Huge thanks to Dr. Adrian Rosebrock and Dr Satya Mallick for helping me understand the entire process of object detection with example code.

If you would like to learn step by step the entire process along with fundamentals of various techniques in this blog, please sign up for my course by clicking here


About Author Evergreen Technologies: