Object Detection is a computer vision technique for locating instances of object within images or videos.
It is a key technology behind applications like surveillance systems, image retrieval system and advanced driver assistance system.
These systems involve not only recognizing and classifying every object in an image, but localizing each one by drawing the appropriate bounding box around it.
YOLOv3 is a real-time, single-stage object detection model that builds on YOLOv2 with several improvements.
Improvements include the use of a new backbone network, Darknet-53 that utilize residual connections, or in the words of the author, “those newfangled residual network stuff”, as well as some improvements to the bounding box prediction step, and use of three different scales from which to extract features (similar to an FPN).