Design of practical computer vision system with real-time object detection capability

chen, guanyu

Design of practical computer vision system with real-time object detection capability

Files

Chen_Guanyu.pdf (8.84 MB)

Date

2024-03-07

Authors

chen, guanyu

Advisor

ho, pinhan

Publisher

University of Waterloo

Abstract

Computer vision nowadays relies heavily on machine learning techniques to interpret useful information from images or videos. Object detection is one such computer vision technique for identifying and locating objects in images. This type of application is of great interest for its potential use in various fields including product inspection, analysis, security, etc. As another important technique in computer vision, object recognition for identifying objects in images has been accomplished earlier. Classic models including LeNet and VGG16 have already adopt CNN-like architectures. In comparison, an object detection model would not only identify objects, but also label each detected object with a bounding box. Provided ground truth labels about both object class and bounding box coordinates, object detection models can be trained regularly for making both predictions. Certain families of object detection models are listed as follows: In R-CNN, the Region Proposal Network (RPN) produces region proposals, corresponding to rectangular regions in the image in which targeting object is possibly present. YOLO divides the input image into grids and predicts the bounding box and class confidence simultaneously for each grid. SSD is a similar model to YOLO but has better accuracy by using features at different scales. As a result of improved hardware performance and innovative network architecture in recent years, real-time object detection has become possible with both satisfying speed and accuracy. The goal of this thesis is to implement a real-time object detection system based on some of the already published models, with the Proposal Connection Network (PCN) discussed in more detail. PCN in simple terms is a two-stage, anchor-free object detection model with unique advantages. Following the demonstration of system design and setup are training and experimental processes, focusing primarily on performance analysis and comparison among models.

URI

http://hdl.handle.net/10012/20385

Collections

Theses
Electrical and Computer Engineering

Full item page

Design of practical computer vision system with real-time object detection capability

Files

Date

Authors

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

LC Subject Headings

Citation

URI

Collections