Modern Object and Visual Relationship Detection in Images from a Critical, Cognitive and Data Perspective

Abou Chacra, David

Modern Object and Visual Relationship Detection in Images from a Critical, Cognitive and Data Perspective

Files

AbouChacra_David.pdf (15.75 MB)

Date

2023-04-27

Authors

Abou Chacra, David

Advisor

Zelek, John

Publisher

University of Waterloo

Abstract

Deep learning has dominated the landscape of computer vision for the past decade. Deep learning networks are the top performers on a slew of computer vision challenges (e.g., object detection or image segmentation) and on the most popular datasets. They outperform other approaches by a large margin, each armed with their own tricks to improve upon their predecessors. However recent research highlights several short-comings of deep learning approaches, from poor generalization performance to the difficulty in understanding the rationale behind the decisions they make. More nuanced and human-like tasks such as visual relationship detection still prove difficult for deep learning networks as well. In this thesis we tackle the problem of scene graph generation: the task of generating a directed graph that describes the relationships between detected objects in an image. We empirically identify, highlight and discuss the shortcomings of modern deep learning approaches to this task along with the reasoning behind these failures. Scene graph generation relies on both object detection and visual relationship detection. Our experiments first tackle object detection (through its more advanced task of instance segmentation) in isolation, then explore visual relationship detection starting with its data and moving on to its deep learning based approaches. Finally we propose and implement Topological Relationship Fields, a novel approach that allows for representing and grounding relationships purely visually. We utilize this representation for a scene graph generation approach that builds upon our findings and tackles the problem radically differently than the current standard approaches.

Keywords

artificial intelligence, machine learning, computer vision, visual relationship detection, scene graphs, human cognition, dataset understanding, adversarial attacks, deep learning, model explainability, statistical modelling, instance segmentation, object detection, network generalization

URI

http://hdl.handle.net/10012/19350

Collections

Theses
Systems Design Engineering

Full item page

Modern Object and Visual Relationship Detection in Images from a Critical, Cognitive and Data Perspective

Files

Date

Authors

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

LC Subject Headings

Citation

URI

Collections