Spatial and Channel Attention-based 3D Object Classification Research for 3D Point Clouds

Tang, Xikai

dc.contributor.author	Tang, Xikai
dc.date.accessioned	2023-01-16 20:43:07 (GMT)
dc.date.available	2023-01-16 20:43:07 (GMT)
dc.date.issued	2023-01-16
dc.date.submitted	2023-01-01
dc.identifier.uri	http://hdl.handle.net/10012/19066
dc.description.abstract	Deep learning has been widely used in Two Dimensional (2D) computer vision and has led to the realization that machine learning techniques have become one of the key research directions for future scientific research. In 2D computer vision, CNN[49], RNN[34], SENet[40], Transformer[89], as well as many other algorithms show amazing results in 2D data. With the accelerating development of computer version technologies, the exploitation of 2D data is insufficient for machine learning research and researchers considering the transfer of 2D computer vision algorithms to Three Dimensional (3D) domain. Point clouds is an important expression of 3D data. The more detailed information found in 3D point cloud data compared to 2D point cloud data, it has accelerated research in recent years, which has led to significant breakthroughs in artificial intelligence, deep learning, autonomous driving, tracking, and other domains. There have been a large number of deep learning methods recently proposed based on point clouds. PointNet[72], P4Transformer[21], and SampleNet[47] show significant success in 3D domain. Disorder and sparse shape make a challenge in designing deep neural networks for point clouds processing. In chapter one, we will introduce the background of point clouds, the existing public datasets and evaluation metrics, then investigate and analyze deep learning methods based on classification of point clouds. In chapter two, we will introduce generation of point clouds and analyse the existing methods based on classification and segmentation. Furthermore, we investigate attention mechanism in computer vision, includes background of attention mechanism, evolution of attention mechanism, spatial and channel attention in vision and point cloud-based attention model in deep learning. Based on the chapter one and two analyse and investigation, we found that this data type’s ability to provide depth information, point sparsity and disorder pose a challenge in designing appropriate deep neural networks to process them and it is still challenging to explore local relationships in point clouds data. so, in chapter three, in order to better extract features and obtain geometric information we will propose a point attention (PointAT) model and propose attention value (AT value) model for feature fusion to apply geometric relationship to the data. Then, we propose a new spatial and channel attention-based network (SCA). The SCA is the overall structure of the network, and the main purpose is to connect PointAT and AT value model, then capturing meaningful geometric information by applying the geometric relationship between point clouds patches to the model, then propose an auto pooling framework to extract global features. In this work, we concentrate on learning geometric relationship between point cloud data. For this purpose, we introduce a point attention model based on spatial and channel attention to learn the geometric relationship between point clouds, and further combine the geometric relationship with the point cloud data by the AT Value Model. Finally, we introduce an adaptive downsampling structure, Autopooling. This downsampling structure considers each point’s importance weight and picking key points adaptively, which can be used with convolutional networks. Extensive experiments conducted on two benchmark datasets (ModelNet40[96] and ShapeNet[11]) clearly demonstrate the effectiveness of our SCA and SCA-Auto (SCAA with Auto pooling) methods. Finally, in chapter four, we summary our contribution, and significant of study findings and limitations of proposed methods. Then, we get future research directions based on our analyse and investigation.	en
dc.language.iso	en	en
dc.publisher	University of Waterloo	en
dc.relation.uri	ModelNet40	en
dc.relation.uri	ShapeNet	en
dc.subject	3D object classification	en
dc.subject	point clouds	en
dc.subject	attention mechanism	en
dc.title	Spatial and Channel Attention-based 3D Object Classification Research for 3D Point Clouds	en
dc.type	Master Thesis	en
dc.pending	false
uws-etd.degree.department	Electrical and Computer Engineering	en
uws-etd.degree.discipline	Electrical and Computer Engineering	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.degree	Master of Applied Science	en
uws-etd.embargo.terms	0	en
uws.contributor.advisor	Ban, Dayan
uws.contributor.advisor	Wang, Zhou
uws.contributor.affiliation1	Faculty of Engineering	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.typeOfResource	Text	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en

Files in this item

Name:: Xikai_Tang.pdf
Size:: 4.948Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Show simple item record