Efficient Hardware Realization of Convolutional Neural Networks using Intra-Kernel Regular Pruning

dc.contributor.advisorGaudet, Vincent
dc.contributor.authorYang, Maurice
dc.date.accessioned2019-05-24T18:47:01Z
dc.date.available2019-05-24T18:47:01Z
dc.date.issued2019-05-24
dc.date.submitted2019-05-17
dc.description.abstractConvolutional neural networks (CNNs) have proven their success in a wide range of applications. While CNNs boast remarkable performance, they require significant computational and memory resources for operation. As research strive towards higher classification accuracy, CNN topologies have increased in depth, complexity and size. In response, algorithmic-level optimizations have been proposed to reduce the size of CNNs while retaining classification accuracy. While these advances promise savings in theory, they often underperform in practice, especially when adopted into hardware. In order achieve practical savings, algorithmic changes must be considered in perspective of hardware, thus necessitating a software-hardware codesign philosophy. We propose an Intra-Kernel Regular (IKR) pruning scheme to reduce the size and computational complexity of CNNs by removing redundant weights at a fine-grained level without loss in classification accuracy. Unlike other pruning methods such as Fine-Grained pruning, IKR pruning maintains regular kernel structures and employs data compression techniques that translate well into hardware. At the hardware level, we propose an FPGAdesign framework targeting IKR-pruned CNNs. The organisational structure of the design enables potential for high parallelism and efficient utilization of on-chip resources. Experimental results in software demonstrate up to 10×reduction in weights and 7×reduction in computation at a cost of less than 1% degradation in accuracy versus the un-pruned case. Evaluation of the accelerator indicate computational speeds up to 77.7 GOP/S (effectively 403 GOP/S) with each DSP effectively performing 0.53 GOP/S.en
dc.identifier.urihttp://hdl.handle.net/10012/14716
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectneural networken
dc.subjectFPGAen
dc.subjectsoftware-hardware codesignen
dc.titleEfficient Hardware Realization of Convolutional Neural Networks using Intra-Kernel Regular Pruningen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Applied Scienceen
uws-etd.degree.departmentElectrical and Computer Engineeringen
uws-etd.degree.disciplineElectrical and Computer Engineeringen
uws-etd.degree.grantorUniversity of Waterlooen
uws.contributor.advisorGaudet, Vincent
uws.contributor.affiliation1Faculty of Engineeringen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Yang_Maurice.pdf
Size:
2.08 MB
Format:
Adobe Portable Document Format
Description:
Thesis

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.08 KB
Format:
Item-specific license agreed upon to submission
Description: