Hardware Implementation of Fixed-Point Decoder for Low-Density Lattice Codes

Loading...
Thumbnail Image

Date

2022-04-29

Authors

Srivastava, Rachna

Advisor

Gaudet, Vincent
Mitran, Patrick

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Low-density lattice codes (LDLCs) are a special class of lattice codes that can be decoded efficiently using iterative decoding and approach the capacity of the additive white Gaussian noise (AWGN) channel. The construction and intended applications are substantially different from that of more familiar error-correcting codes such as low-density parity check (LDPC) codes, Polar, and Turbo codes. Lattice codes in general have shown great theoretical promise to mitigate interference, possibly leading to significantly higher rates between users in multi-user networks. Research on LDLCs has concentrated on demonstrating the theoretically achievable performance limits of LDLCs, and until now there has been no reported hardware implementation, mainly due to the complexity of message-passing for LDLC decoding. This thesis contributes to the hardware implementation of the LDLC decoding. We present several fixed-point decoder implementations covering different parts of the architectural design space, on a field-programmable gate array (FPGA) device. We first present the FPGA implementation of a fixed-point arithmetic LDLC decoder where the Gaussian mixture messages that are exchanged during the iterative decoding process are approximated to a single Gaussian. A detailed quantization study is performed to find the minimum number of bits required for the fixed-point decoder implementation to attain a frame-error-rate (FER) performance similar to floating-point. Efficient numerical methods are used to approximate the non-linear functions required in the decoder. A two-node serial LDLC decoder is implemented on an Intel Arria 10 FPGA as a hardware proof-of-concept attaining a throughput of 440 Ksymbols/sec at high signal-to-noise ratio (SNR). This throughput is obtained at clock frequency of 125 MHz and for a block length of 1000. By exploiting the inherent parallelism of iterative decoding, several parallel message processing blocks are then used to improve the throughput by a factor of 13x. Finally, we propose a pipelined architecture where the decoder achieves a throughput of 10.5 Msymbols/sec, that is, ~24x improvement over the serial decoder. Then, we implement a multi-Gaussian decoder where the Gaussian mixture messages exchanged during the decoding process have two components. We develop efficient techniques to reduce the decoder complexity for hardware implementation, e.g., selecting the strongest component from the Gaussian mixture as the final decision in iterative decoding, and a simplified method for coefficient computation during the product operation at the variable nodes. With a thorough quantization analysis and applying methods devised to approximate the non-linear functions, we design the multi-Gaussian decoders in fixed point arithmetic. We first implemented a serial architecture with a single check node and a single variable node. Then, a partially parallel architecture with a single check node and a variable node message processing block with two-stage pipelining is implemented to achieve an effective parallelism of 5 variable nodes. The pipelined architecture achieves an improvement of ~0.75 dB in decoding performance over the single Gaussian decoder of degree 3 with an overall design throughput of 550 Ksymbols/sec. In the final part of the thesis, we further explore the design space and develop complex LDLC decoder designs for higher degrees. We characterize the decoding performance of these decoders and present the design throughputs for different architectures on the target FPGA. Based on these results, we provide insights that will help users to select the most suitable LDLC decoder for a particular application. However this is attained with additional hardware cost and reduced design throughput. A single-Gaussian decoder of degree 7 achieved an FER improvement of 0.75 dB over a single-Gaussian decoder of degree 3 with a throughput of 3.03 Msymbols/sec. The multi-Gaussian Gaussian decoder of degree 7 (with two components in the Gaussian mixture) attains 1.75 dB improvement in FER over the multi-Gaussian decoder of degree 3, and its overall design throughput is ~84 Ksymbols/sec. From a broader perspective, the LDLC decoders with higher degrees and larger mixture messages provide a significant improvement in decoding performance. For ultra-reliable applications, a multi-Gaussian decoder of degree 7 is most suitable while for a very high throughput requirement single-Gaussian decoder of degree 3 is the best choice. We also characterize the performance of multi-Gaussian decoders where the Gaussian mixture messages contain more than two components. Based on the results, the multi- Gaussian decoder with mixture messages that contain 5 components gain approximately ~0.1 - 0.2 dB (for degree 3 and 7) and ~0.3 dB (for degree 5) over multi-Gaussian decoder where mixture messages have only two components.

Description

Keywords

Low-density lattice codes, Gaussian mixture, fixed-point arithmetic, serial and parallel FPGA architecture, hardware architecture

LC Keywords

Citation