Srivastava, RachnaGaudet, Vincent C.Mitran, Patrick2023-10-312023-10-312022-01-31https://doi.org/10.1007/s11265-021-01735-2http://hdl.handle.net/10012/20077This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.This paper describes a field-programmable gate array (FPGA) implementation of a fixed-point low-density lattice code (LDLC) decoder where the Gaussian mixture messages that are exchanged during the iterative decoding process are approxi- mated to a single Gaussian. A detailed quantization study is first performed to find the minimum number of bits required for the fixed-point decoder to attain a frame error rate (FER) performance similar to floating-point. Then efficient numeri- cal methods are devised to approximate the required non-linear functions. Finally, the paper presents a comparison of the performance of the different decoder architectures as well as a detailed analysis of the resource requirements and through- put trade-offs of the primary design blocks for the different architectures. A novel pipelined LDLC decoder architecture is proposed where resource re-utilization along with pipelining allows for a parallelism equivalent to 50 variable nodes on the target FPGA device. The pipelined architecture attains a throughput of 10.5 Msymbols/sec at a distance of 5 dB from capacity which is a 1.8× improvement in throughput compared to an implementation with 20 parallel variable nodes without pipelining. This implementation also achieves 24× improvement in throughput over a baseline serial decoder.enAttribution 4.0 Internationallow-density lattice codesGaussian mixturefixed-point arithmeticserial and parallel FPGA architecturehardware architecturepipeliningHardware Implementation of a Fixed-Point Decoder for Low-Density Lattice CodesArticle