|| Checking for direct PDF access through Ovid
Block Turbo-codes (BTC) are promising forward error correction (FEC) codes providing close-to-optimal coding gain for rather high coding rate (R > 0.7) and less subject to an error floor than Convolution Turbo Codes (CTC). Due to its good convergence properties, the Fang-Buda algorithm (FBA) allows efficiently decoding BTC in far less iterations than traditional soft-decoding algorithms such as Chase's algorithm. Moreover it can handle BTC inner code with higher minimum distance, improving consequently coding performances.However, the FBA data-intensive character and its very complex control structure are dramatic bottlenecks for a low-power, high-throughput implementation. Therefore, currently available BTC decoders are based on some variants of the Chase algorithm and can only handle simple BTC inner codes. In order to enable high performance BTCs without sacrificing throughput or energy, we have systematically analyzed and optimized the FBA algorithm, applying a systematic methodology to improve the data transfer and storage characteristics. This paper details the algorithm transformation steps and the resulting memory architecture. The latter, when mapped in a typical 0.18 μm technology and clocked at 200 MHz, enables BTCs with maximum throughput up to 134 Mbps. The memory power consumption, which is dominant for such a data-dominated application, has been estimated, after optimization, to 16 nJ/bit while the memory area estimation led to 3.5 mm2 per FBA module in the BTC pipeline.