🔗 Permalink

Patent application title:

Parallel LDPC Decoder

Publication number:

US20110173510A1

Publication date:

2011-07-14

Application number:

13/069,105

Filed date:

2011-03-22

Abstract:

An LDPC decoder that implements an iterative message-passing algorithm, where the improvement includes a pipeline architecture such that the decoder accumulates results for row operations during column operations, such that additional time and memory are not required to store results from the row operations beyond that required for the column operations.

Inventors:

Alexander Andreev 33 🇺🇸 San Jose, CA, United States
Sergey Gribok 26 🇺🇸 Santa Clara, CA, United States
Igor Vikhliantsev 11 🇺🇸 San Jose, CA, United States

Assignee:

LSI CORPORATION 2,003 🇺🇸 Milpitas, CA, United States

Interested in similar patents?

Get notified when new applications in this technology area are published.

Create Free Alert

Classification:

H03M13/116 » CPC main

Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes; Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits using multiple parity bits; Codes on graphs and decoding on graphs, e.g. low-density parity check [LDPC] codes; Structural properties of the code parity-check or generator matrix Quasi-cyclic LDPC [QC-LDPC] codes, i.e. the parity-check matrix being composed of permutation or circulant sub-matrices

H03M13/1137 » CPC further

Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes; Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits using multiple parity bits; Codes on graphs and decoding on graphs, e.g. low-density parity check [LDPC] codes; Decoding; Scheduling of bit node or check node processing Partly parallel processing, i.e. sub-blocks or sub-groups of nodes being processed in parallel

H03M13/114 » CPC further

Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes; Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits using multiple parity bits; Codes on graphs and decoding on graphs, e.g. low-density parity check [LDPC] codes; Decoding; Scheduling of bit node or check node processing Shuffled, staggered, layered or turbo decoding schedules

H03M13/11 IPC

G06F11/10 IPC

Error detection; Error correction; Monitoring; Responding to the occurrence of a fault, e.g. fault tolerance; Error detection or correction by redundancy in data representation, e.g. by using checking codes Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's

Description

FIELD

This patent application is a continuation of and claims all rights and priority on prior pending U.S. patent application Ser. No. 11/565,670 filed 2006.12.01. This invention relates to the field of integrated circuit fabrication. More particularly, this invention relates to an efficient, parallel low-density parity-check (LDPC) decoder for a special class of parity check matrices that reduces the amount of memory and time that are required for the necessary calculations.

BACKGROUND

LDPC code is typically a linear stream of data in a self-correcting format that can be represented by an (m,n)-matrix with a relatively small, fixed number of ones (nonzero for arbitrary GF(q)) in each row and column, where m is the number of check bits and n is the code length in bits.

The most famous algorithm for decoding LDPC codes is called the iterative message-passing algorithm. Each iteration of this algorithm consists of two stages. In stage 1 (the row operations), the algorithm computes messages for all of the check nodes (the rows). In stage 2 (the column operations), the algorithm computes messages for all of the bit nodes (the columns), and sends them back to the check nodes associated with the given bit nodes. There are many different implementations of this message-passing algorithm, but all of them use two-stage operations. Further, in each of these implementations, the second step starts only after all of the messages for all of the rows have been calculated.

As with all information processing operations, it is desirable for the procedure to operate as quickly as possible, while consuming as few resources as possible. Unfortunately, LDPC codes such as those described above typically require a relatively significant overhead in terms of the time and the memory required for them to operate.

What is needed is an LDPC code that operates in a more efficient manner, such as by reducing the amount of time or the amount of memory that is required by the operation.

SUMMARY

The above and other needs are met by an LDPC encoder that implements an iterative message-passing algorithm, where the improvement includes a pipeline architecture such that the decoder accumulates results for row operations during column operations, such that additional time and memory are not required to store results from the row operations beyond that required for the column operations.

BRIEF DESCRIPTION OF THE DRAWINGS

Further advantages of the invention are apparent by reference to the detailed description when considered in conjunction with the figures, which are not to scale so as to more clearly show the details, wherein like reference numbers indicate like elements throughout the several views, and wherein:

FIG. 1 is a functional block diagram of an LDPC decoder according to an embodiment of the present invention.

FIG. 2 is an example of an LDPC encoding matrix according to an embodiment of the present invention.

DETAILED DESCRIPTION

The LDPC algorithm described herein accumulates results for the row operations during the column operations, so that additional time and memory are not required to store the results from the row operations while the column operations are conducted. One embodiment of the method according to the present invention is presented below for the purpose of example. The method is described in reference to a hardware embodiment of the invention, as given in FIG. 1.

Initialization Step

For each parity bit w and code bit v, calculate:

md_—m[v]=Pv(0)/Pv(1),

md_g[v][w]=md_m[v], and

md_—R[w]=md_—m[v]),

where Pv(0) and Pv(1) are the possibilities (from the Viterbi decoder) that bit v is equal to either 0 or 1, respectively. O(v) denotes the set of all parity bits w that include code bit v.

First Step of Iteration Process

Compute the following:

S  [ v ] = ( ∏ w ∈ O  ( v )  md_R  [ w ] md_g  [ v ]  [ w ] ) · md_m  [ v ] ( 1 ) loc_item  [ v ]  [ w ] = md_R  [ w ] md_g  [ v ]  [ w ] ( 2 ) md_g  _new  [ v ]  [ w ] = S  [ v ] loc_item  [ v ]  [ w ] ( 3 ) md_R  _new  [ w ] = f - 1 ( ∏ v ∈ O  ( w )  f  ( md_g  _new  [ v ]  [ w ] ) ) , ( 4 )

where

f  ( x ) = 1 + x 1 - x

(the Gallager function), O(w) is all of the code bits from the parity bit w, and O(v) is all of the parity bits w that include the code bit v.

Calculations (1) and (2) above are performed for v=i, and calculations (3) and (4) are performed for v=i−1. Then, calculations (1) and (2) are performed for v=i+1, and calculations (3) and (4) are performed for v=i, and so on, through a pipeline architecture in the arithmetic unit, depicted in FIG. 1. When all of the code bits v have been processed, the values are assigned as given below,

md_g[v][w]=md_g_new[v][w] (6)

md_R[w]=md_R_new[w] (7)

for each message bit v and parity bit w. A single iteration as described above is performed. The so-called “hard decision” for each code bit v is performed during this single iteration, where:

hard_decision  [ v ] = 0   if   sign ( ∏ w ∈ O  ( v )  loc_item  [ v ]  [ w ] ) = 1 ( 8 )

and

hard_decision  [ v ] = 1   if   sign ( ∏ w ∈ O  ( v )  loc_item  [ v ]  [ w ] ) = - 1 ( 9 )

Products for the formulas (8) and (9) were already calculated during calculation (1) for S[v]. Preferably, the calculations are performed in the logarithmic domain, so all products are replaced by sums as implemented in the arithmetic unit.

Parallel Architecture

One embodiment of the LDPC decoder as described herein includes a controller, an input fifo (first-in, first-out buffer, from the Viterbi decoder), an output fifo (first-in, first-out buffer for the final hard decision, or to another process, such as a Reed-Solomon computation), a pipeline, two interleavers, and t functional units of two types: Bit units and Par units, all as depicted in FIG. 1. The Bit units calculate data on bit nodes, and the Par units calculate data on check nodes.

Each Par unit preferably contains pipelined memories for storing values of md_R[w] and md_R_new[w]. Each Bit unit preferably contains pipelined memories for storing values of S[v], md_m[v], and loc_item. Each unit is preferably pipelined, meaning that it can store data for a few different nodes at the same time. In the embodiment depicted the arithmetic unit is separated for simplification and to show all the more relevant connections. However, the present invention is applicable to a wide variety of arithmetic unit architectures that are capable of performing calculations (1) through (9) above. Also, in the embodiment as depicted in FIG. 1, memories are embedded into the arithmetic unit, but in other embodiments they could be separate from the arithmetic unit.

A special parity check is used for (m,n) matrices H for LDPC-codes, which parity check can be represented by a matrix (M,N) from permutation (r,r) cell H_i,j, where m=M·r, n=N·r, and r(mod t)=0. An example of the matrix H is given in FIG. 2, where M=3, N=7, r=8, m=24, and n=56. The permutation matrix contains exactly one value of one in each sub row and sub column. To reduce the number of operations per circuit gate, circulant permutation matrices are used in one embodiment, which matrices are determined by formula:

p(j)=p(0)+j(mod r)

where p(i) is the index of the column with a value of one in i^throw. For example, p(0)=2 for the upper left cell in FIG. 2 (where counting of both rows and columns starts with zero). Thus, we can use the initial index p(0) of one in the first row to determine each circulant permutation matrix. Similarly, the function c(j) returns the index of row with a value of one in the j^thcolumn.

Groups of t columns from the matrix H are logically divided into stripes. Assume that we already have a value of md_g[v][w] for each pair (v,w), where wεO(v), and a value of mg_R[w] for each parity bit w. Starting from the first stripe, the following operations are performed in one embodiment. Calculate the addresses for the t memories that contain md_R for all of the check nodes, according to the following formula:

address(w)=cell_index(H_ij)(mod M)·(r/t)+c(v)/t (10)

where c(v) is the row index of the value one in the column for v from Rd and cell_index(H_ij)=i+j·M.

The value of md_R[w] for the t memories is input on the reverse interleaver that computes the permutation, according to the function:

π(i)=c(i)(mod t) for given H_i,j. (11)

Then, all of the values of md_R[w] are input to the right-most Bit unit to produce the sum S[v]. The method then continues with the same stripe in H_i+1,j, H_i+2,j, and so on.

For the second and subsequent stripes, we calculate the value loc_item and accumulate the sum S[v] for the current bits as described above, and retain the previously computed values of S[v] and loc_item for the bits from the previous stripe in the pipeline in the Bit unit. Then the values of S[v] and loc_item are retrieved from the pipeline and rearranged through the direct interleaver, which computes the permutation τ according to the function:

τ(π(i))=i, where π,τεH_i,j. (12)

and then calculates the values md_g_new and md_R_new according to formulas (3) and (4) for both v and w from the pipeline. When all the stripes have been processed in this manner, the values of md_g_new and md_R_new are used to replace the values of md_g and md_R as given in equations (6) and (7), and one cycle of the iteration is completed.

Block-Schema of Algorithm

1. Starting with a k^thstripe and a cell H_i,jwith index s.
2. Calculate AR_BIT[i].md_m=md_m[v[k_{t+i]] where i=}0, . . . , t−1.
3. Calculate the addresses for w^sεO(v) for v from cell H_i,jwith index s according to formula (11).
4. Calculate AR_BIT[i].md_R=AR_PAR[π^s(i)].md_R[w^s], where π^sis the reverse permutation for the cell with index s.
5. Calculate AR_BIT[i].md_g=md_g[v[k_t+i]][w^s].
6. Calculate AR_BIT[i] item[v[k_t+i]][w^s] according to formula (2).
7. Calculate AR_BIT[i].S[v[k_t+i]] according to formula (1).
8. Calculate AR_PAR[i].loc_item[v[(k−1)_t+i]][w^s−M]=AR_BIT[τ^s−M[i]]].loc_item[v[(k−1)_t+i]][w^s−M] and AR_PAR[i].S[v[(k−1)_t+i]]=AR_BIT[τ^s−M[i]].S[v[(k−1)_t+i]], where τ^s−Mis the direct permutation for the cell with an index of s−M.
9. Calculate AR_BIT[i].md_g_new[v[(k−1)_t+i]][w^s−M] according to formula (3).
10. Calculate AR_BIT[i].md_R_new[w^s−M] according to formula (4).
11. Go to the next cell, with an index of s+1.
12. If s+1(modM)=0, then go to the next stripe (k+1).
13. If all cells pass step 12 above, then assign the values as given in equations (6) and (7), and start a new iteration for the 0^thstripe and the 0^thcell.

The foregoing description of preferred embodiments for this invention has been presented for purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Obvious modifications or variations are possible in light of the above teachings. The embodiments are chosen and described in an effort to provide the best illustrations of the principles of the invention and its practical application, and to thereby enable one of ordinary skill in the art to utilize the invention in various embodiments and with various modifications as are suited to the particular use contemplated. All such modifications and variations are within the scope of the invention as determined by the appended claims when interpreted in accordance with the breadth to which they are fairly, legally, and equitably entitled.

Claims

What is claimed is:

1. In an LDPC decoder implementing an iterative message-passing algorithm, the improvement comprising a pipeline architecture such that the decoder accumulates results for row operations during column operations, such that additional time and memory are not required to store results from the row operations beyond that required for the column operations.

Resources

Images & Drawings included:

Fig. 01 - Parallel LDPC Decoder — Fig. 01

Fig. 02 - Parallel LDPC Decoder — Fig. 02

Fig. 03 - Parallel LDPC Decoder — Fig. 03

Fig. 04 - Parallel LDPC Decoder — Fig. 04

Fig. 05 - Parallel LDPC Decoder — Fig. 05

Sources:

United States Patent and Trademark Office - verify current appl. status at the USPTO↗

Similar patent applications:

» 20060117240
Hierarchical design and layout optimizations for high throughput parallel LDPC decoders
» 20200252080
Parallel LDPC decoder
» 20230037965
Layered semi parallel LDPC decoder system having single permutation network
» 20080134008
Parallel LDPC decoder
» 20100162071
Circuits for implementing parity computation in a parallel architecture LDPC decoder
» 20090319857
Method and apparatus for parallel processing multimode LDPC decoder
» 20100146362
Contention-free parallel processing multimode LDPC decoder
» 20050005231
Method and system for generating parallel decodable low density parity check (LDPC) codes
» 20080065953
Method and system for generating parallel decodable low density parity check (LDPC) codes
» 20110307760
Method and apparatus for parallel processing in a gigabit LDPC decoder

Recent applications in this class:

» 20250141471 2025-05-01
FAIR-DENSITY PARITY-CHECK CODING AND DECODING
» 20250047307 2025-02-06
SYSTEM AND METHOD FOR LOW DENSITY PARITY CHECK (LDPC) CODE WITH 2/3 CODE RATE
» 20250047306 2025-02-06
SYSTEM AND METHOD FOR LOW DENSITY PARITY CHECK (LDPC) CODE WITH 3/4 CODE RATE
» 20250047305 2025-02-06
SYSTEM AND METHOD FOR LOW DENSITY PARITY CHECK (LDPC) CODE WITH 5/6 CODE RATE
» 20240120945 2024-04-11
Generalized LDPC encoder, generalized LDPC encoding method and storage device
» 20240080047 2024-03-07
RECEPTION APPARATUS
» 20240080046 2024-03-07
Parallel system to calculate low density parity check
» 20240039557 2024-02-01
Receiver with duty cycled listening
» 20240007130 2024-01-04
Channel coding method, processing device, communication method and device
» 20240007129 2024-01-04
Constructing method, processing device, storage medium and coding method

Recent applications for this Assignee:

» 20150333745 2015-11-19
VOLTAGE COMPARATOR
» 20150039932 2015-02-05
Arbitration suspension in a SAS domain
» 20150039787 2015-02-05
Multi-protocol storage controller
» 20140365692 2014-12-11
Sharing of bypassed I/O transaction information
» 20140349475 2014-11-27
Moisture barrier for a wire bond
» 20140281688 2014-09-18
Data recovery in a raid controller by offloading contents of DRAM to a flash module on an SAS switch
» 20140258565 2014-09-11
Smart discovery model in a serial attached small computer system topology
» 20140253203 2014-09-11
Programmable clock spreading
» 20140250246 2014-09-04
Intelligent data buffering between interfaces
» 20140240870 2014-08-28
Analog tunneling current sensors for use with disk drive storage devices