An Erasure-Resilient and Compute-Efficient Coding Scheme

Total Page:16

File Type:pdf, Size:1020Kb

An Erasure-Resilient and Compute-Efficient Coding Scheme An Erasure-Resilient and Compute-Efficient Coding Scheme for Storage Applications Dissertation zur Erlangung des Doktorgrades der Naturwissenschaften vorgelegt beim Fachbereich Informatik und Mathematik der Johann Wolfgang Goethe-Universität in Frankfurt am Main von Sebastian Kalcher aus Lahn-Gießen Frankfurt 2013 (D30) vom Fachbereich Informatik und Mathematik der Johann Wolfgang Goethe-Universität als Dissertation angenommen. Dekan: Prof. Dr. Thorsten Theobald Gutachter: Prof. Dr. Volker Lindenstruth Prof. Dr. Udo Kebschull Datum der Disputation: 31.10.2013 Abstract Driven by rapid technological advancements, the amount of data that is created, captured, communica- ted, and stored worldwide has grown exponentially over the past decades. Along with this development it has become critical for many disciplines of science and business to being able to gather and analyze large amounts of data. The sheer volume of the data often exceeds the capabilities of classical storage systems, with the result that current large-scale storage systems are highly distributed and are comprised of a high number of individual storage components. As with any other electronic device, the reliability of storage hardware is governed by certain probability distributions, which in turn are influenced by the physical processes utilized to store the information. The traditional way to deal with the inherent unre- liability of combined storage systems is to replicate the data several times. Another popular approach to achieve failure tolerance is to calculate the block-wise parity in one or more dimensions. With better understanding of the different failure modes of storage components, it has become evident that so- phisticated high-level error detection and correction techniques are indispensable for the ever-growing distributed systems. The utilization of powerful cyclic error-correcting codes, however, comes with a high computational penalty, since the required operations over finite fields do not map very well onto current commodity processors. This thesis introduces a versatile coding scheme with fully adjustable fault-tolerance that is tailored specifically to modern processor architectures. To reduce stress on the memory subsystem the conventional table-based algorithm for multiplication over finite fields has been replaced with a polynomial version. This arithmetically intense algorithm is better suited to the wide SIMD units of the currently available general purpose processors, but also displays significant bene- fits when used with modern many-core accelerator devices (for instance the popular general purpose graphics processing units). A CPU implementation using SSE and a GPU version using CUDA are presented. The performance of the multiplication depends on the distribution of the polynomial coeffi- cients in the finite field elements. This property has been used to create suitable matrices that generate a linear systematic erasure-correcting code which shows a significantly increased multiplication perfor- mance for the relevant matrix elements. Several approaches to obtain the optimized generator matrices are elaborated and their implications are discussed. A Monte-Carlo-based construction method allows it to influence the specific shape of the generator matrices and thus to adapt them to special storage and archiving workloads. Extensive benchmarks on CPU and GPU demonstrate the superior performance and the future application scenarios of this novel erasure-resilient coding scheme. i Zusammenfassung Die sogenannte digitale Revolution, beginnend am Ende des 20. Jahrhunderts, hat unseren Zugang zu Informationen umfassend verändert. Viele Bereiche des gesellschaftlichen Lebens, der Wirtschaft und der Wissenschaft sind mittlerweile stark auf Informationstechnologie angewiesen. Zusammen mit der rasanten Entwicklung der Mikroprozessoren war die Evolution der magnetischen Speichertechnolo- gien einer der Schlüsselfaktoren für den Übergang in eine Informationsgesellschaft. Kostengünstige Datenspeicher mit hoher Kapazität bilden heute die Basis für die große Familie von allgegenwärtigen Geräten, die in die Klasse der personal computing devices eingeordnet werden, beispielsweise klassi- sche Arbeitsplatzrechner, tragbare Laptops, Mobiltelefone, Tablet-Computer, sowie Foto- und Video- kameras. Besonders der Durchbruch der Halbleiter-Speicher hat die Entwicklung immer kleinerer und robusterer Geräte aus dieser Klasse begünstigt. Aber auch die Entwicklung der professionellen Hoch- leistungscomputer und deren Speichersysteme wird durch diese Schlüsseltechnologien angetrieben. Besonders diese Art von Computern hat enorme Bedeutung für den wissenschaftlichen Erkenntnis- gewinn, aber auch ein breites wirtschaftliches Anwendungsspektrum. Das gewaltige Wachstum der Menge an Informationen, die weltweit erzeugt (oder gewonnen), übertragen und gespeichert wird ist daher wenig überraschend. Schätzungen zu Folge stieg die Gesamtmenge aller gespeicherten Infor- mationen (auf allen gängigen analogen und digitalen Medien) von 2,6 Exabytes1 in 1986 auf 295 Exabytes in 2007 [1]. Bemerkenswerterweise trugen die digitalen Speichertechnologien erst ab dem Jahr 2000 signifikant zur gesamten Speicherkapazität bei. Nur drei Jahre später wurden geschätzte 90% aller neuen Daten auf magnetischen Medien gespeichert [2] und im Jahr 2007 überholten die digitalen Speicher die analogen endgültig: 52% aller existenten Informationen waren auf Festplatten, 28% auf optischen Medien und 11% auf magnetischen Bändern gespeichert. Neuere Studien aus dem Jahr 2011 quantifizierten die Gesamtmenge aller vom Menschen jemals erzeugten und gespeicherten Informationen auf gewaltige 1,8 Zettabytes [3][4]. Die datenintensiven Wissenschaften gelten neben dem Experiment, der Theorie und der Simulation mittlerweile als vierte essentielle wissenschaftliche Disziplin. Das prominenteste Beispiel kommt aus Gebiet der Teilchenphysik: Der Large Hadron Col- lider (LHC) der Europäische Organisation für Kernforschung (CERN) in Genf [5] beheimatet sechs unterschiedlich große Experimente die zusammen einige zehn Petabyte an zu speichernden Daten pro Jahr aufnehmen. Aber nicht nur derartige Großforschungsprojekte stellen hohe Anforderungen an die Datenverarbeitungs- und Speichersysteme: Die Entwicklung bildgebenden Sensoren mit Auflösungen Gigapixel-Bereich und immer höher auflösenden Teleskopen stellt die astronomische Forschung vor ähnliche Herausforderungen. Sogar die relativ kompakten DNA-Sequenzierungsgeräte aktueller Bau- art produzieren bereits Rohdatenmengen im Bereich einiger Terabyte pro Tag. Mit den immer größer 1Ein Exabyte steht für 1018 Bytes. iii werdenden Datenmengen müssen natürlich auch die Hochleistungsrechner mit immer größeren Spei- chersystemen ausgestattet werden, um die Analyse der gewonnen Informationen zu ermöglichen. Doch die Verarbeitung von großen Datenmengen hat nicht nur im wissenschaftlichen Umfeld enorm an Be- deutung gewonnen. Besonders das Geschäftsfeld der Internet Suchmaschinen wäre ohne die Fähigkeit enorme Datenmengen zu sammeln, zu analysieren und zu speichern heute undenkbar. Weiter Beispiele sind elektronische Handelsplattformen und weltumspannende soziale Netzwerke. Der größte Teil all dieser Daten wird zum Online-Zugriff auf großen Sammlungen von Festplatten vorgehalten, deren Verwaltung und Pflege die Besitzer vor große Herausforderungen stellt: Obwohl die Zuverlässigkeit einer einzelnen Festplatte durchaus beachtlich ist, ergeben sich bei Systemen aus Zehn- tausenden Festplatten hohe Raten an Fehlerereignissen. Große wissenschaftliche Experimente können aus Kostengründen oft nicht beliebig wiederholt werden und schon eine kurze Unerreichbarkeit im In- ternethandel kann enorme Verluste erzeugen. Typischerweise werden daher Kopien der Daten erstellt und diese auf (geographisch) verschiedene Rechenzentren verteilt. Da diese Strategien der mehrfachen Replikation relativ kostenintensiv sind, wurden effizientere Verfahren entwickelt. In ihrem berühmten Artikel aus dem Jahre 1988 beschreiben Patterson, Gibson und Katz [6], wie man die Zuverlässigkeit von Sammlungen von Festplatten mit verschiedenen Techniken erhöhen kann. Die Berechnung redun- danter horizontaler Paritätsinformationen zur Erzeugung einer einfachen Fehlertoleranz hat sich daraus zum Standard für die folgenden Jahrzehnte entwickelt (RAID). Später kamen diverse Erweiterungen hinzu, die sogar den Ausfall von zwei Festplatten tolerieren können. All diese Verfahren stellen jedoch nur die Spezialfälle der umfassenden mathematischen Theorie der linearen fehlerkorrigierenden Codes dar. Im Rahmen dieser Theorie nutzt man die Werkzeuge der linearen Algebra um das Kodieren und das Dekodieren zu formalisieren. Als besonders leistungsstark haben sich die zyklischen linearen Co- des über endliche Körper erwiesen. Die weitverbreiteten Reed-Solomon-Codes [7] erlauben es, zu einer Menge an n Nachrichtensymbolen eine frei wählbare Menge an k Redundanzsymbolen zu berechnen. Die original Nachricht kann auch dann noch rekonstruiert werden, wenn bis zu k Symbole verloren ge- gangen sind. Eine Variante dieser Codes eignet sich besonders für den Einsatz in Speichersystemen [8]. Für die n Nachrichtensymbole d = d0;d1;:::;dn−1, müssen k Redundanzsymbole c = c1;c2;:::;ck−1 berechnet werden. Mit Hilfe einer speziellen Generatormatrix G = (gi; j), die von einer Vandermonde- Matrix abgeleitet wird, lässt sich die Kodierung durch eine Matrix-Vektor-Multiplikation darstellen: 0 1 d0 B d C B 1 C 0 1 B . C d0 B . C B C ! B d1 C Bd C d B C B n−1C GB . C = B
Recommended publications
  • Introduction to Coding Theory
    Introduction Why Coding Theory ? Introduction Info Info Noise ! Info »?# Sink to Source Heh ! Sink Heh ! Coding Theory Encoder Channel Decoder Samuel J. Lomonaco, Jr. Dept. of Comp. Sci. & Electrical Engineering Noisy Channels University of Maryland Baltimore County Baltimore, MD 21250 • Computer communication line Email: [email protected] WebPage: http://www.csee.umbc.edu/~lomonaco • Computer memory • Space channel • Telephone communication • Teacher/student channel Error Detecting Codes Applied to Memories Redundancy Input Encode Storage Detect Output 1 1 1 1 0 0 0 0 1 1 Error 1 Erasure 1 1 31=16+5 Code 1 0 ? 0 0 0 0 EVN THOUG LTTRS AR MSSNG 1 16 Info Bits 1 1 1 0 5 Red. Bits 0 0 0 0 0 0 0 FRM TH WRDS N THS SNTNCE 1 1 1 1 0 0 0 0 1 1 1 Double 1 IT CN B NDRSTD 0 0 0 0 1 Error 1 Error 1 Erasure 1 1 Detecting 1 0 ? 0 0 0 0 1 1 1 1 Error Control Coding 0 0 1 1 0 0 31% Redundancy 1 1 1 1 Error Correcting Codes Applied to Memories Input Encode Storage Correct Output Space Channel 1 1 1 1 0 0 0 0 Mariner Space Probe – Years B.C. ( ≤ 1964) 1 1 Error 1 1 • 1 31=16+5 Code 1 0 1 0 0 0 0 Eb/N0 Pe 16 Info Bits 1 1 1 1 -3 0 5 Red. Bits 0 0 0 6.8 db 10 0 0 0 0 -5 1 1 1 1 9.8 db 10 0 0 0 0 1 1 1 Single 1 0 0 0 0 Error 1 1 • Mariner Space Probe – Years A.C.
    [Show full text]
  • Reed-Solomon Encoding and Decoding
    Bachelor's Thesis Degree Programme in Information Technology 2011 León van de Pavert REED-SOLOMON ENCODING AND DECODING A Visual Representation i Bachelor's Thesis | Abstract Turku University of Applied Sciences Degree Programme in Information Technology Spring 2011 | 37 pages Instructor: Hazem Al-Bermanei León van de Pavert REED-SOLOMON ENCODING AND DECODING The capacity of a binary channel is increased by adding extra bits to this data. This improves the quality of digital data. The process of adding redundant bits is known as channel encod- ing. In many situations, errors are not distributed at random but occur in bursts. For example, scratches, dust or fingerprints on a compact disc (CD) introduce errors on neighbouring data bits. Cross-interleaved Reed-Solomon codes (CIRC) are particularly well-suited for detection and correction of burst errors and erasures. Interleaving redistributes the data over many blocks of code. The double encoding has the first code declaring erasures. The second code corrects them. The purpose of this thesis is to present Reed-Solomon error correction codes in relation to burst errors. In particular, this thesis visualises the mechanism of cross-interleaving and its ability to allow for detection and correction of burst errors. KEYWORDS: Coding theory, Reed-Solomon code, burst errors, cross-interleaving, compact disc ii ACKNOWLEDGEMENTS It is a pleasure to thank those who supported me making this thesis possible. I am thankful to my supervisor, Hazem Al-Bermanei, whose intricate know- ledge of coding theory inspired me, and whose lectures, encouragement, and support enabled me to develop an understanding of this subject.
    [Show full text]
  • The Binary Golay Code and the Leech Lattice
    The binary Golay code and the Leech lattice Recall from previous talks: Def 1: (linear code) A code C over a field F is called linear if the code contains any linear combinations of its codewords A k-dimensional linear code of length n with minimal Hamming distance d is said to be an [n, k, d]-code. Why are linear codes interesting? ● Error-correcting codes have a wide range of applications in telecommunication. ● A field where transmissions are particularly important is space probes, due to a combination of a harsh environment and cost restrictions. ● Linear codes were used for space-probes because they allowed for just-in-time encoding, as memory was error-prone and heavy. Space-probe example The Hamming weight enumerator Def 2: (weight of a codeword) The weight w(u) of a codeword u is the number of its nonzero coordinates. Def 3: (Hamming weight enumerator) The Hamming weight enumerator of C is the polynomial: n n−i i W C (X ,Y )=∑ Ai X Y i=0 where Ai is the number of codeword of weight i. Example (Example 2.1, [8]) For the binary Hamming code of length 7 the weight enumerator is given by: 7 4 3 3 4 7 W H (X ,Y )= X +7 X Y +7 X Y +Y Dual and doubly even codes Def 4: (dual code) For a code C we define the dual code C˚ to be the linear code of codewords orthogonal to all of C. Def 5: (doubly even code) A binary code C is called doubly even if the weights of all its codewords are divisible by 4.
    [Show full text]
  • Cyclic Codes
    Cyclic Codes Saravanan Vijayakumaran [email protected] Department of Electrical Engineering Indian Institute of Technology Bombay August 26, 2014 1 / 25 Cyclic Codes Definition A cyclic shift of a vector v0 v1 ··· vn−2 vn−1 is the vector vn−1 v0 v1 ··· vn−3 vn−2 . Definition An (n; k) linear block code C is a cyclic code if every cyclic shift of a codeword in C is also a codeword. Example Consider the (7; 4) code C with generator matrix 21 0 0 0 1 1 03 60 1 0 0 0 1 17 G = 6 7 40 0 1 0 1 1 15 0 0 0 1 1 0 1 2 / 25 Polynomial Representation of Vectors For every vector v = v0 v1 ··· vn−2 vn−1 there is a polynomial 2 n−1 v(X) = v0 + v1X + v2X + ··· + vn−1X Let v(i) be the vector resulting from i cyclic shifts on v (i) i−1 i n−1 v (X) = vn−i +vn−i+1X +···+vn−1X +v0X +···+vn−i−1X Example v = 1 0 0 1 1 0 1, v(X) = 1 + X 3 + X 4 + X 6 v(1) = 1 1 0 0 1 1 0, v(1)(X) = 1 + X + X 4 + X 5 v(2) = 0 1 1 0 0 1 1, v(2)(X) = X + X 2 + X 5 + X 6 3 / 25 Polynomial Representation of Vectors • Consider v(X) and v(1)(X) 2 n−1 v(X) = v0 + v1X + v2X + ··· + vn−1X (1) 2 3 n−2 v (X) = vn−1 + v0X + v1X + v2X + ··· + vn−2X h 2 n−2i = vn−1 + X v0 + v1X + v2X + ··· + vn−2X n h n−2 n−1i = vn−1(1 + X ) + X v0 + ··· + vn−2X + vn−1X n = vn−1(1 + X ) + Xv(X) • In general, v(X) and v(i)(X) are related by X i v(X) = v(i)(X) + q(X)(X n + 1) i−1 where q(X) = vn−i + vn−i+1X + ··· + vn−1X • v(i)(X) is the remainder when X i v(X) is divided by X n + 1 4 / 25 Hamming Code of Length 7 Codeword Code Polynomial 0000000 0 1000110 1 + X 4 + X 5 0100011 X + X 5 + X 6 1100101 1 + X + X 4 + X 6 0010111 X 2 + X 4 + X 5 + X 6 1010001 1 + X 2 + X 6 0110100 X + X 2 + X 4 1110010 1 + X + X 2 + X 5 0001101 X 3 + X 4 + X 6 1001011 1 + X 3 + X 5 + X 6 0101110 X + X 3 + X 4 + X 5 1101000 1 + X + X 3 0011010 X 2 + X 3 + X 5 1011100 1 + X 2 + X 3 + X 4 0111001 X + X 2 + X 3 + X 6 1111111 1 + X + X 2 + X 3 + X 4 + X 5 + X 6 5 / 25 Properties of Cyclic Codes (1) Theorem The nonzero code polynomial of minimum degree in a linear block code is unique.
    [Show full text]
  • ABSTRACT IDEMPOTENTS in CYCLIC CODES by Benjamin Brame April 9, 2012
    ABSTRACT IDEMPOTENTS IN CYCLIC CODES by Benjamin Brame April 9, 2012 Chair: Dr. Zachary Robinson Major Department: Mathematics Cyclic codes give us the most probable method by which we may detect and correct data transmission errors. These codes depend on the development of advanced math- ematical concepts. It is shown that cyclic codes, when viewed as vector subspaces of a vector space of some dimension n over some finite field F, can be approached as polynomials in a ring. This approach is made possible by the assumption that the set of codewords is invariant under cyclic shifts, which are linear transformations. Developing these codes seems to be equivalent to factoring the polynomial xn −x over F. Each factor then gives us a cyclic code of some dimension k over F. Constructing factorizations of xn − 1 is accomplished by using cyclotomic polyno- mials and idempotents of the code algebra. The use of these two concepts together allows us to find cyclic codes in Fn. Hence, the development of cyclic codes is a journey from codewords and codes to fields and rings and back to codes and codewords. IDEMPOTENTS IN CYCLIC CODES A Thesis Presented to The Faculty of the Department of Mathematics East Carolina University In Partial Fulfillment of the Requirements for the Degree Master of Arts in Mathematics by Benjamin Brame April 9, 2012 Copyright 2012, Benjamin Brame IDEMPOTENTS IN CYCLIC CODES by Benjamin Brame APPROVED BY: DIRECTOR OF THESIS: Dr. Zachary Robinson COMMITTEE MEMBER: Dr. Chris Jantzen COMMITTEE MEMBER: Dr. Heather Ries COMMITTEE MEMBER: Dr. David Pravica CHAIR OF THE DEPARTMENT OF MATHEMATICS: Dr.
    [Show full text]
  • Cyclic Codes, BCH Codes, RS Codes
    Classical Channel Coding, II Hamming Codes, Cyclic Codes, BCH Codes, RS Codes John MacLaren Walsh, Ph.D. ECET 602, Winter Quarter, 2013 1 References • Error Control Coding, 2nd Ed. Shu Lin and Daniel J. Costello, Jr. Pearson Prentice Hall, 2004. • Error Control Systems for Digital Communication and Storage, Prentice Hall, S. B. Wicker, 1995. 2 Example of Syndrome Decoder { Hamming Codes Recall that a (n; k) Hamming code is a perfect code with a m × 2m − 1 parity check matrix whose columns are all 2m −1 of the non-zero binary words of length m. This yields an especially simple syndrome decoder look up table for arg min wt(e) (1) ejeHT =rHT In particular, one selects the error vector to be the jth column of the identity matrix, where j is the column of the parity check matrix which is equal to the computed syndome s = rHT . 3 Cyclic Codes Cyclic codes are a subclass of linear block codes over GF (q) which have the property that every cyclic shift of a codeword is itself also a codeword. The ith cyclic shift of a vector v = (v0; v1; : : : ; vn−1) is (i) the vector v = (vn−i; vn−i+1; : : : ; vn−1; v0; v1; : : : ; vn−i−1). If we think of the elements of the vector as n−1 coefficients in the polynomial v(x) = v0 + v1x + ··· + vn−1x we observe that i i i+1 n−1 n n+i−1 x v(x) = v0x + v1x + ··· + vn−i−1x + vn−ix + ··· + vn+i−1x i−1 i n−1 = vn−i + vn−i+1x + ··· + vn−1x + v0x + ··· + vn−i−1x n n i−1 n +vn−i(x − 1) + vn−i+1x(x − 1) + ··· + vn+i−1x (x − 1) Hence, i n i−1 i i+1 n−1 (i) x v(x)mod(x − 1) = vn−i + vn−i+1x + ··· + vn−1x + v0x + v1x + ··· + vn−i−1x = v (x) (2) That is, the ith cyclic shift can alternatively be thought of as the operation v(i)(x) = xiv(x)mod(xn − 1).
    [Show full text]
  • Finding Cyclic Redundancy Check Polynomials for Multilevel Systems James A
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by University of Richmond University of Richmond UR Scholarship Repository Math and Computer Science Faculty Publications Math and Computer Science 10-1998 Finding Cyclic Redundancy Check Polynomials for Multilevel Systems James A. Davis University of Richmond, [email protected] Miranda Mowbray Simon Crouch Follow this and additional works at: http://scholarship.richmond.edu/mathcs-faculty-publications Part of the Mathematics Commons Recommended Citation Davis, James A., Miranda Mowbray, and Simon Crouch. "Finding Cyclic Redundancy Check Polynomials for Multilevel Systems." IEEE Transactions on Communications 46, no. 10 (October 1998): 1250-253. This Article is brought to you for free and open access by the Math and Computer Science at UR Scholarship Repository. It has been accepted for inclusion in Math and Computer Science Faculty Publications by an authorized administrator of UR Scholarship Repository. For more information, please contact [email protected]. 1250 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 46, NO. 10, OCTOBER 1998 Finding Cyclic Redundancy Check Polynomials for ,Multilevel Systems James A. Davis, Miranda Mowbray, and Simon Crouch Abstract-This letter describes a technique for finding cyclic for using all nine levels rather than, effectively, just eight, redundancy check polynomials for systems for transmission over in that with nine levels there are more possible choices for symmetric channels which encode information in multiple voltage code words which do not encode data, but are special signals levels, so that the resulting redundancy check gives good error protection and is efficient to implement. The codes which we used for synchronization information or to signal the end of construct have a Hamming distance of 3 or 4.
    [Show full text]
  • Cyclic Redundancy Code Theory
    Appendix A CYCLIC REDUNDANCY CODE THEORY “It is hard to establish . who did what first. E. N. Gilbert of the Bell Telephone Laboratories derived much of linear theory a year or so earlier than either Zierler, Welch, or myself, . the first investigation of linear recurrence relations modulo p goes back as far as Lagrange, in the eighteenth century, and an excellent modern treatment was given ( . purely mathematical . ) by Marshall Hall in 1937.” — Solomon W. Golomb [264]. Cyclic redundancy code theory is described in this appendix because it is im- portant for built-in self-testing (BIST), and for Memory ROM testing. Cyclic re- dundancy codes have been extensively used for decades in the computer industry to guarantee correct recording of blocks on disk and magnetic tape drives. They also are extensively used for encoding and decoding communication channel burst errors, where a transient fault causes several adjacent data errors. Preliminaries. We express the binary vector as the polynomial As an example, 10111 is written as [36, 37]. The degree of the polynomial is the superscript of the highest non-zero term. Polynomial addition and subtraction arithmetic is performed similarly to integer arithmetic, except that the arithmetic is modulo 2. In this case, addition and subtraction will have the same effect. As an example, let (degree 3) and (degree 2.) We can describe the remainders of polynomials modulus a polynomial. Two polynomials r(x) and s(x) will be congruent modulus polynomial n(x), written 616 Appendix A. CYCLIC REDUNDANCY CODE THEORY as mod n(x) if a polynomial q(x) such that We find the residue (the remainder) by dividing r(x) by n(x).
    [Show full text]
  • Error-Correcting Codes
    Error-Correcting Codes Matej Boguszak Contents 1 Introduction 2 1.1 Why Coding? . 2 1.2 Error Detection and Correction . 3 1.3 Efficiency Considerations . 8 2 Linear Codes 11 2.1 Basic Concepts . 11 2.2 Encoding . 13 2.3 Decoding . 15 2.4 Hamming Codes . 20 3 Cyclic Codes 23 3.1 Polynomial Code Representation . 23 3.2 Encoding . 25 3.3 Decoding . 27 3.4 The Binary Golay Code . 30 3.5 Burst Errors and Fire Codes . 31 4 Code Performance 36 4.1 Implementation Issues . 36 4.2 Test Description and Results . 38 4.3 Conclusions . 43 A Glossary 46 B References 49 1 2 1 Introduction Error-correcting codes have been around for over 50 years now, yet many people might be surprised just how widespread their use is today. Most of the present data storage and transmission technologies would not be conceiv- able without them. But what exactly are error-correcting codes? This first chapter answers that question and explains why they are so useful. We also explain how they do what they are designed to do: correct errors. The last section explores the relationships between different types of codes as well the issue of why some codes are generally better than others. Many mathemat- ical details have been omitted in this chapter in order to focus on concepts; a more rigorous treatment of codes follows shortly in chapter 2. 1.1 Why Coding? Imagine Alice wants to send her friend Bob a message. Cryptography was developed to ensure that her message remains private and secure, even when it is sent over a non-secure communication channel, such as the internet.
    [Show full text]
  • Open Problems on Cyclic Codes∗∗
    Open problems on cyclic codes∗∗ Pascale Charpin ∗ Contents 1 Introduction 3 2 Different kinds of cyclic codes. 4 2.1 Notation . 5 2.2 Definitions . 6 2.3 Primitive and non primitive cyclic codes . 17 2.4 Affine-Invariant codes . 21 2.4.1 The poset of affine-invariant codes . 22 2.4.2 Affine-invariant codes as ideals of A . 28 3 On parameters of cyclic codes. 34 3.1 Codewords and Newton identities . 35 3.2 Special locator polynomials . 47 3.3 On the minimum distance of BCH codes . 59 3.4 On the weight enumerators . 65 3.4.1 The Reed-Muller codes . 67 3.4.2 On cyclic codes with two zeros . 71 3.4.3 On irreducible cyclic codes . 83 3.5 Automorphism groups of cyclic codes . 88 3.6 Are all cyclic codes asymptotically bad ? . 90 ∗INRIA, Projet CODES, Domaine de Voluceau, Rocquencourt BP 105, 78153 Le Ches- nay Cedex, FRANCE. e-mail: [email protected] 1 4 Related problems. 90 4.1 Some problems in cryptography . 91 4.2 Cyclic codes and Goppa codes . 96 4.3 On the weight enumerator of Preparata codes . 104 5 Conclusion 117 ∗∗ \Handbook of Coding Theory", Part 1: Algebraic Coding, chapter 11, V. S. Pless, W. C. Huffman, editors, R. A. Brualdi, assistant editor. Warning. The Handbook of Coding Theory was published in 1998. Some research problems, presented as Open problems in this chapter, are solved, or partially solved, today. 2 1 Introduction We do not intend to give an exhaustive account of the research problems on cyclic codes.
    [Show full text]
  • The Impact of Hamming Code and Cyclic Code on MPSK and MQAM Systems Over AWGN Channel: Performance Analysis
    Universal Journal of Electrical and Electronic Engineering 8(1): 9-15, 2021 http://www.hrpub.org DOI: 10.13189/ujeee.2021.080102 The Impact of Hamming Code and Cyclic Code on MPSK and MQAM Systems over AWGN Channel: Performance Analysis Amjad Abu-Baker∗, Khaled Bani-Hani, Firas Khasawneh, Abdullah Jaradat Department of Telecommunications Engineering, Yarmouk University, Irbid, Jordan Received January 10, 2021; Revised February 19, 2021; Accepted March 23, 2021 Cite This Paper in the following Citation Styles (a): [1] Amjad Abu-Baker, Khaled Bani-Hani, Firas Khasawneh, Abdullah Jaradat, ”The Impact of Hamming Code and Cyclic Code on MPSK and MQAM Systems over AWGN Channel: Performance Analysis,” Universal Journal of Electrical and Electronic Engineering, Vol.8, No.1, pp. 9-15, 2021. DOI: 10.13189/ujeee.2021.080102. (b): Amjad Abu-Baker, Khaled Bani-Hani, Firas Khasawneh, Abdullah Jaradat, (2021). The Impact of Hamming Code and Cyclic Code on MPSK and MQAM Systems over AWGN Channel: Performance Analysis. Universal Journal of Electrical and Electronic Engineering, 8(1), 9-15. DOI: 10.13189/ujeee.2021.080102. Copyright ©2021 by authors, all rights reserved. Authors agree that this article remains permanently open access under the terms of the Creative Commons Attribution License 4.0 International License Abstract Designing efficient and reliable data transmission 1 Introduction systems is an attractive area of research as it has become an essential requirement for the cutting-edge technologies. Error Recently, we have witnessed a great demand on the appli- Control Coding (ECC) is a method used to control errors in cations of the internet of things (IoT) where providing reliable a message to achieve reliable delivery through unreliable or and fast communication systems become essential requirement noisy channel.
    [Show full text]
  • Part III Cyclic Codes
    Part III Cyclic codes CHAPTER 3: CYCLIC CODES, CHANNEL CODING, LIST DECODING Cyclic codes are very special linear codes. They are of large interest and importance for several reasons: They posses a rich algebraic structure that can be utilized in a variety of ways. They have extremely concise specifications. Their encodings can be efficiently implemented using simple machinery - shift registers. Many of the practically very important codes are cyclic. Channel codes are used to encode streams of data (bits). Some of them, as Concatenated codes and Turbo codes, reach theoretical Shannon bound concerning efficiency, and are currently used very often in practice. List decoding is a new decoding mode capable to deal, in an approximate way, with cases of many errors, and in such a case to perform better than classical unique decoding. Locally decodable codes can be seen as theoretical extreme of coding theeory with deep theoretical implications. prof. Jozef Gruska IV054 3. Cyclic codes 2/82 IMPORTANT NOTE In order to specify a non-linear binary code with 2k codewords of length n one may need to write down 2k codewords of length n. In order to specify a linear binary code of the dimension k with 2k codewords of length n it is sufficient to write down k codewords of length n. In order to specify a binary cyclic code with 2k codewords of length n it is sufficient to write down 1 codeword of length n. prof. Jozef Gruska IV054 3. Cyclic codes 3/82 BASIC DEFINITION AND EXAMPLES Definition A code C is cyclic if (i) C is a linear code; (ii) any cyclic shift of a codeword is also a codeword, i.e.
    [Show full text]