theory and practice of codes with locality

Post on 12-Jan-2022

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Theory and Practice of Codes with LocalityIntroduction

July 10, 2016

Theory and Practice of Codes with Locality July 10, 2016 1 / 22

Information Era

• We live in Information Era

• Big Data players: Facebook, Google, MSFT, Amazon, Alibaba, Dropbox, etc.

• Node failures are the norm

Cluster of machines running Hadoop at Yahoo!

Theory and Practice of Codes with Locality July 10, 2016 2 / 22

Information Era

• We live in Information Era

• Big Data players: Facebook, Google, MSFT, Amazon, Alibaba, Dropbox, etc.

• Node failures are the norm

Cluster of machines running Hadoop at Yahoo!

Theory and Practice of Codes with Locality July 10, 2016 2 / 22

Information Era

• We live in Information Era

• Big Data players: Facebook, Google, MSFT, Amazon, Alibaba, Dropbox, etc.

• Node failures are the norm

Cluster of machines running Hadoop at Yahoo!

Theory and Practice of Codes with Locality July 10, 2016 2 / 22

Information Era

• We live in Information Era

• Big Data players: Facebook, Google, MSFT, Amazon, Alibaba, Dropbox, etc.

• Node failures are the norm

Cluster of machines running Hadoop at Yahoo!

Theory and Practice of Codes with Locality July 10, 2016 2 / 22

Information Era

• We live in Information Era

• Big Data players: Facebook, Google, MSFT, Amazon, Alibaba, Dropbox, etc.

• Node failures are the norm

Cluster of machines running Hadoop at Yahoo!

Theory and Practice of Codes with Locality July 10, 2016 2 / 22

State-of-the-Art Coding technique

RAID: Redundant Array of Independent Disks

RAID 1 – Replication (typically 3x)• Simple implementation!• High availability of the information• Can tolerate any 2 disk failures• Widely used in Hadoop and many other systems• Storage overhead of 200%!!!!!

RAID 6• MDS codes with two parities

[n, k] MDS codes• Can tolerate any n− k disk failures• FB uses (14, 10) RS codes• Poor handling of single disk failures (The Repair Problem)

Theory and Practice of Codes with Locality July 10, 2016 3 / 22

State-of-the-Art Coding technique

RAID: Redundant Array of Independent Disks

RAID 1 – Replication (typically 3x)• Simple implementation!• High availability of the information• Can tolerate any 2 disk failures• Widely used in Hadoop and many other systems• Storage overhead of 200%!!!!!

RAID 6• MDS codes with two parities

[n, k] MDS codes• Can tolerate any n− k disk failures• FB uses (14, 10) RS codes• Poor handling of single disk failures (The Repair Problem)

Theory and Practice of Codes with Locality July 10, 2016 3 / 22

State-of-the-Art Coding technique

RAID: Redundant Array of Independent Disks

RAID 1 – Replication (typically 3x)

• Simple implementation!• High availability of the information• Can tolerate any 2 disk failures• Widely used in Hadoop and many other systems• Storage overhead of 200%!!!!!

RAID 6• MDS codes with two parities

[n, k] MDS codes• Can tolerate any n− k disk failures• FB uses (14, 10) RS codes• Poor handling of single disk failures (The Repair Problem)

Theory and Practice of Codes with Locality July 10, 2016 3 / 22

State-of-the-Art Coding technique

RAID: Redundant Array of Independent Disks

RAID 1 – Replication (typically 3x)• Simple implementation!• High availability of the information• Can tolerate any 2 disk failures• Widely used in Hadoop and many other systems• Storage overhead of 200%!!!!!

RAID 6• MDS codes with two parities

[n, k] MDS codes• Can tolerate any n− k disk failures• FB uses (14, 10) RS codes• Poor handling of single disk failures (The Repair Problem)

Theory and Practice of Codes with Locality July 10, 2016 3 / 22

State-of-the-Art Coding technique

RAID: Redundant Array of Independent Disks

RAID 1 – Replication (typically 3x)• Simple implementation!• High availability of the information• Can tolerate any 2 disk failures• Widely used in Hadoop and many other systems• Storage overhead of 200%!!!!!

RAID 6

• MDS codes with two parities

[n, k] MDS codes• Can tolerate any n− k disk failures• FB uses (14, 10) RS codes• Poor handling of single disk failures (The Repair Problem)

Theory and Practice of Codes with Locality July 10, 2016 3 / 22

State-of-the-Art Coding technique

RAID: Redundant Array of Independent Disks

RAID 1 – Replication (typically 3x)• Simple implementation!• High availability of the information• Can tolerate any 2 disk failures• Widely used in Hadoop and many other systems• Storage overhead of 200%!!!!!

RAID 6• MDS codes with two parities

[n, k] MDS codes• Can tolerate any n− k disk failures• FB uses (14, 10) RS codes• Poor handling of single disk failures (The Repair Problem)

Theory and Practice of Codes with Locality July 10, 2016 3 / 22

State-of-the-Art Coding technique

RAID: Redundant Array of Independent Disks

RAID 1 – Replication (typically 3x)• Simple implementation!• High availability of the information• Can tolerate any 2 disk failures• Widely used in Hadoop and many other systems• Storage overhead of 200%!!!!!

RAID 6• MDS codes with two parities

[n, k] MDS codes

• Can tolerate any n− k disk failures• FB uses (14, 10) RS codes• Poor handling of single disk failures (The Repair Problem)

Theory and Practice of Codes with Locality July 10, 2016 3 / 22

State-of-the-Art Coding technique

RAID: Redundant Array of Independent Disks

RAID 1 – Replication (typically 3x)• Simple implementation!• High availability of the information• Can tolerate any 2 disk failures• Widely used in Hadoop and many other systems• Storage overhead of 200%!!!!!

RAID 6• MDS codes with two parities

[n, k] MDS codes• Can tolerate any n− k disk failures

• FB uses (14, 10) RS codes• Poor handling of single disk failures (The Repair Problem)

Theory and Practice of Codes with Locality July 10, 2016 3 / 22

State-of-the-Art Coding technique

RAID: Redundant Array of Independent Disks

RAID 1 – Replication (typically 3x)• Simple implementation!• High availability of the information• Can tolerate any 2 disk failures• Widely used in Hadoop and many other systems• Storage overhead of 200%!!!!!

RAID 6• MDS codes with two parities

[n, k] MDS codes• Can tolerate any n− k disk failures• FB uses (14, 10) RS codes

• Poor handling of single disk failures (The Repair Problem)

Theory and Practice of Codes with Locality July 10, 2016 3 / 22

State-of-the-Art Coding technique

RAID: Redundant Array of Independent Disks

RAID 1 – Replication (typically 3x)• Simple implementation!• High availability of the information• Can tolerate any 2 disk failures• Widely used in Hadoop and many other systems• Storage overhead of 200%!!!!!

RAID 6• MDS codes with two parities

[n, k] MDS codes• Can tolerate any n− k disk failures• FB uses (14, 10) RS codes• Poor handling of single disk failures (The Repair Problem)

Theory and Practice of Codes with Locality July 10, 2016 3 / 22

Limitations of Reed-Solomon codes

Example: [14, 10] RS code

• A disk is lost - Repair job starts to repair it

• Transmit information from 10 disks to recover one lost disk

• Generates 10x more traffic compared to replication for recovery of one disk

• If a large portion the data is RS-coded =⇒ saturation of the network

• Goal: Construct efficient codes with ”good“ repair process

Theory and Practice of Codes with Locality July 10, 2016 4 / 22

Limitations of Reed-Solomon codes

Example: [14, 10] RS code

• A disk is lost - Repair job starts to repair it

• Transmit information from 10 disks to recover one lost disk

• Generates 10x more traffic compared to replication for recovery of one disk

• If a large portion the data is RS-coded =⇒ saturation of the network

• Goal: Construct efficient codes with ”good“ repair process

2

1 2 3 4 5 6 7 9 108

(14, 10) RS code running at FB

Theory and Practice of Codes with Locality July 10, 2016 4 / 22

Limitations of Reed-Solomon codes

Example: [14, 10] RS code

• A disk is lost - Repair job starts to repair it

• Transmit information from 10 disks to recover one lost disk

• Generates 10x more traffic compared to replication for recovery of one disk

• If a large portion the data is RS-coded =⇒ saturation of the network

• Goal: Construct efficient codes with ”good“ repair process

3

1 2 3 4 5 6 7 9 108 P1 P3 P4P2

(14, 10) RS code running at FB

Theory and Practice of Codes with Locality July 10, 2016 4 / 22

Limitations of Reed-Solomon codes

Example: [14, 10] RS code

• A disk is lost - Repair job starts to repair it

• Transmit information from 10 disks to recover one lost disk

• Generates 10x more traffic compared to replication for recovery of one disk

• If a large portion the data is RS-coded =⇒ saturation of the network

• Goal: Construct efficient codes with ”good“ repair process

4

1 2 3 4 5 6 7 9 108 P1 P3 P4P2

(14, 10) RS code running at FB

Theory and Practice of Codes with Locality July 10, 2016 4 / 22

Limitations of Reed-Solomon codes

Example: [14, 10] RS code

• A disk is lost - Repair job starts to repair it

• Transmit information from 10 disks to recover one lost disk

• Generates 10x more traffic compared to replication for recovery of one disk

• If a large portion the data is RS-coded =⇒ saturation of the network

• Goal: Construct efficient codes with ”good“ repair process

5

1 2 3 4 5 6 7 9 108 P1 P3 P4P2

(14, 10) RS code running at FB

Theory and Practice of Codes with Locality July 10, 2016 4 / 22

Limitations of Reed-Solomon codes

Example: [14, 10] RS code

• A disk is lost - Repair job starts to repair it

• Transmit information from 10 disks to recover one lost disk

• Generates 10x more traffic compared to replication for recovery of one disk

• If a large portion the data is RS-coded =⇒ saturation of the network

• Goal: Construct efficient codes with ”good“ repair process

5

1 2 3 4 5 6 7 9 108 P1 P3 P4P2

(14, 10) RS code running at FB

Theory and Practice of Codes with Locality July 10, 2016 4 / 22

Limitations of Reed-Solomon codes

Example: [14, 10] RS code

• A disk is lost - Repair job starts to repair it

• Transmit information from 10 disks to recover one lost disk

• Generates 10x more traffic compared to replication for recovery of one disk

• If a large portion the data is RS-coded =⇒ saturation of the network

• Goal: Construct efficient codes with ”good“ repair process

5

1 2 3 4 5 6 7 9 108 P1 P3 P4P2

(14, 10) RS code running at FB

Theory and Practice of Codes with Locality July 10, 2016 4 / 22

Limitations of Reed-Solomon codes

Example: [14, 10] RS code

• A disk is lost - Repair job starts to repair it

• Transmit information from 10 disks to recover one lost disk

• Generates 10x more traffic compared to replication for recovery of one disk

• If a large portion the data is RS-coded =⇒ saturation of the network

• Goal: Construct efficient codes with ”good“ repair process

5

1 2 3 4 5 6 7 9 108 P1 P3 P4P2

(14, 10) RS code running at FB

Theory and Practice of Codes with Locality July 10, 2016 4 / 22

How “good” is the repair process?

• Total bandwidth during the repair

• Regenerating Codes - Seminal paper by Dimakis, Godfrey, Wu, Wainwright andRamchandran

• Model allows connecting “many” nodes

• Total number of disks participating in the repair

• The struggler

• Locally Recoverable (LRC) Codes

Theory and Practice of Codes with Locality July 10, 2016 5 / 22

How “good” is the repair process?

• Total bandwidth during the repair

• Regenerating Codes - Seminal paper by Dimakis, Godfrey, Wu, Wainwright andRamchandran

• Model allows connecting “many” nodes

• Total number of disks participating in the repair

• The struggler

• Locally Recoverable (LRC) Codes

Theory and Practice of Codes with Locality July 10, 2016 5 / 22

How “good” is the repair process?

• Total bandwidth during the repair

• Regenerating Codes - Seminal paper by Dimakis, Godfrey, Wu, Wainwright andRamchandran

• Model allows connecting “many” nodes

• Total number of disks participating in the repair

• The struggler

• Locally Recoverable (LRC) Codes

Theory and Practice of Codes with Locality July 10, 2016 5 / 22

How “good” is the repair process?

• Total bandwidth during the repair

• Regenerating Codes - Seminal paper by Dimakis, Godfrey, Wu, Wainwright andRamchandran

• Model allows connecting “many” nodes

• Total number of disks participating in the repair

• The struggler

• Locally Recoverable (LRC) Codes

Theory and Practice of Codes with Locality July 10, 2016 5 / 22

How “good” is the repair process?

• Total bandwidth during the repair

• Regenerating Codes - Seminal paper by Dimakis, Godfrey, Wu, Wainwright andRamchandran

• Model allows connecting “many” nodes

• Total number of disks participating in the repair

• The struggler

• Locally Recoverable (LRC) Codes

Theory and Practice of Codes with Locality July 10, 2016 5 / 22

How “good” is the repair process?

• Total bandwidth during the repair

• Regenerating Codes - Seminal paper by Dimakis, Godfrey, Wu, Wainwright andRamchandran

• Model allows connecting “many” nodes

• Total number of disks participating in the repair

• The struggler

• Locally Recoverable (LRC) Codes

Theory and Practice of Codes with Locality July 10, 2016 5 / 22

How “good” is the repair process?

• Total bandwidth during the repair

• Regenerating Codes - Seminal paper by Dimakis, Godfrey, Wu, Wainwright andRamchandran

• Model allows connecting “many” nodes

• Total number of disks participating in the repair

• The struggler

• Locally Recoverable (LRC) Codes

Theory and Practice of Codes with Locality July 10, 2016 5 / 22

Codes with locality: Plan

• LRC codes - Basics (Tamo)

• LRC codes - Constructions (Barg)

• LRC codes in practice (Barg)

• Break ,

• LRC codes - Bounds (Barg)

• Maximally Recoverable Codes (Tamo)

• LRC codes on graphs (Tamo)

Theory and Practice of Codes with Locality July 10, 2016 6 / 22

Codes with locality: Plan

• LRC codes - Basics (Tamo)

• LRC codes - Constructions (Barg)

• LRC codes in practice (Barg)

• Break ,

• LRC codes - Bounds (Barg)

• Maximally Recoverable Codes (Tamo)

• LRC codes on graphs (Tamo)

Theory and Practice of Codes with Locality July 10, 2016 6 / 22

Codes with locality: Plan

• LRC codes - Basics (Tamo)

• LRC codes - Constructions (Barg)

• LRC codes in practice (Barg)

• Break ,

• LRC codes - Bounds (Barg)

• Maximally Recoverable Codes (Tamo)

• LRC codes on graphs (Tamo)

Theory and Practice of Codes with Locality July 10, 2016 6 / 22

Codes with locality: Plan

• LRC codes - Basics (Tamo)

• LRC codes - Constructions (Barg)

• LRC codes in practice (Barg)

• Break ,

• LRC codes - Bounds (Barg)

• Maximally Recoverable Codes (Tamo)

• LRC codes on graphs (Tamo)

Theory and Practice of Codes with Locality July 10, 2016 6 / 22

Codes with locality: Plan

• LRC codes - Basics (Tamo)

• LRC codes - Constructions (Barg)

• LRC codes in practice (Barg)

• Break ,

• LRC codes - Bounds (Barg)

• Maximally Recoverable Codes (Tamo)

• LRC codes on graphs (Tamo)

Theory and Practice of Codes with Locality July 10, 2016 6 / 22

Codes with locality: Plan

• LRC codes - Basics (Tamo)

• LRC codes - Constructions (Barg)

• LRC codes in practice (Barg)

• Break ,

• LRC codes - Bounds (Barg)

• Maximally Recoverable Codes (Tamo)

• LRC codes on graphs (Tamo)

Theory and Practice of Codes with Locality July 10, 2016 6 / 22

Codes with locality: Plan

• LRC codes - Basics (Tamo)

• LRC codes - Constructions (Barg)

• LRC codes in practice (Barg)

• Break ,

• LRC codes - Bounds (Barg)

• Maximally Recoverable Codes (Tamo)

• LRC codes on graphs (Tamo)

Theory and Practice of Codes with Locality July 10, 2016 6 / 22

Codes with locality: Plan

• LRC codes - Basics (Tamo)

• LRC codes - Constructions (Barg)

• LRC codes in practice (Barg)

• Break ,

• LRC codes - Bounds (Barg)

• Maximally Recoverable Codes (Tamo)

• LRC codes on graphs (Tamo)

Theory and Practice of Codes with Locality July 10, 2016 6 / 22

Locally Recoverable Codes - Definition

(n, k, r) LRC Code

• Takes k blocks (symbols)→ produces n blocks

• An erasure has occurred

• Any symbol i has a recovery set Ri of r other symbols, r � k

• Clearly 1 ≤ r ≤ k

1 k − 1 k k + 1 k + 2 n

r

k

Theory and Practice of Codes with Locality July 10, 2016 7 / 22

Locally Recoverable Codes - Definition

(n, k, r) LRC Code

• Takes k blocks (symbols)→ produces n blocks

• An erasure has occurred

• Any symbol i has a recovery set Ri of r other symbols, r � k

• Clearly 1 ≤ r ≤ k

1 k − 1 k k + 1 k + 2 n

r

k

Theory and Practice of Codes with Locality July 10, 2016 7 / 22

Locally Recoverable Codes - Definition

(n, k, r) LRC Code

• Takes k blocks (symbols)→ produces n blocks

• An erasure has occurred

• Any symbol i has a recovery set Ri of r other symbols, r � k

• Clearly 1 ≤ r ≤ k

1 k − 1 k

k + 1 k + 2 n

r

k

Theory and Practice of Codes with Locality July 10, 2016 7 / 22

Locally Recoverable Codes - Definition

(n, k, r) LRC Code

• Takes k blocks (symbols)→ produces n blocks

• An erasure has occurred

• Any symbol i has a recovery set Ri of r other symbols, r � k

• Clearly 1 ≤ r ≤ k

1 k − 1 k k + 1 k + 2 n

r

k

Theory and Practice of Codes with Locality July 10, 2016 7 / 22

Locally Recoverable Codes - Definition

(n, k, r) LRC Code

• Takes k blocks (symbols)→ produces n blocks

• An erasure has occurred

• Any symbol i has a recovery set Ri of r other symbols, r � k

• Clearly 1 ≤ r ≤ k

1 k − 1 k k + 1 k + 2 n

r

k

Theory and Practice of Codes with Locality July 10, 2016 7 / 22

Locally Recoverable Codes - Definition

(n, k, r) LRC Code

• Takes k blocks (symbols)→ produces n blocks

• An erasure has occurred

• Any symbol i has a recovery set Ri of r other symbols, r � k

• Clearly 1 ≤ r ≤ k

1 k − 1 k k + 1 k + 2 n

r

k

Theory and Practice of Codes with Locality July 10, 2016 7 / 22

Locally Recoverable Codes - Definition

(n, k, r) LRC Code

• Takes k blocks (symbols)→ produces n blocks

• An erasure has occurred

• Any symbol i has a recovery set Ri of r other symbols, r � k

• Clearly 1 ≤ r ≤ k

1 k − 1 k k + 1 k + 2 n

r

k

Theory and Practice of Codes with Locality July 10, 2016 7 / 22

Locally Recoverable Codes - Definition

(n, k, r) LRC Code

• Takes k blocks (symbols)→ produces n blocks

• An erasure has occurred

• Any symbol i has a recovery set Ri of r other symbols, r � k

• Clearly 1 ≤ r ≤ k

1 k − 1 k k + 1 k + 2 n

r

k

Theory and Practice of Codes with Locality July 10, 2016 7 / 22

Locally Recoverable Codes - Definition

(n, k, r) LRC Code

• Takes k blocks (symbols)→ produces n blocks

• An erasure has occurred

• Any symbol i has a recovery set Ri of r other symbols, r � k

• Clearly 1 ≤ r ≤ k

1 k − 1 k k + 1 k + 2 n

r

k

Theory and Practice of Codes with Locality July 10, 2016 7 / 22

Locally Recoverable Codes - Definition

(n, k, r) LRC Code

• Takes k blocks (symbols)→ produces n blocks

• An erasure has occurred

• Any symbol i has a recovery set Ri of r other symbols, r � k

• Clearly 1 ≤ r ≤ k

1 k − 1 k k + 1 k + 2 n

r

k

Theory and Practice of Codes with Locality July 10, 2016 7 / 22

Early references on LRC codes

• J. Han and L. Lastras-Montano, ISIT 2007;

• C. Huang, M. Chen, and J. Li, Symp. Networks App. 2007;

• F. Oggier and A. Datta 2013 - Survey on codes for distributed storage systems;

• P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, IEEE Trans. Inf. Theory,Nov. 2012.

Theory and Practice of Codes with Locality July 10, 2016 8 / 22

Early references on LRC codes

• J. Han and L. Lastras-Montano, ISIT 2007;

• C. Huang, M. Chen, and J. Li, Symp. Networks App. 2007;

• F. Oggier and A. Datta 2013 - Survey on codes for distributed storage systems;

• P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, IEEE Trans. Inf. Theory,Nov. 2012.

Theory and Practice of Codes with Locality July 10, 2016 8 / 22

Early references on LRC codes

• J. Han and L. Lastras-Montano, ISIT 2007;

• C. Huang, M. Chen, and J. Li, Symp. Networks App. 2007;

• F. Oggier and A. Datta 2013 - Survey on codes for distributed storage systems;

• P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, IEEE Trans. Inf. Theory,Nov. 2012.

Theory and Practice of Codes with Locality July 10, 2016 8 / 22

Early references on LRC codes

• J. Han and L. Lastras-Montano, ISIT 2007;

• C. Huang, M. Chen, and J. Li, Symp. Networks App. 2007;

• F. Oggier and A. Datta 2013 - Survey on codes for distributed storage systems;

• P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, IEEE Trans. Inf. Theory,Nov. 2012.

Theory and Practice of Codes with Locality July 10, 2016 8 / 22

Early references on LRC codes

• J. Han and L. Lastras-Montano, ISIT 2007;

• C. Huang, M. Chen, and J. Li, Symp. Networks App. 2007;

• F. Oggier and A. Datta 2013 - Survey on codes for distributed storage systems;

• P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, IEEE Trans. Inf. Theory,Nov. 2012.

Theory and Practice of Codes with Locality July 10, 2016 8 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• Rate?

• Minimum distance?

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• Rate?

• Minimum distance?

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The rate is bounded bykn≤ r

r + 1.

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The rate is bounded bykn≤ r

r + 1.

Proof:

• There exist at most nrr+1 coordinates that determine the exact codeword

• This follows since iteratively:

1. Cost: expose the values of the coordinates in a recovery set Ri, |Ri| ≤ r

2. Free: the value of the i-th coordinate

3. By exposing at most nrr+1 coordinates, the exact codeword is determined

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The rate is bounded bykn≤ r

r + 1.

Proof:

• There exist at most nrr+1 coordinates that determine the exact codeword

• This follows since iteratively:

1. Cost: expose the values of the coordinates in a recovery set Ri, |Ri| ≤ r

2. Free: the value of the i-th coordinate

3. By exposing at most nrr+1 coordinates, the exact codeword is determined

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The rate is bounded bykn≤ r

r + 1.

Proof:

• There exist at most nrr+1 coordinates that determine the exact codeword

• This follows since iteratively:

1. Cost: expose the values of the coordinates in a recovery set Ri, |Ri| ≤ r

2. Free: the value of the i-th coordinate

3. By exposing at most nrr+1 coordinates, the exact codeword is determined

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The rate is bounded bykn≤ r

r + 1.

Proof:

• There exist at most nrr+1 coordinates that determine the exact codeword

• This follows since iteratively:

1. Cost: expose the values of the coordinates in a recovery set Ri, |Ri| ≤ r

2. Free: the value of the i-th coordinate

3. By exposing at most nrr+1 coordinates, the exact codeword is determined

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The rate is bounded bykn≤ r

r + 1.

Proof:

• There exist at most nrr+1 coordinates that determine the exact codeword

• This follows since iteratively:

1. Cost: expose the values of the coordinates in a recovery set Ri, |Ri| ≤ r

2. Free: the value of the i-th coordinate

3. By exposing at most nrr+1 coordinates, the exact codeword is determined

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The rate is bounded bykn≤ r

r + 1.

Proof:

• There exist at most nrr+1 coordinates that determine the exact codeword

• This follows since iteratively:

1. Cost: expose the values of the coordinates in a recovery set Ri, |Ri| ≤ r

2. Free: the value of the i-th coordinate

3. By exposing at most nrr+1 coordinates, the exact codeword is determined

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The rate is bounded bykn≤ r

r + 1.

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The rate is bounded bykn≤ r

r + 1.

• The bound is tight (even over F2)

• Partition the k bits into k/r sets of size r

• Add parity check bit to each set

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The rate is bounded bykn≤ r

r + 1.

• The bound is tight (even over F2)

• Partition the k bits into k/r sets of size r

• Add parity check bit to each set

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The rate is bounded bykn≤ r

r + 1.

• The bound is tight (even over F2)

• Partition the k bits into k/r sets of size r

• Add parity check bit to each set

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The minimum distance is bounded by

d ≤ n− k −⌈

kr

⌉+ 2

[Gopalan, Huang, Simitci, Yekhanin 12][Papailiopoulos, Dimakis 12]

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The minimum distance is bounded by

d ≤ n− k −⌈

kr

⌉+ 2

[Gopalan, Huang, Simitci, Yekhanin 12][Papailiopoulos, Dimakis 12]

Remarks:

• Smaller locality =⇒ lower failure resilience

• Generalization of the Singleton bound (r = k)

• Optimal (n, k, r) LRC code achieves the bound with equality

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The minimum distance is bounded by

d ≤ n− k −⌈

kr

⌉+ 2

[Gopalan, Huang, Simitci, Yekhanin 12][Papailiopoulos, Dimakis 12]

Remarks:

• Smaller locality =⇒ lower failure resilience

• Generalization of the Singleton bound (r = k)

• Optimal (n, k, r) LRC code achieves the bound with equality

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The minimum distance is bounded by

d ≤ n− k −⌈

kr

⌉+ 2

[Gopalan, Huang, Simitci, Yekhanin 12][Papailiopoulos, Dimakis 12]

Remarks:

• Smaller locality =⇒ lower failure resilience

• Generalization of the Singleton bound (r = k)

• Optimal (n, k, r) LRC code achieves the bound with equality

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Parameters of LRC codes

Let C be an (n, k, r) LRC code

• Assume r|k and r + 1|n

• The minimum distance is bounded by

d ≤ n− k −⌈

kr

⌉+ 2

[Gopalan, Huang, Simitci, Yekhanin 12][Papailiopoulos, Dimakis 12]

Remarks:

• Smaller locality =⇒ lower failure resilience

• Generalization of the Singleton bound (r = k)

• Optimal (n, k, r) LRC code achieves the bound with equality

Theory and Practice of Codes with Locality July 10, 2016 9 / 22

Constructing optimal LRC codes

• A carefully constructed random generating matrix gives an optimal LRC

1. Non explicit /

2. Field size is superpolynomial in the length / (is it truly necessary?)

• Optimal ((r + 1)d kr e, k, r) LRC code [Prasanth, Kamath, Lalitha, and Kumar 12]

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

Theory and Practice of Codes with Locality July 10, 2016 10 / 22

Constructing optimal LRC codes

• A carefully constructed random generating matrix gives an optimal LRC

1. Non explicit /

2. Field size is superpolynomial in the length / (is it truly necessary?)

• Optimal ((r + 1)d kr e, k, r) LRC code [Prasanth, Kamath, Lalitha, and Kumar 12]

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

Theory and Practice of Codes with Locality July 10, 2016 10 / 22

Constructing optimal LRC codes

• A carefully constructed random generating matrix gives an optimal LRC

1. Non explicit /

2. Field size is superpolynomial in the length / (is it truly necessary?)

• Optimal ((r + 1)d kr e, k, r) LRC code [Prasanth, Kamath, Lalitha, and Kumar 12]

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

Theory and Practice of Codes with Locality July 10, 2016 10 / 22

Constructing optimal LRC codes

• A carefully constructed random generating matrix gives an optimal LRC

1. Non explicit /

2. Field size is superpolynomial in the length / (is it truly necessary?)

• Optimal ((r + 1)d kr e, k, r) LRC code [Prasanth, Kamath, Lalitha, and Kumar 12]

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

Theory and Practice of Codes with Locality July 10, 2016 10 / 22

Constructing optimal LRC codes

• A carefully constructed random generating matrix gives an optimal LRC

1. Non explicit /

2. Field size is superpolynomial in the length / (is it truly necessary?)

• Optimal ((r + 1)d kr e, k, r) LRC code [Prasanth, Kamath, Lalitha, and Kumar 12]

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

Theory and Practice of Codes with Locality July 10, 2016 10 / 22

Constructing optimal LRC codes

• A carefully constructed random generating matrix gives an optimal LRC

1. Non explicit /

2. Field size is superpolynomial in the length / (is it truly necessary?)

• Optimal ((r + 1)d kr e, k, r) LRC code [Prasanth, Kamath, Lalitha, and Kumar 12]

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

Theory and Practice of Codes with Locality July 10, 2016 10 / 22

Constructing optimal LRC codes

• A carefully constructed random generating matrix gives an optimal LRC

1. Non explicit /

2. Field size is superpolynomial in the length / (is it truly necessary?)

• Optimal ((r + 1)d kr e, k, r) LRC code [Prasanth, Kamath, Lalitha, and Kumar 12]

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

Theory and Practice of Codes with Locality July 10, 2016 10 / 22

Constructing optimal LRC codes

• A carefully constructed random generating matrix gives an optimal LRC

1. Non explicit /

2. Field size is superpolynomial in the length / (is it truly necessary?)

• Optimal ((r + 1)d kr e, k, r) LRC code [Prasanth, Kamath, Lalitha, and Kumar 12]

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

Theory and Practice of Codes with Locality July 10, 2016 10 / 22

Optimal LRC codes - Easy cases

• r = k

1. d ≤ n− k + 1

2. An (n, k) RS is an (n, k, k) optimal LRC code

3. |F| = O(n)

• r = 1

1. d ≤ 2( n2 − k + 1)

2. Duplication of an (n/2, k) RS is an (n, k, 1) optimal LRC code

3. |F| = O(n)

• Q: What happens for 1 < r < k?

• Q: Generalize the optimal codes for r = 1, k to codes with arbitrary r?

Theory and Practice of Codes with Locality July 10, 2016 11 / 22

Optimal LRC codes - Easy cases

• r = k

1. d ≤ n− k + 1

2. An (n, k) RS is an (n, k, k) optimal LRC code

3. |F| = O(n)

• r = 1

1. d ≤ 2( n2 − k + 1)

2. Duplication of an (n/2, k) RS is an (n, k, 1) optimal LRC code

3. |F| = O(n)

• Q: What happens for 1 < r < k?

• Q: Generalize the optimal codes for r = 1, k to codes with arbitrary r?

Theory and Practice of Codes with Locality July 10, 2016 11 / 22

Optimal LRC codes - Easy cases

• r = k

1. d ≤ n− k + 1

2. An (n, k) RS is an (n, k, k) optimal LRC code

3. |F| = O(n)

• r = 1

1. d ≤ 2( n2 − k + 1)

2. Duplication of an (n/2, k) RS is an (n, k, 1) optimal LRC code

3. |F| = O(n)

• Q: What happens for 1 < r < k?

• Q: Generalize the optimal codes for r = 1, k to codes with arbitrary r?

Theory and Practice of Codes with Locality July 10, 2016 11 / 22

Optimal LRC codes - Easy cases

• r = k

1. d ≤ n− k + 1

2. An (n, k) RS is an (n, k, k) optimal LRC code

3. |F| = O(n)

• r = 1

1. d ≤ 2( n2 − k + 1)

2. Duplication of an (n/2, k) RS is an (n, k, 1) optimal LRC code

3. |F| = O(n)

• Q: What happens for 1 < r < k?

• Q: Generalize the optimal codes for r = 1, k to codes with arbitrary r?

Theory and Practice of Codes with Locality July 10, 2016 11 / 22

Optimal LRC codes - Easy cases

• r = k

1. d ≤ n− k + 1

2. An (n, k) RS is an (n, k, k) optimal LRC code

3. |F| = O(n)

• r = 1

1. d ≤ 2( n2 − k + 1)

2. Duplication of an (n/2, k) RS is an (n, k, 1) optimal LRC code

3. |F| = O(n)

• Q: What happens for 1 < r < k?

• Q: Generalize the optimal codes for r = 1, k to codes with arbitrary r?

Theory and Practice of Codes with Locality July 10, 2016 11 / 22

Optimal LRC codes - Easy cases

• r = k

1. d ≤ n− k + 1

2. An (n, k) RS is an (n, k, k) optimal LRC code

3. |F| = O(n)

• r = 1

1. d ≤ 2( n2 − k + 1)

2. Duplication of an (n/2, k) RS is an (n, k, 1) optimal LRC code

3. |F| = O(n)

• Q: What happens for 1 < r < k?

• Q: Generalize the optimal codes for r = 1, k to codes with arbitrary r?

Theory and Practice of Codes with Locality July 10, 2016 11 / 22

Optimal LRC codes - Easy cases

• r = k

1. d ≤ n− k + 1

2. An (n, k) RS is an (n, k, k) optimal LRC code

3. |F| = O(n)

• r = 1

1. d ≤ 2( n2 − k + 1)

2. Duplication of an (n/2, k) RS is an (n, k, 1) optimal LRC code

3. |F| = O(n)

• Q: What happens for 1 < r < k?

• Q: Generalize the optimal codes for r = 1, k to codes with arbitrary r?

Theory and Practice of Codes with Locality July 10, 2016 11 / 22

Optimal LRC codes - Easy cases

• r = k

1. d ≤ n− k + 1

2. An (n, k) RS is an (n, k, k) optimal LRC code

3. |F| = O(n)

• r = 1

1. d ≤ 2( n2 − k + 1)

2. Duplication of an (n/2, k) RS is an (n, k, 1) optimal LRC code

3. |F| = O(n)

• Q: What happens for 1 < r < k?

• Q: Generalize the optimal codes for r = 1, k to codes with arbitrary r?

Theory and Practice of Codes with Locality July 10, 2016 11 / 22

Optimal LRC codes - Easy cases

• r = k

1. d ≤ n− k + 1

2. An (n, k) RS is an (n, k, k) optimal LRC code

3. |F| = O(n)

• r = 1

1. d ≤ 2( n2 − k + 1)

2. Duplication of an (n/2, k) RS is an (n, k, 1) optimal LRC code

3. |F| = O(n)

• Q: What happens for 1 < r < k?

• Q: Generalize the optimal codes for r = 1, k to codes with arbitrary r?

Theory and Practice of Codes with Locality July 10, 2016 11 / 22

Optimal LRC codes - Easy cases

• r = k

1. d ≤ n− k + 1

2. An (n, k) RS is an (n, k, k) optimal LRC code

3. |F| = O(n)

• r = 1

1. d ≤ 2( n2 − k + 1)

2. Duplication of an (n/2, k) RS is an (n, k, 1) optimal LRC code

3. |F| = O(n)

• Q: What happens for 1 < r < k?

• Q: Generalize the optimal codes for r = 1, k to codes with arbitrary r?

Theory and Practice of Codes with Locality July 10, 2016 11 / 22

Optimal LRC codes - Easy cases

• r = k

1. d ≤ n− k + 1

2. An (n, k) RS is an (n, k, k) optimal LRC code

3. |F| = O(n)

• r = 1

1. d ≤ 2( n2 − k + 1)

2. Duplication of an (n/2, k) RS is an (n, k, 1) optimal LRC code

3. |F| = O(n)

• Q: What happens for 1 < r < k?

• Q: Generalize the optimal codes for r = 1, k to codes with arbitrary r?

Theory and Practice of Codes with Locality July 10, 2016 11 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3 → f (x) = (f (x1), f (x2), ..., f (xn))Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)

→ f (x) = a0 + a1x + a2x2 + a3x3 → f (x) = (f (x1), f (x2), ..., f (xn))Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3

→ f (x) = (f (x1), f (x2), ..., f (xn))Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3

→ f (x) = (f (x1), f (x2), ..., f (xn))Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3

→ f (x) = (f (x1), f (x2), ..., f (xn))Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3

→ f (x) = (f (x1), f (x2), ..., f (xn))Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3

→ f (x) = (f (x1), f (x2), ..., f (xn))Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3

→ f (x) = (f (x1), f (x2), ..., f (xn))Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3

→ f (x) = (f (x1), f (x2), ..., f (xn))Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3

→ f (x) = (f (x1), f (x2), ..., f (xn))Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3

→ f (x) = (f (x1), f (x2), ..., f (xn))Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3 → f (x) = (f (x1), f (x2), ..., f (xn))

Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3 → f (x) = (f (x1), f (x2), ..., f (xn))

Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3 → f (x) = (f (x1), f (x2), ..., f (xn))

Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3 → f (x) = (f (x1), f (x2), ..., f (xn))

Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3 → f (x) = (f (x1), f (x2), ..., f (xn))

Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3 → f (x) = (f (x1), f (x2), ..., f (xn))

Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Intuition behind the LRC construction

(a0, a1, a2, a3)→ f (x) = a0 + a1x + a2x2 + a3x3 → f (x) = (f (x1), f (x2), ..., f (xn))Only two points suffice to recover the lost point

x1

f (x1)

x2

f (x2)

x3

f (x3)

x4

f (x4)

x5

f (x5)

xn

f (xn)

Theory and Practice of Codes with Locality July 10, 2016 12 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. g(x) ∈ F[x] is a polynomial s.t.

2.1 deg (g(x)) = r + 1

2.2 g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Encoding: Given k information symbols ai,j, i = 0, ..., r − 1, j = 0, ..., kr − 1

1. Define the polynomial

f (x) =r−1∑i=0

xi

kr−1∑j=0

ai,jg(x)j

2. Store the length−n vector (f (α) : α ∈ ∪iRi)

Theorem: This is an optimal (n, k, r) LRC code over F

T. and Alexander Barg, “A Family of Optimal Locally Recoverable Codes”, IEEE Trans. on Information Theory

Theory and Practice of Codes with Locality July 10, 2016 13 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. g(x) ∈ F[x] is a polynomial s.t.

2.1 deg (g(x)) = r + 1

2.2 g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Encoding: Given k information symbols ai,j, i = 0, ..., r − 1, j = 0, ..., kr − 1

1. Define the polynomial

f (x) =r−1∑i=0

xi

kr−1∑j=0

ai,jg(x)j

2. Store the length−n vector (f (α) : α ∈ ∪iRi)

Theorem: This is an optimal (n, k, r) LRC code over F

T. and Alexander Barg, “A Family of Optimal Locally Recoverable Codes”, IEEE Trans. on Information Theory

Theory and Practice of Codes with Locality July 10, 2016 13 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. g(x) ∈ F[x] is a polynomial s.t.

2.1 deg (g(x)) = r + 1

2.2 g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Encoding: Given k information symbols ai,j, i = 0, ..., r − 1, j = 0, ..., kr − 1

1. Define the polynomial

f (x) =r−1∑i=0

xi

kr−1∑j=0

ai,jg(x)j

2. Store the length−n vector (f (α) : α ∈ ∪iRi)

Theorem: This is an optimal (n, k, r) LRC code over F

T. and Alexander Barg, “A Family of Optimal Locally Recoverable Codes”, IEEE Trans. on Information Theory

Theory and Practice of Codes with Locality July 10, 2016 13 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. g(x) ∈ F[x] is a polynomial s.t.

2.1 deg (g(x)) = r + 1

2.2 g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Encoding: Given k information symbols ai,j, i = 0, ..., r − 1, j = 0, ..., kr − 1

1. Define the polynomial

f (x) =r−1∑i=0

xi

kr−1∑j=0

ai,jg(x)j

2. Store the length−n vector (f (α) : α ∈ ∪iRi)

Theorem: This is an optimal (n, k, r) LRC code over F

T. and Alexander Barg, “A Family of Optimal Locally Recoverable Codes”, IEEE Trans. on Information Theory

Theory and Practice of Codes with Locality July 10, 2016 13 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. g(x) ∈ F[x] is a polynomial s.t.

2.1 deg (g(x)) = r + 1

2.2 g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Encoding: Given k information symbols ai,j, i = 0, ..., r − 1, j = 0, ..., kr − 1

1. Define the polynomial

f (x) =r−1∑i=0

xi

kr−1∑j=0

ai,jg(x)j

2. Store the length−n vector (f (α) : α ∈ ∪iRi)

Theorem: This is an optimal (n, k, r) LRC code over F

T. and Alexander Barg, “A Family of Optimal Locally Recoverable Codes”, IEEE Trans. on Information Theory

Theory and Practice of Codes with Locality July 10, 2016 13 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. g(x) ∈ F[x] is a polynomial s.t.

2.1 deg (g(x)) = r + 1

2.2 g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Encoding: Given k information symbols ai,j, i = 0, ..., r − 1, j = 0, ..., kr − 1

1. Define the polynomial

f (x) =r−1∑i=0

xi

kr−1∑j=0

ai,jg(x)j

2. Store the length−n vector (f (α) : α ∈ ∪iRi)

Theorem: This is an optimal (n, k, r) LRC code over F

T. and Alexander Barg, “A Family of Optimal Locally Recoverable Codes”, IEEE Trans. on Information Theory

Theory and Practice of Codes with Locality July 10, 2016 13 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. g(x) ∈ F[x] is a polynomial s.t.

2.1 deg (g(x)) = r + 1

2.2 g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Encoding:

Given k information symbols ai,j, i = 0, ..., r − 1, j = 0, ..., kr − 1

1. Define the polynomial

f (x) =r−1∑i=0

xi

kr−1∑j=0

ai,jg(x)j

2. Store the length−n vector (f (α) : α ∈ ∪iRi)

Theorem: This is an optimal (n, k, r) LRC code over F

T. and Alexander Barg, “A Family of Optimal Locally Recoverable Codes”, IEEE Trans. on Information Theory

Theory and Practice of Codes with Locality July 10, 2016 13 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. g(x) ∈ F[x] is a polynomial s.t.

2.1 deg (g(x)) = r + 1

2.2 g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Encoding: Given k information symbols ai,j, i = 0, ..., r − 1, j = 0, ..., kr − 1

1. Define the polynomial

f (x) =r−1∑i=0

xi

kr−1∑j=0

ai,jg(x)j

2. Store the length−n vector (f (α) : α ∈ ∪iRi)

Theorem: This is an optimal (n, k, r) LRC code over F

T. and Alexander Barg, “A Family of Optimal Locally Recoverable Codes”, IEEE Trans. on Information Theory

Theory and Practice of Codes with Locality July 10, 2016 13 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. g(x) ∈ F[x] is a polynomial s.t.

2.1 deg (g(x)) = r + 1

2.2 g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Encoding: Given k information symbols ai,j, i = 0, ..., r − 1, j = 0, ..., kr − 1

1. Define the polynomial

f (x) =r−1∑i=0

xi

kr−1∑j=0

ai,jg(x)j

2. Store the length−n vector (f (α) : α ∈ ∪iRi)

Theorem: This is an optimal (n, k, r) LRC code over F

T. and Alexander Barg, “A Family of Optimal Locally Recoverable Codes”, IEEE Trans. on Information Theory

Theory and Practice of Codes with Locality July 10, 2016 13 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. g(x) ∈ F[x] is a polynomial s.t.

2.1 deg (g(x)) = r + 1

2.2 g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Encoding: Given k information symbols ai,j, i = 0, ..., r − 1, j = 0, ..., kr − 1

1. Define the polynomial

f (x) =r−1∑i=0

xi

kr−1∑j=0

ai,jg(x)j

2. Store the length−n vector (f (α) : α ∈ ∪iRi)

Theorem: This is an optimal (n, k, r) LRC code over F

T. and Alexander Barg, “A Family of Optimal Locally Recoverable Codes”, IEEE Trans. on Information Theory

Theory and Practice of Codes with Locality July 10, 2016 13 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. g(x) ∈ F[x] is a polynomial s.t.

2.1 deg (g(x)) = r + 1

2.2 g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Encoding: Given k information symbols ai,j, i = 0, ..., r − 1, j = 0, ..., kr − 1

1. Define the polynomial

f (x) =r−1∑i=0

xi

kr−1∑j=0

ai,jg(x)j

2. Store the length−n vector (f (α) : α ∈ ∪iRi)

Theorem: This is an optimal (n, k, r) LRC code over FT. and Alexander Barg, “A Family of Optimal Locally Recoverable Codes”, IEEE Trans. on Information Theory

Theory and Practice of Codes with Locality July 10, 2016 13 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j

=∑r−1

i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim:

f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof:

f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j

=∑r−1

i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim:

f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof:

f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j

=∑r−1

i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim:

f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof:

f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j

=∑r−1

i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof:

f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j

=∑r−1

i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof:

f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof:

f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof:

f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof:

f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim:

f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof:

f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof: f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof:

f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof: f (β)

=∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof: f (β) =∑r−1

i=0 βifi(β)

=∑r−1

i=0 βifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof: f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1)

= δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof: f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof: f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof: f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j =

∑r−1i=0 xifi(x)

Locality: Recover f (α) =? for α ∈ R1

1. Define fi(x) =∑ k

r−1j=0 ai,jg(x)j

2. fi(x) is constant on the sets Rj

3. Define δ(x) =∑r−1

i=0 xifi(R1)

Claim: f (β) = δ(β) for all β ∈ R1 (In particular f (α) = δ(α))

Proof: f (β) =∑r−1

i=0 βifi(β) =

∑r−1i=0 β

ifi(R1) = δ(β)

1. deg(δ(x)) ≤ r − 1

2. r points on δ(x) will suffice to recover δ(x)

3. Read the r values {δ(β) = f (β) : β ∈ R1\α}

Theory and Practice of Codes with Locality July 10, 2016 14 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j

Minimum Distance:

1. Minimum distance = Minimum weight

2. deg(f (x)) ≤ r − 1 + (r + 1)( kr − 1) = k + k

r − 2

3. =⇒ d ≥ n− (k + kr − 2)

4. Equality follows since the code is an (n, k, r) LRC code

Theory and Practice of Codes with Locality July 10, 2016 15 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j

Minimum Distance:

1. Minimum distance = Minimum weight

2. deg(f (x)) ≤ r − 1 + (r + 1)( kr − 1)

= k + kr − 2

3. =⇒ d ≥ n− (k + kr − 2)

4. Equality follows since the code is an (n, k, r) LRC code

Theory and Practice of Codes with Locality July 10, 2016 15 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j

Minimum Distance:

1. Minimum distance = Minimum weight

2. deg(f (x)) ≤ r − 1 + (r + 1)( kr − 1)

= k + kr − 2

3. =⇒ d ≥ n− (k + kr − 2)

4. Equality follows since the code is an (n, k, r) LRC code

Theory and Practice of Codes with Locality July 10, 2016 15 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j

Minimum Distance:

1. Minimum distance = Minimum weight

2. deg(f (x)) ≤ r − 1 + (r + 1)( kr − 1)

= k + kr − 2

3. =⇒ d ≥ n− (k + kr − 2)

4. Equality follows since the code is an (n, k, r) LRC code

Theory and Practice of Codes with Locality July 10, 2016 15 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j

Minimum Distance:

1. Minimum distance = Minimum weight

2. deg(f (x)) ≤ r − 1 + (r + 1)( kr − 1) = k + k

r − 2

3. =⇒ d ≥ n− (k + kr − 2)

4. Equality follows since the code is an (n, k, r) LRC code

Theory and Practice of Codes with Locality July 10, 2016 15 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j

Minimum Distance:

1. Minimum distance = Minimum weight

2. deg(f (x)) ≤ r − 1 + (r + 1)( kr − 1) = k + k

r − 2

3. =⇒ d ≥ n− (k + kr − 2)

4. Equality follows since the code is an (n, k, r) LRC code

Theory and Practice of Codes with Locality July 10, 2016 15 / 22

Optimal (n, k, r) LRC code construction - Cont’d

• g(x) is constant on the sets Ri, |Ri| = r + 1

• f (x) =∑r−1

i=0 xi∑ kr−1j=0 ai,jg(x)j

Minimum Distance:

1. Minimum distance = Minimum weight

2. deg(f (x)) ≤ r − 1 + (r + 1)( kr − 1) = k + k

r − 2

3. =⇒ d ≥ n− (k + kr − 2)

4. Equality follows since the code is an (n, k, r) LRC code

Theory and Practice of Codes with Locality July 10, 2016 15 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)

→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)

→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)

→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)

→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j

= 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

𝑓 1 , 𝑓 3 , 𝑓 9 , 𝑓 2 , 𝑓 6 , 𝑓 5 , 𝑓 4 , 𝑓 12 , 𝑓 10 = (4,8,7,1,11,2,0,0,0)

23

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

𝑓 1 , 𝑓 3 , 𝑓 9 , 𝑓 2 , 𝑓 6 , 𝑓 5 , 𝑓 4 , 𝑓 12 , 𝑓 10 = (4,8,7,1,11,2,0,0,0)

24

𝑓 1 =?

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

• Local Recovery:

𝑓 1 , 𝑓 3 , 𝑓 9 , 𝑓 2 , 𝑓 6 , 𝑓 5 , 𝑓 4 , 𝑓 12 , 𝑓 10 = (4,8,7,1,11,2,0,0,0)

25

𝑓 1 =?

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

• Local Recovery:

• deg 𝛿 𝑥 ≤ 1

𝑓 1 , 𝑓 3 , 𝑓 9 , 𝑓 2 , 𝑓 6 , 𝑓 5 , 𝑓 4 , 𝑓 12 , 𝑓 10 = (4,8,7,1,11,2,0,0,0)

26

𝑓 1 =?

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

• Local Recovery:

• deg 𝛿 𝑥 ≤ 1

𝑓 1 , 𝑓 3 , 𝑓 9 , 𝑓 2 , 𝑓 6 , 𝑓 5 , 𝑓 4 , 𝑓 12 , 𝑓 10 = (4,8,7,1,11,2,0,0,0)

27

𝑓 1 =?

𝑓 3 = 𝛿 3 = 8

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

• Local Recovery:

• deg 𝛿 𝑥 ≤ 1

𝑓 1 , 𝑓 3 , 𝑓 9 , 𝑓 2 , 𝑓 6 , 𝑓 5 , 𝑓 4 , 𝑓 12 , 𝑓 10 = (4,8,7,1,11,2,0,0,0)

28

𝑓 1 =?

𝑓 3 = 𝛿 3 = 8 𝑓 9 = 𝛿 9 = 7

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

• Local Recovery:

• deg 𝛿 𝑥 ≤ 1

• 𝛿 𝑥 = 2𝑥 + 2

𝑓 1 , 𝑓 3 , 𝑓 9 , 𝑓 2 , 𝑓 6 , 𝑓 5 , 𝑓 4 , 𝑓 12 , 𝑓 10 = (4,8,7,1,11,2,0,0,0)

29

𝑓 1 =?

𝑓 3 = 𝛿 3 = 8 𝑓 9 = 𝛿 9 = 7

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

• Local Recovery:

• deg 𝛿 𝑥 ≤ 1

• 𝛿 𝑥 = 2𝑥 + 2

𝑓 1 , 𝑓 3 , 𝑓 9 , 𝑓 2 , 𝑓 6 , 𝑓 5 , 𝑓 4 , 𝑓 12 , 𝑓 10 = (4,8,7,1,11,2,0,0,0)

30

𝑓 1 =?

𝑓 3 = 𝛿 3 = 8 𝑓 9 = 𝛿 9 = 7

𝑓 1 = 𝛿 1 = 4

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Example: (9, 4, 2) LRC over F13

• R1 = {1, 3, 9},R2 = {2, 6, 5},R3 = {4, 12, 10}

• The polynomial g(x) = x3 is constant on each set Ri

• (a0,0, a0,1, a1,0, a1,1) = (1, 1, 1, 1)→ f (x) =∑1

i=0 xi∑1j=0 ai,j(x3)j = 1 + x + x3 + x4

• Store the evaluation of f (x) at ∪iRi

• Local Recovery:

• deg 𝛿 𝑥 ≤ 1

• 𝛿 𝑥 = 2𝑥 + 2

𝑓 1 , 𝑓 3 , 𝑓 9 , 𝑓 2 , 𝑓 6 , 𝑓 5 , 𝑓 4 , 𝑓 12 , 𝑓 10 = (4,8,7,1,11,2,0,0,0)

31

𝑓 1 =?

𝑓 3 = 𝛿 3 = 8 𝑓 9 = 𝛿 9 = 7

𝑓 1 = 𝛿 1 = 4

Theory and Practice of Codes with Locality July 10, 2016 16 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. A polynomial g(x) of degree r + 1

3. g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Claim: Let H be a subgroup of F∗ or F+, then the annihilator polynomial of H

g(x) =∏h∈H

(x− h)

is constant on each coset of H

Theory and Practice of Codes with Locality July 10, 2016 17 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. A polynomial g(x) of degree r + 1

3. g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Claim: Let H be a subgroup of F∗ or F+, then the annihilator polynomial of H

g(x) =∏h∈H

(x− h)

is constant on each coset of H

Theory and Practice of Codes with Locality July 10, 2016 17 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. A polynomial g(x) of degree r + 1

3. g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Claim: Let H be a subgroup of F∗ or F+, then the annihilator polynomial of H

g(x) =∏h∈H

(x− h)

is constant on each coset of H

Theory and Practice of Codes with Locality July 10, 2016 17 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. A polynomial g(x) of degree r + 1

3. g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Claim:

Let H be a subgroup of F∗ or F+, then the annihilator polynomial of H

g(x) =∏h∈H

(x− h)

is constant on each coset of H

Theory and Practice of Codes with Locality July 10, 2016 17 / 22

Optimal (n, k, r) LRC code construction

Ingredients:

1. R1, ...,R nr+1

are disjoint subsets of the field F, s.t. |Ri| = r + 1

2. A polynomial g(x) of degree r + 1

3. g(x) is constant on each subset Ri: g(α) = g(β) for α, β ∈ Ri

Claim: Let H be a subgroup of F∗ or F+, then the annihilator polynomial of H

g(x) =∏h∈H

(x− h)

is constant on each coset of H

Theory and Practice of Codes with Locality July 10, 2016 17 / 22

Availability problem

“Hot data” accessed simultaneously by a very large number of users

Theory and Practice of Codes with Locality July 10, 2016 18 / 22

Availability problem

“Hot data” accessed simultaneously by a very large number of users

Theory and Practice of Codes with Locality July 10, 2016 18 / 22

Availability problem

“Hot data” accessed simultaneously by a very large number of users

Theory and Practice of Codes with Locality July 10, 2016 18 / 22

Availability problem - Cont’d

• Main advantage of replication - High availability for hot data

• Goal: A code with high availability and small overhead

• Solution: LRC code with multiple disjoint recovery sets (codes with availability)

Theory and Practice of Codes with Locality July 10, 2016 19 / 22

Availability problem - Cont’d

• Main advantage of replication - High availability for hot data

• Goal: A code with high availability and small overhead

• Solution: LRC code with multiple disjoint recovery sets (codes with availability)

Theory and Practice of Codes with Locality July 10, 2016 19 / 22

Availability problem - Cont’d

• Main advantage of replication - High availability for hot data

• Goal: A code with high availability and small overhead

• Solution: LRC code with multiple disjoint recovery sets (codes with availability)

Theory and Practice of Codes with Locality July 10, 2016 19 / 22

Availability problem - Cont’d

• Main advantage of replication - High availability for hot data

• Goal: A code with high availability and small overhead

• Solution: LRC code with multiple disjoint recovery sets (codes with availability)

Theory and Practice of Codes with Locality July 10, 2016 19 / 22

Idea behind constructing codes with multiple recovery sets

Theory and Practice of Codes with Locality July 10, 2016 20 / 22

Idea behind constructing codes with multiple recovery sets

42

Subspace of polynomials that provide locality 𝑟1

Theory and Practice of Codes with Locality July 10, 2016 20 / 22

Idea behind constructing codes with multiple recovery sets

43

Subspace of polynomials that provide locality 𝑟1

Ex: The space of polynomials {𝑎0 + 𝑎1𝑥 + 𝑎3𝑥3 + 𝑎4𝑥

4: 𝑎𝑖 ∈ 𝐹13}

Theory and Practice of Codes with Locality July 10, 2016 20 / 22

Idea behind constructing codes with multiple recovery sets

44

Subspace of polynomials that provide locality 𝑟1

Subspace of polynomials that

provide locality 𝑟2

Ex: The space of polynomials {𝑎0 + 𝑎1𝑥 + 𝑎3𝑥3 + 𝑎4𝑥

4: 𝑎𝑖 ∈ 𝐹13}

Theory and Practice of Codes with Locality July 10, 2016 20 / 22

Idea behind constructing codes with multiple recovery sets

45

Subspace of polynomials that provide locality 𝑟1

Subspace of polynomials that

provide locality 𝑟2

Ex: The space of polynomials {𝑎0 + 𝑎1𝑥 + 𝑎3𝑥3 + 𝑎4𝑥

4: 𝑎𝑖 ∈ 𝐹13}

Theory and Practice of Codes with Locality July 10, 2016 20 / 22

LRC code with 2 recovery sets - Example

• Example: (12, 6, {2, 3}) over F13

1. Encoding polynomial:

(a0, a1, a6, a4, a9, a10) 7→ f (x) = a0 + a1x + a4x4 + a6x6 + a9x9 + a10x10

2. Evaluated points: F∗13

Theory and Practice of Codes with Locality July 10, 2016 21 / 22

LRC code with 2 recovery sets - Example

• Example: (12, 6, {2, 3}) over F13

1. Encoding polynomial:

(a0, a1, a6, a4, a9, a10) 7→ f (x) = a0 + a1x + a4x4 + a6x6 + a9x9 + a10x10

2. Evaluated points: F∗13

Theory and Practice of Codes with Locality July 10, 2016 21 / 22

LRC code with 2 recovery sets - Example

• Example: (12, 6, {2, 3}) over F13

1. Encoding polynomial:

(a0, a1, a6, a4, a9, a10) 7→ f (x) = a0 + a1x + a4x4 + a6x6 + a9x9 + a10x10

2. Evaluated points: F∗13

Theory and Practice of Codes with Locality July 10, 2016 21 / 22

LRC code with 2 recovery sets - Example

• Example: (12, 6, {2, 3}) over F13

1. Encoding polynomial:

(a0, a1, a6, a4, a9, a10) 7→ f (x) = a0 + a1x + a4x4 + a6x6 + a9x9 + a10x10

2. Evaluated points: F∗13

Theory and Practice of Codes with Locality July 10, 2016 21 / 22

LRC code with 2 recovery sets - Example

• Example: (12, 6, {2, 3}) over F13

1. Encoding polynomial:

(a0, a1, a6, a4, a9, a10) 7→ f (x) = a0 + a1x + a4x4 + a6x6 + a9x9 + a10x10

2. Evaluated points: F∗13

• 18,6, 1,1 3𝑥 𝑅𝑒𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛 12,6, {2,3} 𝐿𝑅𝐶

Theory and Practice of Codes with Locality July 10, 2016 21 / 22

LRC code with 2 recovery sets - Example

• Example: (12, 6, {2, 3}) over F13

1. Encoding polynomial:

(a0, a1, a6, a4, a9, a10) 7→ f (x) = a0 + a1x + a4x4 + a6x6 + a9x9 + a10x10

2. Evaluated points: F∗13

• 18,6, 1,1 3𝑥 𝑅𝑒𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛 12,6, {2,3} 𝐿𝑅𝐶

• Overhead of 200%

Theory and Practice of Codes with Locality July 10, 2016 21 / 22

LRC code with 2 recovery sets - Example

• Example: (12, 6, {2, 3}) over F13

1. Encoding polynomial:

(a0, a1, a6, a4, a9, a10) 7→ f (x) = a0 + a1x + a4x4 + a6x6 + a9x9 + a10x10

2. Evaluated points: F∗13

• 18,6, 1,1 3𝑥 𝑅𝑒𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛 12,6, {2,3} 𝐿𝑅𝐶

• Overhead of 200% • Overhead of 100%

Theory and Practice of Codes with Locality July 10, 2016 21 / 22

LRC code with 2 recovery sets - Example

• Example: (12, 6, {2, 3}) over F13

1. Encoding polynomial:

(a0, a1, a6, a4, a9, a10) 7→ f (x) = a0 + a1x + a4x4 + a6x6 + a9x9 + a10x10

2. Evaluated points: F∗13

• 18,6, 1,1 3𝑥 𝑅𝑒𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛 12,6, {2,3} 𝐿𝑅𝐶

• Overhead of 200%• Can tolerate any 2 disk failures

• Overhead of 100%

Theory and Practice of Codes with Locality July 10, 2016 21 / 22

LRC code with 2 recovery sets - Example

• Example: (12, 6, {2, 3}) over F13

1. Encoding polynomial:

(a0, a1, a6, a4, a9, a10) 7→ f (x) = a0 + a1x + a4x4 + a6x6 + a9x9 + a10x10

2. Evaluated points: F∗13

• 18,6, 1,1 3𝑥 𝑅𝑒𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛 12,6, {2,3} 𝐿𝑅𝐶

• Overhead of 200%• Can tolerate any 2 disk failures

• Overhead of 100%• Can tolerate any 3 disk failures

Theory and Practice of Codes with Locality July 10, 2016 21 / 22

LRC code with 2 recovery sets - Example

• Example: (12, 6, {2, 3}) over F13

1. Encoding polynomial:

(a0, a1, a6, a4, a9, a10) 7→ f (x) = a0 + a1x + a4x4 + a6x6 + a9x9 + a10x10

2. Evaluated points: F∗13

• 18,6, 1,1 3𝑥 𝑅𝑒𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛 12,6, {2,3} 𝐿𝑅𝐶

• Overhead of 200%• Can tolerate any 2 disk failures• Recovery sets

• Overhead of 100%• Can tolerate any 3 disk failures

{ , , }

Theory and Practice of Codes with Locality July 10, 2016 21 / 22

LRC code with 2 recovery sets - Example

• Example: (12, 6, {2, 3}) over F13

1. Encoding polynomial:

(a0, a1, a6, a4, a9, a10) 7→ f (x) = a0 + a1x + a4x4 + a6x6 + a9x9 + a10x10

2. Evaluated points: F∗13

• 18,6, 1,1 3𝑥 𝑅𝑒𝑝𝑙𝑖𝑐𝑎𝑡𝑖𝑜𝑛 12,6, {2,3} 𝐿𝑅𝐶

• Overhead of 200%• Can tolerate any 2 disk failures• Recovery sets

• Overhead of 100%• Can tolerate any 3 disk failures• Recovery sets { , , } { , , }

Theory and Practice of Codes with Locality July 10, 2016 21 / 22

Codes with locality: Plan

• LRC codes - Basics (Tamo)

• LRC codes - Constructions (Barg)

• LRC codes in practice (Barg)

• Break ,

• LRC codes - Bounds (Barg)

• Maximally Recoverable Codes (Tamo)

• LRC codes on graphs (Tamo)

Theory and Practice of Codes with Locality July 10, 2016 22 / 22

Constructions of LRC codes

Constructions of LRC codes 1 / 25

From RS codes to other related families

In this part our goal is to construct new families of LRC codes

Advantages of the LRC RS construction

• small alphabet q « n

• large distance

• easy processing

Limitations

• n ď q

Fixed q, large n?Possibilities: Reduce q or Increase n

• Take a subfield subcode of an LRC RS code (e.g., a binary code)

• The analysis is simplified if we take a cyclic LRC RS code

• Take a large subset of sampling points (points on an algebraic curve)

Constructions of LRC codes 2 / 25

From RS codes to other related families

In this part our goal is to construct new families of LRC codes

Advantages of the LRC RS construction

• small alphabet q « n

• large distance

• easy processing

Limitations

• n ď q

Fixed q, large n?Possibilities: Reduce q or Increase n

• Take a subfield subcode of an LRC RS code (e.g., a binary code)

• The analysis is simplified if we take a cyclic LRC RS code

• Take a large subset of sampling points (points on an algebraic curve)

Constructions of LRC codes 2 / 25

From RS codes to other related families

In this part our goal is to construct new families of LRC codes

Advantages of the LRC RS construction

• small alphabet q « n

• large distance

• easy processing

Limitations

• n ď q

Fixed q, large n?Possibilities: Reduce q or Increase n

• Take a subfield subcode of an LRC RS code (e.g., a binary code)

• The analysis is simplified if we take a cyclic LRC RS code

• Take a large subset of sampling points (points on an algebraic curve)

Constructions of LRC codes 2 / 25

From RS codes to other related families

In this part our goal is to construct new families of LRC codes

Advantages of the LRC RS construction

• small alphabet q « n

• large distance

• easy processing

Limitations

• n ď q

Fixed q, large n?

Possibilities: Reduce q or Increase n• Take a subfield subcode of an LRC RS code (e.g., a binary code)

• The analysis is simplified if we take a cyclic LRC RS code

• Take a large subset of sampling points (points on an algebraic curve)

Constructions of LRC codes 2 / 25

From RS codes to other related families

In this part our goal is to construct new families of LRC codes

Advantages of the LRC RS construction

• small alphabet q « n

• large distance

• easy processing

Limitations

• n ď q

Fixed q, large n?Possibilities: Reduce q or Increase n

• Take a subfield subcode of an LRC RS code (e.g., a binary code)

• The analysis is simplified if we take a cyclic LRC RS code

• Take a large subset of sampling points (points on an algebraic curve)

Constructions of LRC codes 2 / 25

From RS codes to other related families

In this part our goal is to construct new families of LRC codes

Advantages of the LRC RS construction

• small alphabet q « n

• large distance

• easy processing

Limitations

• n ď q

Fixed q, large n?Possibilities: Reduce q or Increase n

• Take a subfield subcode of an LRC RS code (e.g., a binary code)

• The analysis is simplified if we take a cyclic LRC RS code

• Take a large subset of sampling points (points on an algebraic curve)

Constructions of LRC codes 2 / 25

Cyclic LRC codesCyclic codes form a classic topic in coding theory: BCH codes, RM codes, many other well-studied codefamilies are cyclic.

Cyclic q-ary LRC codes

Recall the LRC RS construction:Data: subset of points P “ pP1, . . . ,Pnq Ă Fq; linear space of polynomialsV “ xxibpxqjy, dim V “ k, bpxq constant on pr ` 1q-subblocks of the set P;

V Ñ C

fa ÞÑ evPpfaq “ pfapP1q, . . . , fapPnqq

Additional assumptions: n|pq ´ 1q; pr ` 1q|n; r|k

P “ t1, α, . . . , αn´1u; V “

B

kr pr`1q´2ÿ

i“0i‰r modpr`1q

aixi, ai P Fq

F

(1)

(α - nth root of unity)

We obtain a cyclic code C:

pc0, c1, c2, . . . , cn´1q P C ñ pcn´1, c0, c1, . . . , cn´2q P C

Constructions of LRC codes 3 / 25

Cyclic LRC codesCyclic codes form a classic topic in coding theory: BCH codes, RM codes, many other well-studied codefamilies are cyclic.

Cyclic q-ary LRC codes

Recall the LRC RS construction:Data: subset of points P “ pP1, . . . ,Pnq Ă Fq; linear space of polynomialsV “ xxibpxqjy, dim V “ k, bpxq constant on pr ` 1q-subblocks of the set P;

V Ñ C

fa ÞÑ evPpfaq “ pfapP1q, . . . , fapPnqq

Additional assumptions: n|pq ´ 1q; pr ` 1q|n; r|k

P “ t1, α, . . . , αn´1u; V “

B

kr pr`1q´2ÿ

i“0i‰r modpr`1q

aixi, ai P Fq

F

(1)

(α - nth root of unity)

We obtain a cyclic code C:

pc0, c1, c2, . . . , cn´1q P C ñ pcn´1, c0, c1, . . . , cn´2q P C

Constructions of LRC codes 3 / 25

Cyclic LRC codesCyclic codes form a classic topic in coding theory: BCH codes, RM codes, many other well-studied codefamilies are cyclic.

Cyclic q-ary LRC codes

Recall the LRC RS construction:Data: subset of points P “ pP1, . . . ,Pnq Ă Fq; linear space of polynomialsV “ xxibpxqjy, dim V “ k, bpxq constant on pr ` 1q-subblocks of the set P;

V Ñ C

fa ÞÑ evPpfaq “ pfapP1q, . . . , fapPnqq

Additional assumptions: n|pq ´ 1q; pr ` 1q|n; r|k

P “ t1, α, . . . , αn´1u; V “

B

kr pr`1q´2ÿ

i“0i‰r modpr`1q

aixi, ai P Fq

F

(1)

(α - nth root of unity)

We obtain a cyclic code C:

pc0, c1, c2, . . . , cn´1q P C ñ pcn´1, c0, c1, . . . , cn´2q P C

Constructions of LRC codes 3 / 25

Zeros of a cyclic LRC code

Let C be a q-ary cyclic code

pco, c1, . . . , cn´1q P C Ñ cpxq “

n´1ÿ

i“0

cixi

C is an ideal in the ring Fqrxs{pxn ´ 1q; C “ xgpxqy, where gpxq is the generator polynomialof C

Definition: Zeros of the code C “ zeros of gpxq

Generator matrix Parity-check matrix

G “

¨

˚

˚

˚

˚

˝

1 1 1 . . . 11 α α2 . . . αn´1

......

...1 αk´1 α2pk´1q . . . αpk´1qpn´1q

˛

H “

¨

˚

˚

˚

˚

˝

1 α α2 . . . αn´1

1 α2 α2¨2 . . . α2pn´1q

......

...1 αn´k α2pn´kq . . . αpn´kqpn´1q

˛

C is a cyclic code with zeros α, α2, . . . , αn´k

Constructions of LRC codes 4 / 25

Zeros of a cyclic LRC code

Let C be a q-ary cyclic code

pco, c1, . . . , cn´1q P C Ñ cpxq “

n´1ÿ

i“0

cixi

C is an ideal in the ring Fqrxs{pxn ´ 1q; C “ xgpxqy, where gpxq is the generator polynomialof C

Definition: Zeros of the code C “ zeros of gpxq

Generator matrix Parity-check matrix

G “

¨

˚

˚

˚

˚

˝

1 1 1 . . . 11 α α2 . . . αn´1

......

...1 αk´1 α2pk´1q . . . αpk´1qpn´1q

˛

H “

¨

˚

˚

˚

˚

˝

1 α α2 . . . αn´1

1 α2 α2¨2 . . . α2pn´1q

......

...1 αn´k α2pn´kq . . . αpn´kqpn´1q

˛

C is a cyclic code with zeros α, α2, . . . , αn´k

Constructions of LRC codes 4 / 25

Cyclic codes: Example

• RS code C of length n “ 15, k “ 8, d “ 8, q “ 24

Zeros of C: α, α2, α3, α4, α5, α6, α7

Generator polynomial gpxq “ś7

i“1px ´ αiq, dimpCq “ n ´ degpgq “ 8

BCH bound: dpCq ě number of consecutive 0’s ` 1

• Now suppose that C has zeros

tα, α2, α3, α4, α5, α6, α7u Y tα, α4, α7, α10, α13

u.

The distance is still d “ 8, and we get locality r “ 2Indeed,

fapxq “ a1 ` a2x ` a3x3` a4x4

` a5x6` a6x7

The rows of G are 1, α, α3, α4, α6, α7

k “ 6; d “ 8 “ n ´ kr ` 1

r` 2

Constructions of LRC codes 5 / 25

Cyclic codes: Example

• RS code C of length n “ 15, k “ 8, d “ 8, q “ 24

Zeros of C: α, α2, α3, α4, α5, α6, α7

Generator polynomial gpxq “ś7

i“1px ´ αiq, dimpCq “ n ´ degpgq “ 8

BCH bound: dpCq ě number of consecutive 0’s ` 1

• Now suppose that C has zeros

tα, α2, α3, α4, α5, α6, α7u Y tα, α4, α7, α10, α13

u.

The distance is still d “ 8, and we get locality r “ 2

Indeed,fapxq “ a1 ` a2x ` a3x3

` a4x4` a5x6

` a6x7

The rows of G are 1, α, α3, α4, α6, α7

k “ 6; d “ 8 “ n ´ kr ` 1

r` 2

Constructions of LRC codes 5 / 25

Cyclic codes: Example

• RS code C of length n “ 15, k “ 8, d “ 8, q “ 24

Zeros of C: α, α2, α3, α4, α5, α6, α7

Generator polynomial gpxq “ś7

i“1px ´ αiq, dimpCq “ n ´ degpgq “ 8

BCH bound: dpCq ě number of consecutive 0’s ` 1

• Now suppose that C has zeros

tα, α2, α3, α4, α5, α6, α7u Y tα, α4, α7, α10, α13

u.

The distance is still d “ 8, and we get locality r “ 2Indeed,

fapxq “ a1 ` a2x ` a3x3` a4x4

` a5x6` a6x7

The rows of G are 1, α, α3, α4, α6, α7

k “ 6; d “ 8 “ n ´ kr ` 1

r` 2

Constructions of LRC codes 5 / 25

Cyclic codes: Example

• RS code C of length n “ 15, k “ 8, d “ 8, q “ 24

Zeros of C: α, α2, α3, α4, α5, α6, α7

Generator polynomial gpxq “ś7

i“1px ´ αiq, dimpCq “ n ´ degpgq “ 8

BCH bound: dpCq ě number of consecutive 0’s ` 1

• Now suppose that C has zeros

tα, α2, α3, α4, α5, α6, α7u Y tα, α4, α7, α10, α13

u.

The distance is still d “ 8, and we get locality r “ 2Indeed,

fapxq “ a1 ` a2x ` a3x3` a4x4

` a5x6` a6x7

The rows of G are 1, α, α3, α4, α6, α7

k “ 6; d “ 8 “ n ´ kr ` 1

r` 2

Constructions of LRC codes 5 / 25

Cyclic codes: Example

• RS code C of length n “ 15, k “ 8, d “ 8, q “ 24

Zeros of C: α, α2, α3, α4, α5, α6, α7

Generator polynomial gpxq “ś7

i“1px ´ αiq, dimpCq “ n ´ degpgq “ 8

BCH bound: dpCq ě number of consecutive 0’s ` 1

• Now suppose that C has zeros

tα, α2, α3, α4, α5, α6, α7u Y tα, α4, α7, α10, α13

u.

The distance is still d “ 8, and we get locality r “ 2Indeed,

fapxq “ a1 ` a2x ` a3x3` a4x4

` a5x6` a6x7

The rows of G are 1, α, α3, α4, α6, α7

k “ 6; d “ 8 “ n ´ kr ` 1

r` 2

Constructions of LRC codes 5 / 25

Cyclic LRC codes

• Consider a cyclic code of length n|pq ´ 1q given by evaluations evalpfapxqq of the

polynomials of the form fapxq “

kr pr`1q´2

ř

i“0i‰r modpr`1q

aixi, ai P Fq

• The zeros of the code are a union of two (overlapping) subsets:

subset D gives the distance, |D| “ n ´ kr pr ` 1q ` 1

subset L supports locality, |LzD| “ kr ´ 1

• The code is Singleton-optimal by the BCH bound

The zeros of C are of the form αj,

j P t1, 2, . . . , n ´ kr pr ` 1q ` 1u; tn ´ p k

r ´ 1qpr ` 1q ` 1, n ´ p kr ´ 2qpr ` 1q ` 1, . . . , n ´ ru

Constructions of LRC codes 6 / 25

Cyclic LRC codes

• Consider a cyclic code of length n|pq ´ 1q given by evaluations evalpfapxqq of the

polynomials of the form fapxq “

kr pr`1q´2

ř

i“0i‰r modpr`1q

aixi, ai P Fq

• The zeros of the code are a union of two (overlapping) subsets:

subset D gives the distance, |D| “ n ´ kr pr ` 1q ` 1

subset L supports locality, |LzD| “ kr ´ 1

• The code is Singleton-optimal by the BCH bound

The zeros of C are of the form αj,

j P t1, 2, . . . , n ´ kr pr ` 1q ` 1u; tn ´ p k

r ´ 1qpr ` 1q ` 1, n ´ p kr ´ 2qpr ` 1q ` 1, . . . , n ´ ru

Constructions of LRC codes 6 / 25

Main ideas I

Let 0 ď l ď r, ν “ n{pr ` 1q. Consider the ν ˆ n matrix (the zeros from L)

H1

¨

˚

˚

˚

˚

˝

1 αl α2l . . . αpn´1qpr`1q`l

1 αpr`1q`l α2ppr`1q`lq . . . αpn´1qppr`1q`l

......

1 αpν´1qpr`1q`l α2ppν´1qpr`1q`lq . . . αpn´1qppν´1qpr`1q`lq

˛

The row space of H1 contains all the cyclic shifts of the n-dimensional vector of weight r ` 1

v “ p1 0 . . . 0loomoon

ν´1

αlν 0 . . . 0loomoon

ν´1

α2lν 0 . . . 0loomoon

ν´1

. . . αrlν 0 . . . 0loomoon

ν´1

q, ν “ npr ` 1q

The code C has the parity-check matrix H “ H Y H1, where

• H is formed by the rows in D

• H1 is formed by the rows in LzD

Constructions of LRC codes 7 / 25

Main ideas I

Let 0 ď l ď r, ν “ n{pr ` 1q. Consider the ν ˆ n matrix (the zeros from L)

H1

¨

˚

˚

˚

˚

˝

1 αl α2l . . . αpn´1qpr`1q`l

1 αpr`1q`l α2ppr`1q`lq . . . αpn´1qppr`1q`l

......

1 αpν´1qpr`1q`l α2ppν´1qpr`1q`lq . . . αpn´1qppν´1qpr`1q`lq

˛

The row space of H1 contains all the cyclic shifts of the n-dimensional vector of weight r ` 1

v “ p1 0 . . . 0loomoon

ν´1

αlν 0 . . . 0loomoon

ν´1

α2lν 0 . . . 0loomoon

ν´1

. . . αrlν 0 . . . 0loomoon

ν´1

q, ν “ npr ` 1q

The code C has the parity-check matrix H “ H Y H1, where

• H is formed by the rows in D

• H1 is formed by the rows in LzD

Constructions of LRC codes 7 / 25

Main ideas I

Let 0 ď l ď r, ν “ n{pr ` 1q. Consider the ν ˆ n matrix (the zeros from L)

H1

¨

˚

˚

˚

˚

˝

1 αl α2l . . . αpn´1qpr`1q`l

1 αpr`1q`l α2ppr`1q`lq . . . αpn´1qppr`1q`l

......

1 αpν´1qpr`1q`l α2ppν´1qpr`1q`lq . . . αpn´1qppν´1qpr`1q`lq

˛

The row space of H1 contains all the cyclic shifts of the n-dimensional vector of weight r ` 1

v “ p1 0 . . . 0loomoon

ν´1

αlν 0 . . . 0loomoon

ν´1

α2lν 0 . . . 0loomoon

ν´1

. . . αrlν 0 . . . 0loomoon

ν´1

q, ν “ npr ` 1q

The code C has the parity-check matrix H “ H Y H1, where

• H is formed by the rows in D

• H1 is formed by the rows in LzD

Constructions of LRC codes 7 / 25

Main ideas II

To obtain a code over a small field (e.g., Fp) take a subfield subcode of the code C, i.e.,

D “ C X pFpqn

Analysis is based on Delsarte’s theorem: C P Fnq, q “ pm

C ÐÑ CK

Ş

pFpqn Tm

D ÐÑ DK “ TmpCKq

(Delsarte ’75; Sidelnikov ’72)

Trace Tm : Fpm Ñ Fp: Tmpaq “ a ` ap ` ¨ ¨ ¨ ` apm´1

Further ideas involve

• Theory of irreducible (minimal) cyclic codes

• Results on upper bounds on the distance of a cyclic code in terms of its zeros

Constructions of LRC codes 8 / 25

Main ideas IITo obtain a code over a small field (e.g., Fp) take a subfield subcode of the code C, i.e.,

D “ C X pFpqn

Analysis is based on Delsarte’s theorem: C P Fnq, q “ pm

C ÐÑ CK

Ş

pFpqn Tm

D ÐÑ DK “ TmpCKq

(Delsarte ’75; Sidelnikov ’72)

Trace Tm : Fpm Ñ Fp: Tmpaq “ a ` ap ` ¨ ¨ ¨ ` apm´1

• Since D is cyclic, r “ dK ´ 1• We would like small r, i.e., upper estimates of dK

• This is in contrast to classical coding where one seeks large dK

Further ideas involve

• Theory of irreducible (minimal) cyclic codes

• Results on upper bounds on the distance of a cyclic code in terms of its zeros

Constructions of LRC codes 8 / 25

Main ideas II

To obtain a code over a small field (e.g., Fp) take a subfield subcode of the code C, i.e.,

D “ C X pFpqn

Analysis is based on Delsarte’s theorem: C P Fnq, q “ pm

C ÐÑ CK

Ş

pFpqn Tm

D ÐÑ DK “ TmpCKq

(Delsarte ’75; Sidelnikov ’72)

Trace Tm : Fpm Ñ Fp: Tmpaq “ a ` ap ` ¨ ¨ ¨ ` apm´1

Further ideas involve

• Theory of irreducible (minimal) cyclic codes

• Results on upper bounds on the distance of a cyclic code in terms of its zeros

Constructions of LRC codes 8 / 25

A sampling of results

In a number of cases it is possible to estimate the locality and the number of recovery setsof codes.

Let α be an nth root of unity, let p2t ´ 1q|n for some t.Let D be an rn, ks binary linear cyclic code whose defining set of zeros contains the groupxα2t´1y. Then the locality of D satisfies

r ď 2t´1´ 1,

and each symbol has ě 2t´1 recovery sets.

For instance, we have the following binary LRC codes:

n k d ZpDq t r ZpDKq dK

35 20 3 t1, 15u 3 r ď 3 t0, 1, 7, 15u 445 33 3 t1u 4 r ď 7 t0, 1, 3, 5, 9, 15, 21u 827 7 6 t1, 9u 2 r “ 1 t0, 3u 263 36 3 t1, 9, 11, 15, 23u 3 r ď 3 t0, 1, 7, 9, 11, 15, 21, 23u 4

Constructions of LRC codes 9 / 25

Some open questions

• In several examples the bounds on locality (dual distance) give tight results.Is it possible to characterize the cases in which the bounds are tight?

• Find the number of recovery sets per symbol for families of cyclic codes.

Constructions of LRC codes 10 / 25

Codes on curves

In this part we discuss another approach to constructing long codes over smallalphabets.

RS codes can be viewed as codes on the (affine) line, and can be extended tocodes on algebraic curves.The constructions becomes more technical, and we proceed by example.

Constructions of LRC codes 11 / 25

LRC codes on curves

Consider the set of pairs px, yq P F9 that satisfy the equation x3 ` x “ y4

α7 ‚ ‚ ‚ ‚

α6 ‚

α5 ‚ ‚ ‚ ‚

α4 ‚ ‚ ‚ ‚

x α3 ‚ ‚ ‚ ‚

α2 ‚

α ‚ ‚ ‚ ‚

1 ‚ ‚ ‚ ‚

0 ‚

0 1 α α2 α3 α4 α5 α6 α7

y

Affine points of the Hermitian curve X over F9; α2 “ α ` 1

Constructions of LRC codes 12 / 25

LRC codes on curves

Consider the set of pairs px, yq P F9 that satisfy the equation x3 ` x “ y4

α7 ‚ ‚ ‚ ‚

α6 ‚

α5 ‚ ‚ ‚ ‚

α4 ‚ ‚ ‚ ‚

x α3 ‚ ‚ ‚ ‚

α2 ‚

α ‚ ‚ ‚ ‚

1 ‚ ‚ ‚ ‚

0 ‚

0 1 α α2 α3 α4 α5 α6 α7

y

Affine points of the Hermitian curve X over F9; α2 “ α ` 1

Constructions of LRC codes 12 / 25

Hermitian codes

g : X Ñ P1

px, yq ÞÑ y

Space of functions V :“ x1, y, y2, x, xy, xy2y

A={Affine points of the Hermitian curve over F9}; n “ 27, k “ 6

C : V Ñ Fn9

E.g., message p1, α, α2, α3, α4, α5q

Fpx, yq “ 1 ` αy ` α2y2` α3x ` α4xy ` α5xy2

Fp0, 0q “ 1 etc.

Constructions of LRC codes 13 / 25

Hermitian codes

g : X Ñ P1

px, yq ÞÑ y

Space of functions V :“ x1, y, y2, x, xy, xy2y

A={Affine points of the Hermitian curve over F9}; n “ 27, k “ 6

C : V Ñ Fn9

E.g., message p1, α, α2, α3, α4, α5q

Fpx, yq “ 1 ` αy ` α2y2` α3x ` α4xy ` α5xy2

Fp0, 0q “ 1 etc.

Constructions of LRC codes 13 / 25

LRC codes on curves

α7 α α7 α5 0α6 α2

α5 α6 α4 α2 0α4 α7 α3 α5 α5

x α3 α3 α7 α α

α2 α3

α 0 0 0 01 1 α6 α4 00 1

0 1 α α2 α3 α4 α5 α6 α7

y

Constructions of LRC codes 14 / 25

Hermitian LRC codes

α7 α α7 α5 0α6 α2

α5 α6 α4 α2 0α4 α7 α3 α5 α5

x α3 α3 α7 α α

α2 α3

α X0 0 0 01 1 α6 α4 00 1

0 1 α α2 α3 α4 α5 α6 α7

y

Let P “ pα, 1q be the erased location.

Constructions of LRC codes 15 / 25

Local recovery with Hermitian codes

α7 α α7 α5 0α6 α2

α5 α6 α4 α2 0α4 α7 α3 α5 α5

x α3 α3 α7 α α

α2 α3

α ? 0 0 01 1 α6 α4 00 1

0 1 α α2 α3 α4 α5 α6 α7

y

Let P “ pα, 1q be the erased location. Recovery set IP “ tpα4, 1q, pα3, 1qu

Find f pxq : f pα4q “ α7, f pα3q “ α3

ñ f pxq “ αx ´ α2

f pαq “ 0 “ FpPq

Computations can be done using GAP or Magma

Constructions of LRC codes 16 / 25

Local recovery with Hermitian codes

α7 α α7 α5 0α6 α2

α5 α6 α4 α2 0α4 α7 α3 α5 α5

x α3 α3 α7 α α

α2 α3

α ? 0 0 01 1 α6 α4 00 1

0 1 α α2 α3 α4 α5 α6 α7

y

Let P “ pα, 1q be the erased location. Recovery set IP “ tpα4, 1q, pα3, 1qu

Find f pxq : f pα4q “ α7, f pα3q “ α3

ñ f pxq “ αx ´ α2

f pαq “ 0 “ FpPq

Computations can be done using GAP or Magma

Constructions of LRC codes 16 / 25

Local recovery with Hermitian codes

α7 α α7 α5 0α6 α2

α5 α6 α4 α2 0α4 α7 α3 α5 α5

x α3 α3 α7 α α

α2 α3

α ? 0 0 01 1 α6 α4 00 1

0 1 α α2 α3 α4 α5 α6 α7

y

Let P “ pα, 1q be the erased location. Recovery set IP “ tpα4, 1q, pα3, 1qu

Find f pxq : f pα4q “ α7, f pα3q “ α3

ñ f pxq “ αx ´ α2

f pαq “ 0 “ FpPq

Computations can be done using GAP or MagmaConstructions of LRC codes 16 / 25

Hermitian codes

q “ q20, q0 prime power

X : xq0 ` x “ yq0`1

X has q30 “ q3{2 points in Fq

We obtain a family of q-ary codes of length n “ q30,

k “ pt ` 1qpq0 ´ 1q, d ě n ´ tq0 ´ pq0 ´ 2qpq0 ` 1q

with locality r “ q0 ´ 1.

Constructions of LRC codes 17 / 25

Hermitian codes

q “ q20, q0 prime power

X : xq0 ` x “ yq0`1

X has q30 “ q3{2 points in Fq

We obtain a family of q-ary codes of length n “ q30,

k “ pt ` 1qpq0 ´ 1q, d ě n ´ tq0 ´ pq0 ´ 2qpq0 ` 1q

with locality r “ q0 ´ 1.

Constructions of LRC codes 17 / 25

Hermitian codes

q “ q20, q0 prime power

X : xq0 ` x “ yq0`1

X has q30 “ q3{2 points in Fq

We obtain a family of q-ary codes of length n “ q30,

k “ pt ` 1qpq0 ´ 1q, d ě n ´ tq0 ´ pq0 ´ 2qpq0 ` 1q

with locality r “ q0 ´ 1.

Constructions of LRC codes 17 / 25

Hermitian codes

q “ q20, q0 prime power

X : xq0 ` x “ yq0`1

X has q30 “ q3{2 points in Fq

We obtain a family of q-ary codes of length n “ q30,

k “ pt ` 1qpq0 ´ 1q, d ě n ´ tq0 ´ pq0 ´ 2qpq0 ` 1q

with locality r “ q0 ´ 1.

Constructions of LRC codes 17 / 25

Geometric view of the construction

We take g : X Ñ Y “ P1, gpPq “ gpx, yq :“ y

α7 ‚ ‚ ‚ ‚

α6 ‚

α5 ‚ ‚ ‚ ‚

α4 ‚ ‚ ‚ ‚

x α3 ‚ ‚ ‚ ‚

α2 ‚

α ‚ ‚ ‚ ‚

1 ‚ ‚ ‚ ‚

0 ‚

0 1 α α2 α3 α4 α5 α6 α7

y

��

��

Projection on y

Space of functions V :“ x1, y, y2, x, xy, xy2y

Since y is constant on the fibers (recovery sets), we get back to the univariate polynomialinterpolation

It is also possible to take gpPq “ x (projection on x); we obtain LRC codes with locality q0

Constructions of LRC codes 18 / 25

Geometric view of the construction

We take g : X Ñ Y “ P1, gpPq “ gpx, yq :“ y

α7 ‚ ‚ ‚ ‚

α6 ‚

α5 ‚ ‚ ‚ ‚

α4 ‚ ‚ ‚ ‚

x α3 ‚ ‚ ‚ ‚

α2 ‚

α ‚ ‚ ‚ ‚

1 ‚ ‚ ‚ ‚

0 ‚

0 1 α α2 α3 α4 α5 α6 α7

y

��

��

Projection on y

Space of functions V :“ x1, y, y2, x, xy, xy2y

Since y is constant on the fibers (recovery sets), we get back to the univariate polynomialinterpolation

It is also possible to take gpPq “ x (projection on x); we obtain LRC codes with locality q0

Constructions of LRC codes 18 / 25

Geometric view of the construction

We take g : X Ñ Y “ P1, gpPq “ gpx, yq :“ y

α7 ‚ ‚ ‚ ‚

α6 ‚

α5 ‚ ‚ ‚ ‚

α4 ‚ ‚ ‚ ‚

x α3 ‚ ‚ ‚ ‚

α2 ‚

α ‚ ‚ ‚ ‚

1 ‚ ‚ ‚ ‚

0 ‚

0 1 α α2 α3 α4 α5 α6 α7

y

��

��

Projection on y

Space of functions V :“ x1, y, y2, x, xy, xy2y

Since y is constant on the fibers (recovery sets), we get back to the univariate polynomialinterpolation

It is also possible to take gpPq “ x (projection on x); we obtain LRC codes with locality q0

Constructions of LRC codes 18 / 25

General construction: Covering maps of curves

In the RS-like construction, X “ Y “ P1

Constructions of LRC codes 19 / 25

General construction: Technical details

Map of curvesX, Y smooth projective absolutely irreducible curves over k

g : X Ñ Y

rational separable map of degree r ` 1

Lift the points of YS “ tP1, . . . ,Psu Ă Ypkq. Partition of points:

A :“ g´1pSq “ tPij, i “ 0, . . . , r, j “ 1, . . . , su Ď Xpkq

such that gpPijq “ Pj for all i, j

Let x P kpXq be such that kpXq “ kpYqpxq, and let deg x “ h as a projection x : X Ñ P1k

Constructions of LRC codes 20 / 25

General construction: Technical details

Map of curvesX, Y smooth projective absolutely irreducible curves over k

g : X Ñ Y

rational separable map of degree r ` 1

Lift the points of YS “ tP1, . . . ,Psu Ă Ypkq. Partition of points:

A :“ g´1pSq “ tPij, i “ 0, . . . , r, j “ 1, . . . , su Ď Xpkq

such that gpPijq “ Pj for all i, j

Let x P kpXq be such that kpXq “ kpYqpxq, and let deg x “ h as a projection x : X Ñ P1k

Constructions of LRC codes 20 / 25

General construction: Technical details

Map of curvesX, Y smooth projective absolutely irreducible curves over k

g : X Ñ Y

rational separable map of degree r ` 1

Lift the points of YS “ tP1, . . . ,Psu Ă Ypkq. Partition of points:

A :“ g´1pSq “ tPij, i “ 0, . . . , r, j “ 1, . . . , su Ď Xpkq

such that gpPijq “ Pj for all i, j

Let x P kpXq be such that kpXq “ kpYqpxq, and let deg x “ h as a projection x : X Ñ P1k

Constructions of LRC codes 20 / 25

General construction, IILet Q8 Ă π´1p8q, deg Q8 “ ℓ ě 1

Let LpQ8q “ xf1, . . . , fmy,m ě ℓ ´ gY ` 1

Function spaceV :“

A

fjxi, i “ 0, . . . , r ´ 1; j “ 1, . . . ,mE

The code C is an image of the map

e :“ evA :V ÝÑ kpr`1qs

F ÞÑ pFpPijq, i “ 0, . . . , r, j “ 1, . . . , sq

Theorem: The subspace CpD, gq Ă Fq forms an pn, k, rq linear LRC code with theparameters

n “ pr ` 1qs

k “ rm ě rpℓ ´ gY ` 1q

d ě n ´ ℓpr ` 1q ´ pr ´ 1qh

,

/

/

.

/

/

-

provided that the right-hand side of the inequality for d is a positive integer.

Constructions of LRC codes 21 / 25

General construction, IILet Q8 Ă π´1p8q, deg Q8 “ ℓ ě 1

Let LpQ8q “ xf1, . . . , fmy,m ě ℓ ´ gY ` 1

Function spaceV :“

A

fjxi, i “ 0, . . . , r ´ 1; j “ 1, . . . ,mE

The code C is an image of the map

e :“ evA :V ÝÑ kpr`1qs

F ÞÑ pFpPijq, i “ 0, . . . , r, j “ 1, . . . , sq

Theorem: The subspace CpD, gq Ă Fq forms an pn, k, rq linear LRC code with theparameters

n “ pr ` 1qs

k “ rm ě rpℓ ´ gY ` 1q

d ě n ´ ℓpr ` 1q ´ pr ´ 1qh

,

/

/

.

/

/

-

provided that the right-hand side of the inequality for d is a positive integer.

Constructions of LRC codes 21 / 25

General construction, IILet Q8 Ă π´1p8q, deg Q8 “ ℓ ě 1

Let LpQ8q “ xf1, . . . , fmy,m ě ℓ ´ gY ` 1

Function spaceV :“

A

fjxi, i “ 0, . . . , r ´ 1; j “ 1, . . . ,mE

The code C is an image of the map

e :“ evA :V ÝÑ kpr`1qs

F ÞÑ pFpPijq, i “ 0, . . . , r, j “ 1, . . . , sq

Theorem: The subspace CpD, gq Ă Fq forms an pn, k, rq linear LRC code with theparameters

n “ pr ` 1qs

k “ rm ě rpℓ ´ gY ` 1q

d ě n ´ ℓpr ` 1q ´ pr ´ 1qh

,

/

/

.

/

/

-

provided that the right-hand side of the inequality for d is a positive integer.

Constructions of LRC codes 21 / 25

Asymptotically good sequences of codesIn classic coding problem, codes on algebraic curves give rise to some excellent codefamilies, in particular, improving the asymptotic Gilbert-Varshamov bound on theparameters (Tsfasman-Vladuts-Zink ’81)

Let q “ q20, where q0 is a prime power. Take Garcia-Stichtenoth towers of curves:

x0 :“ 1; X1 :“ P1,kpX1q “ kpx1q;

Xl : zq0l ` zl “ xq0`1

l´1 , xl´1 :“zl´1

xl´2P kpXl´1q (if l ě 3q

There exist families of q-ary LRC codes with locality r whose rate and relative distancesatisfy

R ěr

r ` 1

´

1 ´ δ ´3

?q ` 1

¯

, r “?

q ´ 1

R ěr

r ` 1

´

1 ´ δ ´2

?q

q ´ 1

¯

, r “?

q

˚qRecall the TVZ ’81 bound without locality: R ě 1 ´ δ ´ 1?

q´1

Constructions of LRC codes 22 / 25

Asymptotically good sequences of codes

Let q “ q20, where q0 is a prime power. Take Garcia-Stichtenoth towers of curves:

x0 :“ 1; X1 :“ P1,kpX1q “ kpx1q;

Xl : zq0l ` zl “ xq0`1

l´1 , xl´1 :“zl´1

xl´2P kpXl´1q (if l ě 3q

There exist families of q-ary LRC codes with locality r whose rate and relative distancesatisfy

R ěr

r ` 1

´

1 ´ δ ´3

?q ` 1

¯

, r “?

q ´ 1

R ěr

r ` 1

´

1 ´ δ ´2

?q

q ´ 1

¯

, r “?

q

˚qRecall the TVZ ’81 bound without locality: R ě 1 ´ δ ´ 1?

q´1

Constructions of LRC codes 22 / 25

Asymptotically good sequences of codes

Let q “ q20, where q0 is a prime power. Take Garcia-Stichtenoth towers of curves:

x0 :“ 1; X1 :“ P1,kpX1q “ kpx1q;

Xl : zq0l ` zl “ xq0`1

l´1 , xl´1 :“zl´1

xl´2P kpXl´1q (if l ě 3q

There exist families of q-ary LRC codes with locality r whose rate and relative distancesatisfy

R ěr

r ` 1

´

1 ´ δ ´3

?q ` 1

¯

, r “?

q ´ 1

R ěr

r ` 1

´

1 ´ δ ´2

?q

q ´ 1

¯

, r “?

q

˚qRecall the TVZ ’81 bound without locality: R ě 1 ´ δ ´ 1?

q´1

Constructions of LRC codes 22 / 25

LRC codes on curves better than the GV bound

The asymptotic GV bound can be improved for any given (constant) r for all q greater thansome value.

Constructions of LRC codes 23 / 25

LRC codes on curves better than the GV bound

The asymptotic GV bound can be improved for any given (constant) r for all q greater thansome value.

Constructions of LRC codes 23 / 25

Extensions and open questions

Extensions

• LRC codes that correct locally multiple erasures

• LRC codes on curves with availability

• Adjusting the locality value r

Open questions

• Constructions of LRC codes from known families of good curves

• Parameters of subfield subcodes of AG LRC codes

• Automorphism groups of curves and codes with availability

Constructions of LRC codes 24 / 25

References

S. Goparaju and R. Calderbank, Binary cyclic codes that are locally repairable, Proc. 2014 IEEE Int. Sympos.Inform. Theory, Honolulu, HI, pp. 676–680.

A. Barg, I. Tamo, and S. Vladut, Locally recoverable codes on algebraic curves, March 2016 (Preprint:arXiv:1603.08876).

I. Tamo, A. Barg, S. Goparaju, and R. Calderbank, Cyclic LRC codes, binary LRC codes, and upper bounds onthe distance of cyclic codes, International Journal of Information and Coding Theory, to appear (Preprint:arXiv:1603.08878).

A. Barg and I. Tamo, On codes with the locality constraint, IEEE IT Society Newsletter 65, no. 4 (2015),pp. 7–15.

Constructions of LRC codes 25 / 25

Erasure (and LRC) codes in practiceA brief overview

LRC codes in practice 1 / 14

Before LRC codes

Replication: Google (x3, x27), FB (x3), HDFS, others

RS codes, e.g., [6,4] RS codes: RAID; used by FB, others

Array-type codes: RAID 6 (2 parities per block); IBM (EVENODD, X-Code)

“There are two kinds of erasure codes implemented in Raid: XOR code and Reed-Solomon code. [...]The replication on the source file can be reduced to 1 when using Reed-Solomon without losing data safety. Thedownside of having only one replica of a block is that reads of a block have to go to a single machine, reducingparallelism. Thus Reed-Solomon should be used on data that is not supposed to be used frequently.”http://wiki.apache.org/hadoop/HDFS-RAID

LRC codes in practice 2 / 14

Before LRC codes

Replication: Google (x3, x27), FB (x3), HDFS, others

RS codes, e.g., [6,4] RS codes: RAID; used by FB, others

Array-type codes: RAID 6 (2 parities per block); IBM (EVENODD, X-Code)

“There are two kinds of erasure codes implemented in Raid: XOR code and Reed-Solomon code. [...]The replication on the source file can be reduced to 1 when using Reed-Solomon without losing data safety. Thedownside of having only one replica of a block is that reads of a block have to go to a single machine, reducingparallelism. Thus Reed-Solomon should be used on data that is not supposed to be used frequently.”http://wiki.apache.org/hadoop/HDFS-RAID

LRC codes in practice 2 / 14

Before LRC codes

Replication: Google (x3, x27), FB (x3), HDFS, others

RS codes, e.g., [6,4] RS codes: RAID; used by FB, others

Array-type codes: RAID 6 (2 parities per block); IBM (EVENODD, X-Code)

“There are two kinds of erasure codes implemented in Raid: XOR code and Reed-Solomon code. [...]The replication on the source file can be reduced to 1 when using Reed-Solomon without losing data safety. Thedownside of having only one replica of a block is that reads of a block have to go to a single machine, reducingparallelism. Thus Reed-Solomon should be used on data that is not supposed to be used frequently.”http://wiki.apache.org/hadoop/HDFS-RAID

LRC codes in practice 2 / 14

Before LRC codes

Replication: Google (x3, x27), FB (x3), HDFS, others

RS codes, e.g., [6,4] RS codes: RAID; used by FB, others

Array-type codes: RAID 6 (2 parities per block); IBM (EVENODD, X-Code)

“There are two kinds of erasure codes implemented in Raid: XOR code and Reed-Solomon code. [...]The replication on the source file can be reduced to 1 when using Reed-Solomon without losing data safety. Thedownside of having only one replica of a block is that reads of a block have to go to a single machine, reducingparallelism. Thus Reed-Solomon should be used on data that is not supposed to be used frequently.”http://wiki.apache.org/hadoop/HDFS-RAID

LRC codes in practice 2 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Applications of erasure codes

Systems using existing solutions:

• Amazon Web Services (AWS) + Glacier archiving service (previously S3); on themarket

• Dropbox (moved its storage from AWS to own system)

• Google Colossus distributed FS ([9,6] RS codes; OSDI2010)

• Facebook F4 ([14,10] RS code; OSDI 2014)

• Ceph storage platform (ceph.com)

Innovative proposals, erasure codes with some form of locality:

• HDFS-RAID, including HDFS Xorbas (FB)

• Microsoft Azure storage (VLDB Endowment, 2013)

• HACFS, adaptive coding for HDFS (IBM) (FAST ’15)

• Glacier (NSDI ’05), Petal (ASPLOS ’96), Weaver (FAST ’05), Stair codes (FAST ’14),CORE (MSST 2013), ...

• Proposals based on regenerating codes

(Many more)

LRC codes in practice 3 / 14

Optimization of coding for storage

[Xia et al., HACFS, USENIX FAST’15]

LRC codes in practice 4 / 14

Erasure coding in Ceph

Ceph: Object storage based freesoftware storage platform for storingon a single cluster

Erasure coding:uses a [5,3] MDS code

http://docs.ceph.com

LRC codes in practice 5 / 14

Facebook’s storage system

The FB system stores files as redundant pieces of data distributed across different drivesand data centers.The system distinguishes between cold, warm, and hot data.

Hot data (Haystack): The data is replicated 3 times, two copies in one center on differentracks, and the third copy in a different center.

LRC codes in practice 6 / 14

Facebook’s storage system

Warm data are written once, accessed many times, and can be deleted but not modified(e.g., 400B photos).

Facebook’s F4 “Warm” BLOB storage system relies on r14, 10s RS codes. The stripe of 14blocks is placed on 14 different nodes, different racks.

(S. Muralidhar e.a., OSDI ’14)

Main issues: Reliability and load analysis based on the models of failures in the system,and storage/system architecture

Optimizing storage, input/output, and bandwidth (the amount of data exchanges in thesystem).

LRC codes in practice 7 / 14

Facebook’s storage system

Warm data are written once, accessed many times, and can be deleted but not modified(e.g., 400B photos).

Facebook’s F4 “Warm” BLOB storage system relies on r14, 10s RS codes. The stripe of 14blocks is placed on 14 different nodes, different racks.

(S. Muralidhar e.a., OSDI ’14)

Main issues: Reliability and load analysis based on the models of failures in the system,and storage/system architecture

Optimizing storage, input/output, and bandwidth (the amount of data exchanges in thesystem).

LRC codes in practice 7 / 14

Facebook’s storage system

Warm data are written once, accessed many times, and can be deleted but not modified(e.g., 400B photos).

Facebook’s F4 “Warm” BLOB storage system relies on r14, 10s RS codes. The stripe of 14blocks is placed on 14 different nodes, different racks.

(S. Muralidhar e.a., OSDI ’14)

Main issues: Reliability and load analysis based on the models of failures in the system,and storage/system architecture

Optimizing storage, input/output, and bandwidth (the amount of data exchanges in thesystem).

LRC codes in practice 7 / 14

Data placement

http://www.slideshare.net/ydn/hdfs-raid-facebook

LRC codes in practice 8 / 14

HDFS RAID + LRC (HDFS-XORbas)

LRC codes for Hadoop file system(M. Sathiamoorthy et al., 2013 VLDB Endowment)

[14,10] RS code with two additional local parities:

X1,X2,X3,X4,X5 S1 X6,X7,X8,X9,X10 S2 P1,P2,P3,P4

Data Data RS parities

where S1 “ř5

i“1 aiXi, S2 “ř5

i“1 biXi`5 are the local parities

Reliability analysis in terms of mean time to data loss using a Markov failure model

LRC codes in practice 9 / 14

HDFS RAID + LRC (HDFS-XORbas)

LRC codes for Hadoop file system(M. Sathiamoorthy et al., 2013 VLDB Endowment)

[14,10] RS code with two additional local parities:

X1,X2,X3,X4,X5 S1 X6,X7,X8,X9,X10 S2 P1,P2,P3,P4

Data Data RS parities

where S1 “ř5

i“1 aiXi, S2 “ř5

i“1 biXi`5 are the local parities

Reliability analysis in terms of mean time to data loss using a Markov failure model

LRC codes in practice 9 / 14

Microsoft Azure storage

(previously Windows Azure storage)

F16 X X X X X X

SS

P

P

1

2

21 3 4

1 2

65

S1 “ α1,1X1 ` α1,2X2 ` α1,3X3, S2 “ α2,1X4 ` α2,2X5 ` α2,3X6

Choose a basis of C so that every 4 erasures other than

tX1,X2,X3, S1u, tX4,X5,X6, S2u

are correctable (Maximally Recoverable Codes, refer to Part 5 “MR codes”).

• C. Huang et al., USENIX 2012: (6,2,2) code, (12,2,2) code (1.33x overhead)

• Installed in Windows 8.1 and Windows Server 2012

LRC codes in practice 10 / 14

Microsoft Azure storage

(previously Windows Azure storage)

F16 X X X X X X

SS

P

P

1

2

21 3 4

1 2

65

S1 “ α1,1X1 ` α1,2X2 ` α1,3X3, S2 “ α2,1X4 ` α2,2X5 ` α2,3X6

Choose a basis of C so that every 4 erasures other than

tX1,X2,X3, S1u, tX4,X5,X6, S2u

are correctable (Maximally Recoverable Codes, refer to Part 5 “MR codes”).

• C. Huang et al., USENIX 2012: (6,2,2) code, (12,2,2) code (1.33x overhead)

• Installed in Windows 8.1 and Windows Server 2012

LRC codes in practice 10 / 14

Adaptive coding for HDFS

Implements 2 code families:

• 6 x 5 Product codes

• LRC (12,2,2) codes12 data blocks; 2 local parities, 2global parities

Optimizing overhead or recovery time

[Xia et al., HACFS, USENIX FAST’15]

LRC codes in practice 11 / 14

Maximally recoverable codes for SSD arrays

RAID 5 or RAID 6 type architecture (r “ 1 to 3 parities per row)

global parities

LRC codes in practice 12 / 14

Maximally recoverable codes for SSD arrays

RAID 5 or RAID 6 type architecture (r “ 1 or 2 parities per row)

Mixed failure model: device failure plus several sector failures

Algebraic construction of MR codes similar to rank error codes (Blaum et al., 2013)

LRC codes in practice 13 / 14

Maximally recoverable codes for SSD arrays

RAID 5 or RAID 6 type architecture (r “ 1 or 2 parities per row)

Mixed failure model: device failure plus several sector failures

Algebraic construction of MR codes similar to rank error codes (Blaum et al., 2013)

LRC codes in practice 13 / 14

A selection of references

C. Huang et al., Erasure coding in Windows Azure storage, Usenix ATC, 2012

M. Blaum et al., Partial MDS codes and applcations in RAID, IT Trans. 2013

˚ f4: Facebook’s Warm BLOB Storage system, 11th USENIX OSDI, 2014, 383–398

˚ M. Xia et al., A tale of two erasure codes in HDFS, 13th USENIX FAST, 2015, 213–226

A. Dimakis, LRC codes and index coding, Talk at DIMACS workshop, Dec. 2015

J. Plank and C. Huang, Tutorial on erasure codes, USENIX FAST 2013

XORing elephants: Erasure codes for Big Data, VLDB Endowment 2013, vol.6, 325–336

K.V. Rashmi et al., Erasure-coded distributed storage systems (FB study), USENIX 2013

˚ K.V. Rashmi et al., Jointly optimal erasure codes for I/O, storage & bandwidth, FAST 2015

˚ Includes overview of other industry-related solutions or proposals

LRC codes in practice 14 / 14

Bounds on LRC codes

Bounds on LRC codes 1 / 22

Problem

C Ă Fnq – LRC code of length n, cardinality qk, distance d, locality r

Def: For any i P rns there exists a recovery (helper) set Ri Ă rnsztiu, |Ri| ď r and a functionϕi such that for any x P C

xi “ ϕipxj, j P Riq

(We do not assume that Ri X Rj “ H, i ‰ j )

Given k define dpn, k, r, qq to be the largest possible distance of C

Given d define kpn, d, r, qq to be the largest possible dimension of C(log-cardinality)

Bounds on d, k ?

Bounds on LRC codes 2 / 22

Problem

C Ă Fnq – LRC code of length n, cardinality qk, distance d, locality r

Def: For any i P rns there exists a recovery (helper) set Ri Ă rnsztiu, |Ri| ď r and a functionϕi such that for any x P C

xi “ ϕipxj, j P Riq

(We do not assume that Ri X Rj “ H, i ‰ j )

Given k define dpn, k, r, qq to be the largest possible distance of C

Given d define kpn, d, r, qq to be the largest possible dimension of C(log-cardinality)

Bounds on d, k ?

Bounds on LRC codes 2 / 22

Problem

C Ă Fnq – LRC code of length n, cardinality qk, distance d, locality r

Def: For any i P rns there exists a recovery (helper) set Ri Ă rnsztiu, |Ri| ď r and a functionϕi such that for any x P C

xi “ ϕipxj, j P Riq

(We do not assume that Ri X Rj “ H, i ‰ j )

Given k define dpn, k, r, qq to be the largest possible distance of C

Given d define kpn, d, r, qq to be the largest possible dimension of C(log-cardinality)

Bounds on d, k ?

Bounds on LRC codes 2 / 22

Problem

C Ă Fnq – LRC code of length n, cardinality qk, distance d, locality r

Def: For any i P rns there exists a recovery (helper) set Ri Ă rnsztiu, |Ri| ď r and a functionϕi such that for any x P C

xi “ ϕipxj, j P Riq

(We do not assume that Ri X Rj “ H, i ‰ j )

Given k define dpn, k, r, qq to be the largest possible distance of C

Given d define kpn, d, r, qq to be the largest possible dimension of C(log-cardinality)

Bounds on d, k ?

Bounds on LRC codes 2 / 22

First results

kn

ďr

r ` 1

d ď n ´ k ´

Qkr

U

` 2

(Gopalan e.a.. IEEE Trans. IT, 2013)

The bound on the rate was proved in the first part of the tutorial.Let us prove the bound on the distance.

Bounds on LRC codes 3 / 22

First results

kn

ďr

r ` 1

d ď n ´ k ´

Qkr

U

` 2

(Gopalan e.a.. IEEE Trans. IT, 2013)

The bound on the rate was proved in the first part of the tutorial.Let us prove the bound on the distance.

Bounds on LRC codes 3 / 22

The Generalized Singleton bound

Main idea. Let C be a q-ary code of length n, size qk. The distance dpCq satisfies

@SĂrns:|CS|ăqk dpCq ď n ´ |S| pAq

Construct a set S:

Let m “ t k´1r u. Put T :“ H, L :“ H.

For j “ 1, 2, . . . ,m: Take ij P pT Y Lqc, Put T Ð T YRij , L Ð L Y tiju (Ri is the recovery set for ij).

Clearly |T| ď k ´ 1, so |CT | ď qk´1, and |CTYL| ď qk´1

Suppose that |T| “ k ´ 1 (if needed, add some coordinates from rnszT Y L) and put S “ T Y L.

We have |S| “ k ´ 1 ` m “ k ` r kr s ´ 2 and |CS| ă qk. Conclude by pAq.

Bounds on LRC codes 4 / 22

The Generalized Singleton bound

Main idea. Let C be a q-ary code of length n, size qk. The distance dpCq satisfies

@SĂrns:|CS|ăqk dpCq ď n ´ |S| pAq

Construct a set S:

Let m “ t k´1r u. Put T :“ H, L :“ H.

For j “ 1, 2, . . . ,m: Take ij P pT Y Lqc, Put T Ð T YRij , L Ð L Y tiju (Ri is the recovery set for ij).

Clearly |T| ď k ´ 1, so |CT | ď qk´1, and |CTYL| ď qk´1

Suppose that |T| “ k ´ 1 (if needed, add some coordinates from rnszT Y L) and put S “ T Y L.

We have |S| “ k ´ 1 ` m “ k ` r kr s ´ 2 and |CS| ă qk. Conclude by pAq.

Bounds on LRC codes 4 / 22

The Generalized Singleton bound

Main idea. Let C be a q-ary code of length n, size qk. The distance dpCq satisfies

@SĂrns:|CS|ăqk dpCq ď n ´ |S| pAq

Construct a set S:

Let m “ t k´1r u. Put T :“ H, L :“ H.

For j “ 1, 2, . . . ,m: Take ij P pT Y Lqc, Put T Ð T YRij , L Ð L Y tiju (Ri is the recovery set for ij).

Clearly |T| ď k ´ 1, so |CT | ď qk´1, and |CTYL| ď qk´1

Suppose that |T| “ k ´ 1 (if needed, add some coordinates from rnszT Y L) and put S “ T Y L.

We have |S| “ k ´ 1 ` m “ k ` r kr s ´ 2 and |CS| ă qk. Conclude by pAq.

Bounds on LRC codes 4 / 22

The Generalized Singleton bound

Main idea. Let C be a q-ary code of length n, size qk. The distance dpCq satisfies

@SĂrns:|CS|ăqk dpCq ď n ´ |S| pAq

Construct a set S:

Let m “ t k´1r u. Put T :“ H, L :“ H.

For j “ 1, 2, . . . ,m: Take ij P pT Y Lqc, Put T Ð T YRij , L Ð L Y tiju (Ri is the recovery set for ij).

Clearly |T| ď k ´ 1, so |CT | ď qk´1, and |CTYL| ď qk´1

Suppose that |T| “ k ´ 1 (if needed, add some coordinates from rnszT Y L) and put S “ T Y L.

We have |S| “ k ´ 1 ` m “ k ` r kr s ´ 2 and |CS| ă qk. Conclude by pAq.

Bounds on LRC codes 4 / 22

The Generalized Singleton bound

Main idea. Let C be a q-ary code of length n, size qk. The distance dpCq satisfies

@SĂrns:|CS|ăqk dpCq ď n ´ |S| pAq

Construct a set S:

Let m “ t k´1r u. Put T :“ H, L :“ H.

For j “ 1, 2, . . . ,m: Take ij P pT Y Lqc, Put T Ð T YRij , L Ð L Y tiju (Ri is the recovery set for ij).

Clearly |T| ď k ´ 1, so |CT | ď qk´1, and |CTYL| ď qk´1

Suppose that |T| “ k ´ 1 (if needed, add some coordinates from rnszT Y L) and put S “ T Y L.

We have |S| “ k ´ 1 ` m “ k ` r kr s ´ 2 and |CS| ă qk. Conclude by pAq.

Bounds on LRC codes 4 / 22

Shortening (alphabet-dependent) bound

Theorem (Cadambe-Mazumdar ’13)

Let kqpm, dq be the maximum dimension of a q-ary code of length m and distance d. Then

kpn, d, r, qq ď minsě1

tsr ` kqpn ´ spr ` 1q, dqu.

Proof.Let C be a linear LRC code with distance d (the linearity assumption is easily lifted).

• Following the proof of the LRC Singleton bound, construct a subset I Ă rns such that|I| “ spr ` 1q, dim projIpCq ď sr.

• Consider the shortening CI of C by the coordinates in I. We obtain

length pCIq “ n ´ spr ` 1q, dpCIq “ d

dimpCIq “ k ´ dim projIpCq ě k ´ sr

kqpn ´ spr ` 1q, dq ě k ´ sr

Bounds on LRC codes 5 / 22

Asymptotic problem

Bounds on codes are often considered in the asymptotics of n Ñ 8

Let Mpn, r, δnq be the max size of a code of length n, distance d, locality r

Rpr, δq :“ lim supnÑ8

1n

log Mpn, r, δnq

Bounds on LRC codes 6 / 22

Gilbert-Varshamov bound

For binary codes,

Rpr, δq ě 1 ´ min0ăsď1

! 1r ` 1

log2pp1 ` sqr`1

` p1 ´ sqr`1

q ´ δ log2 s)

.

Proof by random coding: Estimate the average weight enumerator for the ensemble givenby

H “

»

11 . . . 1

11 . . . 1. . .

11 . . . 1HL

fi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

fl

PrptdpCq ă δnuq

ď δnq´np rr`1 ´Rq min

0ăsď1

bpsqn

r`1

sδn

bpsq “1q

pp1 ` pq ´ 1qsqr`1

` pq ´ 1qp1 ´ sqr`1q

(Cadambe-Mazumdar ’15; Tamo-B.-Frolov ’15)

Bounds on LRC codes 7 / 22

Gilbert-Varshamov bound

For binary codes,

Rpr, δq ě 1 ´ min0ăsď1

! 1r ` 1

log2pp1 ` sqr`1

` p1 ´ sqr`1

q ´ δ log2 s)

.

Proof by random coding: Estimate the average weight enumerator for the ensemble givenby

H “

»

11 . . . 1

11 . . . 1. . .

11 . . . 1HL

fi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

fl

PrptdpCq ă δnuq

ď δnq´np rr`1 ´Rq min

0ăsď1

bpsqn

r`1

sδn

bpsq “1q

pp1 ` pq ´ 1qsqr`1

` pq ´ 1qp1 ´ sqr`1q

(Cadambe-Mazumdar ’15; Tamo-B.-Frolov ’15)

Bounds on LRC codes 7 / 22

Gilbert-Varshamov bound

For binary codes,

Rpr, δq ě 1 ´ min0ăsď1

! 1r ` 1

log2pp1 ` sqr`1

` p1 ´ sqr`1

q ´ δ log2 s)

.

Proof by random coding: Estimate the average weight enumerator for the ensemble givenby

H “

»

11 . . . 1

11 . . . 1. . .

11 . . . 1HL

fi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

fl

PrptdpCq ă δnuq

ď δnq´np rr`1 ´Rq min

0ăsď1

bpsqn

r`1

sδn

bpsq “1q

pp1 ` pq ´ 1qsqr`1

` pq ´ 1qp1 ´ sqr`1q

(Cadambe-Mazumdar ’15; Tamo-B.-Frolov ’15)

Bounds on LRC codes 7 / 22

Gilbert-Varshamov bound

For binary codes,

Rpr, δq ě 1 ´ min0ăsď1

! 1r ` 1

log2pp1 ` sqr`1

` p1 ´ sqr`1

q ´ δ log2 s)

.

Proof by random coding: Estimate the average weight enumerator for the ensemble givenby

H “

»

11 . . . 1

11 . . . 1. . .

11 . . . 1HL

fi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

fl

PrptdpCq ă δnuq

ď δnq´np rr`1 ´Rq min

0ăsď1

bpsqn

r`1

sδn

bpsq “1q

pp1 ` pq ´ 1qsqr`1

` pq ´ 1qp1 ´ sqr`1q

(Cadambe-Mazumdar ’15; Tamo-B.-Frolov ’15)

Bounds on LRC codes 7 / 22

Gilbert-Varshamov bound

For binary codes,

Rpr, δq ě 1 ´ min0ăsď1

! 1r ` 1

log2pp1 ` sqr`1

` p1 ´ sqr`1

q ´ δ log2 s)

.

Proof by random coding: Estimate the average weight enumerator for the ensemble givenby

H “

»

11 . . . 1

11 . . . 1. . .

11 . . . 1HL

fi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

fl

PrptdpCq ă δnuq

ď δnq´np rr`1 ´Rq min

0ăsď1

bpsqn

r`1

sδn

bpsq “1q

pp1 ` pq ´ 1qsqr`1

` pq ´ 1qp1 ´ sqr`1q

(Cadambe-Mazumdar ’15; Tamo-B.-Frolov ’15)

Bounds on LRC codes 7 / 22

Asymptotic upper bounds

LetRqpr, δq “ lim sup

nÑ8

1n

logq kpn, r, δnq

Using the Cadambe-Mazumdar bound together with classic bounds on kqpm, dq (with nolocality)

Rqpr, δq ďr

r ` 1p1 ´ δq, 0 ď δ ď 1

Rqpr, δq ďr

r ` 1

´

1 ´ δq

q ´ 1

¯

, 0 ď δ ď q{pq ´ 1q

Rqpr, δq ď min0ďτď 1

r`1

!

τr ` p1 ´ τpr ` 1qqfq

´ δ

1 ´ τpr ` 1q

¯)

wherefqpxq :“ hqp 1

q pq ´ 1 ´ xpq ´ 2q ´ 2b

pq ´ 1qxp1 ´ xqqq,

hqpxq :“ ´x logqpx{pq ´ 1qq ´ p1 ´ xq logqp1 ´ xq.

Bounds on LRC codes 8 / 22

Asymptotic bounds

Singleton bound

Plotkin bound

LP boundGV bound

0.2 0.4 0.6 0.8 1.0∆

0.1

0.2

0.3

0.4

0.5

0.6

0.7

R

Binary codes; r “ 3

Bounds on LRC codes 9 / 22

Improving GV bound using LRC codes on curves

(B-Tamo-Vladut, ’15)

(details in part on algebraic constructions)

Bounds on LRC codes 10 / 22

Correcting more than one erasure

Our goal here is to correct locally ě 2 erasures.

Def. (correcting ρ ´ 1 erasure): C is an pn, k, r, ρq LRC code if each i P rns is contained in asubset Ii, |Ii| ď r ` ρ ´ 1 such that the projection of C on Ii forms a code with distance ě ρ.

Singleton-type bound:

d ď n ´ k ` 1 ´

´Qkr

U

´ 1¯

pρ ´ 1q (Prakash e.a., ISIT ’12)

GV-type bound:

• Proof by random coding, similar to the LRC GV bound

• For large q the GV bound can be improved using codes on curves (refer to the part“Algebraic constructions”)

(A.B., I. Tamo, and S. Vladut, ’16).

Bounds on LRC codes 11 / 22

Correcting more than one erasure

Our goal here is to correct locally ě 2 erasures.

Def. (correcting ρ ´ 1 erasure): C is an pn, k, r, ρq LRC code if each i P rns is contained in asubset Ii, |Ii| ď r ` ρ ´ 1 such that the projection of C on Ii forms a code with distance ě ρ.

Singleton-type bound:

d ď n ´ k ` 1 ´

´Qkr

U

´ 1¯

pρ ´ 1q (Prakash e.a., ISIT ’12)

GV-type bound:

• Proof by random coding, similar to the LRC GV bound

• For large q the GV bound can be improved using codes on curves (refer to the part“Algebraic constructions”)

(A.B., I. Tamo, and S. Vladut, ’16).

Bounds on LRC codes 11 / 22

Correcting more than one erasure

Our goal here is to correct locally ě 2 erasures.

Def. (correcting ρ ´ 1 erasure): C is an pn, k, r, ρq LRC code if each i P rns is contained in asubset Ii, |Ii| ď r ` ρ ´ 1 such that the projection of C on Ii forms a code with distance ě ρ.

Singleton-type bound:

d ď n ´ k ` 1 ´

´Qkr

U

´ 1¯

pρ ´ 1q (Prakash e.a., ISIT ’12)

GV-type bound:

• Proof by random coding, similar to the LRC GV bound

• For large q the GV bound can be improved using codes on curves (refer to the part“Algebraic constructions”)

(A.B., I. Tamo, and S. Vladut, ’16).

Bounds on LRC codes 11 / 22

Correcting more than one erasure

Our goal here is to correct locally ě 2 erasures.

Def. (correcting ρ ´ 1 erasure): C is an pn, k, r, ρq LRC code if each i P rns is contained in asubset Ii, |Ii| ď r ` ρ ´ 1 such that the projection of C on Ii forms a code with distance ě ρ.

Singleton-type bound:

d ď n ´ k ` 1 ´

´Qkr

U

´ 1¯

pρ ´ 1q (Prakash e.a., ISIT ’12)

GV-type bound:

• Proof by random coding, similar to the LRC GV bound

• For large q the GV bound can be improved using codes on curves (refer to the part“Algebraic constructions”)

(A.B., I. Tamo, and S. Vladut, ’16).

Bounds on LRC codes 11 / 22

Correcting more than one erasure - recursive bounds

Theorem (Hu, Tamo, B., ISIT’16)

Let Bpn, ρq be an upper bound on the size of a code of length n and distance ρ. If thefunction B is log-convex in n, then for any pn, k, r, ρq LRC code,

k ď

Qn ´ pd ´ 1q

r ` ρ ´ 1

U

logq Bpr ` ρ ´ 1, ρq

E.g., locality-dependent Plotkin bound: Let ρ ą pr ` ρ ´ 1qq´1

q , then

k ď

Qn ´ pd ´ 1q

r ` ρ ´ 1

U

logqρ

ρ ´q´1

q pr ` ρ ´ 1q

It is also possible to derive a locality-dependent Hamming bound.

Bounds on LRC codes 12 / 22

Linear programming bounds

In classical coding theory, all (or most) known upper bounds on the size of codes with agiven distance can be derived via linear programming (Delsarte ’72). In this part we extendthis approach to codes with locality.

Steps:

• Construct an association scheme for codes with locality

• Derive positivity conditions for the distance distribution (Delsarte inequalities)

• Find closed-form expressions for upper bounds

Bounds on LRC codes 13 / 22

Linear programming bounds

C an pn, k, r, ρq LRC code, distance d.

Assume disjoint recovery groups of size r ` ρ ´ 1Product association scheme Hpst, qq “ pHpt, qqqbs, s ě 2

Let C be a q-ary pn, k, r, ρq LRC code with distance d. Define

T :“␣

i “ pi1, . . . , isq | i1 ` . . . ` is ě d,

ip P t0, ρ, ρ ` 1, . . . , ρ ` r ´ 1u for all p “ 1, . . . , s.(

.

Then |C| ď 1 `ř

iPT ai, where pai, i P Tq is given by the following LP problem

maximizeÿ

iPT

ai subject to

ai ě 0, i P T,ÿ

iPT

aiQij ě ´Kpr`ρ´1qj p0q, j P rr ` ρ ´ 1s

szt0u

(Hu et al., Proc. ISIT 2016)

Bounds on LRC codes 14 / 22

Linear programming bounds

C an pn, k, r, ρq LRC code, distance d.Assume disjoint recovery groups of size r ` ρ ´ 1

Product association scheme Hpst, qq “ pHpt, qqqbs, s ě 2

Let C be a q-ary pn, k, r, ρq LRC code with distance d. Define

T :“␣

i “ pi1, . . . , isq | i1 ` . . . ` is ě d,

ip P t0, ρ, ρ ` 1, . . . , ρ ` r ´ 1u for all p “ 1, . . . , s.(

.

Then |C| ď 1 `ř

iPT ai, where pai, i P Tq is given by the following LP problem

maximizeÿ

iPT

ai subject to

ai ě 0, i P T,ÿ

iPT

aiQij ě ´Kpr`ρ´1qj p0q, j P rr ` ρ ´ 1s

szt0u

(Hu et al., Proc. ISIT 2016)

Bounds on LRC codes 14 / 22

Linear programming bounds

C an pn, k, r, ρq LRC code, distance d.Assume disjoint recovery groups of size r ` ρ ´ 1Product association scheme Hpst, qq “ pHpt, qqqbs, s ě 2

Let C be a q-ary pn, k, r, ρq LRC code with distance d. Define

T :“␣

i “ pi1, . . . , isq | i1 ` . . . ` is ě d,

ip P t0, ρ, ρ ` 1, . . . , ρ ` r ´ 1u for all p “ 1, . . . , s.(

.

Then |C| ď 1 `ř

iPT ai, where pai, i P Tq is given by the following LP problem

maximizeÿ

iPT

ai subject to

ai ě 0, i P T,ÿ

iPT

aiQij ě ´Kpr`ρ´1qj p0q, j P rr ` ρ ´ 1s

szt0u

(Hu et al., Proc. ISIT 2016)

Bounds on LRC codes 14 / 22

Linear programming bounds

C an pn, k, r, ρq LRC code, distance d.Assume disjoint recovery groups of size r ` ρ ´ 1Product association scheme Hpst, qq “ pHpt, qqqbs, s ě 2

Let C be a q-ary pn, k, r, ρq LRC code with distance d. Define

T :“␣

i “ pi1, . . . , isq | i1 ` . . . ` is ě d,

ip P t0, ρ, ρ ` 1, . . . , ρ ` r ´ 1u for all p “ 1, . . . , s.(

.

Then |C| ď 1 `ř

iPT ai, where pai, i P Tq is given by the following LP problem

maximizeÿ

iPT

ai subject to

ai ě 0, i P T,ÿ

iPT

aiQij ě ´Kpr`ρ´1qj p0q, j P rr ` ρ ´ 1s

szt0u

(Hu et al., Proc. ISIT 2016)

Bounds on LRC codes 14 / 22

Linear programming bounds

Results:It is possible to

• Derive the Singleton, Hamming, and Plotkin bounds on pn, k, r, ρq LRC codes

• Improve the shortening bound in examples for short length (e.g., n “ 8-20) by solvingthe LP problem

Bounds on LRC codes 15 / 22

Availability (Multiple recovery sets)

Def: C Ă Fn is an LRC code with availability if every coordinate i Ă rns has tpairwise disjoint recovery sets R

pjqi , and |R

pjqi | ď rj, j “ 1, . . . , t.

For simplicity let rj “ r, j “ 1, . . . , t and use the notation pn, k, r, tq LRC code

Example: p6, 3, 2, 2q LRC code with distance 3, parity-check matrix

H “

»

0 0 0 1 1 10 1 1 0 0 11 0 1 0 1 0

fi

ffi

fl

Constructions:

• Case t “ 2 : Two-level codes give a natural way of constructing LRC codes with t “ 2.Examples include product codes, codes on bipartite graphs.

• Algebraic constructions of codes with t ě 2 recovery sets are covered in another partof the tutorial (refer to “Introduction”)

Bounds on LRC codes 16 / 22

Availability (Multiple recovery sets)

Def: C Ă Fn is an LRC code with availability if every coordinate i Ă rns has tpairwise disjoint recovery sets R

pjqi , and |R

pjqi | ď rj, j “ 1, . . . , t.

For simplicity let rj “ r, j “ 1, . . . , t and use the notation pn, k, r, tq LRC code

Example: p6, 3, 2, 2q LRC code with distance 3, parity-check matrix

H “

»

0 0 0 1 1 10 1 1 0 0 11 0 1 0 1 0

fi

ffi

fl

Constructions:

• Case t “ 2 : Two-level codes give a natural way of constructing LRC codes with t “ 2.Examples include product codes, codes on bipartite graphs.

• Algebraic constructions of codes with t ě 2 recovery sets are covered in another partof the tutorial (refer to “Introduction”)

Bounds on LRC codes 16 / 22

Availability (Multiple recovery sets)

Def: C Ă Fn is an LRC code with availability if every coordinate i Ă rns has tpairwise disjoint recovery sets R

pjqi , and |R

pjqi | ď rj, j “ 1, . . . , t.

For simplicity let rj “ r, j “ 1, . . . , t and use the notation pn, k, r, tq LRC code

Example: p6, 3, 2, 2q LRC code with distance 3, parity-check matrix

H “

»

0 0 0 1 1 10 1 1 0 0 11 0 1 0 1 0

fi

ffi

fl

Constructions:

• Case t “ 2 : Two-level codes give a natural way of constructing LRC codes with t “ 2.Examples include product codes, codes on bipartite graphs.

• Algebraic constructions of codes with t ě 2 recovery sets are covered in another partof the tutorial (refer to “Introduction”)

Bounds on LRC codes 16 / 22

Upper bounds on codes with availability

Extension of the Singleton bound:

d ď n ´ k ` 2 ´

Q tpk ´ 1q ` 1tpr ´ 1q ` 1

U

(Wang-Zhang ’14; Rawat e.a. ’14)

Bounds on LRC codes 17 / 22

Upper bounds on codes with availability

Theorem (Tamo-B-Frolov ’15)

The parameters of a t-LRC code are bounded as follows:

kn

ď1

śtj“1p1 ` 1

jr q

d ď n ´

tÿ

i“0

Yk ´ 1ri

]

Proof idea:• construct a directed graph GpV,Eq, V “ rns, pi, jq P E iff j is a part of some recovery set for i

• study colorings and expansion of the graph, which enables us to trace propagation of dependent verticesin G

• The bounds are tight in some examples, but generally are likely to be loose.

• The above bound and the bound on the previous page are incomparable (i.e., thereare examples where each of them is better than the other one).

Bounds on LRC codes 18 / 22

Upper bounds on codes with availability

Theorem (Tamo-B-Frolov ’15)

The parameters of a t-LRC code are bounded as follows:

kn

ď1

śtj“1p1 ` 1

jr q

d ď n ´

tÿ

i“0

Yk ´ 1ri

]

Proof idea:• construct a directed graph GpV,Eq, V “ rns, pi, jq P E iff j is a part of some recovery set for i

• study colorings and expansion of the graph, which enables us to trace propagation of dependent verticesin G

• The bounds are tight in some examples, but generally are likely to be loose.

• The above bound and the bound on the previous page are incomparable (i.e., thereare examples where each of them is better than the other one).

Bounds on LRC codes 18 / 22

Upper bounds on codes with availability

Theorem (Tamo-B-Frolov ’15)

The parameters of a t-LRC code are bounded as follows:

kn

ď1

śtj“1p1 ` 1

jr q

d ď n ´

tÿ

i“0

Yk ´ 1ri

]

Proof idea:• construct a directed graph GpV,Eq, V “ rns, pi, jq P E iff j is a part of some recovery set for i

• study colorings and expansion of the graph, which enables us to trace propagation of dependent verticesin G

• The bounds are tight in some examples, but generally are likely to be loose.

• The above bound and the bound on the previous page are incomparable (i.e., thereare examples where each of them is better than the other one).

Bounds on LRC codes 18 / 22

Existence of codes with availability; t “ 2

GV-type bound for codes with two recovery sets (TBF ’16)

• Let G be a regular graph of degree r ` 1. Each edge is adjacent to two disjoint sets of r edges,which form 2 recovery sets (bipartite graphs support recovery sets of different sizes)

• Consider the ensemble of linear codes with parity-check matrices

H “

»

Ar`2

Ar`2

. . .

Ar`2

HL

fi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

fl

where Ar`2 is E-V incidence matrix of the complete graph Kr`2 (or of Kr1,r2 )HL is a random matrix with independent entries

• Analyze the ensemble-average distance of codes

Bounds on LRC codes 19 / 22

Existence of codes with availability; t “ 2

GV-type bound for codes with two recovery sets (TBF ’16)

• Let G be a regular graph of degree r ` 1. Each edge is adjacent to two disjoint sets of r edges,which form 2 recovery sets

(bipartite graphs support recovery sets of different sizes)

• Consider the ensemble of linear codes with parity-check matrices

H “

»

Ar`2

Ar`2

. . .

Ar`2

HL

fi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

fl

where Ar`2 is E-V incidence matrix of the complete graph Kr`2 (or of Kr1,r2 )HL is a random matrix with independent entries

• Analyze the ensemble-average distance of codes

Bounds on LRC codes 19 / 22

Existence of codes with availability; t “ 2

GV-type bound for codes with two recovery sets (TBF ’16)

• Let G be a regular graph of degree r ` 1. Each edge is adjacent to two disjoint sets of r edges,which form 2 recovery sets (bipartite graphs support recovery sets of different sizes)

• Consider the ensemble of linear codes with parity-check matrices

H “

»

Ar`2

Ar`2

. . .

Ar`2

HL

fi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

fl

where Ar`2 is E-V incidence matrix of the complete graph Kr`2 (or of Kr1,r2 )HL is a random matrix with independent entries

• Analyze the ensemble-average distance of codes

Bounds on LRC codes 19 / 22

Existence of codes with availability; t “ 2

GV-type bound for codes with two recovery sets (TBF ’16)

• Let G be a regular graph of degree r ` 1. Each edge is adjacent to two disjoint sets of r edges,which form 2 recovery sets (bipartite graphs support recovery sets of different sizes)

• Consider the ensemble of linear codes with parity-check matrices

H “

»

Ar`2

Ar`2

. . .

Ar`2

HL

fi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

fl

where Ar`2 is E-V incidence matrix of the complete graph Kr`2 (or of Kr1,r2 )HL is a random matrix with independent entries

• Analyze the ensemble-average distance of codes

Bounds on LRC codes 19 / 22

Existence of codes with availability; t “ 2

GV-type bound for codes with two recovery sets (TBF ’16)

• Let G be a regular graph of degree r ` 1. Each edge is adjacent to two disjoint sets of r edges,which form 2 recovery sets (bipartite graphs support recovery sets of different sizes)

• Consider the ensemble of linear codes with parity-check matrices

H “

»

Ar`2

Ar`2

. . .

Ar`2

HL

fi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

ffi

fl

where Ar`2 is E-V incidence matrix of the complete graph Kr`2 (or of Kr1,r2 )HL is a random matrix with independent entries

• Analyze the ensemble-average distance of codes

Bounds on LRC codes 19 / 22

Many recovery sets

TheoremThere exist asymptotically good sequences of q-ary LRC codes (fixed q, n Ñ 8) with tdisjoint recovery sets of size r for any constant t, 1 ď t ď r

Ideas: Existence of bipartite expanding graphs; large finite field alphabets

Lower bound

Singleton bound

0.2 0.4 0.6 0.8 1.0∆

0.2

0.4

0.6

0.8

R

t “ 3; r “ 6δ “ 0,R “ 1 ´ t

r`1

Singleton bound

Rptqpr, δq ď

rtpr ´ 1q

rt`1 ´ 1p1 ´ δq

Bounds on LRC codes 20 / 22

Many recovery sets

TheoremThere exist asymptotically good sequences of q-ary LRC codes (fixed q, n Ñ 8) with tdisjoint recovery sets of size r for any constant t, 1 ď t ď r

Ideas: Existence of bipartite expanding graphs; large finite field alphabets

Lower bound

Singleton bound

0.2 0.4 0.6 0.8 1.0∆

0.2

0.4

0.6

0.8

R

t “ 3; r “ 6δ “ 0,R “ 1 ´ t

r`1

Singleton bound

Rptqpr, δq ď

rtpr ´ 1q

rt`1 ´ 1p1 ´ δq

Bounds on LRC codes 20 / 22

Open questions

• Alphabet-dependent bounds make little use of locality

• Find a good way to derive lower bounds for t ě 3, upper bounds for t ě 2

• Settle the corner point for δ “ 0 of the asymptotic bound for t ě 2

• Remove the structure assumption in the LP bound

• Derive an LP (closed-form asymptotic) bound for LRC codes that correct locally oneerasure

Bounds on LRC codes 21 / 22

Selected references

P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, On the locality of codeword symbols, IEEE Trans. Inform.Theory, vol. 58, no. 11, pp. 6925–6934, 2011.

N. Prakash, G. M. Kamath, V. Lalitha, and P. V. Kumar, Optimal linear codes with a local-error-correctionproperty, Proc. 2012 IEEE Internat. Sympos. Inform. Theory, IEEE, 2012, pp. 2776–2780.

V. Cadambe and A. Mazumdar, Bounds on the size of locally recoverable codes, IEEE Trans. Inform. Theory,vol. 61, no. 11, pp. 5787–5794, 2015.

A. Wang and Z. Zhang, Repair locality with multiple erasure tolerance, IEEE Trans. Inform. Theory, vol. 60,no. 11, 6979–6987, 2014.

A. Rawat, D. Papailiopoulos, A. Dimakis, and S. Vishwanath, Locality and availability in distributed storage, ISIT2014.

I. Tamo, A. Barg, and A. Frolov, Bounds on the parameters of locally recoverable codes, IEEE Trans. Inform.Theory, vol. 62, no. 6, 3070–3083, 2016.

S. Hu, I. Tamo, and A. Barg, Combinatorial bounds for LRC codes, Proc. ISIT 2016.

A. Barg, I. Tamo, and S. Vladut, Locally recoverable codes on algebraic curves, arXiv:1603.08876; also ISIT ’15.

Bounds on LRC codes 22 / 22

Maximally Recoverable CodesCombining locality and storage efficiency

Maximally Recoverable Codes 1 / 17

Maximally Recoverable (MR) codesMotivation

• An (n, k) code is MDS ⇐⇒ Any k symbols of the codeword suffice for decoding

• MDS property is very important in storage

• Recovery set = a set of r + 1 symbols s.t. any symbol of the set can be recoveredby the remaining r symbols

• A set of k symbols of an (n, k, r) LRC code, that contains a recovery set does notsuffice for decoding

• =⇒ LRC codes are not MDS

• Goal: LRC codes that are “close” to be MDS

Maximally Recoverable Codes 2 / 17

Maximally Recoverable (MR) codesMotivation

• An (n, k) code is MDS ⇐⇒ Any k symbols of the codeword suffice for decoding

• MDS property is very important in storage

• Recovery set = a set of r + 1 symbols s.t. any symbol of the set can be recoveredby the remaining r symbols

• A set of k symbols of an (n, k, r) LRC code, that contains a recovery set does notsuffice for decoding

• =⇒ LRC codes are not MDS

• Goal: LRC codes that are “close” to be MDS

Maximally Recoverable Codes 2 / 17

Maximally Recoverable (MR) codesMotivation

• An (n, k) code is MDS ⇐⇒ Any k symbols of the codeword suffice for decoding

• MDS property is very important in storage

• Recovery set = a set of r + 1 symbols s.t. any symbol of the set can be recoveredby the remaining r symbols

• A set of k symbols of an (n, k, r) LRC code, that contains a recovery set does notsuffice for decoding

• =⇒ LRC codes are not MDS

• Goal: LRC codes that are “close” to be MDS

Maximally Recoverable Codes 2 / 17

Maximally Recoverable (MR) codesMotivation

• An (n, k) code is MDS ⇐⇒ Any k symbols of the codeword suffice for decoding

• MDS property is very important in storage

• Recovery set = a set of r + 1 symbols s.t. any symbol of the set can be recoveredby the remaining r symbols

• A set of k symbols of an (n, k, r) LRC code, that contains a recovery set does notsuffice for decoding

• =⇒ LRC codes are not MDS

• Goal: LRC codes that are “close” to be MDS

Maximally Recoverable Codes 2 / 17

Maximally Recoverable (MR) codesMotivation

• An (n, k) code is MDS ⇐⇒ Any k symbols of the codeword suffice for decoding

• MDS property is very important in storage

• Recovery set = a set of r + 1 symbols s.t. any symbol of the set can be recoveredby the remaining r symbols

• A set of k symbols of an (n, k, r) LRC code, that contains a recovery set does notsuffice for decoding

• =⇒ LRC codes are not MDS

• Goal: LRC codes that are “close” to be MDS

Maximally Recoverable Codes 2 / 17

Maximally Recoverable (MR) codesMotivation

• An (n, k) code is MDS ⇐⇒ Any k symbols of the codeword suffice for decoding

• MDS property is very important in storage

• Recovery set = a set of r + 1 symbols s.t. any symbol of the set can be recoveredby the remaining r symbols

• A set of k symbols of an (n, k, r) LRC code, that contains a recovery set does notsuffice for decoding

• =⇒ LRC codes are not MDS

• Goal: LRC codes that are “close” to be MDS

Maximally Recoverable Codes 2 / 17

Maximally Recoverable (MR) codesMotivation

• An (n, k) code is MDS ⇐⇒ Any k symbols of the codeword suffice for decoding

• MDS property is very important in storage

• Recovery set = a set of r + 1 symbols s.t. any symbol of the set can be recoveredby the remaining r symbols

• A set of k symbols of an (n, k, r) LRC code, that contains a recovery set does notsuffice for decoding

• =⇒ LRC codes are not MDS

• Goal: LRC codes that are “close” to be MDS

Maximally Recoverable Codes 2 / 17

(n, k, r) MR codes - Definition

An (n, k, r) MR codes has the following properties:

• An (n, k, r) LRC code

• Disjoint recovery sets Ri, |Ri| = r + 1 for i = 1, ..., nr+1

• Any set S of k symbols s.t. Ri * S suffices for decoding

Remarks:

• If ∃i,Ri ⊆ S, |S| = k then decoding of the codeword is impossible

• Terminology: MR codes = Partially MDS codes

• Equivalent definition using Matroid Theory language

Maximally Recoverable Codes 3 / 17

(n, k, r) MR codes - Definition

An (n, k, r) MR codes has the following properties:

• An (n, k, r) LRC code

• Disjoint recovery sets Ri, |Ri| = r + 1 for i = 1, ..., nr+1

• Any set S of k symbols s.t. Ri * S suffices for decoding

Remarks:

• If ∃i,Ri ⊆ S, |S| = k then decoding of the codeword is impossible

• Terminology: MR codes = Partially MDS codes

• Equivalent definition using Matroid Theory language

Maximally Recoverable Codes 3 / 17

(n, k, r) MR codes - Definition

An (n, k, r) MR codes has the following properties:

• An (n, k, r) LRC code

• Disjoint recovery sets Ri, |Ri| = r + 1 for i = 1, ..., nr+1

• Any set S of k symbols s.t. Ri * S suffices for decoding

Remarks:

• If ∃i,Ri ⊆ S, |S| = k then decoding of the codeword is impossible

• Terminology: MR codes = Partially MDS codes

• Equivalent definition using Matroid Theory language

Maximally Recoverable Codes 3 / 17

(n, k, r) MR codes - Definition

An (n, k, r) MR codes has the following properties:

• An (n, k, r) LRC code

• Disjoint recovery sets Ri, |Ri| = r + 1 for i = 1, ..., nr+1

• Any set S of k symbols s.t. Ri * S suffices for decoding

Remarks:

• If ∃i,Ri ⊆ S, |S| = k then decoding of the codeword is impossible

• Terminology: MR codes = Partially MDS codes

• Equivalent definition using Matroid Theory language

Maximally Recoverable Codes 3 / 17

(n, k, r) MR codes - Definition

An (n, k, r) MR codes has the following properties:

• An (n, k, r) LRC code

• Disjoint recovery sets Ri, |Ri| = r + 1 for i = 1, ..., nr+1

• Any set S of k symbols s.t. Ri * S suffices for decoding

Remarks:

• If ∃i,Ri ⊆ S, |S| = k then decoding of the codeword is impossible

• Terminology: MR codes = Partially MDS codes

• Equivalent definition using Matroid Theory language

Maximally Recoverable Codes 3 / 17

(n, k, r) MR codes - Definition

An (n, k, r) MR codes has the following properties:

• An (n, k, r) LRC code

• Disjoint recovery sets Ri, |Ri| = r + 1 for i = 1, ..., nr+1

• Any set S of k symbols s.t. Ri * S suffices for decoding

Remarks:

• If ∃i,Ri ⊆ S, |S| = k then decoding of the codeword is impossible

• Terminology: MR codes = Partially MDS codes

• Equivalent definition using Matroid Theory language

Maximally Recoverable Codes 3 / 17

(n, k, r) MR codes - Definition

An (n, k, r) MR codes has the following properties:

• An (n, k, r) LRC code

• Disjoint recovery sets Ri, |Ri| = r + 1 for i = 1, ..., nr+1

• Any set S of k symbols s.t. Ri * S suffices for decoding

Remarks:

• If ∃i,Ri ⊆ S, |S| = k then decoding of the codeword is impossible

• Terminology: MR codes = Partially MDS codes

• Equivalent definition using Matroid Theory language

Maximally Recoverable Codes 3 / 17

(n, k, r) MR codes - Definition

An (n, k, r) MR codes has the following properties:

• An (n, k, r) LRC code

• Disjoint recovery sets Ri, |Ri| = r + 1 for i = 1, ..., nr+1

• Any set S of k symbols s.t. Ri * S suffices for decoding

Remarks:

• If ∃i,Ri ⊆ S, |S| = k then decoding of the codeword is impossible

• Terminology: MR codes = Partially MDS codes

• Equivalent definition using Matroid Theory language

Maximally Recoverable Codes 3 / 17

(n, k, r) MR codes - Definition

An (n, k, r) MR codes has the following properties:

• An (n, k, r) LRC code

• Disjoint recovery sets Ri, |Ri| = r + 1 for i = 1, ..., nr+1

• Any set S of k symbols s.t. Ri * S suffices for decoding

Remarks:

• If ∃i,Ri ⊆ S, |S| = k then decoding of the codeword is impossible

• Terminology: MR codes = Partially MDS codes

• Equivalent definition using Matroid Theory language

Maximally Recoverable Codes 3 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

Proof:

• Assume r divides k

• MR code is an Optimal LRC code

⇔ d = n− k − kr + 2

⇔ any d − 1 = n− k − kr + 1 erasures are recoverable

⇔ any n− (d − 1) = k + kr − 1 coordinates suffice for decoding

• Any k + kr − 1 coordinates contain a subset S s.t.

1. |S| = k

2. ∀i,Ri * S

• By the MR property, S suffices for decoding

Maximally Recoverable Codes 4 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

Proof:

• Assume r divides k

• MR code is an Optimal LRC code

⇔ d = n− k − kr + 2

⇔ any d − 1 = n− k − kr + 1 erasures are recoverable

⇔ any n− (d − 1) = k + kr − 1 coordinates suffice for decoding

• Any k + kr − 1 coordinates contain a subset S s.t.

1. |S| = k

2. ∀i,Ri * S

• By the MR property, S suffices for decoding

Maximally Recoverable Codes 4 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

Proof:

• Assume r divides k

• MR code is an Optimal LRC code

⇔ d = n− k − kr + 2

⇔ any d − 1 = n− k − kr + 1 erasures are recoverable

⇔ any n− (d − 1) = k + kr − 1 coordinates suffice for decoding

• Any k + kr − 1 coordinates contain a subset S s.t.

1. |S| = k

2. ∀i,Ri * S

• By the MR property, S suffices for decoding

Maximally Recoverable Codes 4 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

Proof:

• Assume r divides k

• MR code is an Optimal LRC code

⇔ d = n− k − kr + 2

⇔ any d − 1 = n− k − kr + 1 erasures are recoverable

⇔ any n− (d − 1) = k + kr − 1 coordinates suffice for decoding

• Any k + kr − 1 coordinates contain a subset S s.t.

1. |S| = k

2. ∀i,Ri * S

• By the MR property, S suffices for decoding

Maximally Recoverable Codes 4 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

Proof:

• Assume r divides k

• MR code is an Optimal LRC code

⇔ d = n− k − kr + 2

⇔ any d − 1 = n− k − kr + 1 erasures are recoverable

⇔ any n− (d − 1) = k + kr − 1 coordinates suffice for decoding

• Any k + kr − 1 coordinates contain a subset S s.t.

1. |S| = k

2. ∀i,Ri * S

• By the MR property, S suffices for decoding

Maximally Recoverable Codes 4 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

Proof:

• Assume r divides k

• MR code is an Optimal LRC code

⇔ d = n− k − kr + 2

⇔ any d − 1 = n− k − kr + 1 erasures are recoverable

⇔ any n− (d − 1) = k + kr − 1 coordinates suffice for decoding

• Any k + kr − 1 coordinates contain a subset S s.t.

1. |S| = k

2. ∀i,Ri * S

• By the MR property, S suffices for decoding

Maximally Recoverable Codes 4 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

Proof:

• Assume r divides k

• MR code is an Optimal LRC code

⇔ d = n− k − kr + 2

⇔ any d − 1 = n− k − kr + 1 erasures are recoverable

⇔ any n− (d − 1) = k + kr − 1 coordinates suffice for decoding

• Any k + kr − 1 coordinates contain a subset S s.t.

1. |S| = k

2. ∀i,Ri * S

• By the MR property, S suffices for decoding

Maximally Recoverable Codes 4 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

Proof:

• Assume r divides k

• MR code is an Optimal LRC code

⇔ d = n− k − kr + 2

⇔ any d − 1 = n− k − kr + 1 erasures are recoverable

⇔ any n− (d − 1) = k + kr − 1 coordinates suffice for decoding

• Any k + kr − 1 coordinates contain a subset S s.t.

1. |S| = k

2. ∀i,Ri * S

• By the MR property, S suffices for decoding

Maximally Recoverable Codes 4 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

Proof:

• Assume r divides k

• MR code is an Optimal LRC code

⇔ d = n− k − kr + 2

⇔ any d − 1 = n− k − kr + 1 erasures are recoverable

⇔ any n− (d − 1) = k + kr − 1 coordinates suffice for decoding

• Any k + kr − 1 coordinates contain a subset S s.t.

1. |S| = k

2. ∀i,Ri * S

• By the MR property, S suffices for decoding

Maximally Recoverable Codes 4 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

Proof:

• Assume r divides k

• MR code is an Optimal LRC code

⇔ d = n− k − kr + 2

⇔ any d − 1 = n− k − kr + 1 erasures are recoverable

⇔ any n− (d − 1) = k + kr − 1 coordinates suffice for decoding

• Any k + kr − 1 coordinates contain a subset S s.t.

1. |S| = k

2. ∀i,Ri * S

• By the MR property, S suffices for decoding

Maximally Recoverable Codes 4 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

Proof:

• Assume r divides k

• MR code is an Optimal LRC code

⇔ d = n− k − kr + 2

⇔ any d − 1 = n− k − kr + 1 erasures are recoverable

⇔ any n− (d − 1) = k + kr − 1 coordinates suffice for decoding

• Any k + kr − 1 coordinates contain a subset S s.t.

1. |S| = k

2. ∀i,Ri * S

• By the MR property, S suffices for decoding

Maximally Recoverable Codes 4 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

Q: MR codes = Optimal LRC codes ? Ans: No

Maximally Recoverable Codes 5 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

LRC Codes

Q: MR codes = Optimal LRC codes ? Ans: No

Maximally Recoverable Codes 5 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

LRC Codes

Optimal LRC Codes

Q: MR codes = Optimal LRC codes ? Ans: No

Maximally Recoverable Codes 5 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

LRC Codes

Optimal LRC Codes

MR Codes

Q: MR codes = Optimal LRC codes ? Ans: No

Maximally Recoverable Codes 5 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

LRC Codes

Optimal LRC Codes

MR Codes

Q: MR codes = Optimal LRC codes ?

Ans: No

Maximally Recoverable Codes 5 / 17

MR codes - Basics

LemmaMR codes are optimal LRC codes

LRC Codes

Optimal LRC Codes

MR Codes

Q: MR codes = Optimal LRC codes ? Ans: No

Maximally Recoverable Codes 5 / 17

Constructing MR codes

• Probabilistic construction of optimal LRC code results an MR code

1. Non explicit

2. Field size is superpolynomial in the length (is this truly necessary?)

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

• Construction of Partially MDS codes [Blaum, Lee Hafner, Hetzler 13, Blaum, Plank,Schwartz, Yaakobi 14]

1. Restricted parameters

2. Field size |F| = O(n)

Maximally Recoverable Codes 6 / 17

Constructing MR codes

• Probabilistic construction of optimal LRC code results an MR code

1. Non explicit

2. Field size is superpolynomial in the length (is this truly necessary?)

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

• Construction of Partially MDS codes [Blaum, Lee Hafner, Hetzler 13, Blaum, Plank,Schwartz, Yaakobi 14]

1. Restricted parameters

2. Field size |F| = O(n)

Maximally Recoverable Codes 6 / 17

Constructing MR codes

• Probabilistic construction of optimal LRC code results an MR code

1. Non explicit

2. Field size is superpolynomial in the length (is this truly necessary?)

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

• Construction of Partially MDS codes [Blaum, Lee Hafner, Hetzler 13, Blaum, Plank,Schwartz, Yaakobi 14]

1. Restricted parameters

2. Field size |F| = O(n)

Maximally Recoverable Codes 6 / 17

Constructing MR codes

• Probabilistic construction of optimal LRC code results an MR code

1. Non explicit

2. Field size is superpolynomial in the length (is this truly necessary?)

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

• Construction of Partially MDS codes [Blaum, Lee Hafner, Hetzler 13, Blaum, Plank,Schwartz, Yaakobi 14]

1. Restricted parameters

2. Field size |F| = O(n)

Maximally Recoverable Codes 6 / 17

Constructing MR codes

• Probabilistic construction of optimal LRC code results an MR code

1. Non explicit

2. Field size is superpolynomial in the length (is this truly necessary?)

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

• Construction of Partially MDS codes [Blaum, Lee Hafner, Hetzler 13, Blaum, Plank,Schwartz, Yaakobi 14]

1. Restricted parameters

2. Field size |F| = O(n)

Maximally Recoverable Codes 6 / 17

Constructing MR codes

• Probabilistic construction of optimal LRC code results an MR code

1. Non explicit

2. Field size is superpolynomial in the length (is this truly necessary?)

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

• Construction of Partially MDS codes [Blaum, Lee Hafner, Hetzler 13, Blaum, Plank,Schwartz, Yaakobi 14]

1. Restricted parameters

2. Field size |F| = O(n)

Maximally Recoverable Codes 6 / 17

Constructing MR codes

• Probabilistic construction of optimal LRC code results an MR code

1. Non explicit

2. Field size is superpolynomial in the length (is this truly necessary?)

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

• Construction of Partially MDS codes [Blaum, Lee Hafner, Hetzler 13, Blaum, Plank,Schwartz, Yaakobi 14]

1. Restricted parameters

2. Field size |F| = O(n)

Maximally Recoverable Codes 6 / 17

Constructing MR codes

• Probabilistic construction of optimal LRC code results an MR code

1. Non explicit

2. Field size is superpolynomial in the length (is this truly necessary?)

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

• Construction of Partially MDS codes [Blaum, Lee Hafner, Hetzler 13, Blaum, Plank,Schwartz, Yaakobi 14]

1. Restricted parameters

2. Field size |F| = O(n)

Maximally Recoverable Codes 6 / 17

Constructing MR codes

• Probabilistic construction of optimal LRC code results an MR code

1. Non explicit

2. Field size is superpolynomial in the length (is this truly necessary?)

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

• Construction of Partially MDS codes [Blaum, Lee Hafner, Hetzler 13, Blaum, Plank,Schwartz, Yaakobi 14]

1. Restricted parameters

2. Field size |F| = O(n)

Maximally Recoverable Codes 6 / 17

Constructing MR codes

• Probabilistic construction of optimal LRC code results an MR code

1. Non explicit

2. Field size is superpolynomial in the length (is this truly necessary?)

• Explicit constructions [Rawat, Koyluoglu, Silberstein, Vishwanath 14, Gopalan, Huang,Jenkins, Yekhanin 14, Tamo, Papailiopoulos, Dimakis 14]

1. Any n, k, r

2. Field size is superpolynomial

• Construction of Partially MDS codes [Blaum, Lee Hafner, Hetzler 13, Blaum, Plank,Schwartz, Yaakobi 14]

1. Restricted parameters

2. Field size |F| = O(n)

Maximally Recoverable Codes 6 / 17

Constructing MR codes - Easy cases

• r = k

1. An (n, k) RS is an (n, k, k) MR code

2. |F| = O(n)

• r = 1

1. Duplication of an (n/2, k) RS is an (n, k, 1) MR code

2. |F| = O(n)

• Q: Is the RS-like construction an MR code?

1. No, in general

2. Yes in some cases (other cases?)

Maximally Recoverable Codes 7 / 17

Constructing MR codes - Easy cases

• r = k

1. An (n, k) RS is an (n, k, k) MR code

2. |F| = O(n)

• r = 1

1. Duplication of an (n/2, k) RS is an (n, k, 1) MR code

2. |F| = O(n)

• Q: Is the RS-like construction an MR code?

1. No, in general

2. Yes in some cases (other cases?)

Maximally Recoverable Codes 7 / 17

Constructing MR codes - Easy cases

• r = k

1. An (n, k) RS is an (n, k, k) MR code

2. |F| = O(n)

• r = 1

1. Duplication of an (n/2, k) RS is an (n, k, 1) MR code

2. |F| = O(n)

• Q: Is the RS-like construction an MR code?

1. No, in general

2. Yes in some cases (other cases?)

Maximally Recoverable Codes 7 / 17

Constructing MR codes - Easy cases

• r = k

1. An (n, k) RS is an (n, k, k) MR code

2. |F| = O(n)

• r = 1

1. Duplication of an (n/2, k) RS is an (n, k, 1) MR code

2. |F| = O(n)

• Q: Is the RS-like construction an MR code?

1. No, in general

2. Yes in some cases (other cases?)

Maximally Recoverable Codes 7 / 17

Constructing MR codes - Easy cases

• r = k

1. An (n, k) RS is an (n, k, k) MR code

2. |F| = O(n)

• r = 1

1. Duplication of an (n/2, k) RS is an (n, k, 1) MR code

2. |F| = O(n)

• Q: Is the RS-like construction an MR code?

1. No, in general

2. Yes in some cases (other cases?)

Maximally Recoverable Codes 7 / 17

Constructing MR codes - Easy cases

• r = k

1. An (n, k) RS is an (n, k, k) MR code

2. |F| = O(n)

• r = 1

1. Duplication of an (n/2, k) RS is an (n, k, 1) MR code

2. |F| = O(n)

• Q: Is the RS-like construction an MR code?

1. No, in general

2. Yes in some cases (other cases?)

Maximally Recoverable Codes 7 / 17

Constructing MR codes - Easy cases

• r = k

1. An (n, k) RS is an (n, k, k) MR code

2. |F| = O(n)

• r = 1

1. Duplication of an (n/2, k) RS is an (n, k, 1) MR code

2. |F| = O(n)

• Q: Is the RS-like construction an MR code?

1. No, in general

2. Yes in some cases (other cases?)

Maximally Recoverable Codes 7 / 17

Constructing MR codes - Easy cases

• r = k

1. An (n, k) RS is an (n, k, k) MR code

2. |F| = O(n)

• r = 1

1. Duplication of an (n/2, k) RS is an (n, k, 1) MR code

2. |F| = O(n)

• Q: Is the RS-like construction an MR code?

1. No, in general

2. Yes in some cases (other cases?)

Maximally Recoverable Codes 7 / 17

Constructing MR codes - Easy cases

• r = k

1. An (n, k) RS is an (n, k, k) MR code

2. |F| = O(n)

• r = 1

1. Duplication of an (n/2, k) RS is an (n, k, 1) MR code

2. |F| = O(n)

• Q: Is the RS-like construction an MR code?

1. No, in general

2. Yes in some cases (other cases?)

Maximally Recoverable Codes 7 / 17

Constructing MR codes - Easy cases

• r = k

1. An (n, k) RS is an (n, k, k) MR code

2. |F| = O(n)

• r = 1

1. Duplication of an (n/2, k) RS is an (n, k, 1) MR code

2. |F| = O(n)

• Q: Is the RS-like construction an MR code?

1. No, in general

2. Yes in some cases (other cases?)

Maximally Recoverable Codes 7 / 17

MR codes through linearized polynomials

Basic facts:

• Fqm is a vector space of dimension m over Fq

• Linearized polynomial has the form

f (x) =m−1∑i=0

aixqi, where ai ∈ Fqm

• For x, y ∈ Fqm and α1, α2 ∈ Fq

f (α1x + α2y) = α1f (x) + α2f (y)

Maximally Recoverable Codes 8 / 17

MR codes through linearized polynomials

Basic facts:

• Fqm is a vector space of dimension m over Fq

• Linearized polynomial has the form

f (x) =m−1∑i=0

aixqi, where ai ∈ Fqm

• For x, y ∈ Fqm and α1, α2 ∈ Fq

f (α1x + α2y) = α1f (x) + α2f (y)

Maximally Recoverable Codes 8 / 17

MR codes through linearized polynomials

Basic facts:

• Fqm is a vector space of dimension m over Fq

• Linearized polynomial has the form

f (x) =m−1∑i=0

aixqi, where ai ∈ Fqm

• For x, y ∈ Fqm and α1, α2 ∈ Fq

f (α1x + α2y) = α1f (x) + α2f (y)

Maximally Recoverable Codes 8 / 17

MR codes through linearized polynomials

Basic facts:

• Fqm is a vector space of dimension m over Fq

• Linearized polynomial has the form

f (x) =m−1∑i=0

aixqi, where ai ∈ Fqm

• For x, y ∈ Fqm and α1, α2 ∈ Fq

f (α1x + α2y) = α1f (x) + α2f (y)

Maximally Recoverable Codes 8 / 17

Constructing an (n, k, r) MR code [Rawat, Koyluoglu, Silberstein, Vishwanath 14]

• Set m = nrr+1

• α1, ..., αm ∈ F2m linearly independent elements over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• Observations:

1. | ∪i Ri| = n

2.∑α∈Ri

α = 0

Encoding: (v0, ..., vk−1) 7→ f (x) =∑k−1

i=0 vix2i7→ (f (α) : α ∈ ∪iRi)

1. Given k information symbols of F2m

2. Form the linearized polynomial

3. Output the length−n codeword

Maximally Recoverable Codes 9 / 17

Constructing an (n, k, r) MR code [Rawat, Koyluoglu, Silberstein, Vishwanath 14]

• Set m = nrr+1

• α1, ..., αm ∈ F2m linearly independent elements over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• Observations:

1. | ∪i Ri| = n

2.∑α∈Ri

α = 0

Encoding: (v0, ..., vk−1) 7→ f (x) =∑k−1

i=0 vix2i7→ (f (α) : α ∈ ∪iRi)

1. Given k information symbols of F2m

2. Form the linearized polynomial

3. Output the length−n codeword

Maximally Recoverable Codes 9 / 17

Constructing an (n, k, r) MR code [Rawat, Koyluoglu, Silberstein, Vishwanath 14]

• Set m = nrr+1

• α1, ..., αm ∈ F2m linearly independent elements over F2

• R1 = {α1, ..., αr,∑r

i=1 αi}

,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• Observations:

1. | ∪i Ri| = n

2.∑α∈Ri

α = 0

Encoding: (v0, ..., vk−1) 7→ f (x) =∑k−1

i=0 vix2i7→ (f (α) : α ∈ ∪iRi)

1. Given k information symbols of F2m

2. Form the linearized polynomial

3. Output the length−n codeword

Maximally Recoverable Codes 9 / 17

Constructing an (n, k, r) MR code [Rawat, Koyluoglu, Silberstein, Vishwanath 14]

• Set m = nrr+1

• α1, ..., αm ∈ F2m linearly independent elements over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi}

, ...,R nr+1

• Observations:

1. | ∪i Ri| = n

2.∑α∈Ri

α = 0

Encoding: (v0, ..., vk−1) 7→ f (x) =∑k−1

i=0 vix2i7→ (f (α) : α ∈ ∪iRi)

1. Given k information symbols of F2m

2. Form the linearized polynomial

3. Output the length−n codeword

Maximally Recoverable Codes 9 / 17

Constructing an (n, k, r) MR code [Rawat, Koyluoglu, Silberstein, Vishwanath 14]

• Set m = nrr+1

• α1, ..., αm ∈ F2m linearly independent elements over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• Observations:

1. | ∪i Ri| = n

2.∑α∈Ri

α = 0

Encoding: (v0, ..., vk−1) 7→ f (x) =∑k−1

i=0 vix2i7→ (f (α) : α ∈ ∪iRi)

1. Given k information symbols of F2m

2. Form the linearized polynomial

3. Output the length−n codeword

Maximally Recoverable Codes 9 / 17

Constructing an (n, k, r) MR code [Rawat, Koyluoglu, Silberstein, Vishwanath 14]

• Set m = nrr+1

• α1, ..., αm ∈ F2m linearly independent elements over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• Observations:

1. | ∪i Ri| = n

2.∑α∈Ri

α = 0

Encoding: (v0, ..., vk−1) 7→ f (x) =∑k−1

i=0 vix2i7→ (f (α) : α ∈ ∪iRi)

1. Given k information symbols of F2m

2. Form the linearized polynomial

3. Output the length−n codeword

Maximally Recoverable Codes 9 / 17

Constructing an (n, k, r) MR code [Rawat, Koyluoglu, Silberstein, Vishwanath 14]

• Set m = nrr+1

• α1, ..., αm ∈ F2m linearly independent elements over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• Observations:

1. | ∪i Ri| = n

2.∑α∈Ri

α = 0

Encoding: (v0, ..., vk−1) 7→ f (x) =∑k−1

i=0 vix2i7→ (f (α) : α ∈ ∪iRi)

1. Given k information symbols of F2m

2. Form the linearized polynomial

3. Output the length−n codeword

Maximally Recoverable Codes 9 / 17

Constructing an (n, k, r) MR code [Rawat, Koyluoglu, Silberstein, Vishwanath 14]

• Set m = nrr+1

• α1, ..., αm ∈ F2m linearly independent elements over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• Observations:

1. | ∪i Ri| = n

2.∑α∈Ri

α = 0

Encoding: (v0, ..., vk−1) 7→ f (x) =∑k−1

i=0 vix2i7→ (f (α) : α ∈ ∪iRi)

1. Given k information symbols of F2m

2. Form the linearized polynomial

3. Output the length−n codeword

Maximally Recoverable Codes 9 / 17

Constructing an (n, k, r) MR code [Rawat, Koyluoglu, Silberstein, Vishwanath 14]

• Set m = nrr+1

• α1, ..., αm ∈ F2m linearly independent elements over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• Observations:

1. | ∪i Ri| = n

2.∑α∈Ri

α = 0

Encoding:

(v0, ..., vk−1) 7→ f (x) =∑k−1

i=0 vix2i7→ (f (α) : α ∈ ∪iRi)

1. Given k information symbols of F2m

2. Form the linearized polynomial

3. Output the length−n codeword

Maximally Recoverable Codes 9 / 17

Constructing an (n, k, r) MR code [Rawat, Koyluoglu, Silberstein, Vishwanath 14]

• Set m = nrr+1

• α1, ..., αm ∈ F2m linearly independent elements over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• Observations:

1. | ∪i Ri| = n

2.∑α∈Ri

α = 0

Encoding: (v0, ..., vk−1)

7→ f (x) =∑k−1

i=0 vix2i7→ (f (α) : α ∈ ∪iRi)

1. Given k information symbols of F2m

2. Form the linearized polynomial

3. Output the length−n codeword

Maximally Recoverable Codes 9 / 17

Constructing an (n, k, r) MR code [Rawat, Koyluoglu, Silberstein, Vishwanath 14]

• Set m = nrr+1

• α1, ..., αm ∈ F2m linearly independent elements over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• Observations:

1. | ∪i Ri| = n

2.∑α∈Ri

α = 0

Encoding: (v0, ..., vk−1) 7→ f (x) =∑k−1

i=0 vix2i

7→ (f (α) : α ∈ ∪iRi)

1. Given k information symbols of F2m

2. Form the linearized polynomial

3. Output the length−n codeword

Maximally Recoverable Codes 9 / 17

Constructing an (n, k, r) MR code [Rawat, Koyluoglu, Silberstein, Vishwanath 14]

• Set m = nrr+1

• α1, ..., αm ∈ F2m linearly independent elements over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• Observations:

1. | ∪i Ri| = n

2.∑α∈Ri

α = 0

Encoding: (v0, ..., vk−1) 7→ f (x) =∑k−1

i=0 vix2i7→ (f (α) : α ∈ ∪iRi)

1. Given k information symbols of F2m

2. Form the linearized polynomial

3. Output the length−n codeword

Maximally Recoverable Codes 9 / 17

Constructing an (n, k, r) MR code - Cont’d

Locality: Recover f(β)=? for β ∈ Ri

∑α∈Ri

α = 0 =⇒ β =∑

α∈Ri\β

α

=⇒ f(β)= f( ∑α∈Ri\β

α)=

∑α∈Ri\β

f(α)

MR Property:

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

Locality: Recover f(β)=? for β ∈ Ri

∑α∈Ri

α = 0 =⇒ β =∑

α∈Ri\β

α =⇒ f(β)= f( ∑α∈Ri\β

α)=

∑α∈Ri\β

f(α)

MR Property:

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

Locality: Recover f(β)=? for β ∈ Ri

∑α∈Ri

α = 0

=⇒ β =∑

α∈Ri\β

α =⇒ f(β)= f( ∑α∈Ri\β

α)=

∑α∈Ri\β

f(α)

MR Property:

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

Locality: Recover f(β)=? for β ∈ Ri

∑α∈Ri

α = 0 =⇒ β =∑

α∈Ri\β

α

=⇒ f(β)= f( ∑α∈Ri\β

α)=

∑α∈Ri\β

f(α)

MR Property:

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

Locality: Recover f(β)=? for β ∈ Ri

∑α∈Ri

α = 0 =⇒ β =∑

α∈Ri\β

α =⇒ f(β)= f( ∑α∈Ri\β

α)

=∑

α∈Ri\β

f(α)

MR Property:

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

Locality: Recover f(β)=? for β ∈ Ri

∑α∈Ri

α = 0 =⇒ β =∑

α∈Ri\β

α =⇒ f(β)= f( ∑α∈Ri\β

α)=

∑α∈Ri\β

f(α)

MR Property:

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• α1, ..., αm are linearly independent over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• =⇒ the only linear dependencies in ∪iRi are the Ri’s

• =⇒ If S ⊆ ∪iRi s.t. |S| = k,Ri * S then the elements of S are linearlyindependent over F2

• =⇒ all 2k possible sums∑εs∈{0,1}

s∈Sεss are distinct

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• α1, ..., αm are linearly independent over F2

• R1 = {α1, ..., αr,∑r

i=1 αi}

,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• =⇒ the only linear dependencies in ∪iRi are the Ri’s

• =⇒ If S ⊆ ∪iRi s.t. |S| = k,Ri * S then the elements of S are linearlyindependent over F2

• =⇒ all 2k possible sums∑εs∈{0,1}

s∈Sεss are distinct

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• α1, ..., αm are linearly independent over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi}

, ...,R nr+1

• =⇒ the only linear dependencies in ∪iRi are the Ri’s

• =⇒ If S ⊆ ∪iRi s.t. |S| = k,Ri * S then the elements of S are linearlyindependent over F2

• =⇒ all 2k possible sums∑εs∈{0,1}

s∈Sεss are distinct

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• α1, ..., αm are linearly independent over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• =⇒ the only linear dependencies in ∪iRi are the Ri’s

• =⇒ If S ⊆ ∪iRi s.t. |S| = k,Ri * S then the elements of S are linearlyindependent over F2

• =⇒ all 2k possible sums∑εs∈{0,1}

s∈Sεss are distinct

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• α1, ..., αm are linearly independent over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• =⇒ the only linear dependencies in ∪iRi are the Ri’s

• =⇒ If S ⊆ ∪iRi s.t. |S| = k,Ri * S then the elements of S are linearlyindependent over F2

• =⇒ all 2k possible sums∑εs∈{0,1}

s∈Sεss are distinct

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• α1, ..., αm are linearly independent over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• =⇒ the only linear dependencies in ∪iRi are the Ri’s

• =⇒ If S ⊆ ∪iRi s.t. |S| = k,Ri * S then the elements of S are linearlyindependent over F2

• =⇒ all 2k possible sums∑εs∈{0,1}

s∈Sεss are distinct

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• α1, ..., αm are linearly independent over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• =⇒ the only linear dependencies in ∪iRi are the Ri’s

• =⇒ If S ⊆ ∪iRi s.t. |S| = k,Ri * S then the elements of S are linearlyindependent over F2

• =⇒ all 2k possible sums∑εs∈{0,1}

s∈Sεss are distinct

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• α1, ..., αm are linearly independent over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• =⇒ the only linear dependencies in ∪iRi are the Ri’s

• =⇒ If S ⊆ ∪iRi s.t. |S| = k,Ri * S then the elements of S are linearlyindependent over F2

• =⇒ all 2k possible sums∑εs∈{0,1}

s∈Sεss are distinct

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• α1, ..., αm are linearly independent over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• =⇒ the only linear dependencies in ∪iRi are the Ri’s

• =⇒ If S ⊆ ∪iRi s.t. |S| = k,Ri * S then the elements of S are linearlyindependent over F2

• =⇒ all 2k possible sums∑εs∈{0,1}

s∈Sεss are distinct

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)

=∑

εs∈{0,1}s∈S

εsf(s)

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• α1, ..., αm are linearly independent over F2

• R1 = {α1, ..., αr,∑r

i=1 αi} ,R2 = {αr+1, ..., α2r,∑2r

i=r+1 αi} , ...,R nr+1

• =⇒ the only linear dependencies in ∪iRi are the Ri’s

• =⇒ If S ⊆ ∪iRi s.t. |S| = k,Ri * S then the elements of S are linearlyindependent over F2

• =⇒ all 2k possible sums∑εs∈{0,1}

s∈Sεss are distinct

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1

, since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1 , since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Constructing an (n, k, r) MR code - Cont’d

MR Property:

• =⇒ 2k evaluations of f(x):

f( ∑εs∈{0,1}

s∈S

εss)=

∑εs∈{0,1}

s∈S

εsf(s)

• deg(f ) ≤ 2k−1 , since f(x)=∑k−1

i=0 vix2i

• f(x)

can be interpolated

Maximally Recoverable Codes 10 / 17

Properties of the MR code construction

• Flexible set of parameters n, k, r X

• Need m = nrr+1 linearly independent elements over F2 =⇒ |F| = 2

nrr+1

• Field size is exponential in n ×

• Can we do better?

Maximally Recoverable Codes 11 / 17

Properties of the MR code construction

• Flexible set of parameters n, k, r

X

• Need m = nrr+1 linearly independent elements over F2 =⇒ |F| = 2

nrr+1

• Field size is exponential in n ×

• Can we do better?

Maximally Recoverable Codes 11 / 17

Properties of the MR code construction

• Flexible set of parameters n, k, r X

• Need m = nrr+1 linearly independent elements over F2 =⇒ |F| = 2

nrr+1

• Field size is exponential in n ×

• Can we do better?

Maximally Recoverable Codes 11 / 17

Properties of the MR code construction

• Flexible set of parameters n, k, r X

• Need m = nrr+1 linearly independent elements over F2

=⇒ |F| = 2nr

r+1

• Field size is exponential in n ×

• Can we do better?

Maximally Recoverable Codes 11 / 17

Properties of the MR code construction

• Flexible set of parameters n, k, r X

• Need m = nrr+1 linearly independent elements over F2 =⇒ |F| = 2

nrr+1

• Field size is exponential in n ×

• Can we do better?

Maximally Recoverable Codes 11 / 17

Properties of the MR code construction

• Flexible set of parameters n, k, r X

• Need m = nrr+1 linearly independent elements over F2 =⇒ |F| = 2

nrr+1

• Field size is exponential in n

×

• Can we do better?

Maximally Recoverable Codes 11 / 17

Properties of the MR code construction

• Flexible set of parameters n, k, r X

• Need m = nrr+1 linearly independent elements over F2 =⇒ |F| = 2

nrr+1

• Field size is exponential in n ×

• Can we do better?

Maximally Recoverable Codes 11 / 17

Properties of the MR code construction

• Flexible set of parameters n, k, r X

• Need m = nrr+1 linearly independent elements over F2 =⇒ |F| = 2

nrr+1

• Field size is exponential in n ×

• Can we do better?

Maximally Recoverable Codes 11 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k

0 1 2 3 4 ≥ 5

F(n, k, r)

2 r + 1 θ(n) O(n3/2) O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k

0 1 2 3 4 ≥ 5

F(n, k, r)

2 r + 1 θ(n) O(n3/2) O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k

0 1 2 3 4 ≥ 5

F(n, k, r)

2 r + 1 θ(n) O(n3/2) O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k

0 1 2 3 4 ≥ 5

F(n, k, r)

2 r + 1 θ(n) O(n3/2) O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k

0 1 2 3 4 ≥ 5

F(n, k, r)

2 r + 1 θ(n) O(n3/2) O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k

0 1 2 3 4 ≥ 5

F(n, k, r)

2 r + 1 θ(n) O(n3/2) O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k 0

1 2 3 4 ≥ 5

F(n, k, r)

2 r + 1 θ(n) O(n3/2) O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k 0

1 2 3 4 ≥ 5

F(n, k, r) 2

r + 1 θ(n) O(n3/2) O(n7/3) O(nnr

r+1−k−1)

The code is a nr+1 independent parity check codes of length r + 1

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k 0 1

2 3 4 ≥ 5

F(n, k, r) 2

r + 1 θ(n) O(n3/2) O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k 0 1

2 3 4 ≥ 5

F(n, k, r) 2 r + 1

θ(n) O(n3/2) O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k 0 1 2

3 4 ≥ 5

F(n, k, r) 2 r + 1

θ(n) O(n3/2) O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k 0 1 2

3 4 ≥ 5

F(n, k, r) 2 r + 1 θ(n)

O(n3/2) O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k 0 1 2 3

4 ≥ 5

F(n, k, r) 2 r + 1 θ(n) O(n3/2)

O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k 0 1 2 3

4 ≥ 5

F(n, k, r) 2 r + 1 θ(n) O(n3/2)

O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k 0 1 2 3 4

≥ 5

F(n, k, r) 2 r + 1 θ(n) O(n3/2)

O(n7/3) O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k 0 1 2 3 4

≥ 5

F(n, k, r) 2 r + 1 θ(n) O(n3/2) O(n7/3)

O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k 0 1 2 3 4 ≥ 5F(n, k, r) 2 r + 1 θ(n) O(n3/2) O(n7/3)

O(nnr

r+1−k−1)

Maximally Recoverable Codes 12 / 17

High-rate MR codes [Gopalan, Huang, Jenkins, Yekhanin 14]

• In any (n, k, r) LRC code k ≤ nrr+1

• High rate means nrr+1 − k = O( n

log n )

• F(n, k, r) = minimum field size needed for an (n, k, r) MR code

• Known bounds on F(n, k, r):

nrr+1 − k 0 1 2 3 4 ≥ 5F(n, k, r) 2 r + 1 θ(n) O(n3/2) O(n7/3) O(n

nrr+1−k−1)

Maximally Recoverable Codes 12 / 17

Summary of best known constructions of MR codes

High-Rate Constant-Rate Vanishing RateDimension

nrr+1 − k = O( n

log n )kn <

rr+1 k = O( n

log n )

Field size

nnr

r+1−k−1 2nr

r+1 nk

• High-Rate Regime: Gopalan, Huang, Jenkins, Yekhanin 14

• Constant-Rate Regime: Rawat, Koyluoglu, Silberstein, Vishwanath 14

• Vanishing Rate Regime: T., Papailiopoulos, Dimakis 14

Maximally Recoverable Codes 13 / 17

Summary of best known constructions of MR codes

High-Rate Constant-Rate Vanishing RateDimension

nrr+1 − k = O( n

log n )kn <

rr+1 k = O( n

log n )

Field size

nnr

r+1−k−1 2nr

r+1 nk

• High-Rate Regime: Gopalan, Huang, Jenkins, Yekhanin 14

• Constant-Rate Regime: Rawat, Koyluoglu, Silberstein, Vishwanath 14

• Vanishing Rate Regime: T., Papailiopoulos, Dimakis 14

Maximally Recoverable Codes 13 / 17

Summary of best known constructions of MR codes

High-Rate Constant-Rate Vanishing RateDimension nr

r+1 − k = O( nlog n )

kn <

rr+1 k = O( n

log n )

Field size

nnr

r+1−k−1 2nr

r+1 nk

• High-Rate Regime: Gopalan, Huang, Jenkins, Yekhanin 14

• Constant-Rate Regime: Rawat, Koyluoglu, Silberstein, Vishwanath 14

• Vanishing Rate Regime: T., Papailiopoulos, Dimakis 14

Maximally Recoverable Codes 13 / 17

Summary of best known constructions of MR codes

High-Rate Constant-Rate Vanishing RateDimension nr

r+1 − k = O( nlog n )

kn <

rr+1 k = O( n

log n )

Field size nnr

r+1−k−1

2nr

r+1 nk

• High-Rate Regime: Gopalan, Huang, Jenkins, Yekhanin 14

• Constant-Rate Regime: Rawat, Koyluoglu, Silberstein, Vishwanath 14

• Vanishing Rate Regime: T., Papailiopoulos, Dimakis 14

Maximally Recoverable Codes 13 / 17

Summary of best known constructions of MR codes

High-Rate Constant-Rate Vanishing RateDimension nr

r+1 − k = O( nlog n )

kn <

rr+1 k = O( n

log n )

Field size nnr

r+1−k−1

2nr

r+1 nk

• High-Rate Regime: Gopalan, Huang, Jenkins, Yekhanin 14

• Constant-Rate Regime: Rawat, Koyluoglu, Silberstein, Vishwanath 14

• Vanishing Rate Regime: T., Papailiopoulos, Dimakis 14

Maximally Recoverable Codes 13 / 17

Summary of best known constructions of MR codes

High-Rate Constant-Rate Vanishing RateDimension nr

r+1 − k = O( nlog n )

kn <

rr+1

k = O( nlog n )

Field size nnr

r+1−k−1

2nr

r+1 nk

• High-Rate Regime: Gopalan, Huang, Jenkins, Yekhanin 14

• Constant-Rate Regime: Rawat, Koyluoglu, Silberstein, Vishwanath 14

• Vanishing Rate Regime: T., Papailiopoulos, Dimakis 14

Maximally Recoverable Codes 13 / 17

Summary of best known constructions of MR codes

High-Rate Constant-Rate Vanishing RateDimension nr

r+1 − k = O( nlog n )

kn <

rr+1

k = O( nlog n )

Field size nnr

r+1−k−1 2nr

r+1

nk

• High-Rate Regime: Gopalan, Huang, Jenkins, Yekhanin 14

• Constant-Rate Regime: Rawat, Koyluoglu, Silberstein, Vishwanath 14

• Vanishing Rate Regime: T., Papailiopoulos, Dimakis 14

Maximally Recoverable Codes 13 / 17

Summary of best known constructions of MR codes

High-Rate Constant-Rate Vanishing RateDimension nr

r+1 − k = O( nlog n )

kn <

rr+1

k = O( nlog n )

Field size nnr

r+1−k−1 2nr

r+1

nk

• High-Rate Regime: Gopalan, Huang, Jenkins, Yekhanin 14

• Constant-Rate Regime: Rawat, Koyluoglu, Silberstein, Vishwanath 14

• Vanishing Rate Regime: T., Papailiopoulos, Dimakis 14

Maximally Recoverable Codes 13 / 17

Summary of best known constructions of MR codes

High-Rate Constant-Rate Vanishing RateDimension nr

r+1 − k = O( nlog n )

kn <

rr+1 k = O( n

log n )

Field size nnr

r+1−k−1 2nr

r+1

nk

• High-Rate Regime: Gopalan, Huang, Jenkins, Yekhanin 14

• Constant-Rate Regime: Rawat, Koyluoglu, Silberstein, Vishwanath 14

• Vanishing Rate Regime: T., Papailiopoulos, Dimakis 14

Maximally Recoverable Codes 13 / 17

Summary of best known constructions of MR codes

High-Rate Constant-Rate Vanishing RateDimension nr

r+1 − k = O( nlog n )

kn <

rr+1 k = O( n

log n )

Field size nnr

r+1−k−1 2nr

r+1 nk

• High-Rate Regime: Gopalan, Huang, Jenkins, Yekhanin 14

• Constant-Rate Regime: Rawat, Koyluoglu, Silberstein, Vishwanath 14

• Vanishing Rate Regime: T., Papailiopoulos, Dimakis 14

Maximally Recoverable Codes 13 / 17

Lower bounds on the field size

F(n, k, r) is upper bounded by a superpolynomial function of n

Theorem [Gopalan, Huang, Jenkins, Yekhanin 14]

F(n, k, r) ≥ k + 1

Proof:

• Puncturing the code by one coordinate from each recovery set results an ( nrr+1 , k)

MDS code

• By the bound on the field size of an MDS code nrr+1 ≤ |F|+ ( nr

r+1 − k)− 1

Questions:

1. What is the smallest field size needed for an (n, k, r) MR code?

- For optimal LRC order n is sufficient

- Contains the MDS conjecture as a subproblem2. Are MR codes inherently different from optimal LRC codes?

Maximally Recoverable Codes 14 / 17

Lower bounds on the field size

F(n, k, r) is upper bounded by a superpolynomial function of n

Theorem [Gopalan, Huang, Jenkins, Yekhanin 14]

F(n, k, r) ≥ k + 1

Proof:

• Puncturing the code by one coordinate from each recovery set results an ( nrr+1 , k)

MDS code

• By the bound on the field size of an MDS code nrr+1 ≤ |F|+ ( nr

r+1 − k)− 1

Questions:

1. What is the smallest field size needed for an (n, k, r) MR code?

- For optimal LRC order n is sufficient

- Contains the MDS conjecture as a subproblem2. Are MR codes inherently different from optimal LRC codes?

Maximally Recoverable Codes 14 / 17

Lower bounds on the field size

F(n, k, r) is upper bounded by a superpolynomial function of n

Theorem [Gopalan, Huang, Jenkins, Yekhanin 14]

F(n, k, r) ≥ k + 1

Proof:

• Puncturing the code by one coordinate from each recovery set results an ( nrr+1 , k)

MDS code

• By the bound on the field size of an MDS code nrr+1 ≤ |F|+ ( nr

r+1 − k)− 1

Questions:

1. What is the smallest field size needed for an (n, k, r) MR code?

- For optimal LRC order n is sufficient

- Contains the MDS conjecture as a subproblem2. Are MR codes inherently different from optimal LRC codes?

Maximally Recoverable Codes 14 / 17

Lower bounds on the field size

F(n, k, r) is upper bounded by a superpolynomial function of n

Theorem [Gopalan, Huang, Jenkins, Yekhanin 14]

F(n, k, r) ≥ k + 1

Proof:

• Puncturing the code by one coordinate from each recovery set results an ( nrr+1 , k)

MDS code

• By the bound on the field size of an MDS code nrr+1 ≤ |F|+ ( nr

r+1 − k)− 1

Questions:

1. What is the smallest field size needed for an (n, k, r) MR code?

- For optimal LRC order n is sufficient

- Contains the MDS conjecture as a subproblem2. Are MR codes inherently different from optimal LRC codes?

Maximally Recoverable Codes 14 / 17

Lower bounds on the field size

F(n, k, r) is upper bounded by a superpolynomial function of n

Theorem [Gopalan, Huang, Jenkins, Yekhanin 14]

F(n, k, r) ≥ k + 1

Proof:

• Puncturing the code by one coordinate from each recovery set results an ( nrr+1 , k)

MDS code

• By the bound on the field size of an MDS code nrr+1 ≤ |F|+ ( nr

r+1 − k)− 1

Questions:

1. What is the smallest field size needed for an (n, k, r) MR code?

- For optimal LRC order n is sufficient

- Contains the MDS conjecture as a subproblem2. Are MR codes inherently different from optimal LRC codes?

Maximally Recoverable Codes 14 / 17

Lower bounds on the field size

F(n, k, r) is upper bounded by a superpolynomial function of n

Theorem [Gopalan, Huang, Jenkins, Yekhanin 14]

F(n, k, r) ≥ k + 1

Proof:

• Puncturing the code by one coordinate from each recovery set results an ( nrr+1 , k)

MDS code

• By the bound on the field size of an MDS code nrr+1 ≤ |F|+ ( nr

r+1 − k)− 1

Questions:

1. What is the smallest field size needed for an (n, k, r) MR code?

- For optimal LRC order n is sufficient

- Contains the MDS conjecture as a subproblem2. Are MR codes inherently different from optimal LRC codes?

Maximally Recoverable Codes 14 / 17

Lower bounds on the field size

F(n, k, r) is upper bounded by a superpolynomial function of n

Theorem [Gopalan, Huang, Jenkins, Yekhanin 14]

F(n, k, r) ≥ k + 1

Proof:

• Puncturing the code by one coordinate from each recovery set results an ( nrr+1 , k)

MDS code

• By the bound on the field size of an MDS code nrr+1 ≤ |F|+ ( nr

r+1 − k)− 1

Questions:

1. What is the smallest field size needed for an (n, k, r) MR code?

- For optimal LRC order n is sufficient

- Contains the MDS conjecture as a subproblem2. Are MR codes inherently different from optimal LRC codes?

Maximally Recoverable Codes 14 / 17

Lower bounds on the field size

F(n, k, r) is upper bounded by a superpolynomial function of n

Theorem [Gopalan, Huang, Jenkins, Yekhanin 14]

F(n, k, r) ≥ k + 1

Proof:

• Puncturing the code by one coordinate from each recovery set results an ( nrr+1 , k)

MDS code

• By the bound on the field size of an MDS code nrr+1 ≤ |F|+ ( nr

r+1 − k)− 1

Questions:

1. What is the smallest field size needed for an (n, k, r) MR code?

- For optimal LRC order n is sufficient

- Contains the MDS conjecture as a subproblem

2. Are MR codes inherently different from optimal LRC codes?

Maximally Recoverable Codes 14 / 17

Lower bounds on the field size

F(n, k, r) is upper bounded by a superpolynomial function of n

Theorem [Gopalan, Huang, Jenkins, Yekhanin 14]

F(n, k, r) ≥ k + 1

Proof:

• Puncturing the code by one coordinate from each recovery set results an ( nrr+1 , k)

MDS code

• By the bound on the field size of an MDS code nrr+1 ≤ |F|+ ( nr

r+1 − k)− 1

Questions:

1. What is the smallest field size needed for an (n, k, r) MR code?

- For optimal LRC order n is sufficient

- Contains the MDS conjecture as a subproblem

2. Are MR codes inherently different from optimal LRC codes?

Maximally Recoverable Codes 14 / 17

Lower bounds on the field size

F(n, k, r) is upper bounded by a superpolynomial function of n

Theorem [Gopalan, Huang, Jenkins, Yekhanin 14]

F(n, k, r) ≥ k + 1

Proof:

• Puncturing the code by one coordinate from each recovery set results an ( nrr+1 , k)

MDS code

• By the bound on the field size of an MDS code nrr+1 ≤ |F|+ ( nr

r+1 − k)− 1

Questions:

1. What is the smallest field size needed for an (n, k, r) MR code?

- For optimal LRC order n is sufficient

- Contains the MDS conjecture as a subproblem

2. Are MR codes inherently different from optimal LRC codes?

Maximally Recoverable Codes 14 / 17

Lower bounds on the field size

F(n, k, r) is upper bounded by a superpolynomial function of n

Theorem [Gopalan, Huang, Jenkins, Yekhanin 14]

F(n, k, r) ≥ k + 1

Proof:

• Puncturing the code by one coordinate from each recovery set results an ( nrr+1 , k)

MDS code

• By the bound on the field size of an MDS code nrr+1 ≤ |F|+ ( nr

r+1 − k)− 1

Questions:

1. What is the smallest field size needed for an (n, k, r) MR code?

- For optimal LRC order n is sufficient

- Contains the MDS conjecture as a subproblem2. Are MR codes inherently different from optimal LRC codes?

Maximally Recoverable Codes 14 / 17

MR codes under arbitrary topologies

• R = {Ri ⊆ [n] : i = 1, ...,m} constraints on code’s coordinates,

• CR = a code with a set of coordinate constraints R

• Definition: Given a code CR, its Information sets

I = {S ⊆ [n] : CR is decodable from S, |S| = k}

• Goal: Construct a code CR with local constraints R that is “highly” decodable(large set I)

• Definition: CR with information sets I is MR if for any C′R with I′

I′ ⊆ I ( i.e. I is a maximum)

Maximally Recoverable Codes 15 / 17

MR codes under arbitrary topologies

• R = {Ri ⊆ [n] : i = 1, ...,m} constraints on code’s coordinates,

1. E.g. R1 = {1, ..., r + 1},R2 = {r + 2, ..., 2(r + 1)}, ...,R nrr+1

2. Can be arbitrary code constraints

• CR = a code with a set of coordinate constraints R

• Definition: Given a code CR, its Information sets

I = {S ⊆ [n] : CR is decodable from S, |S| = k}

• Goal: Construct a code CR with local constraints R that is “highly” decodable(large set I)

• Definition: CR with information sets I is MR if for any C′R with I′

I′ ⊆ I ( i.e. I is a maximum)

Maximally Recoverable Codes 15 / 17

MR codes under arbitrary topologies

• R = {Ri ⊆ [n] : i = 1, ...,m} constraints on code’s coordinates,

1. E.g. R1 = {1, ..., r + 1},R2 = {r + 2, ..., 2(r + 1)}, ...,R nrr+1

2. Can be arbitrary code constraints

• CR = a code with a set of coordinate constraints R

• Definition: Given a code CR, its Information sets

I = {S ⊆ [n] : CR is decodable from S, |S| = k}

• Goal: Construct a code CR with local constraints R that is “highly” decodable(large set I)

• Definition: CR with information sets I is MR if for any C′R with I′

I′ ⊆ I ( i.e. I is a maximum)

Maximally Recoverable Codes 15 / 17

MR codes under arbitrary topologies

• R = {Ri ⊆ [n] : i = 1, ...,m} constraints on code’s coordinates,

1. E.g. R1 = {1, ..., r + 1},R2 = {r + 2, ..., 2(r + 1)}, ...,R nrr+1

2. Can be arbitrary code constraints

• CR = a code with a set of coordinate constraints R

• Definition: Given a code CR, its Information sets

I = {S ⊆ [n] : CR is decodable from S, |S| = k}

• Goal: Construct a code CR with local constraints R that is “highly” decodable(large set I)

• Definition: CR with information sets I is MR if for any C′R with I′

I′ ⊆ I ( i.e. I is a maximum)

Maximally Recoverable Codes 15 / 17

MR codes under arbitrary topologies

• R = {Ri ⊆ [n] : i = 1, ...,m} constraints on code’s coordinates,

• CR = a code with a set of coordinate constraints R

• Definition: Given a code CR, its Information sets

I = {S ⊆ [n] : CR is decodable from S, |S| = k}

• Goal: Construct a code CR with local constraints R that is “highly” decodable(large set I)

• Definition: CR with information sets I is MR if for any C′R with I′

I′ ⊆ I ( i.e. I is a maximum)

Maximally Recoverable Codes 15 / 17

MR codes under arbitrary topologies

• R = {Ri ⊆ [n] : i = 1, ...,m} constraints on code’s coordinates,

• CR = a code with a set of coordinate constraints R

• Definition: Given a code CR, its Information sets

I = {S ⊆ [n] : CR is decodable from S, |S| = k}

• Goal: Construct a code CR with local constraints R that is “highly” decodable(large set I)

• Definition: CR with information sets I is MR if for any C′R with I′

I′ ⊆ I ( i.e. I is a maximum)

Maximally Recoverable Codes 15 / 17

MR codes under arbitrary topologies

• R = {Ri ⊆ [n] : i = 1, ...,m} constraints on code’s coordinates,

• CR = a code with a set of coordinate constraints R

• Definition: Given a code CR, its Information sets

I = {S ⊆ [n] : CR is decodable from S, |S| = k}

• Goal: Construct a code CR with local constraints R that is “highly” decodable(large set I)

• Definition: CR with information sets I is MR if for any C′R with I′

I′ ⊆ I ( i.e. I is a maximum)

Maximally Recoverable Codes 15 / 17

MR codes under arbitrary topologies

• R = {Ri ⊆ [n] : i = 1, ...,m} constraints on code’s coordinates,

• CR = a code with a set of coordinate constraints R

• Definition: Given a code CR, its Information sets

I = {S ⊆ [n] : CR is decodable from S, |S| = k}

• Goal: Construct a code CR with local constraints R that is “highly” decodable(large set I)

• Definition: CR with information sets I is MR if for any C′R with I′

I′ ⊆ I ( i.e. I is a maximum)

Maximally Recoverable Codes 15 / 17

MR codes under arbitrary topologies

• Definition: CR with I is MR if for any C′R with I′

I′ ⊆ I

• Example: R1 = {1, ..., r + 1} , ...,Rn/(r+1) = {n− r, ..., n}

CR is MR if I = {S ⊂ [n] : Ri * S, |S| = k}

LemmaFor any set R there exists an MR code CR over a large enough field

Maximally Recoverable Codes 16 / 17

MR codes under arbitrary topologies

• Definition: CR with I is MR if for any C′R with I′

I′ ⊆ I

• Example:

R1 = {1, ..., r + 1} , ...,Rn/(r+1) = {n− r, ..., n}

CR is MR if I = {S ⊂ [n] : Ri * S, |S| = k}

LemmaFor any set R there exists an MR code CR over a large enough field

Maximally Recoverable Codes 16 / 17

MR codes under arbitrary topologies

• Definition: CR with I is MR if for any C′R with I′

I′ ⊆ I

• Example: R1 = {1, ..., r + 1}

, ...,Rn/(r+1) = {n− r, ..., n}

CR is MR if I = {S ⊂ [n] : Ri * S, |S| = k}

LemmaFor any set R there exists an MR code CR over a large enough field

Maximally Recoverable Codes 16 / 17

MR codes under arbitrary topologies

• Definition: CR with I is MR if for any C′R with I′

I′ ⊆ I

• Example: R1 = {1, ..., r + 1} , ...,Rn/(r+1) = {n− r, ..., n}

CR is MR if I = {S ⊂ [n] : Ri * S, |S| = k}

LemmaFor any set R there exists an MR code CR over a large enough field

Maximally Recoverable Codes 16 / 17

MR codes under arbitrary topologies

• Definition: CR with I is MR if for any C′R with I′

I′ ⊆ I

• Example: R1 = {1, ..., r + 1} , ...,Rn/(r+1) = {n− r, ..., n}

CR is MR if I = {S ⊂ [n] : Ri * S, |S| = k}

LemmaFor any set R there exists an MR code CR over a large enough field

Maximally Recoverable Codes 16 / 17

MR codes under arbitrary topologies

• Definition: CR with I is MR if for any C′R with I′

I′ ⊆ I

• Example: R1 = {1, ..., r + 1} , ...,Rn/(r+1) = {n− r, ..., n}

CR is MR if I = {S ⊂ [n] : Ri * S, |S| = k}

LemmaFor any set R there exists an MR code CR over a large enough field

Maximally Recoverable Codes 16 / 17

MR codes under arbitrary topologies - cont’d

“Maximally Recoverable Codes for Grid-like Topologies” [Gopalan, Hu, Saraf, Wang,Yekhanin 16]

1. Good problem introduction

2. The first non-trivial lower bound on |F| (for grid-like topologies)

3. More results...

Questions:

1. Given an MR code with set of constraints R characterize its information sets I

2. MR code constructions for other topologies (e.g. grid-like topologies)

3. Are the known constructions extendable to other topologies?

4. Field size?

5. Is it possible to construct MR codes using the RS-like construction ?

Maximally Recoverable Codes 17 / 17

MR codes under arbitrary topologies - cont’d

“Maximally Recoverable Codes for Grid-like Topologies” [Gopalan, Hu, Saraf, Wang,Yekhanin 16]

1. Good problem introduction

2. The first non-trivial lower bound on |F| (for grid-like topologies)

3. More results...

Questions:

1. Given an MR code with set of constraints R characterize its information sets I

2. MR code constructions for other topologies (e.g. grid-like topologies)

3. Are the known constructions extendable to other topologies?

4. Field size?

5. Is it possible to construct MR codes using the RS-like construction ?

Maximally Recoverable Codes 17 / 17

MR codes under arbitrary topologies - cont’d

“Maximally Recoverable Codes for Grid-like Topologies” [Gopalan, Hu, Saraf, Wang,Yekhanin 16]

1. Good problem introduction

2. The first non-trivial lower bound on |F| (for grid-like topologies)

3. More results...

Questions:

1. Given an MR code with set of constraints R characterize its information sets I

2. MR code constructions for other topologies (e.g. grid-like topologies)

3. Are the known constructions extendable to other topologies?

4. Field size?

5. Is it possible to construct MR codes using the RS-like construction ?

Maximally Recoverable Codes 17 / 17

MR codes under arbitrary topologies - cont’d

“Maximally Recoverable Codes for Grid-like Topologies” [Gopalan, Hu, Saraf, Wang,Yekhanin 16]

1. Good problem introduction

2. The first non-trivial lower bound on |F| (for grid-like topologies)

3. More results...

Questions:

1. Given an MR code with set of constraints R characterize its information sets I

2. MR code constructions for other topologies (e.g. grid-like topologies)

3. Are the known constructions extendable to other topologies?

4. Field size?

5. Is it possible to construct MR codes using the RS-like construction ?

Maximally Recoverable Codes 17 / 17

MR codes under arbitrary topologies - cont’d

“Maximally Recoverable Codes for Grid-like Topologies” [Gopalan, Hu, Saraf, Wang,Yekhanin 16]

1. Good problem introduction

2. The first non-trivial lower bound on |F| (for grid-like topologies)

3. More results...

Questions:

1. Given an MR code with set of constraints R characterize its information sets I

2. MR code constructions for other topologies (e.g. grid-like topologies)

3. Are the known constructions extendable to other topologies?

4. Field size?

5. Is it possible to construct MR codes using the RS-like construction ?

Maximally Recoverable Codes 17 / 17

MR codes under arbitrary topologies - cont’d

“Maximally Recoverable Codes for Grid-like Topologies” [Gopalan, Hu, Saraf, Wang,Yekhanin 16]

1. Good problem introduction

2. The first non-trivial lower bound on |F| (for grid-like topologies)

3. More results...

Questions:

1. Given an MR code with set of constraints R characterize its information sets I

2. MR code constructions for other topologies (e.g. grid-like topologies)

3. Are the known constructions extendable to other topologies?

4. Field size?

5. Is it possible to construct MR codes using the RS-like construction ?

Maximally Recoverable Codes 17 / 17

MR codes under arbitrary topologies - cont’d

“Maximally Recoverable Codes for Grid-like Topologies” [Gopalan, Hu, Saraf, Wang,Yekhanin 16]

1. Good problem introduction

2. The first non-trivial lower bound on |F| (for grid-like topologies)

3. More results...

Questions:

1. Given an MR code with set of constraints R characterize its information sets I

2. MR code constructions for other topologies (e.g. grid-like topologies)

3. Are the known constructions extendable to other topologies?

4. Field size?

5. Is it possible to construct MR codes using the RS-like construction ?

Maximally Recoverable Codes 17 / 17

MR codes under arbitrary topologies - cont’d

“Maximally Recoverable Codes for Grid-like Topologies” [Gopalan, Hu, Saraf, Wang,Yekhanin 16]

1. Good problem introduction

2. The first non-trivial lower bound on |F| (for grid-like topologies)

3. More results...

Questions:

1. Given an MR code with set of constraints R characterize its information sets I

2. MR code constructions for other topologies (e.g. grid-like topologies)

3. Are the known constructions extendable to other topologies?

4. Field size?

5. Is it possible to construct MR codes using the RS-like construction ?

Maximally Recoverable Codes 17 / 17

MR codes under arbitrary topologies - cont’d

“Maximally Recoverable Codes for Grid-like Topologies” [Gopalan, Hu, Saraf, Wang,Yekhanin 16]

1. Good problem introduction

2. The first non-trivial lower bound on |F| (for grid-like topologies)

3. More results...

Questions:

1. Given an MR code with set of constraints R characterize its information sets I

2. MR code constructions for other topologies (e.g. grid-like topologies)

3. Are the known constructions extendable to other topologies?

4. Field size?

5. Is it possible to construct MR codes using the RS-like construction ?

Maximally Recoverable Codes 17 / 17

MR codes under arbitrary topologies - cont’d

“Maximally Recoverable Codes for Grid-like Topologies” [Gopalan, Hu, Saraf, Wang,Yekhanin 16]

1. Good problem introduction

2. The first non-trivial lower bound on |F| (for grid-like topologies)

3. More results...

Questions:

1. Given an MR code with set of constraints R characterize its information sets I

2. MR code constructions for other topologies (e.g. grid-like topologies)

3. Are the known constructions extendable to other topologies?

4. Field size?

5. Is it possible to construct MR codes using the RS-like construction ?

Maximally Recoverable Codes 17 / 17

LRC codes on graphs

In this part we used materials provided to us by Fatemeh Arbabjolfaei (University of California, San Diego),

Søren Riis (Queen Mary, University of London ) and Arya Mazumdar (University of Massachusetts at

Amherst), with their kind permission.

LRC codes 1 / 15

LRC codes on graphs [Mazumdar 2014, Shanmugam and Dimakis 2014]

x1

x2 x3

LRC codes 2 / 15

LRC codes on graphs [Mazumdar 2014, Shanmugam and Dimakis 2014]

x1

x2 x3

• Storage recovery graph G1

2 3

LRC codes 2 / 15

LRC codes on graphs [Mazumdar 2014, Shanmugam and Dimakis 2014]

x1

x2 x3

• Storage recovery graph G1

2 3

• Each node can recover its content from its incoming neighbors

LRC codes 2 / 15

LRC codes on graphs [Mazumdar 2014, Shanmugam and Dimakis 2014]

x1

x2 x3

• Storage recovery graph G1

2 3

• Each node can recover its content from its incoming neighbors

• Recovery sets:

LRC codes 2 / 15

LRC codes on graphs [Mazumdar 2014, Shanmugam and Dimakis 2014]

x1

x2 x3

• Storage recovery graph G1

2 3

• Each node can recover its content from its incoming neighbors

• Recovery sets: A1 = {2},

LRC codes 2 / 15

LRC codes on graphs [Mazumdar 2014, Shanmugam and Dimakis 2014]

x1

x2 x3

• Storage recovery graph G1

2 3

• Each node can recover its content from its incoming neighbors

• Recovery sets: A1 = {2}, A2 = {1, 3}, A3 = {1}

LRC codes 2 / 15

LRC codes on graphs [Mazumdar 2014, Shanmugam and Dimakis 2014]

x1

x2 x3

• Storage recovery graph G1

2 3

• Each node can recover its content from its incoming neighbors

• Recovery sets: A1 = {2}, A2 = {1, 3}, A3 = {1}

• How much data can be stored?

LRC codes 2 / 15

LRC codes on graphs [Mazumdar 2014, Shanmugam and Dimakis 2014]

x1

x2 x3

• Storage recovery graph G1

2 3

• Each node can recover its content from its incoming neighbors

• Recovery sets: A1 = {2}, A2 = {1, 3}, A3 = {1}

• How much data can be stored?

• Which coding scheme achieves the limit?

LRC codes 2 / 15

Storage Capacity

• The network is modeled by a (directed) graph G = (V,E), |V| = n

LRC codes 3 / 15

Storage Capacity

• The network is modeled by a (directed) graph G = (V,E), |V| = n

• Each node (vertex) stores a symbol from Fq

LRC codes 3 / 15

Storage Capacity

• The network is modeled by a (directed) graph G = (V,E), |V| = n

• Each node (vertex) stores a symbol from Fq

• Storage code:

LRC codes 3 / 15

Storage Capacity

• The network is modeled by a (directed) graph G = (V,E), |V| = n

• Each node (vertex) stores a symbol from Fq

• Storage code:1. A set of vectors C ⊆ F

nq

LRC codes 3 / 15

Storage Capacity

• The network is modeled by a (directed) graph G = (V,E), |V| = n

• Each node (vertex) stores a symbol from Fq

• Storage code:1. A set of vectors C ⊆ F

nq

2. n recovery functions fi, s.t. for any (x1, ..., xn) ∈ C

LRC codes 3 / 15

Storage Capacity

• The network is modeled by a (directed) graph G = (V,E), |V| = n

• Each node (vertex) stores a symbol from Fq

• Storage code:1. A set of vectors C ⊆ F

nq

2. n recovery functions fi, s.t. for any (x1, ..., xn) ∈ C

fi(xj : j ∈ N(i)) = xi

LRC codes 3 / 15

Storage Capacity

• The network is modeled by a (directed) graph G = (V,E), |V| = n

• Each node (vertex) stores a symbol from Fq

• Storage code:1. A set of vectors C ⊆ F

nq

2. n recovery functions fi, s.t. for any (x1, ..., xn) ∈ C

fi(xj : j ∈ N(i)) = xi

• The storage capacity of G over Fq

Capq(G) = maxC⊆F

nq is a

storage code for G

logq |C|

LRC codes 3 / 15

Storage Capacity

• The network is modeled by a (directed) graph G = (V,E), |V| = n

• Each node (vertex) stores a symbol from Fq

• Storage code:1. A set of vectors C ⊆ F

nq

2. n recovery functions fi, s.t. for any (x1, ..., xn) ∈ C

fi(xj : j ∈ N(i)) = xi

• The storage capacity of G over Fq

Capq(G) = maxC⊆F

nq is a

storage code for G

logq |C| ≤ n

LRC codes 3 / 15

Storage Capacity

• The network is modeled by a (directed) graph G = (V,E), |V| = n

• Each node (vertex) stores a symbol from Fq

• Storage code:1. A set of vectors C ⊆ F

nq

2. n recovery functions fi, s.t. for any (x1, ..., xn) ∈ C

fi(xj : j ∈ N(i)) = xi

• The storage capacity of G over Fq

Capq(G) = maxC⊆F

nq is a

storage code for G

logq |C| ≤ n

• The storage capacity of G is

Cap(G) = supq

Capq(G)

LRC codes 3 / 15

Storage Capacity

• The network is modeled by a (directed) graph G = (V,E), |V| = n

• Each node (vertex) stores a symbol from Fq

• Storage code:1. A set of vectors C ⊆ F

nq

2. n recovery functions fi, s.t. for any (x1, ..., xn) ∈ C

fi(xj : j ∈ N(i)) = xi

• The storage capacity of G over Fq

Capq(G) = maxC⊆F

nq is a

storage code for G

logq |C| ≤ n

• The storage capacity of G is

Cap(G) = supq

Capq(G) = limq→∞

Capq(G) (Fekete’s lemma)

LRC codes 3 / 15

LRC codes on graphs - Example

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

• Each node stores a single bit

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

• Each node stores a single bit

• Two bits of information b1, b2 can be stored (Cap2(G) = 2)

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

• Each node stores a single bit

• Two bits of information b1, b2 can be stored (Cap2(G) = 2)

• Store:• N1 = b1• N2 = b2• N3 = b1 + b2

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

• Each node stores a single bit

• Two bits of information b1, b2 can be stored (Cap2(G) = 2)

• Store:• N1 = b1• N2 = b2• N3 = b1 + b2

• Cap(G) = supq Capq(G)

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

• Each node stores a single bit

• Two bits of information b1, b2 can be stored (Cap2(G) = 2)

• Store:• N1 = b1• N2 = b2• N3 = b1 + b2

• Cap(G) = supq Capq(G) = 2

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

N4

N5

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

N4

N5

• C = {(0, 0, 0, 0, 0), (0, 1, 1, 0, 0), (0, 0, 0, 1, 1), (1, 1, 0, 1, 1), (1, 1, 1, 0, 1)}

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

N4

N5

• C = {(0, 0, 0, 0, 0), (0, 1, 1, 0, 0), (0, 0, 0, 1, 1), (1, 1, 0, 1, 1), (1, 1, 1, 0, 1)}

• N1 = N2 ∧ N5,

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

N4

N5

• C = {(0, 0, 0, 0, 0), (0, 1, 1, 0, 0), (0, 0, 0, 1, 1), (1, 1, 0, 1, 1), (1, 1, 1, 0, 1)}

• N1 = N2 ∧ N5,N2 = N1 ∨ N3,N3 = N2 ∧ N4,N4 = N3 ∧ N5,N5 = N1 ∨ N4

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

N4

N5

• C = {(0, 0, 0, 0, 0), (0, 1, 1, 0, 0), (0, 0, 0, 1, 1), (1, 1, 0, 1, 1), (1, 1, 1, 0, 1)}

• N1 = N2 ∧ N5,N2 = N1 ∨ N3,N3 = N2 ∧ N4,N4 = N3 ∧ N5,N5 = N1 ∨ N4

• Cap2(G) ≥ log2(5) = 2.32...

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

N4

N5

• C = {(0, 0, 0, 0, 0), (0, 1, 1, 0, 0), (0, 0, 0, 1, 1), (1, 1, 0, 1, 1), (1, 1, 1, 0, 1)}

• N1 = N2 ∧ N5,N2 = N1 ∨ N3,N3 = N2 ∧ N4,N4 = N3 ∧ N5,N5 = N1 ∨ N4

• Cap2(G) ≥ log2(5) = 2.32...

• In fact Cap2(G) = log2(5) = 2.32...

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

N4

N5

• C = {(0, 0, 0, 0, 0), (0, 1, 1, 0, 0), (0, 0, 0, 1, 1), (1, 1, 0, 1, 1), (1, 1, 1, 0, 1)}

• N1 = N2 ∧ N5,N2 = N1 ∨ N3,N3 = N2 ∧ N4,N4 = N3 ∧ N5,N5 = N1 ∨ N4

• Cap2(G) ≥ log2(5) = 2.32...

• In fact Cap2(G) = log2(5) = 2.32...

• However Cap(G) = Cap4(G)

LRC codes 4 / 15

LRC codes on graphs - Example

N1

N2

N3

N4

N5

• C = {(0, 0, 0, 0, 0), (0, 1, 1, 0, 0), (0, 0, 0, 1, 1), (1, 1, 0, 1, 1), (1, 1, 1, 0, 1)}

• N1 = N2 ∧ N5,N2 = N1 ∨ N3,N3 = N2 ∧ N4,N4 = N3 ∧ N5,N5 = N1 ∨ N4

• Cap2(G) ≥ log2(5) = 2.32...

• In fact Cap2(G) = log2(5) = 2.32...

• However Cap(G) = Cap4(G) = 2.5 [Blasiak, Kleinberg, Lubetzky 13, Christofides, Markstrom 11]

LRC codes 4 / 15

LRC codes on graphs - Example

LRC codes on graphs - Example

• Cap4(G) = 2.5 → 5 bits can be stored in the system {1, 2, 3, 4, 5}

LRC codes on graphs - Example

1, 5

• Cap4(G) = 2.5 → 5 bits can be stored in the system {1, 2, 3, 4, 5}

LRC codes on graphs - Example

1, 5

1, 2

• Cap4(G) = 2.5 → 5 bits can be stored in the system {1, 2, 3, 4, 5}

LRC codes on graphs - Example

1, 5

1, 22, 3

• Cap4(G) = 2.5 → 5 bits can be stored in the system {1, 2, 3, 4, 5}

LRC codes on graphs - Example

1, 5

1, 22, 3

3, 4

• Cap4(G) = 2.5 → 5 bits can be stored in the system {1, 2, 3, 4, 5}

LRC codes on graphs - Example

1, 5

1, 22, 3

3, 44, 5

• Cap4(G) = 2.5 → 5 bits can be stored in the system {1, 2, 3, 4, 5}

LRC codes 4 / 15

LRC codes on graphs - Example

1, 5

1, 22, 3

3, 44, 5

• Cap4(G) = 2.5 → 5 bits can be stored in the system {1, 2, 3, 4, 5}

• Cap(G) ≤ 2.5?

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

H(N1,N2,N3,N4,N5)

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

H(N1,N2,N3,N4,N5) = H(N1,N2,N3,N4,N5) + H(N4,N5)− H(N4,N5)

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

H(N1,N2,N3,N4,N5) = H(N1,N2,N3,N4,N5) + H(N4,N5)− H(N4,N5)

= H(N1,N3,N4,N5) + H(N4,N5)− H(N4,N5)

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

H(N1,N2,N3,N4,N5) = H(N1,N2,N3,N4,N5) + H(N4,N5)− H(N4,N5)

= H(N1,N3,N4,N5) + H(N4,N5)− H(N4,N5)

≤ H(N1,N4,N5) + H(N3,N4,N5)− H(N4,N5)

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

H(N1,N2,N3,N4,N5) = H(N1,N2,N3,N4,N5) + H(N4,N5)− H(N4,N5)

= H(N1,N3,N4,N5) + H(N4,N5)− H(N4,N5)

≤ H(N1,N4,N5) + H(N3,N4,N5)− H(N4,N5)

= H(N1,N4) + H(N3,N5)− H(N4,N5)

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

H(N1,N2,N3,N4,N5) = H(N1,N2,N3,N4,N5) + H(N4,N5)− H(N4,N5)

= H(N1,N3,N4,N5) + H(N4,N5)− H(N4,N5)

≤ H(N1,N4,N5) + H(N3,N4,N5)− H(N4,N5)

= H(N1,N4) + H(N3,N5)− H(N4,N5)

≤ H(N1) + H(N3) + H(N4) + H(N5)− H(N4,N5)

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

• H(N1,N2,N3,N4,N5) ≤ H(N1) + H(N3) + H(N4) + H(N5)− H(N4,N5)

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

• H(N1,N2,N3,N4,N5) ≤ H(N1) + H(N3) + H(N4) + H(N5)− H(N4,N5)

H(N1,N2,N3,N4,N5) = H(N2,N4,N5)

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

• H(N1,N2,N3,N4,N5) ≤ H(N1) + H(N3) + H(N4) + H(N5)− H(N4,N5)

H(N1,N2,N3,N4,N5) = H(N2,N4,N5)

= H(N4,N5) + H(N2|N4,N5)

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

• H(N1,N2,N3,N4,N5) ≤ H(N1) + H(N3) + H(N4) + H(N5)− H(N4,N5)

H(N1,N2,N3,N4,N5) = H(N2,N4,N5)

= H(N4,N5) + H(N2|N4,N5)

≤ H(N4,N5) + H(N2)

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

• H(N1,N2,N3,N4,N5) ≤ H(N1) + H(N3) + H(N4) + H(N5)− H(N4,N5)

• H(N1,N2,N3,N4,N5) ≤ H(N4,N5) + H(N2)

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

• H(N1,N2,N3,N4,N5) ≤ H(N1) + H(N3) + H(N4) + H(N5)− H(N4,N5)

• H(N1,N2,N3,N4,N5) ≤ H(N4,N5) + H(N2)

• ⇒ 2H(N1,N2,N3,N4,N5) ≤ H(N1) + H(N2) + H(N3) + H(N4) + H(N5) ≤ 5

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

• H(N1,N2,N3,N4,N5) ≤ H(N1) + H(N3) + H(N4) + H(N5)− H(N4,N5)

• H(N1,N2,N3,N4,N5) ≤ H(N4,N5) + H(N2)

• ⇒ 2H(N1,N2,N3,N4,N5) ≤ H(N1) + H(N2) + H(N3) + H(N4) + H(N5) ≤ 5

• ⇒ Cap(G) ≤ 2.5

LRC codes 4 / 15

LRC codes on graphs - Example

• H(X, Y, Z) + H(Z) ≤ H(X, Z) + H(Y, Z)

• H(N1,N2,N3,N4,N5) ≤ H(N1) + H(N3) + H(N4) + H(N5)− H(N4,N5)

• H(N1,N2,N3,N4,N5) ≤ H(N4,N5) + H(N2)

• ⇒ 2H(N1,N2,N3,N4,N5) ≤ H(N1) + H(N2) + H(N3) + H(N4) + H(N5) ≤ 5

• ⇒ Cap(G) ≤ 2.5

• Question: Given an arbitrary graph G how to calculate Cap(G)?

LRC codes 4 / 15

The Confusion Graph [Bar-Yossef, Birk, Jayram, and Kol 2006]

Confusion Graph

LRC codes 5 / 15

The Confusion Graph [Bar-Yossef, Birk, Jayram, and Kol 2006]

Confusion Graph

• Given a graph G on n vertices and q ∈ N

LRC codes 5 / 15

The Confusion Graph [Bar-Yossef, Birk, Jayram, and Kol 2006]

Confusion Graph

• Given a graph G on n vertices and q ∈ N

• The confusion graph Conf(G) has

LRC codes 5 / 15

The Confusion Graph [Bar-Yossef, Birk, Jayram, and Kol 2006]

Confusion Graph

• Given a graph G on n vertices and q ∈ N

• The confusion graph Conf(G) has

• Vertex set: Fnq

LRC codes 5 / 15

The Confusion Graph [Bar-Yossef, Birk, Jayram, and Kol 2006]

Confusion Graph

• Given a graph G on n vertices and q ∈ N

• The confusion graph Conf(G) has

• Vertex set: Fnq

• Edge set: v = (v1, ..., vn) ∼ u = (u1, ..., un) if

LRC codes 5 / 15

The Confusion Graph [Bar-Yossef, Birk, Jayram, and Kol 2006]

Confusion Graph

• Given a graph G on n vertices and q ∈ N

• The confusion graph Conf(G) has

• Vertex set: Fnq

• Edge set: v = (v1, ..., vn) ∼ u = (u1, ..., un) if

1. Exists i s.t. vi 6= ui and

LRC codes 5 / 15

The Confusion Graph [Bar-Yossef, Birk, Jayram, and Kol 2006]

Confusion Graph

• Given a graph G on n vertices and q ∈ N

• The confusion graph Conf(G) has

• Vertex set: Fnq

• Edge set: v = (v1, ..., vn) ∼ u = (u1, ..., un) if

1. Exists i s.t. vi 6= ui and2. v|N(i) = u|N(i)

LRC codes 5 / 15

The Confusion Graph [Bar-Yossef, Birk, Jayram, and Kol 2006]

Confusion Graph

• Given a graph G on n vertices and q ∈ N

• The confusion graph Conf(G) has

• Vertex set: Fnq

• Edge set: v = (v1, ..., vn) ∼ u = (u1, ..., un) if

1. Exists i s.t. vi 6= ui and2. v|N(i) = u|N(i)

• v, u ∈ Fnq can both be contained in a storage code ⇔ v ≁ u in Conf(G)

LRC codes 5 / 15

The Confusion Graph [Bar-Yossef, Birk, Jayram, and Kol 2006]

Confusion Graph

• Given a graph G on n vertices and q ∈ N

• The confusion graph Conf(G) has

• Vertex set: Fnq

• Edge set: v = (v1, ..., vn) ∼ u = (u1, ..., un) if

1. Exists i s.t. vi 6= ui and2. v|N(i) = u|N(i)

• v, u ∈ Fnq can both be contained in a storage code ⇔ v ≁ u in Conf(G)

• C ⊆ Fnq is a storage code ⇔ C is an independent set of Conf(G)

LRC codes 5 / 15

The Confusion Graph [Bar-Yossef, Birk, Jayram, and Kol 2006]

Confusion Graph

• Given a graph G on n vertices and q ∈ N

• The confusion graph Conf(G) has

• Vertex set: Fnq

• Edge set: v = (v1, ..., vn) ∼ u = (u1, ..., un) if

1. Exists i s.t. vi 6= ui and2. v|N(i) = u|N(i)

• v, u ∈ Fnq can both be contained in a storage code ⇔ v ≁ u in Conf(G)

• C ⊆ Fnq is a storage code ⇔ C is an independent set of Conf(G)

• Corollary:Capq(G) = logq α(Conf(G)),

where α(·) is the independence number

LRC codes 5 / 15

Computing Cap(G)

• Capq(G) = logq α(Conf(G))

LRC codes 6 / 15

Computing Cap(G)

• Capq(G) = logq α(Conf(G))

• Computing the independence number is hard

LRC codes 6 / 15

Computing Cap(G)

• Capq(G) = logq α(Conf(G))

• Computing the independence number is hard

• Need to compute it for arbitrarily large confusion graphs (q → ∞)

LRC codes 6 / 15

Computing Cap(G)

• Capq(G) = logq α(Conf(G))

• Computing the independence number is hard

• Need to compute it for arbitrarily large confusion graphs (q → ∞)

• Bounds on Cap(G)?

LRC codes 6 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information A1 = {2}, A2 = {1, 3}, A3 = {1}

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• Side information A1 = {2}, A2 = {1, 3}, A3 = {1}

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• Side information A1 = {2}, A2 = {1, 3}, A3 = {1}

• Send x1 + x2 and x3

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• Side information A1 = {2}, A2 = {1, 3}, A3 = {1}

• Send x1 + x2 and x3

• What is the fundamental limit on the number of transmissions?

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• Side information A1 = {2}, A2 = {1, 3}, A3 = {1}

• Send x1 + x2 and x3

• What is the fundamental limit on the number of transmissions?

• Which coding scheme achieves the limit?

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• Side information A1 = {2}, A2 = {1, 3}, A3 = {1}

• Send x1 + x2 and x3

• What is the fundamental limit on the number of transmissions?

• Which coding scheme achieves the limit?

• Many generalizations...

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• G = (V,E), |V| = n receivers

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• G = (V,E), |V| = n receivers

• Receiver i requests message xi ∈ Fq

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• G = (V,E), |V| = n receivers

• Receiver i requests message xi ∈ Fq

• An incoming edge represents the side information

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• G = (V,E), |V| = n receivers

• Receiver i requests message xi ∈ Fq

• An incoming edge represents the side information

• An index code of length L is

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• G = (V,E), |V| = n receivers

• Receiver i requests message xi ∈ Fq

• An incoming edge represents the side information

• An index code of length L is1. An encoding function E : Fn

q → FLq

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• G = (V,E), |V| = n receivers

• Receiver i requests message xi ∈ Fq

• An incoming edge represents the side information

• An index code of length L is1. An encoding function E : Fn

q → FLq

2. n decoding functions Di s.t. for any x ∈ Fnq

Di(E(x), x|N(i)) = xi.

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• Indexq(G) = the shortest length of an index code over Fq

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• Indexq(G) = the shortest length of an index code over Fq

• Index(G) = infq Indexq(G)

LRC codes 7 / 15

Index Coding [Birk, Kol 1998]

?

?

?

x1

x1

x1

x2

x2

x3

x3

• Side information graph G1

2 3

• Indexq(G) = the shortest length of an index code over Fq

• Index(G) = infq Indexq(G) = limq→∞ Indexq(G)

LRC codes 7 / 15

Index Coding - Continued

LRC codes 8 / 15

Index Coding - Continued

• Index code: an encoding function E : Fnq → F

Lq

LRC codes 8 / 15

Index Coding - Continued

• Index code: an encoding function E : Fnq → F

Lq

• The confusion of G, Conf(G) has

LRC codes 8 / 15

Index Coding - Continued

• Index code: an encoding function E : Fnq → F

Lq

• The confusion of G, Conf(G) has

• Vertex set: Fnq

LRC codes 8 / 15

Index Coding - Continued

• Index code: an encoding function E : Fnq → F

Lq

• The confusion of G, Conf(G) has

• Vertex set: Fnq

• Edge set: v = (v1, ..., vn) ∼ u = (u1, ..., un) if

LRC codes 8 / 15

Index Coding - Continued

• Index code: an encoding function E : Fnq → F

Lq

• The confusion of G, Conf(G) has

• Vertex set: Fnq

• Edge set: v = (v1, ..., vn) ∼ u = (u1, ..., un) if

Exists i s.t.vi 6= ui

LRC codes 8 / 15

Index Coding - Continued

• Index code: an encoding function E : Fnq → F

Lq

• The confusion of G, Conf(G) has

• Vertex set: Fnq

• Edge set: v = (v1, ..., vn) ∼ u = (u1, ..., un) if

Exists i s.t.vi 6= ui and v|N(i) = u|N(i)

LRC codes 8 / 15

Index Coding - Continued

• Index code: an encoding function E : Fnq → F

Lq

• The confusion of G, Conf(G) has

• Vertex set: Fnq

• Edge set: v = (v1, ..., vn) ∼ u = (u1, ..., un) if

Exists i s.t.vi 6= ui and v|N(i) = u|N(i)

• Fact: E(·) is an index code iff it is a proper coloring of Conf(G)

LRC codes 8 / 15

Index Coding - Continued

• Index code: an encoding function E : Fnq → F

Lq

• The confusion of G, Conf(G) has

• Vertex set: Fnq

• Edge set: v = (v1, ..., vn) ∼ u = (u1, ..., un) if

Exists i s.t.vi 6= ui and v|N(i) = u|N(i)

• Fact: E(·) is an index code iff it is a proper coloring of Conf(G)

Corollary

Given a graph G

LRC codes 8 / 15

Index Coding - Continued

• Index code: an encoding function E : Fnq → F

Lq

• The confusion of G, Conf(G) has

• Vertex set: Fnq

• Edge set: v = (v1, ..., vn) ∼ u = (u1, ..., un) if

Exists i s.t.vi 6= ui and v|N(i) = u|N(i)

• Fact: E(·) is an index code iff it is a proper coloring of Conf(G)

Corollary

Given a graph GIndexq(G) = ⌈logq χ(Conf(G))⌉,

where χ(·) is the chromatic number

LRC codes 8 / 15

Duality between Storage Capacity and Index Coding

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Observations:

• Storage Capacity and Index coding are dual problems

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Observations:

• Storage Capacity and Index coding are dual problems

• upper bound on Index(G) ⇒ lower bound on Cap(G)

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Observations:

• Storage Capacity and Index coding are dual problems

• upper bound on Index(G) ⇒ lower bound on Cap(G)

• lower bound on Index(G) ⇒ upper bound on Cap(G)

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Proof:

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Proof:

• χf = fractional chromatic number

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Proof:

• χf = fractional chromatic number

• Hq = Confq(G) is a Cayley Graph of (Fnq,+)

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Proof:

• χf = fractional chromatic number

• Hq = Confq(G) is a Cayley Graph of (Fnq,+) → is vertex transitive

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Proof:

• χf = fractional chromatic number

• Hq = Confq(G) is a Cayley Graph of (Fnq,+) → is vertex transitive

⇒ α(Hq) · χf (Hq) = |V(Hq)| = qn

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Proof:

• χf = fractional chromatic number

• Hq = Confq(G) is a Cayley Graph of (Fnq,+) → is vertex transitive

⇒ α(Hq) · χf (Hq) = |V(Hq)| = qn

⇒ limq→∞

logq α(Hq) + limq→∞

logq χf (Hq) = n

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Proof:

• χf = fractional chromatic number

• Hq = Confq(G) is a Cayley Graph of (Fnq,+) → is vertex transitive

⇒ α(Hq) · χf (Hq) = |V(Hq)| = qn

⇒ limq→∞

logq α(Hq) + limq→∞

logq χf (Hq) = n

⇒ Cap(G) + limq→∞

logq χf (Hq) = n

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Proof:

• Cap(G) + limq→∞ logq χf (Hq) = n

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Proof:

• Cap(G) + limq→∞ logq χf (Hq) = n

• For any graph, and in particular, for Hq

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Proof:

• Cap(G) + limq→∞ logq χf (Hq) = n

• For any graph, and in particular, for Hq

χf (Hq) ≤ χ(Hq) ≤ χf (Hq) · ln |V(Hq)|

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Proof:

• Cap(G) + limq→∞ logq χf (Hq) = n

• For any graph, and in particular, for Hq

χf (Hq) ≤ χ(Hq) ≤ χf (Hq) · ln |V(Hq)|

⇒ limq→∞

logq χf (Hq) = limq→∞

logq χ(Hq)

LRC codes 9 / 15

Duality between Storage Capacity and Index Coding

Theorem [Mazumdar 14, Shanmugam and Dimakis 14]

Let G = (V,E), |V| = n, then

Cap(G) + Index(G) = n

Proof:

• Cap(G) + limq→∞ logq χf (Hq) = n

• For any graph, and in particular, for Hq

χf (Hq) ≤ χ(Hq) ≤ χf (Hq) · ln |V(Hq)|

⇒ limq→∞

logq χf (Hq) = limq→∞

logq χ(Hq) = Index(G)

LRC codes 9 / 15

Bounds on Cap(G)

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:

• Cap(G) ≤ n − α(G) = VertexCover(G)

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:• Cap(G) ≤ n − α(G) = VertexCover(G)

• First proof: Index(G) ≥ α(G)

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:• Cap(G) ≤ n − α(G) = VertexCover(G)

• First proof: Index(G) ≥ α(G)

• Second Proof:

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:• Cap(G) ≤ n − α(G) = VertexCover(G)

• First proof: Index(G) ≥ α(G)

• Second Proof:

1. S ⊆ V(G), a maximal independent set

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:• Cap(G) ≤ n − α(G) = VertexCover(G)

• First proof: Index(G) ≥ α(G)

• Second Proof:

1. S ⊆ V(G), a maximal independent set

2. The information in S can be recovered by the nodes V(G)\S

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:

• Cap(G) ≤ n − α(G) = VertexCover(G)

• Shannon type inequalities (Recall the proof for the pentagon)

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:

• Cap(G) ≤ n − α(G) = VertexCover(G)

• Shannon type inequalities (Recall the proof for the pentagon)

• Non-Shannon type inequalities

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:

• Cap(G) ≤ n − α(G) = VertexCover(G)

• Shannon type inequalities (Recall the proof for the pentagon)

• Non-Shannon type inequalities• Theorem [Baber, Chistofides,Dang, Riis, Vaughan]:

There exists a graph G on 10 vertices s.t.

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:

• Cap(G) ≤ n − α(G) = VertexCover(G)

• Shannon type inequalities (Recall the proof for the pentagon)

• Non-Shannon type inequalities• Theorem [Baber, Chistofides,Dang, Riis, Vaughan]:

There exists a graph G on 10 vertices s.t.• Shannon type inequalities give: Cap(G) ≤ 114/17 = 6.705...

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:

• Cap(G) ≤ n − α(G) = VertexCover(G)

• Shannon type inequalities (Recall the proof for the pentagon)

• Non-Shannon type inequalities• Theorem [Baber, Chistofides,Dang, Riis, Vaughan]:

There exists a graph G on 10 vertices s.t.• Shannon type inequalities give: Cap(G) ≤ 114/17 = 6.705...

• Non-Shannon type inequalities give: Cap(G) ≤ 20/3 = 6.666...

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:

• Cap(G) ≤ n − α(G) = VertexCover(G)

• Shannon type inequalities (Recall the proof for the pentagon)

• Non-Shannon type inequalities• Theorem [Baber, Chistofides,Dang, Riis, Vaughan]:

There exists a graph G on 10 vertices s.t.• Shannon type inequalities give: Cap(G) ≤ 114/17 = 6.705...

• Non-Shannon type inequalities give: Cap(G) ≤ 20/3 = 6.666...

Lower bounds:

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:

• Cap(G) ≤ n − α(G) = VertexCover(G)

• Shannon type inequalities (Recall the proof for the pentagon)

• Non-Shannon type inequalities• Theorem [Baber, Chistofides,Dang, Riis, Vaughan]:

There exists a graph G on 10 vertices s.t.• Shannon type inequalities give: Cap(G) ≤ 114/17 = 6.705...

• Non-Shannon type inequalities give: Cap(G) ≤ 20/3 = 6.666...

Lower bounds:

• Index(G) ≤ min{χf (G),Minrank(G)}

LRC codes 10 / 15

Bounds on Cap(G)

• Any bound on Index(G) ⇒ a bound on Cap(G)

Upper Bounds:

• Cap(G) ≤ n − α(G) = VertexCover(G)

• Shannon type inequalities (Recall the proof for the pentagon)

• Non-Shannon type inequalities• Theorem [Baber, Chistofides,Dang, Riis, Vaughan]:

There exists a graph G on 10 vertices s.t.• Shannon type inequalities give: Cap(G) ≤ 114/17 = 6.705...

• Non-Shannon type inequalities give: Cap(G) ≤ 20/3 = 6.666...

Lower bounds:

• Index(G) ≤ min{χf (G),Minrank(G)}

• Cap(G) ≥ n − min{χf (G),Minrank(G)}

LRC codes 10 / 15

Guessing game on graphs [Riis2005]

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

• n players

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

• n players

• Red/Blue hat is assigned independently to each player. xj =

{

0 if blue

1 if red

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

• n players

• Red/Blue hat is assigned independently to each player. xj =

{

0 if blue

1 if red

• Each player:

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

• n players

• Red/Blue hat is assigned independently to each player. xj =

{

0 if blue

1 if red

• Each player:1. observes only the hats of his mates (complete graph)

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

• n players

• Red/Blue hat is assigned independently to each player. xj =

{

0 if blue

1 if red

• Each player:1. observes only the hats of his mates (complete graph)2. makes a guess on the color of his hat

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

• n players

• Red/Blue hat is assigned independently to each player. xj =

{

0 if blue

1 if red

• Each player:1. observes only the hats of his mates (complete graph)2. makes a guess on the color of his hat

• What is the probability that all players guessed correctly?

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

• n players

• Red/Blue hat is assigned independently to each player. xj =

{

0 if blue

1 if red

• Each player:1. observes only the hats of his mates (complete graph)2. makes a guess on the color of his hat

• What is the probability that all players guessed correctly?

• Random guess → Psuccess =(

12

)n

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

• n players

• Red/Blue hat is assigned independently to each player. xj =

{

0 if blue

1 if red

• Each player:1. observes only the hats of his mates (complete graph)2. makes a guess on the color of his hat

• What is the probability that all players guessed correctly?

• Random guess → Psuccess =(

12

)n

• Strategy: assume∑n

j=1 xj = 0

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

• n players

• Red/Blue hat is assigned independently to each player. xj =

{

0 if blue

1 if red

• Each player:1. observes only the hats of his mates (complete graph)2. makes a guess on the color of his hat

• What is the probability that all players guessed correctly?

• Random guess → Psuccess =(

12

)n

• Strategy: assume∑n

j=1 xj = 0 → Psuccess = 12

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

x1

x2 x3

• Arbitrary graphs?

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

x1

x2 x3

• Arbitrary graphs?

• Neighbors A1 = {2}, A2 = {1, 3}, A3 = {1}

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

x1

x2 x3

• Arbitrary graphs?

• Neighbors A1 = {2}, A2 = {1, 3}, A3 = {1}

• Fundamental limit on the amount of improvement over random guesses?

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

x1

x2 x3

• Arbitrary graphs?

• Neighbors A1 = {2}, A2 = {1, 3}, A3 = {1}

• Fundamental limit on the amount of improvement over random guesses?

• What is the optimal strategy?

LRC codes 11 / 15

Guessing game on graphs [Riis2005]

x1

x2 x3

• Arbitrary graphs?

• Neighbors A1 = {2}, A2 = {1, 3}, A3 = {1}

• Fundamental limit on the amount of improvement over random guesses?

• What is the optimal strategy?

• Connection to index coding [Yi–Sun–Jafar–Gesbert 2015]

LRC codes 11 / 15

Guessing Number [Riis2005]

• Given a (directed) graph G = (V,E), |V| = n

LRC codes 12 / 15

Guessing Number [Riis2005]

• Given a (directed) graph G = (V,E), |V| = n

• Each player has a hat of q possible colors

LRC codes 12 / 15

Guessing Number [Riis2005]

• Given a (directed) graph G = (V,E), |V| = n

• Each player has a hat of q possible colors

• Let X = (x1, ..., xn), xi ∈ [q] be a hat assignment

LRC codes 12 / 15

Guessing Number [Riis2005]

• Given a (directed) graph G = (V,E), |V| = n

• Each player has a hat of q possible colors

• Let X = (x1, ..., xn), xi ∈ [q] be a hat assignment

• A strategy f = (f1, ..., fn) composed of n functions fi : (xj : j ∈ N(i)) → [q]

LRC codes 12 / 15

Guessing Number [Riis2005]

• Given a (directed) graph G = (V,E), |V| = n

• Each player has a hat of q possible colors

• Let X = (x1, ..., xn), xi ∈ [q] be a hat assignment

• A strategy f = (f1, ..., fn) composed of n functions fi : (xj : j ∈ N(i)) → [q]

• The strategy is successful iff

LRC codes 12 / 15

Guessing Number [Riis2005]

• Given a (directed) graph G = (V,E), |V| = n

• Each player has a hat of q possible colors

• Let X = (x1, ..., xn), xi ∈ [q] be a hat assignment

• A strategy f = (f1, ..., fn) composed of n functions fi : (xj : j ∈ N(i)) → [q]

• The strategy is successful iff

fi(xj : j ∈ N(i)) = xi for all i

LRC codes 12 / 15

Guessing Number [Riis2005]

• Given a (directed) graph G = (V,E), |V| = n

• Each player has a hat of q possible colors

• Let X = (x1, ..., xn), xi ∈ [q] be a hat assignment

• A strategy f = (f1, ..., fn) composed of n functions fi : (xj : j ∈ N(i)) → [q]

• The strategy is successful iff

fi(xj : j ∈ N(i)) = xi for all i

• Equivalently: The hat assignment X is a fixed point for the mapping

X = (x1, ..., xn)

LRC codes 12 / 15

Guessing Number [Riis2005]

• Given a (directed) graph G = (V,E), |V| = n

• Each player has a hat of q possible colors

• Let X = (x1, ..., xn), xi ∈ [q] be a hat assignment

• A strategy f = (f1, ..., fn) composed of n functions fi : (xj : j ∈ N(i)) → [q]

• The strategy is successful iff

fi(xj : j ∈ N(i)) = xi for all i

• Equivalently: The hat assignment X is a fixed point for the mapping

X = (x1, ..., xn) 7→ f (X) = (f1(X), ..., fn(X))

fi(X) is a function of the hat colors of the neighbors of i

LRC codes 12 / 15

Guessing Number [Riis2005]

q-Guessing Number - Definition

The q-guessing number of G

Guessq(G) = maxf is a

strategy

logq

P(f is successful)Prand

LRC codes 13 / 15

Guessing Number [Riis2005]

q-Guessing Number - Definition

The q-guessing number of G

Guessq(G) = maxf is a

strategy

logq

P(f is successful)Prand

• For a fixed strategy f

LRC codes 13 / 15

Guessing Number [Riis2005]

q-Guessing Number - Definition

The q-guessing number of G

Guessq(G) = maxf is a

strategy

logq

P(f is successful)Prand

• For a fixed strategy f

logq

P(f is successful)Prand

= logq P(f is successful) + n

LRC codes 13 / 15

Guessing Number [Riis2005]

q-Guessing Number - Definition

The q-guessing number of G

Guessq(G) = maxf is a

strategy

logq

P(f is successful)Prand

• For a fixed strategy f

logq

P(f is successful)Prand

= logq P(f is successful) + n

= logq

| fixed points of f |qn

+ n

LRC codes 13 / 15

Guessing Number [Riis2005]

q-Guessing Number - Definition

The q-guessing number of G

Guessq(G) = maxf is a

strategy

logq

P(f is successful)Prand

• For a fixed strategy f

logq

P(f is successful)Prand

= logq P(f is successful) + n

= logq

| fixed points of f |qn

+ n

= logq | fixed points of f |

LRC codes 13 / 15

Guessing Number [Riis2005]

q-Guessing Number - Definition

The q-guessing number of G

Guessq(G) = maxf is a

strategy

logq

P(f is successful)Prand

= maxf is a

strategy

logq |fixed points of f |

• For a fixed strategy f

logq

P(f is successful)Prand

= logq P(f is successful) + n

= logq

| fixed points of f |qn

+ n

= logq | fixed points of f |

LRC codes 13 / 15

Guessing Number [Riis2005]

q-Guessing Number - Definition

The q-guessing number of G

Guessq(G) = maxf is a

strategy

logq

P(f is successful)Prand

= maxf is a

strategy

logq |fixed points of f |

LRC codes 13 / 15

Guessing Number [Riis2005]

q-Guessing Number - Definition

The q-guessing number of G

Guessq(G) = maxf is a

strategy

logq

P(f is successful)Prand

= maxf is a

strategy

logq |fixed points of f |

Guessing Number - Definition

The guessing number of G

Guess(G) = supq

Guessq(G)

LRC codes 13 / 15

Guessing games and LRC codes on graphs

LRC codes 14 / 15

Guessing games and LRC codes on graphs

• Given a graph G

LRC codes 14 / 15

Guessing games and LRC codes on graphs

• Given a graph G

Guessq(G) = maxf is a strategy

logq |fixed points of f |

LRC codes 14 / 15

Guessing games and LRC codes on graphs

• Given a graph G

Guessq(G) = maxf is a strategy

logq |fixed points of f |

Capq(G) = maxC⊆F

nq is a

storage code for G

logq |C|

LRC codes 14 / 15

Guessing games and LRC codes on graphs

• Given a graph G

Guessq(G) = maxf is a strategy

logq |fixed points of f |

Capq(G) = maxC⊆F

nq is a

storage code for G

logq |C|

• The set of fixed points of a strategy forms a storage code C for G

LRC codes 14 / 15

Guessing games and LRC codes on graphs

• Given a graph G

Guessq(G) = maxf is a strategy

logq |fixed points of f |

Capq(G) = maxC⊆F

nq is a

storage code for G

logq |C|

• The set of fixed points of a strategy forms a storage code C for G

Guessq(G) ≤ Capq(G)

LRC codes 14 / 15

Guessing games and LRC codes on graphs

• Given a graph G

Guessq(G) = maxf is a strategy

logq |fixed points of f |

Capq(G) = maxC⊆F

nq is a

storage code for G

logq |C|

• The set of fixed points of a strategy forms a storage code C for G

Guessq(G) ≤ Capq(G)

• The codewords of a storage code form the set of fixed points of some strategy forthe hat game

LRC codes 14 / 15

Guessing games and LRC codes on graphs

• Given a graph G

Guessq(G) = maxf is a strategy

logq |fixed points of f |

Capq(G) = maxC⊆F

nq is a

storage code for G

logq |C|

• The set of fixed points of a strategy forms a storage code C for G

Guessq(G) ≤ Capq(G)

• The codewords of a storage code form the set of fixed points of some strategy forthe hat game

Capq(G) ≤ Guessq(G)

LRC codes 14 / 15

Guessing games and LRC codes on graphs

• Given a graph G

Guessq(G) = maxf is a strategy

logq |fixed points of f |

Capq(G) = maxC⊆F

nq is a

storage code for G

logq |C|

• The set of fixed points of a strategy forms a storage code C for G

Guessq(G) ≤ Capq(G)

• The codewords of a storage code form the set of fixed points of some strategy forthe hat game

Capq(G) ≤ Guessq(G)

• Guessq(G) = Capq(G)

LRC codes 14 / 15

Guessing games and LRC codes on graphs

• Given a graph G

Guessq(G) = maxf is a strategy

logq |fixed points of f |

Capq(G) = maxC⊆F

nq is a

storage code for G

logq |C|

• The set of fixed points of a strategy forms a storage code C for G

Guessq(G) ≤ Capq(G)

• The codewords of a storage code form the set of fixed points of some strategy forthe hat game

Capq(G) ≤ Guessq(G)

• Guessq(G) = Capq(G)

• Guess(G) = Cap(G)

LRC codes 14 / 15

Open questions

LRC codes 15 / 15

Open questions

• Derive improved bounds on Cap(G)

LRC codes 15 / 15

Open questions

• Derive improved bounds on Cap(G)

• How expansion properties of the graph (expander graphs) affect the storagecapacity?

LRC codes 15 / 15

Open questions

• Derive improved bounds on Cap(G)

• How expansion properties of the graph (expander graphs) affect the storagecapacity?

• Approximate the storage capacity using spectral properties of the graph

LRC codes 15 / 15

Open questions

• Derive improved bounds on Cap(G)

• How expansion properties of the graph (expander graphs) affect the storagecapacity?

• Approximate the storage capacity using spectral properties of the graph

• Given a graph G, characterize the tradeoff between the distance and the rate ofthe code

LRC codes 15 / 15

Open questions

• Derive improved bounds on Cap(G)

• How expansion properties of the graph (expander graphs) affect the storagecapacity?

• Approximate the storage capacity using spectral properties of the graph

• Given a graph G, characterize the tradeoff between the distance and the rate ofthe code

• Networks with time varying topologies: Develop algorithms that efficiently shiftbetween storage codes for different topologies

LRC codes 15 / 15

top related