rim moussa [email protected] ceria.dauphine.fr/rim/rim.html
Embed Size (px)
DESCRIPTION
Paris Dauphine University *CERIA Lab. *04th October 200 4. Contribution to the Design & Implementation of the Highly Available Scalable and Distributed Data Structure: LH* RS. Rim Moussa [email protected] http://ceria.dauphine.fr/rim/rim.html. Thesis Supervisor: Pr. Witold Litwin - PowerPoint PPT PresentationTRANSCRIPT
-
Contribution to the Design & Implementation of the Highly Available Scalable and Distributed Data Structure: LH*RSRim Moussa [email protected] http://ceria.dauphine.fr/rim/rim.htmlThesis Presentation in Computer Science *Distributed DatabasesThesis Supervisor: Pr. Witold Litwin Examinators: Pr. Thomas J.E. Schwarz Pr. Tor RischJury President: Pr. Grard LvyParis Dauphine University *CERIA Lab.*04th October 2004
R. Moussa, U. Paris Dauphine
-
OutlineIssue State of the Art LH*RS SchemeLH*RS ManagerExperimentationsLH*RS File CreationBucket RecoveryParity Bucket CreationConclusion & Future Work
R. Moussa, U. Paris Dauphine
-
Facts Volume of Information of 30% /yearTechnologyNetwork Infrastructure >> Gilder Law, bandwidth triples every year.Evolution of PCs storage & computing capacities>> Moore Law, the latters double every 18 months.Bottleneck of Disks Accesses & CPUsNeed of Distributed Data Storage SystemsSDDSs: LH*, RP* High Throughput
R. Moussa, U. Paris Dauphine
-
Facts Frequent & Costly Failures>> Stat. Published by the Contingency Planning Research in 1996: the cost of service interruption/h case of brokerage application is $6,45 million.Need of Distributed & Highly-Available Data Storage Systems Multicomputers >> Modular Architecture >> Good Price/ Performance Tradeoff
R. Moussa, U. Paris Dauphine
-
State of the Art Parity Calculus(+) Good Response Time, Mirors are functional(-) High Storage Overhead (n if n repliquas)Data Replication Criteria to evaluate Erasure-resilient Codes: Encoding Rate (Parity Volume/ Data Volume) Update Penality (Parity Volumes) Group Size used for Data Reconstruction Encoding & Decoding Complexity Recovery Capabilitties
R. Moussa, U. Paris Dauphine
-
Parity Schemes1-Available Schemesk-Available Schemes Binary Linear Codes: [H94] Tolerate max. 3 failuresArray Codes: EVENODD [B94 ], X-code [XB99], RDP [C+04] Tolerate max. 2 failures Reed Solomon Codes : IDA [R89], RAID X [W91], FEC [B95], Tutorial [P97], LH*RS [LS00, ML02, MS04, LMS04] Tolerate k failures (k > 3) XOR Parity Calculus : RAID Technology (level 3, 4, 5) [PGK88], SDDS LH*g [L96]
R. Moussa, U. Paris Dauphine
-
OutlineIssueState of the Art LH*RS SchemeLH*RS?SDDSs?Reed Solomon Codes?Encoding/ Decoding OptimizationsLH*RS ManagerExperimentations
R. Moussa, U. Paris Dauphine
-
LH*RS ?Distribution using Linear Hashing (LH*LH [KLR96]) LH*LH Manager[B00]Scalability & High Throughput High AvailabilityLH*: Scalable & Distributed Data StructureParity Calculus using Reed-Solomon Codes [RS63]LH*RS [LS00]
R. Moussa, U. Paris Dauphine
-
SDDSs Principles (1) Dynamic File Growth ClientNetworkClientData BucketsCoordinator
R. Moussa, U. Paris Dauphine
-
SDDSs Principles (2)(2) No Centralized Directory AccessCases de Donnes ClientFile Image
R. Moussa, U. Paris Dauphine
-
Reed-Solomon CodesEncodingFrom m Data Symbols Calculus of n Parity SymbolsData Representation Galois Field Fields with finite size: qClosure Propoerty: Addition, Substraction, Multiplication, Division.In GF(2w),(1) Addition (XOR)(2) Multiplication (Tables: gflog and antigflog) e1 * e2 = antigflog[ gflog[e1] + gflog[e2] ]
R. Moussa, U. Paris Dauphine
-
RS Encoding100000 C1,1 C1,j C1,n-m01000 0 C2,1 C2,j C2,n-m00100 0 C3,1 C3,j C3,n-m 0 0000 1 Cm,1 Cm,j Cm,n-m
R. Moussa, U. Paris Dauphine
-
RS Decoding100000C1,1 C1,2 C1,3 C1,n-m01000 0 C2,1 C2,2 C2,3 C2,n-m00100 0 C3,1 C3,2 C3,3 C3,n-m 0 00 0 0 1 Cm,1 Cm,2 Cm,3 Cm,n-mS1S2S3S4:SmP1P2P3: Pn-m
R. Moussa, U. Paris Dauphine
-
OptimizationsGalois FieldParity MatrixGF Multiplication (+)GF(216) vs. GF(28) reduces the #Symbols by 1/2 #Operations in the GF.GF(28) 1 symbol = 1 ByteGF(216) 1 symbol = 2 Bytes (-)Multiplication Tables SizeGF(28): 0,768 KoGF(216): 393,216 Ko (512 0,768)
R. Moussa, U. Paris Dauphine
-
Optimizations (2)Galois FieldParity MatrixGF Multiplication1st Column of 1sEncoding of the 1st PB along XOR Calculus Gain in encoding & decoding 1st Row of 1sAny update from 1st DB is processed with XOR Calculus Gain in Performance of 4% (case PB creation, m =4)0001 0001 0001 0001 eb9b 2284 0001 2284 974 0001 9e44 d7f1
R. Moussa, U. Paris Dauphine
-
Optimizations (3)Galois FieldParity MatrixGF MultiplicationEncodingLog Pre-calculus of the Coef. of P Matrix Improvement of 3,5%0000 0000 0000 0000 5ab5 e267 0000 e267 0dce 0000 784d 2b66 DecodingLog Pre-calculus of coef. of H-1 matrix and OK symbols vector Improvement of 4% to 8% depending on the #buckets to recoverGoal: Reduce GF Multiplication Complexity e1 * e2 = antigflog[ gflog[e1] + gflog[e2] ]
R. Moussa, U. Paris Dauphine
-
LH*RS -Parity Groups
Data BucketsParity Buckets
: Key; DataInsert Rankr: Rank; [Key-list ]; Parity Key r 210 210A k-Acvailable Group survive to the failure of k buckets Grouping Concept m: #data buckets k: #parity buckets
R. Moussa, U. Paris Dauphine
-
OutlineIssueState of the Art LH*RS SchemeLH*RS ManagerCommunicationGross Architecture5.Experimentations6.File CreationBucket Recovery
R. Moussa, U. Paris Dauphine
-
CommunicationTCP/IPUDPMulticastIndividual Operations (Insert, Update, Delete, Search) Record RecoveryControl MessagesPerformance
R. Moussa, U. Paris Dauphine
-
CommunicationTCP/IPUDPMulticastLarge Buffers TransfertNew Parity BucketsTransfer Parity Update & Record (Bucket Split)Bucket Recovery Performance & Reliability
R. Moussa, U. Paris Dauphine
-
CommunicationTCP/IPUDPMulticastLooking for New Data/Parity BucketsCommunication Multipoints
R. Moussa, U. Paris Dauphine
-
Architecture(1)TCP/IP Connection Handler Principle of Sending Credit & Message Conservation until delivery [J88, GRS97, D01]1 Bucket Recovery (3,125 MB): SDDS 2000: 6,7 s SDDS2000-TCP: 2,6 s (Hardware Config.: CPU 733MhZ machines, network 100Mbps) Before Improvement of 60%TCP/IP Connections are passive OPEN, RFC 793 [ISI81], TCP/IP under Win2K Server OS [MB00](2)Flow Control & Message Acknowledgement (FCMA)Enhancements to SDDS2000 Architecture:
R. Moussa, U. Paris Dauphine
-
Architecture (2)BeforeTo tag new servers (data or parity) using Multicast:(3)Dynamic IP Addressing StructurePre-defined and Static [email protected] TableMulticast Group of Blank Data BucketsMulticast Group of Blank Parity BucketsCoordinatorCreated Buckets
R. Moussa, U. Paris Dauphine
-
Architecture (3) Multicast Listening Port UDP Sending Port TCP/IP Port UDP Listening Port UDP Listening Thread Messages QueueTCP Listening Thread Multicast listening Thread Message QueuePool of Working ThreadsNetworkACK Mgmt Threads Free ZonesMessages waiting for ACK.Not acquittedMessagesACK StructureMulticast Working Thread
R. Moussa, U. Paris Dauphine
-
ExperimentationPerformance Evaluation *CPU Time *Communication Time Experimental Environment*5 Machines (Pentium IV: 1.8 GHz, RAM: 512 Mb) *Ethernet Network 1 Gbps*O.S.: Win2K Server*Tested Configuration: 1 Client, A group of 4 Data Buckets, k Parity Buckets (k = 0,1,2,3).
R. Moussa, U. Paris Dauphine
-
OutlineIssue State of the Art LH*RS Scheme LH*RS Manager ExperimentationsFile CreationParity Update PerformanceBucket RecoveryParity Bucket Creation
R. Moussa, U. Paris Dauphine
-
File CreationClient OperationsPropagation of Data Record Inserts/ Updates/ Deletes to Parity Buckets. Update: Send only record. Deletes: Management of Free Ranks within Data Buckets.Data Bucket Split N1: #renaining recordsN2: #leaving recordsParity Group of the Splitting Data BucketN1+N2 Deletes + N1 Inserts Parity Group of the New Data BucketN2 Inserts
R. Moussa, U. Paris Dauphine
-
PerformancesConfig.Client Window = 1Client Window = 5Max Bucket Size = 10 000 records File of 25 000 records 1 record = 104 BytesNo difference GF(28) et GF(216) (we dont wait for ACKs between DBs and PBs)
R. Moussa, U. Paris Dauphine
-
PerformancesConfig.Client Window = 1Client Window = 5k = 0 ** k = 1 Perf. Degradation of 20%k = 1 ** k = 2 Perf. Degradation of 8%
R. Moussa, U. Paris Dauphine
Chart2
0.1410.1720.171
0.2820.3280.359
0.4380.50.531
0.5790.6560.703
0.7190.8130.89
0.8750.9841.062
1.0321.1411.25
1.1721.3131.421
1.3131.4691.625
1.4691.6411.796
1.611.7971.984
1.7661.9692.156
1.9222.1412.343
2.0632.2972.515
2.2042.4532.687
2.3442.6252.875
2.52.7813.046
2.6412.9383.234
2.7823.1093.406
2.9383.2663.593
3.1574.1564.5
3.3134.3134.687
3.4544.4694.859
3.5944.6415.046
3.754.7975.218
3.8914.9845.406
4.0475.1565.578
4.1885.3135.765
4.3295.4695.937
4.4695.6416.109
4.615.7976.296
4.755.9536.468
4.9076.1096.671
5.0476.2816.843
5.1886.4387.031
5.3296.5947.203
5.4856.7667.39
5.6256.9227.562
5.7667.0787.75
5.9227.257.921
6.0637.4068.109
6.0637.4068.109
6.2827.4068.109
6.2827.6568.406
6.2828.2819
6.3758.2819
6.5328.4539.187
6.6888.6259.375
6.8298.7819.562
6.9858.9539.75
7.1259.1099.921
7.2669.28110.109
7.4229.45310.296
7.5639.60910.484
7.7199.78110.656
7.8759.96910.843
k = 0
k = 1
k = 2
Inserted Keys
File Creation Time (sec)
7,896s
9,990s
10,963s
Chart1
000
0.1410.1720.172
0.2820.3280.359
0.4380.50.531
0.5790.6560.718
0.7190.8130.89
0.8750.9841.078
1.0321.1411.25
1.1721.3131.437
1.3131.4691.609
1.4691.6411.781
1.611.7971.968
1.7661.9692.156
1.9222.1412.328
2.0632.2972.515
2.2042.4532.687
2.3442.6252.859
2.52.7813.047
2.6412.9383.218
2.7823.1093.406
2.9383.2663.578
3.1574.1564.484
3.3134.3134.672
3.4544.4694.843
3.5944.6415.031
3.754.7975.203
3.8914.9845.39
4.0475.1565.578
4.1885.3135.75
4.3295.4695.922
4.4695.6416.109
4.615.7976.281
4.755.9536.468
4.9076.1096.64
5.0476.2816.828
5.1886.4387
5.3296.5947.187
5.4856.7667.359
5.6256.9227.547
5.7667.0787.718
5.9227.257.906
6.0637.4068.078
6.0637.4068.078
6.2827.4068.093
6.2827.6568.375
6.2828.2819
6.3758.2819
6.5328.4539.203
6.6888.6259.39
6.8298.7819.578
6.9858.9539.765
7.1259.1099.937
7.2669.28110.125
7.4229.45310.312
7.5639.60910.484
7.7199.78110.672
7.8759.96910.859
k = 0
k = 1
k = 2
Inserted keys
Insert Time (sec)
7,896
9,990
10,963
k = 0, GF[2^8]
AckEssai 1Essai 2Essai 3
KeyTotal time (sec)avg rec(ms)Total time (sec)avg rec(ms)Total time (sec)avg rec(ms)
0000000
5000.1560.3120.1710.3420.1410.282
10000.2970.2820.3120.2820.2820.282
15000.4370.2800.4680.3120.4380.312
20000.5940.3140.6090.2820.5790.282
25000.7340.2800.7650.3120.7190.280
30000.8910.3140.9210.3120.8750.312
35001.0310.2801.0620.2821.0320.314
40001.1720.2821.2180.3121.1720.280
45001.3280.3121.3590.2821.3130.282
50001.4690.2821.5150.3121.4690.312
55001.6250.3121.6560.2821.6100.282
60001.7660.2821.8120.3121.7660.312
65001.9220.3121.9680.3121.9220.312
70002.0620.2802.1090.2822.0630.282
75002.2030.2822.2650.3122.2040.282
80002.3590.3122.4060.2822.3440.280
85002.5000.2822.5460.2802.5000.312
90002.6410.2822.7030.3142.6410.282
95002.7970.3122.8590.3122.7820.282
100002.9370.2803.0000.2822.9380.312
100013.156219.0003.218218.0003.157219.000
105003.3120.3133.3750.3153.3130.313
110003.4530.2823.5150.2803.4540.282
115003.5940.2823.6710.3123.5940.280
120003.7500.3123.8120.2823.7500.312
125003.9060.3123.9530.2823.8910.282
130004.0620.3124.1090.3124.0470.312
135004.2030.2824.2500.2824.1880.282
140004.3440.2824.3900.2804.3290.282
145004.4840.2804.5460.3124.4690.280
150004.6250.2824.6870.2824.6100.282
155004.7660.2824.8280.2824.7500.280
160004.9220.3124.9680.2804.9070.314
165005.0620.2805.1250.3145.0470.280
170005.2030.2825.2650.2805.1880.282
175005.3440.2825.4060.2825.3290.282
180005.5000.3125.5620.3125.4850.312
185005.6410.2825.7030.2825.6250.280
190005.7810.2805.8430.2805.7660.282
195005.9370.3126.0000.3145.9220.312
200006.0780.2826.1400.2806.0630.282
200016.0780.0006.1400.0006.0630.303
200026.297219.0006.359219.0006.282219.000
200036.2970.0006.3590.0006.2820.314
200046.2970.0006.3590.0006.2820.314
200056.39194.0006.45394.0006.37593.000
205006.5470.3156.6250.3476.5320.317
210006.7030.3126.7650.2806.6880.312
215006.8440.2826.9210.3126.8290.282
220006.9840.2807.0620.2826.9850.312
225007.1250.2827.2180.3127.1250.280
230007.2810.3127.3750.3147.2660.282
235007.4220.2827.5150.2807.4220.312
240007.5620.2807.6560.2827.5630.282
245007.7190.3147.8120.3127.7190.312
250007.8590.2807.9530.2827.8750.312
DB 0 split0.2180.2030.219
DB 1 split0.2190.2190.219
DB 2 split0.0940.0940.094
(k = 1) + RS, GF[2^8]
AckEssai 1Essai 2Essai 3
KeyTotal time (sec)avg rec(ms)Total time (sec)avg rec(ms)Total time (sec)avg rec(ms)
0000000
5000.1560.3120.1880.3760.1560.312
10000.3280.3440.3440.3120.3280.344
15000.4840.3120.5160.3440.4840.312
20000.6410.3140.6720.3120.6560.344
25000.8130.3440.8440.3440.8120.312
30000.9690.3121.0160.3440.9840.344
35001.1410.3441.1720.3121.1410.314
40001.2970.3121.3440.3441.3120.342
45001.4530.3121.5000.3121.4690.314
50001.6250.3441.6720.3441.6410.344
55001.7810.3121.8440.3441.7970.312
60001.9530.3442.0160.3441.9690.344
65002.1250.3442.1880.3442.1410.344
70002.2810.3122.3440.3122.2970.312
75002.4380.3142.5160.3442.4690.344
80002.5940.3122.6720.3122.6410.344
85002.7500.3122.8440.3442.8120.342
90002.9220.3443.0000.3122.9690.314
95003.0780.3123.1720.3443.1410.344
100003.2500.3443.3440.3443.2970.312
100014.125875.0004.250906.0004.172875.000
105004.2810.3134.4220.3454.3440.345
110004.4530.3444.5780.3124.5000.312
115004.6090.3124.7500.3444.6720.344
120004.7660.3144.9220.3444.8280.312
125004.9380.3445.0780.3124.9840.312
130005.0940.3125.2500.3445.1560.344
135005.2500.3125.4220.3445.3120.312
140005.4060.3125.5780.3125.4690.314
145005.5780.3445.7500.3445.6410.344
150005.7340.3125.9060.3125.7970.312
155005.8910.3146.0780.3445.9530.312
160006.0470.3126.2340.3126.1250.344
165006.2190.3446.4060.3446.2810.312
170006.3750.3126.5630.3146.4370.312
175006.5310.3126.7340.3426.6090.344
180006.6880.3146.9060.3446.7660.314
185006.8590.3427.0630.3146.9370.342
190007.0160.3147.2340.3427.0940.314
195007.1880.3447.4060.3447.2500.312
200007.3440.3127.5630.3147.4220.344
200017.3440.0007.5630.0007.4220.000
200027.3440.0007.57815.0007.43715.000
200037.609265.0007.828250.0007.687250.000
200048.219610.0008.438610.0008.328641.000
200058.2190.0008.4380.0008.3280.000
205008.4220.4108.6410.4108.5000.347
210008.5780.3128.8130.3448.6560.312
215008.7500.3448.9840.3428.8280.344
220008.9220.3449.1410.3148.9840.312
225009.0780.3129.3130.3449.1410.314
230009.2660.3769.4840.3429.3120.342
235009.4380.3449.6560.3449.4840.344
240009.5940.3129.8130.3149.6410.314
245009.7660.3449.9840.3429.7970.312
250009.9220.31210.1410.3149.9690.344
DB 0 split0.8750.8900.875
DB 1 split0.8590.8600.906
DB 2 split0.2500.2500.250
(k = 1) + New Matrix, GF[2^8]
AckEssai 1Essai 2Essai 3
KeyTotal time (sec)avg rec(ms)Total time (sec)avg rec(ms)Total time (sec)avg rec(ms)
0000000
5000.1720.3440.1720.3440.1720.344
10000.3280.3120.3280.3120.3280.312
15000.5000.3440.5000.3440.5000.344
20000.6560.3120.6560.3120.6560.312
25000.8280.3440.8280.3440.8130.314
30000.9840.3120.9840.3120.9840.342
35001.1560.3441.1560.3441.1410.314
40001.3120.3121.3280.3441.3130.344
45001.4840.3441.5000.3441.4690.312
50001.6400.3121.6560.3121.6410.344
55001.8120.3441.8280.3441.7970.312
60001.9840.3441.9840.3121.9690.344
65002.1400.3122.1560.3442.1410.344
70002.3120.3442.3280.3442.2970.312
75002.4690.3142.4840.3122.4530.312
80002.6250.3122.6400.3122.6250.344
85002.7970.3442.8120.3442.7810.312
90002.9530.3122.9680.3122.9380.314
95003.1250.3443.1400.3443.1090.342
100003.2810.3123.3120.3443.2660.314
100014.156875.0004.187875.0004.156890.000
105004.3280.3454.3590.3454.3130.315
110004.4840.3124.5150.3124.4690.312
115004.6560.3444.6870.3444.6410.344
120004.8120.3124.8430.3124.7970.312
125004.9690.3145.0000.3144.9840.374
130005.1400.3425.1720.3445.1560.344
135005.2970.3145.3280.3125.3130.314
140005.4690.3445.5000.3445.4690.312
145005.6250.3125.6560.3125.6410.344
150005.7810.3125.8120.3125.7970.312
155005.9370.3125.9840.3445.9530.312
160006.1090.3446.1400.3126.1090.312
165006.2650.3126.3120.3446.2810.344
170006.4220.3146.4680.3126.4380.314
175006.5940.3446.6400.3446.5940.312
180006.7500.3126.7970.3146.7660.344
185006.9060.3126.9680.3426.9220.312
190007.0780.3447.1250.3147.0780.312
195007.2340.3127.2970.3447.2500.344
200007.4060.3447.4530.3127.4060.312
200017.4060.0007.4530.0007.4060.370
200027.4060.0007.46815.0007.4060.370
200037.656250.0007.718250.0007.656250.000
200048.281625.0008.359641.0008.281625.000
200058.2810.0008.3590.0008.2810.414
205008.4840.4108.5310.3478.4530.347
210008.6400.3128.7030.3448.6250.344
215008.7970.3148.8590.3128.7810.312
220008.9690.3449.0310.3448.9530.344
225009.1400.3429.1870.3129.1090.312
230009.2970.3149.3590.3449.2810.344
235009.4690.3449.5310.3449.4530.344
240009.6250.3129.6870.3129.6090.312
245009.7970.3449.8590.3449.7810.344
250009.9690.34410.0310.3449.9690.376
DB 0 split0.8750.8750.875
DB 1 split0.8750.8910.875
DB 0 split0.2500.2500.250
(k = 2) + New Matrix, GF[2^8]
AckEssai 1Essai 2Essai 3
KeyTotal time (sec)avg rec(ms)Total time (sec)avg rec(ms)Total time (sec)avg rec(ms)
0000000
5000.1870.3740.1720.3440.1710.342
10000.3750.3760.3590.3740.3590.376
15000.5620.3740.5310.3440.5310.344
20000.7500.3760.7180.3740.7030.344
25000.9370.3740.8900.3440.8900.374
30001.1400.4061.0780.3761.0620.344
35001.3280.3761.2500.3441.2500.376
40001.5150.3741.4370.3741.4210.342
45001.7030.3761.6090.3441.6250.408
50001.8900.3741.7810.3441.7960.342
55002.0780.3761.9680.3741.9840.376
60002.2810.4062.1560.3762.1560.344
65002.4680.3742.3280.3442.3430.374
70002.6560.3762.5150.3742.5150.344
75002.8430.3742.6870.3442.6870.344
80003.0310.3762.8590.3442.8750.376
85003.2180.3743.0470.3763.0460.342
90003.4060.3763.2180.3423.2340.376
95003.6090.4063.4060.3763.4060.344
100003.7960.3743.5780.3443.5930.374
100014.703907.0004.484906.0004.500907.000
105004.8900.3754.6720.3774.6870.375
110005.0780.3764.8430.3424.8590.344
115005.2500.3445.0310.3765.0460.374
120005.4370.3745.2030.3445.2180.344
125005.6250.3765.3900.3745.4060.376
130005.8120.3745.5780.3765.5780.344
135006.0000.3765.7500.3445.7650.374
140006.1870.3745.9220.3445.9370.344
145006.3590.3446.1090.3746.1090.344
150006.5460.3746.2810.3446.2960.374
155006.7340.3766.4680.3746.4680.344
160006.9060.3446.6400.3446.6710.406
165007.0930.3746.8280.3766.8430.344
170007.2810.3767.0000.3447.0310.376
175007.4680.3747.1870.3747.2030.344
180007.6560.3767.3590.3447.3900.374
185007.8280.3447.5470.3767.5620.344
190008.0150.3747.7180.3427.7500.376
195008.2030.3767.9060.3767.9210.342
200008.3900.3748.0780.3448.1090.376
200018.3900.0008.0780.0008.1090.405
200028.40616.0008.09315.0008.1090.405
200038.4060.0008.375282.0008.406297.000
200049.312906.0009.000625.0009.000594.000
200059.3120.0009.0000.0009.0000.450
205009.5150.4109.2030.4109.1870.378
210009.7030.3769.3900.3749.3750.376
215009.8750.3449.5780.3769.5620.374
2200010.0620.3749.7650.3749.7500.376
2250010.2500.3769.9370.3449.9210.342
2300010.4370.37410.1250.37610.1090.376
2350010.6250.37610.3120.37410.2960.374
2400010.8120.37410.4840.34410.4840.376
2450011.0000.37610.6720.37610.6560.344
2500011.1870.37410.8590.37410.8430.374
DB 0 split0.9060.9060.891
DB 1 split0.9220.9210.891
DB 2 split0.2810.2820.281
|||comparaison|||
Essai 1Essai 2Essai 3MoyenneImprovement (%)Improvement (%)
Total time (sec)Total time (sec)Total time (sec)|New MatrixGF[2^8] /GF[2^16]
k = 0, GF[2^8]7.8597.9537.8757.896
k = 0, GF[2^16]7.9078.0627.9857.9851.115
k = 1,RS, GF[2^8]9.92210.1419.96910.011
k = 1,XOR, GF[2^8]9.96910.0319.9699.9900.2097762387
k = 1,RS, GF[2^16]10.21810.06210.17210.1511.379
k = 1,XOR, GF[2^16]10.15610.15610.06210.1250.25614081181.333
k = 2, GF[2^8]11.18710.85910.84310.963
k = 2, GF[2^16]10.98410.9381110.9740.1002369236
0.23295852520.982
-
PerformancesConfig.Client Window = 1Client Window = 5k = 0 ** k = 1 Perf. Degradation of 37%k = 1 ** k = 2 Perf. Degradation of 10%
R. Moussa, U. Paris Dauphine
Chart1
000
0.10366666670.14066666670.161
0.20333333330.27066666670.3126666667
0.3070.4060.4633333333
0.40633333330.54666666670.6196666667
0.51033333330.67166666670.776
0.61466666670.8180.9323333333
0.71866666670.9531.0883333333
0.83333333331.08833333331.2343333333
0.9321.21333333331.3903333333
1.03633333331.35433333331.5416666667
1.14566666671.48966666671.698
1.251.63033333331.8593333333
1.36466666671.77566666672.0156666667
1.45833333331.90066666672.1716666667
1.55733333332.03633333332.3176666667
1.66133333332.16666666672.4633333333
1.7712.30233333332.6143333333
1.8752.42733333332.7656666667
1.9742.56733333332.9216666667
2.0782.7033.073
2.08333333332.7033.073
2.39066666673.98433333334.448
2.44766666674.0524.5156666667
2.51033333334.124.5886666667
2.56733333334.20833333334.6666666667
2.6254.2764.745
2.68233333334.34866666674.8176666667
2.7554.40633333334.8906666667
2.8234.4794.9633333333
2.89066666674.53633333335.0366666667
2.94266666674.60966666675.1093333333
34.67166666675.1823333333
3.04666666674.74466666675.255
3.1044.81233333335.3283333333
3.16666666674.8755.4063333333
3.2194.9485.479
3.27566666675.01566666675.5626666667
3.3285.07833333335.6353333333
3.39066666675.15133333335.7083333333
3.44266666675.2245.7916666667
3.5055.29166666675.8646666667
3.5055.29166666675.8646666667
3.5055.29166666675.8646666667
3.58833333335.29166666675.8646666667
3.67166666675.2975.8646666667
3.7555.2975.8696666667
3.95833333336.46366666677.281
4.00033333336.51533333337.328
4.0476.56766666677.37
4.08866666676.6257.4113333333
4.13033333336.6727.458
4.16666666676.7297.505
4.21866666676.78133333337.552
4.2556.82833333337.599
4.29666666676.88533333337.6563333333
4.34866666676.94266666677.7186666667
k = 0
k = 1
k = 2
Number of Inserted Keys
File Creation Time (sec)
4,349s
6,940s
7,720s
UDP listen priority = highest
without lossAVG
Ackk = 0k = 1k = 2Avg k= 0Avg k= 1Avg k = 2
KeyEssai 1Essai 2Essai 3Essai 1Essai 2Essai 3Essai 1Essai 2Essai 3
5000.0930000.1250000.1090000.1410000.1250000.1410000.1720000.188000
10000.2030000.2190000.2030000.2820000.2500000.2810000.3280000.344000
15000.2960000.3280000.3130000.4220000.3910000.4060000.5000000.500000
20000.4060000.4380000.4220000.5630000.5160000.5470000.6570000.657000
25000.5150000.5470000.5310000.7030000.6560000.6880000.8130000.813000
30000.6250000.6560000.6410000.8600000.7970000.8280000.9850000.985000
35000.7340000.7660000.7500001.0000000.9370000.9530001.1570001.157000
40000.8430000.8750000.8590001.1250001.0620001.0940001.3130001.297000
45000.9530000.9840000.9690001.2660001.2030001.2190001.4690001.454000
50001.0620001.0940001.0780001.4070001.3440001.3600001.6410001.610000
55001.1710001.2030001.1880001.5470001.4840001.5000001.7970001.766000
60001.2810001.3130001.2970001.6880001.6410001.6410001.9690001.938000
65001.4060001.4220001.4220001.8440001.7810001.7810002.1410002.094000
70001.5000001.5310001.5160001.9690001.9060001.9060002.2970002.250000
75001.6090001.6250001.6250002.1100002.0310002.0780002.4850002.407000
80001.7030001.7340001.7190002.2500002.1720002.2190002.6410002.563000
85001.8120001.8440001.8280002.3910002.2970002.3440002.8130002.719000
90001.9210001.9530001.9380002.5320002.4370002.4850002.9690002.860000
95002.0310002.0470002.0470002.6720002.5780002.6100003.1250003.016000
100002.1400002.1560002.1560002.8130002.7190002.7500003.2970003.188000
100012.1400002.1560002.1560002.8130002.7190002.7500003.2970003.188000
105002.8120002.4840002.4840003.9530004.3750003.8440004.40700017.313000
110002.8430002.5470002.5780004.0160004.4370003.9220004.53200017.391000
115002.8900002.6090002.6410004.0940004.4840003.9850004.61000017.454000
120002.9370002.6560002.6880004.1570004.5470004.0630004.70300017.579000
125002.9680002.7660002.7500004.2350004.6090004.1250004.78200017.641000
130003.0150002.8280002.8130004.3130004.6720004.2030004.87500017.719000
135003.0460002.8750002.8590004.3750004.7190004.2810004.95300017.766000
140003.0780002.9380002.9220004.4530004.7810004.3440005.04700017.829000
145003.1250002.9840002.9840004.5320004.8280004.4220005.12500017.891000
150003.1560003.0470003.0310004.6410004.8910004.4850005.21900017.938000
155003.2030003.1090003.0940004.7030004.9530004.5630005.29700018.000000
160003.2340003.1560003.1560004.7820005.0160004.6250005.39100018.063000
165003.2650003.2190003.2190004.8600005.0620004.7030005.46900018.125000
170003.3120003.2810003.2660004.9220005.1250004.7660005.56300018.172000
175003.3430003.3440003.3280005.0000005.1870004.8440005.64100018.235000
180003.3900003.3910003.3910005.0780005.2500004.9220005.73500018.297000
185003.4210003.4530003.4530005.1570005.3120004.9850005.81300018.360000
190003.4680003.5160003.5160005.2350005.3750005.0630005.90700018.422000
195003.5000003.5780003.5630005.2970005.4370005.1410006.00000018.469000
200003.5460003.6410003.6250005.3750005.5000005.2030006.07800018.532000
200013.5460003.6410003.6250005.3750005.5000005.2190006.07800018.532000
200023.5460003.6410003.6250005.3750005.5000005.2190006.07800018.532000
200033.5460003.6410003.6250005.3750005.5000005.2190006.07800018.532000
200043.5460003.6410003.6410005.3750005.5000005.2190006.07800018.532000
200053.5460003.9060003.8910005.3910005.5000005.2190006.07800018.532000
205003.5780004.1880004.0310007.2820005.7970007.0940008.04700018.594000
210003.6090004.2190004.0780007.3130005.8590007.2810008.09400018.657000
215003.6560004.2660004.1250007.3440005.8910007.3280008.14100018.719000
220003.6870004.2970004.1560007.3910005.9370007.4690008.18800018.782000
225003.7340004.3280004.2030007.4380005.9840007.5160008.21900018.844000
230003.7650004.3750004.2340007.4690006.0310007.5630008.26600018.907000
235003.8120004.4060004.2810007.5000006.0620007.6100008.34400018.969000
240003.8430004.4530004.3130007.5470006.1090007.6410008.39100019.032000
245003.8900004.4840004.3590007.5780006.1410007.6880008.45300019.079000
250003.9210004.5310004.3910007.6410006.1870007.7350008.50000019.141000
DB 0 split
DB 1 split
DB 0 split
DB 0 split
k = 0
Essai 1: 6 DBs, {DB0, DB1, DB4, DB5}: 3125 records, {DB2, DB3}: 6250 records
Essai 2: 5 DBs, {3125, 6251, 6249, 6249, 3126} records respectiv, {DB0, ,,,, DB4}
Essai 3: 4 DBs, {6251, 6250, 6249, 6250} records respectiv, {DB0, ,,,, DB3}
k = 1
Essai 1
DB0DB1DB2DB3DB4Total 1PB1Perte 1Total 2PB2Perte 2
# recs3125625062506250312562503125
# recv ins15671628012351224609243780.05%6090%
# sent to PBs1566062721235122460924391598
# forwards118000
Essai 2
DB0DB1DB2DB3TotalPB1Perte
# recs62516250624962506251
# recv ins13809625637041246250000.00%
# sent to PBs1380062503704124625000
# forwards9600
Essai 3
DB0DB1DB2DB3DB4DB5Total 1PB1Perte 1Total 2PB2Perte 2
# recs31263127624962503125312362503125
# recv ins15708573512231212587557238460.04%11440%
# sent to PBs15695572712221212587557238561144
# forwards1381000
k = 2
Essai 1
DB0DB1DB2DB3DB4Total1PB1PB2Perte 1Total 2PB3PB4Perte2
# recs312562516249624931266158615931263126
# recv ins1568662881226121660724294242980.40%6066060%
# sent to PBs3134112562245024321212487851212
# forwards147101
Essai 2
DB0DB1DB2DB3DB4DB5Total1PB1PB2Perte 1Total 2PB3PB4Perte2
# recs3127312862546252312531246254625431253125
# recv ins119621972371837051851183621310213070.06%368736870%
# sent to PBs2387739207436741037023672426437374
# forwards22120000
Trial number 2: I've tried many times I get same results (19,,,, for file creation time): --> data records losses at the client level --> so messages retransmissions
Fast + Ack [client -- DB]
without lossAVG
Ackk = 0k = 1k = 2Avg k= 0Avg k= 1Avg k = 2
KeyEssai 1Essai 2Essai 3Essai 1Essai 2Essai 3Essai 1Essai 2Essai 3
0000000000000
5000.0930.1090.1090.1250.1560.1410.1560.1560.1710.1040.140.16
10000.1720.2190.2190.2650.2810.2660.3130.3130.3120.2030.270.31
15000.2650.3280.3280.3900.4220.4060.4690.4530.4680.3070.410.46
20000.3590.4380.4220.5310.5620.5470.6250.6090.6250.4060.550.62
25000.4530.5470.5310.6560.6870.6720.7810.7660.7810.5100.670.78
30000.5470.6560.6410.7970.8440.8130.9380.9220.9370.6150.820.93
35000.6400.7660.7500.9370.9690.9531.0941.0781.0930.7190.951.09
40000.7340.8750.8911.0621.1091.0941.2501.2191.2340.8331.091.23
45000.8120.9841.0001.1871.2341.2191.4061.3751.390.9321.211.39
50000.9061.0941.1091.3281.3751.3601.5631.5311.5311.0361.351.54
55001.0151.2031.2191.4531.5161.5001.7191.6881.6871.1461.491.70
60001.1091.3131.3281.5941.6561.6411.8911.8441.8431.2501.631.86
65001.2181.4381.4381.7341.8121.7812.047221.3651.782.02
70001.2971.5471.5311.8591.9371.9062.2032.1562.1561.4581.902.17
75001.3901.6411.6411.9842.0782.0472.3602.2972.2961.5572.042.32
80001.4841.7501.7502.1252.2032.1722.5002.4532.4371.6612.172.46
85001.5781.8911.8442.2502.3442.3132.6562.5942.5931.7712.302.61
90001.6722.0001.9532.3752.4692.4382.8132.752.7341.8752.432.77
95001.7502.1092.0632.5152.6092.5782.9692.9062.891.9742.572.92
100001.8432.2192.1722.6402.7502.7193.1253.0633.0312.0782.703.07
100011.8592.2192.1722.6402.7502.7193.1253.0633.0312.0832.703.07
105002.1562.5472.4693.7034.4063.8444.8604.2344.252.3913.984.45
110002.2032.6092.5313.7654.4693.9224.9224.3134.3122.4484.054.52
115002.2652.6722.5943.8444.5313.9854.9854.3914.392.5104.124.59
120002.3122.7342.6563.9844.5784.0635.0474.4694.4842.5674.214.67
125002.3752.7972.7034.0624.6414.1255.1104.5634.5622.6254.284.75
130002.4222.8592.7664.1404.7034.2035.1724.6414.642.6824.354.82
135002.4842.9532.8284.2034.7504.2665.2354.7194.7182.7554.414.89
140002.5313.0162.9224.2814.8124.3445.2974.7974.7962.8234.484.96
145002.6253.0632.9844.3444.8594.4065.3604.8754.8752.8914.545.04
150002.6723.1253.0314.4224.9224.4855.4224.9534.9532.9434.615.11
155002.7183.1883.0944.4844.9844.5475.4855.0315.0313.0004.675.18
160002.7653.2343.1414.5625.0474.6255.5475.1095.1093.0474.745.26
165002.8123.2973.2034.6405.0944.7035.6105.1885.1873.1044.815.33
170002.8753.3593.2664.7035.1564.7665.6885.2665.2653.1674.885.41
175002.9223.4223.3134.7815.2194.8445.7505.3445.3433.2194.955.48
180002.9683.4843.3754.8595.2664.9225.8135.4385.4373.2765.025.56
185003.0153.5313.4384.9225.3284.9855.8755.5165.5153.3285.085.64
190003.0783.5943.5005.0005.3915.0635.9385.5945.5933.3915.155.71
195003.1253.6563.5475.0785.4535.1416.0165.6885.6713.4435.225.79
200003.1873.7193.6095.1565.5165.2036.0785.7665.753.5055.295.86
200013.1873.7193.6095.1565.5165.2036.0785.7665.753.5055.295.86
200023.1873.7193.6095.1565.5165.2036.0785.7665.753.5055.295.86
200033.4373.7193.6095.1565.5165.2036.0785.7665.753.5885.295.86
200043.4373.7193.8595.1565.5165.2196.0785.7665.753.6725.305.86
200053.4373.9693.8595.1565.5165.2196.0785.7665.7653.7555.305.87
205003.5934.1414.1416.4695.8757.0476.4697.7037.6713.9586.467.28
210003.6254.1884.1886.5155.9377.0946.5167.757.7184.0006.527.33
215003.6724.2504.2196.5785.9847.1416.5637.7977.754.0476.577.37
220003.7034.2974.2666.6406.0477.1886.6107.8287.7964.0896.637.41
225003.7504.3444.2976.6876.0947.2356.6567.8757.8434.1306.677.46
230003.7814.3914.3286.7346.1567.2976.7037.9227.894.1676.737.51
235003.8284.4534.3756.7976.2037.3446.7507.9697.9374.2196.787.55
240003.8594.5004.4066.8446.2507.3916.7978.0167.9844.2556.837.60
245003.8904.5474.4536.9066.3127.4386.8448.0948.0314.2976.897.66
250003.9534.6094.4846.9696.3597.5006.9068.1418.1094.3496.947.72
DB 0 split0.2500.9840.9841.0631.0621.1101.125
DB 1 split0.0930.2650.3120.2660.3430.3130.281
DB 0 split0.2650.9840.6090.9670.5941.0151.000
DB 0 split0.5470.5460.547
Configuration: toutes les entits DBs et client sont sur des machines 1,8Ghz + 1Gbps
Window Size = Working Threads in a DB = 5
PBs and other buckets if created are in the 2,6 GKz machine
k = 1
Essai 1
DB0DB1DB2DB3TotalPB1Perte
# recs62506250625062505887
# recv ins16288628612221215246321.50%
# sent to PBs1628262811222121525000
# forwards6500
Essai 2
DB0DB1DB2DB3TotalPB1Perte
# recs62506250625062506207
# recv ins13792625637201244248220.70%
# sent to PBs1378562523719124425000
# forwards7410
Essai 3
DB0DB1DB2DB3DB4Total 1PB1Perte 1Total 2PB2Perte 2
# recs3125625062496250312662503126
# recv ins15690628512221216598244020%5980%
# sent to PBs1568362811222121659824402598
# forwards74000
k = 2
Essai 1
DB0DB1DB2DB3TotalPB1PB2Perte
# recs625062516250624959575985
# recv ins1378862503727124624263242533.00%
# sent to PBs27557124947452249249995
# forwards7310
Essai 2
DB0DB1DB2DB3DB4Total1PB1PB2Perte 1Total 2PB3PB4Perte2
# recs312562506251625031246251625131243124
# recv ins1570162921226120860024400243970%6006000%
# sent to PBs3136712572244224161200487971200
# forwards166500
Essai 3
DB0DB1DB2DB3DB4Total1PB1PB2Perte 1Total 2PB3PB4Perte2
# recs312562516250624931256121611931253125
# recv ins1568362871228121360524257242520.60%6056050%
# sent to PBs3134112564245624261210487871210
# forwards115000
Fast + Ack [client -- DB]
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
k = 0
k = 1
k = 2
Nombre de cls insres
Temps cration fichier (sec)
4,349s
6,940s
7,720s
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
000
k = 0
k = 1
k = 2
Number of Inserted Keys
File Creation Time (sec)
4,349s
6,940s
7,720s
-
OutlineIssue State of the Art LH*RS Scheme LH*RS Manager ExperimentationsFile CreationBucket RecoveryScenarioPerformances8. Parity Bucket Creation
R. Moussa, U. Paris Dauphine
-
ScenarioFailure DetectionAre you Alive? Data BucketsParity BucketsCoordinator
R. Moussa, U. Paris Dauphine
-
Scenario (2)Waiting for Responses OKData BucketsParity BucketsOKOKOKCoordinator
R. Moussa, U. Paris Dauphine
-
Scenario (3)Searching Spare Buckets Wanna be Spare ? Multicast Group of Blank Data BucketsCoordinator
R. Moussa, U. Paris Dauphine
-
Scenario (4)Waiting for Replies Launch UDP Listening Launch TCP Listening, Launch Working Thredsl*Waiting for Confirmation* If Time-out elapsed cancel everythingI would Multicast Group of Blank Data BucketsCoordinatorI would I would
R. Moussa, U. Paris Dauphine
-
Scenario (5)Spare SelectionMulticast Group of Blank Data BucketsConfirmedCancellationConfirmedYou are HiredCoordinator
R. Moussa, U. Paris Dauphine
-
Scenario (6)Parity BucketsRecover Failed BucketsRecovery Manager Selection Coordinator
R. Moussa, U. Paris Dauphine
-
Scenario (7)Data BucketsParity BucketsRecovery ManagerSpare BucketsBuckets participating to RecoverySend me Records of rank in [r, r+slice-1] Query Phase
R. Moussa, U. Paris Dauphine
-
Scenario (8)Decoding Phase Recovered SlicesData BucketsParity BucketsSpare BucketsBuckets participating to RecoveryRequested BuffersReconstruction PhaseRecovery ManagerIn // with Query Phase
R. Moussa, U. Paris Dauphine
-
Performances2 DBs1 DB XORConfig.1 DB RSXOR vs. RS File Info File of 125 000 records Record Size = 100 bytes Bucket Size = 31250 records 3.125 MB Group of 4 Data Buckets (m = 4), k-Available with k = 1,2,3 Decoding * GF(216) * RS+ Decoding (RS + log Pre-calculus of H-1 and OK Symboles Vector) Recovery per Slice (adaptative to PCs storage & computing capacities)
R. Moussa, U. Paris Dauphine
-
Performances2 DBs1 DB XORConfig.1 DB RSXOR vs. RS
SliceTotal Time (sec)CPU Time (sec)Com. Time (sec)12500,6250,2660,34831250,5880,2550,32362500,5520,2400,312156250,5620,2550,302312500,5780,2500,328
R. Moussa, U. Paris Dauphine
-
Performances2 DBs1 DB XORConfig.1 DB RSXOR vs. RS
SliceTotal Time (sec)CPU Time (sec)Com. Time (sec)12500,7340,3490,36531250,6880,3590,32362500,6560,3540,297156250,6670,3600,297312500,6880,3600,328
R. Moussa, U. Paris Dauphine
-
Performances2 DBs1 DB XORConfig.Time to Recover 1DB -XOR : 0,58 secXOR in GF(216) realizes a gain of 13% in Total Time (and 30% in CPU Time)Time to Recover 1DB RS : 0,67 sec1 DB RSXOR vs. RS
R. Moussa, U. Paris Dauphine
-
Performances3 DBs2 DBsSummaryXOR vs. RS1 DB RS
SliceTotal Time (sec)CPU Time (sec)Com. Time (sec)12500,9760,5770,37531250,9320,5890,33862500,8830,5620,321156250,8750,5620,281312500,8750,5620,313
R. Moussa, U. Paris Dauphine
-
Performances3 DBs2 DBsSummaryXOR vs. RS1 DB RS
SliceTotal Time (sec)CPU Time (sec)Com. Time (sec)12501,2810,8280,40631251,2500,8280,39062501,2110,8520,352156251,1880,8230,361312501,2030,8280,375
R. Moussa, U. Paris Dauphine
-
Performances3 DBs2 DBsSummaryXOR vs. RS1 DB RSTime to Recover f Buckets f Time to Recover 1 Bucket Factorized Query Phase The + is Decoding Time & Time to send Recovered Buffers
fBucket Size (MB)Total Time (sec)Recovery Speed (MB/sec)1 (XOR)1 (RS)3,1250,585.380,674.6626,2500,96.9439,3751,237,62
R. Moussa, U. Paris Dauphine
-
PerformancesGF(28) XOR in GF(28) improves decoding perf. of 60% compared to RS in GF(28).
RS/RS+ decoding in GF(216) realize a gain of 50% compared to decoding in GF(28).
3 DBs2 DBsSummaryXOR vs. RS
R. Moussa, U. Paris Dauphine
-
Outline1.Issue2.State of the Art 3.LH*RS Scheme 4.LH*RS Manager 5.Experimentations6.File Creation7.Bucket Recovery8.Parity Bucket CreationScenarioPerformances
R. Moussa, U. Paris Dauphine
-
ScenarioMulticast Group of Blank Parity BucketsWanna Join Group g ? Searching for a new Parity BucketCoordinator
R. Moussa, U. Paris Dauphine
-
Scenario (2)CoordinatorI Would Launch UDP Listening Launch TCP Listening, Launch Working Thredsl*Waiting for Confirmation* If Time-out elapsed cancel everything
Waiting for Replies Multicast Group of Blank Parity BucketsI Would I Would
R. Moussa, U. Paris Dauphine
-
Scenario (3)You are HiredConfirmedCancellationCancellationNew Parity Bucket SelectionMulticast Group of Blank Parity BucketsCoordinator
R. Moussa, U. Paris Dauphine
-
Scenario (4)Send me your contents ! Group of Data BucketsNew Parity BucketAuto-creation *Query Phase
R. Moussa, U. Paris Dauphine
-
Scenario (5)Group of Data BucketsAuto-creation *Encoding PhaseNew Parity Bucket
R. Moussa, U. Paris Dauphine
-
PerformancesMax Bucket Size : 5000 .. 50000 recordsBucket Load Factor: 62,5%Record Size: 100 octetsGroup of 4 Data BucketsEncoding GF(216) RS++ ( Log Pre-calculus & Row 1s XOR encoding to Process 1st DB buffer)XORRSXOR vs. RSConfig.GF(28)
R. Moussa, U. Paris Dauphine
-
PerformancesXORRSXOR vs. RSConfig.GF(28)Same Encoding RateBucket Size: CPU Time 74% Total Time
Bucket SizeTotal Time (sec)CPU Time (sec)Com. Time (sec)50000.1900.1400.029100000.4290.3040.066250001.0070.7380.144500002.0621.4840.322
R. Moussa, U. Paris Dauphine
-
PerformancesXORRSXOR vs. RSConfig.GF(28)Same Encoding RateBucket Size: CPU Time 74% Total Time
Bucket SizeTotal Time (sec)CPU Time (sec)Com. Time (sec)50000.1930.1490.035100000.4460.3280.059250001.0530.7660.153500002.1031.5310.322
R. Moussa, U. Paris Dauphine
-
PerformancesXOR encoding speed : 2.062 secRS encoding speed: 2.103 secXOR realizes a performance gain in CPU time of 5% ( only 0,02% on Total Time)For Bucket Size = 50000 recordsXORRSXOR vs. RSConfig.GF(28)
R. Moussa, U. Paris Dauphine
-
PerformancesXORRSXOR vs. RSConfig.GF(28) Idem GF(216), CPU Time = 3/4 Total Time XOR in GF(28) improves CPU Time by 22%
R. Moussa, U. Paris Dauphine
-
PerformanceFile Creation Rate0.33MB/s for k = 00.25MB/s for k = 10.23MB/s for k = 2Record Insert Time0.29ms for k = 00.33ms for k = 10.36ms for k = 2
Bucket Recovery Rate4.66MB/s from 1-unavailability6.94MB/s from 2-unavailability7.62MB/s from 3-unavailabilityRecord Recovery TimeAbout 1.3msKey Search TimeIndividual> 0.24msBulk> 0.056msWintel P4, 1.8GHz, 1Gbps
R. Moussa, U. Paris Dauphine
-
ConclusionExperiments prove:Optimizations Encoding/ DecodingArchitecture Impact on PerformanceGood Recovery Performances
R. Moussa, U. Paris Dauphine
-
Future WorkUpdate Propagation to Parity Buckets ReliabilityPerformanceReduce Coordinator Tasks Parity DeclusteringInvestigation of New Erausure-Resilient Codes
R. Moussa, U. Paris Dauphine
-
References[PGK88] D. A. Patterson, G. Gibson & R. H. Katz, A Case for Redundant Arrays of Inexpensive Disks, Proc. of ACM SIGMOD Conf, pp.109-106, June 1988.
[ISI81] Information Sciences Institute, RFC 793: Transmission Control Protocol (TCP) Specification, Sept. 1981, http://www.faqs.org/rfcs/rfc793.html
[MB 00] D. MacDonal, W. Barkley, MS Windows 2000 TCP/IP Implementation Details, http://secinf.net/info/nt/2000ip/tcpipimp.html[J88] V. Jacobson, M. J. Karels, Congestion Avoidance and Control, Computer Communication Review, Vol. 18, No 4, pp. 314-329. [XB99] L. Xu & J. Bruck, X-Code: MDS Array Codes with Optimal Encoding, IEEE Trans. on Information Theory, 45(1), p.272-276, 1999.[CEG+ 04] P. Corbett, B. English, A. Goel, T. Grcanac, S. Kleiman, J. Leong, S. Sankar, Row-Diagonal Parity for Double Disk Failure Correction, Proc. of the 3rd USENIX Conf. On File and Storage Technologies, Avril 2004. [R89] M. O. Rabin, Efficient Dispersal of Information for Security, Load Balancing and Fault Tolerance, Journal of ACM, Vol. 26, N 2, April 1989, pp. 335-348. [W91] P.E. White, RAID X tackles design problems with existing design RAID schemes, ECC Technologies, ftp://members.aol.com.mnecctek.ctr1991.pdf [GRS97] J. C. Gomez, V. Redo, V. S. Sunderam, Efficient Multithreaded User-Space Transport for Network Computing, Design & Test of the TRAP protocol, Journal of Parallel & Distributed Computing, 40 (1) 1997.
R. Moussa, U. Paris Dauphine
-
References (2)[BK+ 95] J. Blomer, M. Kalfane, R. Karp, M. Karpinski, M. Luby & D. Zuckerman, An XOR-Based Erasure-Resilient Coding Scheme, ICSI Tech. Rep. TR-95-048, 1995. [LS00] W. Litwin & T. Schwarz, LH*RS: A High-Availability Scalable Distributed Data Structure using Reed Solomon Codes, p.237-248, Proceedings of the ACM SIGMOD 2000. [KLR96] J. Karlson, W. Litwin & T. Risch, LH*LH: A Scalable high performance data structure for switched multicomputers, EDBT 96, Springer Verlag.[RS60] I. Reed & G. Solomon, Polynomial codes over certain Finite Fields,Journal of the society for industrial and applied mathematics, 1960. [P97] J. S. Plank, A Tutorial on Reed-Solomon Coding for fault-Tolerance in RAID-like Systems, Software Practise & Experience, 27(9), Sept. 1997, pp 995- 1012,[D01] A.W. Dine, Contribution la Gestion de Structures de Donnes Distribues et Scalables, PhD Thesis, Nov. 2001, Universit Paris Dauphine. [B00] F. Sahli Bennour, Contribution la Gestion de Structures de Donnes Distribues et Scalables, PhD Thesis, Juin 2000, Universit Paris Dauphine. + Rfrences: http://ceria.dauphine.fr/rim/theserim.pdf
R. Moussa, U. Paris Dauphine
-
Publications[ML02] R. Moussa, W. Litwin, Experimental Performance Analysis of LH*RS Parity Management, Carleton Scientific Records of the 4th International Workshop on Distributed Data & Structure : WDAS 2002, p.87-97. [MS04] R. Moussa, T. Schwarz, Design and Implementation of LH*RS A Highly-Available Scalable Distributed Data Structure, Carleton Scientific Records of the 6th International Workshop on Distributed Data & Structure: WDAS 2004.[LMS04] W. Litwin, R. Moussa, T. Schwarz, Prototype Demonstration of LH*RS: A Highly Available Distributed Storage System, Proc. of VLDB 2004 (Demo Session) p.1289-1292. [LMS04-a] W. Litwin, R. Moussa, T. Schwarz, LH*RS: A Highly Available Distributed Storage System, journal version submitted, under revision.
R. Moussa, U. Paris Dauphine
-
Thank You For Your AttentionQuestions ?
R. Moussa, U. Paris Dauphine
Le client a une image du fichierCalcul d1 symbole de parit (RS): m Mult GF + m-1 XORsCalcul d1 symbole de parit (XOR): m-1 XORs multiplier par le nombre de symboles/ enregA multiplier par le nombre denregistrements
Dduire le nombre de cases de donnes factices
Calcul d1 symbole de parit (RS): m Mult GF + m-1 XORsCalcul d1 symbole de parit (XOR): m-1 XORs multiplier par le nombre de symboles/ enregA multiplier par le nombre denregistrements
Dduire le nombre de cases de donnes factices
Performances de GF(2^16)Par rapport GF(2^8):
Le 1 cest lidentit
Performances de GF(2^16)Par rapport GF(2^8):
En //le aprs phase de dcodage , phase dinterrogationPerformances de GF(2^16)Par rapport GF(2^8):
Vitesses de rcuprationVitesses de rcupration