第6回インターネットと運用技術シンポジウム wipセッション

179
Cybermedia Center Osaka University 柏崎 礼生 広域分散仮想化基盤の ストレージ評価の 最新動向 2013/12/12 IOTS2013 WIP The latest results of a distributed storage system for a widely distributed virtualization infrastructure

Upload: hiroki-kashiwazaki

Post on 28-Jun-2015

1.578 views

Category:

Technology


2 download

DESCRIPTION

第 6 回インターネットと運用技術シンポジウム (IOTS2013) の WIP (Work In Progress) セッションの発表資料です。3 本の発表を 1 つにまとめています。3 本の内訳は以下の通り。 1. 広域分散仮想化基盤のストレージ評価の最新動向 2. 動的な広域ライブマイグレーションが可能な環境における課金モデルの検討 3. 大阪大学における仮想化基盤の設計とその増強計画

TRANSCRIPT

  • 1. The latest results of a distributed storage system for a widely distributed virtualization infrastructure2013/12/12 IOTS2013 WIP Cybermedia Center Osaka University

2. 3. (3) 179p 4. DR Disaster Recovery 5. 1978 6. Sun Information Systems 7. mainframe hot site 8. 80- 90 9. Realtime Processing 10. POS point of sales 11. 90- 00 12. the Internet 13. 2001.9.11 September 11 attacks 14. 2003.8.14 Northeast blackout of 2003 15. in Japan 16. 2011.3.11 The aftermath of the 2011 Tohoku earthquake and tsunami 17. BCP Business Continuity Plan 18. Gunmma prefecture 19. Ishikari City 20. No, four. Two, two, four. 21. 2011 22. 23. 2012 24. 25. Trans-Japan Inter-Cloud Testbed 26. University of the RyukyusSINET Kitami Institute of Technology Cybermedia Center Osaka University 27. XenServer 6.0.2 CloudStack 4.0.0CloudStack 4.0.0 XenServer 6.0.2 28. problems 29. shared storage 30. 50ms 31. RTT > 100ms 32. Storage XenMotion Live Migration without shared storage> XenServer 6.1 33. VSA vSphere Storage Appliance 34. WIDE cloud different translate 35. Distributed Storage 36. requirement 37. 120000 120000Kbytes/sec100000 80000 60000 40000 20000 064100000High Random R/W Performance80000 60000 40000 20000 016384 40961024256 1024 409616384 65536 262144 16 1.04858e+06 4.1943e+06 File size in 2^n KBytes 1.67772e+07 4 6.71089e+07256 64 Record size in 2^n Kbytes 38. POSIX interface protoclNFS, CIFS, iSCSI 39. RICC Regional InterCloud Committee 40. Distcloud 41. Con$identialGlobal VM migration is also available by sharing "storage space" by VM host machines. Real time availability makes it possible. Actual data copy follows. (VM operator need virtually common Ethernet segment and fat pipe for memory copy) live migration of VM between distributed areas after MigrationTOYAMA siteCopy to DR-sitesTOKYO site before MigrationCopy to DR-sitesOSAKA siteCopy to DR-sitesreal time and active-active features seem to be just a simple "shared storage". Live migration is also possible between DR sites (it requires common subnet and fat pipe for memory copy, of course) 42. Con$identialFront-end servers aggregate client requests (READ / WRITE) so that, lots of back-end servers can handle user data in parallel & distributed manner. Both of performance & storage space are scalable, depends on # of servers. clientsfront-end (access server)back-end (core server) read blocksREAD req.WRITE req. write blocksAccess Gateway (via NFS, CIFS or similar)scalable performance & scalable storage size by parallel & distributing processing technology 43. backend (core servers)blockFileblockblockblockblockblockblockblockblockMeta Dataconsistent hash 44. Con$idential1. assign a new unique ID for any updated block (to ensure consistency). 2. make replication in local site (for quick ACK) and update meta data. 3. make replication in global distributed environment (for actual data copies). back-end (multi-sites)Most important ! the key for "distributed replication"(2) create 2 copies in local for each user data, write META data, ant returns ACK (1) assign a new unique ID for any updated block, so that, ID ensures the consistency a file, consisted from many blocks(1)(1')multiplicity in multi-location, makes each user data, redundant in local, at first, 3 distributed copies, at last.(3-b) remove one of 2 local blocks, in a future.(3-b)(3-a)(3-a) make a copy in different location right after ACK. (3-a) 45. NFS CIFS iSCSI 46. r=2 ACKr=1write r=0redundancy =3 47. r=2 e=0r=0 e=1r=1 e=0r = -1 e=2ACKdundancy =3 external 48. VMHypervisor10Gbps 1/4U server x4Cisco UCS 49. ! ! ! SINET4! ! EXAGE / Storage! ! RICC Copyright 2012 Yoshiaki Kitaguchi, All right reserved. 50. 51. SINET4 L2VPN, L3VPN 10Gbps825km 829km 440km316km 223kmSINET4417km 274kmRICCCopyright 2012 Yoshiaki Kitaguchi, All right reserved. 52. iozone -aceI a: full automatic mode c: Include close() in the timing calculations e: Include flush (fsync,fflush) in the timing calculations I: Use DIRECT IO if possible for all file operations. 53. write 120000 16384100000120000Kbytes/sec1000006000010244000080000256200006000006440000 20000 0641616384 4096 1024 256 1024 409616384 65536 262144 16 1.04858e+06 4.1943e+06 File size in 2^n KBytes 1.67772e+07 4 6.71089e+0764644 256 1024 4096 1638465536 1.04858e+06 262144 4.1943e+066.71089e+07 1.67772e+07256 Record size in 2^n Kbytes File size in 2^n KBytesRecord size in 2^n Kbytes409680000 54. writerewritereadreread1001001008060406040202008010MB100MB1GB010GBThroughput (MB/sec)120Throughput (MB/sec)120Throughput (MB/sec)120100Throughput (MB/sec)1208060402010MB100MB1GB010GBrandom read60402010MB100MBFile sizeFile size801GB010GBrandom writebkwd read1GB10GBrecord rewrite1001001008060402008060402010MB100MB1GB010GBThroughput (MB/sec)120Throughput (MB/sec)120Throughput (MB/sec)120100Throughput (MB/sec)100MBFile size12080604010MB100MB1GB010GB10MB100MBFile sizestride readfwrite1GB10GB1000Throughput (MB/sec)100Throughput (MB/sec)100208060402010MB100MB1GBFile size10GB0100MB1GBFile size10MB100MB1GBlegend Exage/Storage Exage/Storage8060402010MB0File size1204040fread1206060File size12080802020File sizeThroughput (MB/sec)10MBFile size10GB010MB100MB1GBFile size10GB10GB 55. SINET4 Hiroshima University EXAGE L3VPNSINET4 Kanazawa University EXAGE L3VPN 56. SINET4 Kanazawa University EXAGE L3VPNSINET4 NII EXAGE L3VPN 57. Read (before migration) Read (after migration) Write (before migration) Write (after migration) Through put (MB/sec )propo sed metho dsharedNFS 58. Read NFS Read 59. WriteNFS Write 60. SC2013 2013/11/1722 @Colorado Convention Center 61. Ikuo Nakagawa @Osaka Univ, INTEC Inc. 62. Kouhei Ichikawa@NAIST 63. We have been developing a widely distributed cluster storage system and evaluating the storage along with various applications. The main advantage of our storage is its very fast random I/O performance, even though it provides a POSIX compatible file system interface on the top of distributed cluster storage. 64. 65. Shinji Shimojo @Osaka Univ, NICT 66. 67. 68. RTT=244ms 1Gbps 69. 2.4km The AtlanticHiroshima 70. 71. consistent hash 72. (s) 17.9 201.6 175.4 Read 400.6 WriteI/O read : write25.4 MB/s 20.9 MB/sdd 60 70MB/s(read), 50 60MB/s(write) 36 73. 74. 75. DC DR 76. 77. Future Works 78. 79. VMVMmigrationVM18 80. Layer VML2+L2 VPLS, IEEE802.1ad PB(Q-in-Q), IEEE802.1ah(Mac-in-Mac)L2 over L3IP VXLAN, OTV, NVGRE( L3 IPL3 SDNID/LocatorOpenFlow IDLocatorLISPIP L4 L7IP MAT, NEMO, MIP(Kagemusha)mSCTPSCTPDNS + Reverse NATIP(L2 / L3 SCTPDynamic DNS VM Reverse NATL2 / L3 IP 21 81. 4th RICC workshop@Okinawa 2014/3/27(Thu)28(Fri) 82. go to next stage 83. A Consideration of accounting model based on an availability of a dynamic wide area live migration.2013/12/12 IOTS2013 WIP Cybermedia Center Osaka University 84. 85. 86. VM 87. Available supplies Frequency x coresVM time` 88. 89. 90. 4 cores 8GB memory 40GB storage Virtualized Machines (VMs) Virtualization Serversinterface Users Cloud Service Provider 91. FrequencyFrequency coresReal demandImaginary demandtimeVirtualized Machines (VMs)IT services4 cores 8GB memory 40GB storageUsersFrequencycorestimeImaginary resource cores FrequencyVirtualization ServerscorestimeCloud Service ProviderAvailable suppliestime 92. Frequency FrequencyReal demandtimeFrequencycorestimeFrequencyReal demand FrequencycorescoresReal demandtimecoresFrequencyresReal demandcoresReal demandtimecoresReal demandtimeReal demandtimetime 93. Users user experience IT services 94. Users IT services Virtualized Machines (VMs) (VM)VM VM VM Per day periodicity Frequency x coresFrequency x coresPer week periodicityFrequency x corestime (day)time (sec)Per year periodicity Frequency x corestime (sec) time (week) 95. Virtualized Machines (VMs)VM VM VM VMVM Virtualization Servers 96. Virtualization Servers Datacenter ON/OFF ( ) 97. Datacenter Power Supplyer 98. implementation of simulator 99. Array Virtualized Machines (VMs)UsersFrequency x coresRequireuser experienceSupply time (msec) FrequencyIT servicesPer day periodicity Frequency x corescorestime (msec)Per week periodicityFrequency x coresVirtualization Serverstime (day)Per year periodicityFrequency x coresDatacentertime (week)time 100. DatacenterVirtualization Servers Virtualized Machines (VMs)Migration SupplyVirtualization Servers Virtualized Machines (VMs) Virtualization Serversworsen UXtime (msec) Migration Available suppliesFrequencyMaintainance ModeRequireFrequency x corescores 101. strategy of migration 102. Tim Roughgarden (1975) 103. Selsh routing and the price of anarchy (2006) 104. John Forbes Nash Jr. (1928) 105. non-cooperative game 106. 107. 108. 109. Cathedral bazaar 110. HPC 111. cloud 112. 113. API 114. 115. go to next stage 116. A design and a project of virtualization infrastructure in Osaka University2013/12/12 IOTS2013 WIP Cybermedia Center Osaka University 117. 118. Campus Cloud Computing Environment 119. 600 nm3900 processor clock rate (MHz) 22Intel 4004 process rule (nm) 108KHzChanges of clock rate of Intel Microprocessors 120. corebecktonxeonwestmereivy bridgesandy bridgeDunnington Core Core 2i7DuoChanges of number of cores on Intel core series and Xeon processor Cybermedia Center Cyber Media Center Osaka University 121. ATCO C BCyber Media Center Osaka University 122. migration 123. DMZ segmentFirewallLoad Balancercore switches service segmentmanaged segment 124. 800040 Number of joined organizations Number of user accounts on the campus mail system 357000306000 25 5000 20 4000 15 3000 102000510000Number of joined organizationsNumber of user accounts on the campus mail system90002012 April2012 July2012 Oct.2013 Jan.2013 April2013 July2013 Oct.0Cybermedia Center Osaka University 125. 164cores/96cores 284GB/432GB 5.3TB/3.6TB 126. / () 127. : 1.7 1 128. averaged monthly changes of CPU usage ratio on Osaka university campus cloud systemCPU usage ratio201510502013/10/19 17:00 2013/10/24 1:00 2013/10/28 9:00 2013/11/1 17:002013/11/6 1:00time2013/11/10 9:00 2013/11/14 17:00Cybermedia Center Osaka University 129. IT cybermedia center datacenter 130. Cybermedia Center Osaka University 131. Peter Drucker (19092005) 132. The deadly sins in public administration(1980) 133. 2 /6 134. (1) The rst thing to do make sure thata program will not have results is to have alofty objective. 135. (2) The second strategy guarantee to produce non-performance is totry to do several things at once. It is to refuse to establish priorities and to stick to them 136. (3) The third deadly sin of the public administrator is to believe thatfat is beautiful,despite the obvious fact thatmass does not work. 137. (4) Don t experiment, be dogmatic 138. (5) Make sure that you cannot learn from experienceis the next prescription for non-perfomance in public administration. 139. (6) The last of administrator s deadly sins is the most damning and the most common:the inability of abandon. 140. 141. 142. design principle 143. (1) 144. (2) 145. (3) () 146. (4) 147. (5) 148. (6) 2.5 149. PLAN1 PLAN3 PLAN2ODINS VLANFirewalls10GbE L2 switches Load BalancersSoftware Firewall Software Load BalancerL3 switchesSoftware Router VM segmentVM segmentVLANVM host segment management segmentVM host segment management segmentStorage Segment management segmentStorage Segment management segment 150. VMware NSX 151. VMware VSAN 152. 400 153. : 4 1 or more 154. 96cores/4U 512GB memory 20TB 155. DR Disaster Recovery 156. Distcloud 157. 158. Sebastian Burkhard Thrun (1967) 159. 50 10 160. Oxford Cambridge Harvard MIT Stanford Princeton (three online Universities) Brigham Young University 161. Angus Maddison, The World Economy A Millennial perspective, Historical Statistics (2007) 162. The world s top 10 economies 1820 China India France28.7% 16.0% 5.4%U.K. Prussia19992009 24.3% 8.7% 8.6%2050U.S. 30.0% Japan 14.5% Germany 6.6%U.S. Japan ChinaChina U.S. India32.7% 17.8% 17.4%5.2% 4.9%U.K. France4.7% 4.7%Germany 5.7% France 4.6%Brazil MexicoJapan Austria Spain3.1% 1.9% 1.9%Italy China Spain3.8% 3.8% 2.0%U.K. Italy Brazil3.7% 3.6% 2.7%Russia 4.0% Indonesia 3.2% Japan 3.1%U.S. Russia1.8% 1.7%Canada Mexico2.1% 1.6%Spain Canada2.5% 2.3%U.K. 2.4% Germany 2.3%5.3% 4.3%From: BK Suh, Mega Trends: An External View , Cisco Connect 2013 163. motto of Osaka University 164. Live locally, grow globally. 165. Survive, locally or globally. 166. 2014/3/31 Cybermedia Center Osaka University 167. 168.