supercomputers and cloud games

34
Super computer & cloud gaming Shinra Technologies, Inc. Senior vice president Tetsuji Iwasaki 11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 1

Upload: shinratechnologies

Post on 07-Jul-2015

884 views

Category:

Technology


4 download

DESCRIPTION

On September 19th, 2014, Shinra held its first developer event in Tokyo, titled “Supercomputers and Cloud Games.”

TRANSCRIPT

Page 1: Supercomputers and Cloud Games

Super computer & cloud gaming

Shinra Technologies, Inc.Senior vice president

Tetsuji Iwasaki

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 1

Page 2: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 2

About me

Tetsuji Iwasaki

Hobby: Beer

Started working in the industry in 1990, Joined Square-Enix in 1994

Some Famous titlesFFT/FFXI/Crysis

+17 game projects

Currently holding these positions:2011 Square-Enix holdings Technology planning specialist

2012 Development director, Eidos Montreal2014 Shinra Technologies, Inc. SVP(Technology)

Page 3: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 3

Internet

Streaming Video

Controller Input

Data center

What is cloud gaming?

「Mini Ninjas」© 2009 Eidos Interactive Ltd. Co-published by Eidos, Inc. and Warner Bros. Interactive Entertainment,

a division of Warner Bros. Home Entertainment Inc.

Page 4: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 4

What is super computer?

There is no clear definition…

Page 5: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 5

What is the imagine of supercomputer in your mind?

http://jp.fujitsu.com/about/tech/k/ スーパーコンピュータ「京」より転載 2014/9/17閲覧

Page 6: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 6

Lets see the top 10http://www.top500.org/

1 Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P

NUDT China

2 Titan Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini interconnect, NVIDIA K20x

Cray Inc. United States

3 Sequoia BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM United States

4 K computer SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu Japan

5 Mira BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM United States

6 Piz Daint Cray XC30, Xeon E5-2670 8C 2.600GHz, Aries interconnect , NVIDIA K20x

Cray Inc. Switzerland

7 Stampede PowerEdge C8220, Xeon E5-2680 8C 2.700GHz, InfinibandFDR, Intel Xeon Phi SE10P

Dell United States

8 JUQUEEN BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect

IBM Germany

9 Vulcan BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect

IBM United States

10Cray XC30, Intel Xeon E5-2697v2 12C 2.7GHz, Aries interconnect Cray Inc. United States

Page 7: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 7

1 Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P

NUDT China

2 Titan Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini interconnect, NVIDIA K20x

Cray Inc. United States

3 Sequoia BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM United States

4 K computer SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu Japan

5 Mira BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM United States

6 Piz Daint Cray XC30, Xeon E5-2670 8C 2.600GHz, Aries interconnect , NVIDIA K20x

Cray Inc. Switzerland

7 Stampede PowerEdge C8220, Xeon E5-2680 8C 2.700GHz, InfinibandFDR, Intel Xeon Phi SE10P

Dell United States

8 JUQUEEN BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect

IBM Germany

9 Vulcan BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect

IBM United States

10Cray XC30, Intel Xeon E5-2697v2 12C 2.7GHz, Aries interconnect Cray Inc. United States

Intel® Xeon®

Page 8: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 8

IBM® Power® BQC

1 Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P

NUDT China

2 Titan Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini interconnect, NVIDIA K20x

Cray Inc. United States

3 Sequoia BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM United States

4 K computer SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu Japan

5 Mira BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM United States

6 Piz Daint Cray XC30, Xeon E5-2670 8C 2.600GHz, Aries interconnect , NVIDIA K20x

Cray Inc. Switzerland

7 Stampede PowerEdge C8220, Xeon E5-2680 8C 2.700GHz, Infiniband FDR, Intel Xeon Phi SE10P

Dell United States

8 JUQUEEN BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect

IBM Germany

9 Vulcan BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect

IBM United States

10Cray XC30, Intel Xeon E5-2697v2 12C 2.7GHz, Aries interconnect Cray Inc. United States

Page 9: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 9

Fujitsu® SPARC®64 Villfx

1 Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P

NUDT China

2 Titan Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini interconnect, NVIDIA K20x

Cray Inc. United States

3 Sequoia BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM United States

4 K computer SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu Japan

5 Mira BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM United States

6 Piz Daint Cray XC30, Xeon E5-2670 8C 2.600GHz, Aries interconnect , NVIDIA K20x

Cray Inc. Switzerland

7 Stampede PowerEdge C8220, Xeon E5-2680 8C 2.700GHz, Infiniband FDR, Intel Xeon Phi SE10P

Dell United States

8 JUQUEEN BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect

IBM Germany

9 Vulcan BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect

IBM United States

10Cray XC30, Intel Xeon E5-2697v2 12C 2.7GHz, Aries interconnect Cray Inc. United States

Page 10: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 10

NVIDIA® tesla®/Intel® phi

1 Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P

NUDT China

2 Titan Cray XK7 , Opteron 6274 16C 2.200GHz, Cray Gemini interconnect, NVIDIA K20x

Cray Inc. United States

3 Sequoia BlueGene/Q, Power BQC 16C 1.60 GHz, Custom IBM United States

4 K computer SPARC64 VIIIfx 2.0GHz, Tofu interconnect Fujitsu Japan

5 Mira BlueGene/Q, Power BQC 16C 1.60GHz, Custom IBM United States

6 Piz Daint Cray XC30, Xeon E5-2670 8C 2.600GHz, Aries interconnect , NVIDIA K20x

Cray Inc. Switzerland

7 Stampede PowerEdge C8220, Xeon E5-2680 8C 2.700GHz, InfinibandFDR, Intel Xeon Phi SE10P

Dell United States

8 JUQUEEN BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect

IBM Germany

9 Vulcan BlueGene/Q, Power BQC 16C 1.600GHz, Custom Interconnect

IBM United States

10Cray XC30, Intel Xeon E5-2697v2 12C 2.7GHz, Aries interconnect Cray Inc. United States

Page 11: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 11

The trend

General purpose processor85.4% of TOP500 is using Intel…not sure exactly but probably most of them is Xeon

Amazon EC2 is ranked as 76th

Amazon EC2 C3 Instance cluster Intel Xeon E5-2680v2 10C 2.800GHz, 10G Ethernet

Page 12: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 12

TESLA GPU ACCELERATORS FOR SERVERS http://www.nvidia.com/object/tesla-servers.html2014-9-17閲覧

Super computer and GPU

NVIDIA® Tesla®

Intel® Xeon Phi™ Coprocessor

インテル® Xeon Phi™ コプロセッサー製品仕様http://www.intel.co.jp/content/www/jp/ja/processors/xeon/xeon-phi-detail.html 2014-9-17閲覧

Page 13: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 13

The impact of DEGIMA

*Tsuyoshi Hamada, Tetsu Narumi, Rio Yokota, Kenji Yasuoka and Keigo Nitadori. 42 TFlops Hierarchical N-body Simulations on GPUs with Applications in both Astrophysics and Turbulence. SC '09 Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis Article No. 62

Page 14: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 14

*長崎大学GPUクラスタDEGIMA(DEstination for Gpu Intensive MAchine)の紹介 https://www.cps-jp.org/seminar/fy2010/2010-12-01/hamada/pub/20101201_hamada_02.pdf page5 2014-9-17閲覧

Page 15: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 15

Be careful, just incase

The value supercomputers can’t tell by just Linpack benchmark performance

Maintenance, usability, purpose of calculations are not considered by Top 500

ranking

But maybe people should mind the cost more…

Page 16: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 16

1 Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C 2.200GHz, TH Express-2, Intel Xeon Phi 31S1P

TH-IVB-FEP Cluster -> system nameIntel Xeon E5-2692 12C 2.200GHz -> cpu nameTH Express-2 -> interconnection Intel Xeon Phi 31S1P -> accelarator

How to check super computers

K-Computer’s Inter connection “Tofu”6 dimension mesh taurus

スーパーコンピュータの高次元接続技術が「恩賜発明賞」を受賞http://pr.fujitsu.com/jp/news/2014/05/29.html 2014-09-17閲覧

Page 17: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 17

Questions so far?

Page 18: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 18

Some parts of Shinra System Technology components

Remote rendering architecture

RDMA/TCP dual protocol inter connection

Distribution models depending on game design

Page 19: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 19

Remote rendering architecture

Page 20: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 20

Remote rendering architecture

• Rendering on GPU server

• DirectX11API calls are executed in my laptop

Page 21: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 21

Game.exe(third-party)

dinput.dll dxgi.dll d3d11.dll

nvwgf2umx.dll

nvlddmkm.sys

Renderer.exe

dxgi.dllws2_32.dll d3d11.dll

nvwgf2umx.dll

nvlddmkm.sys

ws2_32.dll

…Fakedxgi.dll

Faked3d11.dll

Network card Network card

Remote rendering archtectureProcess environment

Page 22: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 22

Remote rendering architecture

Logical unit of game system

Physical unit

• Separate CPU & GPU Servers• Many users per logical unit• Flexible architecture allows

efficient CPU/GPU usage

GPU GPU

GPU GPU

CPU

CPU

CPU GPU GPU

GPU GPUCPU

CPU CPU

CPU CPU

CPU CPU

CPU CPU

Page 23: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 23

CPU

GPU

CPU/GPU performance mismatch

Page 24: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 24

y = 1037.3x-0.826

R² = 0.9055

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 20,000 40,000 60,000 80,000 100,000 120,000 140,000 160,000

The relationship between the cost and performanceTwice expensive doesn’t mean double performance.

Page 25: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 25

Rendering 60 games in a server

Page 26: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 26

RDMA/TCP Dual protocol inter connection

Page 27: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 27

Comp01<->GPU01 Effective bandwidth 8.8Gbps loopback([email protected])Effective bandwidth

3.59 Gbps

Unit size RTT(μsec) Unit size RTT(μsec)

4 42,09 4 15,080261

8 41,75 8 14,986181

16 42,18 16 15,00307

32 41,86 32 15,097176

64 42,69 64 15,081717

128 42,91 128 15,106041

256 43,35 256 15,17368

512 44,6 512 15,301775

1024 46,6 1024 15,67151

2048 64,19 2048 24,330402

4096 79,87 4096 30,921734

8192 140,06 8192 45,846207

16384 186,85 16384 79,473488

32768 291,19 32768 129,546127

65536 497,89 65536 227,030136

131072 909,93 131072 435,540619

262144 1800,49 262144 929,645325

524288 3483,36 524288 1904,819336

1048576 6841,73 1048576 4009,06958

The performance of a latest network card(TCP)

Page 28: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 28

Mellanox Connect X3

http://www.mellanox.com/page/products_dyn?product_family=127 2014-9-17閲覧

-can use RDMA in Ether net environment

-the interconection of Tianhe-2 (MilkyWay-2) using RDMA as well

-can skip most of OS/Driver layer and directly move memory to remote machines

Page 29: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 29

Game.exe(third-party)

dinput.dll

nvwgf2umx.dll

Renderer.exe

dxgi.dllws2_32.dll d3d11.dll

nvwgf2umx.dll

nvlddmkm.sys

ws2_32.dll

…Fakedxgi.dll

Faked3d11.dll

001001010001110101110010011101010

Compression (500µs / Ratio 1:8)Transmission to the Renderer• Using TCP over Gigabit Ethernet (500µs)• Using RDMA over Converged Ethernet (50µs)Decompression (200µs)

Delay ≈ 1.2ms

The interconnection of Shinra system

Video card

Page 30: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 30

Distribution models depending on game design

Stand alone architecture

SS Architecture

MK Architecture

Page 31: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 31

Compute Server Rendering Server

Game.exe Rendering.exe

internet internet

Input Video

GPU

GPURendering Commands

Stand alone architecture

Page 32: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 32

Compute Server Rendering Server

Remote Renderer

internet internet

Input

Rendering CommandsGPU

GPU

Server

Game

Game

Game

Game

SS Architecture

4 x Video Streams

Page 33: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 33

Compute Server Rendering Server

internet internet

GPU

GPUGame

User

User

User

User

Input

Rendering Commands

Remote Renderer

4 users in a single process…

4 x Video Streams

MK Architecture

Page 34: Supercomputers and Cloud Games

11/12/2014 © 2014 Shinra Technologies, Inc. All Rights Reserved. 34

We will make a SDK for these 3 architectures standardized