institute of computing technology

16
Cloud-Sea Computing on ZB of Data Zhiwei Xu Institute of Computing Technology (ICT) Chinese Academy of Sciences (CAS) www.ict.ac.cn, [email protected] This research is supported in part by the National Basic Research Program of China (Grant 2011CB302502), the Strategic Priority Program of Chinese Academy of Sciences (Grant XDA06010400), and the Guangdong Talents Program INSTITUTE OF COMPUTING TECHNOLOGY

Upload: siran

Post on 25-Feb-2016

31 views

Category:

Documents


1 download

DESCRIPTION

INSTITUTE OF COMPUTING TECHNOLOGY. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: INSTITUTE OF COMPUTING  TECHNOLOGY

Cloud-Sea Computing on ZB of Data

Zhiwei XuInstitute of Computing Technology (ICT)

Chinese Academy of Sciences (CAS)www.ict.ac.cn, [email protected]

This research is supported in part by the National Basic Research Program of China (Grant 2011CB302502), the Strategic Priority Program of Chinese Academy of Sciences (Grant XDA06010400),

and the Guangdong Talents Program

INSTITUTE OF COMPUTING

TECHNOLOGY

Page 2: INSTITUTE OF COMPUTING  TECHNOLOGY

We Are Entering a ZB Computing Era

• Two historical observations:– Per-capita capacity: Mega Giga Tera; – Worldwide capacity: Peta Exa Zetta

• Two major challenges– Capacities increase 1000X, while power (and energy) 1X– Enable existing and new workloads (and values)

Per Capi ta Worl dwi de Per Capi ta Worl dwi de Per Capi ta Worl dwi deStorage 4. 3 MB 21 PB 44. 7 GB 295 EB 5. 23 TB 41. 8 ZBCommuni cati on 12 MB 59 PB 9. 86 GB 65 EB 2. 88 TB 23 ZBGP Computi ng 0. 06 MI PS 0. 3 PI PS 0. 97 GI PS 6. 39 EI PS 4. 98 TI PS 40 ZI PSSP Computi ng 0. 09 MI PS 0. 44 PI PS 28. 6 GI PS 189 EI PS 321 TI PS 2570 ZI PS

1986 2007 2030Capaci ty

1986 and 2007 data: Hilbert and López, Science 2011: 332 (6025), 60-65. 2030 projection: from a conservative estimation by ICT, CAS.

Page 3: INSTITUTE OF COMPUTING  TECHNOLOGY

Workload Mega Trend: e-People• e-People = Computing for the Masses

– IT that directly benefits the masses (billions of individuals), not institutions• e-People, not e-Business, e-Science, e-Government

– Computer science utilizing the human-cyber-physical ternary universe• Ternary computing, not just cyber computing (unary computing)• e-People is not fully realized if we have to use cyber devices

Institutional Computinge-Businesse-Science

e-Government

Cyberspace ComputingIT servicesIT softwareIT hardware

Billions of usersTrillions of devicesMillions of verticalsZB of data

Human-facing devices are not enoughCurrently videos are the #1 load.2.88 TB = 8 HD movies per day!

Page 4: INSTITUTE OF COMPUTING  TECHNOLOGY

The Chinese Academy of Sciences NICT Project

• New generation ICT– 10-year research project (2012-2021)– 19 institutes, over 200 faculty members– Targeting potential mainstream markets of 2020-2030– Aiming at China’s needs in 2020-2050

• Human-cyber-physical ternary computing for ZB of data– Functional sensing– Customizable Internet– Cloud-sea computing

Page 5: INSTITUTE OF COMPUTING  TECHNOLOGY

Functional Sensing:Acquisition of Home Appliances Data

• Application examples (2020-2030)– Web search Grid search

• “Top 100 green households in Beijing and London”– Appliances R&D

• Utilizing field data for all appliances (better than software beta-test)

• Acquisition challenge– Can we timely acquire massive and accurate field data from billions of households,

for each and every appliance (lamp, refrigerator, etc.) in every household, with 1(~3) sensors per home?

Page 6: INSTITUTE OF COMPUTING  TECHNOLOGY

Traditional Sensing• One sensor per device

– ~50 devices per home, 220V@50Hz– Up to 128th harmonics

• 256 samples/cycle, 10 bytes/sample– 6.4 MB/s, or 200TB per year per home– For China, 200TB x 0.5 billion homes = 100 ZB per year

Current waveform of a heater in one cycle

Page 7: INSTITUTE OF COMPUTING  TECHNOLOGY

Functional Sensing• One sensor per home• Function is formalized behavior

– Type 0: human sensor– Type 1: current smart meters– Type 2: on-off behavior data for each device– Type 3: event behavior data– Type 4: finite behavior data

(up to kth harmonics for a given finite k)– Type 5: infinite behavior data

• Data storage needs can be reduced 10,000 times– 20GB/year per home for aggregated data– 1TB/year per home for disaggregated data

for each device

Page 8: INSTITUTE OF COMPUTING  TECHNOLOGY

The REST 2.0 Architecturefor Cloud-Sea Computing

EB-scaleBillion-thread

Servers

PB-scaleServers

CDN/CGN

Cloud-side functionsaggregation, request-response, big data

100s units

10Ks units

Millions

SeaHTTP

SeaHTTP

Seaport

Sea Zone

Sea-side functionssensing, interaction, local processing

Trillions, KB-GB

HTTP 2.0+

SeaportBillions unitsTB-PB/unit

Sea ZoneBillions, GB-TB

Page 9: INSTITUTE OF COMPUTING  TECHNOLOGY

New Gadgets for Homes

• GB sensor [email protected]

• TB “smart phones” @2W• PB wuTV (home datacenter) @20W• PB Personal Watson (iPC)

@200W

SeaHTTP

wuTViPC

Home

HTTP 2.0+

Page 10: INSTITUTE OF COMPUTING  TECHNOLOGY

Three examples ofData Computing

• Off-line (back end): RCFile for Apache Hive– Production use: Facebook, Taobao,

Netflix, Twitter, Yahoo!, Linkedin, AOL, Salesforce.com, etc.

– http://en.wikipedia.org/wiki/RCFile• On-line (front end):

CCIndex on Hbase– Production use in Taobao, Tencent

• High-speed communication: DataMPI

Alexa Top Sites(2013.06.14)

1. Facebook2. Google3. YouTube4. Yahoo!5. Baidu6. Wikipedia7. Windows Live8. Twitter9. QQ (Tencent)10. Taobao

22. eBay

Page 11: INSTITUTE OF COMPUTING  TECHNOLOGY

DataMPI open sourced at datampi.org

Hadoop

DataMPI

EXEC Time99 sec

EXEC Time18 sec

EXEC Time364 sec

EXEC Time103 sec

Sort PageRank

Page 12: INSTITUTE OF COMPUTING  TECHNOLOGY

Billion-Thread Server

...Core

Traditional Architecture of Datacenters

...

...... ......

Aggregation

Access

Hypervisor

Application Management

Runtime Environment

Applications

REST 1.0 Requests

Reduce Datacenter

Layers

Simplify SW/HW Stacks

REST 2.0 Requests

Architecture of Cloud-Sea Server

Micro OSApplications

Nano Kernel

Workload Processing Unit (WPU)

Memory

Storage

OS

Chipset

CPU

NIC

Memory

Disk

Page 13: INSTITUTE OF COMPUTING  TECHNOLOGY

Cloud-Sea Storage• Emphasize power-on efficiency

(70% HW peak), while matching latency, scalability, resilience needs

• Innovations– stable sets– metadata clustering– network RIAD

40 benchmark apps: reduces latency 123 times, backend load 50 times

Time

Addresses

Page 14: INSTITUTE OF COMPUTING  TECHNOLOGY

Elastic Processor• A new architecture style (FISC)

– Featuring function instructions executed by programmable ASIC accelerators

– Targeting 1000 GOPS/W applications• Results: 932 GOPS/W for machine learning

RISCARM

FISCFunction Instruction Set

Computer

CISCIntel X86

Chip types: 10s 1K 10KPower: 10~100W 1~10W 0.1~1WApps/chip: 10M 100K 10K

Page 15: INSTITUTE OF COMPUTING  TECHNOLOGY

References• Rui Hou, Tao Jiang, Liuhang Zhang et al, Cost Effective Data Center Servers, HPCA-19,

2013: 179-187• Zhiwei Xu: High-Performance Techniques for Big Data Computing in Internet Services.

Invited speech at SC12, SC Companion 2012: 1861-1895• Zhiwei Xu: Measuring Green IT in Society. IEEE Computer 45(5): 83-85 (2012)• Zhiwei Xu: How Much Power Is Needed for a Billion-Thread High-Throughput Server?

Frontiers of Computer Science 6(4): 339-346 (2012)• Zhiwei Xu, Guojie Li: Computing for the Masses. Commun. ACM 54(10): 129-137 (2011)• Jingjie Liu, Lei Nie, Zhiwei Xu: The Input-Sensing Problem in Ternary Computing and Its

Application in Household Energy-Saving. GreenCom 2011: 131-138• Yongqiang He, Rubao Lee, Yin Huai, Zheng Shao, Namit Jain, Xiaodong Zhang, Zhiwei

Xu: RCFile: A Fast and Space-Efficient Data Placement Structure in MapReduce-based Warehouse Systems. ICDE 2011: 1199-1208

• Xiaoyi Lu, Bing Wang, Li Zha, Zhiwei Xu: Can MPI Benefit Hadoop and MapReduce Applications? ICPP Workshops 2011: 371-379

• Qi Guo, Tianshi Chen, Yunji Chen, Zhi-Hua Zhou, Weiwu Hu, Zhiwei Xu: Effective and Efficient Microprocessor Design Space Exploration Using Unlabeled Design Configurations. IJCAI 2011: 1671-1677

• Yongqiang Zou, Jia Liu, Shicai Wang, Li Zha, Zhiwei Xu: CCIndex: A Complemental Clustering Index on Distributed Ordered Tables for Multi-dimensional Range Queries. NPC 2010: 247-261

Page 16: INSTITUTE OF COMPUTING  TECHNOLOGY

谢谢 !Thank you!

[email protected]