big data and implications on platform architecture

41
Big Data and Implications on Platform Architecture BIGS002 Fayé A Briggs, PhD Intel Fellow and Chief Server Platform Architect, Intel

Upload: stanislas-odinot

Post on 16-Jan-2015

418 views

Category:

Technology


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Big Data and Implications on Platform Architecture

Big Data and Implications on Platform Architecture

BIGS002

Fayé A Briggs, PhD Intel Fellow and Chief Server Platform Architect, Intel

Page 2: Big Data and Implications on Platform Architecture

2

Agenda

• What is Big Data? • Big Data use cases • What does Big Data mean for the data center? • Call to action

The PDF for this Session presentation is available from our Technical Session Catalog at the end of the day at: intel.com/go/idfsessionsBJ

URL is on top of Session Agenda Pages in Pocket Guide

Page 3: Big Data and Implications on Platform Architecture

3

Agenda

• What is Big Data? • Big Data use cases • What does Big Data mean for the data center? • Call to action

Page 4: Big Data and Implications on Platform Architecture

4

What is Big Data?

Time

Vol

ume

Structured Data

Unstructured Data

†”Big data: The next frontier for innovation, competition, and productivity”, McKinsey Global Institute

Analyze

Manage

Store

Capture Corporate Data

Big Sensed Data

Big Web Data

Big Corporate Data

datasets whose of typical database software tools to capture, store, manage and analyze†

size is beyond the ability volume, variety, value and velocity Unstructured

Page 5: Big Data and Implications on Platform Architecture

5

Velocity Real-time rather than batch-style analysis Data streamed in, tortured, and discarded Making impact on the spot rather than

after-the-fact

The Four Pillars and Associated Challenges of Big Data

Volume Massive scale and growth of unstructured data 80%~90% of total data Growing 10x~50x faster than structured (relational) data 10x~100x of traditional data warehousing

Variety Heterogeneity and variable nature of Big Data Many different forms (text, document, image, video, ...) No schema or weak schema Inconsistent syntax and semantics

Value Predictive analytics for future trends and patterns Deep, complex analysis (machine learning, statistic modeling,

graph algorithms, …), versus Traditional business intelligence (querying, reporting, …)

Page 6: Big Data and Implications on Platform Architecture

6

Big Data Has Gone Mainstream

Behold the Big Data sign gracing Times Square

Greeting commuters on Highway 101 in Silicon Valley, a giant Walmart* Big Data sign

“ . . . . every large company is working on Big Data projects “at a furious pace.” - Gartner analyst Merv Adrian

“. . . the use of Big Data will be effective for every segment of the economy.” - Michael Chui, McKinsey & Co. analyst

Page 7: Big Data and Implications on Platform Architecture

7

Big Data Usages Examples (Telecom/Financial/Search)

• Telecom – Calling patterns, signal processing, forecasting Analyze switches/routers data for quality of call, frequency of

calls, region loads, etc. – Act before problems happen. Act before customer calls arrive.

• Financial – Trading behaviour Analyze real-time data to understand market behavior, role of

individual institution/investor – Detect fraud, detect impact of an event, detect the players

• Search Engines – Process the data collected by Web bot in multiple dimensions – Enhance relevance of search

Big Data impacts e-connected businesses through capture, processing and storage of huge amount of data efficiently

Page 8: Big Data and Implications on Platform Architecture

8

Big Data Usages Examples (E-Biz, Media)

• Click Stream Analysis – Analysis of online users behavior – Develop comprehensive insight (Business Intelligence) to run

effective strategies in real time • Graph analysis

– Term for discovering the online influencers in various walks of life – Enables a business to understand key players and devise effective

strategies • Lifecycle Marketing

– Strategies to move away from spam/mass mail – Enables a business to spend money on high probable customers only

• Revenue Attribution – Term for analyzing the data to accurately attribute revenue back to

various marketing investments – Business can identify effectiveness of campaign to control expenses

Big Data phenomenon allows businesses to know, predict and influence customer behaviors!!!

Page 9: Big Data and Implications on Platform Architecture

9

Big Data in Health Care

GenBank* is the NIH genetic sequence database, an annotated collection of all

publicly available DNA sequences

Sequencing costs approaches “$1000.” Analytics, compute, networking, & storage to improve

affordability still challenging.

Cancer Care: American Society of Clinical Oncologists “learning health system”, CancerLinQ • Benefits: Collects and analyzes de-anonymized cancer

care data from millions of patient visits Genome Sequencing: Improvements in DNA sequencing driving down costs of processing complete set of genomes • Benefits: Saving lives through better identification and

treatment of diseases

Page 10: Big Data and Implications on Platform Architecture

10

Big Data Analytics Processing • “Batch”: Sophisticated data processing: enable “better”

decisions – Analyze, transform, scan, etc. large amount of data – E.g., ETL, graph construction, anomaly detection, trend analysis

• “Real-time”: Queries on historical data: enable “faster”

decisions – Data at rest but needs to be served in real-time – E.g., Facebook* uses HBase* “messaging” App serving real-time

data to its users

• “Streaming”: Queries on live data: enable “instantaneous” decisions on real-time streaming data – Large volume of data being ingested and analyzed in real-time – E.g., detect and block worms in real-time (a worm may infect

1mil hosts in 1.3sec)

Adapted from Ion Stoica, UC Berkeley

Page 11: Big Data and Implications on Platform Architecture

11

Visualization

Analytics

Data Management

Big Data Converted to Knowledge in an Iterative Cycle†

External Unstructured/Semi-structured on Dist. Parallel File System (HDFS*, Lustre*, CloudStore*, GPFS, GlusterFS*, etc.)

Real-Time Processing HBase*, SAP* HANA, Spark,

Shark, Cassandra*, mongoDB, Drill, Impala

Stream Processing

Spark Streaming, Storm, S4–Simple Scalable Stream.

Sys

Batch Processing MR-Hadoop*,

GraphLabs,Giraph

Filtering, Cleansing, ETL, Scribe, etc.

Knowledge

Presentation Tableau*, R, Progress Software*, Pentaho*, IBM*, others

†Based on Frog’s & IDC’s Layered & Iterative Approach

Big Data Ingest

Archive

Compute

Transact

Deliver

Batch Analytics

Call Data Records Gene Seq & Analysis GraphLab*-MLearning, ETL, Search, Index Creation, Click Stream Analysis, BI Analytics

Real-time Analytics

Intelligent Transport Home Energy Financial Analytics Sensor Network Database Time Series Stock Ticks Customer Behavior Ads

Twitter* Spam Analytics Video meta-data Traffic Modeling Virus Intrusion Detection Stock trading Video Surveillance Retail – Video, PoS data

Streaming Analytics

Page 12: Big Data and Implications on Platform Architecture

12

Agenda

• What is Big Data? • Big Data use cases • What does Big Data mean for the data center? • Call to action

Page 13: Big Data and Implications on Platform Architecture

13

Telco Usage - China Mobile Group Guangdong Hadoop* Big Data storage and analytics

Analytics

Page 14: Big Data and Implications on Platform Architecture

14

Telco Usage - China Mobile Group Guangdong Hadoop* Big Data storage and analytics

Usage Model: Deliver real-time access to Call Data Records (CDR) for billing self service • Solution: Hadoop* + Intel® Xeon® processors over RDBMS to remove

data access bottlenecks, increase storage, and scale system • Benefits: Lower TCO, up to 30x performance increase, stable operation,

analytics on subscriber usage for targeted promotions • Characteristics:

– 30TB billing data/month; real-time retrieval of 30 days CDRs – 300k records/sec, 800k insert speed/sec; 15 analytics queries , 133 server nodes+

Platform and Cluster Architectural Attributes: • Compute:

– Scale-out Intel Xeon processor E5 based platform for fast analytic query processing – Memory: 2-4GB/Core for Data Management Hadoop JVM – PCI Express* Gen3

• Storage: Scale-out Storage (HDD) for capacity; SSD Cache • I/O: High bandwidth and IOPS to Storage (HDD + SSD cache) for fast record inserts

and reads • Network: High network bandwidth for Hadoop Shuffle data; 10GbE TORS; 10-40GbE

Inter-Rack Switch

Page 15: Big Data and Implications on Platform Architecture

15

Real-Time Analytics: DuPont*– Crop Genetics HBase* Big Data analytics with “BLAST” to compare protein in genomic data

Usage Model: Comparative genomic research. Run “BLAST” to compare protein with every other protein in the genomic code. • Solution: The current RDBMS didn’t scale to planned data growth. HBase*/HDFS*

proved a reduction in processing time from over 30 days to less than 7 hours.

• Characteristics: – Genetic data for 4 million organisms – 1TB in size; scale to 14 million – 12 Trillion HBase rows; single record search takes 1.2 seconds

Platform and Cluster Architectural Attributes: • Compute:

– Scaleout Intel® Xeon® processor E5 base platform for real-time data serving – Memory: Higher Memory Capacity 4GB/Core for HBase Memstores – PCI Express* Gen3

• Storage: Scaleout Storage (HDD) for capacity; SSD Cache • I/O: High bandwidth and IOPS to Storage (HDD + SSD cache) for fast data

access to HBase tables • Network: High network bandwidth for HBase region servers; 10GbE TORS; 10-

40GbE Inter-Rack Switch

Page 16: Big Data and Implications on Platform Architecture

16

Telco - China Unicom Hadoop* & HBase* for Behavioral Analysis

Subscriber Usage & Billing

ETL

• Log Analysis • Daily Reports

Storage, Analytics

New Customer Segmentation & Insights

Page 17: Big Data and Implications on Platform Architecture

17

Telco - China Unicom Hadoop* & HBase* for Behavioral Analysis

Usage Model: Analyze subscriber Web usage and billing to derive new information products • Solution: Scale out storage based on Hadoop* & HBase* with network optimization based on

Web traffic, log analysis for daily reporting • Benefits: New customer segmentation • Characteristics:

– 188 nodes, 14TB/server – 2.5PB raw disk capacity – High speed data loading – Real-time query (latency <1s) Daily statistics & reports (sum, count, join, etc.)

Platform and Cluster Architectural Attributes: • Compute:

– Scaleout Intel® Xeon® processor E5 based platform for real-time data serving in < 1 sec – Memory: Higher Memory Capacity 4GB/Core for HBase Memstores – PCI Express* Gen3

• Storage: Scaleout Storage (HDD) for capacity; SSD Cache • I/O: High bandwidth and IOPS to Storage (HDD + SSD cache) for fast data

access to HBase tables • Network: High network bandwidth for HBase region servers; 10GbE TORS; 10-

40GbE Inter-Rack Switch

Page 18: Big Data and Implications on Platform Architecture

18

In-Memory GraphLab* Analytics: PageRank Big Data analytics with GraphLab

Page 19: Big Data and Implications on Platform Architecture

19

In-Memory GraphLab* Analytics: PageRank Big Data analytics with GraphLab

Usage Model: Deliver Page Ranking for search • Solution: Hadoop* + Intel® Xeon® processors: Large number of ML, Genomics,

Web, etc. applications can be efficiently run in Graph Parallel solution • Benefits: Significantly faster solutions to Graph(Vertices, Edges) domain

problems • Characteristics:

– XML docs, News Feeds, Web Pages – Data collected from Web Pages for Page Ranking

Platform and Cluster Architectural Attributes: • Compute:

– Scaleout Intel Xeon processor E5 based platform for fast analytic in-memory processing – Memory: 2-4GB/Core for Data Management Hadoop JVM; For graph data – PCI Express* Gen3

• Storage: Scaleout Storage (HDD) for capacity; SSD Cache • I/O: High bandwidth and IOPS to Storage (HDD + SSD cache) for fast record inserts and

reads • Network: High network bandwidth for Hadoop Shuffle data during graph construction;

10GbE TORS; 10-40GbE Inter-Rack Switch

Page 20: Big Data and Implications on Platform Architecture

20

Pipelined In-Memory Analytics: Twitter* Feed Spam Analytics Spark Streaming Big Data Analytics

• Run a streaming computation as a series of extremely small, deterministic batch jobs

• Batch sizes as low as ½ second, latency ~ 1 second

*

Page 21: Big Data and Implications on Platform Architecture

21

Pipelined In-Memory Analytics: Twitter* Feed Spam Analytics Spark Streaming Big Data Analytics

Usage Model: Process and filter out Twitter* feed spam as tweets are ingested • Solution: Spark Streaming from Berkeley • Characteristics:

– Data ingested from Twitter feeds

Platform and Cluster Architectural Attributes: • Compute:

– Scaleout Intel® Xeon® processor E5 based platform for fast analytic query processing – Memory: Very large Memory Capacity close to the CPU; High memory bandwidth – PCI Express* Gen3

• Storage: Scaleout Storage (HDD) for capacity; SSD Cache • I/O: High bandwidth and IOPS to Storage (HDD + SSD cache) for fast record

inserts and reads • Network: High network bandwidth; 10GbE TORS; 10-40GbE Inter-Rack

Switch

Page 22: Big Data and Implications on Platform Architecture

22

Smart City Sensor Model

Smart building sensors

Smart Grid sensors

Pollution sensors

Smart meters

Industrial automation

sensors

Portable medical imaging services

Medical sensors on ambulances

Sensors on smartphones

Meteorological sensors

Inductive sensors

Traffic cameras

INTELLIGENT CITY

INTELLIGENT HOSPITAL

INTELLIGENT FACTORY

INTELLIGENT HIGHWAY

Sensors on vehicles

Embedded

Cloud

Dedicated

HPC

Transactional

Social

Location

Page 23: Big Data and Implications on Platform Architecture

23

The Fusion of Internet of Things and Big Data

Planogram monitoring (Real-time stock level)

Surveillance camera (store statistics)

Interactive display (Behavioral marketing)

Real-time transaction

Store heat map (hot merchandise, browsing history, conversion rate)

RFID

RFID

Dynamic pricing Real-time, personalized ad Auto promotion/coupon Social network connection

Page 24: Big Data and Implications on Platform Architecture

24

Smart Traffic Intelligent Transport System HBase* Application for Predictive Analytics

Page 25: Big Data and Implications on Platform Architecture

25

Smart Traffic Intelligent Transport System HBase* Application for Predictive Analytics

Usage Model: Analyze city traffic to derive statistics for crime prevention, info sharing, and predictive traffic analysis • Solution: Embed HBase* client in camera for real-time inserts of structured/unstructured

data • Benefits: Automated queries for traffic violation, data mining of fake licenses <1 minute for

all data captured for a week, predictive traffic forecasting • Characteristics:

– 30000 + camera data collection points – Petabytes of traffic data & terabytes of images – 2 billion HBase records

Platform and Cluster Architectural Attributes: • Compute:

– Scaleout Intel® Xeon® processor E5 based platform for real-time data serving – Memory: Higher Memory Capacity 4GB/Core for HBase Memstores – PCI Express* Gen3

• Storage: Scaleout Storage (HDD) for capacity; SSD Cache • I/O: High bandwidth and IOPS to Storage (HDD + SSD cache) for fast data

access to HBase tables • Network: High network bandwidth for HBase region servers; 10GbE TORS; 10-

40GbE Inter-Rack Switch

Page 26: Big Data and Implications on Platform Architecture

26

E2E Analytics Use Case - Safe City

Private Cloud Public Cloud

Edge Client (Video capture)

Edge Device Video

Indexer/Analyzer/Transcoder (Image extraction & Metadata Creation)

Video Storage (Edge or Centralized)

Data Center/ Cloud

(Private/Public)

Data Services (VSaaS, VAaaS)

Checkpoint District City State Country Sm

art

Camera

Police Car

Management System

Typical Scenario : Automated traffic violations E.g., 70% Traffic violation detection by video in 2011

Edge & Backend VA’s Value

Metadata tagging and compression at the edge Enables 10s of new use models for traffic mgmt Fastest time to information and public safety

By end of 2017 457 PB

Raw Video per Day

Traffic video streamed by HD

cameras

Video created Video analyzed Video Cold storage Video metadata stored

2-3 TB Video data/Camera/Month

300 GB metadata/Camera/Month

By end of 2017 76 PB

Metadata per Day

People, cars, geospatial

information

Page 27: Big Data and Implications on Platform Architecture

27

Smart City Big Data Architecture Framework

Core System-on-a-Chip

Microserver

Intel® Core™

Hor

izon

tal

Sca

le

Hor

izon

tal &

V

erti

cal S

cale

Local Analytics

Preprocessing/ Cleansing/Filtering/

Aggregation Sto

rag

e

Analytics Processing Complex Event Processing

Data Acquisition

Streaming Analytics

Batch Analytics

Visualization & Interpretation

[Un]Structured

Data

Sensors Cameras

Based on Intel microarchitecture-EN

Based on Intel® microarchitecture-EX

Intel® Many Integrated Core Architecture

Based on Intel microarchitecture-EP

Data Acquisition Video Analytics

Page 28: Big Data and Implications on Platform Architecture

28

Agenda

• What is Big Data? • Big Data use cases • What does Big Data mean for the data center? • Call to action

Page 29: Big Data and Implications on Platform Architecture

29

Up to 4 channels DDR3 1600 MHz memory

Up to 8 cores Up to 20 MB cache

Integrated PCI Express* 3.0 Up to 40 lanes per socket

Choice of Compute Platforms Optimized for Big Data

• Preferred solution for Hadoop* and scale-out analytic/DW engines

• Up to 80%** performance boost compared to prior generation

• Intel® Integrated I/O with PCI Express* 3.0 provides more bandwidth for large data sets

• Latest DDR3 memory technology/capacity for reduced memory latency

• Preferred solution for in-memory analytic engines and enterprise databases

• Highest cache and thread performance for large-dataset processing

• Up to 2TB memory footprint (4-socket platform) for in-memory apps

• Highest reliability and 8-socket+ scalability

Up to 10 cores Up to 30 MB cache

QPI 1 QPI 2 QPI 3 QPI 4

Xeon E7-4800

CORE 1 CORE 2

CORE 3 CORE 4

CORE 5 CORE 6

CORE 7 CORE 8

CORE 9 CORE 10

CACHE 4 QPI 1.0 Lanes for robust scalability

RAM

Up to 8 channels DDR3 1066 MHz memory

Intel Xeon processor E7 Family Intel® Xeon® processor E5 Family

Right Analytic Platforms begins with Intel Xeon processors QPI = Intel® QuickPath Interconnect. **See backup slides for 80% claim

Page 30: Big Data and Implications on Platform Architecture

30

Up to four channels DDR3 1600 MHz memory

Up to eight cores Up to 20 MB cache

Integrated PCI Express* 3.0 Up to 40 lanes per socket

Platform and Software Optimizations for Hadoop*

1 Performance comparison using best submitted/published 2-socket server results on the SPECfp*_rate_base2006 benchmark as of 6 March 2012. 2 Source: Intel internal measurements of average time for an I/O device read to local system memory under idle conditions comparing Intel® Xeon® processor E5-2600 product family (230 ns) vs.. Intel® Xeon® processor 5500 series (340 ns). See notes in backup for configuration details * Other names and brands may be claimed as the property of others

• Up to 80%** performance boost vs. prior generation – Intel® Advanced Vector Extensions - reduce compute time – Intel® Turbo Boost Technology - increased performance

• Intel® Distribution for Apache Hadoop* software – Built on open source releases – Custom tuning for data types and scaling approaches

** See backup slides for 80% claim QPI = Intel® QuickPath Interconnect Intel® Xeon® processor E5

Page 31: Big Data and Implications on Platform Architecture

31

Intel® Xeon® processor 2S concept

Intel Xeon processor 4S concept

Low Density Servers

Delivering Performance/Power Efficiency

Page 32: Big Data and Implications on Platform Architecture

32

Microserver: High Density, Low Power System Innovations

• Addressing the low power, high density packaging

• Based on Intel® Atom™ processors – Next generation Intel Atom

processor codename Avoton – Workloads

Web tier, SaaS, IaaS, PaaS and light data analytics

– For scale-out apps

Page 33: Big Data and Implications on Platform Architecture

33

Transforming Storage Data explosion …

Source: Intel

Driving storage opportunity

Big Sensed Data

Big Corp Data

Big Web Data

Structured Data

Unstructured Data

Corporate Data

Time

Volume

690% Growth in storage capacity 2010-

2015+

Intel® Xeon® processors provide storage intelligence

• Deduplication • Thin

provisioning • Erasure code • MapReduce

• Encryption

Traditional Storage

Distributed Storage

30% CAGR

16% CAGR

690 percent growth in storage capacity based off Intel analysis and IDC data, between 2010 (26,066 petabytes) to 2015 (179,327) which is ~690%

Page 34: Big Data and Implications on Platform Architecture

34

Real-Time Data Analytics

Intelligent Distributed Storage Optimizations

APPLI 1

APPLI 2

TRADITIONAL ALLOCATION THIN PROVISIONING

ALLOCATED -FREE ALLOCATED - USED

ALLOCATED -USED ALLOCATED -FREE

APPLI 1 APPLI 2

SYSTEM-WIDE CAPACITY RESERVED

BEFORE AFTER

On-demand utilization of available storage – virtual and real capacity

Analysis of real-time storage determines extent and nature of compression

Strategic positioning of faster storage devices, improves storage performance

Intelligent pattern matching reduces large blocks of repeated data

Page 35: Big Data and Implications on Platform Architecture

35

Agenda

• What is Big Data? • Big Data use cases • What does Big Data mean for the data center? • Call to action

Page 36: Big Data and Implications on Platform Architecture

36

Call to Action

• Big Data represents a huge industry opportunity for innovation – get involved! – New solutions for analytics – Hardware infrastructure innovation – across the platform

• Customers from across enterprise, cloud, government, HPC and telecom are looking to improve decision making with big data – If you are a developer of solutions: Understand the market

opportunities – If you are a manager of solutions: Understand where big

data can help your organization

• Intel is deeply engaged in big data – Work with us on delivery of big data solutions

Page 37: Big Data and Implications on Platform Architecture

37

Legal Disclaimer INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. • A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in

personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS.

• Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information.

• The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

• Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel's current plan of record product roadmaps.

• Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. Go to: http://www.intel.com/products/processor_number.

• Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. • Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be

obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm • Avoton and other code names featured are used internally within Intel to identify products that are in development and not yet

publicly announced for release. Customers, licensees and other third parties are not authorized by Intel to use code names in advertising, promotion or marketing of any product or services and any such use of Intel's internal code names is at the sole risk of the user

• Intel, Xeon, Atom, Core, Sponsors of Tomorrow and the Intel logo are trademarks of Intel Corporation in the United States and other countries.

• *Other names and brands may be claimed as the property of others. • Copyright ©2013 Intel Corporation.

Page 38: Big Data and Implications on Platform Architecture

38

• Intel® Turbo Boost Technology requires a system with Intel Turbo Boost Technology. Intel Turbo Boost Technology and Intel Turbo Boost Technology 2.0 are only available on select Intel® processors. Consult your PC manufacturer. Performance varies depending on hardware, software, and system configuration. For more information, visit http://www.intel.com/go/turbo.

Legal Disclaimer

Page 39: Big Data and Implications on Platform Architecture

39

Risk Factors The above statements and any others in this document that refer to plans and expectations for the first quarter, the year and the future are forward-looking statements that involve a number of risks and uncertainties. Words such as “anticipates,” “expects,” “intends,” “plans,” “believes,” “seeks,” “estimates,” “may,” “will,” “should” and their variations identify forward-looking statements. Statements that refer to or are based on projections, uncertain events or assumptions also identify forward-looking statements. Many factors could affect Intel’s actual results, and variances from Intel’s current expectations regarding such factors could cause actual results to differ materially from those expressed in these forward-looking statements. Intel presently considers the following to be the important factors that could cause actual results to differ materially from the company’s expectations. Demand could be different from Intel's expectations due to factors including changes in business and economic conditions; customer acceptance of Intel’s and competitors’ products; supply constraints and other disruptions affecting customers; changes in customer order patterns including order cancellations; and changes in the level of inventory at customers. Uncertainty in global economic and financial conditions poses a risk that consumers and businesses may defer purchases in response to negative financial events, which could negatively affect product demand and other related matters. Intel operates in intensely competitive industries that are characterized by a high percentage of costs that are fixed or difficult to reduce in the short term and product demand that is highly variable and difficult to forecast. Revenue and the gross margin percentage are affected by the timing of Intel product introductions and the demand for and market acceptance of Intel's products; actions taken by Intel's competitors, including product offerings and introductions, marketing programs and pricing pressures and Intel’s response to such actions; and Intel’s ability to respond quickly to technological developments and to incorporate new features into its products. The gross margin percentage could vary significantly from expectations based on capacity utilization; variations in inventory valuation, including variations related to the timing of qualifying products for sale; changes in revenue levels; segment product mix; the timing and execution of the manufacturing ramp and associated costs; start-up costs; excess or obsolete inventory; changes in unit costs; defects or disruptions in the supply of materials or resources; product manufacturing quality/yields; and impairments of long-lived assets, including manufacturing, assembly/test and intangible assets. Intel's results could be affected by adverse economic, social, political and physical/infrastructure conditions in countries where Intel, its customers or its suppliers operate, including military conflict and other security risks, natural disasters, infrastructure disruptions, health concerns and fluctuations in currency exchange rates. Expenses, particularly certain marketing and compensation expenses, as well as restructuring and asset impairment charges, vary depending on the level of demand for Intel's products and the level of revenue and profits. Intel’s results could be affected by the timing of closing of acquisitions and divestitures. Intel’s current chief executive officer plans to retire in May 2013 and the Board of Directors is working to choose a successor. The succession and transition process may have a direct and/or indirect effect on the business and operations of the company. In connection with the appointment of the new CEO, the company will seek to retain our executive management team (some of whom are being considered for the CEO position), and keep employees focused on achieving the company’s strategic goals and objectives. Intel's results could be affected by adverse effects associated with product defects and errata (deviations from published specifications), and by litigation or regulatory matters involving intellectual property, stockholder, consumer, antitrust, disclosure and other issues, such as the litigation and regulatory matters described in Intel's SEC reports. An unfavorable ruling could include monetary damages or an injunction prohibiting Intel from manufacturing or selling one or more products, precluding particular business practices, impacting Intel’s ability to design its products, or requiring other remedies such as compulsory licensing of intellectual property. A detailed discussion of these and other factors that could affect Intel’s results is included in Intel’s SEC filings, including the company’s most recent Form 10-Q, report on Form 10-K and earnings release. Rev. 1/17/13

Page 40: Big Data and Implications on Platform Architecture

40

Backup

Page 41: Big Data and Implications on Platform Architecture

41

Disclaimer for “Up to 80% performance boost compared to prior generation” • Performance comparison using best submitted/published

2-socket server results on the SPECfp*_rate_base2006 benchmark as of 6 March 2012. Baseline score of 271 published by Itautec on the Servidor Itautec MX203* and Servidor Itautec MX223* platforms based on the prior generation Intel® Xeon® processor X5690. New score of 492 submitted for publication by Dell on the PowerEdge T620 platform and Fujitsu on the PRIMERGY RX300 S7* platform based on the Intel® Xeon® processor E5-2690. For additional details, please visit www.spec.org. Intel does not control or audit the design or implementation of third party benchmark data or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmark data are reported and confirm whether the referenced benchmark data are accurate and reflect performance of systems available for purchase.