big data, cloud & virtualization - intersystems · 2018. 6. 26. · virtualization & cloud...
TRANSCRIPT
InterSystems Symposia 2014
1
Big Data, Cloud & VirtualizationTokyo, 2014
Vik Nagjee – Product Manager, Database Platforms
Big Data
InterSystems Symposia 2014
2
Variety
Velocity
Volume
What’s Big about {Big} Data?The 3 V’s…
The {Big} Data Challenge
Image credit: Diya Soubra [http://www.datasciencecentral.com/forum/topics/the-3vs-that-define-big-data]
InterSystems Symposia 2014
3
What’s the Real {Big} Data Challenge
Volume
VarietyVelocity
The 4th Dimension of Big Data
VALUE
The {Big} Data Journey: A Data Platform for Just-In-Time Action
Volume Velocity Variety VALUE
InterSystems Symposia 2014
4
Big Data Case StudyESA: The Gaia Mission
Source: http://upload.wikimedia.org/wikipedia/commons/a/a8/NASA-Apollo8-Dec24-Earthrise.jpg
InterSystems Symposia 2014
5
Gaia: Complete, Faint, Accurate
Hipparcos Gaia
Magnitude limit 12 mag 20 magCompleteness 7.3 – 9.0 mag 20 magBright limit 0 mag 6 magNumber of objects 120,000 47 million to G = 15 mag
360 million to G = 18 mag1192 million to G = 20 mag
Effective distance limit 1 kpc 50 kpc
Quasars 1 (3C 273) 500,000Galaxies None 1,000,000Accuracy 1 milliarcsec 7 µarcsec at G = 10 mag
26 µarcsec at G = 15 mag333 µarcsec at G = 20 mag
Photometry 2-colour (B and V) Low-res. spectra to G = 20 magRadial velocity None 15 km s-1 to GRVS = 16 magObserving Pre-selected Complete and unbiased
Source: http://www.cosmos.esa.int/web/gaia/presentations
InterSystems Symposia 2014
6
One Billion Stars in 3D will provide …• in our Galaxy …
– the distance and velocity distributions of all stellar populations– the spatial and dynamic structure of the disk and halo– its formation history– a detailed mapping of the Galactic dark-matter distribution– a rigorous framework for stellar-structure and evolution theories– a large-scale survey of extra-solar planets (~7,000)– a large-scale survey of Solar-system bodies (~250,000)
• … and beyond– definitive distance standards out to the LMC/SMC– rapid reaction alerts for supernovae and burst sources (~6,000)– quasar detection, redshifts, microlensing structure (~500,000)– fundamental quantities to unprecedented accuracy: to 2×10-6 (2×10-5 present)
Source: http://www.cosmos.esa.int/web/gaia/presentations
Source: http://n.pr/1p7vyxv
InterSystems Symposia 2014
7
Source: http://www.cosmos.esa.int/web/gaia/data-processing
Core Processing – Powered by InterSystems Caché
• ~1,200,000,000 stars observed by Gaia• In 5 years, Gaia will observe each star, on average, 80 times:
– (80 x 1,200,000,000) = 96,000,000,000 transits
– 96,000,000,000 / 5 years = 52,316,076 transits / day
• On a nominal day ~50,000,000 transits ~ 285,000 MB data • On a “heavy” day ~350,000,000 transits ~1,995,000 MB data
Data volumes
• Per day: ~ 285,000 MB = ~280 GB = ~0.28 TB• First 4 months (COMMISSIONING Period)
– All daily data is kept
– Total growth = 0.28 TB/day x 120 days = ~34 TB
• In 5th month, cleanup occurs. Remaining data = ~3 TB• 5th month onwards, steady state size = ~3 TB
Data growth patterns
InterSystems Symposia 2014
8
Core Processing – Powered by InterSystems Caché
• One 16CPU, 1.2TB RAM server IDT/FL DB• Storage for ITD/FL DB:
– 1x NetApp FAS3160, 160 SATA Disks, iSCSI– 16x Internal SSDs
• List price: ~$200,000 (storage + server)
• One 16CPU, 1.2TB RAM server Asynchronous Mirror• Storage for Async:
– 1x NetApp FAS3250 – 35 STATA Disks, NFS interconnect– 16x Internal SSDs – internal to each server
• List price: ~$90,000 (storage + server)
• Application Access:– Java Application(s) across ~20 application servers– Connecting to Caché via JDBC– List price: ~$10,000 / server = ~$200,000 for Java application
• HA / DR Configuration:– No “hot” HA: 95% Uptime SLA guarantee – rebuild of server, or DR– DR: Caché Database Mirroring
Delightfully Parsimonious Architecture
Mapping the Galaxy for less than $500,000 in hardware[database-specific = $300,000]
$500,000 / 1 billion stars = $0.0005 per star
Answering the formation history of the galaxy = Priceless!
Data Platform for Just-In-Time Action
One unexpected characteristic we have noticed during commissioning concerns stray light. In our test images, an excess of diffuse illumination is sometimes seen on some of the detectors, repeating in a cycle that relates to Gaia’s spin period of 6 hours.
InterSystems Symposia 2014
9
Source: http://www.esa.int/spaceinimages/Images/2013/12/Farewell_to_Gaia
InterSystems Symposia 2014
10
InterSystems Symposia 2014
11
Gaia
Unraveling the chemical and dynamical history of our Galaxy
Cloud & Virtualization
InterSystems Symposia 2014
12
Virtualization & Cloud – Intertwined!
The NIST Definition of Cloud Computing
“Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.”
Source: National Institute of Standards and Technology, Special Publication 800-145
InterSystems Symposia 2014
13
The NIST Definition of Cloud Computing
“Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.”
Source: National Institute of Standards and Technology, Special Publication 800-145
My definition of Cloud Computing
Harnessing advances in information technology to accelerate value delivery to customers
InterSystems Symposia 2014
14
My definition of Cloud Computing
Harnessing advances in information technology to accelerate value delivery to customers
Types of Cloud Computing
• Public Cloud– an infrastructure as a service (IaaS) provider such as Amazon EC2, Rackspace, Azure, etc.
• Private Cloud– either provided by an IaaS, or hosted internally at the customer site (using something like
Openstack or Cloudstack)
• Virtualization-based Cloud– this would be something like a VMware (vCloud) environment, or even a fully virtualized
environment
• Customer SaaS offering– where a partner has built a solution based on our products and delivers that solution on a SaaS
basis
InterSystems Symposia 2014
15
SaaS
SaaS
InterSystems Symposia 2014
16
Cloud. The Enabler.
• Deploy Breakthrough Applications in The Cloud• How?
– Pay-as-you-go– Virtually *infinite* computing resources– Elastic– Provision on-demand– Stay lean and agile
CAUTION!Better to go in with your eyes wide open…
InterSystems Symposia 2014
17
Amazon EC2 SLA
Amazon EC2 => 99.95% monthly uptime
• ~22 minutes downtime / month (min threshold)
• Finer print: 99.95% to 99% monthly uptime
• ~22 minutes to 7.2 HOURS downtime/month
• “Service Credit” as compensation
Other considerations?• Regulatory Compliance• Cost• Where’s the Data?• How Secure Is My Data?• Etc…
InterSystems Symposia 2014
18
Cloud Case StudiesProviding a Cloud-Enabled Data Platform
3M Health Information Systems
• Ensemble ESB in the Cloud• Goals
– To simplify inter-application communication
– To reduce maintenance costs
– To increase scalability
– To improve governance of application access
– To increase the flexibility with which new software applications could be added to the overall system and
– To automate system operations
• Scalable – auto-scale, based on demand• Elastic – grow, shrink based on demand• Stateless – no-persistence model• Automated – single-click, automated deployment
Breakthrough Enterprise Service Bus (ESB) in the cloud
InterSystems Symposia 2014
19
Ontario Systems
• Receivables Management Software for Third-Party Collection Agencies• New regulatory burden for Collection Agencies – monitor customer complaints, or else!• Built a cloud-based Complaint Tracker application• Built & Deployed the breakthrough application as SaaS offering in less than six weeks –
using Caché, Ensemble, DeepSee• Eased burden on existing customers; gained several new customers
“Building on InterSystems technologies, we went from initial concept to delivering a functional product in just 35 days.”
- Chris Cochran, Product Director, Ontario Systems
Breakthrough Software-as-a-Service (SaaS)
Eventsforce
• End- to-end event planning and management solution • Modular, flexible SaaS offering• Extremely scalable model – events from tens to thousands of users!• Extremely elastic model –
– scale up or down during an event
– add or remove functional modules on a live system
• Breakthrough web-based SaaS offering, including mobile app
Breakthrough Software-as-a-Service (SaaS)
InterSystems Symposia 2014
20
Press Computer Systems (PCS) – Social KnowledgeBreakthrough Real-time Perception Management SaaS
Listen | Understand | Engage
Wrap-Up• Questions?• You can reach me @ [email protected]
• Thank you!