cs15-319 / 15-619 cloud computingmsakr/15319-f13/lectures/recitation8.p… · •create index...
TRANSCRIPT
![Page 1: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/1.jpg)
CS15-319 / 15-619 Cloud Computing
Recitation 8
October 15th and 18th, 2013
![Page 2: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/2.jpg)
Announcements
• Encounter a general bug: – Post on Piazza
• Encounter a grading bug: – Post Privately on Piazza
• Don’t ask if my answer is correct
• Don’t post code on Piazza
• Search before posting
• Post feedback on OLI
![Page 3: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/3.jpg)
Project 3, Module 1 Reflections
• Common questions about this module:
– Why Query 6 and Query 7 got worse performance after indexing
– SELECT COUNT(*) FROM songs WHERE duration > (SELECT AVG(duration) FROM songs) ;
– SELECT COUNT(*) FROM songs WHERE duration <= (SELECT AVG(duration) FROM songs) ;
![Page 4: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/4.jpg)
Project 3, Module 1 Reflections
• Common questions about this module:
– Why Query 6 and Query 7 got worse performance after indexing
• CREATE INDEX idx_duration ON songs duration, artist_id(255));
• The index is sorted by the concatenation of duration and artist_id
• Binary search can be used for searching
• Average() will not benefit from binary search
• Count(*) also seems have some negative effects
![Page 5: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/5.jpg)
Project 3, Module 1 Reflections
• Common questions about this module:
– Why did Query 6 and Query 7 get worse performance after indexing
• Indexes and real raw data are not residing together – For an average: 2 disk reads happen, 1 for index, 1 for real
data, which is slow.
• Different databases have different implementations
![Page 6: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/6.jpg)
Unit 3 Quiz
• Average: 78%
0%
20%
40%
60%
80%
100%
120%
0 20 40 60 80 100 120 140 160 180 200
![Page 7: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/7.jpg)
Module to Read
• UNIT 4: Cloud Storage
– Module 12: Cloud Storage
– Module 13: Case Studies: Distributed File Systems
– Module 14: Case Studies: NoSQL Databases
– Module 15: Case Studies: Cloud Object Storage
– Quiz 4: Cloud Storage
![Page 8: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/8.jpg)
Project 3
• Files vs. Databases
– File vs. Database
• Vertical Scaling in Databases
– Vertical Scaling
• Horizontal Scaling in Databases
– Horizontal Scaling
• Working with NoSQL: DynamoDB / Hbase
– Amazon DynamoDB
– DynamoDB vs. HBase
![Page 9: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/9.jpg)
Project 3 Module 2 - Vertical Scaling
• Explore the database performance by tweaking 2 parameters
– Instance Type
• m1.large
• m1.xlarge
– Storage Type
• RAM Disk
• Ephemeral Disks
• Amazon EBS
![Page 10: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/10.jpg)
Different Types of Storage
…...
Remote copy Asynchronous
Synchronous Copy
Disaster Recovery
Memory Internal HDD (RAID?)
RAM Disk Ephemeral Disk
EBS
Elastic Block Store (EBS)
External Storage (Storage Subsystem)
Availability Zone
Region
CPU Mem HDD
…...
Physical Machine (Server)
Switches / Routers
Virtual Machine (Instances)
Virtual Machines Physical Resources
![Page 11: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/11.jpg)
Different Types of Storage
• Memory - RAM Disk
– Inside the server
– Usually from several Gigabytes to several hundreds of Gigabytes
• Internal HDD (Hard Disk Drive)
– Inside the server
– Sometimes employs RAID (Why?)
– Usually from 100s Gigabytes to several Terabytes
![Page 12: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/12.jpg)
Different Types of Storage
• External Storage Subsystems
– Outside of the server
– Connected by cables via switches, routers, directors (Ethernet, Fiber…)
– Provide extra functionalities (Copy services, concurrent volume accesses, grouping, caching…)
– Shared by multiple servers
– Almost always employs RAID
– Capacity range from dozens of TB to 100s of TB
![Page 13: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/13.jpg)
Different Types of Storage
• External Storage Subsystems
IBM 2424-951 DS8800 182TB RAW 129TB useable w/RAID 5 SYSTEM STORAGE On eBay: US $899,995.00
EMC SYMITRIX VMAX 40K
![Page 14: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/14.jpg)
Project 3 Module 2 – Vertical Scaling
…...
Memory Internal HDD (RAID?)
RAM Disk Ephemeral Disk
EBS
Elastic Block Store (EBS)
External Storage (Storage Subsystem)
Explore the database performance by manipulating 2 parameters • Local Access VS. Remote Access • m1.large VS. m1.xlarge • RAM Disk / ephemeral disk / ephemeral disk with RAID0 / EBS • EBS optimized VS. no EBS optimized
…...
Physical Machine (Server)
Switches / Routers
Virtual Machine (Instances)
Remote Query
Local Query
![Page 15: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/15.jpg)
Upcoming Deadlines • Project 3:
• Unit 4:
UNIT 4: Cloud Storage
Module 12: Cloud Storage
Module 13: Case Studies: Distributed File Systems
![Page 16: CS15-319 / 15-619 Cloud Computingmsakr/15319-f13/lectures/Recitation8.p… · •CREATE INDEX idx_duration ON songs duration, artist_id(255)); •The index is sorted by the concatenation](https://reader034.vdocuments.net/reader034/viewer/2022050420/5f8f70965964e7140d7d4761/html5/thumbnails/16.jpg)
Demo Outline
• Provisioning with spot instances
• Running sysbench
– RAM disk
– Ephemeral disk