18-847f: special topics in computer systems foundations of ... · 2 lecture 1: logistics and...
TRANSCRIPT
![Page 1: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/1.jpg)
1
18-847F: Special Topics in Computer Systems
Foundations of Cloud and Machine Learning Infrastructure
![Page 2: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/2.jpg)
2
Lecture 1: Logistics and Overview
Foundations of Cloud and Machine Learning Infrastructure
![Page 3: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/3.jpg)
Graduate Seminar Class
3
Few Lectures
Reading research papers
Student presentations
Class Discussions
Final Research Project (No Exams!)
![Page 4: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/4.jpg)
Learning Objectives
4
o Know the state-of-the-art frameworks in cloud and machine learning and their theoretical foundations
o Read and provide constructive criticism of research papers
o Present to an audience, and answer their questions
o Do creative, collaborate research
![Page 5: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/5.jpg)
Why study Cloud and ML infrastructure?
5
What are the largest words after ‘Big Data’?
![Page 6: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/6.jpg)
Big Data Gold Rush
6
Who got rich in the California gold rush?
![Page 7: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/7.jpg)
Big Data Gold Rush
7
Who got rich in the California gold rush?
In the Big Data rush, it’s the infrastructure companies
![Page 8: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/8.jpg)
Topics Covered
8
CloudComputing DistributedStorage
MachineLearning
Modelreplica
PARAMETERSERVERw’=w– αΔw
Modelreplica
Modelreplica
w Δw
a b a+b
![Page 9: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/9.jpg)
Topics Covered
9
CloudComputingo Scheduling in Parallel Computing
o MapReduce, Spark
o Straggler Replication
o Task Replication in Queueing Systems
![Page 10: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/10.jpg)
Topics Covered
10
DistributedStorageo Coding for locality/repair
o Systems implementation of codes
o Reducing latency in content
download
a b a+b
![Page 11: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/11.jpg)
Topics Covered
11
MachineLearning
Modelreplica
PARAMETERSERVERw’=w– αΔw
Modelreplica
Modelreplica
w Δw
o SGD and its convergence
o Distributed Deep Learning
o Hyper-parameter tuning
o GANs, Deep reinforcement learning
![Page 12: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/12.jpg)
Instructor: Gauri Joshi
12
SM+PhD2010-2016
B.Tech+M.Tech2005-2010
ResearchStaffMember2016-2017
AssistantProfessorFall2017-
Internships
![Page 13: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/13.jpg)
Have worked in all these areas
13
CloudComputing DistributedStorage
MachineLearning
Modelreplica
PARAMETERSERVERw’=w– αΔw
Modelreplica
Modelreplica
w Δw
a b a+b
![Page 14: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/14.jpg)
Student Introductions
14
o Name?
o Department?
o Undergrad/Masters/PhD?
o Previous related classes (if any)?
o What you are looking to learn from this class?
Waiting list will be cleared soon!
![Page 15: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/15.jpg)
Class Hours and Website(s)
15
o When: Mon, Wed 4:30-6:00 pm o Where: Scaife Hall 222
o Class Website (Readings, Schedule): https://www.andrew.cmu.edu/user/gaurij/18-847F-Fall-2018.html
o Canvas Site (Readings, Assignments, Projects):https://canvas.cmu.edu/
o No prerequisites. Basic knowledge of probability and linear algebra is encouraged.
![Page 16: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/16.jpg)
Reading Material
16
Papers will be posted on the class website or on Canvaso Book chapters
o Survey papers
o Theory papers (Scheduling, Queuing, Coding, Optimization)
o Systems papers (Cloud, Machine Learning)
Additional reference books listed in the syllabus
![Page 17: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/17.jpg)
Instructor/TA and Office Hours
17
Instructor: Prof. Gauri Joshi (gaurij [AT]andrew.cmu.edu)
TA: Jianyu Wang (jianyuw1 [AT]andrew.cmu.edu)
Office Location: CIC 4105
Office Hours: By appointment
![Page 18: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/18.jpg)
Graduate Seminar Class
18
A few lectures
Reading research papers
Student presentations
Class Discussions
Final Research Project
![Page 19: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/19.jpg)
Lectures
19
o Next week: Deeper Overview of probability and queuing theory
o Guest lectures during the semester by authors of papers relevant to this class
![Page 20: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/20.jpg)
Graduate Seminar Class
20
A few lectures
Reading research papers
Student presentations
Class Discussions
Final Research Project
![Page 21: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/21.jpg)
Homeworks (~50%)
21
o Submit paper review (due 10:00 am before class)o ~Two reviews per week
o Discussion with classmates is okay, but write reviews in your own words.
![Page 22: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/22.jpg)
Paper Review Format
22
o Summary of the paper
o Reflects your understanding of the paper
o Significance & correctness of results
o Discussion Questions for Class (at least 2)
o Confusions about the paper, open research directions
o Answers to concept-check questions
![Page 23: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/23.jpg)
Homework Grading Rubric (Total: 10 pts)
23
o Understanding of the paper (4 pts)
o Discussion Questions (3 pts)
o Concept-check questions (3 pts)
![Page 24: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/24.jpg)
Graduate Seminar Class
24
A few lectures
Reading research papers
Student presentations
Class Discussions
Final Research Project
![Page 25: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/25.jpg)
Class Presentations (~15%)
25
o Sign up for presentation at least 1 week in advance
o Each student will present 1-2 times in the semester(depends on # of students registered)
o 20 min presentation, followed by 25 min discussiono Motivation and Related worko Summary of main resultso Your views on the paper
![Page 26: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/26.jpg)
Presentation Grading Rubric (Total: 10 pts)
26
o Motivation (1.5 pts)
o Clarity (1.5 pts)
o Understanding/Correctness (4 pts)
o Peer-review Feedback (3 pts)
![Page 27: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/27.jpg)
Graduate Seminar Class
27
A few lectures
Reading research papers
Student presentations
Class Discussions
Final Research Project
![Page 28: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/28.jpg)
Class Participation (~15%)
28
o The class will be divided into groups of 3-4 students each
o Each group will discuss one of the discussion questions among themselves
o Summarize the discussion to the whole class
![Page 29: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/29.jpg)
Participation Grading Rubric (Total: 5 pts)
29
o Attendance and attention (1.5 pt)
o Speaking up in class (1.5 pt)
o Insightful Questions/Comments (2 pt)
![Page 30: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/30.jpg)
Graduate Seminar Class
30
A few lectures
Reading research papers
Student presentations
Class Discussions
Final Research Project
![Page 31: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/31.jpg)
Research Project (~20%)
31
o Groups of 1-3
o Original research on a topic of your choiceo Topics aligned with your research allowed and encouragedo If you can’t think of topics, come talk to Jianyu or me
o Possible Project Types:o New theoretical analysiso Implementation using one of the frameworks discussedo In-depth literature survey of a particular topic
![Page 32: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/32.jpg)
Timeline
32
o 1-page proposal due Oct 3
o Publishable quality report (max 5 pg) in ACM formato Initial draft due: Nov 21o Final report due: Dec 7
o Last week of class: Presentations (~25 min per group)
o Peer-review other presentations
![Page 33: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/33.jpg)
Project Grading Rubric (Total: 20 pts)
33
o Originality (1 pts)
o Review of Related Work (1.5 pts)
o Writing and Organization (1.5 pts)
o Technical Results (4 pts)
o Final presentation (10 pts)
o Peer-Review (2 pts)
![Page 34: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/34.jpg)
In Summary..
34
o Paper Readingo Submitting Reviewso Class Presentations o Final Project
Might seem like a lot of work but..o You will get fast and efficient at reading paperso The project will be a fun, collaborative exerciseo No exams!
![Page 35: 18-847F: Special Topics in Computer Systems Foundations of ... · 2 Lecture 1: Logistics and Overview ... Cloud Computing Distributed Storage Machine Learning Model replica PARAMETER](https://reader033.vdocuments.net/reader033/viewer/2022042406/5f205703bd80b53e7f6cc783/html5/thumbnails/35.jpg)
TO DO
35
o Fill out the sign-up sheet
o Sign-up for presentations
o Start reading the papers
o Form groups for class projects
o Start thinking about projects