how to build a better sreryip/building-sre.pdf · disclaimer: this talk represents my own views....
TRANSCRIPT
![Page 1: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/1.jpg)
Rae YipSony GaikaiSCaLE 13x
How To Build A Better SRE
![Page 2: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/2.jpg)
Gaikai – Who we are
● Subsidiary of Sony● Developed PS4 Remote Play● Operates Playstation Now
● Disclaimer: This talk represents my own views
![Page 3: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/3.jpg)
Agenda
● Introduction● Toolbox● Concepts● Application Stacks● Other Abilities ● Resources● Q & A
![Page 4: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/4.jpg)
What's an SRE?
● Availability, Scalability, and Security● Managing Deployment and Changes● Troubleshooting● Influencing Architecture and Implementation
![Page 5: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/5.jpg)
Related Jobs
● Sysadmin and Webadmin– Significant overlap of skills
– Different focus
● Release / Integration Engineer– Builds, packaging, integration
● Software Engineer / Developer– DevOps ~ SRE?
![Page 6: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/6.jpg)
Why SRE?
● Traditional 3-tier model may not be best fit● For people who like challenges● Cross-disciplinary exposure● Role tends to reward excellence
![Page 7: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/7.jpg)
Toolbox - Languages
● Languages– C / C++
– Java
– Python
– Javascript, Ruby, etc.
● Fluency– Reading & Writing
– Running & Debugging
![Page 8: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/8.jpg)
Toolbox – Unix (1)
● Commands– ls, find, man
– df, du
– ping, traceroute, mtr
– ssh
– curl, wget
– rpm, dpkg, ldd
– svn, git
● Scripting– grep, cut, tr
– awk, sed, jq
– xargs
– bash
– cron
![Page 9: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/9.jpg)
Toolbox – Unix (2)
● System Visibility– vmstat, iostat, dstat, top
– netstat, lsof
– strace, ltrace
– tcpdump
● Many morehttp://www.brendangregg.com/USEmethod/use-linux.html
![Page 10: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/10.jpg)
Toolbox – Unix (3)
● Debuggers– gdb
– jstack
– pdb
● Profilers– oprofile
– dtrace
– ftrace
– perf events
– Poor Man's Profiler
![Page 11: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/11.jpg)
Toolbox – Unix (4)
● Kernel– dmesg
– modinfo, lsmod
– modprobe, insmod
● Hardware– lspci
– lshw
– dmidecode
![Page 12: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/12.jpg)
Toolbox – Network (1)
● Host Networking– ip vs. ifconfig
– Static routing
– iptables● Firewall & NAT
– VIP failover
– Bonding
![Page 13: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/13.jpg)
Toolbox – Network (2)
● OSI Stack● TCP vs. UDP● ICMP, PMTUD● SCTP● IPv6
![Page 14: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/14.jpg)
Toolbox – Network (3)
● Application Protocols– HTTP(S)
– SPDY, HTTP 2.0
– DNS
– NTP
– DHCP
– SNMP
● Routing Protocols– BGP
– OSPF
– MPLS
![Page 15: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/15.jpg)
Toolbox – Network (4)
● Routers & Switches– Managed vs.
Unmanaged
– Cisco
– Juniper
● Load Balancers– F5 Big-IP
– Netscaler
– Barracuda
![Page 16: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/16.jpg)
Concepts – Operating Systems
● Scheduler● Drivers● User vs. Kernel space● Virtualization
![Page 17: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/17.jpg)
Concepts – Queueing Theory
● Little's Law– Load = Arrival Rate x Service Time
● Universal Scalability Law (Gunther 1993)http://www.perfdynamics.com/Manifesto/USLscalability.html
![Page 18: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/18.jpg)
Concepts – Cloud Computing
● What is it?● How is it different?● Difference of scale● Cloud vs. Enterprise
![Page 19: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/19.jpg)
Concepts – Distributed Systems (1)
● Fallacies– http://en.wikipedia.org/wiki/Fallacies_of_distributed
_computing
● Not just theoretical– https://aphyr.com/posts/288-the-network-is-reliable
● Murphy's Law (De Morgan 1866?)
![Page 20: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/20.jpg)
Concepts – Distributed Systems (2)
● Network is not reliable
– Two General's Problem (Akkoyunlu, Ekanadham, Huber 1975)
● Network is not secure
– Byzantine Generals' problem (Pease, Shostak, Lamport 1980)
● CAP principle (Brewer 1999)
![Page 21: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/21.jpg)
Application Stacks (1)
● Apache, nginx● MySQL / MariaDB, PostgreSQL● Hadoop, HBase● Zookeeper● ElasticSearch, Logstash, Kibana● Kafka, RabbitMQ
![Page 22: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/22.jpg)
Other Abilities & Traits (1)
● Communication– Concise writing
– Ability to avoid flame wars
– Interviewing
● Troubleshooting– USE Method
– Scientific Method
– How To Solve It (Polya)
![Page 23: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/23.jpg)
Other Abilities & Traits (2)
● Dispatching– Ability to identify the party with the domain
expertise to solve problem
● Ownership– Ability to see problem through from end to end
● Statistics– Ability to use numbers to make your case
![Page 24: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/24.jpg)
Additional Resources (1)
● High Scalability
http://highscalability.com● Jepsen articles
https://aphyr.com/tags/jepsen● Brendan Gregg's blog
http://www.brendangregg.com● ServerFault
http://serverfault.com
![Page 25: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/25.jpg)
Additional Resources (2)
● Conferences– SCaLE
– Surge
– Velocity
– USENIX (LISA, SREcon)
● Academic papers– arxiv.org
– IEEE ACM
![Page 26: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/26.jpg)
Anybody can be an SRE!?
● Not everyone should be an SRE, but good SREs can come from anywhere.
● Diversity of talent, breadth and depth of knowledge is key to solving problems of unanticipated nature
![Page 27: How To Build A Better SREryip/building-sre.pdf · Disclaimer: This talk represents my own views. Agenda ... Influencing Architecture and Implementation . Related Jobs Sysadmin and](https://reader033.vdocuments.net/reader033/viewer/2022043007/5f954845ecd1600de8660df5/html5/thumbnails/27.jpg)
Thank You!
● Questions?● Swag