Presentation architecting a cloud infrastructure

Download Presentation   architecting a cloud infrastructure

Post on 16-Aug-2015




3 download

Embed Size (px)


  1. 1. Architecting a Cloud Infrastructure Moderator: Chris Colotti, VMware, Inc. Aidan Dalgleish, VMware, Inc. Duncan Epping, VMware, Inc. David Hill, VMware, Inc. Rawlinson Rivera, VMware, Inc. INF-VSP1168 #vmworldinf
  2. 2. 2 Disclaimer This session may contain product features that are currently under development. This session/overview of the new technology represents no commitment from VMware to deliver these features in any generally available product. Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Technical feasibility and market demand will affect final delivery. Pricing and packaging for any new technologies or features discussed or presented have not been determined.
  3. 3. 3 Agenda Intros Gathering requirements Sizing and Scaling Host Design vCenter Design Cluster Design Networking and Security Storage Concluding
  4. 4. 4 Introduction Chris Colotti Consulting Architect, VMware Global Cloud CoE VCDX 37 and owner of Twitter: @CColotti David Hill (Not Available for VMworld US) Senior Solutions Architect, GTS Owner of Twitter @DaveHill99 Aidan Dalgleish Consulting Architect, VMware Global Cloud CoE VCDX 10 Twitter @AidersD
  5. 5. 5 Introduction Duncan Epping Principal Architect, VMware Technical Marketing VCDX 7 and owner of Twitter @DuncanYB Rawlinson Rivera Senior Consultant, VMware Professional Services VCDX 86 and owner of Twitter: @PunchingClouds
  6. 6. 6 What is this session about? Architecting a Cloud Infrastructure See the CIM Case Study whitepaper. This is not a blueprint, it is an example! Design Decisions Real world examples Understanding the potential pitfalls Tweet / blog about this session, please use the #vmworldinf and #VSP1168 hashtag Take pictures and tweet them! Best picture will get a free copy of the vSphere 5.1 Clustering Deepdive
  7. 7. 7 Gathering Requirements
  8. 8. 8 Talk to your customer Most important part of any engagement Gather information and document! Categorize Requirements Nice to have Constraints Assumptions Conceptualize Sound like VCDX to Anyone?
  9. 9. 9 Example Requirements Increasing agility / flexibility while decreasing cost of doing business Availability of services defined as 99.9% during core business hours Security compliancy requires network isolation for specific workloads from other services Minimal workload deployment time Should be able to guarantee resources to groups of workloads as part of internal SLAs Recovery time objective in the case of a datastore failure should be less than 8 hours Servers hosted in the DMZ should be protected within the virtual environment
  10. 10. 10 Example Constraints Dell and AMD have been preselected as the platform of choice Eight 1GbE ports will be used per server NetApps NAS offering has been preselected as the solution of choice All Tier 2 NAS volumes are de-duplicated Physical switches will not be configured for QoS Existing Cisco top of rack environment should be used for the virtual infrastructure
  11. 11. 11 Example Assumptions Sufficient Switch Ports are available Current storage infrastructure can handle expected workload Staff properly trained on vSphere
  12. 12. 12 Historical Best Practices It is best practice to have a 500GB LUN with a maximum of 15 VMs It is best practice to have an 8 host cluster vSphere vCloud Director (Fast Provisioning limitations) It is best practice to have a maximum of 4 VMs per core It is best practice to have a dedicated 1GbE link for vMotion It is best practice to have high / medium / low resource pools It is best practice to have an isolated management It is best practice to configure vMotion and Management on a VSS and the other traffic on a VDS It is best practice to
  13. 13. 13 Some Use cases Server consolidation (power and cooling savings, green computing, lowering TCO) OPEX Savings on redundant tasks Self Service Provisioning Server infrastructure resource optimization (load balancing, high availability) Standardization Business Agility (Rapid provisioning) Infrastructure as a Service (IaaS)
  14. 14. 14 Conceptualize Your Design Building blocks Operations Time to market Compliance
  15. 15. 15 Sizing / Scaling Exercise
  16. 16. 16 Basic details What does the environment look like today? How many sites? How many potential virtualization candidates? Multiple waves? How will this impact your Design / Project? Different Cluster / Datacenter structure Within the limits? Sizing based on X waves / years? What is the use case? Server consolidation? IaaS? Service Level Agreements (SLA)?
  17. 17. 17 Tooling Options Use tools like VMware Capacity Planner PlateSpin Recon Lanamark Do we really need it? Dont all results just look the same? What is important? What am I designing for? Average vs Peak Consolidation vs Performance
  18. 18. 18 Compute Considerations How many eggs in on basket? Two sockets vs four sockets Optimal Memory configurations 8GB DIMMs are cheaper than 2 x 4GB Triple channel configurations Number of DIMM slots might be different per vendor / model AMD vs Intel AMD supports more cores, while Intel generally is faster VMmark can be used to make perf comparisons! TPS vs no TPS Using 64-bit Guest OSes? Performance gain Sweetspot? Still seems to be dual socket 96GB of memory
  19. 19. 19 Network Sizing Is this ever really a bottleneck? In most of the Capacity Planner reports weve seen Expected average network bandwidth requirement ~ 4Mbps based on an average of 20 VMs per ESXi host. 10GbE will lift all (or most) constraints for a very long time! Use the report to identify anomalies!
  20. 20. 20 Storage Sizing Not only Size but Performance matters! (TOTAL IOps % READ)+ ((TOTAL IOps % WRITE) RAID Penalty) (42 x 62%) + (( 42 x 38%) x 2) (26.04) + ((15.96) x 2) 26.04 + 31.92 = 57.96 But what about size? How does this drive your Storage Considerations?
  21. 21. 21 Storage Considerations RAID level used impacts IOps IOps penalty can be severe Spindle count RTO impacts amount of VMs per Datastore Backup environment need to be capable of restoring within RTO window (RTO * restore speed) / avg VM Size Dont confuse Mb/S with MB/S! Adding it up 270 VMs from backup perspective vs 50 VMs from IOps What does your customer feel comfortable with? Going SSD / Hybrid solutions? Potential undesired results vCloud Director Catalogs
  22. 22. 22 Hosts
  23. 23. 23 Design Considerations Vendor AMD vs Intel Blade vs Rack Density increases Hot spots Costs Management Additional considerations Is embedded ESXi available? How much local SSD (capacity and IOPS) can it handle? Does it have built-in 2x 10 GE ports? Does the built-in NIC card have hardware iSCSI capability? Management integration
  24. 24. 24 ESXi boot considerations 4 methods of booting ESXi Local Disk Local SD / USB SAN Boot PXE Boot with Auto-Deploy GUI Fling by Max Daneri! Considerations USB is cheap Local disk usually higher availability than USB SAN Boot easy to move identity, but what about costs Best of all worlds: PXE Boot! Brand new and dependencies
  25. 25. 25 Platform Composition: ESXi base, drivers, CIM providers, Configuration: networking, storage, date/time, firewall, admin password, What Is Auto Deploy No Boot Disk? Where does it go? Boot Disk All information on the state of the host is stored off the host in vCenter Running State: VM Inventory, HA state, License, DPM configuration Event Recording: log files, core dump Running State: VM Inventory, HA state, License, DPM configuration Event Recording: log files, core dump Configuration: networking, storage, date/time, firewall, admin password, Platform Composition: ESXi base, drivers, CIM providers,
  26. 26. 26 Platform Composition: ESXi base, drivers, CIM providers, Configuration: networking, storage, date/time, firewall, admin password, What Is Auto Deploy No Boot Disk? Where does it go? Boot Disk All information on the state of the host is stored off the host in vCenter Running State: VM Inventory, HA state, License, DPM configuration Event Recording: log files, core dump vCenter ServerRunning State: VM Inventory, HA state, License, DPM configuration Add-on ComponentsEvent Recording: log files, core dump Host Profile Configuration: networking, storage, date/time, firewall, admin password, Image Profile Platform Composition: ESXi base, drivers, CIM providers,
  27. 27. 27 vCenter
  28. 28. 28 Design Considerations How many VMs? Do I need a dedicated vCenter Server? Can I still use the vCenter Appliance? Is there a need for the Web Client? Can I use the vCenter Appliance for that? Use it! Will there be other products used like SRM / View / vCloud Director? vCenter Heartbeat required? Statistic Levels will impact performance / scaling
  29. 29. 29 Sizing vCenter and Update Manager Read the documentation! 50 hosts / 500 VMs 2vCPUs 4GB 300 hosts / 3000 VMs 4vCPUs 8GB 1000 hosts / 10000 VMs 8 vCPUs 16GB Do we want to scale up or scale out? vSphere Update Manager on the same Server? How many users will be using vCenter? Use the Sizing Calculators for the database Consider Reservations
  30. 30. 30 Clustering
  31. 31. 31 Where do we start? How many physical Datacenters will there be? Will each physical DC need a vCenter Server? For each vCenter, do we need multiple virtual Datacenters? For each DC, do we need multiple Clusters? For each Cluster, how many hosts? Physical DC vCenter Datacenter Datacenter Cluster Cluster ESXi ESXi ESXi vCenter Datacenter Physical DC vCenter Datacenter Cluster ESXi
  32. 32. 32 Design Considerations Separate clusters for DMZ? Why not use vShield App or vShield Edge? Separate clusters for test? Might also allow you to test vSphere patches! vCenter Datacenter object is a vMotion boundary not the Cluster! vCenter Datacenter object is a VDS boundary not the Cluster! Will you be using HA / FT / DRS / DPM? Did you know that each DRS cluster has its own thread on the vCenter server? Did you know that with vSphere 5 theres a thing called Datastore Heartbeating? There is no primary / secondary concept as of vSphere 5.0 Admission Control is important!
  33. 33. 33 Design Considerations Is 8 the perfect Cluster size? Primary / secondary nodes (4.1 and prior) vs Master / Slave (5.x) Blade environment implication on design? LUN count vs Path count Linked Clones being used? DRS and DPM love big clusters HA benefits from big clusters What about EVC? Should I turn it on by default EVC can only be enabled when all VMs are powered off Did you know that DRS requires EVC to be enabled to balance and place FT virtual machines in a cluster? Is there a need for Resource Pools How will you handle shares? Reservations / limits?
  34. 34. 34 Networking
  35. 35. 35 Design Considerations What type of vSwitch will be used? VSS vs VDS vs Cisco Nexus 1000v If vCloud Director what Network Pools are required? What are the pSwitch capabilities? Will VLANs be used? Will PVLANs be required? Consider vShield App? Requirements for Jumbo Frames? Helps support larger vCDNI packets and/or IP Storage What type of load balancing will be used? What type of traffic Load Based Teaming vs Virtual Port ID vs IP Hash Additional Security requirements?
  36. 36. 37 Design Considerations Network I/O Control Even in 1GbE environment NIOC is useful Especially when connecting outbound you want to Did you know that Limits apply to a NIC pair level? Did you know that shares apply on a NIC Port level? Additional Security Requirements? vShield App vShield App with Data Security vShield Edge 3rd party security products? Hy-trust 2-factor authentication Audit trails Rigid, Hierarchical Access Controls
  37. 37. 38 vShield App Considerations VNIC level firewall DVFilter used for in- and out-bound traffic vShield App Firewall per host! Set rules on vCenter Objects like Resource Pools and Portgroups Deploying a VM with the right rules is easy! Did you know that the vShield Manager cannot be locked out? The DVFilter is not applied to the vShield Manager You can exclude VMs from protection!
  38. 38. 39 vShield Edge considerations Multiple edge security services in a single appliance Firewall (5 tuple) NAT DHCP VPN Load Balancing Network Isolation Data Security options Useful for compliance Think about resiliency! vSE HA from within vCD in 5.1 vCloud Director leverages vShield Edge heavily! Tenant A Tenant C Tenant X VMware vShield Edge VMware vShield Edge VMware vShield Edge VPNLoad balancerFirewall Secure Virtual Appliance Secure Virtual Appliance Secure Virtual Appliance
  39. 39. 40 vShield Manager considerations Local Database Backup vSM uses a MySQL local database If database is lost all configurations on vShield Edge are lost Availability considerations FT is supported (not from 5.1), and of course HA and VM Monitoring New 5.1 Appliance ships with 2 vCPUs vShield Manager Failure All existing, published, rules continue to be enforced All flow logging continues to be sent to syslog server No changes to rules or settings can be made Regular vShield Manager backups can be used to rebuild vShield Manager if needed Security considerations Default Passwords
  40. 40. 41 Storage
  41. 41. 42 Design Considerations Protocol Wars! Multiple Tiers? Or even Auto-Tiering, what is the impact? vSphere Storage APIs Array Integration (VAAI) Does it impact sizing? vSphere Storage APIs Storage Awareness (VASA) Will it impact operations? Thin provisioning? Thin, Thick and Eager Zeroed Thick vSphere vs Storage Array!
  42. 42. 43 Design Considerations Can we use Storage DRS? Impact on storage array features? Impact on sizing? Impact on other VMware products like vCloud Director? Profile-Driven Storage? How does it utilize VASA? DR Requirements? Or possibly in the future? No more worrying about block sizes with VMFS-5 When upgrading VMFS-3 to VMFS-5 block size does not change! Did you know VAAI is T-10 compliant? Makes leveraging it easier for lower-end devices
  43. 43. 44 Impact of Features Storage DRS has constraints SRM does not support Storage vMotion / Storage DRS vCloud Director does support Storage DRS in 5.1! Feature or Product Initial Placement Migration Recommendations Array-based replication (SRDF, MirrorView, SnapMirror, etc ) Supported Manual: I/O and Space Array-based snapshots Supported Manual: I/O and Space Array-based Dedupe Supported Manual: I/O and Space Array based thin provisioning Supported Fully Automated: I/O and Space Array-based auto-tiering (EMC FAST, Compellent Data Progression, etc) Supported Manual: Space Array-based I/O balancing (Dell Equallogic) Supported Manual: Space
  44. 44. 45 Questions


View more >