instructions - solaris container setup overview
Post on 12-Nov-2014
5.711 Views
Preview:
TRANSCRIPT
Solaris 10 Containers
Kimberly ChangOS AmbassadorSolaris 10 Adoption, US Client Solutionshttp://webhome.sfbay/kchangshttp://blogs.sun.com/kchangs
Domain 1Domain 1 Domain 2Domain 2
SunSun Server Server
Container 1Container 1 Container 2Container 2 Container 3Container 3
Container 4Container 4 Container 5Container 5
Server VirtualizationSolaris Containers and Solaris Dynamic System Domains
Server Virtualization
• Consolidates multiple applications• Provides security perimeter between applications
and underlying system• Makes more effective use of hardware• Simplifies administration• Adds flexibility to resource management• Can be hardware- or software- based
Container Components
• Full Resource Containment - SRM (Solaris 9)> Provides predictable service levels
• Isolation -Zones (Solaris 10)> Prevent unauthorized access (security boundary)> Minimize fault propagation (fault boundary)
• Service Management Application> Ease of management – GUI Container Manager
Zones
• Provides virtualized OS environments, each looking like a Solaris instance> Implemented via a lightweight layer in the OS> Details of physical resources are hidden> Separate nodename, IP address, IP port space> Processes cannot see or affect processes in other
containers> Each zone can be administered independently> No porting as the ABI/API is the same
Zones Block Diagram
ce0
global zone(serviceprovider.com)
red zone (red.com)
web services(Apache 1.3.22, J2SE)
enterprise services(Oracle 8i, IAS 6)
sun zone (sun.com)
web services(Apache 2.0, Tomcat)
login services(OpenSSH sshd 3.4)
core services(ypbind, automountd)
core services(ldap_cachemgr, inetd)
zone root: /zone/redzone zone root: /zone/sunzone
ge0
zone managementzonecfg(1M), zoneadm(1M), zlogin(1), ...
zoneadmd
/aux0/redspace /usr
login services(OpenSSH sshd 3.4)
core services(ypbind, automountd)
/opt
ce0:1 ge0:1 ge0:2/red /usr /opt
zoneadmd
/usr /opt
Storage By default, zone refersto non-global zone.
Granularity
• 8,000+ zones per OS instance• Doesn't require dedicated CPUs, memory,
physical devices, etc.> Just space for unique root filesystem
• Existing hardware resources can be:> multiplexed across zones, or> allocated per zone using resource pools
Security
• Security boundary around each zone• Restricted subset of privileges
> A compromised zone is unable to escalate its own privileges
• Important name spaces are isolated• Processes running in a zone are unable to affect
activity in other zones or the global zone
Zone Security Properties
• Services can be isolated from each other> Quarantining potentially risky software> Isolating multiple dis-trusting parties> Containing potential damage by a breach
• Global Zone can:> observe all activities inside each zone> not be seen by software in each zone> change the contents or processes in each zone
• Non-global Zones run with less privileges
Zones are Less Privileged
“contract_event” Request reliable delivery of events“contract_observer” Observe contract events for other users"cpc_cpu” Access to per-CPU perf counters"dtrace_kernel" DTrace kernel tracing"dtrace_proc" DTrace process-level tracing"dtrace_user" DTrace user-level tracing"file_chown" Change file's owner/group IDs"file_chown_self" Give away (chown) files"file_dac_execute" Override file's execute perms"file_dac_read" Override file's read perms"file_dac_search" Override dir's search perms"file_dac_write" Override (non-root) file's write perms"file_link_any" Create hard links to diff uid files"file_owner" Non-owner can do misc owner ops "file_setid" Set uid/gid (non-root) to diff id"ipc_dac_read" Override read on IPC, Shared Mem perms"ipc_dac_write" Override write on IPC, Shared Mem perms"ipc_owner" Override set perms/owner on IPC"net_icmpaccess" Send/Receive ICMP packets"net_privaddr" Bind to privilege port (<1023+extras)"net_rawaccess” Raw access to IP"proc_audit” Generate audit records"proc_chroot” Change root (chroot)"proc_clock_highres" Allow use of hi-res timers"proc_exec" Allow use of execve()"proc_fork" Allow use of fork*() calls"proc_info" Examine /proc of other processes
"proc_lock_memory" Lock pages in physical memory"proc_owner" See/modify other process states"proc_priocntl" Increase priority/sched class"proc_session" Signal/trace other session process"proc_setid" Set process UID"proc_taskid" Assign new task ID“proc_zone” Signal/trace processes in other zones“sys_acct” Manage accounting system (acct)“sys_admin System admin tasks (e.g. domain name)"sys_audit" Control audit system"sys_config" Manage swap"sys_devices" Override device restricts (exclusive)"sys_ipc_config" Increase IPC queue"sys_linkdir" Link/unlink directories"sys_mount" Filesystem admin (mount,quota)"sys_net_config" Config net interfaces,routes,stack"sys_nfs" Bind NFS ports and use syscalls"sys_res_config" Admin processor sets, res pools"sys_resource" Modify res limits (rlimit)"sys_suser_compat" 3rd party modules use of suser"sys_time" Change system time
Interesting Some interesting privilegesBasic Non-root privilegesRemoved Not available in Zones
Processes
• Certain system calls are not permitted or have restricted scope inside a zone> http://developers.sun.com/solaris/articles/application_in_zone.html
• All processes can be seen inside the global zone> But control of those processes is privileged
• Inside a zone, only processes in the same zone can be seen or affected
• proc(4) only shows processes in the same zone
Global root /
/zone
zone1 2 3
/usr /dev ... .... ....
/bin /usr /dev
Zone root / Zone view
Global view
/export
... .... ....
Zone 1
Zone Filesystem
/proc/etc
File Systems & Devices
• Each zone is allocated its own root file system> No access to other zones' root file system> Private /dev directory mounted in zone
• Sparse-root vs. Whole-root> Sparse: subset of packages; sharing of execs, libs,
data. /usr,/sbin,/lib,/platform by default inherited in a read-only manner via lofs
> Whole: copies are made (needs more storage)
• Raw devices can be given to a zone with caution
Network & Identity
• Each zone controls its identity> Node name, RPC domain name, time zone, locale> Each container can use a different naming service (DNS,
LDAP and NIS, etc.)> Private IP addresses, ports
• Separate /etc/passwd files means that unique root users can be assigned
• Only one TCP/IP stack per kernel> Zones shielded from stack specifics – routing, devices, etc.> Cannot view other zones traffic
Zones and Resource Pools
cpu1
Resource Pool AResource Pool B
Non-GlobalZone1
Non-GlobalZone2
Non-GlobalZone3
Global Zone
cpu2 cpu3 cpu4 cpu5 cpu6 cpu7 cpu8
Default Resource Pool● Processor set (now)● Scheduling Class (now)● Memory Set (TBD)● Swap Set (TBD)
Solaris Container
• Solairs Containers = Zones + Resource Management
• Oracle license honor Containers (Zones+RM)> http://oracle.com/corporate/pricing/specialtopics.html> Running Oracle Database in Solaris 10 Containers Best Practices
- Metalink# 317257.1
FSS Scheduling Class
• CPU allocation is based on “shares” assigned to projects or zones> Share defines a guaranteed floor, rather than a cap> Only impose a limit when there is a shortage of CPU> Default share value is 1 share
• FSS works within a processor set• Avoid mixing scheduling classes within a pset• FSS class can be used for workloads having
different CPU utilization patterns> e.g. OLTP, DSS, java
Shares describe relative ratio...
App AApp A20%20%
App B33% App CApp C
14%14%
App DApp D33%33%
App AApp A 30%30%
App B50%
App CApp C20%20%
App A (3 shares) App B (5 shares) App C (2 shares) App D (5 shares)
Solaris ContainerResource Management – Fair Share Scheduler
Fair Share Scheduler (FSS(7D))
• Assigns resources based on number of shares assigned / number of shares on the system
• Two-level model> Top Level: Global zone administrator assigns shares to
zones> Second Level: Zone administrator assigns shares to
projects
• A project's CPU allocation depends on project shares as well as zone shares
• Most likely to use one approach, not both at the same time.
Two Level FSS
3
1
2
1
twilight
drop
fracture
global
Shares Allocated byZone Administrator
6
3
4
5
4
Shares Allocatedto Zones
DatabaseProject
2
(3+1+2+1)x
6
(4+5+4+3+6)=
2
7
6
22x = 6
77~ 7.8%
Enabling FSS Scheduler• Set FSS to be default scheduler class unpon next reboot
> # dispadmin -d FSS> 'dispadmin -d' creates /etc/dispadmin.conf
• Dynamically switch to FSS scheduler> Sysetup init script
> # dispadmin -d FSS> # /etc/init.d/sysetup start
> 'priocntl' command> # priocntl -s -c FSS -i all> # priocntl -s -c FSS -i pid 1
• Verify> # ps -cafe> # ps -ef -o user,pid,class,comm
ExamplesSingle Application Containers
network device(hme0)
storage complex
global zone (v880-room2-rack5-1; 129.76.1.12)
dns1 zone (dnsserver1)
zoneadmd
mail zone (mailserver)
network services(sendmail, IMAP)
remote admin/monitoring(SNMP, SunMC, WBEM)
platform administration(syseventd, devfsadm, ifconfig, metadb,...)
core services(inetd)
core services(inetd)
core services(inetd, rpcbind, sshd, ...)
zone root: /zone/dns1 zone root: /zone/mail1
network device(ce0)
zone management (zonecfg(1M), zoneadm(1M), zlogin(1), ...)
ce0:
3
ce1:
1
hme0
:1
zcon
s
zcon
s
zoneadmd
/usr
/usr
App
licat
ion
Env
ironm
ent
Virt
ual
Pla
tform
login services(SSH sshd)
network services(named)
zoneadmd
web1 zone (foo.org)
network services(Apache, Tomcat)
core services(inetd)
zone root: /zone/web1
hme0
:2
ce0:
1
zcon
s
/usr
zoneadmd
web2 zone (bar.net)
network services(IWS)
core services(inetd)
zone root: /zone/web2
hme0
:3
ce0:
2
zcon
s
/usr
pool2 (4 CPU; 10GB)
network device(ce1)
login services(SSH sshd)
login services(SSH sshd)
login services(SSH sshd, telnetd)
10
pool1 (4 CPU; 6GB), FSS
30 60
ExamplesMultiple Application Containers
network device(hme0)
storage complex
global zone (v1280-room3-rack12-2; 129.76.4.24)
oracle1 zone (oracle_ops)
zoneadmd
mail zone (mailserver)
network services(sendmail, IMAP)
remote admin/monitoring(SNMP, SunMC, WBEM)
platform administration(syseventd, devfsadm, ifconfig, metadb,...)
core services(inetd)
core services(inetd, rpcbind, sshd, ...)
zone root: /zone/oracle1 zone root: /zone/mail1
network device(ce0)
zone management (zonecfg(1M), zoneadm(1M), zlogin(1), ...)
ce0:
3
ce1:
1
zcon
s
/usr
App
licat
ion
Env
ironm
ent
Virt
ual
Pla
tform
zoneadmd
hme0
:1
ce0:
1
zcon
s
/usr
zoneadmd
oracle2 zone (ora_ta)
dba users proj(sh, bash, prstat)
system project(inetd, sshd)
zone root: /zone/oracle2
hme0
:2
ce0:
2
zcon
s
/usr
pool2 (4 CPU)
network device(ce1)
login services(SSH sshd)
ora_ta project(oracle)
pool1 (8 CPU), FSS
70 10
web service project(Apache 1.3.22)
app service project(IAS, J2SE)
dba users project(sh, bash, prstat)
backup project(sqlplus)
ora_ops project(oracle)
system project(inetd, sshd)
60
0
10
15
10
5
70
20
10
Sun™ MC – Solaris Container ManagerManage systems that run the
Solaris 8, 9, and 10 OS
Manage Solaris Containers across many systems
Uses Sun Management Center 3.5 Update 1b
Container ManagementView all the Containers
in your environment
Recreate a Container on another system
Automatic discovery of new objects
Create/Delete/Modify
Projects
Manage Solaris Zones
Support for IPQoS for Solaris Zones
Create/Delete/Modify Pools, Zones
Create new Zones through a
single wizard
Zone Commands
• Zone Configuration – zonecfg> Define what a zone looks like
• Console Access – zlogin -C
• Zone Administration – zoneadm> Install, Boot, Restart, Stop, List, Verify, Uninstall
Zone Administration
• zoneadm(1M) is used by the global zone administrator to> install a new root file system for a configured zone> list zones and optionally their state> verify whether the configuration of an installed zone is
semantically complete and ready to be booted> boot or ready an installed zone> halt or reboot a running zone> uninstall the root file system of an installed zone
Zone Console
• Zone pseudo-console available for each zone> Mimics a hardware console> Accessible via zlogin -C> Available prior to zone boot
global# zlogin -C zone1[Connected to zone 'zone1' console]zone1# ~.[Connection to zone 'zone1' console closed]
• Publishes zone state change messages[Notice: zone halted]
Demo
Creating a zone
Global# zonecfg -z zone1zone1: No such zone configuredUse 'create' to begin configuring a new zone.zonecfg:zone1> create
Setting's for the zonezonecfg:zone1> set zonepath=/zoneroots/zone1zonecfg:zone1> set autoboot=truezonecfg:zone1> add netzonecfg:zone1:net> set address=192.9.200.100/24zonecfg:zone1:net> set physical=e1000gzonecfg:zone1:net> endzonecfg:zone1> add inherit-pkg-dirzonecfg:zone1:inherit-pkg-dir> set dir=/optzonecfg:zone1:inherit-pkg-dir> endzonecfg:zone1> verifyzonecfg:zone1> commitzonecfg:zone1> ^D
global# zoneadm list -vcglobal# ls -l /etc/zones/zone1.xmlglobal# zonecfg -z zone1 info
Installing the zoneglobal# zoneadm -z zone1 installPreparing to install zone <zone1>.Creating list of files to copy from the global zone.Copying <2394> files to the zone.Initializing zone product registry.Determining zone package initialization order.Preparing to initialize <1048> packages on the zone.Initialized <1048> packages on zone.Zone <zone1> is initialized.Installation of <1> packages was skipped.Installation of these packages generated warnings: <SFWmuttS>The file </zoneroots/zone1/root/var/sadm/system/logs/install_log> contains a log of the zone installation.
- It took about 9 minutes on my laptop
global# zoneadm list -cv ID NAME STATUS PATH 0 global running / 1 zone1 installed /zoneroots/zone1
Boot the zone
global# zoneadm -z zone1 boot- It took about 4 seconds for 1st boot on my laptop.
global# zoneadm list -cv ID NAME STATUS PATH 0 global running / 1 zone1 running /zoneroots/zone1
global# zlogin -C zone1[Connected to zone 'zone1' console]<Run through sysid tools as usual to do initial customization>
Example: Interactive Initial Boot
• sysidtool(1M) runs by default[NOTICE: zone booting up]
SunOS Release 5.10 Version s10_52 32-bitCopyright 1983-2004 Sun Microsystems, Inc. All rights reserved.Use is subject to license terms.Hostname: twilightThe system is coming up. Please wait.
Select a Language
0. English 1. French 2. Japanese 3. Simplified Chinese 4. Traditional Chinese
Please make a choice (0 - 4), or press h or ? for help:
Example: Hands-off Initial Boot
• Using a sysidcfg(4) file as an alternative# cat > ~/zone-cfg/zone1.sysidcfgsystem_locale=en_UStimezone=US/Pacifictimeserver=localhostterminal=xtermssecurity_policy=NONEnetwork_interface=PRIMARY {hostname=zone1 \ protocol_ipv6=no}name_service=NONEsystem_locale=Croot_password=MNYm8SfoJlvIY^D
# cp ~/zone-cfg/zone1.sysidcfg \/zoneroots/zone1/root/etc/sysidcfg
Example: Process Monitoring
• prstat -Z• prstat -z <zonename>• ps -aef -z <zonename>• ps -aef -Z• df -hZ
ptree prints the process trees with child processes indented from their respective parent processes.
• ptree -z <zonename>
Zones and File Systems
• 3 different ways of provisioning file systems:> LOFS – Mount directory from global in a non-global zone> UFS – Mount real UFS directly into non-global zone> Raw – Attach raw devices to non-global zone
• Zonecfg requires a separate “add fs” or “add device” stanza for each device or mount point added.
Example: Zones + LOFS
global# zonecfg -z zone1zonecfg:zone1> add fszonecfg:zone1:fs> set dir=/opt/localzonecfg:zone1:fs> set special=/localzonecfg:zone1:fs> set type=lofszonecfg:zone1:fs> add options [rw, nodevices]zonecfg:zone1:fs> endzonecfg:zone1> verifyzonecfg:zone1> commitzonecfg:zone1> ^D
➔ This will mount the /local directory from the global to a mount point of /opt/local in the zone
➔ Useful to share data between zones, using the global zones as a go-between
Example: Zones + UFSglobal# zonecfg -z dbzone
zonecfg:red> add fszonecfg:red:fs> set dir=/opt/localzonecfg:red:fs> set special=/dev/dsk/c0d0s7
zonecfg:red:fs> set raw=/dev/rdsk/c0d0s7zonecfg:red:fs> set type=ufszonecfg:red:fs> endzonecfg:red> verifyzonecfg:red> commitzonecfg:red> ^D
> Mounts the UFS disk slice /dev/dsk/c0t0d0s7 as /opt/local in the non-global zone.
> No exposed mount point for this file system in the global zone.
Example: Zones + Raw Devicesglobal#zonecfg -z zone1zonecfg:zone1> add devicezonecfg:zone1:device> set match=/dev/rdsk/c0d0s6zonecfg:zone1:device> endzonecfh:zone1> add devicezonecfg:zone1:device> set match=/dev/dsk/c0d0s6zonecfg:zone1:device> endzonecfg:zone1> verifyzonecfg:zone1> commitzonecfg:zone1> ^D
> Adds a raw device directly into the non-global zone> Creates device node for the new device> Match can include wildcards and is evaluated each time the zone boots
zone1# newfs /dev/rdsk/c0d0s6zone1# mount /dev/dsk/c0d0s6 /opt/local
Example: Zones + FSS
#zonecfg -z zone1zonecfg:zone1> set pool=newpoolzonecfg:zone1> add rctlzonecfg:zone1:rctl> set name=zone.cpu-shareszonecfg:zone1:rctl> add value (priv=privileged,limit=10,action=none)zonecfg:zone1:rctl> endzonecfg:zone1> verifyzonecfg:zone1> commitzonecfg:zone1> ^D
Note: default pool will be used if “set pool” is not specified
#prctl -n zone.cpu-shares -r -v 25 -i zone zonename
Resource Pools Managementpoolcfg(1M) and pooladm(1M)
• Enabling pools> # pooladm -e
• Disabling pools> # pooladm -d
• Creating /etc/pooladm.conf xml file> # pooladm -s
• View current config info> # poolcfg -c info
Pools Configuration
• Create a set with min and max number of CPU's in a pool> # poolcfg -c 'create pset dbset (uint pset.min=1; uint
pset.max=2)'
• Create a pool> # poolcfg -c 'create pool dbpool'
• Associate set to the pool> # poolcfg -c 'associate pool dbpool (pset dbset)'
• View current config info> # poolcfg -c info> # poolstat -r all
Pools Example • tm163-118# poolcfg -c info
system tm163-118 string system.comment int system.version 1 boolean system.bind-default true int system.poold.pid 20514
pool dbpool int pool.sys_id 3 boolean pool.active true boolean pool.default false int pool.importance 1 string pool.comment pset dbset
pool pool_default int pool.sys_id 0 boolean pool.active true boolean pool.default true int pool.importance 1 string pool.comment pset pset_default
Pools Example (Cont.) pset dbset int pset.sys_id 1 boolean pset.default false uint pset.min 1 uint pset.max 1 string pset.units population uint pset.load 0 uint pset.size 1 string pset.comment
cpu int cpu.sys_id 0 string cpu.comment string cpu.status on-line
pset pset_default int pset.sys_id -1 boolean pset.default true uint pset.min 1 uint pset.max 1 string pset.units population uint pset.load 0 uint pset.size 1 string pset.comment
cpu int cpu.sys_id 1 string cpu.comment string cpu.status on-line
Pools and Zone
• Bind a zone to a pool> # poolbind -p dbpool -i zoneid dbzone
• Which pool are you binding to? > dbzone# poolbind -q $$ 25177 dbpool
System Parameter Changes in S10• Many removed and obsoleleted parameters
> http://docs.sun.com/app/docs/doc/817-0404/6mg74vs90?a=view
• Removed System V IPC parameters
Message Queues Semaphores Shared Memorymsgsys:msginfo_msgmap semsys:seminfo_semmaem shmsys:shminfo_shmminmsgsys:msginfo_msgmax semsys:seminfo_semmap shmsys:shminfo_shmsegmsgsys:msginfo_msgseg semsys:seminfo_semmnsmsgsys:msginfo_msgssz semsys:seminfo_semmnu
semsys:seminfo_semvmxsemsys:seminfo_semumesemsys:seminfo_semusz
• Obsoleted parameters replaced with controlable resource parameters (bigigger default value)
> http://docs.sun.com/app/docs/doc/817-1592/6mhahuoim?a=view
Oracle Related Parameters• System V IPC parameters and the corresponding Solaris resource controls
Parameter Resource Control Default Value
100 Yes project.max-sem-ids 128
1024 No N/A N/A
256 Yes project.max-sem-nsems 512
Yes project.max-shm-memory
1 No
100 Yes project.max-shm-ids 128
10 No N/A N/A
Oracle Recommendation
Requiredin S10
SEMNI (semsys:seminfo_semmni)SEMMNS (semsys:seminfo_semmns)SEMMSL (semsys:seminfo_semmsl)SHMMAX (shmsys:shminfo_shmmax)
¼ of physical RAM
SHMMIN (shmsys:shminfo_shmmin)SHMMNI (shmsys:shminfo_shmmni)SHMSEG (shmsys:shminfo_shmseg)
Resource Control Commands
• System V IPC parameters not need to be set in /etc/system
• Set on a per-process or per-project basis
• prctl(1) > # prctl -n process.max-file-descriptor <pid>
> # prctl -n project.cpu-shares -v 10 -r -i project db_project
> # prctl -n project.max-shm-memory -v 10g -r -i project user.oracle
> # prctl -n project.max-shm-memory -i project user.oracle
> # prctl -i project user.oracle
• rctladm (1)> # rctladm -l
Zones FAQ/Blogs/Info
• http://www.opensolaris.org/os/community/zones/faq• http://blogs.sun.com/<whomever>
> David Comay (comay)> Dan Price (dp)> John Beck (jbeck)> Andy Tucker (tucker)
• http://www.sun.com/bigadmin/content/zones• http://www.sun.com/blueprints/
SOLARIS 10 CONTAINERS
Kimberly Changkimberly.chang@sun.comhttp://webhome.sfbay/kchangshttp://blogs.sun.com/kchangs
top related