under the hood of openshift - red hat · under the hood of openshift: turbochared by red hat...
TRANSCRIPT
UNDER THE HOOD OF OPENSHIFT:TURBOCHARED BY RED HATENTERPRISE LINUX
Ian PilcherSr. Solution Architect, Red HatDaniel WalshSr. Principal Software Engineer, Red HatJune 13, 2013
Contents
● OpenShift Overview
● Control Groups
● SELinux
● Namespaces
● Demo
● Q&A
OPENSHIFT OVERVIEW
Brokers and Nodes
AWS / CloudForms / IaaS (OpenStack) / Virtual (RHEV) / Bare Metal
Nodes are where User Applications live.Brokers keep OpenShift running.
Brokers Node Node Node
RHEL RHEL RHELRHEL
Gears
RHEL RHEL
OpenShift GEARS represent secure containers in RHEL
Broker Node Node Node
RHEL
JBoss
My Gear
AWS / CloudForms / IaaS (OpenStack) / Virtual (RHEV) / Bare Metal
Lots of Gears!
Broker
RHEL RHEL RHEL
Node Node
AWS / CloudForms / IaaS (OpenStack) / Virtual (RHEV) / Bare Metal
What Is a Gear?
● User and group● Name == gear UUID
● UID and GID 1000+
● SELinux category● c${UID}
● Control Group● /openshift/${UUID}
● Home directory● /var/lib/openshift/${UUID}
● Running processes
OpenShift Operating System Requirements
Resource ManagementEnsures that a gear can consume only its allocated portion of a shared resource.
Control groups (cgroups)Filesystem quotas
Access ControlPrevents a gear from inappropriately reading or modifying system resources.
SELinuxFilesystem permissionsNamespaces
PolyinstantiationProvides the appearance of access to a system-wide resource.
Namespaces
Provided By ...
CPU
Provided By ...
Provided By ...
Memory Disk Network
LINUX CONTROL GROUPS(cgroups)
OpenShift Control Groups
● One cgroup per gear
● /openshift/${UUID}
● Created by openshift-cgroups service
● cgrulesengd places processes in correct group (based on process EUID)
● Parameters in /etc/openshift/resource_limits.conf
CG
roup
s CG
roups
MyApp
Resource Controller Limits
cpu● cpu.cfs_period_us = 100000● cpu.cfs_quota_us = 30000● cpu.rt_period_us = 100000● cpu.rt_runtime_us = 0● cpu.shares = 128
memory● memory.limit_in_bytes = 536870912● memory.memsw.limit_in_bytes = 641728512● memory.soft_limit_in_bytes = 9223372036854775807● memory.swappiness = 60
SmallGear
(default)
Additional Resource Controllers
cpuacct● Gathers CPU usage statistics for all
group (gear) processes.
● Statistics not currently used.net_cls
● Tags all network packets generated by gear with class identifier (generated from gear's SELinux category).
● Class identifier can be used for traffic shaping.
● Traffic shaping not currently used.
freezer● Stops all processes in group (gear) from
executing.
● Used by OpenShift Online to achieve massive scale.
● Not used in OpenShift Enterprise.
SECURITY-ENHANCED LINUX(SELinux)
OpenShift and SELinux
Broker
RHEL RHEL
SE
Linu
x
SELinux
SE
Linux
MyApp
MyApp
Node
AWS / CloudForms / IaaS (OpenStack) / Virtual (RHEV) / Bare Metal
SELinux is a LABELING System
● Everything has a label● Process,file,dir, chr_file, blk_file, port, node.
● SELinux Policy defines that access between process labels and all other labels.
● The Kernel controls the access.
Containers != Security
● Running root in a container, machine pwned
● Local Privilege Escalation, machine pwned
● Much of the system is not containerized.● Audit● /sys
● selinuxfs, cgroupfs, sysfs
● Need to block mount● Need to block mknod
Security Goals
http://en.wikipedia.org/wiki/Maginot_line
SELinux is Type Enforcement
system_u:system_r:openshift_t:s0:c1,c2
system_u:system_r:openshift_var_lib_t:s0:c1,c2
seinfo -t | grep openshift
openshift_mail_tmp_t, httpd_openshift_content_t, openshift_cgroup_read_tmp_t, openshift_initrc_tmp_t, openshift_var_lib_t, openshift_var_run_t, openshift_app_t, openshift_min_t, openshift_net_t, openshift_tmp_t, openshift_min_app_t, openshift_net_app_t, openshift_cgroup_read_t, httpd_openshift_script_exec_t, openshift_cron_tmp_t, openshift_initrc_t, httpd_openshift_script_t, openshift_cron_exec_t, openshift_initrc_exec_t, openshift_rw_file_t, openshift_log_t, openshift_cron_t, openshift_mail_t, openshift_port_t, httpd_openshift_ra_content_t, httpd_openshift_rw_content_t, httpd_openshift_htaccess_t, openshift_cgroup_read_exec_t, openshift_t, openshift_tmpfs_t
SELinux is Type Enforcement
● Process Labels can be on Files● File Labels != Process Labels
● openshift_t -> Process● openshift_var_lib_t -> File
SELinux is MCS – Multi Category System
system_u:system_r:openshift_t:s0:c1,c2
system_u:system_r:openshift_var_lib_t:s0:c1,c2
● MCS Enforcement separates “same types”● openshift_t:s0:c1,c2 -> openshift_var_lib_t:s0:c1,c2● openshift_t:s0:c3,c4 -> openshift_var_lib_t:s0:c3,c4● openshift_t:s0:c1,c2 openshift_var_lib_t:s0:c3,c4
Kernel
Host Hardwarememory, storage, etc.
openshift_t:s0:c3,c4openshift_t:s0:c1,c2
MCS In action
openshift_var_lib_t:s0:c1,c2
SELinux
openshift_var_lib_t:s0:c3,c4
MCS Labeling based on UID
def gen_level(uid): SETSIZE=1023 TIER=SETSIZE ORD=uid; while ORD > TIER: ORD = ORD - TIER; TIER= TIER - 1; TIER = SETSIZE - TIER; ORD = ORD + TIER; return "s0:c%d,c%d" % (TIER, ORD)
How do the labels get on gears
● Host receives packet for a gear● OpenShift server
● launches application with correct SELinux label.● Sends packet to application
● If connection comes in via git or ssh● Ssh uses pam_openshift
● Launch sh with correct context● Launch git with correct context
LINUX NAMESPACES
OpenShift & Linux Namespaces
RHEL6 Openshift
● Mount : mounting/unmounting filesystems
● /tmp, /var/tmp and /dev/shm
RHEL7 Openshift
● IPC : SysV message queues, semaphore/shared memory segments
● Network: IPv4/IPv6 stacks, routing, firewall, proc/net /sys/class/net directory trees, sock
● Critical to fix localhost problem● Pid: Private /proc, multiple pid 1's
DEMO