ucs security part2

www.silantia.com1

UCS Security

System Policies High Availability System Events SNMP Firmware TAC Information

www.silantia.com2

System Policies

www.silantia.com3

Overview of High Availability

www.silantia.com4

High Availability

Two fabric interconnects two IOM per chassis so two data paths. Per blade.

Clustering of FI requires same UCS manager version and same model of FI.

Clustering is done thru L1 and L2 port on Fabric interconnect. These ports are non-configurable.

L1-L2 ports 1000BaseTX using straight through Cat6 cable

Pre-configured to run LACP and CDP. Links are 802.3ad bond managed by

underlying OS.

www.silantia.com5

High Availability

Cisco UCS manager controller: Distributed application runs on both the primary

and subordinate UCS manager instance Each instance is represented by node ID Separate process running on Cisco NX-OS Defines running mode UCS manager processes

Cisco NX-OS: Starts all Cisco UCS manager processes Monitors and restart UCS manager processes.

www.silantia.com6

High Availability

Local Storage: NVRAM and flash stores static data Read and written but local Cisco UCS manager

instance Replicated when both nodes are up Chassis EEPROM Serial EEPROM stores state data Upto 3 chassis has its EEPROM written with

state information in two partitions. Read and written by both chassis management

controller Used to assist the Cisco UCS manager in

determining state of the cluster.

www.silantia.com7

Viewing and Changing Management HA

connect local-mgmt dc101-A# sh cluster extended-state Cluster Id: 0x898942147f8311e2-0x8af9547feeed8104 Start time: Sun May 26 18:36:30 2013 Last election time: Sun May 26 18:36:33 2013 A: UP, PRIMARY B: UP, SUBORDINATE A: memb state UP, lead state PRIMARY, mgmt services state: UP B: memb state UP, lead state SUBORDINATE, mgmt services state: UP heartbeat state PRIMARY_OK INTERNAL NETWORK INTERFACES: eth1, UP eth2, UP HA READY Detailed state of the device selected for HA storage: Chassis 1, serial: FOX1450H4JK, state: active dc101-A# cluster lead cluster force

L1 and L2 ports

Serial EEPROM Chassis

www.silantia.com8

High Availability (split brain issues)

Partition in space: A partition in space occurs when the private network fails

(no path from L1 to L1 and L2 to L2) There is a risk of active-active management node. Both nodes are demoted to subordinate and a quorun race

begins. The node that claims the most resources wins.

Partition in time: A partition in time occurs when a node boots alone in the

cluster. Node compares its database version against the serial

EEPROM and discovers that its version number is lower than current database version.

There is risk of applying an old configuration to UCS components.

This node will not become the active management node.

www.silantia.com9

System Events

www.silantia.com10

Fault severity

Severity

Description

Critical A service-affecting condition that requires immediate corrective action. This severity might indicate that the managed object is out of service and its capability must be restored.

Major A service-affecting condition that requires urgent corrective action, This severity might indicate a severe degradation in the capability of managed object and that its full capability must be restored.

Minor A non-service impacting fault condition that requires corrective action to prevent a mode serious fault from occurring,.

Warning A potential service-affecting fault that currently has no significant effects in the system.

Condition

An informational message about a condition, possibly independently insignificant.

Info A basic notification or informational message, possibly independently insignificant.

www.silantia.com11

Fault states

State Description

Active A fault was raised and it currently active

Cleared A fault was raised but did not reoccur during the flapping interval. The condition that caused the fault has been resolved, and the fault has been cleared

Flapping A fault was raised, cleared, and then raised again within a short time interval, known as flap interval.

Soaking A fault raised and then cleared but since it was a flapping condition, the fault severity remains at its original active value, but this state indicates that condition that raised the fault has cleared.

www.silantia.com12

System Events settings

Admin Tab- >Fault,events and audit log -> Settings

www.silantia.com13

SNMP

www.silantia.com14

SNMP

All SNMP versions are supported. V1,v2c and v3. Username and password is configurable on

device for SNMP version 3. Source IP address of all SNMP transaction uses

cluster IP address. Admin Tab -> Communication management ->

Communication services -> SNMP

www.silantia.com15

Firmware

www.silantia.com16

Firmware

UCSM, IOM and Fabric interconnect upgrade Following steps are done under Equipment-> firmware

management -> Update/Activate firmware. Activate Cisco UCS Manager new image Activate the I/O modules new image Activate the subordinate fabric interconnect new image Manually failover the primary fabric interconnect to the fabric

interconnect that has already been upgraded. This step is done thru command line using following command UCS-A (local-mgmt) # cluster {force primary | lead {a | b}} Verify that the data path has been restored. Activate the primary fabric interconnect new image

Note: During fabric interconnect upgrade each blade will lose one path but other path is available so fabric failover from UCS and/or vmware nic teaming should work.

Upon activating IOM image, does not reboot the IOM, IOM reboots and upgrade when connected fabric interconnect reboots and upgraded.

www.silantia.com17

Firmware

Host firmware packages. Grouping of Adapter, BIOS, Board controller,

Storage controller firmwares in to an entity which can be then used in service profile.

Management firmware packages. Set of CIMC images for different kinds of blades.

When above applied to a service profile which is already associated it will trigger maintenance task. Depends on how it is scheduled this firmware updates will be applied.

www.silantia.com18

TAC Information

Go to Admin Tab click on All and then “Collect TAC specific information”

www.silantia.com19

TAC Information

cisco-ucspe# connect local-mgmt cisco-ucspe(local-mgmt)# show tech-support chassis Chassis fex FEX (fabric-extender) Module server Rack Server ucsm UCSM ucsm-mgmt UCSM Management(excludes fabric interconnect)

cisco-ucspe(local-mgmt)# show tech-support chassis 1 cimc 2

cisco-ucspe(local-mgmt)# show tech-support chassis 1 iom 1

ucs security part2

Technology

fault severity

memb state

state information

subordinate ucs manager

ucs manager version

state cluster id

state description active

lead state primary