ovs dom0 upgrade livemigration sop v1.56

21
Doc : Dom0 Upgrade Execution Process Version : 1.56 ----------------------------------------------------------------- --------------------------------------------------------- Objective :- This Document describes the steps to be followed to perform the OVMS upgrade or Dom0 upgrade. Pre check to be performed 1 hr before the schedule of Execution 1.Ensure : DBA has taken the HC for the Affected Dom0 ping channel ovms-up-project /mig-os - script will auto mount .Mig-os mount source details :- mkdir /mig-os ; mount -o noacl rm02stor29-nas:/export/roh_29a_hwmig/rmdc_dom0 /mig-os mkdir /mig-os ; mount -o noacl adc08ntap17-bkp:/vol/aoh_17a_hwmig/adc_dom0 /mig-os mkdir /mig-os ; mount -o noacl sl05stor02-nas:/export/sloh_02a_hwmig/sldc_dom0 /mig-os mkdir /mig-os ; mount -o noacl tvp01stor05-nas:/export/tvpoh_05a_hwmig/tvpdc_dom0 /mig-os mkdir /mig-os ; mount -o noacl epc002oodstor02-nas:/export/epcoh_02a_hwmig/epdc_dom0 /mig-os mkdir /mig-os ; mount -o noacl syc002oodstor01-nas:/export/sycoh_01a_hwmig/sydc_dom0 /mig-os mkdir /mig-os ; mount -o noacl llg02stor02-bkp:/export/llgoh_07a_hwmig/llg_dom0 /mig-os 2.Validate Prep Data and Current Environment ============== TEMPLATE ==============

Upload: akshay-joshi

Post on 05-Dec-2015

232 views

Category:

Documents


1 download

DESCRIPTION

Xen

TRANSCRIPT

Page 1: OVS Dom0 Upgrade Livemigration SOP v1.56

Doc : Dom0 Upgrade Execution Process

Version : 1.56

--------------------------------------------------------------------------------------------------------------------------

Objective :-

This Document describes the steps to be followed to perform the OVMS upgrade or Dom0 upgrade.

Pre check to be performed 1 hr before the schedule of Execution

1.Ensure : DBA has taken the HC for the Affected Dom0 ping channel ovms-up-project

/mig-os - script will auto mount

.Mig-os mount source details :-

mkdir /mig-os ; mount -o noacl rm02stor29-nas:/export/roh_29a_hwmig/rmdc_dom0 /mig-osmkdir /mig-os ; mount -o noacl adc08ntap17-bkp:/vol/aoh_17a_hwmig/adc_dom0 /mig-os mkdir /mig-os ; mount -o noacl sl05stor02-nas:/export/sloh_02a_hwmig/sldc_dom0 /mig-osmkdir /mig-os ; mount -o noacl tvp01stor05-nas:/export/tvpoh_05a_hwmig/tvpdc_dom0 /mig-osmkdir /mig-os ; mount -o noacl epc002oodstor02-nas:/export/epcoh_02a_hwmig/epdc_dom0 /mig-osmkdir /mig-os ; mount -o noacl syc002oodstor01-nas:/export/sycoh_01a_hwmig/sydc_dom0 /mig-osmkdir /mig-os ; mount -o noacl llg02stor02-bkp:/export/llgoh_07a_hwmig/llg_dom0 /mig-os

2.Validate Prep Data and Current Environment

============== TEMPLATE ==============

SR# :

Source Dom0 :

Target Dom0 :

Source Dom0 VM list :

Nos of VMs from PREP :

Nos of VMs from Portal :

Are the VM names Listed in Prep the same as current VM List [Y/N] :

============== TEMPLATE ==============

Use This as reference through out activity

Page 2: OVS Dom0 Upgrade Livemigration SOP v1.56

3.Check the OS version of each VM … It should not be OEL 6.X

If VMs OEL version Ok then proceed next steps…

4.Take: Source Dom0 – Pre Health checks and paste the output in the SR

Run:

/ptsadmin/os_migration/OEL/Dom0_Upgrade_sysinfo_pre_upg_post_checks.sh -o source -o pre -rfc 3-TESTII

Above script covers:

HW_model of the Dom0BIOS version ,ILOM versionunamecat /etc/enterprise-releasemig-os is mountedreset ilom passwordShows Total vcpus of all running VMs ,Shows Total Memory of all running VMsChecking the vm.cfg for vcpu_max parameterCheck if vms are on localdiskIdentify the container to snapEM agent status ,Ovs-agent statusLdap statusIfconfig,bond,MTU,route details Check if any HW Faults exists like dimm powersupply sensors..

5). Ensure you are able to access the ilom console of the Dom0

6). Ensure that /etc/ovms-provision.conf exists. If it does not, perform the following step:

cat > /etc/ovms-provision.conf << EOF

DATACENTER=<Valid values found in Appendix 7.1 of OMCS_Hypervisor_Provisioning.pdf>

ZONE=<Dom0s Network ZONE>

PROVISIONING_TICKET=<SR/RFC Number>

EOF

vi /etc/ovms-provision.conf

7) Replace the values assigned to DATACENTER, ZONE, and PROVISIONING_TICKET with the relevant information for the patching ticket at hand.

Page 3: OVS Dom0 Upgrade Livemigration SOP v1.56

8. Set BO

EM : https://omcsem.oracle.com/apex/f?p=200:35:1039424757297901::NO::: locate your EM then set BO

Give the BO name as : SR#-Server-Boucne

9. Disable Pinger using global dc url:

Ex: https://globaldc.oracle.com/host/detail/vmfsnchpg029

On Temp/Stage Dom0

10. To Take: Target Dom0 – Validations

Run:

/ptsadmin/os_migration/OEL/Dom0_Upgrade_sysinfo_pre_upg_post_checks.sh -o target -o validate -rfc 3-TESTII

- This will check same Bridges from Source are available on the temp Dom0- Ensure the same storage and swap container is available.- Hardware is compatible and adequate memory and CPU are available

- source and target Dom0s must be have same model , if they are of different models immediately cancel the SR and send it back to Bizops

11.SA Has to :-

- Ensure VLANs tagged with the same VLANS as Source Dom0 –check the related SR to confirm the VLAN is already tagged.

Search for SR based on <sub> check the SR is closed and included the VLANs in use on the source Dom0.

Page 4: OVS Dom0 Upgrade Livemigration SOP v1.56

12.Execution Phase :-

12.1 Perform Live migration using the script

Cases where we don’t have any dependency on other SRS for target Dom0 proceed next steps

Else if dependency go with the plan from Lead.

________________________________________________

source pre step has generated the : migVMs.lst file. --- this contains our source vm names

Ex: sample output

Creating the : migVMs.lst file for this 3-11215965617--------------------------------------------54ee4ba4a53a4430ab01b7eec5c3cc92 64bit VMf7e3fd9dd37543dca5fbc1fb6369003d 64bit VM--------------------------------------------

/mig-os/logs/OEL-DOM0_UPG/3-11215965617/migVMs.lst -- was generated

Please review it, and modify if necessary – make sure it has the exact same number of hosts from xm list

12.2 To start migration, run

Take confirmation from SL on HC by this time.

Initiate a ping to the VMs and take Uptime

Ensure df –hP is not hanging on any mounts

Run:

EX: change SR and your target dom0

/ptsadmin/os_migration/OEL/migVM.sh_custom_pchandru_v1.1 /mig-os/logs/OEL-DOM0_UPG/3-11215965617/migVMs.lst [target Dom0]

…cont

Page 5: OVS Dom0 Upgrade Livemigration SOP v1.56

EX:

[root@rmc002oodhost112 ~]# /ptsadmin/os_migration/OEL/migVM.sh_custom_pchandru_v1.1 /mig-os/logs/OEL-DOM0_UPG/3-11215965617/migVMs.lst rmc002oodhost206

This Dom0's configurationnr_cpu : 32total_memory : 262086free_memory : 7097used memory : 254989=============================================Please login to rmc002oodhost206and run 'xm info' then check'nr_cpu' must be >= 32'free_memory' must be >= 254989'total_memory' should be >= 262086=============================================Ready to start migration [y/N]? y

Migration would take more than hour 256G RAM Dom0Do not interrupt. Be patientReally ready to start migration [y/N]? yxm migrate -l 54ee4ba4a53a4430ab01b7eec5c3cc92 rmc002oodhost206xm migrate -l f7e3fd9dd37543dca5fbc1fb6369003d rmc002oodhost206DoneLooking for any errors above

12.3 Monitor Logs on source and target : tail -f /var/log/xen/xend.log

Validate by running xm list so that no vms are there on the source dom0 now.

Access/Verify each VM after migrating to Temp dom0.

Page 6: OVS Dom0 Upgrade Livemigration SOP v1.56

13. Ugrade : (this steps performs Disable uptrack and upgrade )

13.1 Run:

Time /ptsadmin/os_migration/OEL/Dom0_Upgrade_sysinfo_pre_upg_post_checks.sh -o source -o upgrade -rfc 3-TESTII

Log location for the upgrade: Ex /var/log/od-provision/2015-08-08-20\:40\:13.log

13.2 . Bounce the Dom0

Check grub.conf – we should see the new kernel

Power off the Dom0

Use: /ptsadmin/os_migration/bin/system_restart.pl -s --From consoleTake current power status of host : show /SYS … if should of off Start Dom0 from Console : start /SYS--

13.3 .Once the Dom0 is up run the below

/ptsadmin/ateam/bin/fix_firmware --- (go with defaults )

if it suggests : Firmware upgrade needed ..allow it proceed and perform below steps

a) INFORM A TEAM LEAD (this item is being tracked)

b) Shutdown EM Agent

c) Shutdown Dom0

d) Power off Dom0 host

e) Reboot ILOM by executing reset /SP in an ILOM ssh session

f) Power on Dom0 host and await boot-up

g) rerun : /ptsadmin/ateam/bin/fix_firmware ... it should not show firmware upgrade needed

Page 7: OVS Dom0 Upgrade Livemigration SOP v1.56

14. Post checks on Source Dom0

14.1 To Take: Source Dom0 – Post checks

/mig-os is auto mounted by script

Run:

/ptsadmin/os_migration/OEL/Dom0_Upgrade_sysinfo_pre_upg_post_checks.sh -o source -o post -rfc 3-TESTII

Above script covers : /ptsadmin/ateam/bin/ovms-go_live_check.pl as well

14.2 .Ensure that OVM discovery completes successfully through OVM UI

- You can identify the OVM manager for a Dom0 by running the following as root on the Dom0 (no value means that the Dom0 is not currently managed by OVMM):

ovs-agent-db read_item server manager_core_api_url | \ cut -f2 -d@ | cut -f1 -d:

Ex:

[root@auc026oodhost147 pchandru]# ovs-agent-db read_item server manager_core_api_url | cut -f2 -d@ | cut -f1 -d:

10.224.94.217

Connect to the OVMM server and follow steps given in the below doc

14.3.If the upgrade has gone correctly and all verifications passed, then remove uptrack RPMs

yum erase -y uptrack-libyaml uptrack-python-pycurl uptrack-PyYAML uptrack

Page 8: OVS Dom0 Upgrade Livemigration SOP v1.56

15. Bringback the VMs to original Dom0/Flip them back

15.1 Login to Source Dom0 : take xm list

15.2 Login to Stage/Temp Dom0 : take xm list

15.3 Move the VMs back From Temp/Target Dom0 to Source

-Login to Temp /Target dom0

Ensure df –hP is not hanging on any mounts

15.4 Run: use the same and migVMs.lst that got created in : source – pre – phase

Ex : Replace according to your SR

/ptsadmin/os_migration/OEL/migVM.sh_custom_pchandru_v1.1 /mig-os/logs/OEL-DOM0_UPG/3-11215965617/migVMs.lst [Source Dom0]

15.5 Take xm list on the temp/stage dom0 : No VMs should be running now

15.6 Take xm list on the Source dom0 : VMs should be running now

15.7 Login to each VM and take uptime

15.8 On Source Dom0:

Once All VMs are up and moved back – check /etc/xen/auto_OVMSUPGBKP.RFCNUMBER backup folder is auto moved

as /etc/xen/auto by the script when we executed –source-post.

15.9 Unmount /mig-os from both source and target dom0

16. Inform Shift Lead to take Post Health checks

Page 9: OVS Dom0 Upgrade Livemigration SOP v1.56

17. SR final Update Template/Release note

The scheduled maintenance activity has been completed successfully.should you encounter any issue, Please contact:

SA1 Contact:------------Name : Devesh KumarMobile : +91 9611523300Email ID : [email protected]

Management Escalation :---------------------Name : karuppiah rama Email ID : [email protected] : +91 98456 33725

Name : Hari Yalavarthy Email ID : [email protected] : +91 9845902356

Regards,

******END******

Page 10: OVS Dom0 Upgrade Livemigration SOP v1.56

Doc : Dom0 Upgrade PREP Process

Version : 1.56

--------------------------------------------------------------------------------------------------------------------------

OVMS Dom0 Upgrade Prep Procedure

1.Capture the Details in to the Template

============== TEMPLATE ==============

Source Dom0 :

Target Dom0 :

Source Dom0 VM list :

Nos of VMs from Portal :

Nos of VMs currently running :

Are the VM names Listed in SR the same as current VM List [Y/N] :

Additional VMs Found in Prep :

Instance names on Addn hosts :

NETWORK SR# :

============== TEMPLATE ==============

(avoid using same target dom0 – on the same day schedule)

Use the below code to get VM names:-

--

for i in `xm list | awk -F " " '{ print $1 }' | egrep -iv "Name|Domain"` ; do cat /OVS/Repositories/*/VirtualMachines/$i/vm.cfg | grep -i OVM_simple_name | awk -F " " '{ print $3 }' ; done

2. IF there is a mismatch from portal and current ,fill the required fields in the portal

Paste the below

Additional VMs Found in Prep :

Instance names on Addn hosts :

and send SR to Bizops as well : Prep status: completed , Execution_status: Back to bizops

Page 11: OVS Dom0 Upgrade Livemigration SOP v1.56

3. Mount the mig-os on the Source and Target Dom0s

RMDC :mkdir /mig-os ; mount -o noacl rm02stor29-nas:/export/roh_29a_hwmig/rmdc_dom0 /mig-osADC :mkdir /mig-os ; mount -o noacl adc08ntap17-bkp:/vol/aoh_17a_hwmig/adc_dom0 /mig-os SLDC :mkdir /mig-os ; mount -o noacl sl05stor02-nas:/export/sloh_02a_hwmig/sldc_dom0 /mig-osTVP :mkdir /mig-os ; mount -o noacl tvp01stor05-nas:/export/tvpoh_05a_hwmig/tvpdc_dom0 /mig-osEPDC :mkdir /mig-os ; mount -o noacl epc002oodstor02-nas:/export/epcoh_02a_hwmig/epdc_dom0 /mig-osSYDC :mkdir /mig-os ; mount -o noacl syc002oodstor01-nas:/export/sycoh_01a_hwmig/sydc_dom0 /mig-osLLG : mkdir /mig-os ; mount -o noacl llg02stor02-bkp:/export/llgoh_07a_hwmig/llg_dom0 /mig-os

4.On Source Dom0 - take prepwork

RUN:

/ptsadmin/os_migration/OEL/Dom0_Upgrade_sysinfo_pre_upg_post_checks.sh -o source -o prepwork -rfc 3-TESTII

This script will cover below steps

-Takes a List of the VMs running on the Source Dom0

-Takes the Total Memory and VCPU used/assigned to the VMS on Source Dom0 -Identify the bridges of the VMs running on the Source Dom0 - Identify the Storage Container and swap container on the Source dom0 - Reset ILOM Password on Source Dom0- check if VM is on local disk

5.Check the OS version of each VM … It should not be OEL 6.X

(we are working on automation, till then pls take manual)

6.On Target Dom0 : Target validation

/ptsadmin/os_migration/OEL/Dom0_Upgrade_sysinfo_pre_upg_post_checks.sh -o target -o prepvalidate -rfc 3-TESTII

This will compare:-

Same Storage and swap container is Available on Target dom0

Show if enough free memory is available for VM migration

Same bridges of source Dom0 Vms are available on the Target Dom0

Page 12: OVS Dom0 Upgrade Livemigration SOP v1.56

7.In case there are any missing bridges create them and make them up

Collect bridge detail from source Ex:[root@rmc002oodhost436 ~]# brctl show | grep -i br600_RM2_10br600_RM2_10 8000.0010e062f4c5 no bond0.600

In this case :Bridge is : br600_RM2_10 Bond interface : bond0.600

Check the same base bondx (in this case bond0) exists, on the target dom0

ls -l /proc/net/bonding/bond*

[root@rmc002oodhost065 ~]# ls -l /proc/net/bonding/bond*-r--r--r-- 1 root root 0 Sep 4 01:53 /proc/net/bonding/bond0-r--r--r-- 1 root root 0 Sep 4 01:53 /proc/net/bonding/bond1[root@rmc002oodhost065 ~]#

Create the same bridge and bondx file on the target by copying the config from source Dom0

EX:[root@rmc002oodhost436 ~]# ls -l /etc/sysconfig/network-scripts/ifcfg-* | grep -i 600-rw-r--r-- 1 root root 139 Sep 3 20:53 /etc/sysconfig/network-scripts/ifcfg-bond0.600-rw------- 1 root root 176 Sep 3 20:54 /etc/sysconfig/network-scripts/ifcfg-br600_RM2_10[root@rmc002oodhost436 ~]#

Edit these files on target and remove the lines which are like

HWADDR=xx.xx.xx.xxMACADDR=xx.xx.xx.xx

Now do cat of these files ensure above entries are not there.

Page 13: OVS Dom0 Upgrade Livemigration SOP v1.56

Bring up the interface and the Bridgeifup bond0.600 ifup br600_RM2_10brctl show | grep br600_RM2_10 it should show up in the list.

Follow same procedure for each missing bridge and create it on target.Rerun target validation and now it should pass.

8.Source and target Dom0s must be have same HW model , if they are of different models

IF we come across HW mismatch – check if we have same HW on the rest of the reserved Dom0

Still if we do not have sufficient HW – put back the SR to bizops

Execution Status : Back to BizopsPrep status : mis-match drop down option from portal

9. VLAN Tagging:-

9.1.Check in portal for vlan tagging SR and confirm the required vlan are tagged

Confirm it is completed.

update the vlan tagging page

9.2.if vlan tagging is not there, then identify all the VM's in a given server pool and get the vlan details of each VM and raise network vlan tagging SR

- As of now virendra is doing this – limiting to a single person- Pls contact Virendra in case we need to open SR for all the VMs of the pool.-

Page 14: OVS Dom0 Upgrade Livemigration SOP v1.56

9.3.IF we find more than Two Bridges for a VM – check it has –rac interface

In that case we need to take ./getnet for –rac also and get it tagged

Run for :

-rac

-vip

-Frontend

Login : hqsun1 - take the vlan id details of the VMs

EX:bash-2.03$ /home/vvarughe/getnet vmohsadvg501

Request: vmohsadvg501

FQDN: vmohsadvg501.oracleoutsourcing.comIP: 148.87.205.163Netmask: 255.255.255.240

vlan id: rmdc-z2-advg-v160vlan note: NAIR-DB-Advantage-Sales-and-Marketing-(sr3-5756856194)

Network: 148.87.205.160/28First IP: 148.87.205.161Last IP: 148.87.205.174Broadcast IP: 148.87.205.175

Page 15: OVS Dom0 Upgrade Livemigration SOP v1.56

9.4To Get the Switch port details and : PFE and SFE

Use portal : http://pnvcapp01.oracle.com/cgi-bin/switches.cgi

9.5 Sample VLAN tagging SR Template

Subject: OVMS Project:auc055oodhost044:VLAN tagging request

Suggested SR request

This SR is part of the OVMS project for tagging all servers which belongs to rmc002oodpool006

Please VLAN Tag the following Dom0HOSTNAME: rmc002oodhost052Mac address : 00:10:E0:57:C9:4F

SW1: cosprings1z2-swi-107.oracle.comSW2: cosprings1z2-swi-108.oracle.comPORT: 12

VLAN:rmdc-z2-isuv-v219 10.222.66.192/28 NANR-RAC-Illinois-State-University-(sr3-9100325470)

Rasie SR to : Network Team

Sample SR#3-11199598771

Follow up till completion of the VLAN tagging SR

Validate all VLANS from the list given are tagged.

10. During Prep stage for : cit-em-agent-12-R4

copy the : /mig-os/pchandru/cit-em-agent-12-R4.remove file

as : /usr/local/git/etc/software/cit-em-agent-12-R4.remove

This is now automated

Page 16: OVS Dom0 Upgrade Livemigration SOP v1.56

11.Ensure service_code = none

From script output make sure SERVICE_CODE_CHECK:PASS

If failed update ITAS

go to ITAS -- > Asset Details -- > change service_code to None

it will take 15 min to pushup APS

12.Ensure Business_area of the host must be : commercial any other inform Lead

From script output make sure : business_area_check:PASS

Other than this pls inform Lead.

13. check ILOM DNS is resolving fine with correct format

Refer below:-

ILOM_NAME="${DATACENTER}-mgmt-${SERIAL}.${DOMAIN_NAME}" # ADC Zone 7 # ADC Zone 8 # ADC Zone 9 # ADC Zone 15 # ADC Zone 26 # ADC Zone 31 # ADC Zone 32 # ADC Zone 33 # ADC Zone 34 # ADC Zone 36 # ADC Zone 38 # ADC Zone 39 # ADC Zone 40 # RMDC Zone 2 # RMDC Zone 5 # RMDC Zone 6 # LLG Zone 1 # LLG Zone 2 # LLG Zone 6 # LLG Zone 7 # LLG Zone 3 # TVP Zone 1

Page 17: OVS Dom0 Upgrade Livemigration SOP v1.56

# TVP Zone 2 # SLDC Zone 4 # SLDC Zone 5

ILOM_NAME="${HOSTNAME}-ilom.${DOMAIN_NAME}" # ADC Zone 42 # ADC Zone 43 # ADC Zone 44 # ADC Zone 45 # ADC Zone 46 # ADC Zone 47 # ADC Zone 48 # ADC Zone 49 # ADC Zone 50 # ADC Zone 51 # ADC Zone 52 # ADC Zone 53 # ADC Zone 54 # ADC Zone 55 # ADC Zone 56 # ADC Zone 57 # ADC Zone 58 # ADC Zone 59 # ADC Zone 60 # ADC Zone 70 # LLG Zone 10 # LLG Zone 11 # SLDC Zone 6 # SLDC Zone 7 # SLDC Zone 8 # SLDC Zone 9 # SLDC Zone 10 # SLDC Zone 11 # SLDC Zone 12 # SLDC Zone 13 # SLDC Zone 14 # SLDC Zone 40 # SYDC Zone 1 & 2 # SYDC Zone 3

Page 18: OVS Dom0 Upgrade Livemigration SOP v1.56

# EPDC Zone 2 # TRDC Zone 1 # TRDC Zone 2