suse high availability for sap hana tdi in a … · suse ® high availability for sap hana tdi in a...
Post on 20-Aug-2018
258 Views
Preview:
TRANSCRIPT
Before we start
A big thank-you to:
- HPE (Presales UK) assisting with some of the non-SUSE content for
this presentation. Stop by the stand in the technology showcase!
2
What to Expect
• What is SAP HANA TDI?
• HANA on VMware
• Best practices
• HANA TDI Challenges at the OS Level
• SUSE High Availability for SAP HANA
• Overview (performance optimized)
• VM and OS Considerations
• Deployment Steps
• SLE HA for HANA
3
What is SAP HANA TDI?
In the beginning, SAP HANA was simple …
You could only buy HANA in appliance form
Initially only 3 hardware vendors
Use cases were “limited”
The performance improvements being talked about were just incredible
6
SAP HANA Deployment Options
7
• Pubic and Private
• Initial cost savings
• Allows leveraging
existing hardware,
network and storage
for increase flexibility.
• Reduced upfront
costs
• SAP and HW
Vendor Validated
• Most popular HANA
deployment and
consumption model
• Low Risk
• All hardware &
software
components are pre-
installed
AppliancesTailored Datacenter
IntegrationCloud
SAP HANA Delivery Overview
8
Storage
Networking
Compute
SLES
Application
Storage
Networking
Compute
SLES
Application
SAP HANA Delivery Overview
9
Hardware
Selection
Limited Choice Usage of preferred server,
storage and networking
components
Implementation
Effort
Low for customer Increased for SI / Customer
Solution Validation SAP, OS and Hardware vendor
validate the complete HANA
solution.
Customer manages validation
process using SAP tools.
Support Full support offered by HW
vendor
Individual support agreements
SAP Hw Config Tool for SAP HANA TDI
10
NETWORK
COMPUTE
STORAGE
HANA
https://help.sap.com/saphelp_hanaplatform/helpdata/en/2f/334531d3314262aa7c605f8f5f02c1/content.htm
Why talk about VMware?
• Everyone is doing it!
• SAP is the 3rd most common application that gets certified!
• (after mssql and sharepoint)
• Mature – VMware partnership with SAP since 2007
• Validated Architecture for SAP HANA since May 2014
• (Sapphire) vSphere 6.0 now certified for Scale-Up HANA instances.
12
Which version?
vSphere 5.5u2 vSphere 6.0
Hosts per Cluster 32 64
VMs per cluster 4000 8000
CPUs per Host 320 480*
RAM per Host 6TB 12TB
VMs per Host 512 1024
vCPU per VM 64 128
VRAM per VM 1TB 4TB
13
What is supported?
vSphere 5.5u2 vSphere 6.0
Non-prod Production Non-Prod Production
Scale Up < 1TB yes
Scale Up < 4TB no no yes yes
Scale Out < 1TB yes yes yes no
Scale Out < 4TB no no yes no
HANA Versions SPS07+ SPS09+ SPS11+
14
http://blogs.vmware.com/apps/2016/05/sap-hana-on-vsphere-deployment-options-
and-best-practices.html
What About Performance?
Official line:
3-12% Drop in performance.
HPE Document:
HANA Performance increase of 6% on 6.0 vs 5.5.
https://www.hpe.com/h20195/v2/getpdf.aspx/4AA6-6194ENW.pdf
15
Best Practices (basics)
Set Memory reservations for SAP HANA Virtual Machines
Configuring Paravirtual SCSI Controllers and Network Adapters
Enable Hyper Threading on the ESXi host
Use dedicated networks for NFS storage, vMotion, management and client network
Use vMotion and VMware Snapshots during non peak times and without memory
snapshot
Use latency network traffic optimization when needed on the client facing adapters, but
not on the bandwidth adapters for storage or SAP HANA internode network
communication network cards.
Use the HANA HWCCT tool to determine if the storage subsystem is able to meet the
HANA KPIs
Source : www.vmware/go/sap-hana
http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/partners/sap-
hana-scale-out-deployments-on-vsphere.pdf17
80
pages
VMware Tools
VMware tools and drivers integrated with SUSE Linux Enterprise Server 12 for
best out-of-the-box experience
open-vm-tools: eliminates the need to separately install VMware Tools and
reduces operational expenses and virtual machine downtime
vmware_balloon: physical memory management driver
vmw_vmci, vmw_vsock: provide for fast and efficient communications between
guest virtual machines and hypervisors
vmxnet3: next generation of a paravirtualized NIC designed for performance
vmw_pvscsi: driver for paravirtualized SCSI device which improves disk
performance
vmwgfx: kernel driver for 3D graphics
Fully supported by VMware via L3 support agreement
18
OS Challenges
21
Hardware / Platform Choice
OS Install
Patching and Maintenance
Achieving SLA
OS Configuration
OS Kernel settings & Package selections for SAP HANA
SAP HANA(SAP HANA Linux)
SAP B1 (SAP Business One Linux)
# SAP Notes1
# Words2
# Parameters3
496
~570,000
5
451
~150,000
1) Search teams on support.sap.com/notes were: “SAP HANA Linux”, “SAP Business One Linux” and “ SAP NetWeaver Linux”
2) Estimated amount of words based on a sample of 5 randomly chosen notes
3) Linux parameters (kernel, etc), which are mentioned in those SAP Notes
4) Additional packages required compared to a SLES installation with base-pattern only
# Packages4
5
3624
File Systems
24
Systems Integrator, IT or
Customer are
responsible for sizing,
creating and attaching
file systems during or
post OS install
SUSE Linux Enterprise High Availability
HANA Single Box – System Replication
resource failover
active / active
node 1 node 2
SAP HANA SR and SUSE Linux Enterprise High Availability Extension ClusterHANA Single Box
Pacemaker
System Replication
node 1 node 2
SAP HANA
PR1
primary
SAP HANA
PR1
secondary
System
PR1
System
PR1
vIP
SAP HANA SR and SUSE Linux Enterprise High Availability Extension ClusterHANA Single Box
Pacemaker
System Replication
node 1 node 2
SAP HANA
PR1
primary
SAP HANA
PR1
secondary
System
PR1
System
PR1
SAP HANA SR and SUSE Linux Enterprise High Availability Extension ClusterHANA Single Box
Pacemaker
System Replication
node 1 node 2
SAP HANA
PR1
[primary]
SAP HANA
PR1
primary
System
PR1
System
PR1
vIP
SAP HANA SR and SUSE Linux Enterprise High Availability Extension ClusterHANA Single Box
Pacemaker
System Replication
node 1 node 2
SAP HANA
PR1
secondary
SAP HANA
PR1
primary
System
PR1
System
PR1
vIP
VMX Settings (Time)
Settings VMX (Time)
Time should be synced through VMware only in case of “Resume”.
Options can be added with a vSphere client or inside the .vmx file.
34
tools.syncTime = "0"
time.synchronize.continue = "0"
time.synchronize.restore = "0"
time.synchronize.resume.disk = "1"
time.synchronize.shrink = "0"
time.synchronize.tools.startup = "0”
NTP
35
tinker panic 0
# NOT server 127.127.1.0
# NOT fudge 127.127.1.0 stratum 0
Make adjustments to the ntp.conf file:
Disable Services
Hardware related services are generally not required in virtual machines.
Examples are:
36
microcode.ctl
irq_balancer
fbset
alsasound
smartd
mcelog
boot.multipath
Multipathd
I/O Scheduler
The Linux I/O scheduler attempts to minimize the number of I/O
operations and disk head movements by performing two basic functions:
• Request merging - combining multiple adjacent requests to form a
single request
• Elevator - ordering requests to minimize disk seek times
By default the CFQ (Completely Fair Queuing) scheduler is used.
To change this default, use the following boot parameter:
37
elevator=noop
File Systems
Try and minimise disk IO where possible, this can be cause more activty
than you might be aware of and have an impact on performance.
Use the mount options noatime, nodiratime for all file systems.
38
Pre-Requisites
• Two-node clusters
• Scale-up (single-box to single-box) HANA system replication
• Both nodes are in the same network segment (layer 2)
• There is no other SAP HANA system (like QA) on the replicating node
that needs to be stopped during takeover
• Both SAP HANA instances have the same SAP Identifier (SID) and
Instance Number
41
Don’t Forget …
Register and Patch the OS!
Access to:
• Channels
• Tools
• Updates
• Extra Packages
43
SLES
System Replication Overview
Guide
• Back Up the Primary Database
• Set Up System Replication on Primary and Secondary Systems
• Enable Primary Node
• Verify the state of system replication
• Enable the Secondary Node
• TEST and VERIFY .. Then PROCEED
44
HANA
https://help.sap.com/saphelp_hanaplatform/helpdata/en/74/418e86b48542ffb38b54072e0b66ce/content.htm
SAP HANA System Replication Modes
Fullsync Primary will wait for commit from secondary forever.
Sync Primary gets commit from secondary as soon as transaction is on
persistence layer. Primary will wait for a defined time, and then
continue stand-alone.
Syncmem Primary gets commit from secondary as soon as transaction is in
RAM. Primary will wait for a defined time, and then continue stand-
alone.
Async Primary does not expect any commits from secondary.
45
**This should be chosen by the SAP BASIS Team
SLE HA Architecture
46
SLE HA
Corosync
Fencing / STONITH
Pacemaker
Resource AgentsResource
AgentsResource Agents
SLES
SLE HA
HANA
SBD
Local Disk
Installation
SLE HA Pattern
zypper in -t pattern ha_sles
Resource Agents
zypper in SAPHanaSR SAPHanaSR-doc
47
SLE HA
Corosync
/etc/corosync/corosync.conf must be identical on all nodes.
/etc/corosync/authkey also if secure authentication is used.
totem {
version: 2
secauth: on
interface {
member {
memberaddr: 172.17.2.101
}
member {
memberaddr: 172.17.2.102
}
ringnumber: 0
bindnetaddr: 172.17.2.0
mcastport: 5405
ttl: 1
}
transport: udpu
48
SLE HA
SBD (Fencing / STONITH)
What is an SBD?
Setting up and viewing an SBD
sbd -d xxxx create
sbd -d xxxx list
sbd -d /dev/disk/by-id/scsi-12345 list
0 suse01 clear
1 suse02 clear
Best Practice
Shared VMDK, iSCSI Target (Multiple)
49
SBD
SBD Cont…
# /etc/sysconfig/sbd
SBD_DEVICE="/dev/disk/by-id/scsi-12345"
SBD_WATCHDOG="yes"
SBD_PACEMAKER="yes"
SBD_STARTMODE="always"
SBD_OPTS=""
50
SBD
Sharing Disks (VMware)
• For this scenario:
• Shared Virtual Disks are for SBD device only
• Persistent & independent
• Created as eager-zeroed thick (performance)
You can set this directly in the .vmx file or in the vSphere client.
To share two disks, the configuration file entries would look like this:
scsi1:0.sharing = "multi-writer"
scsi1:1.sharing = "multi-writer”
Which VMware features are supported / unsupported?
https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1034165
51
SBD
Sharing Disks (VMware)
52https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1034165
Cluster Watchdog
What is a watchdog?
• Timer
• Reset system in the event of a system hang
• Normally a hardware board.
For VMware use Linux Watchdog - Softdog
echo "softdog" >> /etc/modules-load.d/watchdog.conf
53
SBD
Basic Cluster configuration
crm (cluster resource manager) is a tool with which to manage the
Pacemaker configuration.
USE WITH CAUTION!
crm(live)configure# stonith-enabled="true"
crm(live)configure# no-quorum-policy="ignore"
crm(live)configure# property rsc_defaults resource-stickiness=1000
crm(live)configure# op_defaults timeout="600"
55
SLE HA
SAP HANA Resource Agents
56
SLES
HANA
SBD
Local Disk
SLE HA
Corosync
Pacemaker
IP Address
HANA DB Primary
HANA Topology
SLES
HANA
Local Disk
Corosync
Pacemaker
IP Address
HANA DB Secondary
HANA Topology
SAP HANA Resource
primitive HANANADB ocf:suse:SAPHana \
params SID=HA1 InstanceNumber=10 PREFER_SITE_TAKEOVER=true
DUPLICATE_PRIMARY_TIMEOUT=7200 AUTOMATED_REGISTER=true \
op start timeout=3600 interval=0 \
op stop timeout=3600 interval=0 \
op monitor timeout=700 interval=60 role=Master \
op monitor timeout=700 interval=61 role=Slave
PREFER_SITE TAKEOVER (true)
DUPLICATE_PRIMARY_TIMEOUT (7200)
AUTOMATED_REGISTER (true)
57
SLE HA
HANA Topology Resource
primitive HANATopology ocf:suse:SAPHanaTopology \
params SID=HA1 InstanceNumber=10 \
op start timeout=600 interval=0 \
op stop timeout=300 interval=0 \
op monitor timeout=60 interval=60
58
SLE HA
Resource Agents (under the hood)
59
HANA
Startframework
sapstartsrv / sapcontrol / HDB (calls, output format
“GetProcessList”)
HANA-Topology landscapeHostConfiguration.py (rc, output format)
SR-Topology hdbnsutil (calls, output format “-sr_state --sapcontrol=1”)
SAP Hostagent saphostctrl (call, output format “ListInstances”)
SR-Status hdbsql (now) / systemReplicationStatus.py (future) (now; rc, calls,
output format)
SLE HA
Virtual IP Address
primitive primaryip IPaddr2 \
params ip=172.17.2.103 \
op start timeout=20s interval=0 \
op stop timeout=20s interval=0 \
op monitor timeout=20s interval=10s \
meta target-role=Started
60
SLE HA
Final Cluster Configuration
Create a Master/Slave resource that will manipulate the state of the
HANA instances on both nodes.
ms HANAMS HANADB
Clone the topology resource so it runs on both nodes.
clone TopologyClone HANATopology meta interleave=true
61
SLE HA
Final Cluster Configuration
Create a colocation to tie the IP to the Master side – that is, the Primary
HANA instance.
colocation MASTERIP 2000: primaryip:Started HANAMS:Master
Create an order constraint to ensure that the IP is bound before the
Primary instance is started.
order HANAORDER Optional: TopologyClone HANAMS
62
SLE HA
What did we learn?
63
SLES
HANA
SBD
Local Disk
Corosync
Pacemaker
IP Address
HANA DB Primary
Topology
SLES
HANA
Local Disk
Corosync
Pacemaker
IP Address
HANA DB Secondary
Topology
ESX 01 ESX 02
top related