oracle application server 10g tuning techniques donald k. burleson burleson oracle consulting
DESCRIPTION
Oracle Application Server 10g Tuning Techniques Donald K. Burleson Burleson Oracle Consulting. Oracle Training by Don Burleson. www.rampant-books.com. Oracle Books from $9.95. Guidehorse.com. On-site custom Oracle training Oracle Tuning & Oracle Support Remote DBA Support. Topics:. - PowerPoint PPT PresentationTRANSCRIPT
Oracle Application Server 10g
Tuning Techniques
Donald K. BurlesonBurleson Oracle Consulting
www.rampant-books.comOracle Books from $9.95
Guidehorse.com
On-site custom Oracle training
Oracle Tuning & Oracle Support
Remote DBA Support
Topics: Oracle Application Server 10g Tuning
Approach
Oracle Application Server 10g Monitoring
Tuning with RAM
Load Balancing
Oracle Application Server 10g Architecture
WebCache
WebCache
WebCache
WebCache
HTTPServer
HTTPServer
HTTPServer
HTTPServer
HTTPServer
HTTPServer
Database Files
RACServer
RACServer
RACServer
RACServer
RACServer
Internet
Keys to Success
Parameter Tuning
RAM Cache Tuning
Server Tuning
Parameter Tuning
Oracle Application Server 10g parameters - Adjusting the Oracle9iAS configuration parameters for each Oracle9iAS component has influence performance and throughput.
Database parameters – Because most Oracle9iAS systems are disk I/O intensive, adjusting the Oracle database parameters for the Infrastructure database (iasdb) and the back-end database can heavily influence performance.
RAM Tuning
Data buffer tuning – Adding RAM to the database db_cache_size on the Oracle Infrastructure and back-end database can greatly reduce disk I/O and improve throughput.
Web cache tuning – Adding RAM to the Oracle9iAS web cache can improve the delivery rates of HTML and XML though the Oracle HTTP Server (OHS).
Server tuning Hardware configuration – Adding RAM of CPU
resources to existing servers will improve the throughput on the server
Hardware load balancing – The addition of new servers to the Oracle9iAS farm and relocating Oracle9iAS components. Spare servers can be configured with both Web Cache and App Server, and the appropriate components can be started as-needed.
Server parameter tuning – Adjusting the parameters on your server can have a huge impact on the performance of the Oracle Application Server 10g.
Monitoring Techniques
Response Time Monitoring – DCM and OEM
Wait Event Monitoring – Determine the source of Latency for each Component.
Server Resources – Once the farm is tuned, overloads can be addressed with dynamic server allocation.
Wait Event Monitoring (for isadb and database)
How would you tune this database? % Total
Event Waits Time (s) Ela Time------------------------------ ----- -------- --------CPU time 30 71.43db file parallel write 95 1 23.53control file sequential read 54 1 2.33log file parallel write 62 0 .95db file sequential read 20 0 .68
How would you tune this database? % Total
Event Waits Time (s) Ela Time------------------------------ ----- -------- --------db file sequential read 45 22 41.43db file scattered read 95 14 25.55control file sequential read 54 1 2.33log file parallel write 62 0 .95db file parallel write 20 0 .68
Oracle Application Server 10g Monitoring
Dynamic Monitoring Service (DMS)
OC4J – Measure Parse Time for Incoming Request and Free RAM in the JVM
Portal – Display Portal Metrics
Servlet – Instrument Servlets to Generate Performance Metrics
OHS – Measure Active HTTP Requests
DMS has over 300 metrics
dmstool -l |grep completed
/appsvr/OC4J:3303:6004/oc4j/default/WEBs/parseRequest.completed/appsvr/OC4J:3303:6004/oc4j/default/WEBs/processRequest.completed/appsvr/OC4J:3303:6004/oc4j/default/WEBs/resolveContext.completed
/appsvr/OC4J:3303:6004/oc4j/portal/WEBs/parseRequest.completed/appsvr/OC4J:3303:6004/oc4j/portal/WEBs/processRequest.completed/appsvr/OC4J:3303:6004/oc4j/portal/WEBs/resolveContext.completed
/ap/OC4J:3303:6004/oc4j/syndserver/WEBs/parseRequest.completed/ap/OC4J:3303:6004/oc4j/syndserver/WEBs/processRequest.completed/ap/OC4J:3303:6004/oc4j/syndserver/WEBs/resolveContext.completed
Collect 100 sets at 60 second intervals
dmstool -i 60 -c 100 \/appsvr/Apache:2534:6004/Apache/handle.completed \/appsvr/Apache:2534:6004/Apache/request.completed \/appsvr/Apache:2534:6004/Apache/handle.completed \/appsvr/Apache:2534:6004/Apache/request.completed >> t1.lst
Output ListingSun Jul 13 20:19:43 MDT 2003
/appsvr/Apache:2534:6004/Apache/handle.completed 240320 ops/appsvr/Apache:2534:6004/Apache/request.completed 146504 ops/appsvr/Apache:2534:6004/Apache/connection.completed 56908 ops
Compute delta in spreadsheet
Plot with Chart Wizard
DMS can be scripted:
#!/bin/ksh
PATH=$PATH:/home/oracle/oraportal904/binexport PATH
# Dump Stats for Later Analysisdmstool -dump >> dumparch.lst
# Dumping OHS Stats to a Filedmstool -table ohs_server >> ohs.lst
Sending OHS stats to a flat file:
# Dumping OHS Stats to a File
dmstool -table ohs_server >> ohs.lst
cat ohs.lst|grep connection.active > con_active.lstcat ohs.lst|grep request.active > req_active.lst cat ohs.lst|grep busyChildren.value > busy_child.lstcat ohs.lst|grep readyChildren.value > readyChild.lstcat ohs.lst|grep numChildren.value > det.lst
OHS Server Output
Sun Jul 13 21:01:45 MDT 2003
----------ohs_server----------busyChildren.value: 16...childStart.count: 24748 opsconnection.active: 24 threads...numChildren.value: 44...readyChildren.value: 27...request.avg: 15321 usecsrequest.completed: 150942 ops...
Plotting OHS response time
OHS Response time in milliseconds
0
10,000
20,000
30,000
40,000
50,000
60,000
70,000
80,000
1 6 11 16 21 26 31 36 41 46 51 56 61 66
Active Connections
Mill
ise
co
nd
s
Response time in milliseconds
List OHS performance metricsdmstool -table ohs_module -c 1
Name: mod_oc4j.c ...decline.count: 13487 opshandle.active: 0 threadshandle.avg: 3 usecshandle.completed: 13487 opshandle.maxTime: 8 usecshandle.minTime: 2 usecshandle.time: 43710 usecs
Name: http_core.c...decline.count: 0 opshandle.active: 0 threadshandle.avg: 0 usecshandle.completed: 0 opshandle.maxTime: 0 usecs
Hard to parse The output
Computing real response timeOne of the problems with the OHS statistics is that the one-time operations will skew
the overall averages in the ohs_response listings.
(time – min – max)real_average = ------------------------ (completed – 2)
Using the data from the previous mod_oc4j.c listing, we can compute the real response time:
(43,710 – 2 – 8)real_average = ------------------------ (13,487 – 2)
(43,700)
real_average = ------------------ = 3.24 milliseconds (13,485)
Using Aggrespy
Web Cache Monitoring
Oracle Application Server 10g Web Cache
InternetWeb
CacheWeb Server
Web Server
Database
Trigger
Programmatic
Web Cache Tuning
Static and Dynamic Information
Cacheability Rules
Cache Invalidations
Multi-version HTML
Rule for Each Page Component
Web Cache statistics:
Requests – This shows the current, average and max transaction per second. The backlog section indicates that the Web Cache is overwhelmed and another Web Cache server should be started.
Errors – This summarized the network, site busy and particle-page errors for the Web Cache.
Misses – This section shows cacheable and non-cacheable misses along with the number of refreshes for the Web Cache.
Compression – The compression sections show the total amount of RAM saved by compression and provides a great gauge of the effectiveness of the Web Cache.
Oracle Application Server 10g Load Balancing
Software Load Balancing
Web Cache to OHS – Web Cache interrogates OHS statistics and routes to least loaded.
OHS to Database Listener – OHS Distributes load to multiple Listeners
Database Listener – Listeners to Multiple Dispatchers under MTS, that load balance to least loaded RAC Instance.
Oracle Application Server 10g Load Balancing
WebCache
Web Server
Web ServerDatabase
Web ServerWeb
Cache
Instance
Instance
Instance
ApplicationServer Tier
WebCache Tier
DatabaseServer Tier
Hardware Load Balancing
WebCache
WebCache
WebCache
WebCache
HTTPServer
HTTPServer
HTTPServer
HTTPServer
HTTPServer
HTTPServer
Database Files
RACServer
RACServer
RACServer
RACServer
RACServer
InternetBlade Server Rack
OHS & WC
OHS & WC
OHS & WC
OHS & WC
Oracle RAC
Oracle RAC
Oracle RAC
Oracle RAC
Oracle RAC
8-node RAC with 8 extra nodes pre-defined:
cat cmcfg.ora
HeartBeat=15000ClusterName=TESTPollInterval=1000MissCount=210ServicePort=9998KernelModuleName=hangcheck-timerPrivateNodeNames= int-rac1 int-rac2 int-rac3 int-rac4 int-rac5 int-rac6 int-rac7 int-rac8 int-rac9 int-rac10 int-rac11 int-rac12 int-rac13 int-rac14 int-rac15 int-rac16PublicNodeNames= rac1 rac2 rac3 rac4 rac5 rac6 rac7 rac8 rac9 rac10 rac11 rac12 rac13 rac14 rac15 rac16HostName=rac1CmDiskFile=/mnt/ps/db/quorum
Dynamically add a 9th RAC node:
a) Add a Thread and enable the thread
b) Start the 9th instance
ALTER DATABASE ADD LOGFILE THREAD 9 SIZE 100M
ALTER DATABASE ADD LOGFILE THREAD 9 SIZE 100M;
ALTER DATABASE ENABLE PUBLIC THREAD 9;
startup pfile=initrac9ora
Monitoring Servers with vmstatSAMPLE_TIME=300
while truedo vmstat ${SAMPLE_TIME} 2 > /tmp/msg$$
# run vmstat and direct the output into the Oracle table . . . cat /tmp/msg$$|sed 1,3d | awk '{ printf("%s %s %s %s %s %s\n", $1, $8, $9, 14, $15, $16) }' | while read RUNQUE PAGE_IN PAGE_OUT USER_CPU SYSTEM_CPU DLE_CPU do
$ORACLE_HOME/bin/sqlplus -s perfstat/perfstat@iasdb<<EOF insert into perfstat.stats\$vmstat values ( sysdate, $SAMPLE_TIME, '$SERVER_NAME', $RUNQUE, $PAGE_IN, $PAGE_OUT, $USER_CPU, $SYSTEM_CPU, $IDLE_CPU, 0 ); EXITEOF donedone
rm /tmp/msg$$
Monitoring Servers with vmstatroot> vmstat 5 5
kthr memory page faults cpu ----- ----------- ------------------------ ------------ -----------r b avm fre re pi po fr sr cy in sy cs us sy id wa 7 5 220214 141 0 0 0 42 53 0 1724 12381 2206 19 46 28 79 5 220933 195 0 0 1 216 290 0 1952 46118 2712 40 55 0 513 5 220646 452 0 14 1 33 54 0 2130 86185 3014 38 59 0 36 5 220228 672 0 0 0 0 0 0 1929 25068 2485 25 49 16 10
Assuming an 8 CPU server:
CPU has enqueues when runqueue (r column) > cpu_count
RAM is paging when scan rate (sr) peaks before page-in (pi)
Server exception reports
Wed Dec 20 page 1 run queue > 2 May indicate an overloaded CPU. When runqueue exceeds the number of CPUs on the server, tasks are waiting for service.
SERVER_NAME date hour runq pg_in pg_ot usr sys idl --------------- -------------------- ---- ----- ----- ---- ---- ---- AD-01 00/12/13 17 3 0 0 87 5 8
Whenever Unix performs a page-in, the RAM memory on the server has been exhausted and swap pages are being used.
SERVER_NAME date hour runq pg_in pg_ot usr sys idl ----------------- -------------------- ---- ----- ----- ---- ---- ---- AD-01 00/12/13 16 0 5 0 1 1 98 AD-01 00/12/14 09 0 5 0 10 2 88 AD-01 00/12/15 16 0 6 0 0 0 100 AD-01 00/12/19 20 0 29 2 1 2 98 PROD1DB 00/12/13 14 0 3 43 4 4 93 PROD1DB 00/12/19 07 0 2 0 1 3 96 PROD1DB 00/12/19 11 0 3 0 1 3 96
Fix for Server Stress
Overloaded CPU
– Offload Task to Another Server– Add CPUs– Add Additional Instances/Servers
Overloaded RAM
– Add RAM Cheap $1k/gig– Reallocate RAM from Other Components
RAM Disk Solution
Disk I/O remains the biggest bottleneck 100 gig RAM costs $100k 6,000 times faster than disk for Oracle
Your app will still run inefficiently, but it runs 6,000 times faster!
UNIX server Monitoring rules: The UNIX vmstat utility provides a wealth of information about
the ongoing performance of the Oracle9iAS server.
The vmstat run queue value (r) can indicate a CPU shortage whenever the run queue exceeds the number of CUs on the server.
The vmstat page in values (pi) can indicate a RAM memory shortage.
You can easily define vmstat extension table to hold historical server information and use a UNIX shell script to periodically collect server performance information.
The UNIX server information can be used to generate alert reports and long-term trend reports.
Total Response Time:
Forms Server Time
FormServer
Client DatabaseServer
Time
Time
Time
Database
Client
Network
Extend iasdb for performance monitoring:
create table FormStats ( FORM_ID VARCHAR2(120), EVENT VARCHAR2(120), FSERVER NUMBER, DBASE NUMBER, NWORK NUMBER, CLIENT NUMBER, DATE DATE) ;
Plotting response time data:
Top offending Forms:Top 10 Forms and Events that use the most Average Form Server Timewith a minimum of 10 executions and greater than 2 seconds for
execution.
1. Form: d:\prod\forms\F_END_USER_GENERATED_LETTERS.fmx Event: CLICK F_END_USER_GENERATED_LETTERS BUTTONS SAVE_BTN 1 MOUSE Avg Tm: 5.00 Seconds. Number of Executions: 62
2. Form: d:\prod\forms\F_PC_PICK_RETURNS.fmx Event: CLICK F_PC_PICK_RETURNS BUTTONS PROCESS 1 MOUSE Avg Tm: 4.00 Seconds. Number of Executions: 13
Top 10 Forms and Events that use the most Average Database Timewith a minimum of 2 executions and greater than 5 seconds for execution.
1. Form: d:\prod\forms\f_pc_case_maint.fmx Event: CLICK F_DIARY DIARY_TAB_ALLOUT DATE_OF_INCIDENT 9 Avg Tm: 472.00 Seconds. Number of Executions: 2
Conclusions
Develop a proactive, time-based performance data collection scheme. Real-time OEM and Aggrespy metrics are of little use.
Optimize by adjusting RAM resources parameters.
Once the system is optimized, server monitoring is critical
Server Load Balancing is Critical to properly scale Oracle9iAS