seouc - how to solve the wrong problem
TRANSCRIPT
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
1/42
SEOUC 2011 November 4th, 2011
How to Solve the Wrong Problem(DFS Lock Handle)
Romeo Vasileniuc
BB&T Specialized Lending
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
2/42
About Romeo..
I started working with Oracle database since 1994 I've been involved in about all aspects of Oracle database
technologies including RAC, ASM, Data Guard and Streams Designed and implemented many different varieties of high
available database environments using RAC on ASM, OCFS2 and
Tru64 CFS I enjoy performance tuning Proficient Perl developer In my current role as data warehouse architect with BB&T, I have
architected and implemented many business-driven solutions usingOracle and other vendor products to meet critical business needs
especially in warehousing area Oracle Certified Master, OCP, OCE
2
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
3/42
Agenda
DWH Story
DFS Lock Handle
Oracle DB Storage Overview
Direct I/O
Sync/Async Mode
Supported Platforms
System Monitoring
Oracle testing tools Orion
Q&A
3
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
4/42
DWH Story
TDWH01 TDWH01
TDWH Service
TDWH01 TDWH01
TDWH Service
DWH01 DWH01
DWH Service
9.2.0.3 >>10.2.0.4
TTS
Tru64 9.2.0.3
Tru64 10.2.0.4 RHEL5 10.2.0.4
DWH01 DWH01
DWH Service
10.2.0.4 >>10.2.0.5
RHEL5 10.2.0.5
Storage Vendor A Storage Vendor B
Storage Vendor BStorage Vendor A
4
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
5/42
DFS Lock Handle
Parameter Description
name The name or "type" of the enqueue or global lock can be determined by looking at the two highorder bytes of P1 or P1RAW. The name is always two characters. Use the following SQLstatement to retrieve the lock name.
select chr(bitand(p1,-16777216)/16777215)||chr(bitand(p1,16711680)/65535) "Lock"
from v$session_waitwhere event = 'DFS enqueue lock acquisition';
mode The mode is usually stored in the low order bytes of P1 or P1RAW and indicates the mode ofthe enqueue or global lock request.
select chr(bitand(p1,-16777216)/16777215)|| chr(bitand(p1, 16711680)/65535) "Lock",bitand(p1, 65536) "Mode" from v$session_wait where event = 'DFS enqueue lockacquisition';
id1The first identifier (id1) of the enqueue or global lock takes its value from P2 or P2RAW. Themeaning of the identifier depends on the name (P1).
id2 The second identifier (id2) of the enqueue or global lock takes its value from P3 or P3RAW.The meaning of the identifier depends on the name (P1).
5
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
6/42
DFS Lock Handle 10.2.0.4
6
Bug 8215444
Sessions may hang in a RAC shared server environment whilewaiting for invalidation locks to be acquired. This is most likely tooccur when using Shared Servers with RAC - it should be very rare
with dedicated server connections. Rediscovery Notes: In a RACenvironment, sessions wait for "DFS lock handle" with the lockbeing waited on being an invalidation lock (type "IV")
Fixed in:
10.2.0.5
11.2.0.2 12.1 (future release)
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
7/42
File System Characteristics
Performance is not the most important point
Oracle does not support files on file systems that do not have awrite-through-cache capability
The file system must acknowledge the write operations (Standard
NFS UDP / Network Appliance modified NFS) Security Requirements : data files should be accessible only for the
database owner
Journaling file systems changes are recorded in a journal file (fsckmore quickly compared to non-journaled file systems)
7
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
8/42
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
9/42
Recommended File Systems
Single Node Any file systems supported by the Linux vendor
Multi-node (RAC) RAW
OCFS/OCFS2
Redhat Global File System (GFS) See Document 329530.1
NFS-based storage systems (e.g. NetApp, EMC)
ASM
9
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
10/42
Oracle Memory/Disk Workflow
User Process Redo Log Buffer Redo Log Writer DB Writer
Buffer Cache
Storage
CBC CBC
10
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
11/42
Linux Write Operations Ext3
Buffer Cache Page Cache
Disk
Oracle Process
ssize_t = write(fd1, const void *buf, size_t count) Kernel
Oracle Kernelfd1 = open(/system01.dbf, O_SYNC|..)
I/O
kernel switch
kernel switch
User Space
Kernel Space
11
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
12/42
Linux Read Operations Ext3
Buffer Cache Page Cache
Disk
Oracle Process
ssize_t = read(fd1, const void *buf, size_t count) Kernel
Oracle Kernelfd1 = open(/system01.dbf, O_SYNC|..);
I/O
kernel switch
kernel switch
User Space
Kernel Space
12
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
13/42
Linux Write Operations RAW, OCFS
Buffer Cache
Disk
Oracle Process
ssize_t = write(fd1, const void *buf, size_t count) Kernel
Oracle Kernelfd1 = open(/system01.dbf,
O_SYNC|O_DIRECT|..);
I/O
kernel switch
kernel switch
User Space
Kernel Space
13
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
14/42
Linux Read Operations RAW, OCFS
Buffer Cache
Disk
Oracle Process
ssize_t = read(fd1, const void *buf, size_t count) Kernel
Oracle Kernelfd1 = open(/system01.dbf,
O_SYNC|O_DIRECT|..);
I/O
kernel switch
kernel switch
User Space
Kernel Space
14
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
15/42
Linux System Call : Open(..)int open(const char *pathname, int flags);O_APPEND
The file is opened in append modeO_ASYNC
Enable signal-driven I/OO_CLOEXEC (Since Linux 2.6.23)
Enable the close-on-exec flag for the new file descriptor.O_CREAT
If the file does not exist it will be created.O_DIRECT (Since Linux 2.4.10)Try to minimize cache effects of the I/O to and from this file. In general this will
degrade performance, but it is useful in special situations, such as when applications dotheir own caching. File I/O is done directly to/from user space buffers. The O_DIRECTflag on its own makes at an effort to transfer data synchronously, but does not give theguarantees of the O_SYNC that data and necessary metadata are transferred. To guaranteesynchronous I/O the O_SYNC must be used in addition to O_DIRECT.O_SYNC
The file is opened for synchronous I/O. Any writes on the resulting file descriptorwill block the calling process until the data has been physically written to the
underlying hardware.
15
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
16/42
AIO Overview
Standard feature of 2.6 kernels (patches for 2.4)
Initiate a number of I/O operations without having to block or waitfor any to complete
Later, or after being notified of I/O completion, the process can
retrieve the results of the I/O
16
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
17/42
I/O models
Blocking Non-blocking
Synchronous Read/Write Read/Write(O_NONBLOCK)
Aynchronous I/O multiplexing
(select/poll)
AIO
17
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
18/42
Synchronous blocking I/O model
18
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
19/42
Synchronous non-blocking I/O model
19
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
20/42
Aynchronous blocking I/O model
20
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
21/42
Aynchronous non-blocking I/O model (AIO)
21
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
22/42
AIO interface APIs
API function Description
aio_read Request an asynchronous read operation
aio_error Check the status of an asynchronous request
aio_return Get the return status of a completed asynchronous
request
aio_write Request an asynchronous operation
aio_suspend Suspend the calling process until one or moreasynchronous requests have completed (or failed)
aio_cancel Cancel an asynchronous I/O requestlio_listio Initiate a list of I/O operations
22
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
23/42
Example AIO Usage : AIOCB Structure#include
int aio_read(struct aiocb *aiocbp);
struct aiocb {/* The order of these fields is implementation-dependent */
int aio_fildes; /* File descriptor */off_t aio_offset; /* File offset */volatile void *aio_buf; /* Location of buffer */
size_t aio_nbytes; /* Length of transfer */int aio_reqprio; /* Request priority */struct sigevent aio_sigevent; /* Notification method */int aio_lio_opcode; /* Operation to be performed;
lio_listio() only */
/* Various implementation-internal fields not shown */};
23
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
24/42
Example AIO Usage...#include
...
int main(){// open the fileint file = open(/tmp/file.txt", O_RDONLY|O_DIRECT|O_SYNC, 0);// create the bufferchar* buffer = new char[SIZE_TO_READ];
// create the control block structureaiocb cb;memset(&cb, 0, sizeof(aiocb));cb.aio_nbytes = SIZE_TO_READ;
cb.aio_fildes = file;cb.aio_offset = 0;cb.aio_buf = buffer;
// read!if (aio_read(&cb) == -1){
cout
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
25/42
FILESYSTEMIO_OPTIONS
Value Description AIO DIO
asynch This allows asynchronous IO to be usedwhere supported by the OS.
directIO This allows directIO to be used where
supported by the OS. Direct IO bypasses anyUnix buffer cache. As of 10.2 most platformswill try to use "directio" option for NFSattributes are sensible).
setall Enables both ASYNC and DIRECT IO.
none This disables ASYNC IO and DIRECT IO sothat Oracle uses normal synchronous writes,without any direct io options.
25
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
26/42
FILESYSTEMIO_OPTIONS - ASM
Value Description AIO DIO
asynch This allows asynchronous IO to be usedwhere supported by the OS.
DISK_ASYNC_IO{TRUE|FALSE}
directIO This allows directIO to be used where
supported by the OS. Direct IO bypassesany Unix buffer cache. As of 10.2 mostplatforms will try to use "directio" optionfor NFS attributes are sensible).
DISK_ASYNC_IO{TRUE|FALSE}
setall Enables both ASYNC and DIRECT IO.DISK_ASYNC_IO
{TRUE|FALSE}
none This disables ASYNC IO and DIRECT IOso that Oracle uses normal synchronouswrites, without any direct io options.
DISK_ASYNC_IO{TRUE|FALSE}
26
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
27/42
DISK_ASYNC_IO
Controls whether I/O to datafiles, control files and logfiles isasynchronous (that is, whether parallel server processes can overlapI/O requests with CPU processing during table scans).
Default value : TRUE
Set to FALSE to disable asynchronous I/O
If you set DISK_ASYNCH_IO to false then you should setDBWR_IO_SLAVES to a value other than its default (0) in order tosimulate asynchronous I/O
If DBWR_IO_SLAVES>0 then number of processes used by ARCH
and LGWR is set to 4. Also, RMAN server processes will be set to 4only if asynchronous I/O is disabled
27
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
28/42
How to check the usage at OS Level
If I/O async is enabled:
If I/O async is disabled:
[oracle@dwh01 ~]$ sudo cat /proc/slabinfo | grep kiokioctx 89 144 320 12 1 : tunables 54 27 8 : slabdata 12 12 0kiocb 47 240 256 15 1 : tunables 120 60 8 : slabdata 13 16 0
[oracle@lnx6 ~]$ sudo cat /proc/slabinfo | grep kiokioctx 0 0 320 12 1 : tunables 54 27 8 : slabdata 0 0 0kiocb 0 0 256 15 1 : tunables 120 60 8 : slabdata 0 0 0
28
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
29/42
How to check the usage at DB Level
If I/O async is enabled:
If I/O async is disabled:
[oracle@a01 ~]$ ps -eaf | grep lgworacle 13924 1 0 Oct12 ? 00:19:18 ora_lgwr_DWH1
[oracle@a01 ~]$ strace -p 13924io_submit(47566907363328, 2, {{0x2b4309936b70, 0, 1, 0, 41}, {0x2b4309936d38, 0, 1, 0,42}}) = 2
io_getevents(47566907363328, 1, 1024, {{0x2b4309936b70, 0x2b4309936b70, 512, 0},{0x2b4309936d38, 0x2b4309936d38, 512, 0}}, {600, 0}) = 2
[oracle@b01 ~]$ ps -eaf | grep lgwroracle 28400 1 0 Feb22 ? 05:41:21 ora_lgwr_PMT1
[oracle@b01 ~]$ strace -p 28400..pwrite64(20, "\1\"\0\0\4)\0\0p\317\10\0\20\200..\213k\r\1\0\0042"..., 512, 5376000) = 512pwrite64(21, "\1\"\0\0\4)\0\0p\317\10\0\20\200..\213k\r\1\0\0042"..., 512, 5376000) = 512..
29
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
30/42
How to check the usage at DB Level - ASM
Async Enbled by default on >=10g and ASM
You can disable using DISK_ASYNC_IO=false
[oracle@c01 ~] strace -p 13260...open(0xfe2c85f0, O_WRONLY|O_CREAT|O_APPEND|O_LARGEFILE, 0660) = 8
writev(8, [?] 0xffffb608, 2) = 80...read(16, "MSA\0\2\0\10\0P\0\0\0\222\377\377\377@\313\373\5\0\0\0"..., 80) = 80...
30
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
31/42
Oracle Linux, Filesystem & I/O Type Supportability(Doc ID 279069.1)
10g Async I/O
10g Direct I/O
11g - AsyncI/O
11g Direct I/O
ext3/ext4 Yes (8) Yes (8) Yes (8) Yes (8)
raw Depr. (2) Depr. (2) Depr. (2) Depr. (2)
block Yes
(3)
Yes (3) Yes
(3)
Yes (3)ASM
(4)Yes Yes Yes Yes
OCFS2 Yes Yes Yes Yes
NFS Yes Yes(7)
Yes (7) Yes
GFS Yes(5)
Yes(5)
GFS2 Yes(6)
Yes(6)
31
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
32/42
System Monitoring - LGWR[oracle@dwh1 ~]$ ps -eaf | grep lgwr
oracle 28878 1 0 11:47 ? 00:00:02 ora_lgwr_test1[oracle@pd3 ~]$
[oracle@dwh1 ~]$ strace -cp 28878Process 28878 attached - interrupt to quitProcess 28878 detached% time seconds usecs/call calls errors syscall------ ----------- ----------- --------- --------- ----------------91.68 0.076143 43 1791 pwrite5.67 0.004709 16 290 pread
2.59 0.002152 10 212 7 semtimedop0.06 0.000053 0 5945 times0.00 0.000000 0 1 read0.00 0.000000 0 36 open0.00 0.000000 0 36 close0.00 0.000000 0 21 stat0.00 0.000000 0 35 writev0.00 0.000000 0 10 sendto0.00 0.000000 0 123 kill0.00 0.000000 0 67 semctl0.00 0.000000 0 10 getrusage
------ ----------- ----------- --------- --------- ----------------100.00 0.083057 8577 7 total
32
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
33/42
System Monitoring - DBWR[oracle@dwh1 ~]$ ps -eaf | grep dbw
oracle 28876 1 0 11:47 ? 00:00:03 ora_dbw0_test1
[oracle@dwh1 ~]$ strace -cp 28876Process 28876 attached - interrupt to quitProcess 28876 detached% time seconds usecs/call calls errors syscall------ ----------- ----------- --------- --------- ----------------89.77 0.019979 54 370 pwrite5.67 0.001262 0 70462 getrusage4.49 0.000999 42 24 9 semtimedop
0.07 0.000015 0 817 times0.00 0.000000 0 2 mmap0.00 0.000000 0 21 semctl
------ ----------- ----------- --------- --------- ----------------100.00 0.022255 71696 9 total
33
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
34/42
System Monitoring LGWR - ASM[oracle@pd3 ~]$ ps -eaf | grep lgwroracle 30809 1 0 13:37 ? 00:00:00 ora_lgwr_test1
[oracle@pd3 ~]$ strace -cp 30809Process 30809 attached - interrupt to quitProcess 30809 detached% time seconds usecs/call calls errors syscall------ ----------- ----------- --------- --------- ----------------65.97 0.034288 91 377 io_submit13.77 0.007155 4 1839 io_getevents11.15 0.005794 20 295 pread
8.04 0.004178 13 314 7 semtimedop0.82 0.000428 31 14 pwrite0.23 0.000118 0 5475 times0.02 0.000011 0 248 kill0.00 0.000000 0 32 open0.00 0.000000 0 32 close0.00 0.000000 0 21 stat0.00 0.000000 0 32 writev0.00 0.000000 0 6 sendto0.00 0.000000 0 42 semctl0.00 0.000000 0 8 getrusage
------ ----------- ----------- --------- --------- ----------------100.00 0.051972 8735 7 total
34
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
35/42
System Monitoring DBWR - ASM[oracle@pd3 ~]$ ps -eaf | grep dbworacle 30807 1 0 13:37 ? 00:00:00 ora_dbw0_test1
[oracle@pd3 ~]$ strace -cp 30807Process 30807 attached - interrupt to quitProcess 30807 detached% time seconds usecs/call calls errors syscall------ ----------- ----------- --------- --------- ----------------88.57 0.036811 449 82 io_submit5.32 0.002210 0 96686 getrusage4.36 0.001814 4 451 io_getevents
1.11 0.000463 1 416 mmap0.57 0.000238 0 6008 kill0.06 0.000024 0 3488 times0.00 0.000000 0 2 semop0.00 0.000000 0 96 semctl0.00 0.000000 0 22 8 semtimedop
------ ----------- ----------- --------- --------- ----------------100.00 0.041560 107251 8 total
35
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
36/42
ORION (Oracle I/O Calibration Tool)
Standalone Tool for calibrating the I/O performance for storagesystems that are intended to be used for Oracle databases
No need to create and run an Oracle DB
May be configured to simulate OLTP or DWH environments
36
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
37/42
Orion Configuration[oracle@pd3 ~]$ cat dwh_dg1.lun/dev/mapper/mpath7p1/dev/mapper/mpath8p1/dev/mapper/mpath9p1/dev/mapper/mpath14p1
37
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
38/42
Orion Configuration[oracle@dwh01 orion]$ cat orion-test1.sh#!/bin/bashHOST=`hostname -s`DG=$1TEST="${HOST}-${DG}"
cd ${ROOT}
echo "Hostname: ${HOST}"echo " dg: ${DG}"echo " test: ${TEST}"
sudo ./orion_linux_x86-64 -run advanced \-testname ${TEST} \-matrix point \-num_small 0 \-num_large 4 \-size_large 1024 \-num_disks 4 \-type seq \-num_streamIO 4 \-simulate raid0 \-cache_size 0 \-stripe 1024 \-write 50 \-verbose
38
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
39/42
Orion Run : summary.txtORION VERSION 11.1.0.7.0
Commandline: -run simple -testname dg1 -num_disks 4
This maps to this test:Test: dg1Small IO size: 8 KBLarge IO size: 1024 KBIO Types: Small Random IOs, Large Random IOsSimulated Array Type: CONCATWrite: 0%
Cache Size: Not EnteredDuration for each Data Point: 60 secondsSmall Columns:, 0Large Columns:, 0, 1, 2, 3, 4, 5, 6, 7, 8Total Data Points: 29
Name: /dev/mapper/mpath7p1 Size: 1099522496512Name: /dev/mapper/mpath8p1 Size: 1099522496512Name: /dev/mapper/mpath9p1 Size: 1099522496512Name: /dev/mapper/mpath14p1 Size: 1099522496512
4 FILEs found.
Maximum Large MBPS=330.91 @ Small=0 and Large=8Maximum Small IOPS=8856 @ Small=20 and Large=0Minimum Small Latency=0.85 @ Small=1 and Large=0
39
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
40/42
Implementation Tips..
Test, Test, Test
Make friends in the Storage/System Administration Groups
Be aware of any existing bugs/limitations : Data Pump Export(EXPDP) Received Error ORA-31641 (Doc ID 1330406.1) Setting filesystemio_options=O_DIRECT/SETALL and using expdp to export tables to a file
system (tmpfs) which is not support O_DIRECT.
alter system set filesystemio_options=none scope=spfile;
40
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
41/42
References
Metalink Oracle Linux, Filesystem & I/O Type Supportability (Doc ID 279069.1)
ASM Inherently Performs Asynchronous I/O Regardless of filesystemio_options Parameter(Doc ID 751463.1)
How To Check if Asynchronous I/O is Working On Linux (Doc ID 237299.1)
Supported and Recommended File Systems on Linux (Doc ID 236826.1)
Comparing Performance Between RAW IO vs OCFS vs EXT 2/3 (Doc ID 236679.1)
Using Redhat Global File System (GFS) as shared storage for RAC (Doc ID 329530.1)
Asynchronous I/O (aio) on RedHat Advanced Server 2.1 and RedHat Enterprise Linux 3(Doc ID 225751.1)
Asynchronous I/O Support on OCFS/OCFS2 and Related Settings: filesystemio_options,disk_asynch_io (Doc ID 432854.1)
Other Sites M. Tim Jones - Boost application performance using asynchronous I/O
(http://www.ibm.com/developerworks/linux/library/l-async/index.html)
41
-
7/27/2019 SEOUC - How to Solve the Wrong Problem
42/42
Questions & Contact Info
http://blog.romeosoft.com/
42
http://romeosoft.com/mailto:[email protected]:[email protected]://romeosoft.com/