advanced shell scripting for oracle professionals
Post on 15-Apr-2017
591 Views
Preview:
TRANSCRIPT
ADVANCED SHELL SCRIPTING
FOR ORACLE
PROFESSIONALS
JEVGEŅIJS REUTS
INTRO
• WHO? - Oracle Applications Database Consultant @ Pythian
- Oracle Database Administrator Certified Professional
- Working with Oracle since 2006
- Not shell scripting guru
• WHY? - Interesting case to share with community
© 2015 Pythian Confidential 2
BUSINESS CASE
• Migrate users for particular department from on
premise Oracle Internet Directory 10g instance
to Oracle Internet Directory 11g instance located
in Amazon (AWS)
© 2015 Pythian Confidential 3
CUSTOMER REQUIREMENTS
• Migrate users to new basedn or “tree” in OID 11g
• CSV with usernames provided
• Initial downtime requirement max 4h
• Later changed to “No Downtime allowed”
© 2015 Pythian Confidential 4
INITIAL REVIEW
• CSV contains 2,2M usernames
• OID cn: attribute = username
– Example:
USER_ID,SOURCE_USER_ID
317149,SNOWLIS
• Usernames has no pattern for filtering
• Total amount of users in 10g OID 7,7M
© 2015 Pythian Confidential 5
OID USER RECORD EXAMPLE [oracle@oid10g ~]$ ldapsearch -h localhost -p 389 -D "cn=orcladmin" -w password -L -s sub -b "cn=users,dc=test,dc=example,dc=com" "(cn=SNOWLIS)" "*"
dn: cn=SNOWLIS, cn=users,dc=test,dc=example,dc=com
authpassword;oid: {SASL/MD5-DN}E5GNW+/uc5Q4vaUHTpoV8w==
authpassword;oid: {SASL/MD5-U}em8szBiI6lQe7oSZys9S6w==
authpassword;oid: {SASL/MD5}OIcK6dZZFlu7kZOw8+RxEQ==
authpassword;orclcommonpwd: {MD5}UVSevJPyPkXxUHoK1QMOfw==
authpassword;orclcommonpwd: {X- ORCLLMV}C5A7687D19248DD11D71060D896B7A46
authpassword;orclcommonpwd: {X- ORCLNTV}769F744EC914822D37C66B8EFBFD68F9
authpassword;orclcommonpwd: {X- ORCLIFSMD5}AMLZgqATptPU1TkLgpGh1w==
authpassword;orclcommonpwd: {X- ORCLWEBDAV}Fg/OrZz6AEATMeJMXWm19A==
cn: SNOWLIS
mail: test.test@example.com
objectclass: orcluserv2
objectclass: organizationalPerson
objectclass: top
objectclass: person
objectclass: inetorgperson
orclisenabled: ENABLED
orclpassword: {x- orcldbpwd}1.0:059A0F10E478B5BB
sn: SNOWLIS
uid: SNOWLIS
userpassword: {SHA}1btDzs8cj+zHwHLzsgEaUCJ0nn0=
© 2015 Pythian Confidential 6
INITIAL APPROACH
• Create shell script
• Script will read usernames from csv line by line
• With ldapsearch will check if entry exists
• If exists with ldapsearch again dumping all the entry content to ldif file
• Replace basedn or “tree” with sed
• Import users with native OID bulkload utility (uses SQL*Loader, downtime required)
© 2015 Pythian Confidential 7
INITIAL APPROACH EXAMPLE cat ${v_base_dir}/usernames.csv | grep -v "USER_ID,SOURCE_USER_ID" | awk 'BEGIN
{FS=","}{print $2}' | while read v_username ; do
v_ldap_result=$(ldapsearch -h localhost -p 389 -D "cn=orcladmin" -w ${v_oid_pwd} -L -s
sub -b "cn=users,dc=test,dc=exaple,dc=com" "(cn=${v_username})" "dn" | wc -l)
if [ ${v_ldap_result} -gt 0 ] ; then
ldapsearch -h localhost -p 389 -D "cn=orcladmin" -w ${v_oid_pwd} -L -s sub -b
"cn=users,dc=test,dc=exaple,dc=com" "(cn=${v_username})" "*" >>
${v_base_dir}/content_generated_from_cvs.ldif
echo "" >> ${v_base_dir}/content_generated_from_cvs.ldif
else
echo ${v_username} >> ${v_base_dir}/users_not_in_oid.log
fi
done
© 2015 Pythian Confidential 8
PROBLEM WITH INITIAL APPROACH
• Single ldapsearch operation takes ~1s
• For 2,2M users that is 2,2M seconds
• Or 611 hours or 25 days
• Not an option
© 2015 Pythian Confidential 9
APPROACH 2
• Full export of OID basedn or “tree” ldifwrite connect=“SID" basedn="cn=users,dc=test,dc=example,dc=com" ldiffile=content_generated_from_cvs.ldif threads=8
• Create shell script to read full export file and compare usernames against CSV file
• If user exists dumping all the entry content to ldif file
• Replace basedn or “tree” with sed
• Import users with native OID bulkload utility (uses SQL*Loader, downtime required)
© 2015 Pythian Confidential 10
PROBLEM WITH APPROACH 2
• Full export file size 9GB
• Total 7,7M users x 21 attribute
• Huge amount of lines
• CSV file has 2,2M lines
• HOW TO HANDLE THIS EFFICIENTLY?
© 2015 Pythian Confidential 11
WAY TO GO
• Use BASH associative array
• Load username from CSV to array
• Proceed reading full user dump file against array
• Loading 2,2M row from CSV to array took 50
minutes
• NOTE: BASH associative arrays are available
since bash version 4.0
© 2015 Pythian Confidential 12
BASH ASSOCIATIVE ARRAY # load csv to array
declare -A myarray1
while read line_data
do
myarray1[${line_data}]=1
done <<< "$(cat usernames.csv | grep -v "USER_ID,SOURCE_USER_ID"
| awk 'BEGIN {FS=","}{print $2}')“
[oracle@oid10g ~]$ echo ${myarray1[SNOWLIS]}
1
[oracle@oid10g ~]$ echo ${myarray1[SNOWLIS1]}
© 2015 Pythian Confidential 13
BASH ASSOCIATIVE ARRAY
• Use BASH associative array
• Load username from CSV to array
• Proceed reading full user dump file against array
• Loading 2,2M row from CSV to array took 50
minutes
© 2015 Pythian Confidential 14
CONSTRUCTING MAIN BLOCK
• Reading full dump file
• If dn: attribute, then extracting cn: user attribute
• Checking if cn: or username persist in array
• If persists in array, setting print flag and dumping
all line until next dn: attribute
© 2015 Pythian Confidential 15
CONSTRUCTING MAIN BLOCK
while read v_user_entry_item ; do
v_user_entry_res=$(echo ${v_user_entry_item}| grep "^dn:" | wc -l)
if [ ${v_user_entry_res} -gt 0 ] ; then
v_username=$(echo ${v_user_entry_item} | awk 'BEGIN {FS=","}{print $1}' | awk 'BEGIN
{FS="="}{print $2}')
if [ "1" == "${myarray1[$v_username]}" ]; then
print_status=1
else
print_status=0
fi
fi
if [ ${print_status} = "1" ] ; then
echo ${v_user_entry_item} >> content_to_load.ldif
fi
done < ${v_base_dir}/content_generated_from_cvs.ldif
© 2015 Pythian Confidential 16
RUNNING THE SCRIPT
• Script started, working as expected
• But still slow, target to complete > 24h
• Why script taking so long?
• strace -c -f -p <pid>
• Cat, grep, sed and awk utilities
• When bash runs a command it forks a child
process
© 2015 Pythian Confidential 17
STRACE OUTPUT strace -c -f -p 17011
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
95.11 0.134360 1600 84 28 wait4
1.48 0.002091 6 336 fstat
1.16 0.001635 3 486 dup2
0.93 0.001316 0 3140 rt_sigprocmask
0.20 0.000280 1 249 write
0.19 0.000264 2 112 getegid
0.18 0.000251 1 420 mmap
0.17 0.000246 1 375 open
0.15 0.000213 0 841 57 close
0.10 0.000141 0 1035 fcntl
0.08 0.000111 0 280 84 stat
0.05 0.000072 1 140 28 access
0.04 0.000058 2 28 munmap
0.04 0.000056 0 1000 lseek
0.03 0.000049 1 84 brk
0.03 0.000041 0 196 mprotect
0.02 0.000030 0 621 rt_sigaction
0.02 0.000029 0 112 getuid
0.02 0.000028 0 557 557 ioctl
0.00 0.000000 0 669 read
------ ----------- ----------- --------- --------- ----------------
100.00 0.141271 11232 754 total
© 2015 Pythian Confidential 18
BASH STRING PROCESSING ${parameter:offset:length}
sting="dn: cn=SNOWLIS, cn=users,dc=test,dc=example,dc=com"
echo ${string:0:3}
dn:
${parameter:offset}
sting= "cn: SNOWLIS"
echo ${string:4}
SNOWLIS
© 2015 Pythian Confidential 19
BASH STRING PROCESSING
sting=snowlis
echo ${string^^}
SNOWLIS
sting=SNOWLIS
echo ${string,,}
snowlis
© 2015 Pythian Confidential 20
REWRITTEN SCRIPT VERSION echo "Processing full export LDIF..."
print_status=0
# Reading the user list
while read v_user_entry_item ; do
if [ "X${v_user_entry_item:0:3}" == "Xdn:" ] ; then
if [ ${print_status} = "1" ] ; then
echo "${TMP}" >> content_to_load.ldif
print_status=0
fi
TMP=""
fi
if [ "X${v_user_entry_item:0:4}" == "Xcn: " ] ; then
v_user_entry_item_cn=${v_user_entry_item:4}
if [ "1" == "${myarray1[${v_user_entry_item_cn^^}]}" ]; then
print_status=1
myarray1[${v_user_entry_item_cn^^}]=2
else
print_status=0
fi
fi
TMP="${TMP}
${v_user_entry_item}"
done < ${v_base_dir}/content_generated_from_cvs.ldif
if [ ${print_status} = "1" ] ; then
echo "${TMP}" >> content_to_load.ldif
fi
© 2015 Pythian Confidential 21
RESULTS
• Script execution time decreased to 4h
• Still not fast enough
• Redesign the script to run in 4 parallel sessions
• Split full dump file with split command to 4 parts
• Merge four output file
• Script execution time decreased to 1h
© 2015 Pythian Confidential 22
POST PROCESSING & IMPORT
• Replace basedn or “tree” with sed sed -i
"s/cn=Users,dc=test,dc=example,dc=com/cn=users,ou=he,dc=te
st,dc=example2,dc=net/g" content_to_load.ldif
• Remove internal OID attributes with sed
• Run import with ldapadd native OID tool ldapadd -h localhost -p 3060 -D "cn=orcladmin" -w <pwd> -f
content_to_load.ldif -c
© 2015 Pythian Confidential 23
CONCLUSION
• Cat, awk, sed and grep utilities are efficient and useful working with small files
• Working with huge size files use bash string processing where possible
• Bash associative arrays can help and improve performance of your scripts
BEER TIME !!!!
© 2015 Pythian Confidential 25
top related