erp common errors & solutions

ERP COMMON ERRORS & SOLUTIONS

This documents all UNIX, Oracle, SAP related errors that are worth sharing across teams.

TABLE OF CONTENTS

UNIX 2ORACLE 4SAP GENERAL 9SAP PRESENTATION 15SAP BACKUPS 19SAP SECURITY 25

UNIX

ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION1. nfiles has reached 80% Number of simultaneous open files in UNIX. Need to reboot

to bring nfiles to zero.This cannot be prevented in 3.0F. Once a file is opened, it is never closed until reboot.

2. SOE Refresh: when was the last time files were refreshed into a host and what files were refreshed?

SOE refresh / swinstall logs can be found in /var/adm/sw/swagent.log

3. ethp2010: Disk utilization of /sapmnt/E4Q is 90%

To find out which directory is the biggest: ls -F | grep “/” | xargs du -s

To find out which file inside directory is biggest: ls -l | sort -r -k5n

to find out which directory is the biggest under the same mountpoint: du -x .

If BIethp2010* are the biggest files, ask Basis to run SAP_REORG_BATCHINPUT housekeeping job with the correct client variant.

The BI* files are batch input log files that are generated whenever there are batch input jobs running. These are cleaned up by BASIS job: SAP_REORG_BATCHINPUT

This job contains a step which runs abap RSBDCREO which reorganises the SBDC sessions for a particular client. This clears out any sessions that have been fully processed and leaves the ones that still have processing requirements on them. The current job in E4P was pointing to client 098 instead of 420 so client 420 was not being cleaned up and the file was just getting bigger. The changes made were to change the variant to point at client 420 and change the retention period for SBDC sessions from 7 days to 3 days.

4. What is the IP of the Unix server? a) nslookup <SID> : Will use DNSb) /etc/rc.config.d/netconfc) /etc/hostsd) lanscan (get lan0); ifconfig lan0

5. What do the svr* jobs scheduled in EMEA systems do?

a) svxrj001 job performs a database system check. Check if database is up and running.

b) svxrj010 job performs a sapdba -next. This checks if there are no problems with the max.extents.

6. Setting up a link in unix ln -s <source> <new file>ln -s /var/opt/gsss/data/ANR /var/opt/gsss/data/anr

The new file should not exist first.

AIM COMMON ERRORS & SOLUTIONS of 30document.doc last saved 4/4/2006 05:53:00 PM

ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION7. MERCATOR_EXEL TCP/IP connection on

L6A gets timeout errorRemsh problem between bdhs and L6A for l6aadm. Make sure grace logins don’t expire and remsh possible.

8. To see memory of server glance \ F39. To see how many inodes a filesystem is

occupyingdf - i or bdf - i


ORACLE

ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION1. DB error

.

oerr ora <error #>

If error not found, it could imply Oracle internal error. Ask user to contact Oracle for advice

Check /oracle/<sid>/background/alert_<sid>.log for alert log file.

2. Switch tablespace from backup mode to normal mode

1. sesu - orasid2. svrmgrl3. connect internal;4. select * from V$BACKUP;5. alter tablespace <tablespacename> end backup;

Alternatively you can login goto /opt/nb/bin and as root run# ./backoff_backup.shThis will automatically switch all tablespaces to endbackup mode


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION3. Moving datafile without stopping SAP When the target and source is different:

1. Make a copy of the datafile from the source to the target.

cp /oracle/SID/sapdata20/btabi_1/btabi.data1 /oracle/SID/sapdata10/btabi_1/btabi.data1

Note: You will have to crate the directory [e.g btabi_1] on the target and change the file ownership to orasid:dba . Also once you have copied over the file, you will have to change the ownership of the file itself [here btabi.data1] to orasid:dba.

2. su - orasid3. svrmgrl; connect internal4. alter tablespace psapbtabi offline;5. alter tablespace psapbtabi rename

datafile ‘/oracle/SID/sapdata20/btabi_1/databi.data1’ to /oracle/SID/sapdata10/btabi_1/databi.data1’;

6. alter tablespace psapbtabi online;

Step 5 may generate error:

ORA-01113: file n needs media recoveryThis is likely to happen if some database activity occurs on the particular tablespace that has been taken offline during the time this change is done.

If you encounter this error you have to do the following : [after svrmgrl and connect internal]:

1. recover datafile ‘/oracle/SID/sapdata20/btabi_1/databi.data1’;

When the target and source is same:1. su - oraas12. svrmgrl3. SVRMGR> connect internal4. SVRMGR> Alter database

datafile ‘/oracle/AS1/sapdata25/btabi_44/ btabi.data44’ offline;

5. copy the file to some temp directory like ‘/oracle/sid/sapreorg’ and ‘/var/adm/crash’

6. mount the directory ‘/oracle/AS1/sapdata25’ with correct ownership

7. copy the file from temp directory to mounted filesystem

8. SVRMGR> ALTER DATABASE RECOVER DATAFILE ‘/oracle/AS1/sapdata25/btabi_44/ btabi.data44’;

9. Following error may be recd,

ORA-00279: change 143381204 generated at 09/27/00 17:20:00 needed for thread 1ORA-00289: suggestion : /oracle/AS1/saparch/ AS1arch1_81136.dbfORA-002 80: change 143381204 for thread 1 is in sequence #81136

To solve,10. SVRMGRL> ALTER DATABASE

RECOVER LOGFILE '/oracle/AS1/saparch/AS1arch1_81136.dbf';

11. SVRMGR> Alter database datafile ‘/oracle/AS1/sapdata25/btabi_44/ btabi.data44’ online;


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION4. Create a large rollback tablespace

1. svrmgrl; connect internal 2. CREATE TABLESPACE PSAPROLLBIG

DATAFILE '/oracle/<SID>/sapdata<#>/ rollbig_1/rollbig.data1' SIZE 400M DEFAULT STORAGE(INITIAL 64M NEXT 16M pctincrease 0);

(The path '.../rollbig_1' must be created first, if necessary)

5. Create a rollback segment

CREATE ROLLBACK SEGMENT PRS_BIG TABLESPACE PSAPROLLBIG STORAGE (INITIAL 64M NEXT 16M MAXEXTENTS 300 OPTIMAL 512M);

6. Activate PRS_BIG and deactivate the old rollback segments

> alter rollback segment prs_1 offline; ... > alter rollback segment prs_10 offline; > alter rollback segment prs_big online;

To make this permanent:1. Insert the line "rollback_segments=(PRS_BIG)" in

init<SID>2. Deactivate the line "rollback_segments=(PRS_1,...)

with # 3. Restart the DB


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION7. Maximum number of datafiles have been

reached for a particular tablespace. Cannot add more datafiles

The db_files parameter is there in two places, one in init.ora file (which is easy to change) and the other in controlfile, which can be changed only by rebuilding the controlfile.

TEMPORARY FIXIf tablespace is urgently getting filled up, can increase size of an oracle datafile on line without rebouncing. “Alter database datafile ‘<datafilename>’ resize <new size>”

PERMANENT FIX (Downtime needed):Need to change db_files parameter in init<sid>.ora and controlfile1. Change db_files in /oracle/<SID>/dbs/init<sid>.ora to

higher number and reboot2. Change controlfile

svrmgrlalter database backup controlfile to traceGet the trace file from /oracle/<SID>/saptrace/usertrace/ora*.trcRemove first few statements. Retain from “STARTUP MOUNT” onwardsChange the parameter MAXDATAFILES to a higher numberRun this as an sql script in svrmgrl (i.e. @ora_26824.trc)

8. ORA-12203 : TNS:unable to connect to destination;

Work processes completed but unable to restart. They can be restarted manually without any problems

Note 131561

Many work process tried to connect to the listener at the same time, but listener is not able to process all requests. Modify listener.ora to contain parameter QUEUESIZE=50 and restart listener


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION9. ORA-03113 end-of-file on

communication channelWhen you use svrmgrl to even start the database.Note that this generally happens only on a new system or a system were unix/patches or oracle upgrades have been applied.

There is no straight forward solution to this and you have to try the following to reach the solution [hopefully]The Type of causes you can look for:

Wrong permissions for oracle binaries:

/orac;le/<SID>/bin must have (6751 or -rwsr-x--x).ora<sid> must have dba group

(This was the cause for oragba sid)

Oracle might have stopped incorrectly - Check Oracle processes, Shared Memory ( eg by using ipcs)

Clean (Kill and ipcsrm) these.

HP-UX 11 (SHRMAX must be > 512 Mb - Please note for 32 bit it must be less than 4Gb)

Svrmgrl and sqlplus may need relinking (Were there any unix patches)

Various other erros:

(Trouble with online JFS, etc..)Get more information (from alert_log etc) and get help from Oracle/SAP


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION


SAP GENERAL

ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION1. Slow response time Network communication team to check whether network is

having heavy traffic.

ST06 to check CPU utilization.SM04 to check all users logged on to the system.

ST03-> performance DB->Time Profile to check system average response time, wait time etc.ST03-> performance DB->Top Time to check for long-running programs.

SM21 to check system logs.ST22 to check for shot dumps. Use ‘Selection’ to specify more detailed information.

SM66 to check programs run by users. ‘Sort’ to identify long-running jobs.

SM37 to check for job details, errors etc.

Contact users to monitor/cancel their jobs if the jobs are hogging the system.

Basis could cancel these jobs only when necessary.

If forced to kill job in unix, PID in SM66 is the app server PID (client).

To get the oracle process, use ST04-> Detail Analysis-> Oracle session and get the PID of oracle that corresponds to client PID in app server.

System performance data to be reviewed.

Get in touch with user and make the following preparation prior to any long running batch job:

1. Cleanup pooldisk2. Lower /oracle/sid/saparch threshold so

that DM gets notified earlier.3. Lower saparch_guard.cfg threshold so

that saparch kicks in earlier.


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION2. saplicense >saplicense -show

to see all licenses installed.

>saplicense -getto get the license used.

>saplicense -install to install new license. Must delete old license first before installing new one.

>saplicense -tempto install a temp license

Each SAP system comes with a 30-day temp license. If a license expires, we can just use the temp license till the new license is received from SAP.

To request for a new license, first obtain the Installation Number of the system.

1) Goto Web www003.sap-ag.de2) Log in with your OSS id/passwd3) Click on ‘License Keys’ under ‘Modifications’4) Click ‘Request License Key’ under ‘Application’5) Click the installation no. under which license you want

apply6) Click ‘New System’ under System Overview

Fill in all the info required e.g System ID : ALK System type : Development System Software product : SAP R/3 SW product release : 45B Basis release : 45B Database : Oracle Operating system : HP-UX Priority : Medium

>saplicense -showto see all licenses installed.

>saplicense -getto get the license used.

Check that the license used has expiry date 31/12/9999


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION Hardware Key: D12390003 Desired time limitation: 31/12/9999 Reason of request: New installation

Click ‘New’ near Overview ListClick ‘Submit’You can always track the status about the application by choose ‘License Requested by me’

After you get the license from Web, start to apply license by following steps.

Log into Sap systemsesu - sidadmsaplicense -installFill in all the information as listed in the Web.

3. pgsapinfo time out on connection to host This could mean that the server is down OR

False alarm as follows:

There are a lot of rfc connections queued up on the server, which prevents sapheart from logging in. This is sometimes due to a large download of master data or idocs from another server. Usually, users are not able to log in during this period. The backload is usually cleared in a few minutes.

If it happens often, we can look at increasing the timeout value or suppressing the page for a timeout error.

Note:/momauto/managers/evtmgt/bin/ sapheart/run.sh calls pgsapinfo C program

4. st03 does not show updated performance data.

Check to see if RSCOLL00 (collector for performance monitor) is scheduled to run every hour via sm37. If not, ask Basis to schedule the job

5. st03 contain invalid/non-active servers that were carried over during DBCopy from source system

These servers do not do any harm. To clean them up, do the ff:st03 à Goto à Performance Database à Contents of database à Delete 1 server. Enter the name of inactive server. This will delete its data from the MONI table.


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION6. Cannot ftp or ping sapserv4 (204.79.199.2)

from any of Asia SAP servers1. All UNIX boxes configured to access

sapserv4/sapserv3 [OSS/FTP] should be specifically authorized by SAP [at TCP/IP level not at hostname level]. Normally one would open a ticket with sap to add a VALID UNIX box. Surely, the SAP router bdhifddi can do this.

Complete list of servers registered in SAP: etsp0021, bdhi, bdhifddi, bdhk, bdhy, bhhc, blhd, blhp, sihp8008, testframe.

Possible cause of problem: IP has changed and SAP has not been updated. Open message to SAP to correct the problem

2. Try to do the ff: "traceroute 204.79.199.2" from the Asia SAP server and if it always stops at swcfw.gw.na.pg.com (192.44.167.2) which is the firewall router and says maxttl expired before reaching sapserv4.na.pg.com (204.79.199.2)

Forward ticket to G.IPIA to add permit statement between IP of SAP server (e.g. 155.124.118.25) and sapserv4 (204.79.199.2) on both in bound and out bound access control list

3. To connect to OSS, saprouter software should be running. Ftp would work without saprouter software (executed as <sid>adm or root – saprouter -r &). This is running in bdhi.

7. DBIF_RSQL_NO_MEMORY error on table CDPOS in ANR 4*0. There is a 7.5MB cache limit

Note 108328 recommends setting instance profile rsdp/rclu/cachelimit=0; and setting R3trans and tp env variable to RCLU_CACHELIMIT=0. Related note 141465


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION8. SAP-FI DBIF_RSQL_SQL_ERROR in AS1

client 525See oracle instructions on how to do this:

1. Offline small segments (i.e. PRS_1 to PRS_9)2. Created RBIG segment in PSAPROLLBIG and online

RBIG in AS1

9. Stat file keeps on growing even after manual deletion

OSS 6833

1) Set Max. number of records per cumulated call to 10,000. Standard value (and on almost all our other systems) is 20,000 but this seems to be too much causing time outs.

2) stat/timeout = 6003) Schedule

SAP_COLLECTOR_FOR_PERFMONITOR to run more frequently.

4) Check version of saposcol (saposcol –v). Different OS versions has to run different versions of saposcol. /usr/sap/SID/SYS/exe/run/saposcol


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION10. Connection to partner broken. SAP-Gateway

on host sihp8042.ap.pg.com. Module niuxi.c. NI component. Error message during test connection of TCP/IP via sm59

RFC-TCPIP connection from app server sihp8034 to db server sihp8042 was not working. Initial symptom appeared when Thailand is not able to transfer ASN from A6P to the interface server sihp8018. This was working in A6A, but not in A6P. Note that the TCPIP connection from another app server sihp8035 to the db server is functioning well.

Cause : Codepage settings of sihp8034 (app) is different from sihp8042 (db) and sihp8035 (app). All servers that house A5*, A6*, A7* have codepage set to 1100 (default Latin-1 or ISO8859-1). But sihp8034 was set to 1610 (ISO8859-9).

Solution :1) Change the unix environment variable of a6padm in

sihp8034. Env variable can be found in /home/a6padm/.sapenv_sihp8034.csh and .sh.

SAP_CODEPAGE = 1100 or don't set the env variable at all so it would take the default 1100.PATH_TO_CODEPAGE -> deleted this variable since it is not used

2) Changed profile parameters of the 3 servers to use the en_US.iso88591 locale instead of tr_TR.iso88599 in parameter abap/locale_ctype.

3) Rebounced A6P app and db servers

** Only #1 and #3 are necessary to solve the problem; however, for consistency purposes, we also did the changes on the profile. en_US.iso88591 is the locale that corresponds to the codepage 1100.

To find out if this is the cause of the problem, run RSCP0001 / RSCP0018 via se38.

11. Changes / Transports to the database server are not seen from the application server

Modify default.pfl parameter to contain:

rdisp/bufrefmode=sendon,exeauto12. /usr/sap/trans is full Run the scripts :

1) Go to /usr/sap/trans/data2) /usr/sap/trans/scripts/compandmove <# of

days> /usr/sap/trans/archive/datafiles


SAP PRESENTATION

SAP GUI

ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION1. Logon Load Balancing Error SAP system may be down for maintenance. Check the

change tickets/c4415 messages for planned changes.User has manually entered the picklist entry and the server has changed. Ask user to get LCC to reinstall picklist.Check the services file in C:\win95 for sapmsXXX and sapdpXXX entries. Ask user to get LCC to install latest service file from GWR.None of the above causes then ask user to get LCC to reinstall picklist.Services file may have been overwritten by a ESD job. Ask Helpdesk to re-install the Services file or re-install the picklist.

Broadcast planned system outages Emphasize to users not to make manual changes to picklist.

2. Application Helpfiles not available The Helpfile location parameters have not been set up. Set up the following parameters in the default profile

eu/iwb/path_win32 =global/simpstd/plainhtm/helpdata/eu/iwb/server_win32 = intraprod7.internal.pg.com

and then rebounce the system.

This parameter change has been effected on nearly all the systems already so issue should not arise

3. Other applications conflict with SAP GUI. Users load non-SEWP’d applications on their workstation which conflict with SAP GUI. SEWP C&I the applications.

Users should not use non-SEWP’d applications

4. Error msg with SAP GUI’s that were installed via CD

NO SOLUTION as SAP GUI’s installed via CD are NOT supported by AIM or Basis

Users should not install SAP GUI via CD


SAP PRINTING

ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION1. SAP spool indicates that print job has

successfully completed but no print out.DIAGNOSIS

Try sending output from UNIX via lp -d”<printername>” <filename>Use lpstat <unix printer name> or lpstat -c <request id> to see status of your printout

If it is just queuing up and not printing, then it is a unix spooler problem

Job stuck at Unix level. Use the printers command to check that the printer status is ENABLED, the print queue is spelt correctly and the status check is o.k. If DISABLE then ENABLE printer. If print queue spelt incorrectly, delete printer and add new printer. If status check is not o.k., take corrective action. E.g add Unix host to UXP or ask LCC to restart the UXP software

Job stuck at Unix level. Stop/start the unix lp spooler.

2. Spool table TSP01 reaches limit of 32000 Run RSPO0041 job with transaction SE38 with following variant, client : * and check the delete all print job with min age option and Commit parameter: 1000 can check the table size with the following SQL statement in svrmgrl,Select count(*) from sapr3.tsp01;

Basis to check batch job logs

3. Print jobs cannot be printed out. Printer configured incorrectly at SAP or Unix level or both. Reconfigure printer.

Follow proper procedures for printer configuration.

4. Print job at SAP & Unix level is completed but printer does not print it out. Cannot print from other windows applications like Microsoft Word and Excel.

Netware or local site infrastructure down. Forward issue to local site to handle.


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION5. SAP Spool work process hung

SM51 shows SPOOL process running the RSPOWP00 and CPU time is very long for this work process

Large print job submitted with many other print job waiting. Ask user to delete print job and run it as two or smaller print jobs.Too many print jobs created. Ask user to delete some print jobs.Large print job stuck at Unix Level verified using lpstat. Perform lpshut and then lpsched to start/stop the Unix print spooler.

6. SAP Spool work process hung SM51 shows SPOOL process running the RSPOWP00 and CPU time is very long for this work process

Large print job submitted with many other print job waiting. Ask user to delete print job and run it as two or smaller print jobs.

Too many print jobs created. Ask user to delete some print jobs.

Large print job stuck at Unix Level verified using lpstat. Perform lpshut and then lpsched to start/stop the Unix print spooler.

If system responds with scheduler already running:- cd /usr/spool/lp- ls- rm SCHEDLOCK file if present and

stop/restart the scheduler.7. SAP ‘Output Status’ reports ‘error’

/usr/bin/lp: destination xxx non-existent

Printer inadvertently deleted. Send SC back to customer requesting original info and recreate (use SSC#253)

Printer created on SAP but not UNIX. Create on UNIX using printer script and retest at UNIX and SAP level.

Ensure complete end-to-end setup and testing has been performed for new printer requests.

Ensure printer deletion CBA is followed correctly.

8. SAP ‘Output Status’ reports ‘wait’ Spool temp files get corrupted. Use UNIX printer script to check printer and spooler status. Perform ps –ef|grep on lp, note the printer that is hung, shutdown the scheduler via lpshut, kill any remaining lp processes, cd /var/opt/spool/lp/requests to view what’s in the temp queue and remove, restart lpsched

Delete /usr/sap/L71/DVEBGMS00/data/SP* Clean up stats file via st03 Clean up work process trace files via sm50 Recycle spooler


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION9. EMEA Printing Support EMEA Printing Servers: etsp0033; etsp3005

remsh etsp0033 -l prtmgr "lpstat -p ETC920R"

smitty10. No printing possible. SM21 system logs show

the following errors,cannot open /usr/sap/<SID>/DVEBMGS00/data/statcannot open /usr/sap/<SID>/DVEBMGS00/data/SPxxxxB

Above errors appear even thought the filesystem is not 100% utilized

2 Possible solutions1. Reorg the TemSe Database using transaction

SP12. TemSe Database->Reorganisation Delete obsolete requests

2. remove all SPxxxxB files from the Unix directory /usr/sap/<SID>/DVEBMGS00/data

11. Overflow of spool database 1. Deleted / Reorg Temse via sp122. Check variant of SAP_REORG_SPOOL

housekeeping job


SAP BACKUPS

Note: In order to get a better idea of the reason for the failure the following 3 steps are highly recommended 1. Look into the logs and search for the instance of the first occurrence of any error message instead of looking at the tail of the logs. Very little

information is conveyed and hardly anything can be deduced by looking at the brbackup/brarchieve return code mentioned at the end of the logs. Most of the following data is catering to the first error message that comes up in the logs.

2. Note that it is the first error msg that really counts and all the errors that crop up after that are usually a follow up of the first error. 3. Note that the /var/opt/ctma/ctm/sysout logs are lot more accurate and better furnished in detail as compared to the logs we see at SAP GUI or

even in the /saparch or /sapbackup directories. So if you are not able to see the error in the usual places look at the sysout logs. [Just go to the dir and do $ ll | grep [jobname that failed] to see the logs.]

ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION1. RC=1 Backup ended successfully with

warningsJust a warning msg with no critical implications Look at the logs and see the messages

that were flashed. Let the system owners know if you see something particularly abnormal.Make sure you check the /var/opt/ctm/ctma/sysout file to verify that nothing has been missed.

2. sihp8040.

sapg900_cleanup_pool.sh failed with rc=1

2 Possible reasons:Check the logs under /var/opt/ctm/ctma/sysout : ll | grep 9001. Mostly because there are no logs that qualify to be

deleted, this is not a issue at all2. Some error really happened during the cleanup

process

Action for Issue1. Close the ticket with no action.2. Fire the cleanup job again under /var/opt/sapg/scripts

#./sapg900_cleanup_pool.sh +3 var/opt/saparch/prod#./sapg900_cleanup_pool.sh +10 var/opt/saparch/dvlp


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION3. Tablespace XXXX is already in backup mode Last backup ended abruptly or was killed

Two ways to handle this:1. Manually end backup mode via Oracle commands

sesu - orasidsvrmgrlconnect internal;select * from V$BACKUP;alter tablespace <tablespacename> end backup;

2. Fire script /opt/nb/bin/backoff_backup.sh <SID>

4. lock.bra exists Last backup ended abruptly or was killed

1. Remove the file from /oracle/SID/sapbackup and rerun2. Under /opt/nb/bin fire the script

./removelock.sh SID

5. pgbrclean.sh not found The script is missing from the /sapmnt/SID/exe directory

Copy over the script from the /opt/nb/bin directory and in case it is missing from there, ftp it from another source to /bin and copy it to /etc. The backup would normally be successful, so don’t rerun the backup, just fire the script as ora<sid>

6. backint execute permission denied Problem with the permissions of backint. Usually the sticky bit would be missing.

Check the permissions for the backint file under /usr/openv/netbackup/bin.

If it is different from -rwsr-sr-t redirect to regional UIT to change the permissions.

7. Specified schedule does not exist in the specified class

The parmecs file would have been refreshed to the incorrect parameters

Under /var/opt/ubkp/parmecs change the schedule to w_8w_e or whatever value was defined in netbackup

Work with UIT to make sure correct parmecs file gets refreshed.


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION8. Two errors while executing a backup may

say the following

- invalid username/password: logon deniedConnect to database instance SID failed

- ORA-00942: table or view does not existBR303E Determination of Oracle RDBMS version failed

Auto-secure problem.Sapdba_role does not exist

Define sapdba role to OPS$SIDADM account [ by running sapdba_role.sql]1. $sesu - ora<sid> [Switch user to

ora<sid>]2. $cdexe [Changes to

directory /usr/sap/<SID>/SYS/exe/run]3. Execute sapdba_role.sql with the

following parameters: $sqlplus internal @sapdba_role <SID> UNIX

Create user OPS$ORA<SID> to enable ‘ora<sid>’ to perform normal sapdba works using $sapdba -u / [This is the only way we need to run

SAPDBA in a secured environment]1. #sesu - ora<sid> [Change user to

ora<sid>]2. $sqlplus system/password [Connect to

database using sqlnet ]3. SQL>@/tmp_mnt/orasid.sql [Run

orasid.sql and give SID name]


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION9. ORA-00942: table or view does not existBR303E Determination of Oracle RDBMS version failed

POSSIBLE CAUSES:1) Auto-secure problem2) SAPDBA, CONNECT, RESOURCE is not there for

OPS$<SID>ADM. To verify if this is so, svrmgrl connect internalselect * from dba_role_privs;Check what roles have been granted to OPS$<SID>ADM

SOLUTION:1) Ticket to UIT2) Grant correct roles to OPS$ user:

svrmgrlconnect internalgrant sapdba to OPS$<SID>ADMgrant connect to OPS$<SID>ADMgrant resource to OPS$<SID>ADM

10. 903 job failure with error:

Internal error for /oracle/L01/saparch/L01arch1_54854.db…

This as due to insufficient dbdump space. The backup was successful in handling the sapdatas, but not the archives since the dbdumps are all very full. Alan Randall recommend to add 2 35G dbdumps (dbdump11 and dbdump12).

# default: $ORACLE_HOME/sapbackupbackup_root_dir = (?/dbdump0,?/dbdump1,?/dbdump2,?/dbdump3,?/dbdump4,?/dbdump5,?/dbdump6,?/dbdump7,?/dbdump8,?/dbdump9,?/dbdump10,?/sapdata92,?/sapdata94,?/sapdata95,?/sapdata96)


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION12. /var/opt/saparch in pooldisk is 100% full 1) Attempt to delete using

/var/opt/sapg/scripts/sapg900_cleanup_pool.sh + x /var/opt/saparch

2) Run backup of pooldisk

/var/opt/ubkp/scripts/ubkp049_varopt_genfile.sh generates the listings of files to be backed up in /var/opt directory

/var/opt/ubkp/scripts/ubkp050_varopt.sh <CLASS> <SCHED> <HOST> <FILE> backs up the files

/var/opt/ubkp/scripts/ubkp050_varopt.sh sihp8040 d_2w_e sihp8040 ubkp050_varopt


Following are the list of errors for which you should just forward the ticket to the resolution owner:[ Note: The Status code is the backint system status not brbackup/brachieve return code]

STATUS MESSAGE ACTION13 file read failed 1/netbackup problem --> UIT23 socket read failed -> UIT25 cannot connect on socket 1/rerun

2/send to UIT40 network connection broken 1/rerun

2/send to UIT41 network connection timed out 1/rerun

2/send to UIT49 client did not start 1/ configuration error or autosecure problem,

send to UIT

52 timed out waiting for media manager to mount volume 1/ rerun2/send to UIT

84 media write error 1/remove tape 2/rerun job

85 media read error 1/rerun job2/send to UIT

227 no entity was found 1/setup problem, send to UIT

230 the specified class does not exist in the configuration database 1/setup problem, send to UIT


SAP SECURITY

ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION1. Symptom 1: R3trans -d ; /usr/sap/trans/bin

tp connect <SID> gives error message: “2EETW169 no connect possible: ‘DBMS = ORACLE tns_names_ora= ‘<SID>’. Log shows: sql error 1017

Symptom 2: Joining <sid>adm to DBA group solves problem at UNIX level, but does not solve problem at SAP level. Tp check from within R/3 fails (STMS à Overview à Systems à Select SID à R/3 System à Check à Transport Tool)

Listener.ora listens to incoming IPC connections only. But the client tries to connect using TCP/IP as defined in tnsnames.ora.

1) Modify /oracle/<SID>/listener.ora to contain only (ADDRESS= (PROTOCOL= TCP)(Host= sihp8029)(Port= 1527) in the address list.

2) Stop and Start listener (lsnrctl stop; lsnrctl start)

* This can be done online. Stopping listener prevents new incoming connections (new transports,..), but all connections started by processes when SAP was restarted still can connect to the database even when listener is not up.

EXPLANATION

Taking A5P as example:Listener.ora configuration has three settings:

1) IPC with KEY A5P.world2) IPC with KEY A5P3) TCP/IP at port 1527 -Community sap.world

SQLNET.ORA hasAUTOMATIC_IPC = ON [This means IPC is preferable to TCP/IP if available]If ORACLE CLIENT is in the same box client can connect through IPC. But this will not help for Application Server.

But the tnsnames.ora file tells the oracle client to connect at port TCP/IP 1527 - sap.world option 3).

This is why when you say tnsping it says it connects to TCP/port 1527. (tnsping A5P)

But listener status tells you only IPC because the listener program has to connect to one of the listener option (here it runs down the list so option 1 is IPC with KEY A5P.world) to give you the status, even though it listens in all the above options. But when you start listener it would


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTIONtell you that it is listening on all three options defined in the listener.ora file.

To make a standard what I suggest is to define only one protocol (TCP/IP on port 1527) in listener.ora and specify that in tnsnames.ora. Do not provide the 3 options in listener.ora

The problem happens when listener.ora does not anymore support TCPIP protocol. Then client cannot anymore connect because client connects via TCPIP.

2. SourceOne is not able to receive data from A5P (sihp8040).

When security rules were enforced, “cdirect” id is no longer able to surrogate to sapsys group.

UIT included the CoDi program (/opt/cdirect/cdunix/ndm/bin/ndmsmgr) in the Autosecure config file privpgms.init which means this program is actually trusted to switch user id while it is running. Consequently, specific rules in Autosecure database is no longer needed.

3. Backup returns error when autosecure rules were put in forced mode

/usr/openv/netbackup/bin/backint should be owned by root:sys and permissions 7755.

Forward ticket to UIT to do this change: “chmod 7755 /usr/openv/netbackup/bin/backint”

4. Backup failing due to SQL error -942 SAPDBA role has not been defined yet and <sid>adm is not granted SAPDBA role.

DIAGNOSIS:1) sesu - ora<sid>2) svrmgrl3) connect internal4) select * from dba_role_privs; 5) OPS$<sid>adm should have CONNECT, RESOURCE,

SAPDBA role assigned to him. If SAPDBA role is not assigned to him, do the ff.

SOLUTION:1) sesu - ora<sid>2) cdexe3) sqlplus internal @sapdba_role <SID> UNIX

Whenever chdbpass is executed and OPS$<sid>adm is recreated, you need to run sapdba_role script afterwards.

1) sesu - ora<sid>2) cdexe3) sqlplus internal @sapdba_role <SID>

UNIX


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION5. <sid>adm can still do svrmgrl and connect to

the database<sid>adm should not be able to connect to the database. Only <ora>sid should be part of DBA group

DIAGNOSIS:1) more /etc/group | grep dba2) If you see users other than <ora>sid, request UIT to

take them out of DBA group6. Where are the new backup scripts and ITO

monitoring scripts that work with Phase I security?

1) Backup scripts: (pgsapbackup.sh, pgsaparchive.sh, saparch_guard.sh, pgbrclean.sh).

bdhm:/var/spool/sw/applications/UBKP_SAPCTMA UBKP_SAPCTMA

2) Monitoring scripts: (thresh_ora.free, thresh_ora.ext, thresh_ora.frag, TableExtents.sql, FreeSpace.sql, TbleSpceFrag.sql)

bdhm:/var/opt/sapsoe_arch/depot MONITOR

Monitoring scripts work with secured and non-secured systems. Old backup scripts for non-secured systems are in: bdhm:/opt/soeg/bin/

7. zpas doesn’t work for some systemszpas reports error in the statusTried to login using new password that was propagated by zpas but cannot login

Your user account has not been created in the system or your account on the said system is locked.. Zpas (password propagation) only works for those systems where your account exists. Open ITSM ticket SSC 985.

8. zpas works to propagate new password to a system but when you login, system asks you to immediately change your password. Effect is your passwords across all systems are not synchronized anymore

Password you provided zpas is not unique. Password you provided was one of the last 5 passwords you’ve already used.

Provide a unique password.


ERROR MSG/SYMPTOM POSSIBLE CAUSE & SUGGESTED SOLUTION PREVENTIVE ACTION9. How to know which policy model a particular

server is subscribed to?AIM is using 2 policy models for hp:1 - hp_sap_sec@bdhp4176 : This is the highly secured policy model for all Production systems2 - hp_sap_std@bdhp4176 : This is the standard policy model for all non-production systems

To know which :1) Go to: http://golx4001.na.pg.com/~markj/pmdbs/2) grep pmd /usr/seos/seos.ini' will return several

lines. Look for the line that says, "parent_pmd = ..." This will identify the policy model.

10. How to change password of system without knowing old password

Option 1:via SAPDBA

Option 2:1) sesu – orasid2) svrmgrl3) connect internal4) alter user system identified by <new password>

11. How to check the autosecure audit logs? 1. sesu to root2. seaudit - sd 09-mar-2001 –a | grep “* W *”

12. Connect internal as <sid>adm requests for a password.

Startup and shutdown using “startsap”, “stopsap” hangs during start up and shutdown of database

1) Modify /oracle/<sid>/rdbms/lib/config.c

# ifdef SEQ_PSX#define SS_OPER_GRP "dba"#else#define SS_OPER_GRP "oper"# endif /* SEQ_PSX */

char *ss_dba_grp[] = {SS_DBA_GRP, SS_OPER_GRP};

2) Stop listener. Stop SAP and Oracle3) Relink oracle using relink.sh script as ora<sid>. Relink

can be found on sihp8044:/oracle/oraanq home directory.


erp common errors & solutions

Documents

uit 1rerun

backup ended

exist br303e

smaller print

unix level

unix print

print job

uit 1setup