portable sas : language and platform considerations robert a. cruz… ·  · 2009-09-25page 1 of...

21
Page 1 of 21 Portable SAS ® : Language and Platform Considerations Robert A. Cruz, Info-Mation Systems, Hollister, CA Portable SAS ® : Language and Platform Considerations ............................................................................................... 1 Abstract ..................................................................................................................................................................... 1 Audience............................................................................................................................................................... 1 O. Introduction ...........................................................................................................................................................1 I. The Roots of Portability Issues ............................................................................................................................... 1 I.A Platforms ......................................................................................................................................................... 1 I.B What Do We Mean by “Portable” Software? ................................................................................................... 2 II. Platform Differences .............................................................................................................................................. 2 II.A Hardware Differences..................................................................................................................................... 2 II.B Software Differences ......................................................................................................................................6 III. Base Language Considerations ...........................................................................................................................6 III.A Instruction Set Considerations ...................................................................................................................... 6 III.B Internal Memory Considerations ................................................................................................................... 6 III.C Character Set Considerations ....................................................................................................................... 7 III.D Numeric Considerations .............................................................................................................................. 10 III.E INFORMAT and OUTFORMAT Considerations ..........................................................................................11 III.F Macro Language Considerations................................................................................................................. 11 III.G PROC Considerations.................................................................................................................................11 II.H System, Statement, and PROC Options Considerations ............................................................................. 12 IV. Operating System Considerations ..................................................................................................................... 13 IV.A External Executables Considerations ......................................................................................................... 13 IV.B Other Platform-Specific Features................................................................................................................13 IV.C Isolating System Dependencies ................................................................................................................. 13 V. Creating Portable SAS Programs ....................................................................................................................... 14 V.A The Three-Pronged Strategy to Create a Portable SAS Program ............................................................... 14 VI. Conclusion ......................................................................................................................................................... 15 Acknowledgements .................................................................................................................................................15 Recommended Reading.......................................................................................................................................... 16 Contact Information .................................................................................................................................................16 Trademarks, Brand and Product Names ................................................................................................................. 16 Appendix A: SAS Language Elements with Portability Considerations .................................................................. 16 Endnotes ................................................................................................................................................................. 21 ABSTRACT Techniques for creating portable SAS programs will be discussed. Portable mainline code can be executed unchanged on multiple platforms. Requirements for Windows, Unix, and mainframe systems will be presented. Considerations include language features to use or avoid, and coding techniques to use or avoid. Issues to be dealt with include internal representation of characters and numbers. Techniques for addressing the peculiarities of each platform will be presented. Windows, Unix, and mainframe platforms will be covered. Interfacing to sequential files and databases will be covered, as will considerations for system commands and sort utility. Automated platform identification and adaptation by macros will be covered. Search Keywords: Portable SAS, Windows, Unix, MVS, CMS, PC, Server, Mainframe, Collation Operating System: ALL Applicable SAS Products: Base SAS AUDIENCE This presentation is of interest to Beginner, Intermediate, and Expert SAS users who must deal with portability issues. O. INTRODUCTION The objective of this paper is to familiarize the reader with those issues that impact the ability of a SAS program to function in the same manner when run on different computing platforms. Techniques for creating a portable program will be illustrated. These techniques include choices of base language elements, and an approach to structuring a SAS program to improve portability. I. THE ROOTS OF PORTABILITY ISSUES I.A PLATFORMS A computing platform is the combination of hardware and software that an application is executed on. This combination identifies the environment in which the program will run. The platform may be identified by a particular

Upload: phamdiep

Post on 10-May-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1 of 21

Portable SAS®: Language and Platform ConsiderationsRobert A. Cruz, Info-Mation Systems, Hollister, CA

Portable SAS®: Language and Platform Considerations...............................................................................................1Abstract .....................................................................................................................................................................1

Audience...............................................................................................................................................................1O. Introduction...........................................................................................................................................................1I. The Roots of Portability Issues...............................................................................................................................1

I.A Platforms .........................................................................................................................................................1I.B What Do We Mean by “Portable” Software? ...................................................................................................2

II. Platform Differences..............................................................................................................................................2II.A Hardware Differences.....................................................................................................................................2II.B Software Differences......................................................................................................................................6

III. Base Language Considerations ...........................................................................................................................6III.A Instruction Set Considerations ......................................................................................................................6III.B Internal Memory Considerations ...................................................................................................................6III.C Character Set Considerations .......................................................................................................................7III.D Numeric Considerations..............................................................................................................................10III.E INFORMAT and OUTFORMAT Considerations ..........................................................................................11III.F Macro Language Considerations.................................................................................................................11III.G PROC Considerations.................................................................................................................................11II.H System, Statement, and PROC Options Considerations .............................................................................12

IV. Operating System Considerations .....................................................................................................................13IV.A External Executables Considerations .........................................................................................................13IV.B Other Platform-Specific Features................................................................................................................13IV.C Isolating System Dependencies .................................................................................................................13

V. Creating Portable SAS Programs .......................................................................................................................14V.A The Three-Pronged Strategy to Create a Portable SAS Program ...............................................................14

VI. Conclusion .........................................................................................................................................................15Acknowledgements .................................................................................................................................................15Recommended Reading..........................................................................................................................................16Contact Information .................................................................................................................................................16Trademarks, Brand and Product Names .................................................................................................................16Appendix A: SAS Language Elements with Portability Considerations ..................................................................16Endnotes .................................................................................................................................................................21

ABSTRACTTechniques for creating portable SAS programs will be discussed. Portable mainline code can be executed unchanged on multiple platforms. Requirements for Windows, Unix, and mainframe systems will be presented. Considerations include language features to use or avoid, and coding techniques to use or avoid. Issues to be dealt with include internal representation of characters and numbers.

Techniques for addressing the peculiarities of each platform will be presented. Windows, Unix, and mainframe platforms will be covered. Interfacing to sequential files and databases will be covered, as will considerations for system commands and sort utility. Automated platform identification and adaptation by macros will be covered. Search Keywords: Portable SAS, Windows, Unix, MVS, CMS, PC, Server, Mainframe, CollationOperating System: ALLApplicable SAS Products: Base SAS

AUDIENCEThis presentation is of interest to Beginner, Intermediate, and Expert SAS users who must deal with portability issues.

O. INTRODUCTIONThe objective of this paper is to familiarize the reader with those issues that impact the ability of a SAS program to function in the same manner when run on different computing platforms. Techniques for creating a portable program will be illustrated. These techniques include choices of base language elements, and an approach to structuring a SAS program to improve portability.

I. THE ROOTS OF PORTABILITY ISSUES

I.A PLATFORMSA computing platform is the combination of hardware and software that an application is executed on. This combination identifies the environment in which the program will run. The platform may be identified by a particular

Portable SAS: Language and Platform Considerations

Page 2 of 21

industry term, such as “Wintel” (the Windows operating system running on an Intel-compatible processor), or by implication. For example, citing “CMS” as the platform implies an operating system in the VM/CMS family running on IBM System/370-family hardware. Note that these are historical references, as the current version of this operating system is z/VM and the current hardware is the zSeries System 10.

Very often, the required environment must be specified in more detail, giving a minimum amount of memory, and/or processor class, and/or storage capacity. Very often, a particular release of an operating system or other corequisite software is required, due to the exploitation of features available in that version of the OS.

Examples of platforms are: 32-bit Windows running on an Intel-compatible CPU Windows running on an Alpha processor Linux running on an x86-class CPU Linux running on server-class hardware (it may also be necessary to specify which manufacturer’s hardware

is in use) Linux running on an IBM mainframe (such as System z) MVS-family operating system (such as z/OS) running on an IBM mainframe (such as System z) CMS-family operating system (such as z/VM) running on an IBM mainframe (such as System z )

As you can see, neither the hardware nor operating system alone determines the platform. In SAS documentation, the equivalent term for platform is operating environment. SAS publishes a series of “Companion” manuals, one for each platform1.

I.B WHAT DO WE MEAN BY “PORTABLE” SOFTWARE?The concept of “portability” in computing refers to the ability to move an application program from one computing platform to another without having to change it. Portable programs either do not involve platform aspects that differ from one platform to another, or take them into account.

Some languages were created with portability as a design objective. An early example of this was the “P-code” system used by Pascal language compilers in the early- to mid-1970s2. Chief among these today is Java, which runs in a virtual machine that isolates the Java program from its platform. These virtual machines are themselves not portable, but they enable the Java applications which run within them to be portable.

II. PLATFORM DIFFERENCES

II.A HARDWARE DIFFERENCES

II.A.1 Executable InstructionsHardware design concerns itself with data representation and instruction implementation. Hardware design has produced a number of differing approaches to instruction encoding, including CISC (Complex Instruction Set Computer), RISC (Reduced Instruction Set Computer), and VLIW (Very Long Instruction Words). It is not necessary to know the details of these instruction implementations to realize that they are quite different.

II.A.2 Character Data

II.A.2.i Character Sets

Data representations, too, can be significantly different. Characters in text strings cannot be stored directly in digital computers, as there is no internal hardware for “a” or “B”, only bits for 0 or 1. In order to store character data, a numeric value has to be assigned for each character (letter, digit, or special character). The mapping of characters to numeric values is referred to as a character set or character encoding. Over time, two distinct major character code sets have evolved. One of these is EBCDIC (Extended Binary Character-to-Decimal Interchange Code), which evolved in the IBM Mainframe environment. The other is ASCII (American Standard Code for Information Interchange), which evolved from origins in teletype equipment. There were some other character sets, such as BCDIC3 and Baudot4, which are no longer in use. In recent years, Unicode has been standardized in an effort to extend ASCII for use with alphabets in existence worldwide. Due to their separate evolution, ASCII and EBCDIC are not identical: they share some characters, but others are unique to one character set or the other.

In addition to visible characters (graphemes), these character sets include control characters, which direct the output device to take some action. Control characters are assigned a two- or three-letter mnemonic when used in discussions. Examples of control characters that originated with teletype devices include BEL (ring the bell on the output device), HT (jump forward to the next Horizontal Tab position), BS (backspace), CR (Carriage Return: position to the beginning of the line), and LF (Line Feed: move down to the next line). Some control characters were assigned for communications functions, such as ACK (acknowledge transmission).

There are variations of ASCII, including Extended ASCII, ISO-8 (an 8-bit international version), and, ultimately, Unicode. There are also variations of EBCDIC, created for different national markets. These variations are identified by their “code page”. Even the definition of a code page can change over time. For the purposes of this document,

Portable SAS: Language and Platform Considerations

Page 3 of 21

“ASCII” will mean the 7-bit US ASCII character set, and “EBCDIC” will refer to the characters shown in IBM Publication GX20-1850-3, as this version of EBCDIC is widely supported by peripherals, and software on IBM mainframe hardware. For a look at the modern EBCDIC Code Page 037, see SA22-7871-05, “z/Architecture Reference Summary”5.

In this paper, “EBCDIC” may be used as a shorthand to designate platforms that support this character set, such as IBM z/OS and IBM z/VM. “ASCII” may be used as a shorthand for platforms that support that character set, such as Windows and Unix. Finally, please note that “blank” is synonymous with “space” when referring to that character.

II.A.2.ii Contrasting the Major Character Sets

There are two major character sets in use today: EBCDIC, which is used on IBM mainframes and midrange systems, and ASCII, which is used on nearly all other platforms, including PCs, workstations, and servers, as well as some embedded systems. It is worthwhile to note that there is no intrinsic reason why an IBM mainframe must use EBCDIC: z/Linux runs using ASCII. Unicode is also being adopted widely, but I will not discuss it here, because (for the most part) our SAS programs will only need to deal with ASCII, which is a proper subset of Unicode6.

ASCII and EBCDIC share many display characters in common. These include uppercase letters (A-Z), lowercase letters (a-z), digits (0-9), and the special characters exclamation point (!), commercial at sign (@), hash mark or pound sign or number sign (#), dollar sign ($), percent sign (%), ampersand (&), asterisk (*), left parenthesis (“(“), right parenthesis (“)”), underscore (_), plus sign (+), minus sign or hyphen or dash (-), equal sign (=), left brace ({), right brace (}), left bracket ([), right bracket (]), colon (:), semicolon (;), quote (“), apostrophe (‘), less-than sign (<), greater-than sign (>), question mark (?), comma (,), period (.), forward slash (/), back-slash (\), grave accent (`), tilde (~), and space.

US-ASCII includes one display character which is not supported in EBCDIC: the circumflex accent (^), which also represents an “up arrowhead”, or a “caret”.

Conversely, EBCDIC contains display characters which are not found in US-ASCII. These are the cents sign (¢), the not sign (¬), and the broken vertical line (¦). Note that although ASCII code point ’7C’X is shown as a broken vertical line on many keyboards and some fonts, the standard describes it a an (unbroken) vertical line; therefore it is not equated with the EBCDIC broken vertical line (’6A’X), but rather with the EBCDIC vertical bar (’4F’X).

A similar situation exists with respect to control characters: some are in both character sets while some are in one or the other. As with display characters, there are twice as many control characters in EBCDIC (code points 0-63) as in ASCII (code points 0-31). Few control characters are of interest to us in SAS programming, but one exception is the HT (horizontal tab) character. HT has code point 5 (‘05’X) in EBCDIC, and 9 (‘09’X) in ASCII. Text editors use this key to position the cursor to the next designated position (called a tab stop). When doing so, the editor may either insert sufficient spaces to fill in the empty space, or it may leave the HT character as part of the file instead. For historical reasons, many text editors use a default setting of one tab stop every eight positions.

There are two situations when the HT character is of interest: if we are reading source code created by the SAS editor (or certain other text editors) with the “convert tabs to spaces” option turned off, and spreadsheets that have been exported in “tab-delimited” format.

EBCDIC control characters (code points 0 to 63), US alphabet (upper- and lower-case), and basic punctuation are portable among EBCDIC platforms. Other characters may vary from one code page to another. Similarly, only ASCII values from 0 to 127 are guaranteed portable within all ASCII platforms, the higher-valued code points may vary (according to the code page in use). Only values from 32 to 126 are printable.

II.A.3 Integer DataIntegers, of course, are the essence of computing, being used for everything from counters to memory addresses. Integers can be stored in either one of two representations: ones-complement, or two’s-complement. While non-negative values look the same in both formats, negative numbers are represented differently. In addition to this consideration, integers may be stored in various lengths: 8 bits (usually referred to a byte integers), 16 bits (referred to as “short” or “halfword” integers), 32 bits (“long” or “fullword” integers). Newer 64-bit machines also support 64-bit integers. Finally, in addition to considerations over length and representation (format), there are differences in how machines store integers. Some store data “forwords”, with the high-order bits (in the Most Significant Byte, or MSB) at a lower storage address, and the low-order bits at the higher storage address. Thus, the data is stored in memory in the same arrangement as it would appear in an arithmetic register for computation. This is a typical convention for processors whose architecture supports accessing memory one word at a time. However, some machines were designed to access memory one byte at a time, and such designs can lead to a method of storing data referred to as “backwords”. In such an implementation, the byte at the lowest storage address contains the least significant portion of the integer, while increasingly higher storage addresses hold bytes containing increasingly significant portions of the integer. This arrangement, where the Least Significant Byte (LSB) is stored first (at the lower memory address) is also referred to as “little-endian”.

Portable SAS: Language and Platform Considerations

Page 4 of 21

This example shows how the value +517 would be stored in “forwords” (big-endian, or MSB first) and “backwords” (little-endian) sequences.

224 216 28 20

Big-Endian .--------.--------.--------.--------.or “Forwords” |00000000|00000001|00000010|00000101| '--------'--------'--------'--------'Address: N N+1 N+2 N+3

20 28 216 224

Little-Endian .--------.--------.--------.--------.Or “Backwords” |00000101|00000010|00000001|00000000| '--------'--------'--------'--------'Address: N N+1 N+2 N+3

Be sure to read “Byte Ordering for Integer Binary Data on Big Endian and Little Endian Platforms” on page 75 of SAS® 9.1.3 Language Reference: Dictionary, Third Edition.

II.A.4 Decimal DataIn addition to binary representation of numbers, some hardware can also store and perform computations in a pseudo-decimal format, sometimes called packed decimal. Intel refers to this format as packed Binary Coded Decimal (BCD). Decimal computations can be essential to business applications, where pennies must be correct, even on multi-million dollar numbers, and rounding can raise the suspicion of auditors. In terms of machine resources, conversion between character representation and packed decimal are much more economical than between character and integer formats. In packed decimal format, one decimal digit is stored in four bits (0000 to 1001), and two decimal digits occupy one byte (one digit each in the high-order and low-order half of the byte). These half-bytes are sometimes called nibbles. While these conventions are common, implementations differ in the representation of the sign of the number, both in which value(s) are used for negative or positive, and the placement of the sign within the stored value. There are also differences in the maximum number of digits which can be stored as a single number.

This example shows how the value +517 would be stored in decimal format by an Intel 386 (or successor), and IBM System/360-class machines.

Value: 0 5 1 7 .----+----.----+----.Intel 386 |0000 0011|0001 0111| '----+----'----+----'Address: N N+1

Value: 5 1 7 + .----+----.----+----.IBM S/360 |0011 0001|0111 1100| '----+----'----+----'Address: N N+1

Note: the Intel 386 implementation of decimal arithmetic is rudimentary. Multiple instructions are required to perform simple operations, and only four digits can be processed at a time. There are no signs, leaving the tracking of the correct sign as an exercise for the programmer. All implementations rely on the programmer to track the position of the decimal point for any non-integer values.

Note: IBM packed decimal recognizes hex C, A, E, and F as plus signs; and D and B as minus signs. When performing computations, the hardware generates a result with one of the preferred signs, C (+) or D. (-)

Note: The IBM POWER architecture (a RISC architecture used in IBM’s System i & System p, and Power Blade servers) does not support a packed decimal data representation.

II.A.5 Floating-Point DataFloating-point format is used to represent rational numbers. This format allows the representation of a wide range of values, from very small to very large, both positive and negative. In order to accomplish this, the value in question is encoded in scientific notation: s x be, where s is the significand (or coefficient or mantissa), b is the base (or radix), and e is an integer exponent of the base. The base used is implicit in the format, and is commonly 2 or 16, with some instances of 10 being available. The coefficient provides the significant digits for the value. The value 258.25 would be written in scientific notation as 2.5825 x 10+2. If our radix was base 10, this would mean that the coefficientis 2.351, and the exponent is 2. These two values (2 and 2.5825) would be stored as the floating point number. For a base 16 representation, the number would be written as 1.2416 x 16+1. In this case, the 1 and 1.2416 would be stored. Finally, a radix of 2 would result in 1.001001002 x 2+4, with values of 4 and .001001002 being stored (Where’d the leading 1 go? In most binary floating point implementations, the high-order 1 is implicit under ordinary circumstances. This ingenious trick effectively extends the number of bits in the significand by 1 bit).

Portable SAS: Language and Platform Considerations

Page 5 of 21

Floating-point numbers are stored in several sizes; most implementations have up to three. In general, the smallest size is 32 bits, and is referred to as either short or single-precision. The intermediate size is 64 bits long, and is referred to as either long or double-precision. The length of the largest size varies, and can be anywhere from 80 to 128 bits. This size is referred to as extended. The IEEE7 standards also provide for a half format, which is 16 bits long. The size of the storage used for the floating-point number determines the number of significant digits (the length of the mantissa). In some implementations (notably those based on the IEEE-754 standard), the amount of storage used also determines the maximum value for the exponent, thereby affecting the range of values as well.

There are three standard representations for floating point numbers in use today: the IEEE 754-1985 standard8, which is a base 2 (binary) format, the IBM S/360 floating point format, which is a base 16 (hexadecimal) format, and the IEEE 754-2008 standard. There is also another format used by Cray on its SV1 machines9, as well as most of its older models. There is also an old IEEE 854-1987 radix-independent standard, which never saw wide use. The IEEE 754-2008 standard has a binary format and two decimal formats10. The IBM POWER6 server architecture was the first hardware implementation of this standard11. The IEEE 754-2008 standard was subsequently implemented by the IBM zSystem series Z9 and Z10 hardware12.

Other differences between various implementations have to do with the nature of rounding, and so-called guard digits, which are additional digits of precision present only during internal computations. The IEEE standard provides for special values, such as infinities, not-a-number (NaN) values, sub-normal values, and negative zero. The IBM hexadecimal format can encounter subnormal values, but will not generate them (an exception occurs instead).

II.A.5.i Summary of Fixed-Point Data Type Implementations

Property Intel 386 IBM System z IBM POWER6BINARY INTEGER DATA TYPE

Format: two’s complement two’s complement two’s complementSizes: 8, 16, 32, 64-bit 16, 32, 64-bit 32, 64-bitOrdering: little-endian big-endian big-endianSpecial values: None None None

PACKED DECIMAL DATA TYPESizes 2 to 18 digits 1 to 15 digitsFormat: 1 digit/nibble, no sign 1 digit/nibble, trailing nibble is signOrdering: big-endian big-endianSpecial values: None None

Not Applicable

II.A.5.ii Summary of Floating-Point Data Type Implementations

Property Intel 386 IBM System z IBM POWER6Format: IEEE-754-1985

(base 2)IBM base 16 IEEE-754-2008 base

2IEEE-754-2008 base 2

Sizes: 32 (short), 64 (long), 80-bit (extended)

32 (short), 64 (long), 128-bit (extended)

32 (short), 64 (long), 128-bit (extended)

32 (short), 64 (long), 128-bit (extended)

Ordering: little-endian big-endian big-endian big-endianSignificant digits Short:

7 decimal 24 binary

Long: 15 decimal53 binary

Extended: 19 decimal53 binary

Short: 6-7 decimal 21-24 binary

Long: 16 decimal53-56 binary

Extended: 33 decimal109-112 binary

Short: 7 decimal 24 binary

Long: 15 decimal53 binary

Extended: 33 decimal113 binary

Short: 7 decimal 24 binary

Long: 15 decimal53 binary

Extended: 33 decimal113 binary

Value range(Normalized)

Short: +1.18x10-38 to +3.40x10+38

Long:+2.2x10-308 to +1.79x10+308

Extended:+3.4x10-4932 to +8.4x104933

All precisions: +8.6x10-78 to +7.2x10+75

Short: -3.37x10-38 to +3.37x 10+38

Long:-1.67x10308 to +1.67x10308

Extended:-1.2x104932 to +1.2x104932

Short: -3.37x10-38 to +3.37x 10+38

Long:-1.67x10308 to +1.67x10308

Extended:-1.2x104932 to +1.2x104932

Special values Yes No Yes Yes

As you can see, there are some noteworthy differences between the IBM base 16 and IEEE base 2 schemes. I have not included the IEEE-764-2008 Decimal Floating Point format because it is not used by SAS. The IBM base 16

Portable SAS: Language and Platform Considerations

Page 6 of 21

design produces the same range for all precisions, whereas the range for the IEEE base 2 design varies according to the storage size. Interestingly, the IBM base 16 representation results in a more asymmetrical negative vs. positive exponent range than the IEEE base 2 design. The most interesting similarity between these standards is that the short and long formats provide roughly the same precision. The Intel 386 implementation is based on the 1985 version of the IEEE-754 standard, and has an extended precision of 80 bits13, whereas the IBM IEEE base 2 implementation is based on the 2008 standard of the IEEE-754 standard, and has an extended precision of 128 bits, and provides precision comparable to the IBM base 16 format.

II.B SOFTWARE DIFFERENCESSoftware differences between platforms derive from the features of the respective operating systems. These differences may include the manner in which data is stored and accessed, and the syntax and function of commands.

II.B.1 CommandsOperating systems may support one or more distinct sets of commands. Unix-based systems may provide one or more of the C shell, Korn shell (ksh), and the “Bourne again” shell (bash). IBM MVS-family operating systems have one language for running batch jobs (JCL) and another for interactive use (TSO Commands).

Commands may not be consistent across operating systems from the same source. For example, the commands for IBM’s TSO are significantly different than those for IBM’s CMS, even though they are both interactive systems. There are even variations in commands between Linux and BSD Unix, which both have their origins in Bell Labs’ Unix.

Another general trend is that new commands are added with each release of an OS. Current Windows users have commands at their disposal that were not available in previous generations of that OS. In addition, existing commands often evolve from one iteration of an OS to the next, sprouting new features or behaviors under particular conditions. Naturally, Windows has its own set of commands (which trace their lineage back to PC-DOS).

II.B.2 Data StorageThe manner in which data is stored on various platforms can be quite different. In Windows and Unix, text files (such as a SAS report) are stored as strings of characters with special character(s) signifying the end of a line or file. Due to the usage of these sequences of special characters, they cannot be used as data within the file. This results in a different method of access for binary files than text files.

On the IBM mainframe legacy operating systems, collections of data, known as records, are stored according to either a pre-arranged length or along with metadata specifying the length of each record. The distinction betweentext lines or binary data is not a consideration in these systems, because any data values can be stored in a record without causing processing errors.

The consequence of this situation is that blindly transferring data from an IBM mainframe platform to a Unix or PC platform may inadvertently introduce end-of-line or end-of-file markers that would result in a different interpretation of the data when it is read on the target platform.

II.B.3 Access MethodsSome operating systems support methods of storing and accessing data that others don’t. For example, IBM MVS-family operating systems provide an indexed file organization named VSAM which allows a program to access data via a character key. The DEC VMS OS also supports a native indexed file organization. There is no direct analog for this file organization in Windows or Unix. Indexed files, while supported by SAS Base Language, present special portability issues, which must be addressed on a case-by-case basis.

II.B.4 Database AccessAccess to databases from SAS programs is provided the ODBC14. This works consistently across platforms, and does not usually present portability issues.

III. BASE LANGUAGE CONSIDERATIONSWe have seen that there may be significant variations between platforms. How can we possibly code SAS programs that are immune to these differences? Our strategy is three-pronged:

let SAS shield us from many platform differences write our programs in a manner that avoids code which is prone to being affected by platform differences when necessary, isolate platform-specific code from the portable portion of the program

III.A INSTRUCTION SET CONSIDERATIONSIt was previously pointed out that different processors use machine instruction sets that can be dramatically different. We don’t need to concern ourselves with this, as the SAS compiler for a given platform produces object code appropriate to that platform.

III.B INTERNAL MEMORY CONSIDERATIONSDepending on the platform in question, a SAS program might be able to access and possibly alter instructions and/or data in memory through use of the PEEK, PEEKC, PEEKLONG, PEEKCLONG functions and/or POKE and POKELONG routines.

Portable SAS: Language and Platform Considerations

Page 7 of 21

Use of these functions will almost invariably lead to a program that is not portable, because the typical use of them is to access hardware- or OS-specific data. Such control structures will vary from one platform to another. Even SAS cautions against their use15:

CAUTION:The CALL POKE routine is intended only for experienced programmers in specific cases. Ifyou plan to use this routine, use extreme care both in your programming and in your typing. Writing directly into memory can cause devastating problems. This routine bypasses the normal safeguards that prevent you from destroying a vital element in your SAS session or in another piece of software that is active at the time.

Naturally, this warning applies to all the functions POKE family.

Even if memory containing raw data is accessed, there will be issues, such as the representation of the data on the given platform, and storage order (big-endian vs. little-endian).

III.C CHARACTER SET CONSIDERATIONS

III.C.1 PitfallsThere are many pitfalls in dealing with ASCII and EBCDIC character sets16.

III.C.1.i Pitfall: Embedded Characters in Hex, Octal, or Decimal Notation

In order for your code to be portable, you do not want to use any non-graphic representation of characters, such as hexadecimal or octal literals, or conversions from decimal values. When characters are coded in such a manner, they are not automatically converted to the target platform’s native character set when they are transferred from one platform to another. The following are all ways in which the ASCII character “a” can be coded in a SAS program:

'a' "a" '61'X BYTE(97)

However, only the first two are portable. If a program containing these were transferred to an EBCDIC platform, only the first two would be translated; the last two would remain essentially ASCII, even in the EBCDIC environment.

There are situations where you cannot use a character string to represent a character you need. One of these is control codes, which have no corresponding glyph. Another would be when you want to represent a character that is not available on the platform you are using to write your code. In order for your application to be portable under these circumstances, you will have to create code that can choose the proper code point for the current platform.

Suppose we are reading a spreadsheet stored in tab-delimited form. The best approach would be to use the FILE statement:

FILE SPRDSHT DSD DLM=horizontal_tab ;

Note that the input file has a filename of SPRDSHT, and that the DSD option has been specified to cause SAS to treat two consecutive delimiters as a missing value and remove quotation marks from character values.

The issue now becomes how to code the horizontal_tab. For an ASCII platform, we could code:

FILE SPRDSHT DSD DLM='09'X ;

But this will fail as soon as the code is executed on a platform that uses EBCDIC, where HT is ‘05’X. Conversely, if the code was written explicitly for EBCDIC, it wouldn’t work on an ASCII platform. The solution is to have the application detect the character set in use, and set the proper value for the horizontal tab. The solution could look like this:

ATTRIB HT LENGTH= $1 ;IF RANK( '3' ) = 51 /* 51 in ASCII, 243 in EBCDIC */THEN /*ASCII */ HT = '09'X ;ELSE /*EBCDIC*/ HT = '05'X ;FILE SPRDSHT DSD DLM=HT ;

This example assumes that you have no choice regarding the format used to export the spreadsheet. If you can specify the manner in which the spreadsheet is exported, the “comma-delimited” form will avoid these platform-specific complications.

There is one additional trap. Perl regular expressions (PRX) provide a means of coding a character in octal notation. This is done by a backslash followed by up to three octal digits. Examples include "\5" (EBCDIC HT, ‘05’X), "\11"(ASCII HT, ‘09’X), and "\132" (ASCII “Z”, EBCDIC “!”, ‘58’X). You must be on the lookout for these hidden time bombs!

Portable SAS: Language and Platform Considerations

Page 8 of 21

Generally, you want to avoid those functions which convert between numeric and character forms. These are RANK() and BYTE(). In addition, such conversions can occur by using INPUT(), INPUTC(), INPUTN(), PUT(), PUTN(), or PUTC() functions with INFORMATs and OUTFORMATs such as $HEX, $OCTAL, HEX, and OCTAL.

III.C.1.ii Pitfall: Assuming That a Given Set of Characters is Contiguous

Suppose that you need to generate a SAS data set with one observation for each of the letters A through Z. It is tempting to start with the internal representation for “A”, and add 1 to it to generate each of the subsequent letters. The code to do so would look like this:

DATA Letters_A_Z ; ATTRIB letter LENGTH= $1 ; RETAIN ii ; DO ii = RANK( "A" ) TO RANK( "Z" ) ; letter = BYTE( ii ) ; OUTPUT ; END ; STOP ;RUN ;PROC PRINT ;

This would work in an ASCII platform, but not in an EBCDIC platform. Why? Because the assumption that these letters have contiguous code points is false for EBCDIC. In fact, in EBCDIC, the tilde (~) falls between lowercase “r” and lowercase “s”!

This logic can be coded in a platform-independent manner by not relying on the specific characteristics of the codes assigned to these letters.

DATA Letters_A_Z ; ATTRIB letter LENGTH= $1 ; ATTRIB capitals LENGTH= $26 ; ATTRIB ii ; capitals = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" ; DO ii = 1 TO LENGTH( capitals ) ; letter{ ii } = SUBSTR( capitals, ii, 1 ) ; OUTPUT ; END ; STOP ;RUN ;PROC PRINT ;

At first blush, it might seem that we could use COLLATE( "A", "Z" ) as a more convenient way to generate "ABCDE...XYZ". However, this will not work, because COLLATE generates all the characters between the first and last ones given. On an ASCII platform, “A” is code point 65, and “Z” is code point 90, giving us 90 – 65 + 1 = 26 characters, just as we expect. However, on an EBCDIC platform, “A” is code point 193, and “Z” is code point 233, giving us 293 – 193 + 1 = 101 characters, which is just the same result we would have gotten by incrementing the internal values ourselves, as in the non-portable code example.

You should avoid performing any arithmetic computations on character values. Consequently, you should avoid functions that perform such computations, such as COLLATE(), as well as functions that enable such computations, including RANK() and BYTE(). Instead, use character functions, such as UPCASE(), LOWCASE(), and TRANSLATE(), to transform characters.

III.C.1.iii Pitfalls of Character Comparison

An important difference between ASCII and EBCDIC is their collating sequence. Collating sequence refers to the numeric ordering of the characters within the character set. This ordering is important when we perform comparisons either than equal or not equal. Although control characters precede display characters and space is the first (lowest) display character in both character sets, the rest differs.

The characters in EBCDIC have code points that start with the control characters, followed by the space character (usually referred to as “blank” by mainframe programmers), and place most special characters before lowercase letters followed by uppercase letters, and finally numeric digits. Less common special characters, which were added after the initial design can be found between the lower- and upper-case letters, and even in between some letters.

Characters in ASCII start with with the control characters, followed by the space character, followed by special characters followed by digits, then some more special characters, with uppercase letters, then more special characters, and lowercase letters followed by (you guessed it!) more special characters.

This means that a statement like:

Portable SAS: Language and Platform Considerations

Page 9 of 21

IF "A" > "a" THEN PUTLOG "A>a" ;ELSE PUTLOG "A<=a" ;

will give different results when run on an ASCII platform than an EBCDIC platform.

The only comparisons that yield the same results for both character sets involve only one of these subsets of characters:

all uppercase letters all lowercase letters all digits

Does this matter? Maybe! The first principle here is to avoid any assumptions about the ordering of characters between these groups. Due to the different ordering of characters within the two character sets, a file that was sorted on one platform might not be considered sorted when transferred to another platform. For this reason, it is best to assume that a file transferred from a foreign platform, or of unknown origin, is not sorted.

A match-merge operation is implemented by using the MERGE statement in conjunction with a BY statement. This operation is dependent on the ordering of two or more SAS data sets. Comparing the BY-variables from two or more data sets is subject to the same pitfalls as other comparisons. For a match-merge to work properly, all of the character BY-variables the input data sets must be of the same character set, and sorted in the order (ascending or descending) specified in the MERGE statement.

III.C.2 Character FunctionsIn addition to the character functions listed above, there are several functions whose default values differ according to the platform they run on.

Function or Subroutine ConsiderationsANYPUNCT(string <,start>)

NOTPUNCT(string <,start>)

The results of the ANYPUNCT and NOTPUNCT functions depend directly on the translation table that isin effect (see “TRANTAB= System Option” in the SAS Language Reference: Dictionary) and indirectly on the ENCODING and LOCALE system options.

BYTE(n) NOT portableCOLLATE(start-position[,{end-position|, length}])

NOT portable

RANK(c) NOT portable: Returns the position of a character in the collating sequence for the native character set

CALL SCAN(string, n, position, length<,delimiters>);CALL SCANQ(string, n, position, length<, delimiters>);SCAN(string, n<, delimiter(s)>)SCANQ(string, n<, delimiter(s)>)

The default delimiters supplied for these functions differ between ASCII and EBCDIC platforms. To insure consistent operation, be sure to explicitly specify these parameters.

TRANSLATE(source, to-1, from-1, <... to-n, from-n>)

Portable, but see notes in Appendix A

III.C.3 Characters in Your SAS ProgramAside from characters contained in character constants (literals), there are considerations in which characters are used in the code itself. SAS programs are generally composed of characters which exist in both the ASCII and EBCDIC character sets.

The exception to this observation is the character used as the NOT symbol. SAS permits several characters to be used as the NOT symbol. These are: the EBCDIC NOT symbol (¬) , the ASCII cap or hat symbol (^), and the tilde (~). The EBCDIC NOT symbol (¬) is not recognized by the SAS compiler on ASCII platforms. The ASCII circumflexsymbol (^) is not recognized by the SAS compiler on EBCDIC platforms. Fortunately, the tilde (~) is recognized on all platforms. I strongly recommend that you use tilde as your NOT symbol.

Some programming languages allow <> or >< as the not-equal operator. Don’t make the mistake of doing this in SAS, as these operators are interpreted as MAX and MIN, respectively.

An alternative to using tilde as the NOT symbol would be to use mnemonics wherever a NOT operation was required: NE for ~=, and NOT in place of standalone ~ for Boolean operations.

There are no difficulties with the vertical bar (|) character. Although it appears on some keyboards and fonts as a broken vertical line, , it will transfer to EBCDIC systems as the vertical bar (code point '4F'X), and its interpretation

Portable SAS: Language and Platform Considerations

Page 10 of 21

as the OR operator is universal. This is true despite the fact that EBCDIC includes a broken vertical line (¦) character at code point '6A'X.

The final characters that we must deal with are the left bracket ([) and the right bracket (]). A historical problem exists with these characters in EBCDIC. No code points were assigned to these characters in the original EBCDIC specification. Once there were output devices that could print these characters, code-points were assigned to them for this limited purpose ('AD'X and 'BD'X). Those code points are part of the current definition for EBCDIC Code Page 1047. However, the code points were assigned as part of EBCDIC Code Page 037 are different ('BA'X and 'BB'X). This second set of code points may have originated with a different output device (display tubes). The net result is that transfer and display of these characters is problematic.

What is the responsible SAS programmer to do in the face of this? It’s simple: don’t use brackets. They are not a necessity in SAS (as they are, for example, in C, C++, Java, et al). Instead, I recommend that you use braces ( {and } ) or simply parentheses.

III.D NUMERIC CONSIDERATIONS

III.D.1 Binary Numeric DataWe have seen a variety of numeric data types (binary integer, decimal fixed-point, and hexadecimal/binary/decimal floating-point), in a range of sizes (8 to 128 bits). We must honor these representations when reading and writing binary data on the various platforms. If the requirement to deal with these data types is imposed by external sources, you must code properly for them, and there are suitable SAS INFORMATs and OUTFORMATs for this purpose. However, if you have the flexibility to choose, design your SAS program so as to avoid all binary formats in files that might be passed from one platform to another.

III.D.2 Computational AccuracyClearly, we want our programs to produce the same results regardless of the platform they are run under. Yet there are a wide range of precisions available. SAS performs its computations using floating-point representation. Floating-point precision can be anywhere from 6 to 33 digits.

If SAS chooses to use different precisions on different platforms, the results from a run on one platform could be different from another. Single-precision computations on any one platform could produce different results than double-precision calculations on a different, or even the same, platform. The other possibility we need to consider is the difference between IEEE and IBM hexadecimal floating point hardware: even at the same nominal precision, could the results differ?

You can specify the amount of storage to be used for a numeric variable, and thereby its precision using the LENGTHattribute. This can be done either by means of the ATTRIB statement, or the LENGTH statement. For example:

ATTRIB short LENGTH=4 ;LENGTH long 8 ;

SAS gives the possible lengths17:

For numeric variables, 2 to 8 or 3 to 8, depending on your operating environment.

We see from this that SAS does not support any form of extended precision. Here is SAS’ statement on the data format they use on IBM mainframe hardware18:

To store numbers of large magnitude and to perform computations that require many digits of precision to the right of the decimal point, SAS stores all numeric values in 8-byte floating-point (real binary) representation.

On platforms which employ the IEEE standard for floating-point numbers, such as Windows, SAS states19:

The default length of numeric variables in SAS data sets is 8 bytes. … In SAS under Windows, the Windows data type of numeric values that have a length of 8 is LONG REAL. The precision of floating-point values is always accurate to [at least] 15 digits.

All of this brings us to the best choice: let SAS use the default length of 8 for numeric values. This gives us at least 15 digits of precision on all platforms. It is also the least amount of effort, as we do not have to explicitly declare a non-standard length for all our numeric variables. Ordinarily, we will not have to do anything to assure portability of our programs, but when the LENGTH of a numeric variable is given explicitly, we must remember to code it as 8. Changes in length for input and output can be handled by the length specified on the INFORMAT and OUTFORMATused for the variable.

Although this technique insures the most precision available, and similar precision on all platforms, there is still the issue of range. In theory, a value greater than roughly 10+75 could cause a problem on IBM mainframes. I personally have not encountered this situation, and I believe it to be highly improbable. However, you know your data better than I, so you will know when this is a possibility. Values less than about 10-78 will become zero on IBM mainframe systems. This may, or may not, affect your results. Once again, whether or not this is a realistic possibility, and

Portable SAS: Language and Platform Considerations

Page 11 of 21

whether or not it will have any impact, can only be determined by you, based on the data and the processing algorithm.

The bottom line is, that due to different floating-point implementations on different platforms, there may be differences in computed values. This is nearly impossible between platforms that both use the same floating-point representation (for example IEEE floating-point on a PC and on a server), and possible but unlikely between machines that have different floating-point representation (for example IBM mainframe hexadecimal floating-point and IEEE floating-pointon either a PC or a server). Such differences should be small.

III.E INFORMAT AND OUTFORMAT CONSIDERATIONSINFORMATs and OUTFORMATs can be bane or boon for portability. When you are reading an input file from, or are creating output for, from a specific platform, you will have to use INFORMATs and OUTFORMATs appropriate to that platform. Some key INFORMATs and OUTFORMATs used for such conversions are:

$ASCIIw. o On EBCDIC systems, $ASCIIw. converts EBCDIC character data to ASCIIw.o On all other systems, $ASCIIw. behaves like the $CHARw. format.

$EBCDICw.o On ASCII systems, $EBCDICw. converts ASCII character data to EBCDIC.o On all other systems, $EBCDICw. behaves like the $CHARw. format.

IBRw.d , PIBRw.d – converts SAS numbers to/from DEC/Intel (“backwords”) Integer Binary form IEEEw.d – converts SAS numbers to/from IEEE floating-point form S370FIBw.d, S370FIBUw.d, S370FPIBw.d, S370FPIBw.d – converts SAS numbers to/from IBM

System/370 Integer (Fixed) Binary form S370FPDw.d, S370FPDUw.d – converts SAS numbers to/from IBM System/370 Packed Decimal form S370FRBw.d – converts SAS numbers to/from IBM System/370 Floating Point form S370FZDw.d, S370FZDLw.d, S370FZDTw.d, S370FZDUw.d – converts SAS numbers to/from IBM

System/370 Zoned Decimal format

Some formats pose a hazard, because of their differing behavior on different platforms. These formats produce output that is in a form specific to the platform the SAS program is being run on. Such representations are referred to as native. As long as the data is written and read on the same platform, these formats can be used safely. In addition, they may be more efficient than formats which write data in a non-native representation.

FLOAT4. – converts SAS numbers to/from the native single-precision, floating-point value

HEXw. – converts SAS numbers to/from the hexadecimal representation for their native integer or floating-point binary representation

$HEXw. – converts SAS characters to/from the hexadecimal representation for their native representation IBw.d, PIBw.d – converts SAS numbers to/from native Integer Binary form

OCTALw. – converts SAS numbers to/from the octal representation of their native integer representation

$OCTALw. – converts SAS numbers to/from the octal representation of their native integer representation PDw.d – converts SAS numbers to/from native Packed Decimal format RBw.d – converts SAS numbers to/from native Floating Point (Real Binary) ZDw.d – converts SAS numbers to/from native Zoned Decimal format

Consult the SAS Companion for a given platform to learn about its native formats.

III.F MACRO LANGUAGE CONSIDERATIONSLike base code, macro language is also susceptible to the following considerations:

comparisons affected by the character set’s collating sequence the default delimiters supplied for %SCAN and %QSCAN differ between ASCII and EBCDIC platforms. To

insure consistent operation, be sure to explicitly specify these parameters.

For more detailed information, see “Macro Language Elements with System Dependencies”, on page 144 of SAS 9.1 Macro Language Reference. Refer to Appendix A for additional automatic macro variables (beginning with “&”) and statements and autocall macros (both of which being with “%”).

III.G PROC CONSIDERATIONSComparison is at the heart of the SORT and MERGE operations. Consequently, there are portability considerations for PROC SORT and match-merging operations. Sorting or merging on numeric values alone should not impact portability. However, sorting based on character strings, unless those keys solely contain unsigned fixed-point, decimal-aligned numbers, will produce varying results from one platform to another.

This difference in ordering does not impact the portability of your program as long as the data is maintained solely within one platform. However, if data is transferred from one platform to another, it may no longer be considered sorted on the target platform. For data which will be moved from one platform to another, there are two possible

Portable SAS: Language and Platform Considerations

Page 12 of 21

approaches: (a) re-sort the data after it has been transferred to the target platform, or (b) pre-sort the data on the sending platform.

You can specify the <SORTSEQ=>ASCII or <SORTSEQ=>EBCDIC option of the SORT PROC to force sorting by the collating sequence of a particular character set, regardless of the sorting platform’s native character set. WARNING: do not use SORTSEQ= if you have already converted the data to a particular character set by use of the $ASCII. or $EBCDIC. OUTFORMAT.

Other PROCs with platform dependencies include CIMPORT, CPORT, IMPORT, EXPORT, CONVERT, SOURCE, CALENDAR (when used with ODS), CATALOG, CHART (when used with ODS), CONTENTS, COPY (when used with the XPORT engine), DATASETS, FORMAT, MIGRATE, OPTIONS, PLOT (when used with ODS), PRINTTO, and REPORT. Often, it is the uncommon features of these PROCs that present portability issues, but you should read the appropriate section for any PROC that you will use in more than one platform in the Base SAS® 9.1.3 Procedures Guide. In addition, you should read the platform companion documentation for PROCs CATALOG, DATASETS, FRONTREG (when used under z/OS), FORMAT, OPTIONS, PRINTTO, and REPORT.

I strongly recommend that you avoid PROC PMENU. If not, research it thoroughly, and check its operation carefully on each platform.

Note that carriage control creates portability issues, no matter which PROC or statement it is used with. If you are dealing with carriage control, be sure to research any of the following that you will be using: PROC PRINTTO, the FSLIST command, the FILE, INFILE, FILENAME, INPUT, & PUT statements, the FAPPEND, FOPEN, FWRITE, & MOPEN functions, the SKIP= system option. Check both the base language and platform companion manuals.

All PROCs which perform computations, such as CORR, MEANS, SUMMARY, TABULATE, UNIVARIATE, etc, are subject to floating-point considerations, such as range and accuracy (covered above).

For a list of PROCs which will only work on particular platforms, see Appendix 2, “Operating Environment-Specific Procedures” in Base SAS® 9.1.3 Procedures Guide.

Many PROCs deal with printing. Printing is implemented differently on different platforms. For information related to printing, consult SAS Language Reference: Concepts. Additional information may be available in the SAS companion documentation for your operating environment.

II.H SYSTEM, STATEMENT, AND PROC OPTIONS CONSIDERATIONSThere is a hidden trap that can catch you unawares as you move your program from one platform to another: System Options. The reason for this is that not only can default settings vary from one platform to another, but when SAS was installed on each platform, it may have been customized with different system options by the administrator who performed the installation! To be prepared to deal with problems that arise that are linked to system options, place a PROC OPTIONS step at the start of your portable programs. Something as simple as a difference in the L= option can produce surprising (and unpleasant) compilation errors; run-time variations are even more troublesome. The best response to this hazard is to include an OPTIONS statement with a complete set of explicit options for every setting relevant to your program.

II.H.1 Character Encoding Option ConsiderationsThe term encoding refers to the process of assigning code points from a given code page to represent a text string. For example, text encoded in EBCDIC will be represented internally differently than the same text encoded in Unicode. Encoding is controlled by the ENCODING= data set option, the ENCODING= option of the FILE, FILENAME, and INFILE statements, the INENCODING= and OUTENCODING= options of the LIBNAME Statement. The $UCS*family of FORMATs and INFORMATs are affected by encoding considerations. For the necessary understand of encoding options, see “The ENCODING data set option” and “Encoding Values in SAS Language Elements” in SAS National Language Support (NLS): User’s Guide.

II.H.2 Transcoding Option ConsiderationsThe process of transforming text from one character encoding to another is called transcoding. SAS transcoding is controlled by the TRANTAB= system option, This option affects the operation of character searching functionsANYALNUM, ANYALPHA, ANYGRAPH, ANYLOWER, ANYPRINT, ANYPUNCT, NOTALNUM, NOTALPHA, NOTFIRST, NOTGRAPH, NOTLOWER, NOTPRINT, NOTPUNCT, NOTSPACE, NOTUPPER, as well as the TRANTAB and URLDECODEfunctions. For details, see “The TRANTAB= system option” in SAS National Language Support (NLS): User’s Guide

II.H.1 Locale Option ConsiderationsThe locale of your platform determines things such as the character that is used to designate monetary units (such as the dollar sign, pounds sterling sign, euro sign, etc) as well as date format. Formats affected by your locale include of NLDATEw., NLDATEMNw., NLDATEWw., NLDATEWNw., NLDATMw., NLDATMAPw., NLDATMTMw., NLTIMAPw., NLTIMEw., NLMNYw.d, NLMNYIw.d, NLNUMw.d, NLNUMIw.d, NLPCTw.d, and NLPCTIw.d.. See “LOCALE System Option: OpenVMS, UNIX, Windows, and z/OS” in SAS National Language Support (NLS): User’s Guide.

Portable SAS: Language and Platform Considerations

Page 13 of 21

IV. OPERATING SYSTEM CONSIDERATIONS

IV.A EXTERNAL EXECUTABLES CONSIDERATIONS

IV.A.1 OS Command ConsiderationsThrough use of the system() function, SAS programs can issue a command to the Operating System. OS commands are inherently dependent on the OS for their syntax and function, and, except possibly for some trivial cases, not portable. In addition to the system() function, OS commands may be issued by using the X (eXecute)statement, and the %SYSEXEC macro statement. I strongly recommend against use of these facilities.

IV.A.2 User-Written Module ConsiderationsThe use of the CALL MODULE subroutines can be portable, but the user must port these routines to each target platform, and possibly tailor the tables which describe the interface between SAS and the user module as well. This facility is also supported by the %SYSCALL macro statement.

IV.B OTHER PLATFORM-SPECIFIC FEATURESUnder some platforms, it is possible for SAS to access hardware features that may not exist on other platforms. An example of this can be found in the SAS Companion for Windows, where an illustration is given of how to read data from a COM: (RS-232 serial) port is given20. Any code that accesses hardware features which are not available on alltarget platforms will not qualify as portable.

Likewise, exploiting platform-specific software features can just as readily undermine portability. Accessing or altering hardware or OS internal data via PEEK and POKE, or issuing commands via the system() function, executing routines in DLLs (or their analog on non-windows systems), interacting with objects using OLE or DDE (in Windows), or with ISPF (on MVS TSO) must all be avoided. An example of logic that exploits MVS data structures is given by SAS for “Listing ASCB Bytes” and “Creating a DATA Step View” of the TIOT (Task Input/Output Table), both on the z/OS platform21.

Another way to interact with the OS is the SYSGET() function. This function operations somewhat differently on each platform, and may undermine the portability of your program.

SAS provides a number of ways in which to obtain information about the computing environment that are independent of the platform. The most useful of these are the Portable Automatic Macro Variables. These macro variables will contain analogous information for each given platform. They are22:

Portable AutomaticMacro Variable Name

Contents

&SYSDEVIC The name of the current graphics device on DEVICE=&SYSENV The mode of execution (“FORE” or “BACK”)&SYSJOBID The name of the currently executing batch job. On *nix platforms, this is the PID (Process

ID); on MVS platforms, this is the job name.&SYSRC The last return code generated by your host environment (in response to a SYSTEM()

invocation, the X statement, or the %SYSEXEC, %TSO or %CMS macro statements.)&SYSSCP System Control Program (Operating System). Possible values12 include:

VMS for VAX VMS OS2 for IBM OS/2 WIN for Microsoft Windows CMS for IBM VM/CMS

&SYSCPL System Control Program (Operating System) detail. Possible values12 include: WIN_32S for 32-bit Widows (generic) WIN_95 for Widows 95 WIN_NT for Widows NT

&SYSPARM Retrieve a character string that was passed to SAS by the SYSPARM= system option

There are other automatic macro variables whose value is determined completely internally, and are safe to use on all platforms. These are listed in Appendix A.

IV.C ISOLATING SYSTEM DEPENDENCIESInevitably, there are situations where platform-dependent statements are a necessity. The obvious case is the identification of input and output files. The need to do this varies from one platform to another.

The IBM mainframe platforms, MVS, CMS, and VSE23, and their various incarnations (currently z/OS, z/VM, and z/VSE) use system commands external to the SAS program to associate files with DDNAMEs (Data Definition Names). Due to the fact that SAS originated in these environments, a DDNAME automatically becomes a SASfileref. The beauty of IBM’s method of divorcing the identification of the files to be used for input and output to a program from the program itself, is that the program can be run multiple times with multiple inputs and outputs without the program itself ever having to be modified. A program can be run with magnetic tape input and disk output, or vice

Portable SAS: Language and Platform Considerations

Page 14 of 21

versa. Output can go to a printer one time, or to a file for subsequent FTP the next time the program is run. Once again, all this is done without any need to change the program.

On Windows and Unix platforms, SAS programs identify external input and output files by using the FILENAMEstatement. The names of the files to be read and written are coded explicitly in the program. This means that whenever an input or output file name changes, the program code has to be changed. This will cause both portability problems and maintenance complications.

V. CREATING PORTABLE SAS PROGRAMS

V.A THE THREE-PRONGED STRATEGY TO CREATE A PORTABLE SAS PROGRAMI recommend a three-pronged strategy for creating portable SAS programs: avoid, externalize, and encapsulate24. The tactics are listed in their order of priority.

V.A.1 Avoid Non-Portable Language ElementsThis is straightforward: if you find yourself using a non-portable SAS language element, look for an alternative means to accomplish your goal, or ask yourself if what you are doing is truly necessary. In programming, Simplicity is a virtue (and sometimes harder to achieve than complexity)! See Appendix A for recommendations on specific language elements.

Some SAS functions that should be avoided include PEEK, PEEKC, PEEKLONG, PEEKCLONG. POKE, POKELONG, RANK, BYTE, and COLLATE. The FILEMAP, COMMPORT, and PIPE keywords of the FILENAME statement should be avoided. All SAS Windowing commands, must be avoided. INFORMATs and OUTFORMATs $HEX., $OCTAL., HEX., and OCTAL. must be used with great care (although they’re OK for debugging). Finally, the CALL MODULEfamily (MODULE, MODULEN, MODULEC, MODULEI, MODULEIN, MODULEIC) presents special problems, because the modules executed would have to be ported as well, along with their interface definition tables.

V.A.2 Externalize Non-Portable Language ElementsThis tactic removes platform-specific elements from the core logic of your program, and moves it to an outer shell which can perform prologue and epilogue process specific to the environment. It then becomes possible to have a small piece of code which is platform-dependent and a large module whose code is common to all platforms. You will then have one version of core logic, along with several different versions of the platform-specific code.

For example, if we have code like the following:

DATA Processed_Data ; FILE "C:\Data\Weekly_Sumry_2009-05-12.DAT" ; /*for Output from PUT stmt*/ INFILE "C:\Data\Weekly_Data_2009-05-12.DAT" ; /* External input */ . . .RUN ;

The file specifications are specific to the platform (Windows in this case. We can decouple the filespecs from the FILE and INFILE statements by using the FILENAME statement:

FILENAME InptFile "C:\Data\Weekly_Data_2009-05-12.DAT" ;FILENAME OutptFil "C:\Data\Weekly_Sumry_2009-05-12.DAT" ;DATA Processed_Data ; FILE OutptFil ; /* For output from PUT statement */ INFILE InptFile ; /* External input */ . . .RUN ;

This step makes maintenance easier, but does not make the program portable. What we must do is create two separate SAS programs:

/* The Windows-specific part of the code */FILENAME InptFile "C:\Data\Weekly_Data_2009-05-12.DAT" ;FILENAME OutptFil "C:\Data\Weekly_Sumry_2009-05-12.DAT" ;%INCLUDE APPLIB(WklySmry) /* Common code */ ;

Stored separately as WklySmry in the application code repository:

/* The platform-independent part of the code */DATA Processed_Data ; FILE OutptFil ; /* For output from PUT statement */ INFILE InptFile ; /* External input */ . . .RUN ;

Portable SAS: Language and Platform Considerations

Page 15 of 21

Now that we have this structure, it would be easy to create platform-specific code for Unix, MVS, etc, modeled on the Windows-specific code.

V.A.3 Encapsulate Non-Portable Language ElementsThe final tactic applies to those platform-dependent language features that are absolutely necessary, but cannot be externalized. These situations should be rare, as macros can generally be used to externalize such features.

Take, for example, the problem of causing a SAS program to pause for one minute. Under MVS and CMS, this would be achieved by CALL SLEEP( 60000 ) ;, whereas on a Windows machine, you would have to code dummy = SLEEP( 60 ) ; .

Further suppose this action must be taken at some point during the course of a DATA step, and cannot be externalized to a prologue or epilogue. To accomplish this delay, you could code:

SELECT ( "&SYSSCP" ) ;WHEN ( "OS", "CMS" ) /*MVS|CMS*/ CALL SLEEP( 60000 ) ;WHEN ( "Win" ) /*Windows*/ dummy = SLEEP( 60 ) ;OTHERWISE /*Unknown platform*/ ABORT ;END /* Case of operating system */ ;

It would be even better to create a macro to generate this code. The following example uses seconds as the units for its parameter, but you could code it differently:

%MACRO TakeaNap( duration_in_seconds ) ; SELECT ( "&SYSSCP" ) ; WHEN ( "OS" /*MVS*/, "CMS" /*CMS*/ ) CALL SLEEP( ( &duration_in_seconds ) * 1000 ) ; WHEN ( "Win" ) /*Win*/ dummy = SLEEP( &duration_in_seconds ) ; OTHERWISE /*Unknown platform*/ ABORT ; END /* Case of operating system */ ;%MEND TakeaNap ;

The user would then code %TakeaNap( 60 ) to sleep for one minute.

This macro generates code that must make the decision of what to execute during run time. You could take advantage of the fact that &SYSSCP is known at macro time to generate only the code to be executed, without any run-time decision overhead. Such a macro would look like:

%MACRO TakeaNap( duration_in_seconds ) ; %IF "&SYSSCP" = "OS" /*MVS*/ | "&SYSSCP" = "CMS" /*CMS*/ %THEN %DO ; CALL SLEEP( ( &duration_in_seconds ) * 1000 ) ; %END ; %ELSE %IF "&SYSSCP" = "WIN" /*Windows*/ | "&SYSSCP" = "WIN_SRV" /*Windows Servers*/ %THEN %DO ; dummy = SLEEP( &duration_in_seconds ) ; %END ; %ELSE /*Unknown platform*/ %DO ; ABORT ; %END ;%MEND TakeaNap ;

Now that this platform-dependency is encapsulated in a macro, the macro can be externalized, and shared with other programs as needed.

Note: I used ABORT in the last example, so that if the expanded code were never executed, it would not cause the program to fail. Depending on your philosophy of these things, you could use %ABORT to insure that the code is never executed on an unsupported platform.

VI. CONCLUSIONThe ability to run the same SAS program on any one of several platforms allows an organization to lower costs and increase flexibility. This enables the enterprise to best leverage its computing resources.

ACKNOWLEDGEMENTSI wish to acknowledge the invaluable assistance of Mr. Donald Weimer, as well as constructive review feedback from Ivan Padilla, Mike Whitaker, and one other.

Portable SAS: Language and Platform Considerations

Page 16 of 21

RECOMMENDED READINGFor the fine points of character encoding in code pages, transcoding between code pages, and the code pages themselves cam be found in SAS National Language Support (NLS): User’s Guide.You may also want to peruse “Processing Data Using Cross-Environment Data Access (CEDA)” in SAS Language Reference: Concepts.I strongly recommend that you review the SAS companion manual for each platform you will be porting to.

CONTACT INFORMATIONYour comments and questions are valued and encouraged. Contact the author at:

Robert Cruz, Info-Mation Systems1691 El Camino de VidaHollister, CA 95023Work Phone: 831-207-9132Fax: 413-771-053E-mail: [email protected]

TRADEMARKS, BRAND AND PRODUCT NAMESSAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are trademarks of their respective companies.

APPENDIX A: SAS LANGUAGE ELEMENTS WITH PORTABILITY CONSIDERATIONS

For information on options, see Section II.H, “System, Statement, and PROC Options Considerations”, above.

Language Element Considerations Portability RecommendationStatement: %ABORT The effect of this statement varies

from one platform to another, and even with operating methods (foreground vs. background or batch) on a given platform

Code this statement without any parameters (neither ABEND nor RETURN) to insure portability.

Format: $ASCIIw.Produced output or reads input in the ASCII character set, regardless of the platform’s native character set.

Platform independent. Use this format to read data from, or produce data for, ASCII platforms, or to ensure portability of the data.

Format: $EBCDICw.Produced output or reads input in the EBCDIC character set, regardless of the platform’s native character set.

Platform independent. Use this format to read data from, or produce data for, EBCDIC platforms, or to ensure portability of the data.

Format: FLOAT4.

Converts SAS numbers to/from thenative single-precision, floating-point value

This format produces output that is in a form specific to the platform the SAS program is being run on.

As long as the data is written and read on the same platform, this format can be used safely. In order to have a uniform format for zoned decimal across all platforms, use either the IEEEw.d or the S370FRBw.d format consistently.

Format: HEXw.Converts SAS numbers to/from the hexadecimal representation for their native integer or floating-point binary representation

This format produces output that is in a form specific to the platform the SAS program is being run on.

As long as the data is written and read on the same platform, this format can be used safely.

Format: $HEXw.Converts SAS characters to/from the hexadecimal representation for their native representation

This format produces output that is in a form specific to the platform the SAS program is being run on.

As long as the data is written and read on the same platform, this format can be used safely.

Format: IBw.d, PIBw.d

Converts SAS numbers to/from native Integer Binary form

These formats produce output that is in a form specific to the platform the SAS program is being run on.

As long as the data is written and read on the same platform, these formats can be used safely. In order to have a uniform format for zoned decimal across all platforms, use the S370FIBw.d format consistently

Portable SAS: Language and Platform Considerations

Page 17 of 21

Language Element Considerations Portability RecommendationFormat: IBRw.d , PIBRw.d

Converts SAS numbers to/from DEC/Intel (“backwords”) Integer Binary form

Platform independent. Use this format to read data from, or produce data for, little-endian platforms, or to ensure portability of the data. Or, you could use the S370FIBw.d format to ensure portability.

Format: IEEEw.dConverts SAS numbers to/from IEEE floating-point form

Platform independent. Use this format to read data from, or produce data for, platforms that use the IEEE floating-point representation, or to ensure portability of the data. Or, you could use the S370FRBw.d format to ensure portability.

Function: MCIPISTR(MCI-string-command)Submits an MCI string command to a piece of multimedia equipment

Only available on the Windows platform.

NOT Portable. Avoid.

Function: MOD(argument-1, argument-2)Returns the remainder from the division of the first argument by the second argument, fuzzed to avoid most unexpected floating-point resultsFunction: MODZ(argument-1, argument-2)Returns the remainder from the division of the first argument by the second argument, without fuzzing.

Arguments must be integer values, even though they are stored as floating point numbers. The differences in floating-point implementations means that some integer values might be represented on one platform, but not another.

Avoid using the MOD function with large arguments. Do not exceed 2**53-1.

Function: MOPEN(directory-id,member-name<,open-mode<,record-length <,record-format>>>)

Opens a file by directory id and member name, and returns the file identifier or a 0

The permissible format for member-name values varies by platform.

A value of “P” for record-formatspecifies a platform-dependent file.

NOT Portable. Avoid. If this function is necessary, try to externalize it. Alternatively, you might be able to restrict member-name to a subset common to all platforms. Try to avoid a record-format of “P”.

Format: OCTALw.Converts SAS numbers to/from the hexadecimal representation for their native integer binary representation

This format produces output that is in a form specific to the platform the SAS program is being run on.

As long as the data is written and read on the same platform, this format can be used safely.

Format: $OCTALw.Converts SAS characters to/from the hexadecimal representation for their native representation

This format produces output that is in a form specific to the platform the SAS program is being run on.

As long as the data is written and read on the same platform, this format can be used safely.

Function: PATHNAME((fileref | libref) <, search-opt>)

Returns the physical name of a SAS data library or of an external file, or returns a blank.

Pathnames vary in format and length from one platform to another.

Limit your use of this. It can be beneficial to list the names of input and/or output files used by a SAS program, but avoid processing the pathnames the function returns.

Format: PDw.d

Converts SAS numbers to/from native Packed Decimal format

This format produces output that is in a form specific to the platform the SAS program is being run on.

As long as the data is written and read on the same platform, thIs format can be used safely. For a uniform format for zoned decimal across all platforms, use the S370FPDw.d format consistently

Function: PEEK(address<, length>)Stores the contents of a memory address into a numeric variable on a 32–bit platformFunction: PEEKC(address<, length>)Stores the contents of a memory address in a char variable on a 32–bit platform

A SAS program could potentially access memory owned by either the OS (if the OS designates it as R/W) or owned by SAS itself. There is no commonality in the placement, content or format of data or instructions in memory from one platform to another.

Avoid.

Depending on what information you are looking for, try to find a SAS facility (such as the automatic macro variables or FINFO function) that will provide it to you.

Portable SAS: Language and Platform Considerations

Page 18 of 21

Language Element Considerations Portability RecommendationFunction: PEEKCLONG(address<, length>)Stores the contents of a memory address in a character variable on 32-bit and 64-bit platformsFunction: PEEKLONG(address<, length>)Stores the contents of a memory address in a numeric variable on 32-bit and 64-bit platforms

Same as PEEK() Same as PEEK()

Subroutine: CALL POKE(source, pointer<, length>);

Writes a value directly into memory on a 32–bit platformSubroutine: CALL POKELONG(source, pointer<, length>);Writes a value directly into memory on 32-bit and 64-bit platforms

A SAS program could potentially access memory owned by either the OS (if the OS designates it as R/W) or owned by SAS itself. There is no commonality in the placement, content or format of data or instructions in memory from one platform to another.

Avoid. Period.

Modifying system or application storage can lead to disastrous results.

Tabled Distribution of RAND function: RAND(’TABLE’, p1, p2, ...)

The maximum number of probability parms depends on your platform, but is at least 32,767.

Code no more than 32,767 probability parameters.

Format: RBw.dConverts SAS numbers to/from native Floating Point (Real Binary)

This format produces output that is in a form specific to the platform the SAS program is being run on.

As long as the data is written and read on the same platform, thIs format can be used safely. In order to have a uniform cross-platform format for zoned decimal, use either the IEEEw.d or the S370FRBw.d format consistently

Format: S370FIBw.d, S370FIBUw.d, S370FPIBw.d, S370FPIBw.d

Converts SAS numbers to/from IBM System/370 Integer (Fixed) Binary form

Platform independent. Use this format to read data from, or produce data for, platforms that use the IBM fixed-point format, or to ensure portability of the data. Or, you could use the IBRw.d format to ensure portability.

Format: S370FPDw.d, S370FPDUw.d

Converts SAS numbers to/from IBM System/370 Packed Decimal form

Platform independent. Use this format to read data from, or produce data for, platforms that use the IBM packed decimal representation, or to ensure portability of the data.

Format: S370FRBw.d

Converts SAS numbers to/from IBM System/370 Floating Point form

Platform independent. Use this format to read data from, or produce data for, platforms that use the IBM hex floating-point representation, or to ensure portability of the data. Or, you could use the IEEEw.d format to ensure portability.

Format: S370FZDw.d, S370FZDLw.d, S370FZDTw.d, S370FZDUw.d

Converts SAS numbers to/from IBM System/370 Zoned Decimal format

Platform independent. Use this format to read data from, or produce data for, platforms that use the IBM zoned decimal representation, or to ensure portability of the data.

Subroutine: CALL SCAN(string,n, position,length<,delimiters>)

Subroutine: CALL SCANQ(string,n,position,length<,delimiters>)

Function: SCAN(string,n<,delimiters>)Function: SCANQ(string,n<,delimiters>)Function: %SCAN(string,n<,delimiters>)Function: %QSCAN(string, n<,delimiters>)

The default value for the delimitersparameter is dependent on the platform:EBCDIC:

blank . < ( + | & ! $ * ) ; ¬ / , % ¦ ¢ASCII code pages with circumflex:

blank . < ( + & ! $ * ) ; ^ – / , % |ASCII code pages without circumflex:

Blank . < ( + & ! $ * ) ; ~ – / , % |

Always specify an explicit value for the delimiter parameter.

Portable SAS: Language and Platform Considerations

Page 19 of 21

Language Element Considerations Portability RecommendationSubroutine: CALL SOUND( frequency, duration);

Windows only NOT portable. Avoid, or if necessary, encapsulate.

Automatic variable: &SYSCC The condition code that SAS returns to your operating environment.

Preferably, avoid this function.If it is a necessity, you will need to code platform-sensitive logic.

Automatic variable: &SYSCMD Last unrecognized command from the command line of a macro window.

Avoid. If you are following the injunction against all windowed commands, you won’t need this.

Automatic variable: &SYSDEVIC Contains the name of the current graphics device. These names are platform-dependent.

Portable, but may give different results in each platform.

Automatic variable: &SYSENV Reports whether SAS is running interactively.

Portable.

Automatic variable: &SYSFILRC Contains the return code from the last FILENAME statement

Portable if you only test for zero or non-zero, rather than specific non-zero values.

Function: SYSGET(operating-environment-variable)

Function: %SYSGET(operating-environment-variable)

Returns the value of the specified operating environment variable

The method of setting environment variables prior to invoking SAS differs on each platform

To use this in a portable fashion, you would have to insure that the same environment variables were properly set in all the platforms you run SAS on.

Automatic variable: &SYSINFO Contains return codes provided by some SAS procedures

Portable.

Automatic variable: &SYSJOBIDContains the name of the current batch job or user ID.

The length and format of a User ID may vary from one platform to another.

Portable.

Automatic variable: &SYSLAST Contains the name of the SAS data file created most recently

Portable.

Automatic variable: &SYSLCKRC Contains the return code from the most recent LOCK statement

Portable if you only test for zero or non-zero, rather than specific non-zero values.

Automatic macro variable: &SYSLIBRC Contains the return code from the most recent LIBNAME statement

Portable if you only test for zero or non-zero, rather than specific non-zero values.

Statement: %SYSLPUT Exchange macro variable values between local and remote systems.

Portable.

Automatic variable: &SYSMACRONAME Contains the name of the currently executing macro

Portable.

Automatic variable: &SYSMENV Contains the invocation status of the macro that is currently executing

Portable.

Automatic variable: &SYSMSG Contains the text to display in the message area of a macro window

Portable.

Automatic variable: &SYSNCPU Contains the number of processors available to SAS for computation

Portable.

Automatic variable: &SYSPARM Contains a character string that can be passed from the operating environment to SAS program steps

Portable.

Automatic variable: &SYSPBUFF Contains text supplied as macro parameter values

Portable.

Automatic variable: &SYSPROCESSID Contains the process id of the current SAS process

Portable.

Automatic variable: &SYSPROCESSNAME Contains the process name of the current SAS process

Portable.

Automatic variable: &SYSPROCNAME Contains the name of the procedure (or “DATASTEP” for data steps) currently being processed by the SAS language processor

Portable.

Function: %SYSPROD(product) Reports whether a SAS software product is licensed at the site

Portable.

Portable SAS: Language and Platform Considerations

Page 20 of 21

Language Element Considerations Portability RecommendationAutocall macro: %SYSRC(character-string)

Returns a value corresponding to an error condition

Portable. Note: do not confuse this with the automatic variable of the same name.

Automatic variable: &SYSRC Contains the last return code generated by your operating system.

Note: do not confuse this with the autocall macro of the same name.

NOT portable. You shouldn’t have any need for this automatic variable if you are following the recommendation against issuing OS commands.

Statement: %SYSRPUT Exchange macro variable values between local and remote systems.

Portable

Automatic variable: &SYSSCP

Contains a value which identifies the operating system.

For a list of possible values, see Table 13.3, SYSSCP and SYSSCPL Values, in “SYSSCP and SYSSCPL Automatic Macro Variables” in SAS 9.1 Macro Language: Reference

Portable. Can be used to determine the current operating system family for logic which encapsulates platform-specific logic, for instance Unix vs. z/OS.

Automatic variable: &SYSSCPL

Contains a value which is either blank or further identifies the operating system.

For a list of possible values, see Table 13.3, SYSSCP and SYSSCPL Values, in “SYSSCP and SYSSCPL Automatic Macro Variables” in SAS 9.1 Macro Language: Reference

Portable. Can be used to determine the current operating system variant for logic which encapsulates platform-specific logic, for example Windows 95 vs. Windows NT.

Automatic variable: &SYSSITE None: Contains the number assigned to your site

Portable.

Automatic variable: &SYSSTARTIDContains the ID generated from the last STARTSAS statement

May be blank. Portable.

Automatic variable: &SYSSTARTNAMEContains the process name generated from the last STARTSAS statement

May be blank. Portable.

Subroutine: CALL SYSTEM(command);Statement: X < command >;Function: SYSTEM(command)Function: %SYSTEM(command)Statement: %SYSEXEC < command >;Runs command in the OS environment.

The value of the commandparameter is platform-dependent. The results of command are platform-dependent.

NOT portable.

Alternatives depend on the intended purpose of the command.

Automatic variable: &SYSTIMEContains the time a SAS job or session began executing

None. Portable.

Automatic variable: &SYSUSERIDContains the user ID or login of the current SAS process

The length and format of a User ID may vary from one platform to another.

Portable.

Automatic variable: &SYSVERContains the release number of SAS software

None. Portable.

Automatic variable: &SYSVLONGContains the release number and maintenance level of SAS software

None. Portable.

Function: TRANSLATE(source, to-1, from-1<, …to-n, from-n>)

See the SAS documentation for your operating environment for more information.

You must have pairs of to and from arguments on some operating environments. On other operating environments, a segment of the collating sequence replaces null from arguments.

Portable if you specify a complete to-from pairs.

Only one pair is ever necessary, and I recommend you restrict yourself to that form.

Function: URLDECODE(argument)

Returns a string that was decoded using the URL escape syntax

Characters specified by an escape sequence. are assumed to be in ASCII encoding. On an EBCDIC platform, SAS uses the transport-to-local translation table to convert these characters to their corresponding EBCDIC characters

Use with care. For more information see “TRANTAB= System Option” on page 1639 ofSAS 9.1.3 Language Reference: Dictionary, Third Edition.

Portable SAS: Language and Platform Considerations

Page 21 of 21

Language Element Considerations Portability RecommendationFunction: WAKEUP(datetime)

Specifies the day and time that execution will ensue

Supported only on the Windows platform.

Avoid. You might be able to replicate this function on other platforms by using the SLEEP function (with appropriate calculations), but SLEEP itself is problematic.

Commands: Windowing commands All SAS Windowing commands act differently on various platforms, due to varying implementations of the interactive environment.

Avoid.

Format: ZDw.d Exact layout is platform-dependent. Consult the companion documentation for the operating environment in question.

As long as the data is written and read on the same platform, this format can be used safely. In order to have a uniform format for zoned decimal across all platforms, use the S370FZDw.d format.

ENDNOTES 1 SAS Publications can be downloaded gratis from http://support.sas.com/documentation/onlinedoc/base/index.html ; the Companions manuals are listed at http://support.sas.com/documentation/onlinedoc/base/index.html#companion2 Wikipedia, the free encyclopedia. “P-code machine”. http://en.wikipedia.org/wiki/P-code_machine (09 Jul 2009)3 BCDIC was also used by CDC, prior to, and following, the creation of EBCDIC (see http://en.wikipedia.org/wiki/CDC_3000), and also by the HP 3000 MPE/ix-series computers (see http://docs.hp.com/en/32212-90008/apcs03.html), as well as other computers4 Wikipedia, the free encyclopedia. “Baudot Code”. http://en.wikipedia.org/wiki/Baudot_code (15 Jun 2009)5 You can obtain IBM publications in hardcopy (for a fee) and many of them at no charge in softcopy (PDF and/or Bookmaster) format from http://www.elink.ibmlink.ibm.com/publications/servlet/pbi.wss?CTY=US, by clicking on “Search for Publications”6 Unicode Consortium (1991-2009). “C0 Controls and Basic Latin”. http://www.unicode.org/charts/PDF/U0000.pdf (10 Jul 2009)7 IEEE is a professional organization for the advancement of technology, see http://www.ieee.org/web/aboutus/home/index.html8 Wikipedia, the free encyclopedia. “IEEE 754-1985”. http://en.wikipedia.org/wiki/IEEE_754-1985 (09 Jul 2009)9 Wikipedia, the free encyclopedia. “Cray SV1”. http://en.wikipedia.org/wiki/Cray_SV1 (15 Jul 2009)10 Wikipedia, the free encyclopedia. “Floating Point”. http://en.wikipedia.org/wiki/Floating_point (09 Jul 2009)11 Wikipedia, the free encyclopedia. “POWER6”. http://en.wikipedia.org/wiki/POWER6 (09 Jul 2009)12 Wikipedia, the free encyclopedia. “Z10”. http://en.wikipedia.org/wiki/IBM_System_z10#Decimal_Floating_Point (09 Jul 2009)13 Morse, Isaacson, Albert (1987). The 80386/387 Architecture. NY, NY: John Wiley & Sons.14 Wikipedia, the free encyclopedia. “Open Database Connectivity”. http://en.wikipedia.org/wiki/Odbc (15 Jun 2009)15 SAS Institute Inc (2005). “CALL POKE Routine”, pg 354, SAS® 9.1.3 Language Reference: Dictionary, Third Edition. Cary, NC: SAS Institute.16 IBM Corp. “ASCII to EBCDIC conversion”. http://www-03.ibm.com/servers/eserver/zseries/zos/unix/bpxa1p03.html (09 Jul 2009)17 SAS Institute Inc (2005). “LENGTH Statement”, pg 1294, SAS® 9.1.3 Language Reference: Dictionary, Third Edition. Cary, NC: SAS Institute.18 SAS Institute Inc (2004). “Representation of Numeric Variables”, pg 207, SAS® 9.1.3 Companion for z/OS. Cary, NC: SAS Institute.19 SAS Institute Inc (2004). “Length and Precision of Variables”, pg 579, SAS® 9.1.3 Companion for Windows. Cary, NC: SAS Institute.20 SAS Institute Inc (2004). “Using Reserved Operating System Physical Names”, pg 155, SAS® 9.1.3 Companion for Windows. Cary, NC: SAS Institute.21 SAS Institute Inc (2005). “CALL PEEKC Function”, pg 713, SAS® 9.1.3 Language Reference: Dictionary, Third Edition. Cary, NC: SAS Institute.22 Check the exact value of this variable with the SAS companion for your particular platform.23 IBM Corp. “About VSE”. http://www-03.ibm.com/servers/eserver/zseries/zvse/about/ (15 Jul 2009)24 I wish I had a catchy mnemonic for this… any suggestions? Send them to me at the e-mail address above.