introduction to sas programming - university libraries · introduction to sas programming christina...

58
Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming I: Essentials Training

Upload: dongoc

Post on 12-Apr-2018

256 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Introduction to SAS

Programming

Christina L. Ughrin

Statistical Software Consulting

Some notes pulled from SAS

Programming I: Essentials Training

Page 2: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Datasets

Examining the structure of SAS

Datasets

Page 3: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Data Sets

Two Sections

Descriptor Section

Data Section

Page 4: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Data Set Descriptor Section

Page 5: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Data Section

Page 6: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Attributes of Variables

Name

e.g. Status

Type

Numeric or Character

e.g. Status in this example is character (T, TT,

PT, or NTT) and Satisfaction is numeric (1 to 5).

Page 7: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Data Set Terminology

Variables – columns in a SAS data set.

Observations – rows in a SAS data set.

Numeric Data – values that are treated as numeric

and may include 8 bytes of floating storage for 16 to

17 significant digits.

Character Data – non numeric data values such as

letters, numbers, special characters, and blanks.

May be stores with a length of 1 to 32, 767 bytes.

One byte is equal to one character.

Page 8: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Data Set and Variable

Name Criteria

Can be 32 characters long.

Can be uppercase, lowercase, or a mixture of

the cases.

Are not case sensitive

Cannot start with number and cannot contain

special characters or blanks.

Must start with a letter or underscore.

Page 9: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Dates

Dates are treated as special kind of numeric data.

They are the number of days since January 1st, 1960.

January 1st 1960 is the 0 point. SAS dates can go back to 1582 (Gregorian Calendar) and forward to the year 20000.

Dates are displayed using a format. There are a number of different date formats supported by SAS.

Time is scored as the number of seconds since midnight. SAS date time is the number of seconds since January 1st, 1960.

Page 10: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Missing Data in SAS

Missing values are valid values.

For character data, missing values are displayed as blanks.

For numeric data, missing values are displayed as periods.

Page 11: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Syntax

Page 12: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Syntax

Statements in SAS are like sentences. The punctuation though is a semicolon( ; )rather than a period ( . )

Most Statements (but not all) start with an identifying key word (e.g. proc, data, label, options, format…)

Statements are strung together into sections similar to paragraphs. These paragraphs in a Windows OS are ended with the word “run” and a semicolon.

Page 13: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Example of SAS Syntax

Page 14: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Syntax Rules

SAS statements are format free.

One or more blanks or special characters are

used to separate words.

They can begin and end in any column.

A single statement can span multiple lines.

Several statements can be on the same line.

Page 15: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Example of SAS Free Format

Using the free-format Syntax

rules of SAS though can make it difficult for others (or you) to read your

program. This is akin to

writing a page of text with little attention to line breaks. You may still have

Capital letters and periods, but where a sentence begins and ends may be a bit confusing.

Page 16: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Example of SAS Formatted

Using the free-format Syntax rules of SAS though can make it difficult for others

(or you) to read your program. This is akin to writing a page of text with little

attention to line breaks. You may still have capital letters and periods, but where a

sentence begins and ends may be a bit confusing. Isn‟t this paragraph a bit

easier to read?

Page 17: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Comments

Type /* to begin a comment.

Type your comment text.

Type */ to end the comment.

Or, type an * at the beginning of a line. Everything

between the * and the ; will be commented.

e.g. *infile „tutor.dat‟;

Alternatively, highlight the text that you would like to

comment and use the keys Ctrl / to comment the

line. To uncomment a line, highlight and use the

Ctrl Shift / keys.

Page 18: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Comments

Page 19: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Windows

Page 20: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Windows

Log

Editor

Explorer

Page 21: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Enhanced Editor Window

Your program script appears in this window.

You can either bring it in from a file or type the program right into the window.

Once the program is in the window, you can Click Submit (or the running guy).

Enhanced

Editor

Output

Page 22: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Log

SAS Log provides a “blow by blow” account of the execution of your program. It includes how many observations were read and output, as well as, errors and notes.

Note the errors in red.

Page 23: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Output Window

Page 24: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

SAS Library

SAS Data Libraries are like drawers in a filing cabinet. The SAS data sets are files within those drawers. Note the icons for the SAS library match that metaphor.

In order to assign a “drawer”, you assign a library reference name (libref).

There are two drawers already in your library: work (temporary) and sasuser (permanent).

You can also create your own libraries (drawers) using the libname statement.

Page 25: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Establishing the libname

libname Tina „E:\Trainings\JMP Training‟;

run;

Type the libname

command in the

Enhanced Editor.

Click on the

running icon

Page 26: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Viewtable Window

Page 27: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Data Step Programming

SAS data set can be created using another SAS

data set as input or raw data

To create a SAS data set using another SAS data

set, the DATA and SET statements are used.

To create a SAS data set from raw data, you use

INFILE and INPUT statements.

DATA and SET cannot be used for raw data and

INFILE and INPUT cannot be used for existing SAS

datasets.

Page 28: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Reading a SAS Dataset

DATA (name of new SAS dataset)

SET (name of existing SAS dataset)

Additional statements

Run;

Page 29: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Reading a SAS Dataset

Page 30: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Reading SAS Dataset

Page 31: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Reading Raw Data

Page 32: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Selecting Variables

You can use a DROP or KEEP statement in a

DATA step to control which variables are

written to a new SAS data set.

Page 33: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Selecting Variables

Page 34: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Selecting Variables

Page 35: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Date Functions

Create SAS date values TODAY() – obtains the date value from the system clock

MDY(month,day,year) – uses numeric month, day, and year values to return the corresponding SAS date value.

Extract information from SAS date values YEAR (SAS-date) – extracts the year from a SAS date and

returns a four-digit value for year

QTR (SAS-date) – extracts the quarter from a SAS date and returns a number from 1-4

MONTH (SAS-date) extracts the month from a SAS date and returns a number from 1 to 12

WEEKDAY (SAS-date) – extracts the day of the week and returns a number from 1 to 7

Page 36: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Date Function – Weekday

Function

Page 37: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Proc Univariate

Proc Univariate

Page 38: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Proc Univariate

Page 39: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Proc Univariate

Page 40: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Getting started with

programming

Proc Print

Page 41: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Proc Print – Beginning

Procedures

Examining data using proc print procedure.

Display particular variables of interest.

Display particular observations.

Display a list report with column totals.

Page 42: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Default List Report

Proc print data=train.sastraining;

Run;

Page 43: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Printing Particular Variables

Use the VAR statement which allows you to: Select variables for your

proc print

Define the order of the variables in the proc print.

Proc print data=train.sastraining;

var ID Department Satisfaction;

Run;

Page 44: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Suppressing Obs Column

The NOOBS option

suppresses the number

of observations column

that shows up on the

left hand side of a proc

print output.

Proc print

data=train.sastraining

NOOBS;

Run;

Page 45: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Subsetting Data with the

WHERE Statement

Allows you to select particular observations based on criteria.

Can be used with most SAS procedures (“IF” statements are generally used in the Data step though).

Operands Variables and Observations

Operators Comparisons

Logical,

Special

Functions

Page 46: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Comparison Operators

Mneumonic Symbol Definition

EQ = equal to

NE ^= or ~= not equal to

GT > greater than

LT < less than

GE >= greater than or equal to

LE <= less than or equal to

IN equal to one of a list

Page 47: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Examples of WHERE

Comparison Operators

Proc print

data=train.sastraining

NOOBS;

where department=

„Psychology‟;

Run;

proc print data=train.sastraining

NOOBS;

where department=

'Psychology';

run;

Page 48: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

WHERE Logical Operators

And (&) Used if both expressions are true,

then the compound expression is true.

OR (|) Used if either expression is true, then

the compound expression is true.

Not (^) Can be combined with other operators

to reverse the logic of a comparison.

Page 49: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Examples of WHERE Logical

Operators

proc print data=train.sastraining NOOBS;

where department= 'Psychology' and

years>10;

run;

proc print data=train.sastraining NOOBS;

where department= 'Psychology' or

department='Anthropology';

run;

Page 50: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

WHERE Special Operators

BETWEEN-AND – Used to select

observations in which the value of the

variable falls within a range of values.

CONTAINS ? – Used when one wants to

select observations that include the specified

substring.

Page 51: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Examples of WHERE Special

Operators

proc print data=train.sastraining NOOBS;

where years between 10 and 15;

run;

proc print data=train.sastraining NOOBS;

where Department ? 'Nurs';

run;

Page 52: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Column Totals

Can provide a Total

Can also provide subtotals if data is printed in

groups.

Page 53: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Example of Column Total

Page 54: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Proc Sort

Page 55: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Overview of Proc Sort

Sorts (arranges) observations of the data set.

Can create a new SAS data set containing rearranged observations.

Can sort on more than one variable at a time.

Sorts ascending (default) and descending.

Does not provide printed output (that requires the proc print statements).

Treats missing data as smallest possible value.

Page 56: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Proc Sort Example

proc sort data=train.sastraining;

by Department;

run;

proc print data=train.sastraining NOOBS;

var Department Satisfaction Years;

run;

Page 57: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Printing Totals and Subtotals Proc

Sort and Proc Print Example

proc sort data=train.sastraining;

by Department;

run;

proc print data=train.sastraining NOOBS;

by Department;

sum years;

run;

Page 58: Introduction to SAS Programming - University Libraries · Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming

Page Breaks with Proc Sort

and Proc Print

proc sort data=train.sastraining;

by Department;

run;

proc print data=train.sastraining NOOBS;

by Department;

Pageby Department;

sum years;

run;