introduction to sas programming - university libraries · introduction to sas programming christina...
TRANSCRIPT
Introduction to SAS
Programming
Christina L. Ughrin
Statistical Software Consulting
Some notes pulled from SAS
Programming I: Essentials Training
SAS Datasets
Examining the structure of SAS
Datasets
SAS Data Sets
Two Sections
Descriptor Section
Data Section
Data Set Descriptor Section
SAS Data Section
Attributes of Variables
Name
e.g. Status
Type
Numeric or Character
e.g. Status in this example is character (T, TT,
PT, or NTT) and Satisfaction is numeric (1 to 5).
SAS Data Set Terminology
Variables – columns in a SAS data set.
Observations – rows in a SAS data set.
Numeric Data – values that are treated as numeric
and may include 8 bytes of floating storage for 16 to
17 significant digits.
Character Data – non numeric data values such as
letters, numbers, special characters, and blanks.
May be stores with a length of 1 to 32, 767 bytes.
One byte is equal to one character.
SAS Data Set and Variable
Name Criteria
Can be 32 characters long.
Can be uppercase, lowercase, or a mixture of
the cases.
Are not case sensitive
Cannot start with number and cannot contain
special characters or blanks.
Must start with a letter or underscore.
SAS Dates
Dates are treated as special kind of numeric data.
They are the number of days since January 1st, 1960.
January 1st 1960 is the 0 point. SAS dates can go back to 1582 (Gregorian Calendar) and forward to the year 20000.
Dates are displayed using a format. There are a number of different date formats supported by SAS.
Time is scored as the number of seconds since midnight. SAS date time is the number of seconds since January 1st, 1960.
Missing Data in SAS
Missing values are valid values.
For character data, missing values are displayed as blanks.
For numeric data, missing values are displayed as periods.
SAS Syntax
SAS Syntax
Statements in SAS are like sentences. The punctuation though is a semicolon( ; )rather than a period ( . )
Most Statements (but not all) start with an identifying key word (e.g. proc, data, label, options, format…)
Statements are strung together into sections similar to paragraphs. These paragraphs in a Windows OS are ended with the word “run” and a semicolon.
Example of SAS Syntax
SAS Syntax Rules
SAS statements are format free.
One or more blanks or special characters are
used to separate words.
They can begin and end in any column.
A single statement can span multiple lines.
Several statements can be on the same line.
Example of SAS Free Format
Using the free-format Syntax
rules of SAS though can make it difficult for others (or you) to read your
program. This is akin to
writing a page of text with little attention to line breaks. You may still have
Capital letters and periods, but where a sentence begins and ends may be a bit confusing.
Example of SAS Formatted
Using the free-format Syntax rules of SAS though can make it difficult for others
(or you) to read your program. This is akin to writing a page of text with little
attention to line breaks. You may still have capital letters and periods, but where a
sentence begins and ends may be a bit confusing. Isn‟t this paragraph a bit
easier to read?
SAS Comments
Type /* to begin a comment.
Type your comment text.
Type */ to end the comment.
Or, type an * at the beginning of a line. Everything
between the * and the ; will be commented.
e.g. *infile „tutor.dat‟;
Alternatively, highlight the text that you would like to
comment and use the keys Ctrl / to comment the
line. To uncomment a line, highlight and use the
Ctrl Shift / keys.
SAS Comments
SAS Windows
SAS Windows
Log
Editor
Explorer
Enhanced Editor Window
Your program script appears in this window.
You can either bring it in from a file or type the program right into the window.
Once the program is in the window, you can Click Submit (or the running guy).
Enhanced
Editor
Output
SAS Log
SAS Log provides a “blow by blow” account of the execution of your program. It includes how many observations were read and output, as well as, errors and notes.
Note the errors in red.
Output Window
SAS Library
SAS Data Libraries are like drawers in a filing cabinet. The SAS data sets are files within those drawers. Note the icons for the SAS library match that metaphor.
In order to assign a “drawer”, you assign a library reference name (libref).
There are two drawers already in your library: work (temporary) and sasuser (permanent).
You can also create your own libraries (drawers) using the libname statement.
Establishing the libname
libname Tina „E:\Trainings\JMP Training‟;
run;
Type the libname
command in the
Enhanced Editor.
Click on the
running icon
Viewtable Window
Data Step Programming
SAS data set can be created using another SAS
data set as input or raw data
To create a SAS data set using another SAS data
set, the DATA and SET statements are used.
To create a SAS data set from raw data, you use
INFILE and INPUT statements.
DATA and SET cannot be used for raw data and
INFILE and INPUT cannot be used for existing SAS
datasets.
Reading a SAS Dataset
DATA (name of new SAS dataset)
SET (name of existing SAS dataset)
Additional statements
Run;
Reading a SAS Dataset
Reading SAS Dataset
Reading Raw Data
Selecting Variables
You can use a DROP or KEEP statement in a
DATA step to control which variables are
written to a new SAS data set.
Selecting Variables
Selecting Variables
Date Functions
Create SAS date values TODAY() – obtains the date value from the system clock
MDY(month,day,year) – uses numeric month, day, and year values to return the corresponding SAS date value.
Extract information from SAS date values YEAR (SAS-date) – extracts the year from a SAS date and
returns a four-digit value for year
QTR (SAS-date) – extracts the quarter from a SAS date and returns a number from 1-4
MONTH (SAS-date) extracts the month from a SAS date and returns a number from 1 to 12
WEEKDAY (SAS-date) – extracts the day of the week and returns a number from 1 to 7
Date Function – Weekday
Function
Proc Univariate
Proc Univariate
Proc Univariate
Proc Univariate
Getting started with
programming
Proc Print
Proc Print – Beginning
Procedures
Examining data using proc print procedure.
Display particular variables of interest.
Display particular observations.
Display a list report with column totals.
Default List Report
Proc print data=train.sastraining;
Run;
Printing Particular Variables
Use the VAR statement which allows you to: Select variables for your
proc print
Define the order of the variables in the proc print.
Proc print data=train.sastraining;
var ID Department Satisfaction;
Run;
Suppressing Obs Column
The NOOBS option
suppresses the number
of observations column
that shows up on the
left hand side of a proc
print output.
Proc print
data=train.sastraining
NOOBS;
Run;
Subsetting Data with the
WHERE Statement
Allows you to select particular observations based on criteria.
Can be used with most SAS procedures (“IF” statements are generally used in the Data step though).
Operands Variables and Observations
Operators Comparisons
Logical,
Special
Functions
Comparison Operators
Mneumonic Symbol Definition
EQ = equal to
NE ^= or ~= not equal to
GT > greater than
LT < less than
GE >= greater than or equal to
LE <= less than or equal to
IN equal to one of a list
Examples of WHERE
Comparison Operators
Proc print
data=train.sastraining
NOOBS;
where department=
„Psychology‟;
Run;
proc print data=train.sastraining
NOOBS;
where department=
'Psychology';
run;
WHERE Logical Operators
And (&) Used if both expressions are true,
then the compound expression is true.
OR (|) Used if either expression is true, then
the compound expression is true.
Not (^) Can be combined with other operators
to reverse the logic of a comparison.
Examples of WHERE Logical
Operators
proc print data=train.sastraining NOOBS;
where department= 'Psychology' and
years>10;
run;
proc print data=train.sastraining NOOBS;
where department= 'Psychology' or
department='Anthropology';
run;
WHERE Special Operators
BETWEEN-AND – Used to select
observations in which the value of the
variable falls within a range of values.
CONTAINS ? – Used when one wants to
select observations that include the specified
substring.
Examples of WHERE Special
Operators
proc print data=train.sastraining NOOBS;
where years between 10 and 15;
run;
proc print data=train.sastraining NOOBS;
where Department ? 'Nurs';
run;
Column Totals
Can provide a Total
Can also provide subtotals if data is printed in
groups.
Example of Column Total
Proc Sort
Overview of Proc Sort
Sorts (arranges) observations of the data set.
Can create a new SAS data set containing rearranged observations.
Can sort on more than one variable at a time.
Sorts ascending (default) and descending.
Does not provide printed output (that requires the proc print statements).
Treats missing data as smallest possible value.
Proc Sort Example
proc sort data=train.sastraining;
by Department;
run;
proc print data=train.sastraining NOOBS;
var Department Satisfaction Years;
run;
Printing Totals and Subtotals Proc
Sort and Proc Print Example
proc sort data=train.sastraining;
by Department;
run;
proc print data=train.sastraining NOOBS;
by Department;
sum years;
run;
Page Breaks with Proc Sort
and Proc Print
proc sort data=train.sastraining;
by Department;
run;
proc print data=train.sastraining NOOBS;
by Department;
Pageby Department;
sum years;
run;