presentation and data short courses intro to sas download data to desktop 1
TRANSCRIPT
![Page 1: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/1.jpg)
Presentation and Data
http://www.lisa.stat.vt.edu
Short Courses
Intro to SAS
Download Data to Desktop
1
![Page 2: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/2.jpg)
Mark Seiss, Dept. of Statistics
Introduction to SAS Part 1
February 21, 2011
![Page 3: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/3.jpg)
Reference Material
The Little SAS Book – Delwiche and Slaughter
SAS Programming I: Essentials
SAS Programming II: Manipulating Data with the DATA Step
Presentation and Data
http://www.lisa.stat.vt.edu
![Page 4: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/4.jpg)
Presentation Outline
Part 1
1. Introduction to the SAS Environment
2. Working With SAS Data Sets
Part 2
1. Summary Procedures
2. Basic Statistical Analysis Procedures
![Page 5: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/5.jpg)
Presentation Outline
Questions/Comments
Individual Goals/Interests
![Page 6: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/6.jpg)
Introduction to the SAS Environment
1. SAS Programs
2. SAS Data Sets and Data Libraries
3. SAS System Help4. Creating SAS Data Sets
![Page 7: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/7.jpg)
SAS Programs• File extension - .sas
• Editor window has four uses:• Access and edit existing SAS programs
• Write new SAS programs
• Submitting SAS programs for execution
• Saving SAS programs
• SAS program – sequence of steps that the user submits for execution
• Submitting SAS programs• Entire program
• Selection of the program
![Page 8: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/8.jpg)
SAS Programs• Syntax Rules for SAS statements
• Free-format – can use upper or lower case
• Usually begin with an identifying keyword
• Can span multiple lines
• Always end with a semicolon
• Multiple statements can be on the same line
• Errors
• Misspelled key words
• Missing or invalid punctuation (missing semi-colon common)
• Invalid options
• Indicated in the Log window
![Page 9: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/9.jpg)
SAS Programs• 2 Basic steps in SAS programs:
• Data Steps
• Typically used to create SAS datasets and manipulate data,
• Begins with DATA statement
• Proc Steps
• Typically used to process SAS data sets
• Begins with PROC statement
• The end of the data or proc steps are indicated by:• RUN statement – most steps
• QUIT statement – some steps
• Beginning of another step (DATA or PROC statement)
![Page 10: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/10.jpg)
SAS Programs• Output generated from SAS program – 2 Windows
• SAS log
• Information about the processing of the SAS program
• Includes any warnings or error messages
• Accumulated in the order the data and procedure steps are submitted
• SAS output
• Reports generated by the SAS procedures
• Accumulates output in the order it is generated
![Page 11: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/11.jpg)
SAS Data Sets and Data Libraries• SAS Data Set
• Specifically structured file that contains data values.
• File extension - .sas7bdat
• Rows and Columns format – similar to Excel
• Columns – variables in the table corresponding to fields of data
• Rows – single record or observation
• Two types of variables
• Character – contain any value (letters, numbers, symbols, etc.)
• Numeric – floating point numbers
• Located in SAS Data Libraries
![Page 12: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/12.jpg)
SAS Data Sets and Data Libraries• SAS Data Libraries
• Contain SAS data sets
• Identified by assigning a library reference name – libref
• Temporary
• Work library
• SAS data files are deleted when session ends
• Library reference name not necessary
• Permanent
• SAS data sets are saved after session ends
• SASUSER library
• You can create and access your own libraries
![Page 13: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/13.jpg)
SAS Data Sets and Data Libraries• SAS Data Libraries cont.
• Assigning library references
• Syntax
LIBNAME libref ‘SAS-data-library’;
• Rules for Library References• 8 characters or less
• Must begin with letter or underscore
• Other characters are letters, numbers, or under scores
![Page 14: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/14.jpg)
SAS Data Sets and Data Libraries• SAS Data Libraries cont.
• Identifying SAS data sets within SAS Data Libraries
libref.filename
• Accessing SAS data sets within SAS Data Libraries
Example: DATA new_data_set;
set libref.filename;
run;
• Creating SAS data sets within SAS Data Libraries
Example: DATA libref.filename;
set old_data_set;
run;
![Page 15: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/15.jpg)
SAS System Help• SAS Help and Documentation
• Help SAS Help and Documentation
• Red Book Icon
• SAS Online Help
• http://support.sas.com/
![Page 16: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/16.jpg)
Creating SAS Data Sets• Creating a SAS data sets from raw data
• 4 methods
1. Importing existing data sets using Import menu option
2. Importing existing raw data in SAS program
3. Manually entering raw data in SAS program
4. Manually entering raw data using Table Editor
![Page 17: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/17.jpg)
Creating SAS Data Sets• Using the import data menu option
1. File Import Data
2. Standard data source select the file format
3. Specify file location or Browse to select file
4. Create name for the new SAS data set and specify location
![Page 18: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/18.jpg)
Creating SAS Data Sets• Compatible file formats
• Microsoft Excel Spreadsheets
• Microsoft Access Databases
• Comma Separate Files (.csv)
• Tab Delimited Files (.txt)
• dBASE Files (.dbf)
• JMP data sets
• SPSS Files
• Lotus Spreadsheets
• Stata Files
• Paradox Files
![Page 19: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/19.jpg)
Creating SAS Data Sets• Example Data Sets
• Excel File – State_SAT_data.xls
• http://www.stat.ucla.edu/labs/datasets/sat.dat
• Extracted from 1997 Digest of Education Statistics, an annual publication of the U.S. Department of Education
• Contains variables that show the relationship between public school expenditure and SAT performance
• Variables: – State (state)– Current expenditure per pupil (expend)– Average pupil to teacher ratio (PT_ratio)– Estimated annual salary of teachers (salary)– Percentage of eligible students taking the SAT (students)– Average verbal SAT score (verbal)– Average math SAT Score (math)– Average total score (total)
![Page 20: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/20.jpg)
Creating SAS Data Sets• Example Data Sets Cont.
• Text file – State_region_data.txt• Contains region assignments for each state
• 1 = New England
• 2 = Middle Atlantic
• 3 = East North Central
• 4 = West North Central
• 5 = South Atlantic
• 6 = East South Central
• 7 = West South Central
• 8 = Mountain
• 9 = Pacific
![Page 21: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/21.jpg)
Creating SAS Data Sets
Import State_SAT_data.xls Assign as work.state_sat_data.sas7bdat
Import State_region_data.txt Assign as
work.state_region_data.sas7bdat
![Page 22: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/22.jpg)
Introduction to theSAS Environment
Questions/Comments
![Page 23: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/23.jpg)
Working With SAS Data Sets
1. Data Set Information
2. Data Set Manipulation
3. Data Set Processing
4. Combining Data Sets
A. Concatenating/Appending
B. Merging
5. Saving Data Sets
![Page 24: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/24.jpg)
Data Set Information• Proc Contents
• Output contains a table of contents of the specified data set
• Data Set Information• Data set name
• Number of observations
• Number of Variables
• Variable Information• Type (numeric or character)
• Length
• Syntax:
PROC CONTENTS DATA=input_data_set;
RUN;
![Page 25: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/25.jpg)
Data Set InformationAssignment
Obtain Data Set Information for work.state_sat_data and work.state_region_data
![Page 26: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/26.jpg)
Data Set InformationSolution
proc contents data=state_sat_data;
run;
proc contents data=state_region_data;
run;
![Page 27: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/27.jpg)
Data Set Manipulation• Create a new SAS data set using an existing SAS data set as
input• Specify name of the new SAS data set after the DATA statement
• Use SET statement to identify SAS data set being read
• Syntax:
DATA output_data_set;
SET input_data_set;
<additional SAS statements>;
RUN;
• By default the SET statement reads all observations and variables from the input data set into the output data set.
![Page 28: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/28.jpg)
Data Set Manipulation• Assignment Statements
• Evaluate an expression
• Assign resulting value to a variable
• General Form: variable = expression;
• Example: miles_per_hour = distance/time;
• SAS Functions• Perform arithmetic functions, compute simple statistics, manipulate
dates, etc.
• General Form: variable=function_name(argument1, argument2,…);
• Example: Time_worked = sum(Day1,Day2, Day3, Day4, Day5);
![Page 29: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/29.jpg)
Data Set Manipulation• Selecting Variables
• Use DROP and KEEP to determine which variables are written to new SAS data set.
• 2 Ways
• DROP and KEEP as statements– Form: DROP Variable1 Variable2;
KEEP Variable3 Variable4 Variable5;
• DROP and KEEP options in SET statement– Form: SET input_data_set (KEEP=Var1);
![Page 30: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/30.jpg)
Data Set Manipulation• Conditional Processing
• Uses IF-THEN-ELSE logic
• General Form: IF <expression1> THEN <statement>;
ELSE IF <expression2> THEN <statement>;
ELSE <statement>;
• <expression> is a true/false statement, such as:
• Day1=Day2, Day1 > Day2, Day1 < Day2
• Day1+Day2=10
• Sum(day1,day2)=10
• Day1=5 and Day2=5
![Page 31: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/31.jpg)
Data Set Manipulation• Conditional Processing
Symbolic Mnemonic Example
= EQ IF region=‘Spain’;
~= or ^= NE IF region ne ‘Spain’;
> GT IF rainfall > 20;
< LT IF rainfall lt 20;
>= GE IF rainfall ge 20;
<= LE IF rainfall <= 20;
& AND IF rainfall ge 20 & temp < 90;
| or ! OR IF rainfall ge 20 OR temp < 90;
IS NOT MISSING
IF region IS NOT MISSING;
BETWEEN AND IF region BETWEEN ‘Plain’ AND ‘Spain’;
CONTAINS IF region CONTAINS ‘ain’;
IN IF region IN (‘Rain’, ‘Spain’, ‘Plain’);
![Page 32: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/32.jpg)
Data Set Manipulation• Conditional Processing cont.
• If <expression1> is true, <statement> is processed
• ELSE IF and ELSE are only processed if <expression1> is false
• Only one statement specified using this form
• Use DO and END statements to execute group of statements
• General Form: IF <expression> THEN DO;<statements>;
END;ELSE DO;
<statements>;END;
![Page 33: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/33.jpg)
Data Set Manipulation• Subsetting Rows (Observations)
• We will look at two ways
• Using IF statement
• Using WHERE option in SET statement
• IF statement
• Only writes observations to the new data set in which an expression is true;
• General Form: IF <expression>;
• Example: IF career = ‘Teacher’;IF sex ne ‘M’;
• In the second example, only observations where sex is not equal to ‘M’ will be written to the output data set
![Page 34: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/34.jpg)
Data Set Manipulation• Subsetting Rows (Observations) cont.
• Where Option in SET statement
• Use option to only read rows from the input data set in which the expression is true
• General Form: SET input_data_set (where=(<expression>));
• Example:SET vacation (where=(destination=‘Bermuda’));
• Only observations where the destination equals ‘Bermuda’ will be read from the input data set
• Comparison
• Resulting output data set is equivalent
• IF statement – all rows read from the input data set
• Where option – only rows where expression is true are read from input data set
• Difference in processing time when working with big data sets
![Page 35: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/35.jpg)
Data Set Manipulation• Assignments
1. Create new dataset work.state_SAT_data2 from work.state_SAT_data
Assign new variable upper_ind
If total > 1000 then upper_ind=1
Otherwise upper_ind=0
2. Create new dataset work.south from work.state_region_data
Specify work.south contains only records from regions 5, 6, or 7
Specify work.south only contains the state variable
![Page 36: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/36.jpg)
Data Set Manipulation• Solutions
1. data state_sat_data2;
set state_sat_data;
if total>1000 then upper_ind=1;
else upper_ind=0;
run;
![Page 37: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/37.jpg)
Data Set Manipulation• Solutions
2. data south;
set state_region_data;
if region=5 or region=6 or region=7;
keep state;
run;
OR
data south;
set state_region_data(where=(region=5 or region=6 or region=7));
keep state;
run;
![Page 38: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/38.jpg)
Data Set Manipulation• PROC SORT sorts data according to specified variables
• General Form: PROC SORT DATA=input_data_set <options>;BY Variable1 Variable2;
RUN;
• Sorts data according to Variable1 and then Variable2;
• By default, SAS sorts data in ascending order• Number low to high
• A to Z
• Use DESCENDING statement for numbers high to low and letters Z to A• BY City DESCENDING Population;
• SAS sorts data first by city A to Z and then Population high to low
![Page 39: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/39.jpg)
Data Set Manipulation• Some Options
• NODUPKEY
• Eliminates observations that have the same values for the BY variables
• OUT=output_data_set
• By default, PROC SORT replaces the input data set with the sorted data set
• Using this option, PROC SORT creates a newly sorted data set and the input data set remains unchanged
![Page 40: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/40.jpg)
Data Set Processing• Data Set Processing
• DATA steps read in data from existing data sets or raw data files one row at a time, like a loop
• DATA step reads data from the input data set in the following way:
1. Read in current row from input data set to Program Data Vector (PDV)
2. Process SAS statements
3. PDV to output data set
4. Set current row to the next row in the input data set
5. Iterate to Step 1
• One row at a time is processed
• Thus we cannot simply add the value of a variable in one row to the value in another row
![Page 41: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/41.jpg)
Data Set Processing• Data Set Processing – Example
• Consider the following submitted code:
data state_sat_data2;
set state_sat_data;
if total>1000 then upper_ind=1;
else upper_ind=0;
run;
![Page 42: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/42.jpg)
Data Set Processing• Data Set Processing – Example
• Execution of the Data Stepdata state_sat_data2;
Current set state_sat_data;if total>1000 then upper_ind=1;else upper_ind=0;
run;
PDV
State_sat_data2
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
Alabama 4.405 17.2 31.144 8 491 538 1029 .
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
![Page 43: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/43.jpg)
Data Set Processing• Data Set Processing – Example
• Execution of the Data Stepdata state_sat_data2;
set state_sat_data;Current if total>1000 then upper_ind=1;
else upper_ind=0;run;
PDV
State_sat_data2
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
Alabama 4.405 17.2 31.144 8 491 538 1029 1
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
![Page 44: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/44.jpg)
Data Set Processing• Data Set Processing – Example
• Execution of the Data Stepdata state_sat_data2;
set state_sat_data;if total>1000 then upper_ind=1;else upper_ind=0;
Current run;
PDV
State_sat_data2
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
Alabama 4.405 17.2 31.144 8 491 538 1029 1
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
Alabama 4.405 17.2 31.144 8 491 538 1029 1
![Page 45: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/45.jpg)
Data Set Processing• Data Set Processing – Example
• Execution of the Data StepCurrent data state_sat_data2;
set state_sat_data;if total>1000 then upper_ind=1;else upper_ind=0;
run;
PDV
State_sat_data2
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
Alabama 4.405 17.2 31.144 8 491 538 1029 .
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
Alabama 4.405 17.2 31.144 8 491 538 1029 1
![Page 46: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/46.jpg)
Data Set Processing• Data Set Processing – Example
• Execution of the Data Stepdata state_sat_data2;
Current set state_sat_data;if total>1000 then upper_ind=1;else upper_ind=0;
run;
PDV
State_sat_data2
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
Alaska 8.963 17.6 47.951 47 445 489 934 .
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
Alabama 4.405 17.2 31.144 8 491 538 1029 1
![Page 47: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/47.jpg)
Data Set Processing• Data Set Processing – Example
• Execution of the Data Stepdata state_sat_data2;
set state_sat_data;if total>1000 then upper_ind=1;
Current else upper_ind=0;run;
PDV
State_sat_data2
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
Alaska 8.963 17.6 47.951 47 445 489 934 0
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
Alabama 4.405 17.2 31.144 8 491 538 1029 1
![Page 48: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/48.jpg)
Data Set Processing• Data Set Processing – Example
• Execution of the Data Stepdata state_sat_data2;
set state_sat_data;if total>1000 then upper_ind=1;else upper_ind=0;
Current run;
PDV
State_sat_data2
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
Alaska 8.963 17.6 47.951 47 445 489 934 0
State Expend PT_ratio Salary Students Verbal Math Total Upper_ind
Alabama 4.405 17.2 31.144 8 491 538 1029 1
Alaska 8.963 17.6 47.951 47 445 489 934 0
![Page 49: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/49.jpg)
Combining Data Sets• Concatenating (or Appending)
• Stacks each data set upon the other
• If one data set does not have a variable that the other datasets do, the variable in the new data set is set to missing for the observations from that data set.
• General Form: DATA output_data_set;SET data1 data2;
run;
• PROC APPEND may also be used
![Page 50: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/50.jpg)
Combining Data Sets• Merging Data Sets
• One-to-One Match Merge
• A single record in a data set corresponds to a single record in all other data sets
• Example: Patient and Billing Information
• One-to-Many Match Merge
• Matching one observation from one data set to multiple observations in other data sets
• Example: County and State Information
• Note: Data must be sorted before merging can be done
(PROC SORT)
![Page 51: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/51.jpg)
Combining Data Sets• One-to-One Match Merge
• Usually need at least one common variable between data sets – matching purposes
• For the example, a patient ID would be needed
• Do not need common variable if all data sets are in exactly the same order
• General Form: DATA output_data_set;MERGE input_data_set1
input_data_set2;By variable1 variable2;
RUN;
![Page 52: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/52.jpg)
Combining Data Sets• One-to-One Match Merge
• Example:
PerformanceGoals
Code:
DATA compare;
MERGE performance goals;
BY month;
difference=sales-goal;
RUN;
Month Sales
1 8223
2 6034
3 4220
Month Goal
1 9000
2 6000
3 5000
![Page 53: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/53.jpg)
Combining Data Sets• One-to-One Match Merge
• Example cont.:
Compare
Month Sales Goal Difference
1 8223 9000 -777
2 6034 6000 34
3 4220 5000 -780
![Page 54: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/54.jpg)
Combining Data Sets• One-to-Many Match Merge
• Requires at least one common variable in the data sets for matching purposes
• For the example, State information is in both the state and county files
• If two data sets have variables with the same name, the variables in the second data set will overwrite the variable in the first.
• General Form: DATA output_data_set;MERGE Data1 Data2 Data3;BY Variable1 Variable2;
RUN:
![Page 55: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/55.jpg)
Combining Data Sets• One-to-Many Match Merge
• Example:
VideosAdjustment
Code:
DATA prices;
MERGE videos adjustment
BY category;
NewPrice=(1-adjustment)*sales;
RUN;
Category Sales
Aerobics 12.99
Aerobics 13.99
Aerobics 13.99
Step 12.99
Step 12.99
Weights 15.99
Category Adjustment
Aerobics .20
Step .30
Weights .25
![Page 56: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/56.jpg)
Combining Data Sets• One-to-One Many Merge
• Example cont.:
VideosCategory Sales Adjustment NewPrice
Aerobics 12.99 .20 10.39
Aerobics 13.99 .20 11.19
Aerobics 13.99 .20 11.19
Step 12.99 .30 9.09
Step 12.99 .30 9.09
Weights 15.99 .25 11.99
![Page 57: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/57.jpg)
Combining Data Sets• Assignment
Create the dataset work.state_data
Merge work.state_sat_data2 with work.state_region_data by the state variable
![Page 58: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/58.jpg)
Combining Data Sets• Solution
proc sort data=state_sat_data2;
by state;
run;
proc sort data=state_region_data;
by state;
run;
data state_data;
merge state_sat_data2 state_region_data;
by state;
run;
![Page 59: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/59.jpg)
Saving Data Sets• Save as SAS dataset (.sas7bdat)
LIBNAME libref “destination folder”;
DATA libref.filename;
SET current_name;
optional commands;
RUN;
• Other Formats
1. File Export Data
2. Specify SAS data set
3. Standard data source select the file format
4. Specify File Folder and Filename
![Page 60: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/60.jpg)
Working With SAS Data Sets
Questions/Comments
![Page 61: Presentation and Data Short Courses Intro to SAS Download Data to Desktop 1](https://reader035.vdocuments.net/reader035/viewer/2022062221/56649da95503460f94a96554/html5/thumbnails/61.jpg)
Attendee Questions
If time permits