sas tips & tricks

31
SAS Tips & Tricks Brought to you by your fellow MASUG members

Upload: luann

Post on 30-Jan-2016

69 views

Category:

Documents


0 download

DESCRIPTION

SAS Tips & Tricks. Brought to you by your fellow MASUG members. Jun Tang, Data Analyst, MS. UT Health Science center Missing Data in Multivariate Analysis. Missing Data in Multivariate Analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SAS Tips & Tricks

SAS Tips & Tricks

Brought to you by your fellow MASUG members

Page 2: SAS Tips & Tricks

Jun Tang, Data Analyst, MS

UT Health Science center Missing Data in Multivariate Analysis

Page 3: SAS Tips & Tricks

Missing Data in Multivariate Analysis

In multivariate analysis, we encountered this: 76 of 306 observations in data set LIB.FILENAME omitted due to missing values. It’s 25% of the sample. Sample size is drastically reduced.

Page 4: SAS Tips & Tricks

Missing Data in Multivariate Analysis

To solve this problem, we consider two situations:1. Do not use Multivariate Analysis, in stead, use Univariate Test. This will not

affect the sample size of those non-missing value variables. 2. Example: MANOVA change to TTEST.

proc anova;class AAMC; model ratevar1 ratevar2 ratevar3 …=AAMC;manova h=AAMC/printe printh;run;

Change to:proc ttest data=lib.filename;class AAMC;VAR ratevar1 ratevar2 ratevar3 …;run;

Page 5: SAS Tips & Tricks

Missing Data in Multivariate Analysis

To replace the missing data :Mean Estimation;Hot-Deck imputation;Predict missing values from Regression;Multiple Imputation (which is still in experimental form in Release 8.1 of the SAS system).

SAS procedures for doing MI:a. PROC MI creates Multiple Imputed data sets;b. PROC MIANALYZE combines results after analysis.Example:proc MI data=lib.filename <seed=37851> <mu0=50 10 180> out=outmi;VAR ratevar1 ratevar2 ratevar3…;run;

Refs: 1. Kim Chantala and C.Suchindran, Multiple Imputation for missing data; 2. www.ats.ucla.edu/stat/sas/v8/miv802.pdf, chapter 9, the MI procedure;

Page 6: SAS Tips & Tricks

Shelly Lensing, Sr. Biostatistician IIDeqing Pei, Statistical Analyst II St. Jude Children's Research Hospital Missing Readings Without a Missing

Record

Page 7: SAS Tips & Tricks

Missing Readings Without a Missing Record

This is a problem that just came up: We needed to group each set of readings

(3h, 7h, 23h, 44h) for each ID to indicate group 1, 2, etc. over time.

Some readings were missing, but there was no missing record

Page 8: SAS Tips & Tricks

Missing Readings Without a Missing Record

data one; input id sample $ 10-18 mtx_con @37 hdmtx_date mmddyy8.; sampnum=compress(sample,'MTXHR')+0; format hdmtx_date mmddyy8.; datalines; *** example data;11111 MTX 3HR 81.29 11/20/9511111 MTX 7HR 22.07 11/20/9511111 MTX 23HR 2.24 11/21/9511111 MTX 44HR 1.49 11/22/9511111 MTX 3HR 92.16 01/19/9611111 MTX 44HR 0.18 01/21/9611111 MTX 7HR 24.27 02/20/9611111 MTX 23HR 3.34 02/21/9622222 MTX 3HR 127.22 12/17/9422222 MTX 7HR 29.73 12/17/9422222 MTX 23HR 0.65 12/18/9422222 MTX 7HR 45.00 12/28/9422222 MTX 23HR 1.88 12/29/9422222 MTX 3HR 132.48 02/06/9522222 MTX 7HR 37.96 02/06/9522222 MTX 23HR 1.53 02/07/9522222 MTX 44HR 0.13 02/08/9522222 MTX 7HR 85.71 02/09/9522222 MTX 44HR 30.01 02/11/9522222 MTX 44HR 35.01 02/13/9533333 MTX 3HR 65.02 02/13/9533333 MTX 23HR 34.13 02/14/9533333 MTX 23HR 24.13 02/15/9533333 MTX 3HR 74.23 02/16/9533333 MTX 44HR 24.06 02/17/95

;run;

Page 9: SAS Tips & Tricks

Missing Readings Without a Missing Record

proc sort data=one; *** make sure data are in order; by id hdmtx_date sampnum;run;

data two; set one; by id;

lagsamp=lag(sampnum); * Uses logic of when hour-based measurements should occur relative to each other; * to find first record of each set; if sampnum=3 then first=1; if sampnum=7 and lagsamp ne 3 then first=1; if sampnum=23 and lagsamp not in(7,3) then first=1; if sampnum=44 and lagsamp not in(23,7,3) then first=1;

if first.id then do; * restarts group number with each new id; group=0; end; retain group; if first=1 then group+1; * increments group number for each grouping;run;

proc print;run;

Page 10: SAS Tips & Tricks

Vaibhav Gardé , Senior Marketing Analyst FedEx Services - Marketing, Planning and

Analysis Group Interactive Mode

– Beats the Batch Mode in most cases

Page 11: SAS Tips & Tricks

I like to use the SAS Interactive Mode on UNIX. It is almost as convenient to use as Windows, and my code gets neatly color coded! This makes it easy to catch minor error in syntax – like forgetting a quote, semi-colon, etc.

Page 12: SAS Tips & Tricks

In the interactive mode, it is quite easy to see the exact location, size, modification date, etc. of all your tables.

Page 13: SAS Tips & Tricks

You can even right click to open (view) Oracle tables, SAS datasets to see the records contained therein. Much easier than the usual UNIX SQL options! (Can also delete, copy, save, etc.)

Page 14: SAS Tips & Tricks

You can see a continuously visible log file (again color coded) to monitor the progress/success of your program. In case there is an error, it shows up in RED.

There are other benefits of using the Interactive mode, but I have highlighted just these few. Hope you find it worthwhile to check it out!

Page 15: SAS Tips & Tricks

Alan Teal, Chief, Technical Support Branch Defense Contract Audit Agency Macro to Document Date/Time of

Processing

Page 16: SAS Tips & Tricks

Macro to Document Date/Time of Processing Purpose: It is a good practice to document the

date and time when a SAS job was run, especially if the same job is often run but with different data sets.

The following macro writes the date and time to the SAS log.

It can also be used to determine how long each Data step or PROC processed, in order to evaluate which steps might be made more efficient.

Page 17: SAS Tips & Tricks

Macro to Document Date/Time of ProcessingSAS Log Shown Below:1 * TheTimeIs.SAS Writes the date and time to the SAS Log to document processing.;2 * Created 03-25-2003, Alan Teal;34 %MACRO Tim;5 DATA _Null_;6 TheTimeIs=Datetime();7 PUT TheTimeIs= DateTime20.;8 %MEND Tim;910 %Tim1112 RUN;

TheTimeIs=25MAR2003:12:36:47NOTE: DATA statement used: real time 0.32 seconds

When the %Tim macro is called TheTimeIs (date/time) is written to the SAS Log, as shown above.

Page 18: SAS Tips & Tricks

The Unknown SAS Instructor

I just have one comment to make…

Page 19: SAS Tips & Tricks

I just have one comment to make…

/* Here’s an easy way to comment out a chunk of code: */

%macro skip;data new; set old; . . .run;%mend skip;/* And the best part is, you can use it as many times as you want within the same program! */

Page 20: SAS Tips & Tricks

Stephanie Thompson, Sr. Merchandising Analyst AutoZone Extra rsubmits

Page 21: SAS Tips & Tricks

Extra rsubmits

Ever forget to remove an extra rsubmit from your code and it crashes?

Keep them there and use them at will with…...

Page 22: SAS Tips & Tricks

Extra rsubmitsrsubmit;proc sql;create table payprelim_vp as (SELECT VP, sum(purchase_cost) as p_cost, sum(inventory_cost) as i_cost FROM payprelim_t group by vp);*rsubmit;data q.payables_vp;set payprelim_vp;payables = p_cost / i_cost;run;proc print data = q.payables_vp;format purchase_cost dollar12. inventory_cost

dollar12. payables percent6.2;run;

Higlight from after the ‘*’ to the end and submit.

Page 23: SAS Tips & Tricks

Walter Smith, Marketing Project Manager WorldWide Forecasting, FedEx Dates and Fixing Capitalization

Page 24: SAS Tips & Tricks

Creating SAS Dates From Character Data

Often date data is stored in files or tables as character data and is more useful when converted to SAS date values (number of days since Jan 1, 1960)

Page 25: SAS Tips & Tricks

*-- read in some dates as character ---;

data chardates; infile cards; input yyyymm $6. ;cards;200206200207200208200303run;

*---------- create SAS dates ----------;data sasdates; set chardates; yy = input( substr( yyyymm , 1 , 4 ) , 4. ); mm = input( substr( yyyymm , 5 ) , 2. ); sasdate = mdy( mm , 1 , yy ); date9 = sasdate; mmddyy = sasdate; attrib sasdate label='Numeric Date' format=5.; attrib date9 label='Displayed As Date9' format=date9.; attrib mmddyy label='Displayed As mmddyy' format=mmddyy10.; keep yyyymm sasdate date9 mmddyy; tt = 1; put 'tt=' tt 'tt=' tt date9.;run;title 'Character & SAS Dates';proc print; run;

Sample Code

Page 26: SAS Tips & Tricks

The Result

Character & SAS Dates

Obs yyyymm sasdate date9 mmddyy

1 200206 15492 01JUN2002 06/01/2002 2 200207 15522 01JUL2002 07/01/2002 3 200208 15553 01AUG2002 08/01/2002 4 200303 15765 01MAR2003 03/01/2003

Page 27: SAS Tips & Tricks

SAS Dates in Macro Variables

Sometimes it is convenient to have date values in the macro environment

Additional manipulations can be done – the macro function %sysfunc() gives access to nearly all the datastep functions!

Page 28: SAS Tips & Tricks

Sample Code & Results*--- put min & max into macro vars ----;

proc sql noprint;

select put(min(sasdate),5.), max(sasdate)

into :mindate, :maxdate

from sasdates;

quit;

%put >> mindate="&mindate" maxdate="&maxdate";

%let nextmonth = %sysfunc( intnx( month , &maxdate , 1 ) , date9. );

%let sas_date_const = %unquote(%bquote(')&nextmonth%bquote('d));

%put >> nextmonth=&nextmonth sas_date_const=&sas_date_const;

>> mindate="15492" maxdate=" 15765“

>> nextmonth=01APR2003 sas_date_const='01APR2003'd

Page 29: SAS Tips & Tricks

Capitalizing Proper Names

Often names of people, places and things in data is in mixed or case, or worse, IN ALL CAPITAL LETTERS.

Names are easier to read when each word of the name has only the initial letter capitalized.

Page 30: SAS Tips & Tricks

Sample Code

*--------- read in some data ----------;

data names; infile cards; input fullname $30. ;cards;Chris GeithFaRoUK HiJaZi unusual caseSOUTH KOREATARIQ AZIZWalter j SMITHlos angeles CA state abbrevrun;

*---- change names to initial caps ----;data newnames; set names; oldname = fullname; length newname $30; fullname = lowcase( fullname ); n_wrds = length( compbl( fullname ) ) - length( compress( fullname , ' ' ) ) +1; do i=1 to n_wrds; wrd = scan( fullname , i , ' ' ); newwrd = upcase(substr(wrd,1,1))||substr(wrd,2); newname = trim(newname)||' '||newwrd; end; keep oldname newname;run;title 'Names With Initial Capitals';proc print; run;

Page 31: SAS Tips & Tricks

Results

Names With Initial Capitals

Obs oldname newname

1 Chris Geith Chris Geith

2 FaRoUK HiJaZi unusual case Farouk Hijazi Unusual Case

3 SOUTH KOREA South Korea

4 TARIQ AZIZ Tariq Aziz

5 Walter j SMITH Walter J Smith

6 los angeles CA state abbrev Los Angeles Ca State Abbrev