![Page 1: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/1.jpg)
Beginning Data Manipulation
HRP 223 - Topic 4Oct 19th 2011
![Page 2: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/2.jpg)
Some fake dataPr
oced
ures
Functions
Procedures summarize over a dataset
Functions work on a within a record of a
dataset.
Notice SAS remembers the capitalization
![Page 3: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/3.jpg)
Print a Dataset
![Page 4: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/4.jpg)
What SAS writes
![Page 5: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/5.jpg)
Labels fix these
Formats fix these
Formats fix these
I changed the capitalization.
![Page 6: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/6.jpg)
![Page 7: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/7.jpg)
![Page 8: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/8.jpg)
Aver
age
mon
ths
trea
tmen
t
Calculate a mean
![Page 9: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/9.jpg)
Average 3 labs
Search the function list in onlineDoc for a function that does average.
![Page 10: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/10.jpg)
![Page 11: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/11.jpg)
![Page 12: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/12.jpg)
Modifying datasets with SQL
![Page 13: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/13.jpg)
![Page 14: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/14.jpg)
I like to split my .egp file into several process flowcharts.
• One sets the libraries and formats.• One does cleaning.• One (or several) for analyses.
Right click here and choose Properties.
Label this process flow Make data.
Note the name.
![Page 15: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/15.jpg)
Note the new name.
![Page 16: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/16.jpg)
Automatically Make Libraries and/or Formats
• You can make a process flow that runs whenever you start up your project. Just name the process flow autoexec.
![Page 17: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/17.jpg)
User Defined Formats
• I typically create my formats with code but if you want to use the GUI.
![Page 18: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/18.jpg)
Set this
A short name
![Page 19: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/19.jpg)
![Page 20: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/20.jpg)
After pushing Run fix the node name to match the format.
![Page 21: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/21.jpg)
Make At Least 1 Analysis Process Flow
• If you have an autoexec file you don’t need to include the library in the analysis sheet but I like to see it:
![Page 22: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/22.jpg)
Moving Between Process Flows
Here
Or here
![Page 23: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/23.jpg)
Need a new variable?
• You can check a value using an if statement in a data step:
![Page 24: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/24.jpg)
else
• If the value is not greater than or equal to 175 then set the result to be good:
New character variables are 8 letters wide if you use an input statement. Otherwise it uses the first reference
to set the length.
It gets the length for existing variables from
the first reference in the source dataset.
![Page 25: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/25.jpg)
Change this to "Bad " or use a length statement.
![Page 26: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/26.jpg)
Missing values are negative
infinity….
![Page 27: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/27.jpg)
![Page 28: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/28.jpg)
![Page 29: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/29.jpg)
![Page 30: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/30.jpg)
You can get the same result with SQL.
![Page 31: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/31.jpg)
Showing Combinations
• Often I am asked to show sets of treatments or sets of drugs. This quickly gets too complex for contingency tables (for 5 treatments you need 2x2x2x2x2 tables).
• I use binary lists. For example, common cancer treatments include Chemo, Radiation, Surgery (but you can use this same system for fine distinctions). Somebody who got Chemo and Surgery but no radiation can be represented as CrS. Code everybody like that and count the combinations.
![Page 32: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/32.jpg)
![Page 33: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/33.jpg)
![Page 34: Beginning Data Manipulation HRP 223 - Topic 4 Oct 19 th 2011](https://reader030.vdocuments.net/reader030/viewer/2022032800/56649d2c5503460f94a02eb6/html5/thumbnails/34.jpg)