lecture 6

16
1 1 SAS Lecture 6 – ODS and graphs Aidan McDermott, May 2, 2006 2 ODS The output delivery system 3 The output delivery system During the 1990’s SAS introduced a more extensive way of dealing with SAS output called the output delivery system or ODS for short. Until then: Each procedure decided what tables to print and in what format. And this created difficulties: Inconsistency in output formats between different procedures Some procedures did not produce certain output datasets Output was designed largely for line-printers (except for graphics output) and used monospace fonts. 4 The output delivery system • The ODS combines raw data with table definitions to produce output objects These objects can be sent to ODS destinations: traditional monospace output output for high-resolution printers datasets HTML LaTeX etc. • The ODS provides table definitions that define the structure of the output. You can customize these definitions or create your own. Procedures no longer handle output but pass the raw data and a table name to the ODS. 5 The output delivery system Destination Examples: Postscript, pdf, … printer LaTeX source latex HTML output HTML SAS datasets output Listing output listing Produces … Destination … Extendable to new destinations in the future You use the ODS statement to specify one or more destinations. 6 The output delivery system ODS creates output objects. Each output object contains the results of a procedure or DATA step (the data component) and may also contain information about how to render the results (the table definition). Numeric data is stored at machine precision. You can change the default table definition. This is somewhat reminiscent of variable values and formats.

Upload: vibhav-prasad-mathur

Post on 17-Jul-2016

5 views

Category:

Documents


0 download

DESCRIPTION

SAS Lecture: Graphs

TRANSCRIPT

Page 1: Lecture 6

1

1

SAS Lecture 6 – ODS and graphs

Aidan McDermott,May 2, 2006

2

ODSThe output delivery system

3

The output delivery system

• During the 1990’s SAS introduced a more extensive way of dealing with SAS output called the output delivery system or ODS for short. Until then:

• Each procedure decided what tables to print and in what format.

• And this created difficulties:– Inconsistency in output formats between different procedures– Some procedures did not produce certain output datasets– Output was designed largely for line-printers (except for

graphics output) and used monospace fonts.

4

The output delivery system• The ODS combines raw data with table definitions to

produce output objects• These objects can be sent to ODS destinations:

traditional monospace output output for high-resolution printers datasets HTML LaTeX etc.

• The ODS provides table definitions that define the structure of the output. You can customize these definitions or create your own.

• Procedures no longer handle output but pass the raw data and a table name to the ODS.

5

The output delivery system• Destination Examples:

Postscript, pdf, …printerLaTeX sourcelatex

HTML outputHTMLSAS datasetsoutputListing outputlisting

Produces …Destination …

• Extendable to new destinations in the future

• You use the ODS statement to specify one or more destinations. 6

The output delivery system• ODS creates output objects. • Each output object contains the results of a

procedure or DATA step (the data component) and may also contain information about how to render the results (the table definition).

• Numeric data is stored at machine precision.

• You can change the default table definition.

• This is somewhat reminiscent of variable values and formats.

Page 2: Lecture 6

2

7

The output delivery system

8

The output delivery systemODS creates a link to each output object in the Results window and identifies each output object by the appropriate icon.

9

The ODS statement• ODS is a global statement -- can go anywhere• used to choose a destination• used to choose which objects to send to the destination

and which to excludeoptions nodate nonumber;ods printer pdf file = ‘odsprinter.pdf’;proc reg data=mydata;

model y = x;run;ods printer close;

will produce a pdf file for printing. (Use PS for a postscript file.) 10

PDF output

11

Producing HTML• One the most important features of the ODS is its

ability to produce HTML• The ODS HTML statement can create:

� a body file containing the results from procedures� a table of contents that links to the body file� a table of pages that links to the body file� a frame that displays the table of contents, the table of

pages and the body file

12

HTML example

ods html file = ‘odshtml-body.htm’ contents=‘odshtml-contents.htm’ page=‘odshtml-page.htm’ frame=‘odshtml-frame.htm’;

proc univariate data=infant; var rate98 rate99; title; run; ods html close;

Page 3: Lecture 6

3

13

HTML example

ods html file = ‘body.htm’

contents= ‘contents.htm’page= ‘page.htm’

frame= ‘frame.htm’;

14

ODS• You can control which objects get printed to the

html files by naming them.• Proc univariate produces five output objects, to

find out what they are called you can use the trace statement (or look in the help): ods trace on;

this will write a list of objects to the log: ods trace on; proc univariate data=infant; var rate98 rate99; run ods trace off;

15

Log

16

HTML example ods html file = ‘odshtml-body.htm’ contents=‘odshtml-contents.htm’ page=‘odshtml-page.htm’ frame=‘odshtml-frame.htm’; ods select BasicMeasures Quantiles;

proc univariate data=infant; var rate98 rate99; title; run; ods html close;• You can control the format of the output by using proc

template to change the default template.

17

HTML example

• You can, of course, output the body only if you wish:

ods html body = ‘c:\myhtml\admit.htm’;

proc print data=clinic.admit label; var sex age height weight actlevel; label actlevel=“Activity Level’; run;

ods html close;

18

HTML example ods html body = ‘c:\myhtml\admit.htm’;

proc print data=clinic.admit label; var sex age height weight actlevel; label actlevel=“Activity Level’; run;

ods html close;

Page 4: Lecture 6

4

19

Default destination• By default the listing destination is open.

You can close it by typing:ods listing close;Or by setting the results tab in the preferences window 20

ODS - creating datasets• You can create an output dataset by using the

OUTPUT destination:ODS OUTPUT output-object=SAS-data-set;

/* close the listing */ods listing close;

/* Ask for the dataset measures to be made */ods output BasicMeasures=measures;proc univariate data=infant;

var rate98 rate99;title ;

run;

21

ODS - creating datasets/* Open HTML */ods html body=‘measutes-body.htm’

contents=‘measures-contents.htm’frame=‘measures-frame.htm’;

proc print data=measures noobsheadings=horizontal;title “Output dataset produced from univariate”;

run;/*Reset the destinations to their defaults */ods html close;ods listing ;

22

URLsods html body=‘c:\records\data.html’

contents= ‘c:\records\toc.html’frame=‘c:\records\frame.html’;

toc.html contains a reference to c:\records\data.html

frame.html contains a reference to c:\records\toc.htmlandc:\records\data.html

23

URLsIf you move the web pages to a new location or put them on a web server you would need to update all the references.Instead use a URL specification:ods html body='c:\records\data.html' (url='data.html')contents='c:\records\toc.html' (url='toc.html')frame='c:\records\frame.html';

Ok if data.html and toc.html are at the same location as frame.htmlods html body='c:\records\data.html'(url='http://mysite.com/myreports/data.html')contents='c:\records\toc.html' (url='http://mysite.com/mycontents/toc.html')frame='c:\records\frame.html'; 24

Path option

Putting the full pathname in each of the body, contents, and frame is not a good idea. Instead you can use the path otpions:

ods html path = ‘c:\records’ (url=none)body='data.html' (url='data.html')contents='toc.html' (url='toc.html')frame='frame.html';

Using relative paths is usually a good idea.

Page 5: Lecture 6

5

25

Style

Other styles:beige, brick, brown, D3D, default, minimal

You can change the appearance of the html by using the style option.

ods listing close; ods html body='c:\records\data.html'(url='data.html') contents='c:\records\toc.html'(url='toc.html') frame='c:\records\frame.html' style=brick;

26

Word and ExcelThe html destination is an excellent way to transfer your SAS output to Microsoft Word or Excel.

• Some things to note:• Use a gif device driver to translate graphic output.• Use colors and fonts compatible with Word and Excel.• Colors may be given in cxRRGGBB format where RR is

for red GG for green and BB for blue, so cxFF0000 is pure red (numbers are in hexadecimal).

goptions device = gif570 colors=(cxAAAA00,cx555500);

27

ExampleWe have two causes of mortality data for two U.S. cities by sex, agegroup for the years 1987 to 2000. We wish to make a plot and a table and transfer the results to Excel say. The data looks like this:

28

Example

29

Example

30

Example

Page 6: Lecture 6

6

31

SAS/GRAPH

There are a small number of graphic types commonly used in public health presentations and publication.These basic types are either used alone or mixed together to form a composite graphic.Here we will look at how to build some of these basic types of graph.

Golden Rule: Everybody is a graph critic.

32

Two types of graph makerIf you are using SAS for statistics and data management then it seems natural to use it to produce your graphs as well. Sometimes a statistical procedure will produce the graph you are looking for anyway. Need a one-off graph for a presentation versus production line graphs.To produce “quick and dirty” graphs you can use Graph-n-go.

Very easy to use; not bad for putting multiple graphs on one page; data viewer is a graph type; only a small number of graph types available; not all options available; labor intensive so not suitable for production line graphs.

Use SAS/Graph proceduresVery flexible; complete control over graphic elements; less labor intensive in the long run; harder to learn; same control can be used for SAS/STAT graphics output.

33

Some common types of graphChartsHistogramsStem and leaf plotsBoxplotsPlotsContour plots / 3-dimensional plotsMapsGantt chartsTrellis plots Trees / pedigrees / dendograms(mathematical) graphs / networksFlow charts / entity-relationship diagrams

34

35

Graphic output within SAS

• You have already seen some graphic output from within SAS.

• proc means, proc univariate, proc genmod, proc lifetest etc. all produce graphs

• Other procedures in SAS specifically produce graphs, even some procedures that are not part of SAS/Graph (proc boxplot is an example)Here our aim is to produce

publication/presentation-- quality graphs.36

Graph basics

SAS stores graphs in catalogs (an entity similar to a folder in windows).Graphs are stored in a SAS proprietary format.By default graphs are stored in a catalog called Gsegin the work library.Graphs can be translated to postscript, gif, jpeg, and a number of other commonly used formats for printing or including in other documents (Word, html, etc.).

Page 7: Lecture 6

7

37

Graphic control There are three ways to control the look of a

sas/graph.

1. Use options within the procedure 2. Use global commands 3. Use goptions

38

GOPTIONSset the environment for a graphics program to run and send output

independent of the program

remain in effect for the entire SAS session unless changed or reset

control appearance of graphic elements by specifying default fonts, colors, text heights etc.

Useful when you want the same options in multiple procs

39

PROC GOPTIONS

used to review current GOPTIONSlists alphabetically all of the current GOPTIONS in the LOG window

proc goptions;run;

Can also type goptions at the command line

40

GOPTIONSGOPTIONS options-list

ROTATE= portrait or landscape(will override the setting in the print dialog box)

RESET=ALL resets all options to defaults including all global statements

RESET=GOPTIONS resets only goptionsstatements

41

COLORS=device dependent default color list for device driver

GUNIT= unit of measurement for height in global statements, such as TITLE and FOOTNOTE

cell - character cellspct - percent of graphics areain - inches

42

Data• From the SAS samples folder.• Three Californian pollutant monitoring

stations (AZU, LIV, SFO)• One monthly measurement (taken on the

15th of the month) for CO, O3, SO4, temperature etc. for each station. 36 observations in all

• Month is a numeric variable taking the value 1 for January, 2 for February, etc.

Page 8: Lecture 6

8

43

Californian Air pollutant Data –ca88air

44

Charts

• Examples Look for graphic elements in each chart Look for common data types Look for similarities among the examples

45 46

47 48

Page 9: Lecture 6

9

49 50

51 52

Charts• All the examples used a small number of

graphic elements• Main difference between plots is the

polygon/area type• Most involved a categorical/discrete

variable and a numeric variable. A histogram uses a continuous variable to

create categories. The counts of a categorical variable can be used to create the numeric variable.

53

Proc GCHART

produces charts based on the values of one or more chart variables.

produces vertical and horizontal bar charts, block charts, pie charts etc.

graphs based on statistics - counts, percentages, sums, or means

run-group processing

numeric and character variables54

Proc GCHART example proc format; value seas 1 = ‘Win’ 2 = ‘Spr’ 3 = ‘Sum’ 4 = ‘Fal’;

data ca88air; set vol1.ca88air(where=(station=“SFO”));

if ( month in (12,1,2) ) then season = 1; else if ( month in (3,4,5) ) then season = 2; else if ( month in (6,7,8) ) then season = 3; else if ( month in (9,10,11)) then season = 4;

format season seas.; format month mth.; run;

Page 10: Lecture 6

10

55

Proc GCHART example title1 h=4 ’Mean seasonal carbon monoxide for station SFO’; footnote j=l h=4 f=simplex 'Bar Chart - vertical’;

proc gchart data=ca88air; vbar season / sumvar=co type=mean discrete ctext=black clm=95 ; run; quit;

56

57

Proc GCHART syntax

PROC GCHART data=data set name;

One of the following: VBAR variables / options; HBAR variables / options; STAR variables / options; PIE variables / options; BLOCK variables / options;

run;58

VBARseparate bar chart for each chart variable

each bar represents the statistic selected for a value of the chart variable

response axis (vertical) provides a scale for statistic graphed

midpoint axis - horizontal axis

59

VBAR SYNTAX

VBAR chart variables/ options;

chart-variable(s) specifies one or more variables that

define the categories of data to chart.

optionsspecifies appearance, statistics, axes

and midpoint options

60

VBAR

midpoints are the values of the chart variable that identify categories of data. By default, midpoints are selected or calculated by the procedure. The way the procedure handles the midpoints depends on whether the values of the chart variable are character, discrete numeric, or continuous numeric. character chart variables- separate bar is drawn for each value

Page 11: Lecture 6

11

61

VBARnumeric chart variables - each bar represents a range of values - DISCRETE option generates a midpoint

for each unique value of the chart variable.

- generates midpoints that represent ranges of values. By default, determines the ranges, calculates the median value of each range, and displays the median value at each midpoint on the chart. A value that falls exactly halfway between two midpoints is placed in the higher range.

62

VBAR OPTIONS

For character or discrete numeric values, you can use the MIDPOINTS= option to rearrange the midpoints or to exclude midpoints from the chart.

For character dataMIDPOINTS= list values in quotesMIDPOINTS=‘Sydney’ ‘Atlanta’ ‘Paris’

63

VBAR OPTIONSFor continuous numeric variables, use the MIDPOINTS= option to change the number of midpoints, to control the range of values each midpoint represents, or to change the order of the midpoints. To control the range of values each midpoint represents, use the MIDPOINTS= option to specify the median value of each range. For example, to select the ranges 20-29, 30-39, and 40-49, specify

MIDPOINTS=25 35 4564

VBAR OPTIONS

Other options;

DISCRETEseparate bar for each value of numeric variable

TYPE=statistic specifies the chart statistic.FREQ frequencyPCT percentageSUM sum (the default)MEAN mean

CLM=confidence-level draws chart confidence intervals (error bars)

65

VBAR SYNTAX

SUMVAR=variablespecifies variable to used for sum or mean calculations for each midpoint. The resulting statistics are represented bythe length of the bars along the response axis, and they are displayed at major tick marks. REQUIRED if specifying TYPE-MEAN or SUM.

RAXIS= axisn response axis MAXIS=axisn midpoint axis

66

GLOBAL STATEMENTS

define titles, footnotes used to control axes, symbols, patterns, and legendscan be defined anywhere inside a proc or before a proc

in effect until canceled, replaced, or the end of SAS sessioncancel by repeating statement with no options or using goptions RESET=ALL;

Page 12: Lecture 6

12

67

GLOBAL STATEMENTS

TITLE defines titlesAXIS defines appearance of axes

FOOTNOTE defines footnotes

PATTERN defines patterns used ingraphs (histograms)

LEGEND defines legends

SYMBOL defines symbols (plotting)NOTE adds text to graph

68

TITLE STATEMENTcreates, changes or cancels a title for all subsequent graphics output in a SAS sessionallowed up to 10 titleskeyword TITLE can be followed by unlimited number of text strings and options text strings enclosed in single or double quotesmost recently created TITLE number replaces the previous TITLE of the same number

69

Title syntax

TITLE<1,2....10> <options | ‘text’> ......<options-n>| ‘text-n’>;

Options:FONT=font specifies the font for the

subsequent text.

HEIGHT= specifies the height of text H=n<units> characters in number of units

JUSTIFY= specifies the alignment J=R|L|C By default, JUSTIFY=C=center

R=right L=left. 70

PATTERN STATEMENT

defines the characteristics of patterns used in chartstype of fill pattern - solid, empty, linedcolor

An example of a global statement

71

PATTERN STATEMENT

PATTERN <1....99> options;

OPTIONSCOLOR= pattern color

VALUE= fill E emptyS solidLn left slanting linesRn right slanting linesXn crosshatched lines

where n is 1-5 1 indicating the lightest

72

Proc GCHART example

pattern1 color=blue value=fill; pattern2 color=red value=fill;

proc gchart data=ca88air; star month / sumvar=co type=mean discrete ctext=black noheading ; run; quit;

Page 13: Lecture 6

13

73 74

Exporting graphs

Make sure the graphics window has focus, by clicking on it.File export as Imageselect type of image – gif, …open other software program – Powerpointinsert picture

75

Graphs can also be saved in a SAS catalog.They are stored in a SAS proprietary format.They can be viewed with proc greplay.

goptions replace;libname mylib ‘c:\Temp\sasclass\myfiles’;proc gchart data=mydat gout=lib.mygraphs;…

proc greplay allows multiple plots on one page.

Saving graphs

76

PROC GPLOTgraphs one variable against another producing presentation quality plots

coordinates of each point correspond to the values in one or more observations of the input data set.

run-group processing- procedure does not end with a run- submit new statements and produce

more graphs without another PROC- ends with QUIT or PROC or DATA

77

Proc GPLOT

produces two-dimensional graphs that plot one variable against another within a set of coordinate axes graphs are automatically scaled to the values of your data, although scaling can be controlled with options or with AXIS statements. scatterplots, bubble plots plots, plots with interpolated lines (SYMBOL statement)

78

2 4 6 8

10

Tick Marks

Values

VERTICAL AXIS Y variable

HORIZONTAL AXIS X variable

20

Page 14: Lecture 6

14

79

GPLOT SYNTAX

PROC GPLOT data=data-set-name <options>;PLOT request list </options list>;

request list is of the form:

vertical*horizontal e.g.PLOT y*x;

vertical*horizontal=variable e.g.PLOT y*x=z;

80

Graphics options on PLOT statement

CTEXT= colorLEGEND= LEGENDn(uses nth global LEGEND statement)

HAXIS=AXISn(uses nth global AXIS statement)

VAXIS=AXISn(uses nth global AXIS statement)

GPLOT SYNTAX

81

Proc GPLOT example• Suppose we are asked to draw a plot of ozone

by month for the three stations SFO, LIV, AZU. After consulting the help we might try:

proc gplot data=ca88air; plot o3 * month; run; quit;

which produces:

82

83

Proc GPLOT example• Increase the size of the text• use a format to print out Month names• clear the unwanted footnote

GOPTIONS gunits=pct htext=4; footnote1;

proc gplot data=ca88air; plot o3 * month ; format month mth.; title1 '1988 Air Quality Data - Ozone'; run;

84

Page 15: Lecture 6

15

85

Proc GPLOT example

• back to the help• you can make a stratified plot by station• x axis too crowded - use a different format

proc gplot data=ca88air; plot o3 * month = station; format month mthc.; title1 '1988 Air Quality Data - Ozone'; run;

86

87

Proc GPLOT example

• the symbols in the plot are too small• use symbol global statements!

symbol1 v=dot i=join c=blue h=1.3; symbol2 v=dot i=join c=green h=1.3; symbol3 v=dot i=join c=brown h=1.3;

proc gplot data=ca88air; plot o3 * month = station; format month mthc.; title1 '1988 Air Quality Data - Ozone'; run;

88

89

Proc GPLOT exampleThe x-axis is not right - use an axis global statement

axis1 minor = none label = (f=simplex j=c 'Ozone levels at three locations') major = (h=1.1) order = (0 to 13 by 1) value = (f=simplex h=3.0);

proc gplot data=ca88air; plot o3 * month = station / haxis=axis1; format month mthc.; title1 '1988 Air Quality Data - Ozone'; run; 90

Page 16: Lecture 6

16

91

Proc GPLOT example• The x-axis has extra characters - use a new format or

use an axis global statement• y-axis label need to be rotated and placed in center of

axis• legend needs moving - legend global command

axis1 minor = none label = (f=centb j=c 'Ozone levels at three locations') major = (h=1.0) order = (0 to 13 by 1) value = (f=simplex h=3.0 " " "J" "F" "M" "A" "M" "J” "J" "A" "S" "O" "N" "D" " "); 92

Proc GPLOT example axis2 label = (f=centb rotate=0 angle=90 j=c 'Ozone') value = (f=simplex h=3.0) ;

legend1 across=3 position=(bottom center inside) label=none; proc gplot data=ca88air; plot o3 * month = station / haxis=axis1 vaxis=axis2; format month mthc.; title1 '1988 Air Quality Data - Ozone'; run;

93 94

proc g3d and proc contour produce 3-dimensional analogs of gplot