chapter 4: creating simple queries

110
1 Chapter 4: Creating Simple Queries 4.1 Introduction to Querying Data 4.2 Filtering and Sorting Data 4.3 Creating New Columns with an Expression 4.4 Grouping and Summarizing Data in a Query 4.5 Joining Tables 4.6 Joining Tables Including Nonmatching Rows (Self-Study) 4.7 Creating New Columns by Recoding Values (Self-Study)

Upload: henry-levine

Post on 02-Jan-2016

40 views

Category:

Documents


4 download

DESCRIPTION

Chapter 4: Creating Simple Queries. Chapter 4: Creating Simple Queries. Objectives. State the function of the Filter and Sort task and the Query Builder. Compare the functionality available in each task. Filter and Sort Task and the Query Builder. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Chapter 4: Creating Simple Queries

1

Chapter 4: Creating Simple Queries

4.1 Introduction to Querying Data

4.2 Filtering and Sorting Data

4.3 Creating New Columns with an Expression

4.4 Grouping and Summarizing Data in a Query

4.5 Joining Tables

4.6 Joining Tables Including Nonmatching Rows (Self-Study)

4.7 Creating New Columns by Recoding Values (Self-Study)

Page 2: Chapter 4: Creating Simple Queries

2

Chapter 4: Creating Simple Queries

4.1 Introduction to Querying Data4.1 Introduction to Querying Data

4.2 Filtering and Sorting Data

4.3 Creating New Columns with an Expression

4.4 Grouping and Summarizing Data in a Query

4.5 Joining Tables

4.6 Joining Tables Including Nonmatching Rows (Self-Study)

4.7 Creating New Columns by Recoding Values (Self-Study)

Page 3: Chapter 4: Creating Simple Queries

3

Objectives State the function of the Filter and Sort task

and the Query Builder. Compare the functionality available in each task.

Page 4: Chapter 4: Creating Simple Queries

4

Filter and Sort Task and the Query BuilderThe Filter and Sort task and the Query Builder can be used to create a new data source from one or more tables according to the criteria specified by the user.

Page 5: Chapter 4: Creating Simple Queries

5

Page 6: Chapter 4: Creating Simple Queries

6

4.01 Multiple Answer PollDouble-click on any data source in your project. Select Filter and Sort and explore the available tabs. What functionality do you think is supported by this task?

a. Subsetting rows

b. Selecting columns

c. Calculating new columns

d. Controlling the sort order of the rows

e. Summarizing data

f. Create a SAS data set

Page 7: Chapter 4: Creating Simple Queries

7

4.01 Multiple Answer Poll – Correct AnswersDouble-click on any data source in your project. Select Filter and Sort and explore the available tabs. What functionality do you think is supported by this task?

a. Subsetting rows

b. Selecting columns

c. Calculating new columns

d. Controlling the sort order of the rows

e. Summarizing data

f. Create a SAS data set

Page 8: Chapter 4: Creating Simple Queries

8

Filter and Sort TaskThe Filter and Sort task enables you to create a new SAS table by selecting rows, columns, and a sort sequence.

Page 9: Chapter 4: Creating Simple Queries

9

Page 10: Chapter 4: Creating Simple Queries

10

4.02 QuizClose the Filter and Sort task and return to the data grid. Select Query Builder. What options appear to be available that are not present in the Filter and Sort task?

Page 11: Chapter 4: Creating Simple Queries

11

4.02 Quiz – Correct AnswerClose the Filter and Sort task and return to the data grid. Select Query Builder. What options appear to be available that are not present in the Filter and Sort task?

Possible answers: Query name, Output name, Computed Columns, Prompt Manager, Tools, Options, Add Tables, Join Tables

Page 12: Chapter 4: Creating Simple Queries

12

Query BuilderThe Query Builder enables you to create a new SAS table by selecting rows, columns, and a sort sequence. It also enables computing new columns, joining tables, grouping, summarizing, and modifying column attributes.

Page 13: Chapter 4: Creating Simple Queries

13

Filter and Sort Task versus the Query Builder Filter and Sort Query Builder

Sort data Yes Yes

Filter rows and columns Yes Yes

Create a new SAS data set Yes Yes

Define new columns No Yes

Join tables No Yes

Group and summarize data No Yes

Define column attributes No Yes

Remove duplicates No Yes

Page 14: Chapter 4: Creating Simple Queries

14

Chapter 4: Creating Simple Queries

4.1 Introduction to Querying Data

4.2 Filtering and Sorting Data4.2 Filtering and Sorting Data

4.3 Creating New Columns with an Expression

4.4 Grouping and Summarizing Data in a Query

4.5 Joining Tables

4.6 Joining Tables Including Nonmatching Rows (Self-Study)

4.7 Creating New Columns by Recoding Values (Self-Study)

Page 15: Chapter 4: Creating Simple Queries

15

Objectives Apply a filter in a query. Exclude columns in a query. Reorder rows in a query.

Page 16: Chapter 4: Creating Simple Queries

16

Business ScenarioOrion Star wants to analyze Internet sales since 2008. To prepare the data for input to the various analytic tasks, the company must generate a new data source from the orders table, including only those Internet orders placed on or after 01JAN2008.

Internet Orders(Order_Type =3)

Page 17: Chapter 4: Creating Simple Queries

17

Filter and Sort TaskThe Variables, Filter, and Sort tabs in the Filter and Sort task provide functionality to select rows and columns in a designated sort order.

Page 18: Chapter 4: Creating Simple Queries

18

Filter and Sort: FilterSimple filters can be built using variable names, operators, and data values. Select Advanced Edit… to build more complex filters.

Page 19: Chapter 4: Creating Simple Queries

19

Advanced Filter BuilderThe Advanced Filter Builder provides access to advanced operators and SAS functions to create more complex rules for extracting rows.

Page 20: Chapter 4: Creating Simple Queries

20

Filter and Sort: Sort and ResultsYou can sort by multiple variables, and designate either ascending or descending sequence. You can also name the task and output table.

Page 21: Chapter 4: Creating Simple Queries

21

Query BuilderThe Query Builder provides similar tabs for selecting columns, filtering rows, and sorting data. Additional functionality is available, including the following: modifying column

properties grouping and

summarizing data applying formats selecting distinct

rows joining tables

Page 22: Chapter 4: Creating Simple Queries

22

Using Query Results in TasksData sources generated from queries can serve as the input data for follow-up tasks.

Page 23: Chapter 4: Creating Simple Queries

23

Selecting Columns and Filtering Rows

Page 24: Chapter 4: Creating Simple Queries

24

Page 25: Chapter 4: Creating Simple Queries

25

Exercise

This exercise reinforces the concepts discussed previously.

Page 26: Chapter 4: Creating Simple Queries

26

Page 27: Chapter 4: Creating Simple Queries

27

Chapter 4: Creating Simple Queries

4.1 Introduction to Querying Data

4.2 Filtering and Sorting Data

4.3 Creating New Columns with an Expression4.3 Creating New Columns with an Expression

4.4 Grouping and Summarizing Data in a Query

4.5 Joining Tables

4.6 Joining Tables Including Nonmatching Rows (Self-Study)

4.7 Creating New Columns by Recoding Values (Self-Study)

Page 28: Chapter 4: Creating Simple Queries

28

Objectives Define a new column of data in a query by building

an expression.

Page 29: Chapter 4: Creating Simple Queries

29

Business ScenarioOrion Star wants to analyze shipment methods by determining how many days elapse between each order date and delivery date. The company also wants to calculate the total amount invoiced to the customer, which is the sum of total retail price and shipping charges.

Delivery_Date - Order_Date

SUM(Total_Retail_Price, Shipping)

Page 30: Chapter 4: Creating Simple Queries

30

Computed ColumnsNew summarized columns, recoded columns, or columns based on an expression can be added to a query in the Query Builder.

Select to begin creatinga new column.

Page 31: Chapter 4: Creating Simple Queries

31

New Computed Column WizardA wizard guides you through the process of creating the new column and assigning attributes such as the column name, label, and format.

Page 32: Chapter 4: Creating Simple Queries

32

Expression EditorThe Expression Editor enables you to build expressions based on variables, operators, and functions.

Page 33: Chapter 4: Creating Simple Queries

33

SAS Functions

Example:

A SAS function is a routine that returns a value that is determined from specified arguments.

General form of a SAS function:

function-name(argument1,argument2, . . .)function-name(argument1,argument2, . . .)

sum(Salary,Bonus)

Page 34: Chapter 4: Creating Simple Queries

34

Using SAS FunctionsSAS functions can do the following: perform arithmetic operations compute sample statistics (for example, sum, mean,

and standard deviation) manipulate SAS dates process character values perform many other tasks

Sample statistics functions ignore missing values.

Page 35: Chapter 4: Creating Simple Queries

35

Page 36: Chapter 4: Creating Simple Queries

36

4.03 Multiple Choice PollWhat is the result of the expression given the values of Var1, Var2, and Var3?

a. . (missing)

b. 3

c. 9

d. 12

Var1+Var2+Var3

Var1 Var2 Var39 . 3

Page 37: Chapter 4: Creating Simple Queries

37

4.03 Multiple Choice Poll – Correct AnswerWhat is the result of the expression given the values of Var1, Var2, and Var3?

a. . (missing)

b. 3

c. 9

d. 12

Var1 Var2 Var39 . 3

Var1+Var2+Var3

Page 38: Chapter 4: Creating Simple Queries

38

4.04 Multiple Choice PollWhat is the result of the expression given the values of Var1, Var2, and Var3?

a. . (missing)

b. 3

c. 9

d. 12

sum(Var1,Var2,Var3)

Var1 Var2 Var39 . 3

Page 39: Chapter 4: Creating Simple Queries

39

4.04 Multiple Choice Poll – Correct AnswerWhat is the result of the expression given the values of Var1, Var2, and Var3?

a. . (missing)

b. 3

c. 9

d. 12

sum(Var1,Var2,Var3)

Var1 Var2 Var39 . 3

Page 40: Chapter 4: Creating Simple Queries

40

Computed ColumnsComputed columns appear in the left pane and can be used in a filter, for sorting, or as an input to another computed column.

Page 41: Chapter 4: Creating Simple Queries

41

Creating a Column with an Expression

This demonstration illustrates using the Computed Column wizard to define new columns based on advanced expressions.

SUM(Total_Retail_Price, Shipping)

Delivery_Date - Order_Date

Page 42: Chapter 4: Creating Simple Queries

42

Page 43: Chapter 4: Creating Simple Queries

43

Exercise

This exercise reinforces the concepts discussed previously.

Page 44: Chapter 4: Creating Simple Queries

44

Page 45: Chapter 4: Creating Simple Queries

45

Chapter 4: Creating Simple Queries

4.1 Introduction to Querying Data

4.2 Filtering and Sorting Data

4.3 Creating New Columns with an Expression

4.4 Grouping and Summarizing Data in a Query4.4 Grouping and Summarizing Data in a Query

4.5 Joining Tables

4.6 Joining Tables Including Nonmatching Rows (Self-Study)

4.7 Creating New Columns by Recoding Values (Self-Study)

Page 46: Chapter 4: Creating Simple Queries

46

Objectives Assign a grouping variable in a query. Select the analysis variable and the summary statistic

to compute. Filter grouped data.

Page 47: Chapter 4: Creating Simple Queries

47

Business ScenarioOrion Star wants to offer a sales promotion that highlights the most lucrative products. The company would like a list of all products with a total profit that exceeds $500.

Page 48: Chapter 4: Creating Simple Queries

48

Grouping DataThe Query Builder canbe used to group and summarize data.

Page 49: Chapter 4: Creating Simple Queries

49

Grouping DataData can be grouped and summarized using the Select Data tab.

Choose a statisticfor columns tobe summarized.

Columns without an assignedstatistic will automaticallydefine the groups.

Page 50: Chapter 4: Creating Simple Queries

50

Grouping by Column ValuesThe query result includes one row for every unique value of the group column(s) and a calculated statistic for the summarized column(s).

Page 51: Chapter 4: Creating Simple Queries

51

Page 52: Chapter 4: Creating Simple Queries

52

4.05 Quiz1. Open the Query Builder and use any data source

in the current project.

2. Click the Filter Data tab and notice the layout.

3. Return to the Select Data tab and add any two columns.

4. For one of the columns in the Select Data tab, select Count in the Summary field.

5. Return to the Filter Data tab.

How does the Filter Data tab change after a query includes grouped data?

Page 53: Chapter 4: Creating Simple Queries

53

4.05 Quiz – Correct AnswerHow does the Filter Data tab change after a query includes grouped data?

An additional pane labeled “Filter the summarized data” is added to the Filter Data tab.

WithgroupingWithout

grouping

Page 54: Chapter 4: Creating Simple Queries

54

Filtering DataThe Filter Data tab can be used to filter both raw data and summarized data.

Page 55: Chapter 4: Creating Simple Queries

55

Summarizing and Filtering by Groups

This demonstration illustrates grouping, summarizing, andfiltering grouped data.

Page 56: Chapter 4: Creating Simple Queries

56

Page 57: Chapter 4: Creating Simple Queries

57

Exercise

This exercise reinforces the concepts discussed previously.

Page 58: Chapter 4: Creating Simple Queries

58

Page 59: Chapter 4: Creating Simple Queries

59

Chapter 4: Creating Simple Queries

4.1 Introduction to Querying Data

4.2 Filtering and Sorting Data

4.3 Creating New Columns with an Expression

4.4 Grouping and Summarizing Data in a Query

4.5 Joining Tables4.5 Joining Tables

4.6 Joining Tables Including Nonmatching Rows (Self-Study)

4.7 Creating New Columns by Recoding Values (Self-Study)

Page 60: Chapter 4: Creating Simple Queries

60

Objectives Join multiple tables by common columns. Include only matching rows.

Page 61: Chapter 4: Creating Simple Queries

61

Business ScenarioIn a previous query, products with total profits exceeding $500 were identified. Analysts asked for more details about these top products, including the product category, the product, supplier, and country name. The columns to include come from three different tables.

topproducts products Country_lookup

Page 62: Chapter 4: Creating Simple Queries

62

Business ScenarioTo include the necessary columns, the topproducts SAS table must be joined with the products SAS table and the country_lookup Excel spreadsheet.

Page 63: Chapter 4: Creating Simple Queries

63

Joining TablesJoining tables enables you to extract and simultaneously process data from more than one table.

Page 64: Chapter 4: Creating Simple Queries

64

Joining TablesBy default, the Query Builder includes matching rows only in the results.

Page 65: Chapter 4: Creating Simple Queries

65

Page 66: Chapter 4: Creating Simple Queries

66

4.06 Multiple Answer PollWhich customers will be returned by the Query Builder if these tables are combined using the default join type?

a. Smith, John (00001)

b. Anderson, Tim (00002)

c. Jones, Betsy (00003)

d. Customer 00004

e. Rigsbee, Marilyn (00005)

Page 67: Chapter 4: Creating Simple Queries

67

4.06 Multiple Answer Poll – Correct AnswersWhich customers will be returned by the Query Builder if these tables are combined using the default join type?

a. Smith, John (00001)

b. Anderson, Tim (00002)

c. Jones, Betsy (00003)

d. Customer 00004

e. Rigsbee, Marilyn (00005)

Page 68: Chapter 4: Creating Simple Queries

68

Tables and Joins WindowSelect Join Tables to access the Tables and Joins window. This window enables you to add additional tables and verify or change the criteria used to join tables.

Page 69: Chapter 4: Creating Simple Queries

69

Join PropertiesThe Join Properties window provides the ability to modify the join type or condition. Selecting a different join type can be used to identify or eliminate nonmatching rows.

Page 70: Chapter 4: Creating Simple Queries

70

Query OptionsSelect Options to customize the query, including the type of result produced, query limits, and the SAS server that will execute the query.

Page 71: Chapter 4: Creating Simple Queries

71

Page 72: Chapter 4: Creating Simple Queries

72

Setup for the Poll1. Right-click on any data source in the project and select

Query Builder….

2. Select Options Server and carefully read the warning regarding the SAS server for the query.

Page 73: Chapter 4: Creating Simple Queries

73

4.07 Multiple Choice PollAssume that you have SAS on both your local machine and a remote server. If you want to join an Excel spreadsheet on your PC with a large table on the server, what should you do?

a. Nothing. Allow SAS Enterprise Guide to choose where to process the query.

b. Modify the query options to force the query to process on the local server.

c. Modify the query options to force the query to process on your remote SAS Server.

Page 74: Chapter 4: Creating Simple Queries

74

4.07 Multiple Choice Poll – Correct AnswerAssume that you have SAS on both your local machine and a remote server. If you want to join an Excel spreadsheet on your PC with a large table on the server, what should you do?

a. Nothing. Allow SAS Enterprise Guide to choose where to process the query.

b. Modify the query options to force the query to process on the local server.

c. Modify the query options to force the query to process on your remote SAS Server.

Page 75: Chapter 4: Creating Simple Queries

75

Join ResultsWhen joining tables in the Query Builder, you can also filter or sort on any of the columns from the input tables, as well as compute new columns, or group and summarize.

Page 76: Chapter 4: Creating Simple Queries

76

Joining Tables

This demonstration illustrates how to join multiple tables and store the result in a data table.

Page 77: Chapter 4: Creating Simple Queries

77

Page 78: Chapter 4: Creating Simple Queries

78

Exercise

This exercise reinforces the concepts discussed previously.

Page 79: Chapter 4: Creating Simple Queries

79

Page 80: Chapter 4: Creating Simple Queries

80

Chapter 4: Creating Simple Queries

4.1 Introduction to Querying Data

4.2 Filtering and Sorting Data

4.3 Creating New Columns with an Expression

4.4 Grouping and Summarizing Data in a Query

4.5 Joining Tables

4.6 Joining Tables Including Nonmatching Rows 4.6 Joining Tables Including Nonmatching Rows (Self-Study)(Self-Study)

4.7 Creating New Columns by Recoding Values (Self-Study)

Page 81: Chapter 4: Creating Simple Queries

81

Objectives Perform different join types.

Page 82: Chapter 4: Creating Simple Queries

82

Business ScenarioIn an effort to improve customer retention, the Marketing Department at Orion Star would like to identify those customers in the database that did not place a recent order.

Page 83: Chapter 4: Creating Simple Queries

83

Joining TablesTypes of Joins: Matching Rows Only (SAS Enterprise Guide default)

– produces results where only the rows from onetable that have a corresponding match in every other table are returned.

All Rows from one or both tables

– produces results where all of the matched rows from both tables and the unmatched rows from at least one table are returned.

All Rows from A All Rows from A and B All Rows from B

A B A B A B

Page 84: Chapter 4: Creating Simple Queries

84

Review: Matching Rows Only

Page 85: Chapter 4: Creating Simple Queries

85

Including Nonmatching Rows All rows from customerdatabase and itemsordered

Page 86: Chapter 4: Creating Simple Queries

86

All rows from customerdatabase

Including Nonmatching Rows

Page 87: Chapter 4: Creating Simple Queries

87

Including Nonmatching Rows All rows from itemsordered

Page 88: Chapter 4: Creating Simple Queries

88

Join Properties (Review)The Join Properties include the ability to modify the join type or condition. Selecting a different join type can be used to identify or eliminate nonmatching rows.

Page 89: Chapter 4: Creating Simple Queries

89

Isolating Nonmatching RowsThe query can also include a filter to isolate the nonmatching rows from one or both tables.

Customers in the CustomerDatabase table who have not placed orders

Filter to include only rows where Customer_ID is missing from the orders table

Page 90: Chapter 4: Creating Simple Queries

90

Page 91: Chapter 4: Creating Simple Queries

91

4.08 Multiple Choice PollWhich would be the most appropriate join type to begin to isolate those orders placed on products that are no longer included in the products table?

a. Matching rows only

b. All rows from products

c. All rows from orders

d. All rows from products and orders

Page 92: Chapter 4: Creating Simple Queries

92

4.08 Multiple Choice Poll – Correct AnswerWhich would be the most appropriate join type to begin to isolate those orders placed on products that are no longer included in the products table?

a. Matching rows only

b. All rows from products

c. All rows from orders

d. All rows from products and orders

Page 93: Chapter 4: Creating Simple Queries

93

Joining Tables Including Nonmatching Rows

This demonstration illustrates how to change the join type to include nonmatching rows in a query.

Page 94: Chapter 4: Creating Simple Queries

94

Page 95: Chapter 4: Creating Simple Queries

95

Exercise

This exercise reinforces the concepts discussed previously.

Page 96: Chapter 4: Creating Simple Queries

96

Page 97: Chapter 4: Creating Simple Queries

97

Chapter 4: Creating Simple Queries

4.1 Introduction to Querying Data

4.2 Filtering and Sorting Data

4.3 Creating New Columns with an Expression

4.4 Grouping and Summarizing Data in a Query

4.5 Joining Tables

4.6 Joining Tables Including Nonmatching Rows (Self-Study)

4.7 Creating New Columns by Recoding Values 4.7 Creating New Columns by Recoding Values (Self-Study)(Self-Study)

Page 98: Chapter 4: Creating Simple Queries

98

Objectives Recode individual values or a range of values

in a column.

Page 99: Chapter 4: Creating Simple Queries

99

Business ScenarioTo further analyze profit per order, management would like to categorize each order in the following ranges: $0 to $100 $100 to $500 $500 and Above

Page 100: Chapter 4: Creating Simple Queries

100

Recoded Columns New columns can also be derived by recoding values from an existing column.

Page 101: Chapter 4: Creating Simple Queries

101

Recoded ValuesRecoding a column enables you to assign a value to a new column based on the value of an existing column.

When Order_Type=1Then

Order_Type_Detail= 'Retail Sale'

TRUE

TRUE

TRUEWhen Order_Type=3

FALSE

When Order_Type=2

FALSE

Then Order_Type_Detail

= 'Catalog Sale'

Then Order_Type_Detail

= 'Internet Sale'

Page 102: Chapter 4: Creating Simple Queries

102

Page 103: Chapter 4: Creating Simple Queries

103

4.09 QuizWhat should be assigned to the new column if Order_Type = 999?

???

Page 104: Chapter 4: Creating Simple Queries

104

4.09 Quiz – Correct AnswerWhat should be assigned to the new column if Order_Type = 999?

Possible answers:

Assign a missing value.

Assign ‘999’.

Assign ‘Other’.

???

Page 105: Chapter 4: Creating Simple Queries

105

Recode a ColumnThe New Computed Column wizard provides an option for recoding the values of an existing column in the input table.

Page 106: Chapter 4: Creating Simple Queries

106

Specify a ReplacementThe wizard enables you to specify replacements based on distinct values, ranges, or conditions.

Select the new columntype before you definereplacement values.

Determine a value for datanot assigned a replacement.

Page 107: Chapter 4: Creating Simple Queries

107

Creating a New Column by Recoding Values

This demonstration illustrates the use of the Recoding Values in a query to create a new column based conditionally on an existing column.

Page 108: Chapter 4: Creating Simple Queries

108

Page 109: Chapter 4: Creating Simple Queries

109

Chapter Review1. Name at least three tasks that you can do in the Query

Builder that you cannot do in the Filter and Sort task.

2. Can you filter or sort on a calculated column?

3. What is the default join type?

Page 110: Chapter 4: Creating Simple Queries

110

Chapter Review Answers1. Name at least three tasks that you can do in the Query

Builder that you cannot do in the Filter and Sort task.

2. Can you filter or sort on a calculated column?

3. What is the default join type?

Yes, you can filter or sort on a column whose values are created during processing.

The default join type is the inner join, or matching rows only.

Define new columns.Join tables.Group and summarize data.Define column attributes.Remove duplicate rows.