investigating objects and data patterns using base r

106
Investigating objects and data patterns using base R Managing and Manipulating Data Using R 1 / 106

Upload: others

Post on 17-May-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Investigating objects and data patterns using base R

Investigating objects and data patterns using base RManaging and Manipulating Data Using R

1 / 106

Page 2: Investigating objects and data patterns using base R

Lecture outline1. Investigate objects, base R

1.1 Functions to describe objects

1.2 Variables names

1.3 View and print data

1.4 Missing values

2. Subsetting using subset operators

2.1 Subset atomic vectors using []

2.2 Subsetting lists/data frames using []

2.3 Subsetting lists/data frames using [[]] and $

2.4 Subset Data frames by combining [] and $

3. Subset using subset() function

4. Creating variables

5. Appendix

5.1 Sorting data

2 / 106

Page 3: Investigating objects and data patterns using base R

1 Investigate objects, base R

3 / 106

Page 4: Investigating objects and data patterns using base R

Load .Rdata data frames we will use today

Data on off-campus recruiting events by public universities

▶ Data frame object df_event▶ One observation per university, recruiting event

▶ Data frame object df_school▶ One observation per high school (visited and non-visited)

rm(list = ls()) # remove all objects in current environment

getwd()#> [1] "C:/Users/ozanj/Documents/rclass1/lectures/patterns_base_r"#load dataset with one obs per recruiting eventload(url("https://github.com/ozanj/rclass/raw/master/data/recruiting/recruit_event_somevars.RData"))#load("../../data/recruiting/recruit_event_somevars.Rdata")

#load dataset with one obs per high schoolload(url("https://github.com/ozanj/rclass/raw/master/data/recruiting/recruit_school_somevars.RData"))#load("../../data/recruiting/recruit_school_somevars.Rdata")

4 / 106

Page 5: Investigating objects and data patterns using base R

1.1 Functions to describe objects

5 / 106

Page 6: Investigating objects and data patterns using base R

Simple base R functions to describe objects

This section introduces some base R functions to describe objects (some of these youhave seen before)

▶ list objects, list.files() and ls()▶ remove objects, rm()▶ object type, typeof()▶ object length (number of elements), length()▶ object structure, str()▶ number of rows and columns, ncol() and nrow()

I use the functions typeof() , length() , str() anytime I encounter a new object

▶ Helps me understand the object before I start working with it

6 / 106

Page 7: Investigating objects and data patterns using base R

Listing objects

Files in your working directory

list.files() function lists files in your current working directory

▶ if you run this code from .Rmd file, working directory is location .Rmd file is stored

getwd() # what is your current working directory#> [1] "C:/Users/ozanj/Documents/rclass1/lectures/patterns_base_r"list.files()#> [1] "base_r_week1_video_lecture_script.R" "fp1.JPG"#> [3] "fp2.JPG" "one_carriage_train_vs_contents.png"#> [5] "patterns_base_r.pdf" "patterns_base_r.Rmd"#> [7] "patterns_base_r.tex" "smaller_trains.png"#> [9] "test.txt" "three_carriage_train.png"#> [11] "transform-logical.png"

7 / 106

Page 8: Investigating objects and data patterns using base R

Objects currently open in your R session

Listing objects currently open in your R session

ls() function lists objects currently open in R

x <- "hello!"ls() # Objects open in R#> [1] "df_event" "df_school" "x"

Removing objects currently open in your R session

rm() function removes specified objects open in R

rm(x)ls()#> [1] "df_event" "df_school"

Command to remove all objects open in R (I don’t run it)

#rm(list = ls())

8 / 106

Page 9: Investigating objects and data patterns using base R

Base R functions to describe objects, typeof()

typeof() function determines the the internal storage type of an object (e.g., logicalvector, integer vector, list)

▶ syntax▶ tyepof(x)

▶ arguments▶ x : any R object

▶ help:

?typeof

Examples

▶ Recall that a data frame is an object where type is a list

typeof(c(TRUE,TRUE,FALSE,NA))#> [1] "logical"typeof(df_event)#> [1] "list"typeof(x = df_event)#> [1] "list"

9 / 106

Page 10: Investigating objects and data patterns using base R

Base R functions to describe objects, length()length() function determines the length of an R object

▶ for atomic vectors and lists, length() is the number of elements in the object▶ syntax

▶ length(x)▶ arguments

▶ x : any R object▶ help:

?length

Example, length of an atomic vector is

length(c(TRUE,TRUE,FALSE,NA))#> [1] 4

Example, length of a list or data frame

▶ length of a list is the number of elements▶ data frame is a list▶ length of a data frame = number of elements = number of variables

length(df_event) # = num elements = num columns#> [1] 33

10 / 106

Page 11: Investigating objects and data patterns using base R

Base R functions to describe objects, str()str() function compactly displays the structure of an R object

▶ “structure” includes type, length, and attribute of object and also nested objects▶ syntax: str(object)▶ arguments (partial)

▶ object : any R object▶ max.level : max level of nesting to display nested structures; default NA = all levels

▶ help: ?str

Example, atomic vectors

str(c(TRUE,TRUE,FALSE,NA))#> logi [1:4] TRUE TRUE FALSE NAstr(object = c(TRUE,TRUE,FALSE,NA))#> logi [1:4] TRUE TRUE FALSE NA

Example, lists/data frames (output omitted)

x <- list(c(1,2), list("apple", "orange"), list(2, 3)) # liststr(x)

str(df_event) # data frame

11 / 106

Page 12: Investigating objects and data patterns using base R

Base R functions to describe objects, ncol() and nrow()ncol() nrow() and dim() functions

▶ Description▶ ncol() = number of columns; nrow() = number of rows

▶ syntax: ncol(x) nrow(x) dim(x)▶ arguments

▶ x : a vector, array, data frame, or NULL▶ value/return:

▶ if object x is an atomic vector: ncol() and nrow() returns NULL▶ if object x is a list but not a data frame: ncol() and nrow() returns NULL▶ if object x is a data frame: ncol() and nrow() returns integer of length 1

Example, object is a data frame

ncol(df_event) # num columns = num elements = num variables#> [1] 33nrow(df_event) # num rows = num observations#> [1] 18680# can wrap ncol() or nrow() within str() to see what functions return#str(ncol(df_event))

Example, object is atomic vector or list that is not a data frame (output omitted)

ncol(c(TRUE,TRUE,FALSE,NA)) # atomic vectorx <- list(c(1,2), list("apple", "orange"), list(2, 3)) # listnrow(x)

12 / 106

Page 13: Investigating objects and data patterns using base R

Base R functions to describe objects, dim()dim() function returns the dimensions of an object (e.g., number of rows and

columns)

▶ syntax: dim(x)▶ arguments

▶ x : a vector, array, data frame, or NULL▶ value/return:

▶ if object x is a data frame: dim() returns integer of length 2▶ first element = number of rows; second element = number of columns

▶ if object x is an atomic vector: dim() returns NULL▶ if object x is a list but not a data frame: dim() returns NULL

Example, object is a data frame

dim(df_event) # shows number rows by columns#> [1] 18680 33

str(dim(df_event)) # can wrap dim() within str() to see what functions return#> int [1:2] 18680 33

Example, object is atomic vector or list that is not a data frame (output omitted)

dim(c(TRUE,TRUE,FALSE,NA)) # atomic vectorx <- list(c(1,2), list("apple", "orange"), list(2, 3)) # listdim(x)

13 / 106

Page 14: Investigating objects and data patterns using base R

1.2 Variables names

14 / 106

Page 15: Investigating objects and data patterns using base R

names() functionnames() function gets or sets the names of elements of an object

▶ syntax:▶ get the names of an object: names(x)▶ set the names of an object: names(x) <- value

▶ arguments (partial)▶ x : an R object▶ value : a character vector with same length as object x or NULL

▶ value/return▶ names(x) returns a character vector of length = length(x) in which each element is

the name of the element of x

Example, get names (of atomic vector)

a <- c(v1=1,v2=2,3,v4="hi!") # named atomic vectora#> v1 v2 v4#> "1" "2" "3" "hi!"length(a)#> [1] 4names(a)#> [1] "v1" "v2" "" "v4"length(names(a)) # investigate length of object names(a)#> [1] 4str(names(a)) # investigate structure of object names(a)#> chr [1:4] "v1" "v2" "" "v4"

15 / 106

Page 16: Investigating objects and data patterns using base R

names() functionnames() function gets or sets the names of elements of an object

▶ syntax:▶ get the names of an object: names(x)▶ set the names of an object: names(x) <- value

▶ arguments (partial)▶ x : an R object▶ value : a character vector with same length as object x or NULL

▶ value/return▶ names(x) returns a character vector of legnth = length(x) in which each element is

the name of the element of x

Example, set names (of atomic vector)

names(a) <- NULL # set names of vector a to NULLa#> [1] "1" "2" "3" "hi!"names(a)#> NULL

names(a) <- c("var1","var2","var3","var4") # set names of vector aa#> var1 var2 var3 var4#> "1" "2" "3" "hi!"names(a)#> [1] "var1" "var2" "var3" "var4"

16 / 106

Page 17: Investigating objects and data patterns using base R

Applying names() function to a data frameRecall that a data frame is an object where type is a list and each element is named

▶ each element is a variable▶ each element name is a variable name

Example (output omitted)

names(df_event)

Investigate the object names(df_event)

typeof(names(df_event)) # type = character vector#> [1] "character"length(names(df_event)) # length = number of variables in data frame#> [1] 33str(names(df_event)) # structure of names(df_event)#> chr [1:33] "instnm" "univ_id" "instst" "pid" "event_date" "event_type" ...

We can even assign a new object based on names(df_event)

names_event <- names(df_event)typeof(names_event) # type = character vector#> [1] "character"length(names_event) # length = number of variables in data frame#> [1] 33str(names_event) # structure of names(df_event)#> chr [1:33] "instnm" "univ_id" "instst" "pid" "event_date" "event_type" ...

17 / 106

Page 18: Investigating objects and data patterns using base R

Variable names

Refer to specific named elements of an object using this syntax:

▶ object_name$element_name

When object is data frame, refer to specific variables using this syntax:

▶ data_frame_name$varname▶ This approach to isolating variables is very useful for investigating data

#df_event$instnmtypeof(df_event$instnm)#> [1] "character"typeof(df_event$med_inc)#> [1] "double"

18 / 106

Page 19: Investigating objects and data patterns using base R

Variable namesData frames are lists with the following criteria:

▶ each element of the list is (usually) a vector; each element of list is a variable▶ length of data frame = number of variables

length(df_event)#> [1] 33nrow(df_event)#> [1] 18680#str(df_event)

▶ each element of the list (i.e., variable) has the same length▶ Length of each variable is equal to number of observations in data frame

typeof(df_event$event_state)#> [1] "character"length(df_event$event_state)#> [1] 18680str(df_event$event_state)#> chr [1:18680] "MA" "MA" "MA" "MA" "MA" "MA" "MA" "MA" "MA" "MA" "MA" "MA" ...

typeof(df_event$med_inc)#> [1] "double"length(df_event$med_inc)#> [1] 18680str(df_event$med_inc)#> num [1:18680] 71714 89122 70137 70137 71024 ... 19 / 106

Page 20: Investigating objects and data patterns using base R

Variable namesThe object df_school has one obs per high school

▶ variable visits_by_100751 shows number the of visits by University of Alabamato each high school

▶ like all variables in a data frame, the var visits_by_100751 is just a vector

typeof(df_school$visits_by_100751)#> [1] "integer"length(df_school$visits_by_100751) # num elements in vector = num obs#> [1] 21301str(df_school$visits_by_100751)#> int [1:21301] 0 0 0 0 0 0 0 0 0 0 ...sum(df_school$visits_by_100751) # sum of values of var across all obs#> [1] 3338

We perform calculations on a variable like we would on any vector of same type

v <- c(2,4,6)typeof(v)#> [1] "double"length(v)#> [1] 3sum(v)#> [1] 12

20 / 106

Page 21: Investigating objects and data patterns using base R

1.3 View and print data

21 / 106

Page 22: Investigating objects and data patterns using base R

Viewing and printing, data frames

Many ways to view/print a data frame object. Here are three ways:

1. Simply type the object name (output omitted)▶ number of observations and rows printed depend on YAML header settings and on

object “attributes” (attributes discussed in future week)

df_event

2. Use the View() function to view data in a browser

View(df_event)

3. head() to show the first n rows. The default is 6 rows.

#?head#head(df_event)head(df_event, n=5)

22 / 106

Page 23: Investigating objects and data patterns using base R

Viewing and printing, data framesobj_name[<rows>,<cols>] to print specific rows and columns of data frame

▶ particularly powerful when combined with sequences (e.g., 1:10 )

Examples (output omitted):

▶ Print first five rows, all vars

df_event[1:5, ]

▶ Print first five rows and first three columns

df_event[1:5, 1:3]

▶ Print first three columns of the 100th observation

df_event[100, 1:3]

▶ Print the 50th observation, all variables

df_event[50,]

23 / 106

Page 24: Investigating objects and data patterns using base R

Viewing and printing, variables within data framesRecall that:

▶ obj_name$var_name print specifics elements (i.e., variables) of a data frame

df_event$zip

▶ each element (i.e., variable) of data frame is an atomic vector with length =number of observations

typeof(df_event$zip)#> [1] "character"length(df_event$zip)#> [1] 18680

▶ each element of a variable is the value of the variable for one observation

Print specific elements (i.e., observations) of variable based on element position

▶ syntax: obj_name$var_name[<element position>]▶ vectors don’t have “rows” or “columns”; they just have elements▶ syntax combined with sequences (e.g., print first 10 observations)

df_event$event_state[1:10] # print obs 1-10 of variable "event_state"#> [1] "MA" "MA" "MA" "MA" "MA" "MA" "MA" "MA" "MA" "MA"df_event$event_type[6:10] # print obs 6-10 of variable "event_type"#> [1] "private hs" "private hs" "public hs" "private hs" "public hs"

24 / 106

Page 25: Investigating objects and data patterns using base R

Viewing and printing, variables within data framesPrint specific elements (i.e., observations) of variable based on element position

▶ syntax: obj_name$var_name[<element position>]

Example, print individual elements

df_event$zip[1:5] # print obs 1-5 of variable for event zip code#> [1] "01002" "01007" "01020" "01020" "01027"df_event$zip[1] # print obs 1 of variable for event zip code#> [1] "01002"df_event$zip[5] # print obs 5 of variable for event zip code#> [1] "01027"df_event$zip[c(1,3,5)] # print obs 5 of variable for event zip code#> [1] "01002" "01020" "01027"

Print specific elements of multiple variables using combine function c()

▶ syntax:c(obj_name$var1_name[<element position>], obj_name$var2_name[<element position>],...)

▶ Example: print first five observations of variables "event_state" and"event_type"

c(df_event$event_state[1:5],df_event$event_type[1:5])#> [1] "MA" "MA" "MA" "MA" "MA" "public hs"#> [7] "public hs" "public hs" "public hs" "public hs"

25 / 106

Page 26: Investigating objects and data patterns using base R

Exercise

Printing exercise using the df_school data frame

1. Use the obj_name[<rows>,<cols>] syntax to print the first 5 rows and 3columns of the df_school data frame

2. Use the head() function to print the first 4 observations3. Use the obj_name$var_name[1:10] syntax to print the first 10 observations of a

variable in the df_school data frame4. Use combine() to print the first 3 observations of variables “school_type” &

“name”

26 / 106

Page 27: Investigating objects and data patterns using base R

Solution

1. Use the obj_name[<rows>,<cols>] syntax to print the first 5 rows and 3columns of the df_school data frame

df_school[1:5,1:3]#> state_code school_type ncessch#> 1 AK public 020000100208#> 2 AK public 020000100211#> 3 AK public 020000100212#> 4 AK public 020000100213#> 5 AK public 020000300216

27 / 106

Page 28: Investigating objects and data patterns using base R

Solution2. Use the head() function to print the first 4 observations

head(df_school, n=4)#> state_code school_type ncessch name#> 1 AK public 020000100208 Bethel Regional High School#> 2 AK public 020000100211 Ayagina'ar Elitnaurvik#> 3 AK public 020000100212 Kwigillingok School#> 4 AK public 020000100213 Nelson Island Area School#> address city zip_code pct_white pct_black#> 1 1006 Ron Edwards Memorial Dr Bethel 99559 11.7764 0.5988#> 2 106 Village Road Kongiganak 99559 0.0000 0.0000#> 3 108 Village Road Kwigillingok 99622 0.0000 0.0000#> 4 118 Village Road Toksook Bay 99637 0.0000 0.0000#> pct_hispanic pct_asian pct_amerindian pct_other num_fr_lunch total_students#> 1 1.5968 0.998 84.6307 0.3992 362 501#> 2 0.0000 0.000 99.4505 0.5495 182 182#> 3 0.0000 0.000 100.0000 0.0000 116 120#> 4 0.0000 0.000 100.0000 0.0000 187 201#> num_took_math num_prof_math num_took_rla num_prof_rla avgmedian_inc_2564#> 1 146 24.82 147 24.99 76160.0#> 2 17 1.70 17 1.70 76160.0#> 3 14 3.50 14 3.50 NA#> 4 30 3.00 30 3.00 57656.5#> visits_by_110635 visits_by_126614 visits_by_100751 inst_110635 inst_126614#> 1 0 0 0 CA CO#> 2 0 0 0 CA CO#> 3 0 0 0 CA CO#> 4 0 0 0 CA CO#> inst_100751#> 1 AL#> 2 AL#> 3 AL#> 4 AL

28 / 106

Page 29: Investigating objects and data patterns using base R

Solution

3. Use the obj_name$var_name[1:10] syntax to print the first 10 observations of avariable in the df_school data frame

df_school$name[1:10]#> [1] "Bethel Regional High School" "Ayagina'ar Elitnaurvik"#> [3] "Kwigillingok School" "Nelson Island Area School"#> [5] "Alakanuk School" "Emmonak School"#> [7] "Hooper Bay School" "Ignatius Beans School"#> [9] "Pilot Station School" "Kotlik School"

29 / 106

Page 30: Investigating objects and data patterns using base R

Solution

4. Use combine() to print the first 3 observations of variables “school_type” &“name”

c(df_school$school_type[1:3],df_school$name[1:3])#> [1] "public" "public"#> [3] "public" "Bethel Regional High School"#> [5] "Ayagina'ar Elitnaurvik" "Kwigillingok School"

30 / 106

Page 31: Investigating objects and data patterns using base R

1.4 Missing values

31 / 106

Page 32: Investigating objects and data patterns using base R

Missing valuesMissing values have the value NA

▶ NA is a special keyword, not the same as the character string "NA"

use is.na() function to determine if a value is missing

▶ is.na() returns a logical vector

is.na(5)#> [1] FALSEis.na(NA)#> [1] TRUEis.na("NA")#> [1] FALSEtypeof(is.na("NA")) # example of a logical vector#> [1] "logical"

nvector <- c(10,5,NA)is.na(nvector)#> [1] FALSE FALSE TRUEtypeof(is.na(nvector)) # example of a logical vector#> [1] "logical"

svector <- c("e","f",NA,"NA")is.na(svector)#> [1] FALSE FALSE TRUE FALSE

32 / 106

Page 33: Investigating objects and data patterns using base R

Missing values are “contagious”

What does “contagious” mean?

▶ operations involving a missing value will yield a missing value

7>5#> [1] TRUE7>NA#> [1] NAsum(1,2,NA)#> [1] NA0==NA#> [1] NA2*c(0,1,2,NA)#> [1] 0 2 4 NANA*c(0,1,2,NA)#> [1] NA NA NA NA

33 / 106

Page 34: Investigating objects and data patterns using base R

Functions and missing values example, table()

table() function is useful for investigating categorical variables

str(df_event$event_type)#> chr [1:18680] "public hs" "public hs" "public hs" "public hs" "public hs" ...table(df_event$event_type)#>#> 2yr college 4yr college other private hs public hs#> 951 531 2001 3774 11423

34 / 106

Page 35: Investigating objects and data patterns using base R

Functions and missing values example, table()By default table() ignores NA values#?tablestr(df_event$school_type_pri)#> int [1:18680] NA NA NA NA NA 1 1 NA 1 NA ...table(df_event$school_type_pri)#>#> 1 2 5#> 3765 8 1

useNA argument controls if table includes counts of NA s. Allowed values:

▶ never (“no”) [DEFAULT VALUE]▶ only if count is positive (“ifany”);▶ even for zero counts (“always”)”

nrow(df_event)#> [1] 18680table(df_event$school_type_pri, useNA="always")#>#> 1 2 5 <NA>#> 3765 8 1 14906

Broader point: Most functions that create descriptive statistics have options abouthow to treat missing values‘

▶ When investigating data, good practice to always show missing values35 / 106

Page 36: Investigating objects and data patterns using base R

2 Subsetting using subset operators

36 / 106

Page 37: Investigating objects and data patterns using base R

Subsetting to Extract Elements

“Subsetting” refers to isolating particular elements of an object

Subsetting operators can be used to select/exclude elements (e.g., variables,observations)

▶ there are three subsetting operators: [] , $ , [[]]▶ these operators function differently based on vector types (e.g, atomic vectors,

lists, data frames)

37 / 106

Page 38: Investigating objects and data patterns using base R

Wichham refers to number of “dimensions” in R objectsAn atomic vector is a 1-dimensional object that contains n elements

x <- c(1.1, 2.2, 3.3, 4.4, 5.5)str(x)#> num [1:5] 1.1 2.2 3.3 4.4 5.5

Lists are multi-dimensional objects▶ Contains n elements; each element may contain a 1-dimensional atomic vector or

a multi-dimensional list. Below list contains 3 dimensions

list <- list(c(1,2), list("apple", "orange"))str(list)#> List of 2#> $ : num [1:2] 1 2#> $ :List of 2#> ..$ : chr "apple"#> ..$ : chr "orange"

Data frames are 2-dimensional lists▶ each element is a variable (dimension=columns)▶ within each variable, each element is an observation (dimension=rows)

ncol(df_school)#> [1] 26nrow(df_school)#> [1] 21301 38 / 106

Page 39: Investigating objects and data patterns using base R

2.1 Subset atomic vectors using []

39 / 106

Page 40: Investigating objects and data patterns using base R

Subsetting elements of atomic vectors

“Subsetting” a vector refers to isolating particular elements of a vector

▶ I sometimes refer to this as “accessing elements of a vector”▶ subsetting elements of a vector is similar to “filtering” rows of a data-frame▶ [] is the subsetting function for vectors

Six ways to subset an atomic vector using []

1. Using positive integers to return elements at specified positions2. Using negative integers to exclude elements at specified positions3. Using logicals to return elements where corresponding logical is TRUE4. Empty [] returns original vector (useful for dataframes)5. Zero vector [0], useful for testing data6. If vector is “named,” use character vectors to return elements with matching

names

40 / 106

Page 41: Investigating objects and data patterns using base R

1. Using positive integers to return elements at specified positions (subsetatomic vectors using [])

Create atomic vector x

(x <- c(1.1, 2.2, 3.3, 4.4, 5.5))#> [1] 1.1 2.2 3.3 4.4 5.5str(x)#> num [1:5] 1.1 2.2 3.3 4.4 5.5

[] is the subsetting function for vectors

▶ contents inside [] can refer to element number (also called “position”).▶ e.g., [3] refers to contents of 3rd element (or position 3)

x[5] #return 5th element#> [1] 5.5

x[c(3, 1)] #return 3rd and 1st element#> [1] 3.3 1.1

x[c(4,4,4)] #return 4th element, 4th element, and 4th element#> [1] 4.4 4.4 4.4

#Return 3rd through 5th elementx[3:5]#> [1] 3.3 4.4 5.5

41 / 106

Page 42: Investigating objects and data patterns using base R

2. Using negative integers to exclude elements at specified positions (subsetatomic vectors using [])

Before excluding elements based on position, investigate object

x#> [1] 1.1 2.2 3.3 4.4 5.5

length(x)#> [1] 5str(x)#> num [1:5] 1.1 2.2 3.3 4.4 5.5

Use negative integers to exclude elements based on element position

x[-1] # exclude 1st element#> [1] 2.2 3.3 4.4 5.5

x[c(3,1)] # 3rd and 1st element#> [1] 3.3 1.1x[-c(3,1)] # exclude 3rd and 1st element#> [1] 2.2 4.4 5.5

42 / 106

Page 43: Investigating objects and data patterns using base R

3. Using logicals to return elements where corresponding logical is TRUE(subset atomic vectors using [])

x#> [1] 1.1 2.2 3.3 4.4 5.5

When using x[y] to subset x , good practice to have length(x)==length(y)

length(x) # length of vector x#> [1] 5length(c(TRUE,FALSE,TRUE,FALSE,TRUE)) # length of y#> [1] 5length(x) == length(c(TRUE,FALSE,TRUE,FALSE,TRUE)) # condition true#> [1] TRUEx[c(TRUE,TRUE,FALSE,FALSE,TRUE)]#> [1] 1.1 2.2 5.5

Recycling rules:

▶ in x[y] , if x is different length than y , R “recycles” length of shorter tomatch length of longer

length(c(TRUE,FALSE))#> [1] 2x#> [1] 1.1 2.2 3.3 4.4 5.5x[c(TRUE,FALSE)]#> [1] 1.1 3.3 5.5

43 / 106

Page 44: Investigating objects and data patterns using base R

3. Using logicals to return elements where corresponding logical is TRUE(subset atomic vectors using [])

x#> [1] 1.1 2.2 3.3 4.4 5.5

Note that a missing value ( NA ) in the index always yields a missing value in theoutput:

x[c(TRUE, FALSE, NA, TRUE, NA)]#> [1] 1.1 NA 4.4 NA

Return all elements of object x where element is greater than 3:

x # print object X#> [1] 1.1 2.2 3.3 4.4 5.5x>3 # for each element of X, print T/F whether element value > 3#> [1] FALSE FALSE TRUE TRUE TRUEstr(x>3)#> logi [1:5] FALSE FALSE TRUE TRUE TRUEx[x>3] # prints only the values that had TRUE at that position#> [1] 3.3 4.4 5.5

44 / 106

Page 45: Investigating objects and data patterns using base R

3. Using logicals to return elements where corresponding logical is TRUE(subset atomic vectors using []) [cont.]

The visits_by_100751 column shows how many visits the University of Alabamamade to each school. Let’s subset this to only include 2 or more visits:

df_school$visits_by_100751[1:100]#> [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0#> [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0#> [75] 0 0 0 0 0 0 0 0 0 0 5 2 4 4 3 3 3 3 3 3 2 3 3 3 3 1df_school$visits_by_100751[1:100]>2#> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE#> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE#> [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE#> [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE#> [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE#> [61] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE#> [73] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE#> [85] TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE#> [97] TRUE TRUE TRUE FALSEdf_school$visits_by_100751[df_school$visits_by_100751>2]#> [1] 5 4 4 3 3 3 3 3 3 3 3 3 3 4 4 3 3 3 3 3 4 4 3 3 3 4 3 3 3 3 3 3 3 4 3 3 4#> [38] 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 4 5 3 3 3 4 3 3 4 3 3 3 3#> [75] 3 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 5 4 6 5 3 5 5 5 3#> [112] 4 4 3 3 3 5 3 4 4 3 3 3 3 4 4 3 3 4 3 3 3 3 3 3 3 3 3 3 3 4 3 3 3 3 3 3 3#> [149] 3 3 3 3 4 3 3 3 3 3 3 3 3

45 / 106

Page 46: Investigating objects and data patterns using base R

4. Empty [] returns original vector (subset atomic vectors using [])

x#> [1] 1.1 2.2 3.3 4.4 5.5

x[]#> [1] 1.1 2.2 3.3 4.4 5.5

This is useful for sub-setting data frames, as we will show below

46 / 106

Page 47: Investigating objects and data patterns using base R

5. Zero vector [0] (subset atomic vectors using [])

Zero vector, x[0]

▶ R interprets this as returning element 0

x[0]#> numeric(0)

Wickham states:

▶ “This is not something you usually do on purpose, but it can be helpful forgenerating test data.”

47 / 106

Page 48: Investigating objects and data patterns using base R

6. If vector is named, character vectors to return elements with matchingnames (subset atomic vectors using [])

Create vector y that has values of vector x but each element is named

x#> [1] 1.1 2.2 3.3 4.4 5.5

(y <- c(a=1.1, b=2.2, c=3.3, d=4.4, e=5.5))#> a b c d e#> 1.1 2.2 3.3 4.4 5.5

Return elements of vector based on name of element

▶ enclose element names in single '' or double "" quotes

#show element named "a"y["a"]#> a#> 1.1

#show elements "a", "b", and "d"y[c("a", "b", "d" )]#> a b d#> 1.1 2.2 4.4

48 / 106

Page 49: Investigating objects and data patterns using base R

2.2 Subsetting lists/data frames using []

49 / 106

Page 50: Investigating objects and data patterns using base R

Subsetting lists using []Using [] operator to subset lists works the same as subsetting atomic vector

▶ Using [] with a list always returns a list

list_a <- list(list(1,2),3,"apple")str(list_a)#> List of 3#> $ :List of 2#> ..$ : num 1#> ..$ : num 2#> $ : num 3#> $ : chr "apple"

#create new list that consists of elements 3 and 1 of list_alist_b <- list_a[c(3, 1)]str(list_b)#> List of 2#> $ : chr "apple"#> $ :List of 2#> ..$ : num 1#> ..$ : num 2

#show elements 3 and 1 of object list_a#str(list_a[c(3, 1)])

50 / 106

Page 51: Investigating objects and data patterns using base R

Subsetting data frames using []

Recall that a data frame is just a particular kind of list

▶ each element = a column = a variable

Using [] with a list always returns a list

▶ Using [] with a data frame always returns a data frame

Two ways to use [] to extract elements of a data frame

1. use “single index” df_name[<columns>] to extract columns (variables) based onelement position number (i.e., column number)

2. use “double index” df_name[<rows>, <columns>] to extact particular rows andcolumns of a data frame

51 / 106

Page 52: Investigating objects and data patterns using base R

Subsetting data frames using [] to extract columns (variables) based onelement position

Use “single index” df_name[<columns>] to extract columns (variables) based onelement number (i.e., column number)

Examples [output omitted]

names(df_event)

#extract elements 1 through 4 (elements=columns=variables)df_event[1:4]df_event[c(1,2,3,4)]

str(df_event[1:4])#extract columns 13 and 7df_event[c(13,7)]

52 / 106

Page 53: Investigating objects and data patterns using base R

Subsetting Data Frames to extract columns (variables) and rows(observations) based on positionality

use “double index” syntax df_name[<rows>, <columns>] to extact particular rowsand columns of a data frame

▶ often combined with sequences (e.g., 1:10 )

#Return rows 1-3 and columns 1-4df_event[1:3, 1:4]#> instnm univ_id instst pid#> 1 UM Amherst 166629 MA 57570#> 2 UM Amherst 166629 MA 56984#> 3 UM Amherst 166629 MA 57105

#Return rows 50-52 and columns 10 and 20df_event[50:52, c(10,20)]#> event_state pct_tworaces_zip#> 50 MA 1.9831#> 51 MA 1.9831#> 52 MA 1.9831

53 / 106

Page 54: Investigating objects and data patterns using base R

Subsetting Data Frames to extract columns (variables) and rows(observations) based on positionality

use “double index” syntax df_name[<rows>, <columns>] to extact particular rowsand columns of a data frame

recall that empty [] returns original object (output omitted)

#return original data framedf_event[]

#return specific rows and all columns (variables)df_event[1:5, ]

#return all rows and specific columns (variables)df_event[, c(1,2,3)]

54 / 106

Page 55: Investigating objects and data patterns using base R

Use [] to extract data frame columns based on variable names

Selecting columns from a data frame by subsetting with [] and list of elementnames (i.e., variable names) enclose in quotes

“single index” approach extracts specific variables, all rows (output omitted)

df_event[c("instnm", "univ_id", "event_state")]

“Double index” approach extracts specific variables and specific rows

▶ syntax df_name[<rows>, <columns>]

df_event[1:5, c("instnm", "event_state", "event_type")]#> instnm event_state event_type#> 1 UM Amherst MA public hs#> 2 UM Amherst MA public hs#> 3 UM Amherst MA public hs#> 4 UM Amherst MA public hs#> 5 Stony Brook MA public hs

55 / 106

Page 56: Investigating objects and data patterns using base R

Student exercises

Use subsetting operators from base R in extracting columns (variables), observations:

1. Use both “single index” and “double index” in subsetting to create a newdataframe by extracting the columns instnm , event_date , event_type fromthe df_event data frame. And show what columns (variables) are in the newlycreated dataframe.

2. Use subsetting to return rows 1-5 of columns state_code , name , addressfrom the df_school data frame.

56 / 106

Page 57: Investigating objects and data patterns using base R

Solution to Student Exercises

Solution to 1

base R using subsetting operators

# single indexdf_event_br <- df_event[c("instnm", "event_date", "event_type")]#double indexdf_event_br <- df_event[, c("instnm", "event_date", "event_type")]names(df_event_br)#> [1] "instnm" "event_date" "event_type"

Solution to 2

base R using subsetting operators

df_school[1:5, c("state_code", "name", "address")]#> state_code name address#> 1 AK Bethel Regional High School 1006 Ron Edwards Memorial Dr#> 2 AK Ayagina'ar Elitnaurvik 106 Village Road#> 3 AK Kwigillingok School 108 Village Road#> 4 AK Nelson Island Area School 118 Village Road#> 5 AK Alakanuk School 9 School Road

57 / 106

Page 58: Investigating objects and data patterns using base R

2.3 Subsetting lists/data frames using [[]] and $

58 / 106

Page 59: Investigating objects and data patterns using base R

Subset single element from object using [[]] operator, atomic vectorsSo far we have used [] to extract elements from an object

▶ Apply [] to atomic vector: returns atomic vector with elements you requested▶ Apply [] to list: returns list with elements you requested

[[]] also extract elements from an object

▶ Applying [[]] to atomic vector gives same result as [] ; that is, an atomicvector with element you request

(x <- c(1.1, 2.2, 3.3, 4.4, 5.5))#> [1] 1.1 2.2 3.3 4.4 5.5

str(x[3])#> num 3.3

str(x[[3]])#> num 3.3

▶ Caveat: when applying [[]] to atomic vector, you can only subset a singleelement

x[c(3,4)] # single bracket; this works#> [1] 3.3 4.4

#x[[c(3,4)]] # double bracket; this won't work59 / 106

Page 60: Investigating objects and data patterns using base R

Subsetting lists using [] vs. [[]] , introduce “train metaphor”

Applying [[]] to a list

▶ Understanding what [] vs. [[]] does to a list is very important but requiressome explanation!

Advanced R chapter 4.3 by Wickham uses the “train metaphor” to explain a listvs. contents of a list and how this relates to [] vs. [[]]

Below code chunk makes a list named list_x that contains 3 elements

list_x <- list(1:3, "a", 4:6) # create list object list_x

In our train metaphor, object list_x is a train that contains 3 carriages

60 / 106

Page 61: Investigating objects and data patterns using base R

Subsetting lists using [] vs. [[]] , introduce “train metaphor”list object list_x is a train that contains 3 carriages

When we “subset a list” – that is, extract one or more elements from the list – wehave two broad choices (image below)

1. Extracting elements using [] always returns a list, usually one with fewerelements

▶ you can think of this as a train with fewer carriages

#str(list_x)str(list_x[1]) # returns a list#> List of 1#> $ : int [1:3] 1 2 3

2. Extracting element using [[]] returns contents of particular carriage▶ I say applying [[]] to a list or data frame returns a simpler object that moves up one

level of hierarchy

str(list_x[[1]]) # returns an atomic vector#> int [1:3] 1 2 3 61 / 106

Page 62: Investigating objects and data patterns using base R

Subset lists using [] vs. [[]] , deepen understanding of []Rules about applying subset operator [] to a list

▶ Applying [] to a list always returns a list▶ Resulting list contains 1 or more elements depending on what typed inside []

Here is a list object named list_x

list_x <- list(1:3, "a", 4:6)

Here is an image of a few “trains” that can be created by applying [] to list_x

And here is code to create the “trains” shown in above image (output omitted)

list_x[1:2]list_x[-2]list_x[c(1,1)]list_x[0]list_x[] # returns the original list; not shown in above train picture 62 / 106

Page 63: Investigating objects and data patterns using base R

Subset lists using [] vs. [[]] , deepen understanding of [[]]Rules about applying subset operator [[]] to a list

▶ Can apply [[]] to return the contents of a single element of a list

Create list list_x and show “train” Image of applying list_x[1]vs. list_x[[1]]

list_x <- list(1:3, "a", 4:6)

Object created by list_x[1] is a list with one element (output omitted)

list_x[1]str(list_x[1])

Object created by list_x[[1]] is a vector with 3 elements (output omitted)

▶ list_x[[1]] gives us “contents” of element 1▶ Since element 1 contains a numeric vector, object created by list_x[[1]] is a

numeric vector

list_x[[1]]str(list_x[[1]]) 63 / 106

Page 64: Investigating objects and data patterns using base R

Subset lists using [] vs. [[]] , deepen understanding of [[]]

Rules about applying subset operator [[]] to a list

▶ Can apply [[]] to return the contents of a single element of a list

list_x <- list(1:3, "a", 4:6) # create list list_x

We cannot use [[]] to subset multiple elements of a list (output omitted)

▶ e.g., we could write list_x[[2]] but not list_x[[2:3]]

list_x[[c(2)]] # this works, subset element 2 using [[]]list_x[[c(2,3)]] # this doesn't work; subset element 2 and 3 using [[]]list_x[c(2,3)] # this works; subset element 2 and 3 using []

64 / 106

Page 65: Investigating objects and data patterns using base R

Subset lists using [] vs. [[]] , deepen understanding of [[]]Like [] , can use [[]] to return contents of named elements specified using quotes

▶ syntax: obj_name[["element_name"]]

Same list as before, but this time elements named

list_x <- list(var1=1:3, var2="a", var3=4:6)

Subset list list_x using [[]] element names vs. element position

list_x[["var1"]]#> [1] 1 2 3

# list_x[[1]] # same as abovelist_x[["var3"]]#> [1] 4 5 6

# list_x[[3]] # same as above

We can do the same thing with data frames because data frames are lists

▶ e.g., df_event[["zip"]] returns contents of element named "zip"▶ object created by df_event[["zip"]] is character vector of length = 18,680

# df_event[["zip"]] # this works but long outputstr(df_event[["zip"]])#> chr [1:18680] "01002" "01007" "01020" "01020" "01027" "01027" "01027" ...typeof(df_event[["zip"]])#> [1] "character"length(df_event[["zip"]])#> [1] 18680

65 / 106

Page 66: Investigating objects and data patterns using base R

General rules of applying [] vs [[]] to (nested) objects

What we just learned about applying [] vs [[]] to lists applies more generally to“nested objects”

▶ “nested objects” are objects with a hierarchical structure such that an element ofan object contains another object

General rules of applying [] vs. [[]] to nested objects

▶ subset any object x using [] will return object with same data structure as x▶ subset any object x using [[]] will return an object thay may or may not have

same data structure of x▶ if object x is not a nested object, then applying [[]] to a single element of x will

return object with same data structure as x▶ if object x has a nested data structure, then then applying [[]] to a single element

of x will “move up one level of hierarchy” to extract the contents of an element withinthe object x

66 / 106

Page 67: Investigating objects and data patterns using base R

Subset lists/data frames using $

list_x <- list(var1=1:3, var2="a", var3=4:6)

obj_name$element_name is shorthand operator for obj_name[["element_name"]]

These three lines of code all give the same result

list_x[[1]]#> [1] 1 2 3list_x[["var1"]]#> [1] 1 2 3list_x$var1#> [1] 1 2 3

df_name$var_name : easiest way in base R to refer to variable in a data frame

▶ these two lines of code are equivalent

str(df_event[["zip"]])#> chr [1:18680] "01002" "01007" "01020" "01020" "01027" "01027" "01027" ...str(df_event$zip)#> chr [1:18680] "01002" "01007" "01020" "01020" "01027" "01027" "01027" ...

67 / 106

Page 68: Investigating objects and data patterns using base R

2.4 Subset Data frames by combining [] and $

68 / 106

Page 69: Investigating objects and data patterns using base R

Subset Data Frames by combining [] and $ , MotivationMotivation

▶ When working with data frames we often want to isolate those observations thatsatisfy certain conditions

▶ This is often referred to as “filtering”▶ We filter observations based on the values of one or more variables

▶ Perhaps you have seen “filtering” in Microsoft Excel▶ open some spreadsheet that contains variables (columns) and observations (rows)▶ click on Data » Filter and then filter observations based on values of variable(s)

Filtering example using data frame df_school

▶ Observations:▶ One observation per high school (public and private)

▶ Variables:▶ high school characteristics; number of off-campus recruiting visits from particular

universities▶ NCES ID for UC Berkeley is 110635▶ variable visits_by_110635 shows number of visits a high school received from UC

Berkeley▶ Task:

▶ Isolate observations where the high school received at least 1 visit from UC Berkeley

69 / 106

Page 70: Investigating objects and data patterns using base R

Subset Data Frames by combining [] and $

Task:

▶ Using df_school , isolate obs where school received at least 1 visit from UCBerkeley

General syntax: df_name[df_name$var_name <condition>, ]

▶ Note that syntax uses “double index” df_name[<rows>, <columns>] syntax▶ Therefore, the <condition> in above syntax is isolating <rows>

▶ Cannot use “single index” syntax df_name[<columns>]

Solution to task (output omitted)

▶ Note: below code filters observations but keeps all variables

df_school[df_school$visits_by_110635 >= 1, ]

70 / 106

Page 71: Investigating objects and data patterns using base R

Subset Data Frames by combining [] and $ , decompose syntaxTask: Isolate obs where school received at least 1 visit from UC Berkeley

▶ general syntax: df_name[df_name$var_name <condition>, ]▶ solution: df_school[df_school$visits_by_110635 >= 1, ]

Decomposing syntax df_school[df_school$visits_by_110635 >= 1, ]

▶ df_school$visits_by_110635 >= 1 : returns a logical ( TRUE / FALSE ) atomicvector with length equal to number of obs in df_school

typeof(df_school$visits_by_110635 >= 1)length(df_school$visits_by_110635 >= 1)str(df_school$visits_by_110635 >= 1)

▶ df_school[df_school$visits_by_110635 >= 1, ]▶ uses “double index” df_name[<rows>, <columns>] syntax to extract rows, columns▶ rows: extract rows where df_school$visits_by_110635 >= 1 is TRUE▶ columns: since <columns> is empty, extracts all columns

▶ key point: df_name[df_name$var_name <condition>, ] is “subset a vectorapproach #3”: “Using logicals to return elements where condition TRUE ”

▶ example using atomic vectors (output omitted)

x <- c(1.1, 2.2, 3.3, 4.4, 5.5)x[x>3]

71 / 106

Page 72: Investigating objects and data patterns using base R

Subset Data Frames by combining [] and $ , keep desired columns

▶ General syntax to filter desired observations (rows) and variables (columns) ofdata frame:

▶ df_name[df_name$var_name <condition>, <desired columns>]

Tasks (output omitted)

▶ Extract observations where the high school received at least 1 visit from UCBerkeley and the first three columns

df_school[df_school$visits_by_110635 >= 1, 1:3]

▶ Extract observations where the high school received at least 1 visit from UCBerkeley and variables “state_code” “school_type” “name”

df_school[df_school$visits_by_110635 >= 1, c("state_code","school_type","name")]

72 / 106

Page 73: Investigating objects and data patterns using base R

Subset Data Frames by combining [] and $ , more examplesSyntax:

▶ filter based on one variable:▶ df_name[df_name$var_name <condition>, <columns>]

▶ Example syntax to filter based on two conditions being true▶

df_name[df_name$var_name <condition> & df_name$var_name <condition>, <columns>]

Pro tip:

▶ wrap above syntax within nrow() function to count how many observations(rows) satisfy the condition (as opposed to printing all rows that satisfy condition)

Tasks▶ Count obs where high schools received at least 1 visit by Bama (100751) and at

least one visit by Berkeley (110635)

nrow(df_school[df_school$visits_by_110635 >= 1 &df_school$visits_by_100751 >= 1, ])

#> [1] 247# Equivalently:# nrow(df_school[df_school[["visits_by_110635"]] >= 1 &# df_school[["visits_by_100751"]] >= 1, ])

▶ Count obs where schools received 1+ visit by Bama or 1+ visit by Berkeley

nrow(df_school[df_school$visits_by_110635 >= 1| df_school$visits_by_100751 >= 1, ])

#> [1] 276373 / 106

Page 74: Investigating objects and data patterns using base R

Logical operators for comparisons▶ Logical operators to isolate/filter observations of data frame

Symbol Meaning== Equal to!= Not equal to> greater than>= greater than or equal to< less than<= less than or equal to& AND| OR%in% includes

▶ Visualization of “Boolean” operators (e.g., AND, OR, AND NOT)

Figure 1: “Boolean” operations, x=left circle, y=right circle, from Wichkam (2018)74 / 106

Page 75: Investigating objects and data patterns using base R

Subset Data Frames by combining [] and $ , more examplesExample: Count the number of out-of-state high schools that UC Berkeley visited.

# The `inst_110635` variable contains the home state of UC Berkeleyunique(df_school$inst_110635)#> [1] "CA"

# Filter for schools visited by UC Berkeley AND whose state is not "CA"nrow(df_school[df_school$visits_by_110635 >= 1 &

df_school$state_code != df_school$inst_110635, ])#> [1] 302

Example: Count the number of schools in the Northeast that received a visit fromeither UC Berkeley, U of Alabama, or CU Boulder.

# Vector containing states located in the Northeast regionnortheast_states <- c('CT', 'ME', 'MA', 'NH', 'RI', 'VT', 'NJ', 'NY', 'PA')

# Filter for schools in the Northeast AND visited by any of the 3 univsnrow(df_school[df_school$state_code %in% northeast_states &

(df_school$visits_by_110635 >= 1 |df_school$visits_by_100751 >= 1 |df_school$visits_by_126614 >= 1), ])

#> [1] 544

75 / 106

Page 76: Investigating objects and data patterns using base R

Subset Data Frames by combining [] and $ , NA Observations

Filtering observations of data frame using [] combined with $ is more complicatedin the presence of missing values ( NA values)

The next few slides will explain

▶ why it is more complicated▶ how to filter correctly when NA s are present

76 / 106

Page 77: Investigating objects and data patterns using base R

Subset Data Frames by combining [] and $ , NA ObservationsWhen sub-setting via [] combined with $ , result will include:

▶ rows where condition is TRUE▶ as well as rows with NA (missing) values for condition.

Task (using df_event , which has one obs per university, recruiting event)

▶ How many events at public HS with at least $50k median household income?

sum(is.na(df_event$med_inc)) # number of observations (all events) with missing values for med_inc#> [1] 470

#num obs event_type=="public hs" and med_inc is missingnrow(df_event[df_event$event_type == "public hs"

& is.na(df_event$med_inc)==1 , ]) # note TRUE evaluates to 1#> [1] 75

#num obs event_type=="public hs" & med_inc is not NA & med_inc >= $50,000nrow(df_event[df_event$event_type == "public hs"

& is.na(df_event$med_inc)==0 & df_event$med_inc>=50000 , ]) # note FALSE evaluates to 0#> [1] 9941

#num obs event_type=="public hs" and med_inc >= $50,000nrow(df_event[df_event$event_type == "public hs"

& df_event$med_inc>=50000 , ])#> [1] 10016

77 / 106

Page 78: Investigating objects and data patterns using base R

Subset Data Frames by combining [] and $ , NA ObservationsTo exclude rows where condition is NA if subset using [] combined w/ $

▶ use which() to ask only for values where condition evaluates to TRUE▶ which() returns position numbers for elements where condition is TRUE

#?whichc(TRUE,FALSE,NA,TRUE)#> [1] TRUE FALSE NA TRUEstr(c(TRUE,FALSE,NA,TRUE))#> logi [1:4] TRUE FALSE NA TRUEwhich(c(TRUE,FALSE,NA,TRUE))#> [1] 1 4

Task: Count events at public HS with at least $50k median household income?

#Base R, `[]` combined with `$`; without which()nrow(df_event[df_event$event_type == "public hs" & df_event$med_inc>=50000, ])#> [1] 10016

#Base R, `[]` combined with `$`; with which()nrow(df_event[which(df_event$event_type == "public hs"

& df_event$med_inc>=50000), ])#> [1] 9941

78 / 106

Page 79: Investigating objects and data patterns using base R

Student Exercises

Subsetting Data Frames with [] and $ :

1. Show how many public high schools in California with at least 50% Latinx(hispanic in data) student enrollment from df_school.

2. Show how many out-state events at public high schools with more than $30Kmedian from df_event (do not forget to exclude missing values).

79 / 106

Page 80: Investigating objects and data patterns using base R

Solution to Student Exercises

Solution to 1

base R using [] and $

df_school_br1<- df_school[df_school$school_type == "public"& df_school$pct_hispanic >= 50& df_school$state_code == "CA", ]

nrow(df_school_br1)#> [1] 713

80 / 106

Page 81: Investigating objects and data patterns using base R

Solution to Student Exercises

Solution to 2:

base R using [] and $

# use is.na to exclude NAnrow(df_event[df_event$event_type == "public hs" & df_event$event_inst =="Out-State"

& df_event$med_inc > 30000 & is.na(df_event$med_inc) ==0, ])#> [1] 7784

# use which to exclude NAnrow(df_event[which(df_event$event_type == "public hs" & df_event$event_inst =="Out-State"

& df_event$med_inc > 30000 ), ])#> [1] 7784

81 / 106

Page 82: Investigating objects and data patterns using base R

3 Subset using subset() function

82 / 106

Page 83: Investigating objects and data patterns using base R

Subset functionThe subset() is a base R function to “filter” observations from some object x

▶ object x can be a matrix, data frame, list▶ subset() automatically excludes elements/rows with NA for condition▶ Can also use subset() to select variables▶ subset() can be combined with:

▶ assignment ( <- ) to create new objects▶ nrow() to count number of observations that satisfy criteria

?subset

Syntax [when object is data frame]: subset(x, subset, select, drop = FALSE)

▶ x is object to be subset▶ subset is the logical expression(s) (evaluates to TRUE/FALSE ) indicating

elements (rows) to keep▶ select indicates columns to select from data frame (if argument is not used

default will keep all columns)▶ drop to preserve original dimensions [SKIP]

▶ can take values TRUE or FALSE ; default is FALSE▶ only need to worry about dataframes when subset output is single column

83 / 106

Page 84: Investigating objects and data patterns using base R

Subset function, examplesRecall the previous example where we count events at public HS with at least $50kmedian household income.

▶ Note. subset() automatically excludes rows where condition is NA :

#Base R, `[]` combined with `$`, without which(); includes `NA`nrow(df_event[df_event$event_type == "public hs"

& df_event$med_inc>=50000, ])#> [1] 10016

#Base R, `[]` combined with `$`, with which(); excludes `NA`nrow(df_event[which(df_event$event_type == "public hs"

& df_event$med_inc>=50000), ])#> [1] 9941

#Base R, `subset()`; excludes `NA`nrow(subset(df_event, event_type == "public hs"

& med_inc>=50000))#> [1] 9941

#Base R, `subset()`; excludes `NA`; explicitly name arguments of subset()nrow(subset(x = df_event, subset = event_type == "public hs"

& med_inc>=50000))#> [1] 9941

84 / 106

Page 85: Investigating objects and data patterns using base R

Subset function, examples

Using df_school , show all public high schools that are at least 50% Latinx(var= pct_hispanic ) student enrollment in California

▶ Using base R, subset() [output omitted]

#public high schools with at least 50% Latinx student enrollmentsubset(x= df_school, subset = school_type == "public" & pct_hispanic >= 50

& state_code == "CA")

▶ Can wrap subset() within nrow() to count number of observations thatsatisfy criteria

nrow(subset(df_school, school_type == "public" & pct_hispanic >= 50& state_code == "CA"))

#> [1] 713

85 / 106

Page 86: Investigating objects and data patterns using base R

Subset function, examples

Note that subset() identify the number of observations for which the condition isTRUE

nrow(subset(x = df_school, subset = TRUE))#> [1] 21301nrow(subset(x = df_school, subset = FALSE))#> [1] 0

86 / 106

Page 87: Investigating objects and data patterns using base R

Subset function, examples

Count all CA public high schools that are at least 50% Latinx and received at least 1visit from UC Berkeley (var= visits_by_110635 )

nrow(subset(df_school, school_type == "public" & pct_hispanic >= 50& state_code == "CA" & visits_by_110635 >= 1))

#> [1] 100

87 / 106

Page 88: Investigating objects and data patterns using base R

Subset function, examples

subset() can also use %in% operator, which is more efficient version of ORoperator |

▶ Count number of schools from MA, ME, or VT that received at least one visitfrom University of Alabama (var= visits_by_100751 )

nrow(subset(df_school, state_code %in% c("MA","ME","VT")& visits_by_100751 >= 1))

#> [1] 108

88 / 106

Page 89: Investigating objects and data patterns using base R

Subset function, examplesUse the select argument within subset() to keep selected variables

▶ syntax: select = c(var_name1,var_name2,...,var_name_n)

Subset all CA public high schools that are at least 50% Latinx AND only keepvariables name and address

subset(x = df_school, subset = school_type == "public" & pct_hispanic >= 50& state_code == "CA", select = c(name, address))

#> name#> 1254 Tustin High#> 1301 Bell Gardens High#> 1309 Santa Ana High#> 1332 Warren High#> 1336 Hollywood Senior High#> 1337 Venice Senior High#> 1345 Sequoia High#> 1360 Santa Barbara Senior High#> 1361 Santa Paula High#> 1362 Azusa High#> 1363 Alta Loma High#> 1373 Garden Grove High#> 1374 Mt. Eden High#> 1375 Ocean View High#> 1376 Inglewood High#> 1378 Bell Senior High#> 1379 El Camino High#> 1380 Heritage High#> 1391 San Lorenzo High#> 1394 Segerstrom High#> 1404 Santa Fe High#> 1422 Healdsburg High#> 1425 Savanna High#> 1431 Carpinteria Senior High#> 1433 Etiwanda High#> 1434 Colony High#> 1436 Centennial High#> 1437 Duarte High#> 1438 James Lick High#> 1439 Oak Grove High#> 1440 Yerba Buena High#> 1442 Arroyo High#> 1443 El Rancho High#> 1448 Herbert Hoover High#> 1449 Fullerton Union High#> 1450 Rancho Alamitos High#> 1451 Santiago High#> 1454 Glen A. Wilson High#> 1455 Hayward High#> 1456 Leadership Public Schools - Hayward#> 1457 West Valley High#> 1459 Westminster High#> 1461 Lemoore High#> 1464 Millikan High#> 1467 Nathaniel Narbonne Senior High#> 1468 Theodore Roosevelt Senior High#> 1470 Woodrow Wilson Senior High#> 1471 Linda Esperanza Marquez High C School of Social Justice#> 1473 Monrovia High#> 1474 Schurr High#> 1476 Ygnacio Valley High#> 1477 Napa High#> 1478 Vintage High#> 1479 John H. Glenn High#> 1483 Aspire Lionel Wilson College Preparatory Academy#> 1484 El Modena High#> 1486 Pajaro Valley High#> 1487 Pasadena High#> 1491 Richmond High#> 1492 John W. North High#> 1496 Hoover High#> 1499 Leadership High#> 1506 Harbor High#> 1509 Elsie Allen High#> 1512 Sonoma Valley High#> 1513 Saint Helena High#> 1517 La Serna High#> 1518 Pioneer High#> 1519 Pioneer High#> 1523 Rancho Verde High#> 1554 Hesperia High#> 1559 Natomas High#> 1560 Gonzales High#> 1562 Alhambra High#> 1563 High Tech High Chula Vista#> 1566 San Marcos Senior High#> 1567 Da Vinci Science#> 1570 La Sierra High#> 1571 Norte Vista High#> 1572 Katella High#> 1573 Loara High#> 1578 Brawley High#> 1579 Calipatria High#> 1583 Southwest High#> 1584 Chaffey High#> 1585 Montclair High#> 1586 Ontario High#> 1589 Chino High#> 1594 Bloomington High#> 1595 Compton High#> 1596 Dominguez High#> 1599 Robert F. Kennedy High#> 1601 Downey High#> 1602 Andrew P. Hill High#> 1603 Mount Pleasant High#> 1612 San Pasqual High#> 1615 Henry J. Kaiser High#> 1616 Fontana A. B. Miller High#> 1617 Summit High#> 1618 Jurupa Hills High#> 1621 School of Unlimited Learning#> 1622 Buena Park High#> 1623 La Habra High#> 1624 Sonora High#> 1627 Los Amigos High#> 1628 Gilroy High#> 1632 La Puente High#> 1633 Tennyson High#> 1634 Imperial High#> 1636 Highland High#> 1637 South High#> 1639 Golden Valley High#> 1641 Mira Monte High#> 1645 Wilson High#> 1646 Phineas Banning Senior High#> 1647 Birmingham Community Charter High#> 1648 Canoga Park Senior High#> 1649 Carson Senior High#> 1650 Grover Cleveland Charter High#> 1653 Benjamin Franklin Senior High#> 1654 John C. Fremont Senior High#> 1655 James A. Garfield Senior High#> 1656 Alexander Hamilton Senior High#> 1657 Huntington Park Senior High#> 1658 Manual Arts Senior High#> 1659 James Monroe High#> 1660 San Fernando Senior High#> 1661 University Senior High#> 1662 Vaughn Next Century Learning Center#> 1664 Belmont Senior High#> 1666 Middle College High#> 1667 South East High#> 1668 Maywood Academy High#> 1670 Animo Watts College Preparatory Academy#> 1671 West Adams Preparatory High#> 1672 Alain Leroy Locke College Prep Academy#> 1673 Alliance Health Services Academy High#> 1674 Felicitas and Gonzalo Mendez High#> 1675 Ramon C. Cortines School of Visual and Performing Arts#> 1676 RFK Community Schools-UCLA Community K-12#> 1677 E. Los Angeles Renaiss Acad at Esteban E. Torres High #2#> 1678 PUC Lakeview Charter High#> 1679 Performing Arts Community at Diego Rivera Learning Complex#> 1680 Animo College Preparatory Academy#> 1681 Lynwood High#> 1682 Marco Antonio Firebaugh High#> 1685 Thomas Downey High#> 1688 Montebello High#> 1689 Applied Technology Center#> 1690 Seaside High#> 1691 Moreno Valley High#> 1692 Canyon Springs High#> 1693 Valley View High#> 1694 Vista del Lago High#> 1700 North Monterey County High#> 1701 Norwalk High#> 1702 Coliseum College Prep Academy#> 1703 Aspire Golden State College Preparatory Academy#> 1706 Channel Islands High#> 1707 Oxnard High#> 1709 Watsonville High#> 1710 Ceiba College Preparatory Academy#> 1711 Paramount High#> 1713 Pittsburg Senior High#> 1718 Eisenhower Senior High#> 1719 Wilmer Amina Carter High#> 1723 Ramona High#> 1727 Nogales High#> 1734 Everett Alvarez High#> 1735 Alisal High#> 1736 San Benito High#> 1737 Cajon High#> 1738 San Gorgonio High#> 1739 Arroyo Valley High#> 1744 Lincoln High#> 1745 Gompers Preparatory Academy#> 1751 Abraham Lincoln High#> 1756 Mission Hills High#> 1758 Saddleback High#> 1759 Nova Academy#> 1760 Hector G. Godinez#> 1763 Ernest Righetti High#> 1764 Santa Maria High#> 1765 Pioneer Valley High#> 1767 Selma High#> 1768 Woodside High#> 1769 Summit Preparatory Charter High#> 1770 Silverado High#> 1771 Soledad High#> 1773 Eastlake High#> 1774 Montgomery Senior High#> 1775 Otay Ranch Senior High#> 1783 West Covina High#> 1784 Edgewood High#> 1785 California High#> 1794 Citrus Hill High#> 1855 Assurance Learning Academy#> 1857 Cuyama Valley High#> 1862 Sultana High#> 1863 Summit Leadership Academy-High Desert#> 1864 Mirus Secondary#> 1866 Oak Hills High#> 1869 Upland High#> 1875 Mendota High#> 1877 Elsinore High#> 1879 Lakeside High#> 1883 Dos Palos High#> 1884 Farmersville High#> 1889 Delhi High#> 1890 Riverdale High#> 1893 Orland High#> 1894 Merrill F. West High#> 1900 Gridley High#> 1902 Riverbank High#> 1905 Waterford High#> 1907 Monache High#> 1908 Porterville High#> 1909 Granite Hills High#> 1910 Strathmore High#> 1911 Butterfield Charter High#> 1912 Harmony Magnet Academy#> 1913 Dinuba High#> 1914 Caruthers High#> 1919 School of Arts and Enterprise#> 1922 Turlock High#> 1924 Fusion Charter#> 1929 Lifeline Education Charter#> 1939 Hamilton High#> 1943 Washington High#> 1944 W. E. B. DuBois Public Charter#> 1945 Woodlake High#> 1948 Da Vinci Design#> 1949 Anahuacalmecac Inter. Univ. Prep. High Sch of N. America#> 1950 Artesia High#> 1953 Alta Vista Public#> 1956 Alternatives in Action#> 1959 Alpaugh Junior-Senior High#> 1962 Anaheim High#> 1965 Magnolia High#> 1966 Western High#> 1972 Anderson Valley Junior-Senior High#> 1973 Antelope Valley High#> 1974 Palmdale High#> 1977 Highland High#> 1978 Littlerock High#> 1979 Eastside High#> 1980 William J. (Pete) Knight High#> 1985 Crossroads Charter#> 1986 Gladstone High#> 1987 Baldwin Park High#> 1988 Sierra Vista High#> 1989 Opportunities for Learning - Baldwin Park#> 1990 Opportunities For Learning - Baldwin Park II#> 1991 Banning High#> 1992 Barstow High#> 1993 Bassett Senior High#> 1996 Bellflower High#> 1997 Mayfair High#> 1998 REALM Charter High#> 2005 San Dimas High#> 2006 Borrego Springs High#> 2008 Diego Springs Academy#> 2011 Options for Youth-Burbank Charter#> 2012 Summit Charter Academy#> 2013 Burton Pathways Charter Academy#> 2018 Calexico High#> 2019 Calistoga Junior-Senior High#> 2020 Del Mar High#> 2028 Hawthorne High#> 2029 Leuzinger High#> 2030 Lawndale High#> 2031 Family First Charter#> 2032 Central High East Campus#> 2033 Central Union High#> 2034 Ceres High#> 2036 Central Valley High#> 2037 Charter Oak High#> 2041 Don Antonio Lugo High#> 2042 Chowchilla Union High#> 2043 Cloverdale High#> 2046 Coachella Valley High#> 2047 West Shores High#> 2048 Desert Mirage High#> 2049 NOVA Academy - Coachella#> 2050 Coalinga High#> 2051 Colton High#> 2052 Grand Terrace High Sch at the Ray Abril Jr. Edal Complex#> 2053 Colusa High#> 2054 Centennial High#> 2055 Thurgood Marshall#> 2056 Corcoran High#> 2058 Corona High#> 2061 Covina High#> 2062 Northview High#> 2063 South Hills High#> 2064 Orosi High#> 2067 Diego Hills Charter#> 2069 Delano High#> 2070 Cesar E. Chavez High#> 2073 Indio High#> 2074 La Quinta High#> 2075 Shadow Hills High#> 2076 Dixon High#> 2077 Opportunities for Learning - Duarte#> 2082 William C. Overfelt High#> 2085 Latino College Preparatory Academy#> 2086 Escuela Popular Accelerated Family Learning#> 2087 Escuela Popular/Center for Training and Careers, Family Lrng#> 2088 KIPP San Jose Collegiate#> 2089 Summit Public School: Rainier#> 2092 South El Monte High#> 2093 El Monte High#> 2094 Mountain View High#> 2103 Escondido High#> 2104 Orange Glen High#> 2107 Esparto High#> 2111 Fallbrook High#> 2113 Fillmore Senior High#> 2114 Firebaugh High#> 2115 Fontana High#> 2118 Fowler High#> 2123 Edison High#> 2124 Fresno High#> 2125 McLane High#> 2126 Roosevelt High#> 2127 Sunnyside High#> 2128 Carter G. Woodson Public Charter#> 2129 Erma Duncan Polytechnical High#> 2130 Sierra Charter#> 2131 Fresno Academy for Civic and Entrepreneurial Leadership#> 2134 Galt High#> 2138 Geyserville New Tech Academy#> 2140 Christopher High#> 2149 Monte Vista High#> 2150 Mount Miguel High#> 2154 Gustine High#> 2155 Los Altos High#> 2156 William Workman High#> 2157 Hanford High#> 2158 Hanford West High#> 2159 Sierra Pacific High#> 2161 Hawthorne Math and Science Academy#> 2162 Impact Academy of Arts & Technology#> 2165 Alta Vista South Public Charter#> 2169 Tahquitz High#> 2172 Holtville High#> 2174 Morningside High#> 2175 Animo Inglewood Charter High#> 2182 Diego Valley Charter#> 2183 Rubidoux High#> 2184 Jurupa Valley High#> 2185 Patriot High#> 2187 Kerman High#> 2189 Arvin High#> 2190 Bakersfield High#> 2191 East Bakersfield High#> 2192 Foothill High#> 2195 Shafter High#> 2196 West High#> 2197 Ridgeview High#> 2198 Kern Workforce 2000 Academy#> 2199 Independence High#> 2201 King City High#> 2202 Greenfield High#> 2203 Reedley High#> 2204 Orange Cove High#> 2205 Kingsburg High#> 2208 Pescadero High#> 2214 Laton High#> 2215 Environmental Charter High#> 2216 Le Grand High#> 2218 Animo Leadership High#> 2219 Lennox Mathematics, Science and Technology Academy#> 2222 Linden High#> 2223 Lindsay Senior High#> 2224 Loma Vista Charter#> 2226 Live Oak High#> 2227 Tokay High#> 2231 Rio Valley Charter#> 2233 Lompoc High#> 2235 Jordan High#> 2237 Cabrillo High#> 2238 Avalon K-12#> 2241 Chatsworth Charter High#> 2242 Eagle Rock High#> 2243 Elizabeth Learning Center#> 2244 Fairfax Senior High#> 2245 Foshay Learning Center#> 2246 John H. Francis Polytechnic#> 2247 Robert Fulton College Preparatory#> 2248 Gardena Senior High#> 2249 Ulysses S. Grant Senior High#> 2250 Thomas Jefferson Senior High#> 2251 John F. Kennedy High#> 2252 David Starr Jordan Senior High#> 2253 Abraham Lincoln Senior High#> 2254 Los Angeles Senior High#> 2255 John Marshall Senior High#> 2256 North Hollywood Senior High#> 2257 Reseda Senior High#> 2258 San Pedro Senior High#> 2259 South Gate Senior High#> 2260 Sylmar Senior High#> 2261 Van Nuys Senior High#> 2262 Verdugo Hills Senior High#> 2264 PUC Comm Chtr Mid and PUC Comm Chtr Early College High#> 2265 Los Angeles Leadership Academy#> 2266 Magnolia Science Academy#> 2267 Animo South Los Angeles Charter#> 2268 New Designs Charter#> 2269 Harbor Teacher Preparation Academy#> 2271 Animo Venice Charter High#> 2272 Animo Pat Brown#> 2273 Alliance Gertz-Ressler Richard Merkin 6-12 Complex#> 2274 Northridge Academy High#> 2275 International Studies Lrng Center at Legacy High Sch Complex#> 2276 Port of Los Angeles High#> 2278 Alliance Judy Ivie Burton Technology Academy High#> 2279 Alliance Collins Family College-Ready High#> 2280 PUC CA Academy for Liberal Studies Early College High#> 2282 Oscar De La Hoya Animo Charter High#> 2283 Renaissance Arts Academy#> 2284 Central City Value#> 2285 North Valley Military Institute College Preparatory Academy#> 2286 Wallis Annenberg High#> 2287 Santee Education Complex#> 2288 Los Angeles Academy of Arts & Enterprise Charter#> 2290 Alliance Patti And Peter Neuwirth Leadership Academy#> 2291 Alliance Dr. Olga Mohan High#> 2292 Animo Ralph Bunche High#> 2293 Animo Jackie Robinson High#> 2294 Alliance Ouchi-O'Donovan 6-12 Complex#> 2295 Alliance Marc & Eva Stern Math and Science#> 2296 School of Business and Tourism at Contreras Learning Complex#> 2297 East Valley Senior High#> 2298 Arleta High#> 2299 Panorama High#> 2300 Bright Star Secondary Charter Academy#> 2302 Magnolia Science Academy 2#> 2303 Contreras Lrng Center-Los Angeles Sch of Global Studies#> 2304 Discovery Charter Preparatory No. 2#> 2306 Student Empowerment Academy#> 2307 Helen Bernstein High#> 2309 Alliance Media Arts and Entertainment Design High#> 2310 RFK Community Schools-Los Angeles High School of the Arts#> 2311 Edward R. Roybal Learning Center#> 2312 APEX Academy#> 2313 Contreras Learning Center-Academic Leadership Community#> 2314 Belmont SH-LA Teacher Preparatory Academy#> 2315 Alliance Environmental Science and Technology High#> 2316 Magnolia Science Academy 4#> 2317 RFK Community Schools- for the Visual Arts and Humanities#> 2318 New Designs Charter School-Watts#> 2319 RFK Community Schools-New Open World Academy K-12#> 2320 Sun Valley High#> 2321 Daniel Pearl Journalism & Communications Magnet#> 2322 RFK Community Schools-Ambassador-Global Leadership#> 2323 Alliance Cindy and Bill Simon Technology Academy High#> 2324 Alliance Tennenbaum Family Technology High#> 2325 Aspire Pacific Academy#> 2326 Los Angeles Big Picture High#> 2327 Humanitas Acad of Art and Tech at Esteban E. Torres High #4#> 2328 Soc Just Leadership Acad at Esteban E. Torres High #5#> 2329 Acad of Environmental & Soc Policy (ESP) at Roosevelt High#> 2330 Alliance College-Ready Academy High 16#> 2331 E. Los Angeles Perf Arts Acad at Esteban E. Torres High #1#> 2332 Math, Science, & Technology Magnet Academy at Roosevelt High#> 2333 PUC Triumph Charter High#> 2334 Engr and Tech Acad at Esteban E. Torres High #3#> 2335 Alliance Renee and Meyer Luskin Academy High#> 2336 Cesar E. Chavez Lrng Acads-Acad of Scientific Explr (ASE)#> 2337 PUC Early College Academy for Leaders and Scholars (ECALS)#> 2338 Dr. Maya Angelou Community High#> 2339 Synergy Quantum Academy#> 2340 Cesar E. Chavez Lrng Acads-Soc Just Humanitas Acad#> 2341 Rancho Dominguez Preparatory#> 2342 Los Angeles River at Sonia Sotomayor Learning Academies#> 2343 Cesar E. Chavez Lrng Acads-Arts,Theatre, Entertmnt (ArTES)#> 2344 Green Design at Diego Rivera Learning Complex#> 2345 Valley Academy of Arts and Sciences#> 2346 Sch of Hist and Dramatic Arts at Sonia Sotomayor Lrng Acads#> 2347 Cesar E. Chavez Lrng Acads-Teacher Preparation Acad#> 2348 Public Service Community at Diego Rivera Learning Complex#> 2349 Communication and Tech at Diego Rivera Lrng Complex#> 2351 Augustus F. Hawkins High A Critical Design and Gaming#> 2352 Acad for Multilingual Arts and Sci at Mervyn M. Dymally High#> 2353 Augustus F. Hawkins High B Community Health Advocates#> 2355 Leadership in Entertainment and Media Arts (LEMA)#> 2356 Visual and Performing Arts at Legacy High School Complex#> 2357 STEM Academy at Bernstein High#> 2358 Sci, Tech, Engr, Arts and Math at Legacy High Sch Complex#> 2359 Augustus F. Hawkins High C Rspnsbl Indigenous Soc Entrepr#> 2360 Linda Esperanza Marquez High A Hntngtn Park Inst of Appl Med#> 2361 Linda Esperanza Marquez High B LIBRA Academy#> 2362 Contreras Learning Center-School of Social Justice#> 2363 Early College Academy-LA Trade Tech College#> 2364 Camino Nuevo High No. 2#> 2365 Humanities and Arts (HARTS) Academy of Los Angeles#> 2366 Los Banos High#> 2367 Pacheco High#> 2371 Nipomo High#> 2372 Madera High#> 2373 Madera South High#> 2374 Mammoth High#> 2375 East Union High#> 2376 Manteca High#> 2387 McFarland High#> 2390 Golden Valley High#> 2391 Atwater High#> 2392 Livingston High#> 2393 Merced High#> 2394 Buhach Colony High#> 2397 Peter Johansen High#> 2399 Grace M. Davis High#> 2400 Modesto High#> 2402 Mojave Jr./Sr. High#> 2404 Mountain Park#> 2406 Learning for Life Charter#> 2407 Marina High#> 2409 Live Oak High#> 2418 Mt. Diablo High#> 2427 Humphreys College Academy of Business, Law and Education#> 2429 Orestimba High#> 2430 Costa Mesa High#> 2431 Estancia High#> 2432 La Mirada High#> 2434 Nuview Bridge Early College High#> 2436 LIFE Academy#> 2439 Lighthouse Community Charter High#> 2440 Oakland Military Institute, College Preparatory Academy#> 2441 MetWest High#> 2442 Oakland Unity High#> 2445 ARISE High#> 2448 Castlemont High#> 2449 Fremont High#> 2450 LPS Oakland R & D Campus#> 2451 Oceanside High#> 2452 Pacific View Charter#> 2454 National University Academy - Orange Center#> 2457 Orange High#> 2459 Mojave River Academy#> 2460 Riverside Preparatory#> 2462 Hueneme High#> 2463 Rio Mesa High#> 2464 Pacifica High#> 2467 Pacific Coast Charter#> 2468 Palm Springs High#> 2469 Desert Hot Springs High#> 2470 Cathedral City High#> 2471 Guidance Charter#> 2472 Antelope Valley Learning Academy#> 2473 Palo Verde High#> 2477 Parlier High#> 2478 Blair High#> 2479 Marshall Fundamental#> 2480 John Muir High#> 2482 Learning Works#> 2483 Patterson High#> 2484 Perris High#> 2485 Paloma Valley High#> 2486 California Military Institute#> 2487 Gateway to College Academy#> 2488 Pierce High#> 2491 Valencia High#> 2500 Point Arena High#> 2501 Fremont Academy of Engineering and Design#> 2502 Ganesha High#> 2503 Garey High#> 2504 Palomares Academy of Health Science#> 2505 Pomona High#> 2506 Diamond Ranch High#> 2507 School of Extended Educational Options#> 2509 Princeton Junior-Senior High#> 2510 Ambassador Phillip V. Sanchez Public Charter#> 2513 Aspire East Palo Alto Charter#> 2517 Avenal High#> 2518 Rialto High#> 2519 Kennedy High#> 2521 Middle College High#> 2522 Leadership Public Schools: Richmond#> 2526 Arlington High#> 2527 Polytechnic High#> 2528 Roseland Charter#> 2535 New Technology High#> 2536 North Salinas High#> 2537 Salinas High#> 2538 San Bernardino High#> 2539 Pacific High#> 2540 Provisional Accelerated Learning Academy#> 2541 ASA Charter#> 2542 Public Safety Academy#> 2543 Options for Youth-San Bernardino#> 2544 Indian Springs High#> 2546 Charter School of San Diego#> 2547 Clairemont High#> 2551 Mission Bay High#> 2555 Preuss School UCSD#> 2557 Audeo Charter#> 2558 High Tech High International#> 2561 San Diego Business/Leadership#> 2562 San Diego MVP Arts#> 2566 Kearny Eng, Innov & Design#> 2567 San Diego Science and Technology#> 2569 Arroyo Paseo Charter High#> 2570 Health Sciences High#> 2571 King-Chavez Community High#> 2576 O'Connell (John) High#> 2581 City Arts and Tech High#> 2582 Five Keys Independence HS (SF Sheriff's)#> 2584 Options for Youth San Gabriel#> 2585 San Jacinto High#> 2586 San Jacinto Valley Academy#> 2587 Mountain Heights Academy#> 2588 Gunderson High#> 2591 San Jos? High#> 2592 Willow Glen High#> 2593 Downtown College Preparatory#> 2601 Aspire Alexander Twilight Secondary Academy#> 2608 San Pasqual Valley High#> 2609 San Rafael High#> 2610 Sanger High#> 2612 Valley High#> 2613 Century High#> 2614 Piner High#> 2615 Abraxis Charter#> 2616 East Palo Alto Academy#> 2617 Shandon High#> 2623 Tomales High#> 2632 Options for Youth-Victorville Charter#> 2634 Victor Valley High#> 2635 University Preparatory#> 2636 Adelanto High#> 2639 South San Francisco High#> 2641 Rosamond High#> 2643 Edison High#> 2644 Franklin High#> 2645 Stagg Senior High#> 2647 Aspire Langston Hughes Academy#> 2648 Stockton High#> 2650 Stockton Collegiate International Secondary#> 2651 Health Careers Academy#> 2652 Pacific Law Academy#> 2653 Crescent Valley Public Charter#> 2658 Bonita Vista Senior High#> 2659 Castle Park Senior High#> 2660 Chula Vista Senior High#> 2661 Hilltop Senior High#> 2662 Mar Vista Senior High#> 2663 Southwest Senior High#> 2664 Sweetwater High#> 2665 MAAC Community Charter#> 2666 San Ysidro High#> 2667 Olympian High#> 2672 Sierra Vista Charter High#> 2673 Tulare Union High#> 2674 Tulare Western High#> 2675 Mission Oak High#> 2676 Tulelake High#> 2680 Accelerated Achievement Academy#> 2684 Taylion San Diego Academy#> 2687 Ventura High#> 2689 Golden West High#> 2690 Mt. Whitney High#> 2691 Redwood High#> 2692 El Diamante High#> 2693 Visalia Charter Independent Study#> 2695 Guajome Park Academy Charter#> 2696 Vista High#> 2697 Rancho Buena Vista High#> 2698 North County Trade Tech High#> 2701 Wasco High#> 2702 West Sacramento Early College Prep Charter#> 2703 West Park Charter Academy#> 2707 Crescent View South Charter#> 2712 Whittier High#> 2716 Golden Valley High#> 2718 Mission View Public#> 2721 Williams Junior/Senior High#> 2726 Winters High#> 2727 Woodland Senior High#> 2734 Making Waves Academy#> 2735 Crescent View West Charter#> 2736 Big Picture High School - Fresno#> 2740 YouthBuild Charter School of California#> 2741 YouthBuild Charter School of California Central#> 2742 The Education Corps#> 2743 College Bridge Academy#> 2745 Paramount Academy#> 2746 Pioneer Technical Center#> 2747 Madera County Independent Academy#> 2748 Merced Scholars Charter#> 2749 Monterey County Home Charter#> 2750 Muir Charter#> 2753 EPIC de Cesar Chavez#> 2756 Riverside County Education Academy#> 2757 Come Back Kids#> 2758 Gateway College and Career Academy#> 2760 Venture Academy#> 2761 one.Charter#> 2762 San Joaquin Building Futures Academy#> 2763 Leadership Public Schools - San Jose#> 2764 Summit Public School: Tahoma#> 2767 Stanislaus Alternative Charter#> 2769 La Sierra High#> 2771 Vista Real Charter High#> 2775 Aspire California College Preparatory Academy#> 2778 Orange County Conservation Corps Charter#> 2780 International Polytechnic High#> 2781 Academia Avance Charter#> 2782 Los Angeles International Charter High#> 2783 Optimist Charter#> 2784 Tranquillity High#> 2785 Anzar High#> address#> 1254 1171 El Camino Real#> 1301 6119 Agra St.#> 1309 520 W. Walnut#> 1332 8141 De Palma St.#> 1336 1521 N. Highland Ave.#> 1337 13000 Venice Blvd.#> 1345 1201 Brewster Ave.#> 1360 700 E. Anapamu St.#> 1361 404 N. Sixth St.#> 1362 240 N. Cerritos Ave.#> 1363 8880 Baseline Rd.#> 1373 11271 Stanford Ave.#> 1374 2300 Panama St.#> 1375 17071 Gothard St.#> 1376 231 S. Grevillea Ave.#> 1378 4328 Bell Ave.#> 1379 400 Rancho del Oro Dr.#> 1380 26000 Briggs Rd.#> 1391 50 E. Lewelling Blvd.#> 1394 2301 W. MacArthur Blvd.#> 1404 10400 S. Orr and Day Rd.#> 1422 1024 Prince St.#> 1425 301 N. Gilbert St.#> 1431 4810 Foothill Rd.#> 1433 13500 Victoria Ave.#> 1434 3850 E. Riverside Dr.#> 1436 1820 Rimpau Ave.#> 1437 1565 E. Central Ave.#> 1438 57 N. White Rd.#> 1439 285 Blossom Hill Rd.#> 1440 1855 Lucretia Ave.#> 1442 4921 N. Cedar Ave.#> 1443 6501 S. Passons Blvd.#> 1448 5550 N. First St.#> 1449 201 E. Chapman Ave.#> 1450 11351 Dale St.#> 1451 12342 Trask Ave.#> 1454 16455 Wedgeworth Dr.#> 1455 1633 E. Ave.#> 1456 28000 Calaroga Ave.#> 1457 3401 Mustang Way#> 1459 14325 GoldenW. St.#> 1461 101 E. Bush St.#> 1464 2800 Snowden Ave.#> 1467 24300 W.ern Ave.#> 1468 456 S. Mathews St.#> 1470 4500 Multnomah St.#> 1471 6361 Cottage St.#> 1473 845 W. Colorado Blvd.#> 1474 820 N. Wilcox Ave.#> 1476 755 Oak Grove Rd.#> 1477 2475 Jefferson St.#> 1478 1375 Trower Ave.#> 1479 13520 Shoemaker Ave.#> 1483 400 105th Ave.#> 1484 3920 Spring St.#> 1486 500 Harkins Slough Rd.#> 1487 2925 E. Sierra Madre Blvd.#> 1491 1250 23rd St.#> 1492 1550 W. Third St.#> 1496 4474 El Cajon Blvd.#> 1499 241 Oneida Ave. #301#> 1506 300 La Fonda Ave.#> 1509 599 Bellevue Ave.#> 1512 20000 BRd.way#> 1513 1401 Grayson Ave.#> 1517 15301 E. Youngwood Dr.#> 1518 10800 Ben Avon#> 1519 1400 Pioneer Ave.#> 1523 17750 Lasselle St.#> 1554 9898 Maple Ave.#> 1559 3301 Fong Ranch Rd.#> 1560 501 Fifth St.#> 1562 101 S. Second St.#> 1563 1945 Discovery Falls Dr.#> 1566 4750 Hollister Ave.#> 1567 13500 Aviation Blvd.#> 1570 4145 La Sierra Ave.#> 1571 6585 Crest Ave.#> 1572 2200 E. Wagner Ave.#> 1573 1765 W. Cerritos Ave.#> 1578 480 N. Imperial Ave.#> 1579 601 W. Main St.#> 1583 2001 Ocotillo Dr.#> 1584 1245 N. Euclid Ave.#> 1585 4725 Benito St.#> 1586 901 W. Francis St.#> 1589 5472 Park Pl.#> 1594 10750 Laurel Ave.#> 1595 601 S. Acacia St.#> 1596 15301 S. San Jose Ave.#> 1599 1401 Hiett Ave.#> 1601 11040 Brookshire Ave.#> 1602 3200 Senter Rd.#> 1603 1750 S. White Rd.#> 1612 3300 Bear Valley Pkwy.#> 1615 11155 Almond Ave.#> 1616 6821 Oleander Ave.#> 1617 15551 Summit Ave.#> 1618 10700 Oleander Ave.#> 1621 2336 Calaveras St.#> 1622 8833 Acad Dr.#> 1623 801 W. Highlander Ave.#> 1624 401 S. Palm St.#> 1627 16566 Newhope St.#> 1628 750 W. Tenth St.#> 1632 15615 E. Nelson Ave.#> 1633 27035 Whitman Rd.#> 1634 517 W. Barioni Blvd.#> 1636 2900 Royal Scots Way#> 1637 1101 Planz Rd.#> 1639 801 Hosking Ave.#> 1641 1800 S. Fairfax Rd.#> 1645 4400 E. Tenth St.#> 1646 1527 Lakme Ave.#> 1647 17000 Haynes St.#> 1648 6850 Topanga Canyon Blvd.#> 1649 22328 S. Main St.#> 1650 8140 Vanalden Ave.#> 1653 820 N. Ave. 54#> 1654 7676 S. San Pedro St.#> 1655 5101 E. Sixth St.#> 1656 2955 Robertson Blvd.#> 1657 6020 Miles Ave.#> 1658 4131 S. Vermont Ave.#> 1659 9229 Haskell Ave.#> 1660 11133 O'Melveny Ave.#> 1661 11800 Texas Ave.#> 1662 13330 Vaughn St.#> 1664 1575 W. 2nd St.#> 1666 1600 W. Imperial Hwy, Bldg. 16#> 1667 2720 Tweedy Blvd.#> 1668 6125 Pine Ave.#> 1670 12628 Avalon Blvd.#> 1671 1500 W. Washington Blvd.#> 1672 325 E. 111th St.#> 1673 12226 S. W.ern Ave.#> 1674 1200 Plaza Del Sol#> 1675 450 N. Grand Ave.#> 1676 700 S. Mariposa Ave.#> 1677 4211 Dozier St.#> 1678 919 Eighth St.#> 1679 6100 S. Central Ave.#> 1680 2265 E. 103rd St.#> 1681 4050 E. Imperial Hwy.#> 1682 5246 MLK Blvd.#> 1685 1000 Coffee Rd.#> 1688 2100 W. Cleveland Ave.#> 1689 1200 W. Mines Ave.#> 1690 2200 Noche Buena St.#> 1691 23300 Cottonwood Ave.#> 1692 23100 Cougar Canyon Dr.#> 1693 13135 Nason St.#> 1694 15150 Lasselle St.#> 1700 13990 Castroville Blvd.#> 1701 11356 E. Leffingwell Rd.#> 1702 1390 66th Ave.#> 1703 1009 66th Ave.#> 1706 1400 Raiders Way#> 1707 3400 W. Gonzales Rd.#> 1709 250 E. Beach St.#> 1710 260 W. Riverside Dr.#> 1711 14429 S. Downey Ave.#> 1713 1750 Harbor St.#> 1718 1321 N. Lilac Ave.#> 1719 2630 N. Linden Ave.#> 1723 7675 Magnolia Ave.#> 1727 401 S. Nogales St.#> 1734 1900 Independence Blvd.#> 1735 777 Williams Rd.#> 1736 1220 Monterey St.#> 1737 1200 Hill Dr.#> 1738 2299 E. Pacific Ave.#> 1739 1881 W. Baseline St.#> 1744 4777 Imperial Ave.#> 1745 1005 47th St.#> 1751 555 Dana Ave.#> 1756 1 Mission Hills Ct.#> 1758 2801 S. Flower#> 1759 1010 W. 17th St.#> 1760 3002 Centennial Rd.#> 1763 941 E. Foster Rd.#> 1764 901 S. BRd.way#> 1765 675 Panther Dr.#> 1767 3125 Wright St.#> 1768 199 Churchill Ave.#> 1769 890 BRd.way#> 1770 14048 Cobalt Rd.#> 1771 425 Gabilan Dr.#> 1773 1120 E.lake Pkwy.#> 1774 3250 Palm Ave.#> 1775 1250 Olympic Pkwy.#> 1783 1609 E. Cameron Ave.#> 1784 1301 Trojan Way#> 1785 9800 S. Mills Ave.#> 1794 18150 Wood Rd.#> 1855 5701 S. W.ern Ave.#> 1857 4500 Highway 166#> 1862 17311 Sultana Ave.#> 1863 12850 Muscatel St.#> 1864 14073 Main St., Ste. 103#> 1866 7625 Cataba Rd.#> 1869 565 W. 11th St.#> 1875 1200 Belmont Ave.#> 1877 21800 Canyon Dr.#> 1879 32593 Riverside Dr.#> 1883 1701 E. Blossom St.#> 1884 631 E. Walnut Ave.#> 1889 16881 W. Schendel Ave.#> 1890 3086 W. Mt. Whitney Ave.#> 1893 101 Shasta St.#> 1894 1775 W. Lowell Ave.#> 1900 300 E. Spruce St.#> 1902 6200 Claus Rd.#> 1905 121 S. Reinway#> 1907 960 N. Newcomb St.#> 1908 465 W. Olive Ave.#> 1909 1701 E. Putnam Ave.#> 1910 22568 Ave. 196#> 1911 600 W. Grand Ave.#> 1912 19429 Rd. 228#> 1913 340 E. Kern St.#> 1914 2580 W. Tahoe Ave.#> 1919 295 N. Garey Ave.#> 1922 1600 E. Canal Dr.#> 1924 2217 Geer Rd.#> 1929 225 S. Santa Fe Ave.#> 1939 620 Canal St.#> 1943 6041 S. Elm Ave.#> 1944 2604 MLK Blvd.#> 1945 400 W. Whitney Ave.#> 1948 12501 Isis Ave.#> 1949 4736 Hntngtn Dr. S.#> 1950 12108 E. Del Amo Blvd.#> 1953 11988 Hesperia Rd., Ste. B#> 1956 6221 E 17th St.#> 1959 5313 Rd. 39#> 1962 811 W. Lincoln Ave.#> 1965 2450 W. Ball Rd.#> 1966 501 S. W.ern Ave.#> 1972 18200 Mountain View Rd.#> 1973 44900 N. Division St.#> 1974 2137 E. Ave. R#> 1977 39055 25th St. W.#> 1978 10833 E. Ave. R#> 1979 3200 E. Ave. J-8#> 1980 37423 70th St. E.#> 1985 418 W. 8th St.#> 1986 1340 N. Enid#> 1987 3900 N. Puente Ave.#> 1988 3600 N. Frazier St.#> 1989 320 N. Halstead St.#> 1990 320 N. Halstead St.#> 1991 100 W. W.ward#> 1992 430 S. First Ave.#> 1993 755 Ardilla Ave.#> 1996 15301 S. McNab Ave.#> 1997 6000 N. Woodruff Ave.#> 1998 2023 Eighth St.#> 2005 800 W. Covina Blvd.#> 2006 2281 Diegueno Rd.#> 2008 310 BRd.way#> 2011 1610 W. Burbank Blvd.#> 2012 175 S. Mathew St.#> 2013 1414 W. Olive Ave.#> 2018 1030 Encinas Ave.#> 2019 1608 Lake St.#> 2020 1224 Del Mar Ave.#> 2028 4859 W. El Segundo Blvd.#> 2029 4118 W. Rosecrans Ave.#> 2030 14901 S. Inglewood Ave.#> 2031 4953 Marine Ave.#> 2032 3535 N. Cornelia Ave.#> 2033 1001 Brighton Ave.#> 2034 2320 Central Ave.#> 2036 4033 Central Ave.#> 2037 1430 E. Covina Blvd.#> 2041 13400 Pipeline Ave.#> 2042 805 Humboldt Ave.#> 2043 509 N. Cloverdale Blvd.#> 2046 83-800 Airport Blvd.#> 2047 2381 Shore Hawk Ave.#> 2048 86-150 Ave. 66#> 2049 52-780 Frederick St.#> 2050 750 Van Ness Ave.#> 2051 777 W. Valley Blvd.#> 2052 21810 Main St.#> 2053 901 Colus Ave.#> 2054 2606 N. Central Ave.#> 2055 12501 N. Wilmington#> 2056 1100 Letts Ave.#> 2058 1150 W. Tenth St.#> 2061 463 S. Hollenbeck Ave.#> 2062 1016 W. Cypress Ave.#> 2063 645 S. Barranca St.#> 2064 41815 Rd. 128#> 2067 4585 College Ave.#> 2069 1331 Cecil Ave.#> 2070 800 Browning Rd.#> 2073 81-750 Ave. 46#> 2074 79-255 W.ward Ho Dr.#> 2075 39-225 Jefferson St.#> 2076 555 College Way#> 2077 1008 Hntngtn Dr.#> 2082 1835 Cunningham Ave.#> 2085 14271 Story Rd.#> 2086 467 N. White Rd.#> 2087 149 N. White Rd.#> 2088 1790 Edal Park Dr.#> 2089 1750 S. White Rd.#> 2092 1001 Durfee Ave.#> 2093 3048 N. Tyler Ave.#> 2094 2900 Pkwy. Dr.#> 2103 1535 N. BRd.way#> 2104 2200 Glen Ridge Rd.#> 2107 17121 Yolo Ave.#> 2111 2400 S. Stage Coach Ln.#> 2113 555 Central Ave.#> 2114 1976 Morris Kyle Dr.#> 2115 9453 Citrus Ave.#> 2118 701 E. Main St.#> 2123 540 E. California Ave.#> 2124 1839 Echo Ave.#> 2125 2727 N. Cedar Ave.#> 2126 4250 E. Tulare St.#> 2127 1019 S. Peach Ave.#> 2128 3333 N. Bond Ave.#> 2129 4330 E. Garland Ave.#> 2130 1931 N. Fine Ave.#> 2131 1713 Tulare St., Ste. 202#> 2134 145 N. Lincoln Way#> 2138 1300 Moody Ln.#> 2140 850 Day Rd.#> 2149 3230 Sweetwater Springs Blvd.#> 2150 8585 Blossom Ln.#> 2154 501 N. Ave.#> 2155 15325 E. Los Robles Ave.#> 2156 16303 E. Temple Ave.#> 2157 120 E. Grangeville Blvd.#> 2158 1150 W. Lacey Blvd.#> 2159 1259 N. 13th Ave.#> 2161 4467 W. BRd.way#> 2162 2560 Darwin St.#> 2165 689 W. Second St.#> 2169 4425 Titan Trail#> 2172 755 Olive Ave.#> 2174 10500 S. Yukon Ave.#> 2175 3425 W. Manchester Blvd.#> 2182 511 N. 2nd St.#> 2183 4250 Opal St.#> 2184 10551 Bellegrave Ave.#> 2185 4355 Camino Real#> 2187 205 S. First St.#> 2189 900 Varsity Rd.#> 2190 1241 G St.#> 2191 2200 Quincy St.#> 2192 501 Park Dr.#> 2195 526 Mannel Ave.#> 2196 1200 New Stine Rd.#> 2197 8501 Stine Rd.#> 2198 5801 Sundale Ave.#> 2199 8001 Old River Rd.#> 2201 720 BRd.way St.#> 2202 225 S. El Camino Real#> 2203 740 W. N. Ave.#> 2204 1700 Anchor Ave.#> 2205 1900 18th Ave.#> 2208 350 Butano Cut Off#> 2214 6449 DeWoody#> 2215 16315 Grevillea Ave.#> 2216 12961 E. Le Grand Rd.#> 2218 11044 S. Freeman Ave.#> 2219 11036 Hawthorne Blvd.#> 2222 18527 E. Front St.#> 2223 1849 E. Tulare Rd.#> 2224 290 N. Harvard#> 2226 2351 Pennington Rd.#> 2227 1111 W. Century Blvd.#> 2231 1530 W. Kettleman Ln., Ste. A#> 2233 515 W. College Ave.#> 2235 6500 Atlantic Ave.#> 2237 2001 Santa Fe Ave.#> 2238 200 Falls Canyon Rd.#> 2241 10027 Lurline Ave.#> 2242 1750 Yosemite Dr.#> 2243 4811 Elizabeth St.#> 2244 7850 Melrose Ave.#> 2245 3751 S. Harvard Blvd.#> 2246 12431 Roscoe Blvd.#> 2247 7477 Kester Ave.#> 2248 1301 W. 182nd St.#> 2249 13000 Oxnard St.#> 2250 1319 E. 41st St.#> 2251 11254 Gothic Ave.#> 2252 2265 E. 103rd St.#> 2253 3501 N. BRd.way#> 2254 4650 W. Olympic Blvd.#> 2255 3939 Tracy St.#> 2256 5231 Colfax Ave.#> 2257 18230 Kittridge St.#> 2258 1001 W. 15th St.#> 2259 3351 Firestone Blvd.#> 2260 13050 Borden Ave.#> 2261 6535 Cedros Ave.#> 2262 10625 Plainview Ave.#> 2264 11500 Eldridge Ave.#> 2265 234 E. Ave. 33#> 2266 18238 Sherman Way#> 2267 11130 W.ern Ave.#> 2268 2303 S. Figueroa Way#> 2269 1111 Figueroa Pl.#> 2271 820 BRd.way St.#> 2272 8255 Beach St.#> 2273 2023 S. Union Ave.#> 2274 9601 Zelzah Ave.#> 2275 5225 Tweedy Blvd.#> 2276 250 W. Fifth St.#> 2278 10101 S. BRd.way#> 2279 2071 Saturn Ave.#> 2280 7350 N. Figueroa St.#> 2282 1114 S. Lorena St.#> 2283 1800 Colorado Blvd.#> 2284 221 N. W.moreland Ave.#> 2285 16651-A Rinaldi St.#> 2286 4000 S. Main St.#> 2287 1921 S. Maple Ave.#> 2288 600 S. La Fayette Park Pl.#> 2290 4610 S. Main St.#> 2291 644 W. 17th St.#> 2292 1655 E. 27th St., Ste. B#> 2293 3500 S. Hill St.#> 2294 5356 S. Fifth Ave.#> 2295 5151 State University Dr.#> 2296 322 Lucas Ave.#> 2297 5525 Vineland Ave.#> 2298 14200 Van Nuys Blvd.#> 2299 8015 Van Nuys Blvd.#> 2300 5431 W. 98th St.#> 2302 17125 Victory Blvd.#> 2303 322 Lucas Ave.#> 2304 12550 Van Nuys Blvd.#> 2306 1319 E. 41st St.#> 2307 1309 N. Wilton Pl.#> 2309 113 S. Rowan Ave.#> 2310 701 S. Catalina St.#> 2311 1200 W. Colton St.#> 2312 1309 N. Wilton Pl., 3rd Fl.#> 2313 322 Lucas Ave.#> 2314 1575 W. Second St.#> 2315 2930 Fletcher Dr.#> 2316 11330 W. Graham Pl., B-9#> 2317 701 S. Catalina St.#> 2318 12714 S. Avalon Blvd.#> 2319 3201 W. Eighth St.#> 2320 9171 Telfair Ave.#> 2321 6649 Balboa Blvd.#> 2322 701 S. Catalina St.#> 2323 10720 S. Wilmington Ave.#> 2324 2050 San Fernando Rd.#> 2325 2565 58th St.#> 2326 700 Wilshire Blvd., Ste. 400#> 2327 4211 Dozier St.#> 2328 4211 Dozier St.#> 2329 3921 Selig Pl.#> 2330 1575 W. Second St.#> 2331 4211 Dozier St.#> 2332 456 S. Mathews St.#> 2333 9171 Telfair Ave.#> 2334 4211 Dozier St.#> 2335 2941 W. 70th St.#> 2336 1001 Arroyo Ave.#> 2337 2050 San Fernando Rd.#> 2338 300 E. 53rd St.#> 2339 300 E. 53rd St.#> 2340 1001 Arroyo Ave.#> 2341 4110 Santa Fe Ave.#> 2342 2050 San Fernando Rd.#> 2343 1001 Arroyo Ave.#> 2344 6100 S. Central Ave.#> 2345 10445 Balboa Blvd.#> 2346 2050 San Fernando Rd.#> 2347 1001 Arroyo Ave.#> 2348 6100 S. Central Ave.#> 2349 6100 S. Central Ave.#> 2351 825 W. 60th St.#> 2352 8800 S. San Pedro St.#> 2353 825 W. 60th St.#> 2355 3501 N. BRd.way#> 2356 5225 Tweedy Blvd.#> 2357 1309 N. Wilton Pl.#> 2358 5225 Tweedy Blvd.#> 2359 825 W. 60th St.#> 2360 6361 Cottage St.#> 2361 6361 Cottage St.#> 2362 322 S. Lucas Ave.#> 2363 400 W. Washington Blvd.#> 2364 3500 W. Temple St.#> 2365 24300 S. W.ern Ave.#> 2366 1966 11th St.#> 2367 200 N. Ward Rd.#> 2371 525 N. Thompson Rd.#> 2372 200 S. L St.#> 2373 755 W. Pecan Ave.#> 2374 365 Sierra Park Rd.#> 2375 1700 N. Union Rd.#> 2376 450 E. Yosemite Ave.#> 2387 259 W. Sherwood Ave.#> 2390 2121 E. Childs Ave.#> 2391 2201 Fruitland Ave.#> 2392 1617 Main St.#> 2393 205 W. Olive Ave.#> 2394 1800 Buhach Rd.#> 2397 641 Norseman Dr.#> 2399 1200 W. Rumble Rd.#> 2400 18 H St.#> 2402 15732 O St.#> 2404 950 S. Mountain Ave.#> 2406 330 Reservation Rd., Ste. F#> 2407 298 Patton Pkwy.#> 2409 1505 E. Main Ave.#> 2418 2455 Grant St.#> 2427 6515 Inglewood Ave.#> 2429 707 Hardin Rd.#> 2430 2650 Fairview Rd.#> 2431 2323 Placentia Ave.#> 2432 13520 Adelfa Dr.#> 2434 30401 Reservoir Ave.#> 2436 2101 35th Ave.#> 2439 444 Hegenberger Rd.#> 2440 3877 Lusk St.#> 2441 1100 3rd Ave.#> 2442 6038 Brann St.#> 2445 3301 E. 12th St., Ste. 205#> 2448 8601 MacArthur Blvd.#> 2449 4610 Foothill Blvd.#> 2450 8601 MacArthur Blvd.#> 2451 1 Pirates Cove Way#> 2452 3670 Ocean Ranch Blvd.#> 2454 3530 S. Cherry Ave.#> 2457 525 N. Shaffer St.#> 2459 16519 Victor St., Ste. 404#> 2460 19121 Third St.#> 2462 500 W. Bard Rd.#> 2463 545 Central Ave.#> 2464 600 E. Gonzales Rd.#> 2467 294 Green Valley Rd.#> 2468 2401 E. Baristo Rd.#> 2469 65850 Pierson Blvd.#> 2470 69250 Dinah Shore Dr.#> 2471 37230 37th St. E.#> 2472 1601 Palmdale Blvd., Ste. C#> 2473 667 N. Lovekin Blvd.#> 2477 603 Third St.#> 2478 1201 S. Marengo Ave.#> 2479 990 N. Allen Ave.#> 2480 1905 N. Lincoln Ave.#> 2482 90 N. Daisy Ave.#> 2483 200 N. Seventh St.#> 2484 175 E. Nuevo Rd.#> 2485 31375 Bradley Rd.#> 2486 755 N. A St.#> 2487 Santa Rosa Jr College#> 2488 960 Wildwood Rd.#> 2491 500 N. Bradford Ave.#> 2500 270 Lake St.#> 2501 725 W. Franklin Ave.#> 2502 1151 Fairplex Dr.#> 2503 321 W. Lexington Ave.#> 2504 2211 N. Orange Grove Ave.#> 2505 475 Bangor St.#> 2506 100 Diamond Ranch Dr.#> 2507 1460 E. Holt Ave., St. 100#> 2509 473 State St.#> 2510 5659 E. Kings Canyon Rd.#> 2513 1286 Runnymede St.#> 2517 601 Mariposa St.#> 2518 595 S. Eucalyptus Ave.#> 2519 4300 Cutting Blvd.#> 2521 2600 Mission Bell Dr.#> 2522 251 S. 12th St.#> 2526 2951 Jackson St.#> 2527 5450 Victoria Ave.#> 2528 100 Sebastopol Rd.#> 2535 1400 Dickson St.#> 2536 55 Kip Dr.#> 2537 726 S. Main St.#> 2538 1850 N. E St.#> 2539 1020 Pacific St.#> 2540 2450 Blake St.#> 2541 3512 N. E St.#> 2542 1482 E. Enterprise Dr.#> 2543 985-A S. E St.#> 2544 650 N. Del Rosa Dr.#> 2546 10170 Huennekens St.#> 2547 4150 Ute Dr.#> 2551 2475 Grand Ave.#> 2555 9500 Gilman Dr.#> 2557 10170 Huennekens St.#> 2558 2855 Farragut Rd.#> 2561 1405 Park Blvd.#> 2562 1405 Park Blvd.#> 2566 7651 Wellington Way#> 2567 1405 Park Blvd.#> 2569 3773 El Cajon Blvd.#> 2570 3910 University Ave., Ste. 100#> 2571 201 A St.#> 2576 2355 Folsom St.#> 2581 325 La Grande Ave.#> 2582 70 Oak Grove#> 2584 405 S. San Gabriel Blvd.#> 2585 500 Idyllwild Dr.#> 2586 480 N. San Jacinto Ave.#> 2587 1000 Ramona Blvd.#> 2588 620 Gaundabert Ln.#> 2591 275 N. 24th St.#> 2592 2001 Cottle Ave.#> 2593 1460 The Alameda#> 2601 2360 El Camino Ave.#> 2608 Rt.1, 676 Baseline Rd.#> 2609 185 Mission Ave#> 2610 1045 Bethel Ave.#> 2612 1801 S. Greenville St.#> 2613 1401 S. Grand Ave.#> 2614 1700 Fulton Rd.#> 2615 1207 Cleveland Ave.#> 2616 1050 Myrtle St.#> 2617 101 S. First St.#> 2623 3850 Irvin Rd.#> 2632 15048 Bear Valley Rd.#> 2634 16500 Mojave Dr.#> 2635 13853 Seneca Rd#> 2636 15620 Joshua St.#> 2639 400 B St.#> 2641 2925 Rosamond Blvd.#> 2643 1425 S. Center St.#> 2644 300 N. Gertrude St.#> 2645 1621 Brookside Rd.#> 2647 2050 W. Ln.#> 2648 22 S. Van Buren St.#> 2650 One N. Sutter#> 2651 931 E. Magnolia St.#> 2652 1621 Brookside Rd.#> 2653 309 W. Main St., Ste. 110#> 2658 751 Otay Lakes Rd.#> 2659 1395 Hilltop Dr.#> 2660 820 Fourth Ave.#> 2661 555 Claire Ave.#> 2662 505 Elm Ave.#> 2663 1685 Hollister St.#> 2664 2900 Highland Ave.#> 2665 1385 Third Ave.#> 2666 5353 Airway Rd.#> 2667 1925 Magdalena Ave.#> 2672 351 N. K St.#> 2673 755 E. Tulare Ave.#> 2674 824 W. Maple Ave.#> 2675 3442 E. Bardsley Ave.#> 2676 850 Main St.#> 2680 1031 N. State St.#> 2684 100 N. Rancho Sante Fe Rd.#> 2687 2 N. Catalina St.#> 2689 1717 N. McAuliff Rd.#> 2690 900 S. Conyer St.#> 2691 1001 W. Main St.#> 2692 5100 W. Whitendale Ave.#> 2693 1821 W. Meadow Ln.#> 2695 2000 N. Santa Fe Ave.#> 2696 1 Panther Dr.#> 2697 1601 Longhorn Dr.#> 2698 1126 N. Melrose Dr., Ste. 301#> 2701 1900 Seventh St.#> 2702 1504 Fallbrook St.#> 2703 2695 S. Valentine Ave.#> 2707 1901 E. Shields Ave., Ste. 169#> 2712 12417 E. Philadelphia St.#> 2716 27051 Robert C. Lee Pkwy.#> 2718 26334 Citrus St.#> 2721 222 11th St.#> 2726 101 Grant Ave.#> 2727 21 N. W. St.#> 2734 4123 Lakeside Dr.#> 2735 1901 E. Shields Ave., Ste. 130#> 2736 1207 S. Trinity St.#> 2740 155 W. Washington Blvd.#> 2741 155 W. Washington Blvd.#> 2742 2824 S. Main St.#> 2743 2824 S. Main St.#> 2745 1942 Randolph St.#> 2746 1025 S. Madera Ave.#> 2747 28123 Ave. 14#> 2748 1850 Wardrobe Ave., Bldg. H#> 2749 901 Blanco Cir.#> 2750 9845 Horn Rd., Ste. 150#> 2753 410 W. J St., Ste. A#> 2756 13730 Perris Blvd.#> 2757 3939 13th St.#> 2758 4800 Magnolia Ave.#> 2760 2829 Transworld Dr.#> 2761 2707 Transworld Dr.#> 2762 3100 Monte Diablo Ave.#> 2763 1881 Cunningham Ave.#> 2764 14271 Story Rd.#> 2767 3920 Bluebird St.#> 2769 1735 E. Houston Ave.#> 2771 401 S. A St., Ste. 3#> 2775 2125 Jefferson Ave.#> 2778 1548 E. Walnut Ave.#> 2780 3801 W. Temple Ave.#> 2781 115 N. Ave. 53#> 2782 625 Coleman Ave.#> 2783 6957 N. Figueroa St.#> 2784 6052 Juanche St.#> 2785 2000 San Juan Hwy.

89 / 106

Page 90: Investigating objects and data patterns using base R

Subset function, examples

Combine subset() with assignment ( <- ) to create a new data frame

Create a new date frame of all CA public high schools that are at least 50% LatinxAND only keep variables name and address

df_school_v2 <- subset(df_school, school_type == "public" & pct_hispanic >= 50& state_code == "CA", select = c(name, address))

head(df_school_v2, n=5)#> name address#> 1254 Tustin High 1171 El Camino Real#> 1301 Bell Gardens High 6119 Agra St.#> 1309 Santa Ana High 520 W. Walnut#> 1332 Warren High 8141 De Palma St.#> 1336 Hollywood Senior High 1521 N. Highland Ave.

nrow(df_school_v2)#> [1] 713

90 / 106

Page 91: Investigating objects and data patterns using base R

Student Exercises

Using subset() from base R:

1. Create a new dataframe by extracting the columns instnm , event_date ,event_type from df_event data frame. And show what columns (variables)

are in the newly created dataframe.2. Create a new dataframe from the df_school data frame that includes

out-of-state public high schools with 50%+ Latinx student enrollment thatreceived at least one visit by the University of California Berkeley (var=visits_by_110635). And count the number of observations.

3. Count the number of public schools from CA, FL or MA that received one or twovisits from UC Berkeley from the df_school data frame.

4. Subset all public out-of-state high schools visited by University of CaliforniaBerkeley that enroll at least 50% Black students, and only keep variablesstate_code , name and zip_code .

91 / 106

Page 92: Investigating objects and data patterns using base R

Solution to Student Exercises

Solution to 1

df_event_br <- subset(df_event, select=c(instnm, event_date, event_type))names(df_event_br)#> [1] "instnm" "event_date" "event_type"

Solution to 2

df_school_br <- subset(df_school, state_code != "CA" & school_type == "public"& pct_hispanic >= 50 & visits_by_110635 >=1 )

nrow(df_school_br)#> [1] 10

Solution to 3

nrow(subset(df_school, state_code %in% c("CA", "FL", "MA")& school_type == "public" & visits_by_110635 %in% c(1,2) ))

#> [1] 246

92 / 106

Page 93: Investigating objects and data patterns using base R

Solution to Student Exercises

Solution to 4

subset(df_school, school_type == "public" & state_code != "CA"& visits_by_110635 >= 1 & pct_black >= 50,select = c(state_code, name, zip_code))

#> state_code name zip_code#> 4523 GA Grady High School 30309#> 8519 MD Frederick Douglass High 20772#> 9666 MN DOWNTOWN CAMPUS 55403#> 10681 MS MURRAH HIGH SCHOOL 39202#> 14569 OH Shaker Hts High School 44120#> 14632 OH Cleveland Heights High School 44118#> 17142 SC Spring Valley High 29229#> 17143 SC Richland Northeast High 29223#> 17623 TN Soulsville Charter School 38106#> 17624 TN KIPP Memphis Collegiate High School 38108

93 / 106

Page 94: Investigating objects and data patterns using base R

4 Creating variables

94 / 106

Page 95: Investigating objects and data patterns using base R

Create new data frame based on df_school_allData frame df_school_all has one obs per US high school and then variablesidentifying number of visits by particular universities

load(url("https://github.com/ozanj/rclass/raw/master/data/recruiting/recruit_school_allvars.RData"))names(df_school_all)#> [1] "state_code" "school_type" "ncessch"#> [4] "name" "address" "city"#> [7] "zip_code" "pct_white" "pct_black"#> [10] "pct_hispanic" "pct_asian" "pct_amerindian"#> [13] "pct_other" "num_fr_lunch" "total_students"#> [16] "num_took_math" "num_prof_math" "num_took_rla"#> [19] "num_prof_rla" "avgmedian_inc_2564" "latitude"#> [22] "longitude" "visits_by_196097" "visits_by_186380"#> [25] "visits_by_215293" "visits_by_201885" "visits_by_181464"#> [28] "visits_by_139959" "visits_by_218663" "visits_by_100751"#> [31] "visits_by_199193" "visits_by_110635" "visits_by_110653"#> [34] "visits_by_126614" "visits_by_155317" "visits_by_106397"#> [37] "visits_by_149222" "visits_by_166629" "total_visits"#> [40] "inst_196097" "inst_186380" "inst_215293"#> [43] "inst_201885" "inst_181464" "inst_139959"#> [46] "inst_218663" "inst_100751" "inst_199193"#> [49] "inst_110635" "inst_110653" "inst_126614"#> [52] "inst_155317" "inst_106397" "inst_149222"#> [55] "inst_166629"

95 / 106

Page 96: Investigating objects and data patterns using base R

Create new data frame based on df_school_allCreate new version of data frame, called school_v2 , which we’ll use to introducehow to create new variables

library(tidyverse) # below code use tidyverse functions and pipe operator#> -- Attaching packages --------------------------------------- tidyverse 1.3.1 --#> v ggplot2 3.3.5 v purrr 0.3.4#> v tibble 3.1.3 v dplyr 1.0.7#> v tidyr 1.1.3 v stringr 1.4.0#> v readr 2.0.0 v forcats 0.5.1#> -- Conflicts ------------------------------------------ tidyverse_conflicts() --#> x dplyr::filter() masks stats::filter()#> x dplyr::lag() masks stats::lag()school_v2 <- df_school_all %>%

select(-contains("inst_")) %>% # remove vars that start with "inst_"rename( # rename selected variables

visits_by_berkeley = visits_by_110635,visits_by_boulder = visits_by_126614,visits_by_bama = visits_by_100751,visits_by_stonybrook = visits_by_196097,visits_by_rutgers = visits_by_186380,visits_by_pitt = visits_by_215293,visits_by_cinci = visits_by_201885,visits_by_nebraska = visits_by_181464,visits_by_georgia = visits_by_139959,visits_by_scarolina = visits_by_218663,visits_by_ncstate = visits_by_199193,visits_by_irvine = visits_by_110653,visits_by_kansas = visits_by_155317,visits_by_arkansas = visits_by_106397,visits_by_sillinois = visits_by_149222,visits_by_umass = visits_by_166629,num_took_read = num_took_rla,num_prof_read = num_prof_rla,med_inc = avgmedian_inc_2564

)

glimpse(school_v2)

96 / 106

Page 97: Investigating objects and data patterns using base R

Base R approach to creating new variablesCreate new variables using assignment operator <- and subsetting operators [] and$ to create new variables and set conditions of the input variables

Pseudo syntax: df$newvar <- ...

▶ where ... argument is expression(s)/calculation(s) used to create new variables▶ expressions can include subsetting operators and/or other base R functions

Task: Create measure of percent of students on free-reduced lunchbase R approach

school_v2_temp<- school_v2 #create copy of dataset; not necessaryschool_v2_temp$pct_fr_lunch <-

school_v2_temp$num_fr_lunch/school_v2_temp$total_students

#investigate variable you createdstr(school_v2_temp$pct_fr_lunch)#> num [1:21301] 0.723 1 0.967 0.93 1 ...school_v2_temp$pct_fr_lunch[1:5] # print first 5 obs#> [1] 0.7225549 1.0000000 0.9666667 0.9303483 1.0000000

tidyverse approach (with pipes)

school_v2_temp <- school_v2 %>%mutate(pct_fr_lunch = num_fr_lunch/total_students)

97 / 106

Page 98: Investigating objects and data patterns using base R

Base R approach to creating new variables

If creating new variable based on the condition/values of input variables, basically thetidyverse equivalent of mutate() with if_else() or recode()

▶ Pseudo syntax: df$newvar[logical condition]<- new value▶ logical condition : a condition that evaluates to TRUE or FALSE

98 / 106

Page 99: Investigating objects and data patterns using base R

Base R approach to creating new variablesTask: Create 0/1 indicator if school has median income greater than $100ktidyverse approach (using pipes)

school_v2_temp %>% select(med_inc) %>%mutate(inc_gt_100k= if_else(med_inc>100000,1,0)) %>%count(inc_gt_100k) # note how NA values of med_inc treated

#> # A tibble: 3 x 2#> inc_gt_100k n#> <dbl> <int>#> 1 0 18632#> 2 1 2045#> 3 NA 624

Base R approach

school_v2_temp$inc_gt_100k<-NA #initialize an empty column with NAs# otherwise you'll get warning

school_v2_temp$inc_gt_100k[school_v2_temp$med_inc>100000] <- 1school_v2_temp$inc_gt_100k[school_v2_temp$med_inc<=100000] <- 0count(school_v2_temp, inc_gt_100k)#> # A tibble: 3 x 2#> inc_gt_100k n#> <dbl> <int>#> 1 0 18632#> 2 1 2045#> 3 NA 624

99 / 106

Page 100: Investigating objects and data patterns using base R

Creating variablesTask: Using data frame wwlist and input vars state and firstgen , create a4-category var with following categories:

▶ “instate_firstgen”; “instate_nonfirstgen”; “outstate_firstgen”;“outstate_nonfirstgen”

tidyverse approach (using pipes)

load(url("https://github.com/ozanj/rclass/raw/master/data/prospect_list/wwlist_merged.RData"))wwlist_temp <- wwlist %>%

mutate(state_gen = case_when(state == "WA" & firstgen =="Y" ~ "instate_firstgen",state == "WA" & firstgen =="N" ~ "instate_nonfirstgen",state != "WA" & firstgen =="Y" ~ "outstate_firstgen",state != "WA" & firstgen =="N" ~ "outstate_nonfirstgen")

)str(wwlist_temp$state_gen)#> chr [1:268396] NA "instate_nonfirstgen" "instate_nonfirstgen" ...wwlist_temp %>% count(state_gen)#> # A tibble: 5 x 2#> state_gen n#> <chr> <int>#> 1 instate_firstgen 32428#> 2 instate_nonfirstgen 58646#> 3 outstate_firstgen 32606#> 4 outstate_nonfirstgen 134616#> 5 <NA> 10100

100 / 106

Page 101: Investigating objects and data patterns using base R

Base R approach to creating new variablesTask: Using wwlist and input vars state and firstgen , create a 4-category var

base R approach

wwlist_temp <- wwlist

wwlist_temp$state_gen <- NAwwlist_temp$state_gen[wwlist_temp$state == "WA"

& wwlist_temp$firstgen =="Y"] <- "instate_firstgen"wwlist_temp$state_gen[wwlist_temp$state == "WA"

& wwlist_temp$firstgen =="N"] <- "instate_nonfirstgen"wwlist_temp$state_gen[wwlist_temp$state != "WA"

& wwlist_temp$firstgen =="Y"] <- "outstate_firstgen"wwlist_temp$state_gen[wwlist_temp$state != "WA"

& wwlist_temp$firstgen =="N"] <- "outstate_nonfirstgen"

str(wwlist_temp$state_gen)#> chr [1:268396] NA "instate_nonfirstgen" "instate_nonfirstgen" ...count(wwlist_temp, state_gen)#> # A tibble: 5 x 2#> state_gen n#> <chr> <int>#> 1 instate_firstgen 32428#> 2 instate_nonfirstgen 58646#> 3 outstate_firstgen 32606#> 4 outstate_nonfirstgen 134616#> 5 <NA> 10100 101 / 106

Page 102: Investigating objects and data patterns using base R

5 Appendix

102 / 106

Page 103: Investigating objects and data patterns using base R

5.1 Sorting data

103 / 106

Page 104: Investigating objects and data patterns using base R

Base R sort() for vectors

sort() is a base R function that sorts vectors

Syntax: sort(x, decreasing=FALSE, ...)

▶ where x is object being sorted▶ By default it sorts in ascending order (low to high)▶ Need to set decreasing argument to TRUE to sort from high to low

#?sort()x<- c(31, 5, 8, 2, 25)sort(x)#> [1] 2 5 8 25 31sort(x, decreasing = TRUE)#> [1] 31 25 8 5 2

104 / 106

Page 105: Investigating objects and data patterns using base R

Base R order() for dataframesorder() is a base R function that sorts vectors

▶ Syntax: order(..., na.last = TRUE, decreasing = FALSE)▶ where ... are variable(s) to sort by▶ By default it sorts in ascending order (low to high)▶ Need to set decreasing argument to TRUE to sort from high to low

Descending argument only works when we want either one (and only) variabledescending or all variables descending (when sorting by multiple vars)

▶ use - when you want to indicate which variables are descending while using thedefault ascending sorting

df_event[order(df_event$event_date), ]df_event[order(df_event$event_date, df_event$total_12), ]

#sort descending via argumentdf_event[order(df_event$event_date, decreasing = TRUE), ]df_event[order(df_event$event_date, df_event$total_12, decreasing = TRUE), ]

#sorting by both ascending and descending variablesdf_event[order(df_event$event_date, -df_event$total_12), ]

105 / 106

Page 106: Investigating objects and data patterns using base R

Example, sorting

▶ Create a new dataframe from df_events that sorts by ascending by event_date ,ascending event_state , and descending pop_total .

base R using order() function:

df_event_br1 <- df_event[order(df_event$event_date, df_event$event_state,-df_event$pop_total), ]

106 / 106