team17: thehipsters’troublingtelephonegame: acausal ... · 2021. 5. 5. · team17:...

8
Team 17: The Hipsters’ Troubling Telephone Game: A Causal Analysis of Gentrification Effects on the Number of 311 Noise Complaints Calls Xing Liu, Alessandro Micheli, Harrison Zhu October 25, 2020 1 Executive Summary Hipsters are often associated with gentrification in New York City neighborhoods as well as with the rise of the sharing economy world. We pose a question on whether we can measure the effect of gentrification on total noise complaints in New York City by using sharing economy data. We analyzed Airbnb short-term rentals, For-Hire Vehicles rides and demographic data on residents across New York City to establish potential causes behind gentrification. An exploratory data analysis showed gentrification being mostly associated with changes in income, demographic, Airbnb short-term rentals and For-Hire Vehicles trips. By means of a clustering algorithm we were able to cluster different neighborhoods into 3 groups of varying gentrification levels. In addition to simply finding association, we used a controlled experimental setup based on these levels to estimate the effect of gentrification on 311 noise complaints call volumes. By estimating the Conditional Average Treatment Effect (CATE), we find that the large volumes of calls exhibited by several neighbourhoods in Brooklyn are associated with gentrification, in accordance with what reported in news media and academic studies. On the other hand, our findings corroborate the hypothesis that the high number of 311 noise complaint calls registered in Manhattan’s gentrifying neighbourhoods could be related to factors different from those behind gentrification. 2 Introduction and Exploratory Data Analysis The displacement of the lower-class residents and the influx of higher educated, wealthier individuals have disrupted the socio-political landscape of many American metropolitan areas. The corresponding effects and causes fall under the name of “gentrification”. Even though gentrification is routinely documented by news sources (see for example [1; 2; 3]), the forces fueling this phenomenon are still of much debate among academics as well as media pundits. A popular news article in BuzzFeed [1] noted the increasing relationship between gentrification, 311 complaints and subsequent police interventions. Their findings corroborate the association between the influx of Caucasian residents and the corresponding complaints targeted towards the original, often minority, residents. New York City (NYC) governmental agencies have recorded a staggering 44 million customer interactions in 2018 [4], several times the NYC population. The top categories remain unchanged from 2017 with noise complaints being the number one cause of service requests. Part of the most recent research has focused on the phenomenon of “Hipster gentrification” - displacements generated by a 21st century alternative subculture alimenting, for example, the rise of stores such as coffee and vintage clothing shops in many metropolis around the world. Medias often associated such 1

Upload: others

Post on 14-Aug-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Team17: TheHipsters’TroublingTelephoneGame: ACausal ... · 2021. 5. 5. · Team17: TheHipsters’TroublingTelephoneGame: ACausal AnalysisofGentrificationEffectsontheNumberof311Noise

Team 17: The Hipsters’ Troubling Telephone Game: A CausalAnalysis of Gentrification Effects on the Number of 311 Noise

Complaints Calls

Xing Liu, Alessandro Micheli, Harrison Zhu

October 25, 2020

1 Executive SummaryHipsters are often associated with gentrification in New York City neighborhoods as well as with the riseof the sharing economy world. We pose a question on whether we can measure the effect of gentrificationon total noise complaints in New York City by using sharing economy data.

We analyzed Airbnb short-term rentals, For-Hire Vehicles rides and demographic data on residentsacross New York City to establish potential causes behind gentrification. An exploratory data analysisshowed gentrification being mostly associated with changes in income, demographic, Airbnb short-termrentals and For-Hire Vehicles trips. By means of a clustering algorithm we were able to cluster differentneighborhoods into 3 groups of varying gentrification levels. In addition to simply finding association,we used a controlled experimental setup based on these levels to estimate the effect of gentrification on311 noise complaints call volumes.

By estimating the Conditional Average Treatment Effect (CATE), we find that the large volumes ofcalls exhibited by several neighbourhoods in Brooklyn are associated with gentrification, in accordancewith what reported in news media and academic studies. On the other hand, our findings corroboratethe hypothesis that the high number of 311 noise complaint calls registered in Manhattan’s gentrifyingneighbourhoods could be related to factors different from those behind gentrification.

2 Introduction and Exploratory Data AnalysisThe displacement of the lower-class residents and the influx of higher educated, wealthier individuals havedisrupted the socio-political landscape of many American metropolitan areas. The corresponding effectsand causes fall under the name of “gentrification”. Even though gentrification is routinely documentedby news sources (see for example [1; 2; 3]), the forces fueling this phenomenon are still of much debateamong academics as well as media pundits. A popular news article in BuzzFeed [1] noted the increasingrelationship between gentrification, 311 complaints and subsequent police interventions. Their findingscorroborate the association between the influx of Caucasian residents and the corresponding complaintstargeted towards the original, often minority, residents. New York City (NYC) governmental agencieshave recorded a staggering 44 million customer interactions in 2018 [4], several times the NYC population.The top categories remain unchanged from 2017 with noise complaints being the number one cause ofservice requests.

Part of the most recent research has focused on the phenomenon of “Hipster gentrification” - displacementsgenerated by a 21st century alternative subculture alimenting, for example, the rise of stores such ascoffee and vintage clothing shops in many metropolis around the world. Medias often associated such

1

Page 2: Team17: TheHipsters’TroublingTelephoneGame: ACausal ... · 2021. 5. 5. · Team17: TheHipsters’TroublingTelephoneGame: ACausal AnalysisofGentrificationEffectsontheNumberof311Noise

subculture to the rise of the sharing-economy world, such as Airbnb and FHVs (For-Hire Vehicles)1(see for example the articles [5; 6; 7]). In the United States, such subculture is often associated withupper-middle-class Caucasian young adults [8].

In this report, we use Airbnb rental locations, FHV rides, US Census and 311 data to examine theleading factor behind noise complaints call in NYC. Appendix A contains further details regarding thedata used in this study.

Figure 1 shows the total number of 311 noise complaints calls in the NYC area in 2018. Brooklyn andManhattan present some of the highest call volumes. Our results are compatible with what was alreadyobserved by [9]. Such noise calls are often attributed to the ongoing gentrification taking place in areassuch as Brooklyn [10].

Figure 2 compares the absolute change of Airbnb Rentals with the relative change in the number ofeducated residents (Bachelor’s or higher degrees) for each census tract of the NYC area from 2010 to2018. The data is ranked in ascending order and then rescaled between -1 and +1 for comparison. Thetracts in the West-Side Manhattan as well as the “gentrified” Brooklyn both present an increase inthe number of educated residents and Airbnb Rentals. Sparsely, throughout the map, it is possible todetermine other tracts in which such agreement is confirmed.

Manhattan

Brooklyn

−0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00Total Number of Noise Complaints in 2018

Figure 1: An heat-map showing the areas with the total number of 311 noise complaints calls in 2018.The scale is normalized from -1 to 1, corresponding to the lowest and highest call volumes, respectively.

Lastly, Table 1 shows the Pearson correlation coefficients between the absolute change in FHV relationshipand (i) the absolute change in Caucasian residents, and (ii) the absolute change in educated residentsin the boroughs of Manhattan and Brooklyn. Specifically, for each NYC Taxi zone we compute theabsolute change in the number of FHV rides from 2014 to 2018 and compute the correlation with theabsolute change in Caucasian residents of all the tracts within each Taxi zone. A similar procedure iscarried out for the change in educated residents. Table 1 shows significant correlations between thechange in FHV rides and the change in Caucasian residents as well as educated residents, therefore,supporting the existence of an association between the variables.

However, cause-effect relationships should be interpreted with care, especially with respect to racialdemographics, as the corresponding effects are often shrouded by strong confounders. In the next section,we describe a methodological setup for uncovering the causal effects of gentrification on 311 call volumes.

1For-Hire Vehicle providers include Uber and Lyft for example.

2

Page 3: Team17: TheHipsters’TroublingTelephoneGame: ACausal ... · 2021. 5. 5. · Team17: TheHipsters’TroublingTelephoneGame: ACausal AnalysisofGentrificationEffectsontheNumberof311Noise

Pearson Correlations - Change in FHV Rides

Educated Residents Caucasian Residents

Correlation 0.23 0.22p-value 0.01 0.01

Table 1: A table showing the Pearson correlation between the change in FHV rides and the change ineducated residents (“Educated Residents”) and the change in Caucasian residents (“Caucasian Residents”)from 2014 to 2018. The census and ride data considered are those of the boroughs of Manhattan andBrooklyn.

Manhattan

Brooklyn

Airbnb in NYC

Manhattan

Brooklyn

Educated Residents in NYC

−0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00Absolute Change of Airbnb Rentals

−0.75 −0.50 −0.25 0.00 0.25 0.50 0.75Relative Change of Educated Residents

Figure 2: Left Panel: Absolute Change of Airbnb short-term rentals from 2010 to 2018. The scale isnormalized from -1 to 1 corresponding to the smallest and highest absolute changes, respectively.Right Panel: Relative Change of number of educated resident (Bachelor’s or higher degrees) from 2010to 2018. The scale is normalized from -1 to 1 corresponding to the smallest and highest relative changes,respectively.

3 MethodologyDenote X as the covariate space and N as the response space. Capital letters will indicate randomvariables and uncapitalised ones will be samples. We want to conduct a causal analysis of the effect ofgentrification on call volumes based on K gentrification groups, which we call treatment groups. Wewill be looking at data at taxi zone resolution (130 zones), which we downscaled from census tract level(around 1315 tracts). We consider calls Y (Z) with Z being the treatment groups. Before we proceedwith the model setup, we need to make the following assumptions:

• Unconfoundedness: This means the conditional independence (Y (0), Y (1), Y (2)) ⊥⊥ Z|X. The

3

Page 4: Team17: TheHipsters’TroublingTelephoneGame: ACausal ... · 2021. 5. 5. · Team17: TheHipsters’TroublingTelephoneGame: ACausal AnalysisofGentrificationEffectsontheNumberof311Noise

assignment of the treatment group is only generated by the covariates considered and not thecall volumes. Indeed, as described in many social sciences articles pertaining to the subjectof gentrification and calls, it is the underlying covariates X, such as change in income anddemographics, that are directly associated with them.

• Common Support: Each zone conditioned on covariates x will have a non-zero probability ofbeing assigned into another treatment group. This is a realistic assumption as zones are in generalsusceptible to change in gentrification levels.

• Stable Unit Treatment Value Assumption (SUTVA): We assume the SUTVA, making surethat there is no interference of another zone’s gentrification assignment to the call volumes of thecurrent zone. This may not potentially hold or would not strongly hold in reality for tracts, as noisecomplaints for a tracts could be made from nearby or adjacent tracts. We could not easily examinesuch hypothesis due to data scarcity, but since only the boundaries would be affected we make theassumption that this interference is insignificant for our model. We refer the reader to [11] formethodology on treatment effects based on boundaries. Nevertheless, having aggregated censustracts to zones of larger areas alleviates, to some extent, this problem of interference betweennearby zones.

We propose the standard Poisson observation model for 311 calls:

y ∼ Poisson(exp(f(x, z))),

for some f ∈ H ⊆ {f : f : X × {1, . . . ,K} → R}, x ∈ X and z indicates the treatment group. ANegative-Binomial distribution would also allow for more flexibility but for simplicity we prefer to use aPoisson distribution. In order to specify the class of functions H, we experiment with a couple of modelssuch as linear models and giving f a Gaussian Process (GP) prior, opting for the latter as it gave alower loss, in terms of a sum of statistical estimation and model approximation losses.

With K treatment groups, we propose the structure f(x, z) = g(x) + η(x, z) = g(x) +∑K

k=1 ηk(x)δz,k,where g is a function, δz,k = 1 if z = k and 0 otherwise, and ηk : X → R is the intensity effect for groupk. We opted for GP with zero mean and kernel structure

k((x, z), ·) = kRBF(o, ·) + kMatern12(s, ·) + kLinear(z, ·),

where x = (o, s)T ∈ X such that o represents some non-geographical covariates and s ∈ R2. We usea radial basis function (RBF) kernel to model the dependence between covariates oi and a Matern12kernel for spatial dependence. Indeed, as the First Law of Geography states: “everything is related toeverything else, but near things are more related than distant things”. The linear kernel for the treatmentassignments specifies a linear effect on the intensity effects ηk, as GPs with linear kernels producesamples that are linear functions. In practice, we use an one-hot encoding of the treatment covariateunder the S-Learner setup [12].

Conditional Average treatment effect (CATE): The conditional average treatment effect (CATE)is defined as the function

τj,k(x) := E[Y (j)− Y (k)|X = x] = exp(f(x, j))− exp(f(x, k))

for any x ∈ X . Under our model, the CATE for treatment group j compared with another group kcan be estimated as τ̂j,k = exp(f̂(x, j))− exp(f̂(x, k)), where f̂ is the posterior mean of the Gaussianprocess, which is an estimate of f . The CATE can be interpreted as the influence or the direct effect ofbeing in a gentrification group j rather than in k.

4

Page 5: Team17: TheHipsters’TroublingTelephoneGame: ACausal ... · 2021. 5. 5. · Team17: TheHipsters’TroublingTelephoneGame: ACausal AnalysisofGentrificationEffectsontheNumberof311Noise

Clusters in Manhattan and Brooklyn

Cluster 1 Cluster 2 Cluster 3

CATE on counterfactual being Cluster 1

0

500

1000

1500

2000

2500

Figure 3: Left Panel: K-means clustering over the the absolute change of FHV rides, educated residents,number of Airbnb rentals from 2014 to 2018 for each individual taxi zone.Right Panel: CATE on noise call volumes computed with the counterfactual being cluster 1.

4 ResultsWe focus on the boroughs of Manhattan and Brooklyn where the number of 311 noise complaint callsis the highest. In order to establish the K treatment groups defined in Section 3 we perform K-means clustering over the the absolute change of FHV rides, educated residents, number of Airbnbrentals and average median income from 2014 to 2018, where each covariate is computed over eachindividual taxi zone. The choice of K = 3 was the most compatible with several results we analyzedfrom social sciences and the one that minimises the geographic dispersion of the members of each cluster.

Figure 3 shows the clusters overlaid on the NYC street map. Each cluster corresponds to a differentdegree of gentrification in ascending order from 1 to 3. Cluster 1 contains areas with a slow rate ofgentrification, such as those in the southern part of Brooklyn, or in which gentrification has alreadyhappened and therefore not gentrifying anymore. Cluster 2 contains areas where gentrification is ongoing,for example Greenpoint in North Brooklyn. Finally, Cluster 3 corresponds to “super-gentrifying” areas,such as the neighbours of Crown Heights and Park Slope (yellow blocks in Brooklyn) as well as Hell’sKitchen in Manhattan. The results of our clustering are compatible with a wide range of news articlesand research publications including [13; 14].

To approximate the GP posterior, we used the inducing points method to form the approximation viastochastic gradient descent with the GPFlow package [15]. For τ3,1 as an example, we adopt the ordinalprocedure of setting the one-hot-encoded linear kernel input as (0, 1, 1) for the first term.

Several neighbourhoods in Manhattan belonging to either Cluster 2 or 3 present a very high number of

5

Page 6: Team17: TheHipsters’TroublingTelephoneGame: ACausal ... · 2021. 5. 5. · Team17: TheHipsters’TroublingTelephoneGame: ACausal AnalysisofGentrificationEffectsontheNumberof311Noise

311 noise complaints call whilst being on the lower end of the CATE, as displayed by Figure 3. Thiscorroborates the hypothesis that other factors, other than Airbnb rentals, FHV rides and demographics,could be at the origin of such sustained call volumes. On the other hand, the neighbours of CrownHeights, Park Slope, Bedford-Stuyvesant and Bushwick in the Brooklyn Borough present some of thehighest degrees of CATE on the entire map. Such finding supports the ideas shared by many newssources [16; 17] that factors such as the rise of Airbnb rentals, the presence of hipsters [18] as well asother major demographic factors are behind the gentrification ongoing in those areas and the relatednumber of 311 noise complaints calls.

Policies analogous to those adopted by the Mayor of London [19], which put a cap on the allowed annualnumber of days of short-term rental, could be put in act to mitigate the effect of gentrification in manyof the Brooklyn neighbours.

Related Works: Many researchers have used a similar causal inference setup to investigate the effectof gentrification on a range of aspects. [20] used a similar set of socio-economic features to partitionBrisbane, Australia, into subareas according to their gentrification levels, and analyzed the influence ofgentrification on neighbour complaints. [21] and [22], on the other hand, both compared the mental healthof residents in gentrifying and non-gentrifying areas. [21] looked at the inverse-probability-weightedestimates for the average treatment effect of living in a gentrified area on the psychological distress,whilst [22] adopted a matching technique by comparing the mental health of individuals who had similarsocial characteristics to conclude that residents in gentrified areas showed more depression and anxietysymptoms.

Although there are both papers [23] and news articles [24] that analyze how Airbnb listings and FHVrides are correlated with gentrification, to the best of our knowledge, our approach of using a combinationof these two factors has not been investigated in the literature.

Future Works: Possible future works could involve analyzing this phenomenon at the census tractlevel if given higher-resolution data on taxi rides and Airbnb rentals. Extra data, e.g. on the number offood and clothing franchises, as those popular among hipsters, could also be included to expand thefeatures. Another interesting question to ask is whether the geographical borders could be be includedin the modelling for alleviating the possible violation of the Stable Unit Treatment Value Assumption(SUTVA), which is a key assumption for our causal model.

References[1] L.T. Vo. They played dominoes outside their apartment for decades. then the white people moved

in and police started showing up. BuzzFeed News, 2018.

[2] L. Freeman and F. Braconi. Gentrification and displacement New York City in the 1990s. Journalof the American Planning Association, 70(1):39–52, 2004.

[3] L. Lees. Super-gentrification: The case of brooklyn heights, new york city. Urban studies, 40(12):2487–2509, 2003.

[4] 311 sets new record with 44 million customer interactions in 2018. Available at: https://www1.nyc.gov/311/311-sets-new-record-in-2018.page. Accessed: 2020-10-24.

[5] When it comes to Airbnb, the hipsters are winning. The Indipendent, 2016.

[6] Airbnb lifestyle: The rise of the hipster nomad., . Available at: https://techcrunch.com/2014/10/03/airbnb-lifestyle-the-rise-of-the-hipster-nomad/. Accessed: 2020-10-24.

[7] Taxi driver: Lyft is as bad for St. Louis workers as Walmart, fast food.,. Available at: https://www.riverfronttimes.com/newsblog/2014/04/21/

6

Page 7: Team17: TheHipsters’TroublingTelephoneGame: ACausal ... · 2021. 5. 5. · Team17: TheHipsters’TroublingTelephoneGame: ACausal AnalysisofGentrificationEffectsontheNumberof311Noise

taxi-driver-lyft-is-as-bad-for-st-louis-workers-as-walmart-fast-food. Accessed:2020-10-24.

[8] M. Greif. What was the hipster? New York Mag, 2010.

[9] S. Weaver. Brooklyn is officially the noisiest borough in NYC. TimeOut, 2020.

[10] V. Yee. Gentrification in a Brooklyn neighborhood forces residents to move on. The New YorkTimes, 2015.

[11] M. Rischard, Z. Branson, L. Miratrix, and L. Bornn. Do school districts affect nyc house prices?identifying border differences using a bayesian nonparametric approach to geographic regressiondiscontinuity designs. Journal of the American Statistical Association, (just-accepted):1–35, 2020.

[12] J. L. Hill. Bayesian nonparametric modeling for causal inference. Journal of Computational andGraphical Statistics, 20(1):217–240, 2011.

[13] A. Jacobson. Rapid change in Hell’s Kitchen. The New York Times, 2015.

[14] L. Lees. Super-gentrification: the case of Brooklyn Heights, New York City. Urban Studies, 40(12):2487–2509, November 2003.

[15] A. G. De G. Matthews, M. Van Der Wilk, T. Nickson, K. Fujii, A. Boukouvalas, P. León-Villagrá,Z. Ghahramani, and J. Hensman. Gpflow: A gaussian process library using tensorflow. The Journalof Machine Learning Research, 18(1):1299–1304, 2017.

[16] Crown heights bar received 86 complaints in 2016, all from the samewoman, Oct 2018. URL https://brooklyneagle.com/articles/2018/10/23/crown-heights-bar-received-86-complaints-in-2016-all-from-the-same-woman/.

[17] H. Halle. Noise complaints are way up now that New Yorkers are stuck at home. TimeOut, 2020.

[18] R. Carroll. Hipster-bashing in california: angry residents fight back against gentrification. TheGuardian, 2017.

[19] Mayor of London. Mayor calls for registration system to enforce short-term let-ting law, 2019. Available at https://www.london.gov.uk/press-releases/mayoral/registration-system-for-short-term-letting-law. Accessed: 2020-10-24.

[20] L. Cheshire, R. Fitzgerald, and Y. Liu. Neighbourhood change and neighbour complaints: howgentrification and densification influence the prevalence of problems between neighbours. UrbanStudies, 56(6):1093–1112, 2019. doi: 10.1177/0042098018771453. URL https://doi.org/10.1177/0042098018771453.

[21] L. D. Tran, T. H. Rice, P. M. Ong, S. Banerjee, J. Liou, and N. A. Ponce. Impact of gentrificationon adult mental health. Health Services Research, 55(3):432–444, 2020.

[22] R. J. Smith, A. J. Lehning, and K. Kim. Aging in place in gentrifying neighborhoods: implicationsfor physical and mental health. The Gerontologist, 58(1):26–35, 2018.

[23] D. Wachsmuth and A. Weisler. Airbnb and the rent gap: gentrification through the sharing economy.Environment and Planning A: Economy and Space, 02 2018. doi: 10.1177/0308518X18778038.

[24] B. Eldredge. Uber marketed for everyone, most used in gentrified neigh-borhoods, 2015. Available at https://www.brownstoner.com/brooklyn-life/uber-marketed-everyone-used-gentrified-neighborhoods/. Accessed: 2020-10-24.

[25] S. Gupta. Airbnb rental listings dataset mining. Towards Data Science, 2019.

7

Page 8: Team17: TheHipsters’TroublingTelephoneGame: ACausal ... · 2021. 5. 5. · Team17: TheHipsters’TroublingTelephoneGame: ACausal AnalysisofGentrificationEffectsontheNumberof311Noise

[26] Tlc trip record data. Available at: https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page. Accessed: 2020-10-24.

[27] Uber pickups in New York City. Available at: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city. Accessed: 2020-10-24.

[28] United States census bureau data. Available at: https://www.census.gov/data.html. Accessed:2020-10-24.

Appendix

A Complementary DataWe retrieved Airbnb data from [25]. This is part of a GitHub repository the code which was used toscrape short-term rentals from 2009 to 2018. The data for 2014 and 2018 contains about 35,000 Airbnbshort-term rentals.

The FHV data is retrievable from the NYC Data website [26] for the year 2018 whilst for 2014 we usedthe Uber Pickups Kaggle dataset [27]. The data is too large for the restriction of the competition so wesubsample 15 % of the data to stay within the requirements. FHV data includes many trips startingfrom the main NYC airports, namely JFK airport, Newark Airport and LaGuardia Airport. Beforesubsampling we removed all the trips starting at one of the above airports. The number of FHV tripsstudied in this analysis is around 33 millions.

The education data used in Figure 2 was taken from the United States Census Bureau dataset [28],where we define an “educated” individual as one with a bachelor’s degree or higher.

All the data used in this report is public and free of charge.

8