integrating big data in the belgian cpi...supermarket scanner data current method experimental...
TRANSCRIPT
![Page 1: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/1.jpg)
Integrating big data
in the Belgian CPI
Ken Van Loon, Dorien Roels
Geneva, May 2018
http://statbel.fgov.be
![Page 2: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/2.jpg)
Supermarket scanner data
Current method
Experimental results: multilateral methods & splicing options
Web scraping
Footwear
Second-hand cars
Renting student rooms
Hotel reservations
Consumer electronics
Overview
2 http://statbel.fgov.be
![Page 3: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/3.jpg)
Current methodology: “dynamic” method
In production since January 2015
Monthly chained Jevons index
Threshold: 𝑠𝑚+ 𝑠𝑚−1
2 >
1
𝑛 ∗ λ (λ = 1.25)
Imputations
Dumping filters
Outlier filters
SKUs instead of GTIN
Linking relaunches
Scanner data
3 http://statbel.fgov.be
![Page 4: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/4.jpg)
Goal: switch to a multilateral method in 2020
Ongoing research first results
13 months window length when splicing
Scanner data
4 http://statbel.fgov.be
Multilateral methods Splicing & extension methods
GEKS-Törnqvist Movement Splice
Time Product Dummy Window Splice
Geary-Khamis Half Splice
Augmented Lehr Mean Splice
Fixed Base Monthly Expanding Window (FBEW)
Fixed Base Moving Window (FBMW)
![Page 5: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/5.jpg)
Dataset for testing multilateral methods
One retailer: period of 37 months
COICOPs:
01.1 Food (excl. seasonal products)
01.2 Non-alcoholic beverages
02.1 Alcoholic beverages
12.1.3.2 Articles for personal hygiene & beauty products
Around 480 product groups in total
Incl. extra relaunch linking
Scanner data
5 http://statbel.fgov.be
![Page 6: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/6.jpg)
Relaunches
Scanner data
6 http://statbel.fgov.be
![Page 7: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/7.jpg)
Comparison of multilateral methods (full window)
Scanner data
7 http://statbel.fgov.be
![Page 8: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/8.jpg)
Comparison of multilateral methods (full window)
Scanner data
8 http://statbel.fgov.be
![Page 9: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/9.jpg)
Results with splicing & extension methods (rolling window = 13 months)
Scanner data
9 http://statbel.fgov.be
![Page 10: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/10.jpg)
Dumping impact: TPD/GK minus GEKS
Scanner data
10 http://statbel.fgov.be
![Page 11: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/11.jpg)
Since 2014
R (rvest, Rselenium)
CPI: international train travel, videogames,…
+/- 70 scripts
Daily or several times a week
Web scraping
11 http://statbel.fgov.be
Clothing Drugstores
Footwear Books
Hotel reservations Videogames
Airfares DVD & Blu-ray discs
International train travel Supermarkets
Second hand cars Student rooms
Consumer electronics …
![Page 12: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/12.jpg)
Websites largest footwear retailers
Feature of this market:
Products leave market at significant lower price
downward drift
Web scraping – Footwear
12 http://statbel.fgov.be
![Page 13: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/13.jpg)
Non-matched model with stratification
Web scraping – Footwear
13 http://statbel.fgov.be
![Page 14: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/14.jpg)
Not yet covered
Daily scraping
Time dummy hedonic method (incl. characteristics/depreciation)
Web scraping – Second-hand cars
14 http://statbel.fgov.be
![Page 15: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/15.jpg)
Not yet covered
Aggregator sites are scraped
Price
Room size
Specific type of room
Address → geocoding
Web scraping – Renting student rooms
15 http://statbel.fgov.be
![Page 16: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/16.jpg)
Geocoding:
Web scraping – Renting student rooms
16 http://statbel.fgov.be
![Page 17: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/17.jpg)
Resulting index:
Current research:
Price collection expanded to more cities
More characteristics
Web scraping – Renting student rooms
17 http://statbel.fgov.be
Year City 1 City 2
T 100 100
T+1 102.1 102.3
![Page 18: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/18.jpg)
Domestic destinations: Cities, Seaside, Ardennes
Virtual reservations:
Web scraping – Hotel reservations
18 http://statbel.fgov.be
Manual Web scraping
Frequency 1x / month Daily
Reservation 4 weeks before arrival 4 & 8 weeks before arrival
Characteristics Fri – Sun, Double room Fri - Sun, incl. breakfast & free cancellation
Method Sample of hotels Stratification:
Destination
Area
Weeks booked before arrival date
Hotel classification
Price Per hotel Per stratum
![Page 19: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/19.jpg)
Web scraping – Hotel reservations
19 http://statbel.fgov.be
![Page 20: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/20.jpg)
Product characteristics
Time dummy hedonic method
Splicing methods compared with:
Monthly matching Jevons index (MM)
Monthly chaining & replenishment (MCR)
Simulating current official method
Web scraping – Consumer electronics
20 http://statbel.fgov.be
![Page 21: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/21.jpg)
Web scraping – Consumer electronics
21 http://statbel.fgov.be
![Page 22: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/22.jpg)
Scanner data
Dynamic method ~ Multilateral methods
No large differences between splicing/extension options
Lehr index deviates
Web scraping
Footwear/Hotels: no large differences classical vs. web scraping
Second-hand cars/Renting student rooms: index calculation possible
Consumer electronics: Use of hedonics
Conclusion
22 http://statbel.fgov.be
![Page 23: Integrating big data in the Belgian CPI...Supermarket scanner data Current method Experimental results: multilateral methods & splicing options Web scraping Footwear Second-hand cars](https://reader036.vdocuments.net/reader036/viewer/2022071000/5fbc7cc5a940583f795985db/html5/thumbnails/23.jpg)
23 http://statbel.fgov.be
Thank you!