sdmx connectors: using sdmx data in statistical packages ... 2013 session 4.6 - using sdmx... ·...
TRANSCRIPT
1
SDMX Connectors: using SDMX data in statistical
packages and tools (EXCEL, R, Matlab, SAS)
Gianpaolo Lopez
Attilio Mattiocco
Diana Nicoletti
Bank of Italy IT Support Unit for
Economic Research Department
SDMX Global Conference - OECD, Paris, 11-13 September 2013
2 2
Motivations for the “SDMX Connectors”
What do users want ?
• To use the statistical tools they know (R, EXCEL, Matlab, SAS etc.) to analyse
data from different sources
• To discover the data they might want to use for their analysis
• To repeat the analysis they have developed in their tools, with updated data
Data
Processing
and Analysis
Data
Dissemination
Data
Provider
Data Capture
What do users face ?
• often the data of interest is replicated inside the organization
• different external and internal data providers with different formats
• the need for manual steps to get the data into the tool
• frustration …
The first step of a typical data analysis process is the retrieval
of the data to be processed.
3
SDMX and Web Services provide the means for strongly simplifying data retrieval
from external sources.
Reduce the amount of external data that an organization needs to replicate in the
internal systems, without impacting data process efficiency.
Data
Process
Data
Dissemination
SDMX
Provider
SDMX Connector
But…
• SDMX standard and Web Services technology are quite complex.
• End-users don’t want to cope with this kind of “IT complexities”.
The SDMX Connectors framework has been developed for hiding this
complexity to the end-user.
the “SDMX Connectors”
4
The Framework
internet/intranet
ECB OECD ISTAT
BIS LOCAL
DB
In the future ? ECB
Secure
OECD
Secure
SDMX Providers
IMF
SDMX library
SDMX Connectors
5
Use case: user wants to get exchange rates from ECB...
6
• Data Flows in SDMX are families of data that share a common structure
• The structure of an SDMX data flow is multidimensional and is declared in a Data Structure
Definition in SDMX 2.1 (Key Family in SDMX 2.0)
• The dimensions of a data flow can be used (as the columns of an SQL table) to retrieve the
specific parts of the data flow that are of interest
Example: Exchange Rates dataflow in the ECB
( http://sdw.ecb.europa.eu/browse.do?node=2018794 )
Data Flow identifier: EXR
Dimensions: FREQ
CURRENCY
CURRENCY_DENOM
EXR_TYPE
EXR_SUFFIX
In order to get the desired time series, you need to know the name of the dataflow, the
dimensions and the codes of the dimensions that correspond exactly to your needs…
Some basics about SDMX
IT’S NOT EASY !!
7
To facilitate query building, we have developed a light HELPER tool, driven by DSDs made
available by the data provider with web services
Knowing SDMX means… Knowing DSDs…
8
With the HELPER tool, the user can browse through the DSDs to search data of interest…
Knowing SDMX means… Knowing DSDs…
9
The user can learn the codes needed to build the queries…
Knowing SDMX means… Knowing DSDs…
10
The “every-day” life …. of the data-analyst
Excel video demo
11
The same access from R …. the data
12
The same access from R … the Helper
13
The same access from Matlab ….
14
next steps and ....
• SDMX 2.1 support, REST (in progress..), JSON ?
• Easier config of data sources (in progress..)
• Registry for data discovery ?
• New SDMX Data Providers ? further statistical tools ?
• Open Source, collaborations…
A dream ? “What would be possible if ALL data providers were to give data in SDMX ? If we also had an SDMX registry that would tell us where the data is ? Life could be so easy both for who need to make data available to researchers and for the researchers, who want all the data … and all of this will work even better, if we have DSDs for global use (e.g. BOP and SNA) …”.
15
Thank You!
SDMX Global Conference - OECD, Paris, 11-13 September 2013
16
• Backup slides for Excel demo…
17
18
19
20
21
22
23
The “every-day” life …. of the data-analyst
Excel video demo