intelligent etl-processes for geo data · 2019-04-23 · for talend real-time big data •use big...
TRANSCRIPT
![Page 1: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy](https://reader033.vdocuments.net/reader033/viewer/2022050422/5f918e743442960b9f248dca/html5/thumbnails/1.jpg)
Dr. Wassilios Kazakos, Head of Marketing & Business Development
Intelligent ETL-Processes
for Geo Data
![Page 2: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy](https://reader033.vdocuments.net/reader033/viewer/2022050422/5f918e743442960b9f248dca/html5/thumbnails/2.jpg)
Business focus:
• Data analytics & information delivery solutions with focus on geo-spatial data
• Implementation of Data Warehouses with spatial and non-spatial data
Product Cadenza:
• Our platform for geo-analytics & data discovery
• Several thousand users in Germany and Austria
Headquarters: Karlsruhe, Germany. 100 employees.
Company Profile Company Profile
![Page 3: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy](https://reader033.vdocuments.net/reader033/viewer/2022050422/5f918e743442960b9f248dca/html5/thumbnails/3.jpg)
Project Context: WIRE (1)
WIRE: Intelligent Methods for Integration and Quality Assurance of Geo Data
• German SME research project, Funded by BMBF, Grant # 01IS16039
• Duration: 01/2017 – 12/2018
Application scenarios:
• Smart agriculture
• Environment monitoring / water management
Project Partners
• Disy Informationssysteem GmbH
• FZI, Research Cetntre for Information Technology
![Page 4: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy](https://reader033.vdocuments.net/reader033/viewer/2022050422/5f918e743442960b9f248dca/html5/thumbnails/4.jpg)
Project Context: WIRE (2)
Motivation
• Massive growth of application big geo data to be fused in real-time
• Data management and quality control needs more automation and tool support
Some methods and tools developed:
• Geospatial extension for Talend Data Integration workbench
• Machine-learning services for geo-data schema mapping
• Algorithms for detection of geo-data quality problems
• Automated corrections of geo-data errors
Today‘s topic
Mobile Sensors
Source: Yara
Source: DLR
Social Media Satelite Data
![Page 5: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy](https://reader033.vdocuments.net/reader033/viewer/2022050422/5f918e743442960b9f248dca/html5/thumbnails/5.jpg)
What is ETL?
„E“ Extract
„T“ Transform
„L“ Load
Fiter, Analyse, Visualize
Integrated quality data
![Page 6: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy](https://reader033.vdocuments.net/reader033/viewer/2022050422/5f918e743442960b9f248dca/html5/thumbnails/6.jpg)
What is Geo-ETL
„E“ Extract
„T“ Transform
„L“ Load
Fiter, Analyse, Visualize
Integrated quality data
+ Geometric Operations
+ Validation und correction (geometric)
+ Combination of Geo- und Non-Geodata
+ (Formats, Datenbases, Interfaces)
+ (Formats, Datenbanse, Interfaces)
![Page 7: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy](https://reader033.vdocuments.net/reader033/viewer/2022050422/5f918e743442960b9f248dca/html5/thumbnails/7.jpg)
Basis: Talend Data Integration Platform
Leading Platform for Big Data and Cloud and Data Integration
• User Interface for Job definition / code creation
• Hundreds of connectors, components and routines
• Repository management for Job-reuse and teamwork
• Monitoring and logging optimized for large installations
• Free and Open Source but still powerfull entry version
Professional Software for Big Data
and Cloud Data Integration
BUT: no Geodata and no Geo-
Operations
![Page 8: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy](https://reader033.vdocuments.net/reader033/viewer/2022050422/5f918e743442960b9f248dca/html5/thumbnails/8.jpg)
GeoSpatial Integration for Talend
„Everything you need to extend ETL to Geo-ETL“
Extension of the
Talend Palette
Data-Connectors for
• Oracle Locator/Spatial, PostGIS
• SpatiaLite, Shapefile, WKT, WKB
• GeoJSON
Components for
• Length and area calculation
• Geometry transaformation
• Geometry validation
• Buffer, Bounding Box, Centroid….
Routines
• More than 40 geometric Operations
• Like Visvalingam-Wyatt Algorithm (Simplification)
![Page 9: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy](https://reader033.vdocuments.net/reader033/viewer/2022050422/5f918e743442960b9f248dca/html5/thumbnails/9.jpg)
GeoSpatial Integration for Talend
Use drag & Drop with new components
GeoSpatial components
are fully integrated in
the Talend ETL process
![Page 10: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy](https://reader033.vdocuments.net/reader033/viewer/2022050422/5f918e743442960b9f248dca/html5/thumbnails/10.jpg)
GeoSpatial Intergration
For Talend Real-time Big Data
• Use Big Data Capabilities of Talend Platform for Geodata-Processing
• Create and Deploy Spark-Code with Geo-Routines Hadoop/Spark-Clusters
Parallele execution in a
Spark-Cluster to
process Big Data
Streams
comming from Sensors,
Social Media etc.
![Page 11: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy](https://reader033.vdocuments.net/reader033/viewer/2022050422/5f918e743442960b9f248dca/html5/thumbnails/11.jpg)
San Francisco Municipal Transort
Authority
• All Vehicle send radio signals every
minute
• Metro (MUNI), Busses, Car- and
Bicycle-sharing
• Data are stored in a Hadoop Data
Lake
• Talend Big Data Platform +
GeoSpatial Plugin for Talend
• Currently: mainly data analysis.
• Vision: Real-time data services for
other departments, the public &
companies
Case Study
San Francisco
![Page 12: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy](https://reader033.vdocuments.net/reader033/viewer/2022050422/5f918e743442960b9f248dca/html5/thumbnails/12.jpg)
Contact
Dr. Wassilios Kazakos
Disy Informationssysteme GmbH
Tel. +49 721 16006-000
You can download and
try the plug-in for free
here…
www.disy.net/spatial-etl
Intelligent ETL-Processes for Geo-Data
Thank you for your attention