TP5.1: Integration Platform for Temporal Geographic Data

Subproject manager
Prof. Alfons Kemper, Ph.D.

The goal of this subproject is to implement a prototype of a data integration and communication platform for geospatial data with temporal attributes. With current solutions it is not possible to address the data management requirements of Industrie 4.0 and the Internet of Things (IoT). Due to the large amounts of data and the speed at which data arrives, high-performance approaches are required. Further, sensor data needs to be joined with transactional business data.

To implement intelligent services, it is desireable to be able to perform real-time analytics on this integrated database. SAP founder Hasso Plattner and Microsoft founder Bill Gates described this requirement as “information at your fingertips” and pointed out that this kind of “real world awareness” will have a dramatic impact on business processes and on personal mobility services (e.g., traffic forecasts).

In a first study [1] we have compared HyPer with modern streaming processing systems including Apache Flink. We used a workload from the telecommunication domain that runs analytics on the most current state of all phone call metadata.

The prototype is based on the main-memory DBMS HyPer, which is developed at TU Munich. This work addresses two characteristics of connected mobility workloads: 1) data is continuously being produced and 2) it comes with temporal and geospatial attributes.

Analytic on Fast Data
Figure 1: Analytics on Fast Data: Complex Analytics on Current Data

We found that there is a performance and usability gap between these systems and proposed concrete solutions. Next, we will implement these solutions in our prototype, including continuous queries, window functionalities, and an extension of the SQL interface.

Analytics on Fast Data: Main-Memory Database Systems versus Modern Streaming Systems (EDBT 2017)

Supervised student projects
“Analytics on Fast Data Using Modern Stream Processing Systems” – Jan Böttcher
“Efficient Geospatial Joins Using Specialized Radix Trees” – Raul Persa
“An Efficient Nearest Neighbor Join Algorithm for Lines and Points in Main Memory” – David Becher