Research | Melanie Herschel

Data integration and quality

Getting comprehensive information from a variety of heterogeneous data sources is crucial for a large variety of data-informed applications. Our research focuses on algorithms and systems that combine such data effectively, resulting in high data quality, while operating in a resource-conscious way.

Entity resolution
Semi-structured data
Dynamic and streaming data
Adaptive data integration
Data lakes

Accountable and fair data processing

As decisions increasingly rely on data analysis and machine learning, we develop technologies to understand, explain, document, monitor, and improve the underlying data processing. We thereby facilitate responsible data use.

Data provenance
Fairness and bias
Meta-data modeling
Data capture, management, and querying
Trust in data engineering

Data exploration and analytics

Investigating what information lies in one’s data and applying appropriate analysis techniques to derive insights typically is an interactive process. In our research, we devise algorithms and system that guide users such as data scientists and domain experts in this process.

Human-in-the-loop exploration
Iterative pipeline refinement
Recommendations

Complex data processing pipelines

As the amount and variety of data constantly increase, their processing and management require novel data management technologies. We investigate solutions that cater to different types of data, domain requirements, deployment environments, and users with varying digital / data management literacy levels.

Data pipeline debugging
Performance optimization
Domain-specific pipelines