Project R-8195

Title

A process discovery algorithm for exploratory data analysis (Research)

Abstract

Industry is becoming increasingly data-driven. The past decade both the amount of data collected and the nature of the data has changed. This project focusses on event data, which describes how (business) processes are executed. The first step for retrieving insights from data is through exploratory data analysis (EDA). Despite the many algorithms which discover process models from event data, none of them are really suited for EDA. Models for EDA have two important requirements. Firstly, they should only be a description of the observed data. Secondly, they should be comprehensible such that interesting patterns are easily recognized. The main issue with the existing process discovery techniques is that they create models which contain behavior that was not observed. Additionally, almost none of the existing techniques optimize their models for comprehensibility. This project contributes to both process mining and data analytics. It creates the first discovery algorithm suitable for EDA. The models it creates only represent the observed behavior and are optimized for comprehensibility. Further contributions of this project are a first comprehensibility measure which takes duplicate tasks into account and alternative visualizations for partial parallelism and long-term dependencies.

Period of project

01 October 2017 - 31 August 2018