tracing data utility for noise adding plugins in process mining
Type: Bachelor CS
If you are interested please contact :
Introduction: Process mining bridges the gap between data mining and Business Process Management (BPM). It helps in evaluating the event logs extracted from care-providers information systems. Event log based on the treatments given to the patients with respective time stamps makes the big data analytics easier and precise. Multiple tools are used to extract valued information from big datasets for the same purpose. ProM is one of the most reliable and used publicly available tool for process mining. ProM is used to understand the business processes by automatically generating respective business models using multiple plugins such as Inductive miner, Heuristic miner, alpha miner, etc. These models allow detailed analysis based on discovery and conformance checking in comparison with the original event log. These comparisons can be used to quantify the quality indicators; fitness, precision, simplicity, and generalization ranging from 0-1. Where 0 shows the least suitability between the event log and derived process-model where-as 1 shows the most suitability.
Assignment: The main objective of this project is to make use of noise adding plugins of latest version of ProM tool on multiple healthcare datasets (3 or more) to see the difference between the test model in comparison to the training models (with added noise). The procedure can be done by quantifying the three quality indicators named fitness, precision, and simplicity. The exploration of other indicators/parameters is part of the assignment. With these experiments, the aim is to locate whether noise adding plugins give optimum data utility or not? The study will assist us in locating the privacy-utility trade-off in process mining (healthcare in focus).
Data Source: data.4tu.nl, MIMIC III
ProM tool: http://www.promtools.org/doku.php
 Blum, F. (2015). Metrics in process discovery. Tech. Rep. TR/DCC. 1–21.
 Ghawi, R. (2016). Process Discovery using Inductive Miner and Decomposition. arXiv preprint arXiv:1610.07989.
 van der Aalst, W. On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery.