Research paper presentation at the 17th International Conference on Business Process Management (BPM'2019) in Vienna, 3 September 2019. Paper available at: http://kodu.ut.ee/~dumas/pubs/bpm2019-optimization.pdf
Presentation delivered by Adriano Augusto
2. Context
Automated discovery of (business) process models from event logs
Automated
Process Discovery
Approach (APDA)
2
a » b » c » g » e » h 10
a » b » c » f » g » h 10
a » b » d » g » e » h 10
a » b » d » e » g » h 10
a » b » e » c » g » h 10
a » b » e » d » g » h 10
a » c » b » e » g » h 10
a » c » b » f » g » h 10
a » d » b » e » g » h 10
a » d » b » f » g » h 10
3. Process Model Quality
How good is an automatically discovered process model?
Process
Model
APDA
3
Event
Log
Compare
Fitness, Precision (F-score)
Generalization
Simplicity
Soundness
5. DFG-based APDAs
5
Process
Model
DFG-based
APDA
(e.g. Split Miner)
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
a » b » c » g » e » h 10
a » b » c » f » g » h 10
a » b » d » g » e » h 10
a » b » d » e » g » h 10
a » b » e » c » g » h 10
a » b » e » d » g » h 10
a » c » b » e » g » h 10
a » c » b » f » g » h 10
a » d » b » e » g » h 10
a » d » b » f » g » h 10
6. DFG-based APDAs
6
Process
Model
DFG-based
APDA
(e.g. Split Miner)
Event
Log
a » b » c » g » e » h 10
a » b » c » f » g » h 10
a » b » d » g » e » h 10
a » b » d » e » g » h 10
a » b » e » c » g » h 10
a » b » e » d » g » h 10
a » c » b » e » g » h 10
a » c » b » f » g » h 10
a » d » b » e » g » h 10
a » d » b » f » g » h 10
input params
6
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
7. What is the best input configuration?
Model
(1)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration 1
Assess
Quality
Model
Quality (1)
Model
(2)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration 2
Assess
Quality
Model
Quality (2)
Model
(N)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration N
Assess
Quality
Model
Quality (N)
Compare
8. What is the best input configuration?
Model
(1)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration 1
Assess
Quality
Model
Quality (1)
Model
(2)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration 2
Assess
Quality
Model
Quality (2)
Model
(N)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration N
Assess
Quality
Model
Quality (N)
Compare
Model (x)
is the
BEST!
9. How to be more efficient?
9
Optimization Metaheuristics
Population Based
Evolutionary computation
Ant colony
Bee colony
Swarm particles
…
Single-solution Based
Repetitive local search
Iterative local search
Tabu search
Simulated annealing
…
10. Adapting the Metaheuristics to our Context
10
Repetitive Local Search (RLS)
Iterative Local Search (ILS)
Tabu Search (TS)
Simulated Annealing (SA)
1. Solution Space
2. Solution Neighbourhood
3. Objective Function
11. Adapting the Metaheuristics to our Context
Model
(1)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration 1
Assess
Quality
Model
Quality (1)
Model
(2)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration 2
Assess
Quality
Model
Quality (2)
Model
(N)
DFG-based
APDA
(e.g. Split Miner)
Log
Configuration N
Assess
Quality
Model
Quality (N)
Compare
Model (x)
is the
BEST!
Solution Space Objective Function
Neighbours?
12. Optimizing a DFG-based APDAs
12
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
13. Optimizing a DFG-based APDAs
13
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Assess
Quality
14. Assess Quality
Fitness, precision, generalization, or simplicity?
What measure to use?
Assess
Quality
Fitness and precision > F-score
Alignment, anti-alignment, PCC, entropy, Markovian accuracy
15. Optimizing a DFG-based APDAs
15
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Assess
Quality
16. Optimizing a DFG-based APDAs
16
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Assess
Quality
17. Explore Neighbour DFGs
Explore
Neighbour
DFGs
Given a DFG, its closer neighbours are the ones having one more or one less edge.
Adding edges will result into adding behaviour (increasing the fitness of the model)
Removing edges will result into removing behaviour (increasing the precision of the model)
Explore
Neighbour
DFGs
DFG DFG
DFG
DFG
DFG
DFGModel
Quality
18. Optimizing a DFG-based APDAs
18
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Assess
Quality
Convert
DFGs to
Models
19. Optimizing a DFG-based APDAs
19
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Assess
Quality
Convert
DFGs to
Models
Assess
Quality
20. Optimizing a DFG-based APDAs
20
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Assess
Quality
Select Best
DFG
Candidate
Convert
DFGs to
Models
Assess
Quality
21. Optimizing a DFG-based APDAs
21
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Check
Termination
Condition
Assess
Quality
Select Best
DFG
Candidate
Convert
DFGs to
Models
Assess
Quality
Timeout
Number of iterations
Objective function threshold
22. Optimizing a DFG-based APDAs
22
Process
Model
Event
Log
Discover
DFG
Manipulate
DFG
(e.g. Filtering)
Convert DFG
to Model
(e.g. BPMN)
input params
Explore
Neighbour
DFGs
Check
Termination
Condition fulfilled
not
fulfilled
Assess
Quality
Select Best
DFG
Candidate
Convert
DFGs to
Models
Assess
Quality
Optimization Metaheuristic
23. Optimization Framework
23
APDA – Metaheuristic Interface
Event
Log
Input
Settings
Objective
FunctionsOptimization metaheuristics ID
APDA ID
Objective Function ID
Process
Model
Optimization
Metaheuristics
DFG-based
APDAs
24. Optimization Framework Instantiation
24
APDA – Metaheuristic Interface
Event
Log
Input
Settings
Markovian
F-scoreOptimization metaheuristics ID
APDA ID
Objective Function ID
Process
Model
RLS, ILS,
TS, SA
Split Miner
25. Evaluation Setup
25
— 20 real-life event logs (10 BPIC logs, RTFMP, SEPSIS case, and 8 private logs)
— 3 baselines without hyper-parameters optimization:
Inductive Miner (IM), Evolutionary Tree Miner (ETM), Split Miner (SM)
— 1 baseline with hyper-parameters optimization, Split Miner (HPO)
— Markovian accuracy, Alignment accuracy, simplicity, and time performance
27. Limitations
27
— Slower than baselines with default input params (Inductive Miner, Split Miner)
— More complex models when optimizing fitness
— Not applicable to any APDA, only for DFG-based APDA
29. Future Work
29
— Add more DFG-based APDAs to our framework
(Fodina Miner and Inductive Miner)
— Explore alternative quality measures to drive the optimization metaheuristics
— Combine accuracy and simplicity measures