SlideShare uma empresa Scribd logo
1 de 21
Data-driven Process Discovery
Revealing Conditional Infrequent Behavior
from Event Logs
Felix Mannhardt, Massimiliano de Leoni,
Hajo A. Reijers, Wil M.P. van der Aalst
Process Discovery
PAGE 1 / 20
Three traces recorded for three process instances
The Noise Challenge
PAGE 2 / 20Source: van der Aalst, W.M.P.: Process Mining - Data Science in Action, Second Edition. Springer (2016)
“Happy Path” Discovery
PAGE 3 / 20
Noise vs. potentially interesting infrequent behavior
PAGE 4 / 20
Infrequent behavior
What exactly is noise in event logs?
• Infrequent out-of-order events
• Recording errors
• Exceptional behavior
Random / No explanation
State of the Art – Based on Control-flow & Frequencies
PAGE 5 / 20
Inductive miner
Heuristics miner
Existing noise filtering techniques are based
on control-flow perspective!
Proposed Method: Data-aware Heuristic Miner
PAGE 6 / 20
Ⓐ priority = white
Ⓑ nurse = Alice
Ⓒ type = out
Ⓐ
Ⓑ
Ⓒ
Source: van der Aalst, W.M.P.: Process Mining - Data Science in Action,
Second Edition. Springer (2016)
Overview: Data-aware Heuristic Miner (DHM)
PAGE 7 / 20
𝑎 > 𝐶 𝑎,𝑏,𝐿 𝑏
Conditional directly-follows relation
a ⇒C,L b
Conditional dependency measure
Dependency conditions (C)
C
Event log (L)
Causal Net
(1)
(4)
(3)
(2)(2)
Step (1) Dependency Conditions
PAGE 8 / 20
Binary classifiers that predict occurrence of directly-follows relations:
YES: Activity b directly-follows activity a
NO: Other activities (≠b) directly-follow a
Any classifiers can be employed! (we used C4.5)
Dependency conditions (C)
C
Event log (L)
(1)
Building Dependency Conditions – Example #1
PAGE 9 / 20
Relation: Register, END (Ⓐ)
Count Relation Class Priority Nurse …
1430 (Register, END) 1 white … …
39780 (Register, X-Ray) 0 green … …
49295 (Register, Check) 0 orange … …
9491 (Register, Visit) 0 red … …
Classifier: if priority = white then YES, otherwise NO
Building Dependency Conditions – Example #2
PAGE 10 / 20
Relation: X-Ray, Visit (Ⓑ)
Count Relation Class Priority Nurse …
20400 (X-Ray, Visit) 1 … Alice …
7923 (X-Ray, Final Visit) 0 … Peter …
(X-Ray, Check)
Detected as parallel activities
Classifier: if nurse = Alice then YES, otherwise NO
Step (2) Conditional Directly-follows Relation
PAGE 11 / 20
L
Dependency conditions
C
Event Log
𝑎 > 𝐶 𝑎,𝑏,𝐿 𝑏
Conditional
directly-follows relation
Relation: a followed by b under condition Ca,b
𝑋 > 𝐶 𝑋,𝑉,𝐿 𝑉 = 1
𝑉 > 𝐶 𝑋,𝑉,𝐿 𝑋 = 0
Step (3) – Conditional Dependency Measure
PAGE 12 / 20
𝑎 > 𝐶 𝑎,𝑏,𝐿 𝑏
Conditional
directly-follows relation
𝑎 ⇒ 𝐶,𝐿 𝑏
Conditional
dependency measure
Adapting the Heuristics Miner for conditional directly-follows
𝑎 ⇒ 𝐶,𝐿 𝑏 =
𝑎 > 𝐶 𝑎,𝑏,𝐿
𝑏 − 𝑏 > 𝐶 𝑎,𝑏,𝐿
𝑎
𝑎 > 𝐶 𝑎,𝑏,𝐿 𝑏 + 𝑏 > 𝐶 𝑎,𝑏,𝐿 𝑎 + 1
, for 𝑎 ≠ 𝑏
𝑎 > 𝐶 𝑎,𝑎,𝐿
𝑎
𝑎 > 𝐶 𝑎,𝑎,𝐿 𝑎 + 1
, otherwise
Step (4) – Discover a Causal Net with Conditional Dep.
PAGE 13 / 20
a ⇒C,L b
Conditional
dependency measure
Causal Net
Step 4.1: Build Unconditional Dependency Graph
• Observation Threshold (θobs)
• Dependency Threshold (θdep)
Step (4) – Discover a Causal Net with Conditional Dep. #2
PAGE 14 / 20
Causal Net
Step 4.2: Expand with Conditional Dependencies
• Dependency Threshold (θdep)
• Condition Threshold (θcon) [e.g., AUROC, Kappa, F-Score, ..]
Ⓐ
Ⓑ
Ⓒ
a ⇒C,L b
Conditional
dependency measure
Step (4) – Discover a Causal Net with Conditional Dep. #3
PAGE 15 / 20
Causal Net
Step 4.3: Connect Tasks
added
a ⇒C,L b
Conditional
dependency measure
Step (4) – Discover a Causal Net with Conditional Dep. #4
PAGE 16 / 20
Causal Net
Step 4.4: Build Causal Net as in the Heuristic Miner
• Binding Threshold (θbin)
a ⇒C,L b
Conditional
dependency measure
Evaluation – Can we rediscover conditions? (Synthetic)
PAGE 17 / 20
• Noise level 0.05 means that in 5% of the traces 1 event is out-of-order
• Compared three methods
• Heuristic Miner without filtering (HMA)
• Heuristic Miner with filtering (HMF)
• Data-aware Heuristic Miner (DHM)
• GED-based comparison since we want to evaluate at dependency level
Evaluation – Does it work in practice?
PAGE 18 / 20
Hospital Billing – Event Log (100,000 cases)
Conclusion & Future Work
PAGE 19 / 20
Implemented in ProM 6.7: Data-aware Heuristic Miner
• Data-first approach:
• Data attributes influence control-
flow discovery
• Conditional infrequent behavior
• Combines classification methods and
heuristic process discovery
• Validated on large real-life event logs
• Extend the idea to more complex patterns of behavior
• Long-term dependencies
• Duplicate activities
• Suggest suitable parameter settings / hyperparameter optimization
Questions
@fmannhardt - f.mannhardt@tue.nl - fmannhardt.de

Mais conteúdo relacionado

Último

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
seri bangash
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 

Último (20)

The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
An introduction on sequence tagged site mapping
An introduction on sequence tagged site mappingAn introduction on sequence tagged site mapping
An introduction on sequence tagged site mapping
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Exploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdfExploring Criminology and Criminal Behaviour.pdf
Exploring Criminology and Criminal Behaviour.pdf
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 

Destaque

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Destaque (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Data-driven Process Discovery - Revealing Conditional Infrequent Behavior from Event Logs

  • 1. Data-driven Process Discovery Revealing Conditional Infrequent Behavior from Event Logs Felix Mannhardt, Massimiliano de Leoni, Hajo A. Reijers, Wil M.P. van der Aalst
  • 2. Process Discovery PAGE 1 / 20 Three traces recorded for three process instances
  • 3. The Noise Challenge PAGE 2 / 20Source: van der Aalst, W.M.P.: Process Mining - Data Science in Action, Second Edition. Springer (2016)
  • 5. Noise vs. potentially interesting infrequent behavior PAGE 4 / 20 Infrequent behavior What exactly is noise in event logs? • Infrequent out-of-order events • Recording errors • Exceptional behavior Random / No explanation
  • 6. State of the Art – Based on Control-flow & Frequencies PAGE 5 / 20 Inductive miner Heuristics miner Existing noise filtering techniques are based on control-flow perspective!
  • 7. Proposed Method: Data-aware Heuristic Miner PAGE 6 / 20 Ⓐ priority = white Ⓑ nurse = Alice Ⓒ type = out Ⓐ Ⓑ Ⓒ Source: van der Aalst, W.M.P.: Process Mining - Data Science in Action, Second Edition. Springer (2016)
  • 8. Overview: Data-aware Heuristic Miner (DHM) PAGE 7 / 20 𝑎 > 𝐶 𝑎,𝑏,𝐿 𝑏 Conditional directly-follows relation a ⇒C,L b Conditional dependency measure Dependency conditions (C) C Event log (L) Causal Net (1) (4) (3) (2)(2)
  • 9. Step (1) Dependency Conditions PAGE 8 / 20 Binary classifiers that predict occurrence of directly-follows relations: YES: Activity b directly-follows activity a NO: Other activities (≠b) directly-follow a Any classifiers can be employed! (we used C4.5) Dependency conditions (C) C Event log (L) (1)
  • 10. Building Dependency Conditions – Example #1 PAGE 9 / 20 Relation: Register, END (Ⓐ) Count Relation Class Priority Nurse … 1430 (Register, END) 1 white … … 39780 (Register, X-Ray) 0 green … … 49295 (Register, Check) 0 orange … … 9491 (Register, Visit) 0 red … … Classifier: if priority = white then YES, otherwise NO
  • 11. Building Dependency Conditions – Example #2 PAGE 10 / 20 Relation: X-Ray, Visit (Ⓑ) Count Relation Class Priority Nurse … 20400 (X-Ray, Visit) 1 … Alice … 7923 (X-Ray, Final Visit) 0 … Peter … (X-Ray, Check) Detected as parallel activities Classifier: if nurse = Alice then YES, otherwise NO
  • 12. Step (2) Conditional Directly-follows Relation PAGE 11 / 20 L Dependency conditions C Event Log 𝑎 > 𝐶 𝑎,𝑏,𝐿 𝑏 Conditional directly-follows relation Relation: a followed by b under condition Ca,b 𝑋 > 𝐶 𝑋,𝑉,𝐿 𝑉 = 1 𝑉 > 𝐶 𝑋,𝑉,𝐿 𝑋 = 0
  • 13. Step (3) – Conditional Dependency Measure PAGE 12 / 20 𝑎 > 𝐶 𝑎,𝑏,𝐿 𝑏 Conditional directly-follows relation 𝑎 ⇒ 𝐶,𝐿 𝑏 Conditional dependency measure Adapting the Heuristics Miner for conditional directly-follows 𝑎 ⇒ 𝐶,𝐿 𝑏 = 𝑎 > 𝐶 𝑎,𝑏,𝐿 𝑏 − 𝑏 > 𝐶 𝑎,𝑏,𝐿 𝑎 𝑎 > 𝐶 𝑎,𝑏,𝐿 𝑏 + 𝑏 > 𝐶 𝑎,𝑏,𝐿 𝑎 + 1 , for 𝑎 ≠ 𝑏 𝑎 > 𝐶 𝑎,𝑎,𝐿 𝑎 𝑎 > 𝐶 𝑎,𝑎,𝐿 𝑎 + 1 , otherwise
  • 14. Step (4) – Discover a Causal Net with Conditional Dep. PAGE 13 / 20 a ⇒C,L b Conditional dependency measure Causal Net Step 4.1: Build Unconditional Dependency Graph • Observation Threshold (θobs) • Dependency Threshold (θdep)
  • 15. Step (4) – Discover a Causal Net with Conditional Dep. #2 PAGE 14 / 20 Causal Net Step 4.2: Expand with Conditional Dependencies • Dependency Threshold (θdep) • Condition Threshold (θcon) [e.g., AUROC, Kappa, F-Score, ..] Ⓐ Ⓑ Ⓒ a ⇒C,L b Conditional dependency measure
  • 16. Step (4) – Discover a Causal Net with Conditional Dep. #3 PAGE 15 / 20 Causal Net Step 4.3: Connect Tasks added a ⇒C,L b Conditional dependency measure
  • 17. Step (4) – Discover a Causal Net with Conditional Dep. #4 PAGE 16 / 20 Causal Net Step 4.4: Build Causal Net as in the Heuristic Miner • Binding Threshold (θbin) a ⇒C,L b Conditional dependency measure
  • 18. Evaluation – Can we rediscover conditions? (Synthetic) PAGE 17 / 20 • Noise level 0.05 means that in 5% of the traces 1 event is out-of-order • Compared three methods • Heuristic Miner without filtering (HMA) • Heuristic Miner with filtering (HMF) • Data-aware Heuristic Miner (DHM) • GED-based comparison since we want to evaluate at dependency level
  • 19. Evaluation – Does it work in practice? PAGE 18 / 20 Hospital Billing – Event Log (100,000 cases)
  • 20. Conclusion & Future Work PAGE 19 / 20 Implemented in ProM 6.7: Data-aware Heuristic Miner • Data-first approach: • Data attributes influence control- flow discovery • Conditional infrequent behavior • Combines classification methods and heuristic process discovery • Validated on large real-life event logs • Extend the idea to more complex patterns of behavior • Long-term dependencies • Duplicate activities • Suggest suitable parameter settings / hyperparameter optimization

Notas do Editor

  1. PhD student in Eindhoven Co-authors: Massimiliano, Hajo, and Wil
  2. This is the wrong way around. Adapt example!!