Skip to content

Example python notebook & resources to get started using the pm4py library.

Notifications You must be signed in to change notification settings

RubyNixx/Process_Mining_Python_Healthcare

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

pm4py is a python library that supports (state-of-the-art) process mining algorithms in python. It is open source (licensed under GPL) and intended to be used in both academia and industry projects. pm4py is a product of the Fraunhofer Institute for Applied Information Technology.

This repo contains links to process mining training, links to supporting documentation to support with installing libraries & how to use the library, and an example python notebook that you can run with the artifical healthcare data provided.

Presentation

I presented on process mining in the East of England at a few different communities, presentation below:

Presentation

Recommended training in process mining

Coursera Process Mining

Process Mining in Python - Youtube Videos

pm4py tutorials - tutorial #1: What is Process Mining? This video covers what is process mining; examples, definition of process mining, event log, main tasks of process mining, process discovery, conformance checking, process enhancement

pm4py tutorials - tutorial #2: Importing CSV Files This video covers example process; how to read graphical representation of processes, example data in CSV format #(can be downloaded here, importing CSV data in Python using pandas library, importing CSV data, reformatting the data into event log using format_dataframe function and obtaining start and end activities using get_start_activities and get_end_activities functions from pm4py library.

pm4py tutorials - tutorial #3: Importing XES Files This video covers case level attributes, XES format; tools supporting XES format, how XES looks, example XES file, XES - extensions, standard extensions of XES (website), extensions on log level, trace level and event level, XES public datasets, globals (default values) in XES files, classifiers; meta information in XES files, reading XES files using read_xes function from pm4py library and getting start and end activities.

pm4py tutorials - tutorial #4: Playing with Event Data; Lambda Functions

pm4py tutorials - tutorial #5: Playing with Event Data; Shipped Filters

pm4py tutorials - tutorial #6 exporting event data

pm4py tutorials - tutorial #7 process discovery

pm4py tutorials - tutorial #8 conformance checking

Resources Available:

Official pm4py

pm4py official documentation

Process Intelligence website

pm4py installation support

YouTube Videos

Process Mining - Data Science in Action Book

dcr4py - supporting documentation

dcr4pydocs - extension of pm4py documentation

pm4py-dcr

Example Publications using pm4py

https://processintelligence.solutions/pm4py/publications

Structure of this repo

Python Notebook - open in google colab Example artificial healthcare data

Additional example datasets for healthcare & process mining

Ambulance Data

Real life log of a Dutch academic hospital, originally intended for use in the first Business Process Intelligence Contest (BPIC 2011) Uploaded to this repo.

Data.XML Hospital_log.xes.gz

Requirements

pm4py depends on some other Python packages, with different levels of importance:

Essential requirements: numpy, pandas, deprecation, networkx

Normal requirements (installed by default with the pm4py package, important for mainstream usage): graphviz, intervaltree, lxml, matplotlib, pydotplus, pytz, scipy, stringdist, tqdm

Optional requirements (not installed by default): scikit-learn, pyemd, pyvis, jsonschema, polars, openai, pywin32, python-dateutil, requests, workalendar

Source: https://github.com/paul-cvp/pm4py-dcr

About

Example python notebook & resources to get started using the pm4py library.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published