Process mining
Process mining is a data mining technique based on the analysis of log files. As a method of process management, process mining offers the possibility of analyzing business processes and identifying potential for optimization.
What is process mining?
Process mining comprises techniques in the area of business process management that serve to analyze business processes. There are data-supported methods of process analysis that focus on the evaluation of event logs – information stored in IT systems about individual process steps. Process mining applications apply special data mining algorithms to log files and transaction data to identify trends and patterns. The aim is to gain a better understanding of relevant business processes in order to make them more efficient.
Process mining types
In research, process mining is also referred to as “Automated Business Process Discovery” (ABPD) and describes techniques used to create, evaluate, and extend process models. The Process Mining Manifesto by the IEEE Task Force on Process Mining distinguishes between three types of process mining techniques:
- Discovery: Discovery process mining techniques are used to identify processes and create process models.
- Conformance: Conformance process mining techniques enable an assessment of the conformity of existing process models to current data.
- Extension/enhancement: Extension (also known as enhancement) process mining techniques are used to enhance existing process models.
The IEEE Task Force on Process Mining is a research group of the Institute of Electrical and Electronics Engineers (IEEE) at the Eindhoven University of Technology that aims to promote the development and understanding of process mining technologies through research and education.
How does process mining work?
Process mining combines data mining and computational intelligence (CI) techniques with process modeling and analysis. A process is described as a series of logically linked process steps that can be recorded as events.
The starting point for any process mining technique is event data in the form of log files that reproduce events in chronological order and can be assigned to both a process step and a process instance.
While the term “process“ generally refers to a business transaction at planning level, a process instance is a concrete run through of a process. Process instances can be determined individually by dimensions such as time and location or people and devices involved. For example, processing an application for a life insurance policy with an insurance company would be a process. The processing of Mr. Doe’s insurance application, on the other hand, is an instance of the previously modeled standard process.
The IEEE has defined a standard schema for each process mining type.
“Discovery” process mining techniques provide pattern recognition algorithms that enable models to be derived from existing event log data. They are based on information recorded as log files by IT systems.
The result of this type of process mining is usually a process model. In a manufacturing plant, for example, a model like this could be derived from time stamps that indicate when each product passes through a certain production step.
Common presentation techniques for process models are:
- BPMN (Business Process Model and Notation)
- EPC (Event-driven Process Chain)
- HIPO diagrams
- Communication structure analysis
- Petri net models
- SOM (Semantic Object Model)
- UML (Unified Modeling Language)
- BPEL (WS-Business Process Execution Language)
Process mining techniques are not necessarily limited to the creation, validation, and extension of process models. Social structures, organizational charts, business rules, or guidelines can also be displayed with process mining techniques.
“Conformance” process mining techniques are used to validate process models. If a process model already exists, it is advisable to compare it at regular intervals with new event log data to ensure that the model corresponds to how the real processes are being documented. Process mining techniques are used to compare the existing process model with current event data in order to determine differences between the model and reality. The resulting diagnosis of a conformance test like this enables conclusions to be drawn about the quality of the process model under investigation. A conformance test can be applied to both descriptive and normative process models.
Descriptive models describe processes as they actually run. Normative models provide information on how a process should run in the best case. These are also known as actual and target models.
The “extension” process mining techniques aim to extend and improve existing process models with the help of newly acquired information. The result is a new, extended process model.
Analysis perspectives
Process mining covers four different levels of observation:
- Control flow perspective: A process with a view to the control flow aims to represent the sequence of activities within a process as a process model (e.g. as a petri net UML activity diagram, EPC, or BPMN model.
- Organizational perspective: Process mining from an organizational perspective highlights how people and IT systems relate to each other through participation in a business process. Activity profiles and roles are defined and compared with each other. The result of an analysis like this is a social network that visualizes the network of relationships.
- Case perspective: Process mining with a case perspective is used to analyze individual process instances. These are described and categorized as cases according to their properties. The classification takes place according to the data values recorded for the respective process instance – for example, according to which actors are involved.
- Time perspective: Process mining with a time perspective takes a close look at the absolute or relative point in time and the frequency of events. The prerequisite for this is that all event logs have a time stamp. Analyses of this kind allow simulations that enable conclusions to be drawn about patterns, trends, and obstacles in the process flow. For example, bottlenecks in the process chain can be identified.
In practice, process mining today is primarily used for control flow detection. In the foreground are the “discovery” process mining techniques with a control flow perspective, which make it possible to identify the chronological sequence of individual process steps and to compare them with the desired target state.
Phases of process mining
The IEEE has developed the L* life-cycle model as a reference model for applying process mining techniques. This divides the procedure for process mining projects into five phases:
Phase | Action | |
0 | Planning and classification | According to the L* life-cycle model, process mining projects start with a planning phase. In addition, the following questions are answered in this phase: - Which process is examined? - Which events are relevant? - Which indicators are relevant? - Which actors and IT systems are involved? - How can the required data be obtained? - What are the goals of the process mining project? |
1 | Extracting relevant data | The planning phase is followed by the extraction of relevant data from the available IT systems: - Log files - Models - Etc. |
2 | Creating the control flow model | In phase 2, a control flow model is derived from the collected data and related to the log files. |
3 | Creating an integrated model | If the data basis is sufficient, the model created in phase 2 will be extended by further perspectives in phase 3. |
4 | Operative support | Phase 4 includes the use of the model to support operational processes. |
Where is process mining used?
Process mining can be used wherever detailed information about the individual steps of relevant business processes is recorded and permanently stored with the help of IT systems. It can be used, for example, when companies:
- Process workflows via workflow management systems
- Make transactions using ERP systems
- Manage support requests via a ticket system
- Ensure the quality of medical treatment via clinical treatment pathways
This makes process mining suitable for use in retail and OEM, banking, development, sales, and the insurance industry to improve business processes such as ordering processes, manufacturing processes, or cash flows.
Workflow management and knowledge management are central fields of applications for process mining techniques. In addition, knowledge gained from process mining projects is used in the development of assistance systems.
Many companies use technologies such as databases, ERP systems, and knowledge management systems to safeguard factual knowledge. As a rule, process knowledge is not processed. This is where process mining comes in with methods, which make it possible to make implicit process knowledge explicit.
Workflow management systems describe business processes at formal levels and automate the coordination and control of individual process steps. The system provides users with user interfaces for communication and for accessing data and programs. Workflow management is based on modeled workflows that allow the system to recognize events (such as inputting a document by e-mail) and automatically react to them. This automation is based on process models that can be created, checked, and extended using process mining methods.
Advantages of process mining technology
Process mining techniques can be used wherever individual steps of business-relevant processes are recorded as logs. Algorithms from the fields of data mining and computational intelligence now make it possible to analyze even complex event data and derive insights into how business processes can be made more efficient and secure.
The high degree of automation distinguishes process mining from classical techniques for creating process models. By extracting information on real events from the operative business, process mining methods realistically reproduce process sequences. Compared to manual techniques, process mining scores points when it comes to speed and accuracy. In addition, the increasing volume of data can already no longer be managed manually.
Another advantage of professional process mining applications are the extensive visualization options. Process models are presented to skilled workers and managers on interactive dashboards, which enable a dynamic view of process flows and sometimes provide additional analysis tools.
Challenges with implementation
Companies encounter difficulties when implementing process mining techniques when the data basis to be analyzed is inconsistent due to a heterogeneous IT infrastructure. If uniform descriptions for events are missing, the corresponding log files must first be processed. This not only means additional effort, but may also result in data corruption – which, strictly speaking, no longer represents real data.
In addition, companies are confronted with technical hurdles during implementation. The use of data mining is only effective if the respective applications have access to all relevant IT systems. This requires appropriate interfaces and for the connected systems to be configured properly, as well as close cooperation with the provider of the process mining application.
The effort required for implementation also increases when companies combine standard applications for managing business processes with tools that they have developed themselves in order to adapt them to individual needs.