Digitization has enabled companies to record their operations in event logs where every activity of a business process is recorded as data with certain attributes like timestamp, event name etc. These logs are useful because they provide insight into operations and can be used to develop process models that optimize the business process. However, the quality of the optimization process is only as good as the stored data and event logs with missing events lead to poor data and analysis models.
In a collaborative study, researchers from the National University of Pusan, South Korea, including Dr Sunghyun Sim and Prof. Hyerim Bae, as well as Prof. Ling Liu from the Georgia Institute of Technology developed a method that can restore missing data in an event log. The study, published in IEEE Transactions on Services Computing, uses imputation methods that use correlations between available data to find missing information. “Since data is collected from multiple angles in many information systems, there is a relationship between the data collected. From this point our study suggested a method of restoring missing event values using the relationship between entities in the event log, which can overcome human or system error.Explains Dr Sim.
In event logs, events have attributes that relate to other events in “single event” or “multiple event” relationships. In the first case, each attribute of an event corresponds to a unique attribute in another event. Based on this relationship, the researchers developed a systematic event imputation (SEI) method that restores a missing value by simply referring to the available value to which it relates.
However, in the latter case where the attributes have multiple matches, a simple match of the attributes is not possible. For such situations, a multiple event imputation (IEM) method has been developed where the missing events are first estimated and used to create event sequences or chains of events. These sequences can be compared to an event log with no missing data to restore missing event attributes.
These imputation methods were applied simultaneously by a bagging recurring event imputation (BREI) algorithm, using bootstrap sampling and recurring event imputation (REI) to repair the event log. When testing with actual event logs, the researchers found that their algorithm improved the accuracy of the restore by 10 to 30% compared to existing restore algorithms. In addition, it could restore almost 90% of data accuracy even when more than half was missing.
In addition to optimizing business processes, researchers are optimistic that such an algorithm can be extended to other applications that rely on data quality. One promising avenue lies in improving the data supplied to AI systems and this method has the potential to accelerate the development of AI technologies. “It is possible to improve the performance of artificial intelligence by improving the quality of data in its learning process. The algorithm will also help prevent model malfunctions by improving the quality of the data it collects in real time in a real time environment ”, specifies Professor Hyerim.
The high precision of the new algorithm, as well as its versatility, will surely ensure its widespread application in industry in the near future.
Authors: Sunghyun Sim (1), Hyerim Bae (1), and Ling Liu (2)
Original article title: Bagging Recurrent Event Imputation for Repair of Imperfect Event Log With Missing Categorical Events
Newspaper: IEEE IT Services Transactions
- Pusan National University, South Korea
- Georgia Institute of Technology, United States
About the National University of Pusan
Pusan National University, located in Busan, South Korea, was founded in 1946 and today is No. 1 national university of South Korea in research and pedagogical competence. The multi-campus university also has other smaller campuses in Yangsan, Miryang, and Ami. The university prides itself on the principles of truth, freedom and service and has approximately 30,000 students, 1,200 professors and 750 faculty members. The university is made up of 14 colleges (schools) and an independent division, with 103 departments in total.
About the authors
Dr Sunghyun Sim obtained his Masters and PhD. in Industrial Engineering from Pusan National University, South Korea, in 2021. His research interests include automatic process exploration, quality improvement of event logs, and process optimization based on the deep learning method.
Professor Hyerim Bae obtained his doctorate. in Industrial Engineering from Seoul National University, South Korea. Since 2004, he has been a professor in the Department of Industrial Engineering at the National University of Pusan, South Korea. His interests include AI-powered smart ports, cloud computing, process mining for smart factories, and big data analytics for operational intelligence.
Website Address: http://baelab.pusan.ac.kr/
IEEE IT Services Transactions
Computer simulation / modeling
The title of the article
Bagging recurring event imputation to repair imperfect event log with missing categorical events
Publication date of the article
Warning: AAAS and EurekAlert! are not responsible for the accuracy of any press releases posted on EurekAlert! by contributing institutions or for the use of any information via the EurekAlert system.