Information systems in the big data era 2018

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (23.57 MB, 280 trang )

LNBIP 317

Jan Mendling
Haralambos Mouratidis (Eds.)

Information Systems
in the Big Data Era
CAiSE Forum 2018
Tallinn, Estonia, June 11–15, 2018
Proceedings

123

Lecture Notes
in Business Information Processing
Series Editors
Wil M. P. van der Aalst
RWTH Aachen University, Aachen, Germany
John Mylopoulos
University of Trento, Trento, Italy
Michael Rosemann
Queensland University of Technology, Brisbane, QLD, Australia
Michael J. Shaw
University of Illinois, Urbana-Champaign, IL, USA
Clemens Szyperski
Microsoft Research, Redmond, WA, USA

317

More information about this series at />

Jan Mendling Haralambos Mouratidis (Eds.)
•

Information Systems
in the Big Data Era
CAiSE Forum 2018
Tallinn, Estonia, June 11–15, 2018
Proceedings

123

Editors
Jan Mendling
Wirtschaftsuniversität Wien
Vienna
Austria

Haralambos Mouratidis
University of Brighton
Brighton
UK

ISSN 1865-1348
ISSN 1865-1356 (electronic)
Lecture Notes in Business Information Processing
ISBN 978-3-319-92900-2
ISBN 978-3-319-92901-9 (eBook)

/>Library of Congress Control Number: 2018944409
© Springer International Publishing AG, part of Springer Nature 2018
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the
material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microﬁlms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book are
believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors
give a warranty, express or implied, with respect to the material contained herein or for any errors or
omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in
published maps and institutional afﬁliations.
Printed on acid-free paper
This Springer imprint is published by the registered company Springer International Publishing AG
part of Springer Nature
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Preface

This volume contains the papers presented at CAISE Forum 2018 held from June 28 to
July 1, 2018, in Tallinn. CAISE is a well-established highly visible conference series
on information systems engineering. The CAiSE Forum is a place within the CAiSE
conference for presenting and discussing new ideas and tools related to information
systems engineering. Intended to serve as an interactive platform, the forum aims at the
presentation of emerging new topics and controversial positions, as well as demonstration of innovative systems, tools, and applications. The forum sessions at the
CAiSE conference will facilitate the interaction, discussion, and exchange of ideas

among presenters and participants. Contributions to the CAiSE 2018 Forum were
welcome to address any of the conference topics and in particular the theme of this
year’s conference: “Information Systems in the Big Data Era.” We invited two types of
submissions:
– Visionary papers presenting innovative research projects, which are still at a relatively early stage and do not necessarily include a full-scale validation. Visionary
papers are presented as posters in the forum.
– Demo papers describing innovative tools and prototypes that implement the results
of research efforts. The tools and prototypes are presented as demos in the forum.
The management of paper submission and reviews was supported by the EasyChair
conference system. There were 29 submissions, with 13 of them being nominated by
the Program Committee (PC) chairs of the CAISE main conference. Each submission
was reviewed by three PC members. The committee decided to accept 22 papers.
As chairs of the CAISE Forum, we would like to express again our gratitude to the
PC for their efforts in providing very thorough evaluations of the submitted forum
papers. We also thank the local organization team for their great support. Finally, we
wish to thank all authors who submitted papers to CAISE and to the CAISE Forum.
April 2018

Jan Mendling
Haralambos Mouratidis

Organization

General Chairs
Marlon Dumas
Andreas Opdahl

University of Tartu, Estonia
University of Bergen, Norway

Organization Chair
Fabrizio Maggi

University of Tartu, Estonia

Program Committee Chairs
Jan Mendling
Haralambos Mouratidis

Wirtschaftsuniversität Wien, Austria
University of Brighton, UK

Program Committee
Raian Ali
Said Assar
Saimir Bala
Jan Claes
Dirk Fahland
Luciano García-Bañuelos
Haruhiko Kaiya
Christos Kalloniatis
Dimka Karastoyanova
Henrik Leopold
Daniel Lübke
Massimo Mecella
Selmin Nurcan
Cesare Pautasso
Michalis Pavlidis
Luise Pufahl

David Rosado
Sigrid Schefer-Wenzl
Stefan Schönig
Arik Senderovich
Arnon Sturm
Lucinéia Heloisa Thom
Matthias Weidlich
Moe Wynn

Bournemouth University, UK
Institut Mines-Telecom, France
WU Vienna, Austria
Ghent University, Belgium
Eindhoven University of Technology, The Netherlands
University of Tartu, Estonia
Shinshu University, Japan
University of the Aegean, Greece
University of Groningen, The Netherlands
Vrije Universiteit Amsterdam, The Netherlands
Leibniz Universität Hannover, Germany
Sapienza University of Rome, Italy
Université Paris 1 Panthéon, Sorbonne, France
University of Lugano, Switzerland
University of Brighton, UK
Hasso Plattner Institute, University of Potsdam,
Germany
University of Castilla-La Mancha, Spain
FH Campus Vienna, Austria
University of Bayreuth, Germany
Technion, Israel

Ben-Gurion University, Israel
Federal University of Rio Grande do Sul, Brazil
Humboldt-Universität zu Berlin, Germany
Queensland University of Technology, Australia

Contents

Enabling Process Variants and Versions in Distributed Object-Aware
Process Management Systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Kevin Andrews, Sebastian Steinau, and Manfred Reichert

1

Achieving Service Accountability Through Blockchain and Digital Identity . . .
Fabrizio Angiulli, Fabio Fassetti, Angelo Furfaro, Antonio Piccolo,
and Domenico Saccà

16

CrowdCorrect: A Curation Pipeline for Social Data Cleansing and Curation . . .
Amin Beheshti, Kushal Vaghani, Boualem Benatallah,
and Alireza Tabebordbar

24

Service Discovery and Composition in Smart Cities . . . . . . . . . . . . . . . . . .
Nizar Ben-Sassi, Xuan-Thuy Dang, Johannes Fähndrich,
Orhan-Can Görür, Christian Kuster, and Fikret Sivrikaya

39

CJM-ab: Abstracting Customer Journey Maps Using Process Mining. . . . . . .
Gaël Bernard and Periklis Andritsos

49

PRESISTANT: Data Pre-processing Assistant . . . . . . . . . . . . . . . . . . . . . . .
Besim Bilalli, Alberto Abelló, Tomàs Aluja-Banet, Rana Faisal Munir,
and Robert Wrembel

57

Systematic Support for Full Knowledge Management Lifecycle
by Advanced Semantic Annotation Across Information
System Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Vishwajeet Pattanaik, Alex Norta, Michael Felderer, and Dirk Draheim
Evaluation of Microservice Architectures: A Metric
and Tool-Based Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Thomas Engel, Melanie Langermeier, Bernhard Bauer,
and Alexander Hofmann

66

74

KeyPro - A Decision Support System for Discovering Important
Business Processes in Information Systems . . . . . . . . . . . . . . . . . . . . . . . .
Christian Fleig, Dominik Augenstein, and Alexander Maedche

90

Tell Me What’s My Business - Development of a Business Model
Mining Software: Visionary Paper. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Christian Fleig, Dominik Augenstein, and Alexander Maedche

105

VIII

Contents

Checking Business Process Correctness in Apromore. . . . . . . . . . . . . . . . . .
Fabrizio Fornari, Marcello La Rosa, Andrea Polini, Barbara Re,
and Francesco Tiezzi

114

Aligning Goal and Decision Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Renata Guizzardi, Anna Perini, and Angelo Susi

124

Model-Driven Test Case Migration: The Test Case Reengineering
Horseshoe Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ivan Jovanovikj, Gregor Engels, Anthony Anjorin, and Stefan Sauer

133

MICROLYZE: A Framework for Recovering the Software Architecture
in Microservice-Based Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Martin Kleehaus, Ömer Uludağ, Patrick Schäfer, and Florian Matthes

148

Towards Reliable Predictive Process Monitoring . . . . . . . . . . . . . . . . . . . . .
Christopher Klinkmüller, Nick R. T. P. van Beest, and Ingo Weber
Extracting Object-Centric Event Logs to Support Process Mining
on Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Guangming Li, Eduardo González López de Murillas,
Renata Medeiros de Carvalho, and Wil M. P. van der Aalst
Q-Rapids Tool Prototype: Supporting Decision-Makers in Managing
Quality in Rapid Software Development . . . . . . . . . . . . . . . . . . . . . . . . . .
Lidia López, Silverio Martínez-Fernández, Cristina Gómez,
Michał Choraś, Rafał Kozik, Liliana Guzmán, Anna Maria Vollmer,
Xavier Franch, and Andreas Jedlitschka
A NMF-Based Learning of Topics and Clusters for IT Maintenance
Tickets Aided by Heuristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Suman Roy, Vijay Varma Malladi, Abhishek Gangwar,
and Rajaprabu Dharmaraj
From Security-by-Design to the Identification of Security-Critical
Deviations in Process Executions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Mattia Salnitri, Mahdi Alizadeh, Daniele Giovanella, Nicola Zannone,
and Paolo Giorgini

163

182

200

209

218

Workflow Support in Wearable Production Information Systems. . . . . . . . . .
Stefan Schönig, Ana Paula Aires, Andreas Ermer, and Stefan Jablonski

235

Predictive Process Monitoring in Apromore . . . . . . . . . . . . . . . . . . . . . . . .
Ilya Verenich, Stanislav Mõškovski, Simon Raboczi, Marlon Dumas,
Marcello La Rosa, and Fabrizio Maria Maggi

244

Contents

IX

Modelling Realistic User Behaviour in Information Systems Simulations
as Fuzzing Aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Tom Wallis and Tim Storer

254

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

269

Enabling Process Variants and Versions
in Distributed Object-Aware Process
Management Systems
Kevin Andrews(B) , Sebastian Steinau, and Manfred Reichert
Institute of Databases and Information Systems,
Ulm University, Ulm, Germany
{kevin.andrews,sebastian.steinau,manfred.reichert}@uni-ulm.de

Abstract. Business process variants are common in many enterprises
and properly managing them is indispensable. Some process management suites already oﬀer features to tackle the challenges of creating and
updating multiple variants of a process. As opposed to the widespread
activity-centric process modeling paradigm, however, there is little to no
support for process variants in other process support paradigms, such
as the recently proposed artifact-centric or object-aware process support
paradigm. This paper presents concepts for supporting process variants
in the object-aware process management paradigm. We oﬀer insights into
the distributed object-aware process management framework PHILharmonicFlows as well as the concepts it provides for implementing variants
and versioning support based on log propagation and log replay. Finally,
we examine the challenges that arise from the support of process variants and show how we solved these, thereby enabling future research
into related fundamental aspects to further raise the maturity level of
data-centric process support paradigms.

Keywords: Business processes
Object-aware processes

1

· Process variants

Introduction

Business process models are a popular method for companies to document their
processes and the collaboration of the involved humans and IT resources. However, through globalization and the shift towards oﬀering a growing number
of products in a large number of countries, many companies are face a sharp
increase of complexity in their business processes [4,5,11]. For example, automotive manufacturers that, years ago, only had to ensure that they had stable
processes for building a few car models, now have to adhere to many regulations
for diﬀerent countries, the increasing customization wishes of customers, and far
faster development and time-to-market cycles. With the addition of Industry
4.0 demands, such as process automation and data-driven manufacturing, it is
c Springer International Publishing AG, part of Springer Nature 2018
J. Mendling and H. Mouratidis (Eds.): CAiSE Forum 2018, LNBIP 317, pp. 1–15, 2018.
/>

2

K. Andrews et al.

becoming more important for companies to establish maintainable business processes that can be updated and rolled out across the entire enterprise as fast as
possible.
However, the increase of possible process variants poses a challenge, as each
additional constraint derived from regulations or product speciﬁcs either leads to
larger process models or more process models showing diﬀerent variants of otherwise identical processes. Both scenarios are not ideal, which is why there has
been research over the past years into creating more maintainable process variants [5,7,11,13]. As previous research on process variant support has focused
on activity-centric processes, our contribution provides a novel approach supporting process variants in object-aware processes. Similar to case handling or
artifact-centric processes, object-aware processes are inherently more ﬂexible
than activity-centric ones, as they are less strictly structured, allowing for more
freedom during process execution [1,3,6,10,12]. This allows object-aware processes to support processes that are very dynamic by nature and challenging to

formulate in a sequence of activities in a traditional process model.
In addition to the conceptual challenges process variants pose in a centralized
process server scenario, we examine how our approach contributes to managing
the challenges of modeling and executing process variants on an architecture that
can support scenarios with high scalability requirements. Finally, we explain how
our approach can be used to enable updatable versioned process models, which
will be essential for supporting schema evolution and ad-hoc changes in objectaware processes.
To help understand the notions presented in the contribution we provide
the fundamentals of object-aware process management and process variants in
Sect. 2. Section 3 examines the requirements identiﬁed for process variants. In
Sect. 4 we present the concept for variants in object-aware processes as the main
contribution of this paper. In Sect. 5 we evaluate whether our approach meets the
identiﬁed requirements and discuss threats to validity as well as persisting challenges. Section 6 discusses related work, whereas Sect. 7 provides a summary and
outlook on our plans to provide support for migrating running process instances
to newer process model versions in object-aware processes.

2
2.1

Fundamentals
Object-Aware Process Management

PHILharmonicFlows, the object-aware process management framework we are
using as a test-bed for the concepts presented in this paper, has been under
development for many years at Ulm University [2,8,9,16,17]. This section gives
an overview on the PHILharmonicFlows concepts necessary to understand the
remainder of the paper. PHILharmonicFlows takes the basic idea of a data-driven
and data-centric process management system and improves it by introducing
the concept of objects. One such object exists for each business object present in
a real-world business process. As can be seen in Fig. 1, a PHILharmonicFlows

Enabling Process Variants and Versions

3

object consists of data, in the form of attributes, and a state-based process model
describing the object lifecycle.

Transfer
Approved
Initialized
Amount

Decision Pending

Approved

Date

Approved == true
Approved == false

Assignment: Customer

Rejected

Assignment: Checking Accou nt Manager

Lifecycle

Attributes

Amount: Integer

Date: Date

Approved: Bool

Comment: String

Fig. 1. Example object including lifecycle process

The attributes of the Transfer object (cf. Fig. 1) include Amount, Date,
Approval, and Comment. Thelifecycle process, in turn, describes the diﬀerent
states (Initialized, Decision Pending, Approved, and Rejected), an instance of a
Transfer object may have during process execution. Each state contains one or
more steps, each referencing exactly one of the object attributes, thereby forcing
that attribute to be written at run-time. The steps are connected by transitions,
allowing them to be arranged in a sequence. The state of the object changes
when all steps in a state are completed. Finally, alternative paths are supported
in the form of decision steps, an example of which is the Approved decision step.
As PHILharmonicFlows is data-driven, the lifecycle process for the Transfer
object can be understood as follows: The initial state of a Transfer object is Initialized. Once a Customer has entered data for the Amount and Date attributes,
the state changes to Decision Pending, which allows an Account Manager to
input data for Approved. Based on the value for Approved, the state of the
Transfer object changes to Approved or Rejected. Obviously, this ﬁne-grained
approach to modeling a business process increases complexity when compared
to the activity-centric paradigm, where the minimum granularity of a user action
is one atomic activity or task, instead of an individual data attribute.
However, as an advantage, the object-aware

approach allows for automated form generation at Bank Transfer – Decision
run-time. This is facilitated by the lifecycle pro- Amount
cess of an object, which dictates the attributes to Date
be ﬁlled out before the object may switch to the Approved*
Comment
next state, resulting in a personalized and dynamSubmit
ically created form. An example of such a form,
derived from the lifecycle process in Fig. 1, is shown
in Fig. 2.
Fig. 2. Example form
27.000 €

03.06.2017

true

4

K. Andrews et al.

Note that a single object and its resulting form only constitutes one part of
a complete PHILharmonicFlows process. To allow for complex executable business processes, many diﬀerent objects and users may have to be involved [17].
It is noteworthy that users are simply special objects in the object-aware process management concept. The entire set of objects (including those representing
users) present in a PHILharmonicFlows process is denoted as the data model,
an example of which can be seen in Fig. 3a. At run-time, each of the objects
can be instantiated to so-called object instances, of which each represents a concrete instance of an object. The lifecycle processes present in the various object
instances are executable concurrently at run-time, thereby improving performance. Figure 3b shows a simpliﬁed example of an instantiated data model at
run-time.

Employee 1

Employee

Customer

Checking
Account

Stock Depot

Transfer

Customer 1

Savings
Account

Customer 2

Checking
Account 1

Checking
Account 2

Checking
Account 3

Transfer 2

Transfer 1

Transfer 3

(a) Design-time

(b) Run-time
Fig. 3. Data model

In addition to the objects, the data model contains information about the
relations existing between them. A relation constitutes a logical association
between two objects, e.g., a relation between a Transfer and a Checking Account.
Such a relation can be instantiated at run-time between two concrete object
instances of a Transfer and a Checking Account, thereby associating the two
object instances with each other. The resulting meta information, i.e., the information that the Transfer in question belongs to a certain Checking Account, can
be used to coordinate the processing of the two objects with each other.
Finally, complex object coordination, which becomes necessary as most processes consist of numerous interacting business objects, is possible in PHILharmonicFlows as well [17]. As objects publicly advertise their state information,
the current state of an object can be utilized as an abstraction to coordinate
with other objects corresponding to the same business process through a set of
constraints, deﬁned in a separate coordination process. As an example, consider a
constraint stating that a Transfer may only change its state to Approved if there
are less than 4 other Transfers already in the Approved state for one speciﬁc
Checking Account.

Enabling Process Variants and Versions

5

The various components of PHILharmonicFlows, i.e., objects, relations, and
coordination processes, are implemented as microservices, turning PHILharmonicFlows into a fully distributed process management system for object-aware
processes. For each object instance, relation instance, or coordination process
instance one microservice is present at run-time. Each microservice only holds
data representing the attributes of its object. Furthermore, the microservice only
executes the lifecycle process of the object it is assigned to. The only information
visible outside the individual microservices is the current “state” of the object,
which, in turn, is used by the microservice representing the coordination process
to properly coordinate the objects’ interactions with each other.
2.2

Process Variants

Clerk must
approve
Transfer
<1000

Determine
Transfer
Amount

(a) Base

>20000 &&
[Country A]

Manager
must
approve

Transfer

[Country B]

>10000

Manager
must
approve
Transfer

Report to
Government
Agency in
Country A

Report to
government
Agency in
Country B

Clerk must
approve
Transfer

Determine
Transfer
Amount
<1000

>10000

Simply speaking, a process variant is one speciﬁc path through the activities of
a process model, i.e., if there are three distinct paths to completing a business
goal, three process variants exist. As an example, take the process of transferring
money from one bank account to another, for which there might be three alternate execution paths. For instance, if the amount to be transferred is greater
than $10,000, a manager must approve the transfer, if the amount is less than
$10,000, a mere clerk may approve said transfer. Finally, if the amount is less
than $1,000, no one needs to approve the transfer. This simple decision on who
has to approve the transfer implicitly creates three variants of the process.
As previously stated, modeling such variants is mostly done by incorporating them into one process model as alternate paths via choices (cf. Fig. 4a). As
demonstrated in the bank transfer example, this is often the only viable option,
because the amount to be transferred is not known when the process starts.
Clearly, for more complex processes, each additional choice increases the complexity of the process model, making it harder to maintain and update.
To demonstrate this, we extend our previous example of a bank transfer with
the addition of country-speciﬁc legal requirements for money transfers between
accounts. Assuming the bank operates in three countries, A, B, and C, country
A imposes the additional legal requirement of having to report transfers over
$20,000 to a government agency. On the other hand, Country B could require
the reporting of all transfers to a government agency, while country C has no

(b) Including extra Requirements
Fig. 4. Bank transfer process

6

K. Andrews et al.

such requirements. The resulting process model would now have to reﬂect all

these additional constraints, making it substantially larger (cf. Fig. 4b).
Obviously, this new process model contains more information than necessary
for its execution in one speciﬁc country. Luckily, if the information necessary
to choose the correct process variant is available before starting the execution
of the process, a diﬀerent approach can be chosen: deﬁning the various process variants as separate process models and choosing the right variant before
starting the process execution. In our example this can be done as the country is known before the transfer process is started. Therefore, it is possible to
create three country-speciﬁc process model variants, for countries A, B, and C,
respectively. Consequently, each process model variant would only contain the
additional constraints for that country not present in the base process model.
This reduces the complexity of the process model from the perspective of
each country, but introduces the problem of having three diﬀerent models to
maintain and update. Speciﬁcally, changes that must be made to those parts
of the model common to all variants, in our example the decision on who must
approve the transfer, cause redundant work as there are now multiple process
models that need updating. Minimizing these additional time-consuming workloads, while enabling clean variant-speciﬁc process models, is a challenge that
many researchers and process management suites aim to solve [5,7,11,14,15].

3

Requirements

The requirements for supporting process variants in object-aware processes are
derived from the requirements for supporting process variants in activity-centric
processes, identiﬁed in our previous case studies and a literature review [5,7,11].
Requirement 1 (Maintainability). Enabling maintainability of process variants is paramount to variant management. Without advanced techniques, such
as propagating changes made to a base process to its variants, optimizing a
process would require changes in all individual variants, which is error-prone
and time-consuming. To enable the features that improve maintainability, the
base process and its variants must be structured as such (cf. Req. 2). Furthermore, process modelers must be informed if changes they apply to a base process
introduce errors in the variants derived from them (cf. Req. 3).

Requirement 2 (Hierarchical structuring). As stated in Req. 1, a hierarchical
structure becomes necessary between variants. Ideally, to further reduce workloads when optimizing and updating processes, the process variants of both lifecycle and coordination processes can be decomposed into further sub-variants.
This allows those parts of the process that are shared among variants, but which
are not part of the base process, to be maintained in an intermediate model.
Requirement 3 (Error resolution). As there could be countless variants, the
system should report errors to process modelers automatically, as manual checking of all variants could be time-consuming. Additionally, to ease error resolution,

Enabling Process Variants and Versions

7

the concept should allow for the generation of resolution suggestions. To be able
to detect which variants would be adversely aﬀected by a change to a base model,
automatically veriﬁable correctness criteria are needed, leading to Req. 4.
Requirement 4 (Correctness). The correctness of a process model must be
veriﬁable at both design- and run-time. This includes checking correctness before
a pending change is applied in order to judge its eﬀects. Additionally, the eﬀects
of a change on process model variants must be determinable to support Req. 5.
Requirement 5 (Scalability). Finally, most companies that need process variant management solutions maintain many process variants and often act globally.
Therefore, the solutions for the above requirements should be scalable, both in
terms of computational complexity as well as in terms of the manpower necessary
to apply them to a large number of variants. Additionally, as the PHILharmonicFlows architecture is fully distributed, we have to ensure that the developed
algorithms work correctly in a distributed computing environment.

4

Variants and Versioning of Process Models

This section introduces our concepts for creating and managing diﬀerent

deployed versions and variants of data models as well as contained objects in an
object-aware process management system. We start with the deployment concept, as the variant concept relies on many of the core notions presented here.
4.1

Versioning and Deployment Using Logs

Versioning of process models is a trivial requirement for any process management system. Speciﬁcally, one must be able to separate the model currently
being edited by process modelers from the one used to instantiate new process
instances. This ensures that new process instances can always be spawned from
a stable version of the model that no one is currently working on. This process is
referred to as deployment. In the current PHILharmonicFlows implementation,
deployment is achieved by copying an editable data model, thereby creating a
deployed data model. The deployed data model, in turn, can then be instantiated
and executed while process modelers keep updating the editable data model.
As it is necessary to ensure that already running process instances always
have a corresponding deployed model, the deployed models have to be versioned
upon deployment. This means that the deployment operation for an editable data
model labeled “M” automatically creates a deployed data model MT 38 (Data
Model M , Timestamp 38 ). Timestamp T 38 denotes the logical timestamp of the
version to be deployed, derived from the amount of modeling actions that have
been applied in total. At a later point, when the process modelers have updated
the editable data model M and they deploy the new version, the deployment
operation gets the logical timestamp for the deployment, i.e., T 42, and creates
the deployed data model MT 42 (Data Model M , Timestamp 42). As MT 38 and
MT 42 are copies of the editable model M at the moment (i.e., timestamp) of

8

K. Andrews et al.

deployment, they can be instantiated and executed concurrently at run-time. In
particular, process instances already created from MT 38 should not be in conﬂict
with newer instances created from MT 42 .
The editable data model M , the two deployed models MT 38 and MT 42 as well
as some instantiated models can be viewed in Fig. 5. The representation of each
model in Fig. 5 contains the set of objects present in the model. For example,
{X, Y } denotes a model containing the two objects X and Y. Furthermore, the
editable data model has a list of all modeling actions applied to it. For example,
L13:[+X] represents the 13th modeling action, which added an object labeled
“X”. The modeling actions we use as examples throughout this section allow
adding and removing entire objects. However, the concepts can be applied to
any of the many diﬀerent operations supported in PHILharmonicFlows, e.g.,
adding attributes or changing the coordination process.

{X,Y,A,B,C,D}

Instantiation

Deployed Model
M_T42

{X,Y}

Instantiation

{X,Y,A,B,C,D}

Deployment

Editable Model “M”
...
L13:[+X]
L24:[+Y]
...
L39:[+A]
L40:[+B]
L41:[+C]
L42:[+D]

Deployment

Instantiation

Deployed Model
M_T38

Instantiated Model
M_T38_1

Instantiated Model
M_T38_2

{X,Y}

{X,Y(,A,B,C,D)}

Version Migration <+A,+B,+C,+D>

Instantiated Model

M_T42_1
{X,Y,A,B,C,D}

Fig. 5. Deployment example

To reiterate, versioned deployment is a basic requirement for any process
management system and constitutes a feature that most systems oﬀer. However,
we wanted to develop a concept that would, as a topic for future research, allow
for the migration of already running processes to newer versions. Additionally,
as we identiﬁed the need for process variants (cf. Sect. 1), we decided to tackle
all three issues, i.e., versioned deployment, variants, and version migration of
running processes, in one approach.
Deploying a data model by simply cloning it and incrementing its version number is not suﬃcient for enabling version migration. Version migration
requires knowledge about the changes that need to be applied to instances
running on a lower version to migrate it to the newest version, denoted as
MT 38 ΔMT 42 in our example. In order to obtain this information elegantly, we log
all actions a process modeler completes when creating the editable model until
the ﬁrst deployment. We denote these log entries belonging to M as logs (M ). To
create the deployed model, we replay the individual log entries l ∈ logs (M ) to
a new, empty, data model. As all modeling actions are deterministic, this recreates the data model M step by step, thereby creating the deployed copy, which
we denote as MT 38 . Additionally, as replaying the logs in logs (M ) causes each
modeling action to be repeated, the deployment process causes the deployed data

Enabling Process Variants and Versions

9

model MT 38 to create its own set of logs, logs (MT 38 ). Finally, as data model M
remains editable after a deployment, additional log entries may be created and

added to logs (M ). Each consecutive deployment causes the creation of another
deployed data model and set of logs, e.g. MT 42 and logs (MT 42 ).
As the already deployed version, MT 38 has its own set of logs, i.e.,
logs (MT 38 ), it is trivial to determine MT 38 ΔMT 42 , as it is simply the set diﬀerence, i.e., logs (MT 42 ) \ logs (MT 38 ). As previously stated, MT 38 ΔMT 42 can be
used later on to enable version migration, as it describes the necessary changes to
instances of MT 38 when migrating them to MT 42 . An example of how we envision
this concept functioning is given in Fig. 5 for the migration of the instantiated
model MT 382 to the deployed model MT 42 .
To enable this logging-based copying and deployment of a data model in
a distributed computing environment, the log entries have to be ﬁtted with
additional meta information. As an example, consider the simple log entry l42
which was created after a user had added a new object type to the editable data
model:
⎧
dataM odelId 6123823241189
⎪
⎪
⎪
⎨action
AddObjectT ype
l42 =
⎪
params
[name
: “Object D”]
⎪
⎪
⎩
timestamp
42

Clearly, the log entry contains all information necessary for its replay: the id of
the data model or object the logged action was applied to, the type of action that
was logged, and the parameters of this action. However, due to the distributed
microservice architecture PHILharmonicFlows is built upon, a logical timestamp
for each log entry is required as well. This timestamp must be unique and sortable
across all microservices that represent parts of one editable data model, i.e., all
objects, relations, and coordination processes. This allows PHILharmonicFlows
to gather the log entries from the individual microservices, order them in exactly
the original sequence, and replay them to newly created microservices, thereby
creating a deployed copy of the editable data model.
Coincidentally, it must be noted that the example log entry l42 is the one
created before deployment of MT 42 . By labeling the deployment based on the
timestamp of the last log entry, determining the modeling actions that need to
be applied to an instance of MT 38 to update it to MT 42 can be immediately
identiﬁed as the sequence l39 , l40 , l41 , l42 ⊂ logs (MT 42 ), as evidenced by the
example in Fig. 5.
4.2

Variants

As previously stated, we propose reusing the logging concept presented in
Sect. 4.1 for creating and updating variants of data models and contained objects.
In Sect. 4.1, we introduced two example data models, MT 38 and MT 42 , which
were deployed at diﬀerent points in time from the same editable data model.
Additionally, we showed that the diﬀerences between these two deployed models
are the actions applied by four log entries, namely l39 , l40 , l41 , l42 . Expanding

10

K. Andrews et al.

{X,Y,E,B,C}

Instantiation

Deployed Model
M_V1_T43

{X,Y}

Instantiated Model
M_V1_T40_1

Instantiation

Deployment

{X,Y,E,B,C,D}

Deployed Model
M_V1_T40

Deployment

Editable Model “M” Variant “V1”
...
L40:[-A]
L41:[+E]
L42:[+B]

L43:[+C]
L44:[+D]

Instantiated Model
M_V1_T43_1

{X,Y}

{X,Y,E,B,C}

Instantiation

Editable Model “M” Variant “V2"
...
L43:[+F]
{X,Y,A,B,C,D,F}

Deployment

{X,Y,A,B,C,D}

Variant

Editable Model “M”
...
L13:[+X]
L24:[+Y]
...
L39:[+A]
L40:[+B]

L41:[+C]
L42:[+D]

Variant

upon this idea, we developed a concept for creating variants of data models using
log entries for each modeling action, which we present in this section.
An example of our concept, in which two variants, V 1 and V 2, are created
from the editable data model M , is shown in Fig. 6. The editable base model,
M , has a sequence of modeling actions that were applied to it and logged in
l1 , . . . , l42 . Furthermore, the two variants of M where created at diﬀerent points
in time, i.e., at diﬀerent logical timestamps. Variant V 1 was created at timestamp
T 39, i.e., the last action applied before creating the variant had been logged in
l39 .
As we reuse the deployment concept for variants, the actual creation of a data
model variant is, at ﬁrst, merely the creation of an identical copy of the editable
data model in question. For variant V 1, this means creating an empty editable
data model and replaying the actions logged in the log entries l1 , . . . , l39 ⊆
logs (M ), ending with the creation of object A. As replaying the logs to the new
editable data model MV 1 creates another set of logs, logs (MV 1 ), any further
modeling actions that process modelers only apply to MV 1 can be logged in
logs (MV 1 ) instead of logs (M ). This allows us to add or remove elements not
altered in the base model or other variants. An example is given by the removal
of object A in l40 ∈ logs (MV 1 ), an action not present in logs (M ) or logs (MV 2 ).

Deployed Model
M_V2_T43

Instantiated Model
M_V1_T43_2

{X,Y,A,B,C,D,F}

{X,Y,E,B,C}

Fig. 6. Variant example

Up until this point, a variant is nothing more than a copy that can be edited
independently of the base model. However, in order to provide a solution for
maintaining and updating process variants (cf. Req. 1), the concept must also
support the automated propagation of changes made to the base model to each
variant. To this end, we introduce a hierarchical relationship between editable
models, as required by Req. 2, denoted by . In the example (cf. Fig. 6), both
variants are beneath data model M in the variant hierarchy, i.e., MV 1 M and
MV 2 M . For possible sub-variants, such as MV 2V 1 , the hierarchical relationship is transitive, i.e., MV 2V 1 M ⇐⇒ MV 2 M .
M , we utilize the
To fulﬁll Req. 1 when modeling a variant, e.g. MV 1
hierarchical relationship to ensure that all modeling actions applied to M are

Enabling Process Variants and Versions

11

(1)
IniƟal
SituaƟon

“M”
...

L39:[+A]
{X,Y,A}

(3)
Log
PropagaƟon

“M”
...
L39:[+A]
L40:[+B]
{X,Y,A,B}

Propagate L40

propagated to MV 1 , always ensuring that logs (MV 1 ) ⊆ logs (M ) holds. This is
done by replaying new log entries added to logs (M ) to MV 1 , which, in turn,
creates new log entries in logs (MV 1 ). As an example, Fig. 7 shows the replaying
of one such log, l40 ∈ logs(M ) to MV 1 , which creates log entry l42 ∈ logs (MV 1 ).
“M” Variant “V1”
(2)
...
Add
L41:[+E]
Object
{X,Y,E}
B

“M”
...

L39:[+A]
L40:[+B]
{X,Y,A,B}

“M” Variant “V1”
...
L41:[+E]
{X,Y,E}

“M” Variant “V1”
(4)
...
Log
L41:[+E]
Replay
{X,Y,E}

“M”
...
L39:[+A]
L40:[+B]
{X,Y,A,B}

“M” Variant “V1”
...
L41:[+E]
L42:[+B]
{X,Y,E,B}

Fig. 7. Log propagation example

In the implementation, we realized this by making the propagation of the log
entry for a speciﬁc modeling action part of the modeling action itself, thereby
ensuring that updating the base model, including all variants, is atomic. However, it must be noted that, while the action being logged in both editable
data models is the same, the logs have diﬀerent timestamps. This is due to the
fact that MV 1 has the variant-speciﬁc log entries l40 , l41 ⊂ logs (MV 1 ) and
l40 ∈ logs(M ) is appended to the end of logs (MV 1 ) as l42 ∈ logs (MV 1 ). As evidenced by Fig. 6, variants created this way are fully compatible with the existing
deployment and instantiation concept. In particular, from the viewpoint of the
deployment concept, a variant is simply a normal editable model with its own
set of logs that can be copied and replayed to a deployed model.

5

Evaluation

The presented concept covers all requirements (cf. Sect. 3) as we will show in
the following. The main requirement, and goal of this research, was to develop
a concept for maintainable process variants of object-aware data models and
contained objects (cf. Req. 1). We managed to solve this challenge by introducing
a strict hierarchical structure between the base data model, variants, and even
sub-variants (cf. Req. 2). Furthermore, our solution ensures that the variants
are always updated by changes made to their base models. As presented in
Sect. 4.2, this is done by managing logs with logical timestamps and replaying
them to variants that are lower in the hierarchy. This ensures that any modeling
action applied to a variant will always take into consideration the current base
model. However, this strict propagation of all modeling actions to all variants
poses additional challenges. For instance, expanding on the situation presented
in Fig. 6, a modeling action that changes part of the lifecycle process of object
A could be logged as l43 ∈ logs(M ), causing the log to be replayed to variant

12

K. Andrews et al.

V 1. However, V 1 does not have an object A anymore, as is evidenced by the set
of objects present, i.e., {X, Y, E, B, C, D}. Clearly, this is due to the fact that
l40 ∈ logs (MV 1 ) removed object A from that variant.
As it is intentional for variant V 1 to not comprise object A, this particular
case poses no further challenge, as changes to an object not existing in a variant
can be ignored by that variant. However, there are other scenarios to be considered, one of which is the application of modeling actions in a base model that
have already been applied to a variant, such as introducing a transition between
two steps in the lifecycle process of an object. If this transition already exists in
a variant, the log replay to that variant will create an identical transition. As
two transitions between the same steps are prohibited, this action would break
the lifecycle process model of the variant and, in consequence, the entire object
and data model it belongs to. A simpliﬁed example of the bank transfer object
can be seen next to a variant with an additional transition between Amount and
Date to showcase this issue in Fig. 8. The problem arises when trying to add a
transition between Amount and Date to the base lifecycle process model, as the
corresponding log entry gets propagated to the variant, causing a clash.
Transfer - Base

Transfer - Variant
Approved

Initialized

Approved
Initialized

Date

Date

Amount

Amount
Rejected
Immediate

Rejected
Immediate

Fig. 8. Conﬂicting actions example

To address this and similar issues, which pose a threat to validity for our concept, we utilize the existing data model veriﬁcation algorithm we implemented in
the PHILharmonicFlows engine [16]. In particular, we leverage our distributed,
micro-service based architecture to create clones of the parts of a variant that will
be aﬀected by a log entry awaiting application. In the example from Fig. 8, we
can create a clone of the microservice serving the object, apply the log describing
the transition between Amount and Date, and run our veriﬁcation algorithm on
the clone. This would detect any problem caused in a variant by a modeling
action and generate an error message with resolution options, such as deleting
the preexisting transition in the variant (cf. Reqs. 3 and 4). In case there is no
problem with the action, we apply it to the microservice of the original object.
How the user interface handles the error message (e.g., oﬀering users a decision on how to ﬁx the problem) is out of the scope of this paper, but has been
implemented and tested as a proof-of-concept for some of the possible errors.
All other concepts presented in this paper have been implemented and tested
in the PHILharmonicFlows prototype. We have headless test cases simulating a

multitude of users completing randomized modeling actions in parallel, as well as

Enabling Process Variants and Versions

13

around 50,000 lines of unit testing code, covering various aspects of the engine,
including the model veriﬁcation, which, as we just demonstrated, is central to
ensuring that all model variants are correct. Furthermore, the basic mechanism
used to support variants, i.e., the creation of data model copies using log entries,
has been an integral part of the engine for over a year. As we rely heavily on it
for deploying and instantiating versioned data models (cf. Sect. 4.1), it is utilized
in every test case and, therefore, thoroughly tested.
Finally, through the use of the microservice-based architecture, we can ensure
that time-consuming operations, such as verifying models for compatibility with
actions caused by log propagation, are highly scalable and cannot cause bottlenecks [2]. This would hardly be an issue at design-time either way, but we are
ensuring that this basis for our future research into run-time version migration, or
even migration between variants, is highly scalable (cf. Req. 5). Furthermore, the
preliminary benchmark results for the distributed PHILharmonicFlows engine,
running on a cluster of 8 servers with 64 CPUs total, are promising. As copying
data models using logs is central to the concepts presented in this paper, we
benchmarked the procedure for various data model sizes (5, 7, and 14 objects)
and quadrupling increments of concurrently created copies of each data model.
The results in Table 1 show very good scalability for the creation of copies, as
creating 64 copies only takes twice as long as creating one copy. The varying performance between models of only slightly diﬀerent size can be attributes to the
fact that some of the more complex modeling operations are not yet optimized.
Table 1. Results
Example process Objects 1 copy
Recruitment

Intralogistics
Insurance

6

5

880 ms

4 copies 16 copies 64 copies
900 ms 1120 ms

2290 ms

7

2680 ms 2830 ms 4010 ms

9750 ms

14

4180 ms 4470 ms 7260 ms

12170 ms

Related Work

Related work deals with modeling, updating, and managing of process variants
in the activity-centric process modeling paradigm [5,7,11,13,15], as well as the

management of large amounts of process versions [4].
The Provop approach [5] allows for ﬂexible process conﬁguration of large
process variant collections. The activity-centric variants are derived from base
processes by applying change operations. Only the set of change operations constituting the delta to the base process is saved for each variant, reducing the
amount of redundant information. Provop further includes variant selection techniques that allow the correct variant of a process to be instantiated at run-time,
based on the context the process is running in.
An approach allowing for the conﬁguration of process models using questionnaires is presented in [13]. It builds upon concepts presented in [15], namely the

14

K. Andrews et al.

introduction of variation points in process models and modeling languages (e.g.
C-EPC). A process model can be altered at these variation points before being
instantiated, based on values gathered by the questionnaire. This capability has
been integrated into the APROMORE toolset [14].
An approach enabling ﬂexible business processes based on the combination of
process models and business rules is presented in [7]. It allows generating ad-hoc
process variants at run-time by ensuring that the variants adhere to the business
rules, while taking the actual case data into consideration as well.
Focusing on the actual procedure of modeling process variants, [11] oﬀers
a decomposition-based modeling method for entire families of process variants.
The procedure manages the trade-oﬀ between modeling multiple variants of a
business process in one model and modeling them separately.
A versioning model for business processes that supports advanced capabilities
is presented in [4]. The process model is decomposed into block fragments and
persisted in a tree data structure, which allows versioned updates and branching
on parts of the tree, utilizing the tree structure to determine aﬀected parts of
the process model. Unaﬀected parts of the tree can be shared across branches.

Our literature review has shown that there is interest in process variants and
developing concepts for managing their complexity. However, existing research
focuses on the activity-centric process management paradigm, making the current lack of process variant support in other paradigms, such as artifact- or
data-centric, even more evident. With the presented research we close this gap.

7

Summary and Outlook

This paper focuses on the design-time aspects of managing data model variants
in a distributed object-aware process management system. Firstly, we presented
a mechanism for copying editable design-time data models to deployed runtime data models. This feature, by itself, could have been conceptualized and
implemented in a number of diﬀerent ways, but we strove to ﬁnd a solution
that meets the requirements for managing process variants as well. Secondly, we
expanded upon the concepts created for versioned deployment to allow creating,
updating, and maintaining data model variants. Finally, we showed how the
concepts can be combined with our existing model veriﬁcation tools to support
additional requirements, such as error messages for aﬀected variants.
There are still open issues, some of which have been solved for activitycentric process models, but likely require entirely new solutions for non-activitycentric processes. Speciﬁcally, one capability we intend to realize for object-aware
processes is the ability to take the context in which a process will run into account
when selecting a variant.
When developing the presented concepts, we kept future research into truly
ﬂexible process execution in mind. Speciﬁcally, we are currently in the process
of implementing a prototypical extension to the current PHILharmonicFlows
engine that will allow us to upgrade instantiated data models to newer versions.
This kind of version migration will allow us to fully support schema evolution.

Enabling Process Variants and Versions

15

Additionally, we are expanding the error prevention techniques presented in
our evaluation to allow for the veriﬁcation of data model correctness for already
instantiated data models at run-time. We plan to utilize this feature to enable adhoc changes of instantiated objects and data models, such as adding an attribute
to one individual object instance without changing the deployed data model.
Acknowledgments. This work is part of the ZAFH Intralogistik, funded by the European Regional Development Fund and the Ministry of Science, Research and the Arts
of Baden-W¨
urttemberg, Germany (F.No. 32-7545.24-17/3/1).

References
1. Van der Aalst, W.M.P., Weske, M., Gr¨
unbauer, D.: Case handling: a new paradigm
for business process support. Data Knowl. Eng. 53(2), 129–162 (2005)
2. Andrews, K., Steinau, S., Reichert, M.: Towards hyperscale process management.
In: Proceedings of the EMISA, pp. 148–152 (2017)
3. Cohn, D., Hull, R.: Business artifacts: a data-centric approach to modeling business
operations and processes. IEEE TCDE 32(3), 3–9 (2009)
4. Ekanayake, C.C., La Rosa, M., ter Hofstede, A.H.M., Fauvet, M.-C.: Fragmentbased version management for repositories of business process models. In: Meersman, R., et al. (eds.) OTM 2011. LNCS, vol. 7044, pp. 20–37. Springer, Heidelberg
(2011). 3
5. Hallerbach, A., Bauer, T., Reichert, M.: Capturing variability in business process
models: the provop approach. JSEP 22(6–7), 519–546 (2010)
6. Hull, R.: Introducing the guard-stage-milestone approach for specifying business
entity lifecycles. In: Proceedings of the WS-FM, pp. 1–24 (2010)
7. Kumar, A., Yao, W.: Design and management of ﬂexible process variants using
templates and rules. Comput. Ind. 63(2), 112–130 (2012)
8. K¨
unzle, V.: Object-aware process management. Ph.D. thesis, Ulm University
(2013)
9. K¨

unzle, V., Reichert, M.: PHILharmonicFlows: towards a framework for objectaware process management. JSME 23(4), 205–244 (2011)
10. Marin, M., Hull, R., Vacul´ın, R.: Data centric BPM and the emerging case management standard: a short survey. In: Proceedings of the BPM, pp. 24–30 (2012)
11. Milani, F., Dumas, M., Ahmed, N., Matuleviˇcius, R.: Modelling families of business
process variants: a decomposition driven method. Inf. Syst. 56, 55–72 (2016)
12. Reichert, M., Weber, B.: Enabling Flexibility in Process-Aware Information Systems: Challenges, Methods, Technologies. Springer, Heidelberg (2012). https://doi.
org/10.1007/978-3-642-30409-5
13. La Rosa, M., Dumas, M., ter Hofstede, A.H.M., Mendling, J.: Conﬁgurable multiperspective business process models. Inf. Syst. 36(2), 313–340 (2011)
14. La Rosa, M., Reijers, H.A., van der Aalst, W.M.P., Dijkman, R.M., Mendling,
J., Dumas, M., Garc´ıa-Ba˜
nuelos, L.: APROMORE: an advanced process model
repository. Expert Syst. Appl. 38(6), 7029–7040 (2011)
15. Rosemann, M., van der Aalst, W.M.P.: A conﬁgurable reference modelling language. Inf. Syst. 32(1), 1–23 (2007)
16. Steinau, S., Andrews, K., Reichert, M.: A modeling tool for PHILharmonicFlows
objects and lifecycle processes. In: Proceedings of the BPMD (2017)
17. Steinau, S., K¨
unzle, V., Andrews, K., Reichert, M.: Coordinating business processes
using semantic relationships. In: Proceedings of the CBI, pp. 143–152 (2017)

Information systems in the big data era 2018

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về