Pragmatic interoperability and translation of industrial engineering problems into modelling and simulation solutions

. Pragmatic interoperability between platforms and service-oriented architectures exists whenever there is an agreement on the roles of participants and components as well as minimum standards for good practice. In this work, it is argued that open platforms require pragmatic interoperability, complementing syntactic interoperability (e.g., through common ﬁle formats), and semantic interoperability by ontologies that provide agreed deﬁnitions for entities and relations. For consistent data management and the provision of services in computational molecular engineering, community-governed agreements on pragmatics need to be established and formalized. For this purpose, if ontology-based semantic interoperability is already present, the same ontologies can be used. This is illustrated here by the role of the “translator” and procedural deﬁnitions for the process of “translation” in materials modelling, which refers to mapping industrial research and development problems onto solutions by modelling and simulation. For the associated roles and processes, substantial previous standardization eﬀorts have been carried out by the European Materials Modelling Council (EMMC). In the present work, the Materials Modelling Translation Ontology (MMTO) is introduced, and it is discussed how the MMTO can contribute to formalizing the pragmatic interoperability standards developed by the EMMC.


Introduction
The capabilities of service, software, and data architectures increase greatly if they are able to integrate a variety of heterogeneous resources into a common framework. In general, this requires an exchange of information with a multitude of systems of resources, each of which follows the paradigm and structure, or language, favoured by its designers. As the number n of such systems increases, establishing and maintaining a direct 1 : 1 compatibility between each pair of standards (e.g., file formats) becomes impractical, considering that n(n − 1) converters would need to be developed and updated as each of the relevant standards is modified; beside the unfavorable scaling of the effort required to develop such an architecture, this would also presuppose that the designers of each system understand all other systems and are interested in ensuring a compatibility with each of them, neither of which can be taken for granted. Instead, n : 1 : n interoperability based on a single intermediate standard only requires 2n mappings; if the interoperability standard has the approval of a significant community, which can be expected whenever the number of participating systems is large enough, developers have an intrinsic interest in maintaining the interoperability. For this purpose, they merely need to keep track of changes to their own system and the common intermediate standard.
Hence, interoperability is generally the favoured approach to integrating distributed and heterogeneous infrastructures. Since this addresses a problem of languages, the aspects of interoperability can be categorized according to their relation to three major areas of the theory of formal languages: Syntax, semantics, and pragmatics -or how to write correctly (according to a given format or grammar), how to associate a meaning with the communicated content (by which data items become information), and how to deal with information and transactions that involve an exchange of information. While well-known and well-established approaches for ensuring syntactic and semantic interoperability exist, pragmatic interoperability has not acquired the same degree of attention. However, it is as important. The statement "the accused is guilty of high treason" is syntactically correct, in English. Its denotational meaning might be clarified by linking "the accused" to an individual representing the specific person, and "is guilty of" and "high treason," respectively, to a relation and an entity from an ontology representing the laws of the country. However, its impact will vary greatly depending on who says it (e.g., a journalist, the prosecutor, or the judge), at which point, and in which context. If multiple countries decide to set up a joint court, they need to agree on the legal framework and on the language to be used at its sessions, but also on the pragmatics, much of which relates to role definitions and minimum requirements for good practice: How is a person appointed to become a judge, what qualifications are needed, and what code of conduct needs to be followed?
Software and data architectures often neglect to explicitly formulate any requirements at the level of pragmatics, since they are assumed to be guaranteed by institutional procedures (e.g., who is given an account, and who may ingest data). However, this delegation of responsibilities cannot be upheld for open infrastructures where anybody is invited to participate and to which a multitude of external tools and platforms connect, each of which may have its own users, roles, service definitions, access regulations, interfaces, and protocols. Accordingly, finding that semantic interoperability cannot reach its goals if it is not supplemented by an agreement on "what kind of socio-technical infrastructure is required," it has been proposed to work toward a universal pragmatic web [26]; in full consequence, this would add a third layer to the world-wide web infrastructure, operating on top of the semantic web and hypertext/syntactic web layers. This raises the issue of requirements engineering (i.e., specifying and implementing requirements) for service-oriented infrastructures, which becomes non-trivial whenever "stakeholders do not deliberately know what is needed" [30]. Previous work has established that ontologies are not only a viable tool for semantic interoperability, but also for enriching the structure provided for the semantic space by definitions of entities, relations, and rules that are employed to specify jointly agreed pragmatics [26,29]; to provide additional procedural information, workflow patterns have been suggested as a tool [28], e.g., employing the Business Process Model and Notation (BPMN) [1]. Since BPMN workflow diagrams can be transformed to RDF triples on the basis of an ontology [24], this approach is well suitable for domains of knowledge where ontologies already exist.
The present work follows a similar approach; it intends to contribute to the aim of the European Materials Modelling Council (EMMC) to make services, platforms, and tools for modelling and simulation of fluid and solid materials interoperable at all levels, which includes pragmatic interoperability. The workflow pattern standard of the EMMC is MODA (i.e., Model Data) [7], which as an ontology becomes OSMO, the ontology for simulation, modelling, and optimization; this ontology development and the release [17] of OSMO version 1.2 constitutes the point of departure for the present discussion. One of the concepts at the core of this line of work is that of materials modelling translation, i.e., the process of guiding an industrial challenge toward a solution with the help of modelling [11,12]. The experts that facilitate this process are referred to as translators; they provide a service for companies and can be either academics, software owners, internal employees of a company, or independent engineers. By employing translators and their translation services, the interpretation of modelling and simulation is adapted to decision making processes in industry. Translators are expected to support the uptake of methods from computational molecular engineering by their industrial partners to facilitate innovations leading to novel or improved products and manufacturing processes. Previous work on data science pragmatics by Neff et al. [21] concludes that it is particularly relevant to "get involved in observing the day-to-day practices of the work of data science" when addressing a scenario that "requires translation across multiple knowledge domains" to "make data valuable, meaningful, and actionable." This is the case here as well.
The remainder of this work is structured as follows: Section 2 introduces the approach to interoperability established by the EMMC (and associated projects), the relevant definitions of roles and best practices concerning materials modelling translation, and how ontologies can be employed in this context. For this purpose, Section 3 introduces the main contribution from the present work, the Materials Modelling Translation Ontology (MMTO) version 1.1, together with OSMO version 1.4 which is extended in comparison to the previous release [17]. Section 4 discusses the identification of key performance indicators (KPIs) and suggests a procedure, for aligning simulation workflows with KPIs. Finally, a conclusion is given in Section 5.

Review of Materials Modelling (RoMM) and ontologies
Where a physically based modelling approach is followed, physical equations (PEs) are employed jointly with materials relations (MRs) that parameterize and complement the PEs, e.g., for a particular substance. The combination of PEs and MRs is referred to as the system of governing equations (GEs); on the basis of the Review of Materials Modelling (RoMM) [3], common PE types are identified categorized into four groups according to their granularity level: Electronic, atomistic, mesoscopic, or continuum. Subsequent to the review activity and the agreement on a basic nomenclature as formalized by RoMM [3], the EMMC developed MODA, a semi-formalized simplified graph representation for simulation workflows [7]; this notation, which is immediately intelligible to human readers, but not immediately machine-processable, was further extended to permit the inclusion of graph elements that represent logical data transfer (LDT) [17]. In MODA graphs, there are four classes of vertices, which are here referred to as sections: 1. Use case, i.e., the physical system to be simulated. 2. Materials model, i.e., the system of GEs, with one or multiple PEs and MRs. 3. Solver, i.e., the numerical solution of the model in terms of exactly the variables that occur in the GEs explicitly (and nothing beyond this). 4. Processor, i.e., any computational operation beyond the above.
For each section, the MODA standard contains a list of text fields, which are here referred to as aspects, where more detailed information can be provided; however, since this is plain text, it is not immediately possible to extract semantically annotated content from this representation automatically. In LDT graphs, additionally, there are vertices for logical resources that store logical variables, i.e., abstractions of quantities and data structures that are exchanged between sections; the representation of the workflows and the flow of information is conceptual, or logical, in the sense that it does not carry any information on how the exchange of data is realized technically [17].
The strict distinction between the model, the solver, and processor(s) from MODA itself already constitutes a substantial abstraction from the implementation in typical simulation software architectures; e.g., in case of a molecular dynamics simulation, the PE is given by Newton's equations of motion (and nothing else), while the force field is the MR. Accordingly, the numerical solution of the GEs, i.e., the trajectory including positions, orientations, velocities, and possibly forces over time, and nothing else, constitutes the solver output according to MODA. Any other quantities (e.g., the pressure, if it is not a boundary condition, but a simulation result), need to be formally represented as the output of a processor. This does Moreover, in many cases, it is hard to draw a clear boundary between the model and its numerical implementation (i.e., in MODA, the solver), since models and solvers are often co-developed. In such cases, the PE catalogue from RoMM [3] is intended to serve as an orientation.
Semantic technology, centered on the use of ontologies as a tool, is increasingly applied to data management in all areas, including computational chemistry and molecular engineering [15]. As a feature of the semantic web, ontologies can link to entity definitions from other ontologies, facilitating distributed development and complex multi-tier architectures. The highest level of abstraction is usually given by a top-level ontology (or upper ontology). These components of the semantic web, such as the Basic Formal Ontology (BFO) [2] and the Descriptive Ontology for Linguistic and Cognitive Engineering (DOLCE) [6], are largely domain-independent and reused frequently in very diverse contexts; at the top level, philosopical concerns are at least as significant as the practical demands of research data technology. The EMMC advances its own top-level ontology, the European Materials and Modelling Ontology (EMMO), which is based on the philosophical paradigms of nominalism, merotopology, and semiosis following Peirce [14]. To achieve interoperability within the framework of the projects and infrastructures involving the EMMC community, all lower-level, i.e., domain-specific ontologies need to be aligned with the EMMO.
In particular, the Virtual Materials Marketplace project (VIMMP), which develops a platform where services and solutions related to computational molecular engineering can be traded, and with which multiple other platforms are expected to interoperate, it is essential to standardize the semantic space, e.g., for the exchange of information during data ingest and data retrieval [18]; this includes the characterization of services, models, documents, data access, etc., and may involve communication with external resources such as model and property databases. The domain ontologies that are developed by VIMMP for this purpose are referred to as marketplace-level ontologies [18]; the marketplacelevel ontology OSMO, which was developed by VIMMP in collaboration with the TaLPas project, is directly based on the MODA workflow graph standard as well as its LDT extension [17]. Thereby, a section from MODA, e.g., a solver, becomes a osmo:section entity, e.g., a osmo:solver. However, in MODA, the aspects (entries) of a section can only contain plain text; by using the relation osmo:has object content from OSMO, it becomes possible to point to semantically characterized content defined anywhere on the semantic web, including individuals and classes from OSMO and other ontologies. Similarly, OSMO formalizes the workflow graph elements and the exchanged logical variables. In this way, the MODA standard becomes machine-processable through OSMO. The materials modelling translation ontology from the present work, cf. Section 3, is based on this approach; it is closely connected to OSMO, and by extending the structure of the accessible semantic space to additionally deal with translation in materials modelling, it explicitly builds on OSMO and implicitly generalizes the MODA standard.

Specification of roles and processes
The role of the materials modelling translator is defined in the EMMC Translators' Guide (ETG) [12]; a translator needs to be able to bridge the "language gap" between industrial end users, software owners, and model providers who are usually academics. The work of a translator aims at delivering not just modelling results, but a valuable and beneficial solution for a problem from industrial engineering practice. An instance of the materials modelling translation process, some agreed features of which are codified by the ETG [12] and the EMMC Translation Case Template (ETCT) [11], is referred to as a translation case (TC). It begins with exploring and understanding the business case (BC) and the industrial case (IC), or multiple relevant BCs and/or ICs, which describe the socioeconomic objectives and boundary conditions, cf. Section 3.
Role definitions are known to be helpful in establishing sustainable good practices in data stewardship; e.g., this is illustrated by recent work proposing the position of the Scientific Data Officer (SDO), jointly with providing a role definition that is tailor-made for addressing major concerns from research data management in high-performance computing [25]. The responsibilities associated with this role relate to technical, organizational, ethical, and legal aspects of data stewardship [20]. One of the most important tasks that an SDO is expected to perform is data annotation; since data can only be curated with the help of metadata, concrete tasks include adaptation of existing metadata models to a use case and the support of automated metadata extraction. Moreover, the SDO's responsibilites also include a mediation role between different groups of interest, e.g., between scientists and the operators of computing and storage facilities. This mediation role has high impact for pragmatic interoperability, since many problems arise when different technical languages, terminologies or jargons are conflicting. In this way, the SDO is in a position that shares certain characteristics with that of the materials modelling translator, particularly if metadata are seen as a form of communication as proposed by Edwards et al. [10].
Translation can be a process with multiple iterations. Thereby, the active and regular contact with the end user (i.e., the industrial client of the translator) is a prerequisite for an effective and successful working relationship: The translator needs to be in communication with the client during the whole project duration to discuss regularly the project dynamics, possible changes to the line of work and development, and any other relevant feedback. The level of detail required for a modelling and simulation based contribution to an economic analysis of the considered value-added chain and its elements makes it necessary to go beyond computational molecular engineering in the strict sense, since eventually, key performance indicators (KPIs) of processes and products need to be optimized. The subsequent sections propose a solution for documenting these processes, providing the required BC, IC, and TC descriptions (Section 3), and associating OSMO workflows with KPIs (Section 4).

Materials modelling translation ontology (MMTO)
The present first release of the MMTO, version 1.1, and the revised release of OSMO, version 1.4, are openly accessible through the VIMMP website [27]. The revision of OSMO generalizes the section structure from MODA by introducing osmo:application case as a new direct superclass of osmo:use case. Beside use cases, in this way, BCs (mmto:business case), ICs (mmto:industrial case), and TCs (mmto:translation case) become subclasses of osmo:application case, by which they can be dealt with in a similar way as the sections from MODA. The relevant part of the MMTO and OSMO class hierarchy is visualized in Fig. 1, including the relations that are most useful and common in this context.
The TC aspects, cf. Tab. 1, directly correspond to the ETCT text fields [11], except that the MMTO permits the provision of semantically characterized content; this follows the approach from OSMO, which delivers the same feature for the text fields from MODA. The aspects by which BCs and ICs are described in the MMTO are given in Tabs. 2 and 3. Thereby, a BC can represent any purely economic consideration or an optimization problem at the management level, whereas an IC refers to an industrial engineering problem or an optimization problem at the technical or research and development level. Within the translation process, a suitable approach based on modelling and simulation is identified and carried out; subsequently, the outcome is translated back to support an actionable decision at the BC and IC levels. The stages of the translation process according to the ETG [12], together with the corresponding MMTO entities, are reported in Tab. 4. To show how this would actually be realized on a virtual marketplace, an illustrative exchange of communications taking place during a translation process (ordered as a sequence in time from top to bottom) is depicted in Fig. 2, together with the class hierarchy of the relevant branch of the MMTO.
The MMTO and OSMO are connected to the EMMO through the European Virtual Marketplace Ontology (EVMPO), a module which is developed jointly by the VIMMP and MarketPlace projects, and the EMMO-VIMMP Integration (EVI) component for ontology alignment [18]. Additionally, the MMTO employs the ISO 4217 standard for currency descriptions through the Currency Amount Ontology (CAO) module of the Financial Industry Business Ontology (FIBO) [5,9], cf. Tab. 2; it also refers to entity definitions from two further marketplace-level ontologies from the VIMMP project [18]: Classes of agents (e.g., vico:end user and vico:agent) and messages (e.g., vico:interlocution and vico:statement) from the VIMMP Communication Ontology (VICO), and the description of marketplace-interaction evaluations (here, vivo:translation assessment) from the VIMMP Validation Ontology (VIVO), cf. Tab. 1.

Types of KPIs
The idea of using KPIs as a valuable vehicle to map industrial problems onto modelling and simulation workflows is still under debate. The general impression is that there is some confusion concerning what a KPI actually means in a particular context. In business administration and management, a KPI is understood to be a natural-language description of something which is a selling argument. This reflects the point of view corresponding to organizational roles that are comparably distant from research and development, e.g., in sales or high-level management. In scenarios that arise in the context of such organizational roles, it necessarily appears to be most crucial to address concerns that are immediately relevant to business-to-administration (B2A), business-to-business (B2B), and business-to-customer (B2C) relations [4]. We propose to reserve the keyword KPI (mmto:key performance indicator) to indicators (scalar quantities) that are directly relevant to characterizing, modelling, or optimizing such scenarios. On this basis, from the point of view of a materials modelling translator, two major distinctions need to be made: Table 2. Aspects of a business case (BC), mmto:business case, in the MMTO.
BC aspect class name content description mmto:bca description abstract or a rough description of the BC content type: plain text, i.e., xs:string mmto:bca industrial case industrial case(s) associated with the BC content type: mmto:industrial case mmto:bca red zone red zone(s), i.e., operational constraint(s) content type: plain text, i.e., xs:string mmto:bca context context of the BC; revenue streams, risk management, distribution channels, etc. content type: plain text, i.e., xs:string mmto:bca currency budgeting currency content type: cao:Currency from CAO [5,9] mmto:bca contribution to cost contribution to cost (in budgeting currency) description type: plain text, i.e., xs:string magnitude type: decimal, i.e., xs:decimal mmto:bca total cost total cost (in budgeting currency) content type: decimal, i.e., xs:decimal mmto:bca contribution to benefit contribution to benefit (in budgeting currency) description type: plain text, i.e., xs:string magnitude type: decimal, i.e., xs:decimal mmto:bca total benefit total benefit (in budgeting currency) content type: decimal, i.e., xs:decimal mmto:bca return on investment return on investment content type: decimal, i.e., xs:decimal mmto:bca decision support employed decision support system(s) content type: osmo:decision support system 1. Some KPIs are closely related to human sentience (aesthetics, haptics, taste, etc.). Studies aiming at gaining information on these quantities typically rely on market research and other empirical methods that involve human subjects; such indicators are referred to as subjective KPIs (mmto:subjective kpi).
Obversely, an objective KPI (mmto:objective kpi) can be determined by a standardized process, e.g., a measurement, experiment, or simulation, the result of which (assuming that it is conducted correctly) does not depend on the person that carries it out. 2. An objective KPI is technological (mmto:technological kpi) if it is observed or measured within a technical or experimental process, referring directly to properties of the real product or manufacturing process; properties of a model, which are determined by simulation, are computational KPIs (mmto:computational kpi).
The distinction between subjective and objective KPIs is similar to that between critical-to-customer (CTC) and critical-to-quality (CTQ) measures [13,19,23]. The formulation given above, however, is more closely related to concepts from the EMMO. Due to its foundation on Peircean semiotics [22], it is straightforward in the EMMO to categorize signs by the way in which their in- terpretation depends on the subjective impression of an interpreter or observer: In particular, the same distinction is made in EMMO version 0.9.10 [14]. Accordingly, this approach is best amenable to a prospective alignment of the MMTO with the EMMO and the approach to interoperability guided by the EMMC.

KPI analysis
The relation between properties accessible to materials modelling and the technological KPIs that are most immediately relevant to real industrial processes and products is necessarily indirect; it requires the mediation through a translation process and a TC as formalized above, which includes modelling KPIs as a function of other quantities, i.e., the creation of KPI models (mmto:kpi model).
For the present purpose, a KPI model is given by a condition, correlation, or other formalism containing a set of variables, which can -but need not -be KPIs or other indicators, by which one or multiple KPIs are represented (i.e., here, predicted, correlated, or modelled). In this way, e.g., computational KPIs determined as the outcome of a complex simulation workflow can be correlated, and a technological KPI can be estimated on the basis of computational KPIs. Accordingly, KPI models can represent observables: Mathematical operators that map logical (e.g., physical) variables to scalar quantities. This mapping additionally depends on all boundary and initial conditions. The construction of the combined map from materials options and initial and boundary conditions to a number is a key aspect of materials modelling translation. Based on previous experience from the FORCE project, the following procedure, abbreviated PRO (partition → rationalize → OSMO workflows) is recommended for mapping a set of KPIs, which are relevant to an IC, to a collection of simulation workflows: 1. Partitioning: Sort the KPIs, which should be formulated in the language of the end user (or close to that language), into groups according to their principal dependencies on the data space. Individual KPIs may depend on a subset of the data space dimensions while other dimensions are irrelevant. 2. Rationalization: Based on the analysis of the data space and its relation to the KPIs, specify a formal structure for the KPI model(s). For physically based models, decide on the PE types, MR, initial and boundary conditions and any other parameters necessary for describing the material. Table 4. MMTO representation of the stages of a materials modelling translation process (subclasses of mmto:translation step), as specified by the ETG [12]. The numbers in the first column, which follow the ETG, are related to the stages by the datatype property mmto:has emmc guide no. The considered sections, i.e., individuals of the classes given in the third column without an asterisk, are related to the stages by mmto:considers section. ⋆ Remark on no. 4: The relations connecting mmto:translation step modelling to osmo:workflow graph is mmto:considers workflow; ⋆⋆ remark on no. 6: The relations connecting mmto:translation step decision to osmo:decision support system and mmto:kpi model, respectively, are mmto:has decision support, and mmto:considers kpi model.

no.
MMTO class identifier and description entities connected by relations 1 mmto:translation step bc mmto:business case good understanding of the business case 2 mmto:translation step ic mmto:industrial case good understanding of the industrial case 3 mmto:translation step data osmo:use case analysis of the data from experiment and simulation available to the end user 4 ⋆ mmto:translation step modelling osmo:materials model translation to simulation workflows and osmo:workflow graph ⋆ 5 mmto:translation step execution execution and validation strategy 6 ⋆⋆ mmto:translation step decision osmo:decision support system ⋆⋆ evaluation of the simulation results to and mmto:kpi model ⋆⋆ facilitate an actionable decision 3. OSMO workflow development and documentation: A KPI model -or multiple KPI models which are structurally similar or related -can be mapped to an OSMO simulation workflow. Analyse the workflow to determine the expected reliability of the result.

Conclusion
By developing the MMTO, which extends the section concept from OSMO to cover BCs, ICs, and TCs, a formalism was introduced by which translation in materials modelling can be represented in a way that implicitly also extends MODA, the pre-existing EMMC standard for simulation workflows. Just as OSMO is the ontology version of MODA, the MMTO is the ontology version of an implicit generalization of MODA by which, beside the simulation workflow itself, its socioeconomic context can be described. In this way, the MMTO is also a tool for representing the exchange of information during translation processes (e.g., on KPIs) as a workflow, analogous to the formalization of MODA and LDT workflow graphs within OSMO. Since it is given as an ontology, the aspects from the MMTO (and from OSMO), which correspond to plain-text form entries in MODA, can contain links to entities defined elsewhere in the semantic  [18]; this diagram was generated using OWLViz [16].
web which can be immediately processed computationally, and to which automated reasoning can be applied. Where available, previous agreements have been taken into account in the form of the ETG and the ETCT, which are codified by the MMTO. To guarantee pragmatic interoperability between translationrelated services and platforms such as materials modelling marketplaces, open translation environments, business decision support systems, and open innovation platforms, substantial further standardization efforts will be required.