About SEA4DQ

Cyber-physical systems (CPS)/Internet of Things (IoT) are omnipresent in many industrial sectors and application domains in which the quality of the data acquired and used for decision support is a common factor. Data quality can deteriorate due to factors such as sensor faults and failures due to operating in harsh and uncertain environments.

How can software engineering and artificial intelligence (AI) help manage and tame data quality issues in CPS/IoT?

This is the question we aim to investigate in this workshop SEA4DQ. Emerging trends in software engineering need to take data quality management seriously as CPS/IoT are increasingly data-centric in their approach to acquiring and processing data along the edge-fog-cloud continuum. This workshop will provide researchers and practitioners a forum for exchanging ideas, experiences, understanding of the problems, visions for the future, and promising solutions to the problems in data quality in CPS/IoT.

Topics of Interest

  • Software/hardware architectures and frameworks for data quality management in CPS/IoT
  • Software engineering and AI to pre-process and clean data
  • Software engineering and AI to detect and repair anomalies in CPS/IoT data
  • Software engineering and AI to cluster data as events
  • Software tools for data quality management, testing, and profiling
  • Public sensor datasets from CPS/IoT (manufacturing, digital health, energy,...)
  • Distributed ledger and blockchain technologies for quality tracking
  • Quantification of data quality hallmarks and uncertainty in data repair
  • Sensor data fusion techniques for improving data quality and prediction
  • Augmented data quality
  • Case studies that have evaluated an existing technique or tool on real systems, not only toy problems, to manage data quality in cyber-physical systems in different sectors.
  • Certification and standardization of data quality in CPS/IoT
  • Approaches for secure and trusted data sharing, especially for data quality, management, and governance in CPS/IoT
  • Trade-offs between data quality and data security in CPS/IoT

Schedule

17th of November 2022 - All times are in Singapore local time (GMT+8)

Start - End Topic Presenters Session Chairs
08:50 - 09:00 Physical and Virtual Meeting Setup Organization Committee
09:00 - 09:15 Welcome, Objectives and Agenda Organization Committee
09:15 - 10:15 Keynote: Online Reinforcement Learning for Self-adaptive Systems Prof. Andreas Metzger Phu Nguyen
10:15-10:30 Data Quality Issues in Solar Panels Installations: A Case Study (Position Paper) Dumitru Roman (Titi), Antoine Pultier, Xiang Ma, Ahmet Soylu, Alexander G.Ulyashin Phu Nguyen
10:30-11:00 AM Break
11:00-11:30 Data Quality as a Microservice - an ontology and rule based approach for quality assurance of sensor data in manufacturing machines (Long Paper) Jørgen Stang, Dirk Walther, Per Myrseth Beatriz Cassoli, Nicolas Jourdan
11:30-11:50 Preliminary Findings on the Occurrence and Causes of Data Smells in a Real-World Business Travel Data Processing Pipeline (WIP Paper) Valentina Golendukhina, Harald Foidl, Michael Felderer, Rudolf Ramler Beatriz Cassoli, Nicolas Jourdan
11:50-12:20 Effect of Time Patterns in Mining Process Invariants for Industrial Control Systems: An Experimental Study (Long Paper) Muhammad Azmi Umer, Aditya Mathur, Muhammad Taha Jilani Beatriz Cassoli, Nicolas Jourdan
12:20-14:00 Lunch Break
14:00-15:00 Keynote: Data Quality and Model Under-Specification Issues Prof. Foutse Khomh Sagar Sen
15:00-15:15 Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production (Position Paper) Maryna Waszak, Terje Moen, Sølve Eidnes et al. Sagar Sen
15:15-15:30 InterQ Research Project Presentation Nicolas Jourdan, Beatriz Cassoli Sagar Sen
15:30-16:00 PM Break
16:00 - 17:00 Panel Discussion Prof. Foutse Khomh, Prof. Andreas Metzger, Sallam Abualhaija, Sagar Sen Beatriz Cassoli, Nicolas Jourdan
17:00 - 17:05 Closing Statement Phu Nguyen

Registration can be done at the FSE conference website.

Keynotes

...

Prof. Dr. Andreas Metzger

Head of Adaptive Systems and Big Data Applications,
University of Duisburg-Essen, Germany

Title: "Data Quality Issues in Online Reinforcement Learning for Self-adaptive Systems"

A self-adaptive system can modify its structure and behavior at runtime based on its perception of the environment, itself, and its requirements. By adapting itself at runtime, the system can maintain its requirements in the presence of dynamic environment changes. Examples are elastic cloud systems, intelligent IoT systems as well as proactive process management systems. One key element of a self-adaptive system is its self-adaptation logic, which encodes when and how the system should adapt itself. When developing the adaptation logic, developers face the challenge of design time uncertainty. This means they have to anticipate potential environment states and the precise effect of adaptation in a given environment state, while the knowledge available at design time may not be sufficient to do so. A recent industrial survey determined design-time uncertainty as one of the most frequently observed difficulties in designing self-adaptation logic in practice. This talk will explore the opportunities but also challenges that modern machine learning algorithms offer in building the self-adaptation logic in the presence of design-time uncertainty. It will focus on online reinforcement learning as an emerging approach, which means that during operation the system learns from interactions with its environment, thereby effectively leveraging data only available at run time. In particular, the talk will focus on three different issues related to data quality and will introduce initial solutions for these issues: (1) data non-stationarity, (2) data sparsity, and (3) data intransparency. The talk will close with a critical discussion of limitations and an outlook on future research opportunities.


...

Prof. Foutse Khomh

Head of SoftWare Analytics and Technologies (SWAT) Lab,
University of Montréal, Canada

Title: "Data Quality and Model Under-Specification Issues"

Nowadays, we are witnessing an increasing demand in both industry and academia for exploiting Deep Learning (DL) to solve complex real-world problems. However, the performance of these high-capacity learners is currently bounded by the quality and volume of their underlying training data. The use of incomplete, erroneous, or inappropriate training data, and the implementation of poor data management practices in a training pipeline often result into unreliable, biased, or under specified models. In this talk, I will report about some recent research works that we have conducted to identify best practices of data management for DL. I will also report about recent techniques and tools that we have developed to help detect the root cause of model under-specification issues early on during a DL training process.

Accepted Papers

  • Data Quality as a Microservice - an ontology and rule based approach for quality assurance of sensor data in manufacturing machines | Full Paper
    Jørgen Stang, Dirk Walther, Per Myrseth
  • Effect of Time Patterns in Mining Process Invariants for Industrial Control Systems: An Experimental Study | Full Paper
    Muhammad Azmi Umer, Aditya Mathur and Muhammad Taha Jilani
  • Preliminary Findings on the Occurrence and Causes of Data Smells in a Real-World Business Travel Data Processing Pipeline | WIP Paper
    Valentina Golendukhina, Harald Foidl, Michael Felderer and Rudolf Ramler
  • Data Quality Issues for Vibration Sensors: A Case Study in Ferrosilicon Production | Short Paper
    Maryna Waszak, Terje Moen, Sølve Eidnes, Alexander Stasik, Anders Hansen, Gregory Bouquet, Antoine Pultier, Xiang Ma, Idar Tørlen, Bjørn Rune Henriksen, Arianeh Aamodt, Dumitru Roman
  • Data Quality Issues in Solar Panels Installations: A Case Study | Short Paper
    Dumitru Roman, Antoine Pultier, Xiang Ma, Ahmet Soylu, Alexander G.Ulyashin

Abstracts will be added after camera ready submission deadline has passed.

Important Dates

  • Abstract submission deadline: July 25, 2022 (optional)
  • Submission deadline: August 1, 2022
  • Notification of Acceptance: August 26, 2022
  • Camera-Ready Submission: September 9, 2022 (hard deadline)
  • Workshop: November 17, 2022

Organization Committee

...
Phu Nguyen (Main Contact)
General Chair
SINTEF, Norway
phu.nguyen@sintef.no
...
Sagar Sen (Main Contact)
Co-Program Chair
SINTEF, Norway
sagar.sen@sintef.no
...
Maria Chiara Magnanini
Co-Program Chair
Politecnico di Milano, Italy

...
Beatriz Cassoli
Moderator, Co-Web Chair
TU Darmstadt, Germany
...
Nicolas Jourdan
Moderator, Co-Web Chair
TU Darmstadt, Germany
...
Mikel Armendia
Publicity Chair
Tekniker, Spain


Program Committee*

  • Abhilash Anand, DNV AS, Norway
  • Enrique Garcia Ceja, Optimeering, Oslo, Norway
  • Sudipto Ghosh, Colorado State University, USA
  • Helena Holmström Olsson, Malmö University, Sweden
  • Frank Alexander Kraemer, NTNU, Norway
  • Felix Mannhardt, KIT-AR, Germany
  • Dusica Marijan, Simula Research Laboratory, Norway
  • Andreas Metzger, University of Duisburg-Essen, Germany
  • Jan Nygård, Cancer Registry of Norway, Norway
  • Karl John Pedersen, DNV AS, Norway
  • Dimitra Politaki, INLECOM, Greece
  • Dumitru Roman, SINTEF / University of Oslo, Norway
  • Marc Roper, University of Strathclyde, UK
  • Helge Spieker, Simula Research Laboratory, Norway
  • Jean-Yves Tigli, Université Côte d’Azur, France
  • Hong-Linh Truong, Aalto University, Finland
  • Katinka Wolter, Free University of Berlin, Germany
  • Amina Ziegenbein, Technische Universität Darmstadt, Germany
* PC members list is in alphabetical order.

The SEA4DQ 2022 Workshop is sponsored by the research projects InterQ and DAT4.Zero that are funded by the European Union’s Horizon 2020 Research and Innovation programme.