WEBIST 2022 Abstracts


Area 1 - HCI in Mobile Systems and Web Interfaces

Full Papers
Paper Nr: 7
Title:

Identifying User Experience Aspects for Voice User Interfaces with Intensive Users

Authors:

Kristina Kölln, Jana Deutschländer, Andreas M. Klein, Maria Rauschenberger and Dominique Winter

Abstract: Voice User Interfaces (VUIs) are becoming increasingly available while users raise, e.g., concerns about privacy issues. User Experience (UX) helps in the design and evaluation of VUIs with focus on the user. Knowledge of the relevant UX aspects for VUIs is needed to understand the user’s point of view when developing such systems. Known UX aspects are derived, e.g., from graphical user interfaces or expert-driven research. The user’s opinion on UX aspects for VUIs, however, has thus far been missing. Hence, we conducted a qualitative and quantitative user study to determine which aspects users take into account when evaluating VUIs. We generated a list of 32 UX aspects that intensive users consider for VUIs. These overlap with, but are not limited to, aspects from established literature. For example, while Efficiency and Effectivity are already well known, Simplicity and Politeness are inherent to known VUI UX aspects but are not necessarily focused. Furthermore, Independency and Context-sensitivity are some new UX aspects for VUIs.
Download

Short Papers
Paper Nr: 5
Title:

Impact of Usage Behaviour on the User Experience of Netflix, Microsoft Powerpoint, Bigbluebutton and Zoom

Authors:

Jessica Kollmorgen, Martin Schrepp and Jörg Thomaschewski

Abstract: In order to be able to meaningfully classify the user experience and thus the popularity of products, UX questionnaires such as the UEQ, SUS or UMUX are frequently used in practice to measure the UX. This makes it possible to specifically evaluate the ratings of pragmatic and hedonic UX factors. However, it is conceivable that, in addition to users’ own perceptions, external factors also have an influence on the evaluation of the UX of products. These include, for example, time or duration of use. It can be assumed that users who rate the UX of a product as good also use this product more frequently and vice versa. Such a consideration of influencing factors is particularly interesting for products that have been used frequently in recent years and thus also during the pandemic. For this reason, Netflix, Microsoft PowerPoint, Zoom and BigBlueButton were selected, which cover the range from primarily hedonic to primarily pragmatic quality. These are examined for their UX ratings as well as influencing factors.
Download

Paper Nr: 38
Title:

Incorporating the User Attention in User Interface Logs

Authors:

A. Martínez-Rojas, A. Jiménez-Ramírez, J. G. Enríquez and D. Lizcano-Casas

Abstract: Business process analysis is a key factor in the lifecycle of Robotic Process Automation. Currently, task mining techniques provide mechanisms to analyze information about the process tasks to be automated, e.g., identify repetitive tasks or process variations. Existing proposals mainly rely on the user interactions with the UIs of the system (i.e., keyboard and mouse level) and information that can be gathered from them (e.g., the window name) which is stored in a UI event log. In some contexts, the latter information is limited because the system is accessed through virtualized environments (e.g., Citrix or Teamviewer). Other approaches extend the UI Log, including screenshots to address this issue. Regardless of the context, the aim is to store as much information as possible in the UI Log so that is can be analyzed later on, e.g., by extracting features from the screenshots. This amount of information can introduce much noise in the log that messes up what is relevant to the process. To amend this, the current approach proposes a method to include a gaze analyzer, which helps to identify which is process-relevant information between all the information. More precisely, the proposal extends the UI Log definition with the attention change level, which records when the user’s attention changes from one element on the screen to another. This paper sets the research settings for the approach and enumerates the future steps to conduct it.
Download

Paper Nr: 46
Title:

Data Visualization, Accessibility and Graphicacy: A Qualitative Study of Communicative Artifacts through SUS Questionnaire

Authors:

Alessio Caccamo

Abstract: The study presented here examines the accessibility of information conveyed through the language of infographics, analyzing the usability by users in the fruition of information content of five Data Visualization artifacts, selected according to the degree of iconicity of representation by Anceschi. Specifically, the study compared the SUS evaluation by two groups [F=100 – M=100] homogeneous in educational grade and age but distinguished in owning proven Visual Design competence or not. It is therefore investigated, whether basic soft skill, is sufficient to achieve an optimal level of accessibility or rather, whether Graphicacy competence is discriminated. Therefore, understanding whether infographic language could be considered ad a universal language or no. A three–variable correlation design was therefore constructed: two independent variables, the System Usability Scale (SUS) along with the degree of iconicity of the representation, and one dependent variable, namely the amount of information extracted from the infographic. The results show that in both Group A and B is evident a general difficulty in accessibility of information correlated to the degree of iconicity of the infographic representation. Specifically, in “non designer” group, no infographics achieved the minimum usability rating, which, on the other hand, in “designer” group, is achieved by the only two artifacts with a medium/low degree of iconicity. From the analysis of the data, Graphicacy – acquired within the educational curriculum of Designers – would appear to be a determinate element in the correct decoding of communicative artifacts. The contribution, through existing data and literature, leads, on the one hand, to confirm that Graphicacy has been found to be neglected in comparison to Literacy, Numeracy, and Articulacy and that the complexity and sophistication of infoaesthetic may be incomprehensible without timely data visualization literacy.
Download

Paper Nr: 48
Title:

Space Geeks: A Proposed Serious Game to Teach Array Concept for Novice Programming Students

Authors:

Abdelbaset Jamal Assaf, Mohammed Eshtay and Lana Issa

Abstract: The failure rates in introductory programming courses still shows that there is a continuous need in research to investigate and propose new methods and techniques of teaching introductory to programming courses to attract more people to the information technology field and build more skilled programmers from their first course. This study investigates students levels in multiple topics in introduction to programming, then, proposes a new science fiction themed game called Space Geeks. The game is initially designed to target arrays, and is extendable to cover more programming concepts. The design of this games helps students enhance their coding skills, gives motivation using game features, and helps them understand the arrays concept by visualisation and graphics. This work will open more insights to focus on further introductory topics such as arrays since that there has been other work to focus on other topics such as variables, input/output, and problem solving.
Download

Paper Nr: 53
Title:

An Interaction Effort Score for Web Pages

Authors:

Juan Cruz Gardey, Julián Grigera, Andrés Rodríguez, Gustavo Rossi and Alejandra Garrido

Abstract: There is a lack of automatic evaluation models to measure the user experience (UX) of online systems, especially in relation to the user interaction. In this paper we propose the interaction effort score as a factor that contributes to the measure of the UX of a web page. The interaction effort is automatically computed as an aggregation of the effort on each interactive widget of a page, and for all users that have interacted with them. In turn, the effort on each widget is predicted from different micro-measures computed on the user interaction, by learning from manual UX expert ratings. This paper describes the evaluation of the interaction effort of different web forms, and how it compares to other metrics of usability and user interaction. It also shows possible applications of the interaction effort score in the automatic evaluation of web pages.
Download

Paper Nr: 54
Title:

Cornucopia: Tool Support for Selecting Machine Learning Lifecycle Artifact Management Systems

Authors:

Marius Schlegel and Kai-Uwe Sattler

Abstract: The explorative and iterative nature of developing and operating machine learning (ML) applications leads to a variety of ML artifacts, such as datasets, models, hyperparameters, metrics, software, and configurations. To enable comparability, traceability, and reproducibility of ML artifacts across the ML lifecycle steps and iterations, platforms, frameworks, and tools have been developed to support their collection, storage, and management. Selecting the best-suited ML artifact management systems (AMSs) for a particular use case is often challenging and time-consuming due to the plethora of AMSs, their different focus, and imprecise specifications of features and properties. Based on assessment criteria and their application to a representative selection of more than 60 AMSs, this paper introduces an interactive web tool that enables the convenient and time-efficient exploration and comparison of ML AMSs.
Download

Paper Nr: 21
Title:

Study on VR Application Efficiency of Selected Android OS Mobile Devices

Authors:

Przemyslaw Falkowski-Gilski and Karol Fidurski

Abstract: Currently, the number of scenarios for using VR (Virtual Reality) technology grows every year. Yet, there are still issues associated with it, related with the performance of the mobile device itself. The aim of this work is to perform an analysis of the effectiveness of virtual reality applications in case of mobile platforms. We put the main emphasis on examining the performance and efficiency of four different hardware and software platforms, evaluated in a number of research scenarios, related with typical user activities. The performance of various consumer devices running Android OS was assessed using selected benchmark applications. Additionally, a custom-build environments was also created to facilitate further testing, including an enhanced HCI (Human-Computer Interface) linking the mobile device, head-mounted googles, and a powerful desktop PC. The performed tests and obtained results can aid any interested individual when choosing the right mobile device, as well as configuring the VR environment, for various UX (User Experience) purposes.
Download

Area 2 - Internet Technology

Full Papers
Paper Nr: 6
Title:

Classifying the Reliability of the Microservices Architecture

Authors:

Adrian Ramsingh, Jeremy Singer and Phil Trinder

Abstract: Microservices are popular for web applications as they offer better scalability and reliability than monolithic architectures. Reliability is improved by loose coupling between individual microservices. However in production systems some microservices are tightly coupled, or chained together. We classify the reliability of microservices: if a minor microservice fails then the application continues to operate; if a critical microservice fails, the entire application fails. Combining reliability (minor/critical) with the established classifications of dependence (individual/chained) and state (stateful/stateless) defines a new three dimensional space: the Microservices Dependency State Reliability (MDSR) classification. Using three web application case studies (Hipster-Shop, Jupyter and WordPress) we identify microservice instances that exemplify the six points in MDSR. We present a prototype static analyser that can identify all six classes in Flask web applications, and apply it to seven applications. We explore case study examples that exhibit either a known reliability pattern or a bad smell. We show that our prototype static analyser can identify three of six patterns/bad smells in Flask web applications. Hence MDSR provides a structured classification of microservice software with the potential to improve reliability. Finally, we evaluate the reliability implications of the different MDSR classes by running the case study applications against a fault injector.
Download

Paper Nr: 15
Title:

Few-shot Approach for Systematic Literature Review Classifications

Authors:

Maísa Kely de Melo, Allan Victor Almeida Faria, Li Weigang, Arthur Gomes Nery, Flávio Augusto R. de Oliveira, Ian Teixeira Barreiro and Victor Rafael Rezende Celestino

Abstract: Systematic Literature Review (SLR) studies aim to leverage relevant insights from scientific publications to achieve a comprehensive overview of the academic progress of a specific field. In recent years, a major effort has been expended in automating the SLR process by extracting, processing, and presenting the synthesized findings. However, implementations capable of few-shot classification for fields of study with a smaller amount of material available seem to be lacking. This study aims to present a system capable of conducting automated systematic literature reviews on classification constraint by a few-shot learning. We propose an open-source, domain-agnostic meta-learning SLR framework for few-shot classification, which has been validated using 64 SLR datasets. We also define an Adjusted Work Saved over Sampling (AWSS) metric to take into account the class imbalance during validation. The initial results show that AWSS@95% scored as high as 0.9 when validating our learner with data from 32 domains (just 16 examples were used for training in each domain), and only four of them resulted in scores lower than 0.1. These findings indicate significant savings in screening time for literature reviewers.
Download

Paper Nr: 23
Title:

Agile in Higher Education: How Can Value-based Learning Be Implemented in Higher Education?

Authors:

Eva-Maria Schön, Ilona Buchem and Stefano Sostak

Abstract: The corona pandemic has shown how important it is to be able to react quickly to changing conditions. In many organizations, agile process models and agile practices are used for this purpose. This paper examines how agility can be implemented in higher education. Using two case studies, we analyze how agile practices and agile values are implemented for knowledge and skills development. Our results present a student-centered approach where lecturers supported self-organized learning. In the student-centered approach, prior knowledge and experience of learners are taken into account, and the learning process is adjusted through continuous feedback. With the introduction of agility, a value shift towards value-based learning is taking place. Value-based learning supports competency-based teaching since the focus is less on imparting technical knowledge and more on imparting competencies.
Download

Paper Nr: 25
Title:

Privacy Policy Beautifier: Bringing Privacy Policies Closer to Users

Authors:

Michalis Kaili and Georgia M. Kapitsaki

Abstract: A plethora of privacy policies are available online, as all websites need to inform the users in detail about the processing of their personal data, especially in the recent years in order to comply with the European General Data Protection Regulation (GDPR). As these texts tend to be very long, understanding their content takes time and is not easy for the majority of users that usually do not spend the time to read the policy texts. In this paper, we present our work on Privacy Policy Beautifier that aims to bring privacy policies closer to the user by highlighting specific parts of the text and presenting the information in different formats: textual with colors, pie chart, word cloud and GDPR terms presence. For the main part of this process, we utilize machine learning techniques. Privacy policy Beautifier has been evaluated via the survey methodology with 90 users and shows potential for assisting in the creation of more user-friendly privacy policy representations.
Download

Paper Nr: 35
Title:

AKIP Process Automation Platform: A Framework for the Development of Process-Aware Web Applications

Authors:

Ulisses Telemaco Neto, Toacy Oliveira, Raquel Pillat, Paulo Alencar, Don Cowan and Glaucia Melo

Abstract: An increasing number of platforms for Business Process Automation (BPA) have been developed in recent years, including open-source and proprietary solutions. However, there are still some unsolved problems and limitations related to the adoption of these solutions, which include: vendor lock-in, limited UI/UX, limited integration, outdated technology stack, and lack of support for non-process features. The framework presented in this paper addresses these problems and limitations by providing an open-source platform to facilitate the development of process-aware information systems (PAISs) based on code generation techniques. The platform is capable of generating a functional process-aware Web application from a business process model defined in BPMN. To the best of our knowledge, there is no other software tool that generates fully functional process-aware Web applications. The presented framework has been evaluated in the academy and industry and used to develop dozens of non-profit and commercial process-aware applications.
Download

Paper Nr: 41
Title:

Machine Learning and Raspberry PI Cluster for Training and Detecting Skin Cancer

Authors:

Elias Rabelo Matos, Edward David Moreno and Kalil Araujo Bispo

Abstract: Context: Melanoma is the most popular and aggressive type of skin cancer with thousands of cases and deaths worldwide each year. But melanoma isn’t the only type of skin lesion. Since 2016 the ISIC (International Skin Cancer Challenge) has been launching challenges toward skin lesion detection. In this paper, we use the HAM10000 dataset which is part of the ISIC archive and contains seven classes of skin lesions to train a DenseNet network aiming to achieve state-of-the-art accuracy. Objective: We evaluate the use of a low-cost cluster with four Raspberry PI to check the viability as a machine learning cluster for detecting one type of skin cancer. Method: We trained a deep convolutional neural network using the pre-trained model of four networks and we got 89% of accuracy which is a top state-of-art value. After we perform two experiments: (i) we use the knowledge transfer technique to run an MLP model using four Raspberry and (ii) we train the pre-trained DenseNet with a Raspberry PI cluster aiming to verify if a low-cost cluster is viable for this approach. Results: We found that is not possible to train our network using only four Raspberry PI since it has low computational resources but we show what more resources are needed to perform this task. Despite this situation, we achieve 80% accuracy using the knowledge transfer technique and only four Raspberry Pi.
Download

Paper Nr: 47
Title:

Semi Real-time Data Cleaning of Spatially Correlated Data in Traffic Sensor Networks

Authors:

Federica Rollo, Chiara Bachechi and Laura Po

Abstract: The new Internet of Things (IoT) era is submerging smart cities with data. Various types of sensors are widely used to collect massive amounts of data and to feed several systems such as surveillance, environmental monitoring, and disaster management. In these systems, sensors are deployed to make decisions or to predict an event. However, the accuracy of such decisions or predictions depends upon the reliability of the sensor data. By their nature, sensors are prone to errors, therefore identifying and filtering anomalies is extremely important. This paper proposes an anomaly detection and classification methodology for spatially correlated data of traffic sensors that combines different techniques and is able to distinguish between traffic sensor faults and unusual traffic conditions. The reliability of this methodology has been tested on real-world data. The application on two days affected by car accidents reveals that our approach can detect unusual traffic conditions. Moreover, the data cleaning process could enhance traffic management by ameliorating the traffic model performances.
Download

Short Papers
Paper Nr: 3
Title:

Who Ate All Our Cookies? Investigating Publishers’ Challenges Caused by Changes in Third-party Cookie Tracking

Authors:

Valerio Stallone, Aline Gägauf and Tania Kaya

Abstract: This paper investigates the potential reactions of Swiss publishers, as actors with-in the digital advertising ecosystem, to the forthcoming fundamental changes to user tracking in the world wide web. The results of this mixed methodical study initiate the discussion on the future of cookie tracking by setting and then answering to four hypotheses regarding first-party tracking, shared ID solutions, Google’s Privacy Sandbox, and a national walled garden system. The results show a clear inclination of Swiss publishers towards first-party tracking and shared ID solutions, neutral standing towards Google’s efforts to undo their harm provoking with their upcoming change, and an aversion towards a nation-wide walled garden. These findings intend to increase the volume of the discussion on the effects of BigTech’s changes on the digital advertising ecosystem as a whole and therefore stimulate further research on the effects on single actors within this ecosystem – beyond the publishers themselves.
Download

Paper Nr: 11
Title:

Temporal Evolution of Topics on Twitter

Authors:

Daniel Pereira, Wladmir Brandão and Mark Song

Abstract: Social networks became an environment where users express their feeling and share news in real-time. But analyzing the content produced by the users is not simple, considering the number of posts. It is worthy to understand what is being expressed by users to get insights about companies, public figures, and news. To the best of our knowledge, the state-of-the-art lacks proposing studies about how the topics discussed by social network users change over time. In this context, this work measure how topics discussed on Twitter vary over time. We used Formal Concept Analysis to measure how these topics were varying, considering the support and confidence metrics. We tested our solution on two case studies, first using the RepLab 2013 and second creating a database with tweets that discuss vaccines in Brazil. The result confirms that is possible to understand what Twitter users were discussing and how these topics changed over time. Our work benefits companies who want to analyze what users are discussing about them.
Download

Paper Nr: 14
Title:

Evaluation of Project Work in Public Administrations in E-government and Digitization Projects in Germany

Authors:

Hanna Looks, Anja Gebauer, Jörg Thomaschewski, María José Escalona and Eva-Maria Schön

Abstract: Particularly in times of crisis, it is apparent that the digital transformation in Germany has not yet progressed far enough. Public administrations are confronted with legally prescribed obligations to implement e-government and digitization projects in the near future. Project work and the involvement of users in the implementation are increasingly coming into focus. The objective of this paper is to identify the challenges in project work in the context of the implementation of e-government and digitalization projects in public administration. This was done by comparing survey results from 2018 and 2021 and analyzing whether agility can make a decisive contribution to eliminating these challenges. Moreover, it was investigated whether user involvement in implementation is already taking place. This study was conducted for the first time in 2018 and again in 2021 using an online survey. The two samples of the questionnaire study (2018 and 2021) show, that public administrations are increasingly coming into contact with agile methods. Furthermore, challenges were identified that limit improved project work in public administrations and that need to be addressed.
Download

Paper Nr: 19
Title:

Multi-label Emotion Classification using Machine Learning and Deep Learning Methods

Authors:

Drashti Kher and Kalpdrum Passi

Abstract: Emotion detection in online social networks benefits many applications like personalized advertisement services, suggestion systems etc. Emotion can be identified from various sources like text, facial expressions, images, speeches, paintings, songs, etc. Emotion detection can be done by various techniques in machine learning. Traditional emotion detection techniques mainly focus on multi-class classification while ignoring the co-existence of multiple emotion labels in one instance. This research work is focussed on classifying multiple emotions from data to handle complex data with the help of different machine learning and deep learning methods. Before modeling, first data analysis is done and then the data is cleaned. Data preprocessing is performed in steps such as stop-words removal, tokenization, stemming and lemmatization, etc., which are performed using a Natural Language Processing toolkit (NLTK). All the input variables are converted into vectors by naive text encoding techniques like word2vec, Bag-of-words, and term frequency-inverse document frequency (TF-IDF). This research is implemented using python programming language. To solve multi-label emotion classification problem, machine learning, and deep learning methods were used. The evaluation parameters such as accuracy, precision, recall, and F1-score were used to evaluate the performance of the classifiers Naïve Bayes, support vector machine (SVM), Random Forest, K-nearest neighbour (KNN), GRU (Gated Recurrent Unit) based RNN (Recurrent Neural Network) with Adam optimizer and Rmsprop optimizer. GRU based RNN with Rmsprop optimizer achieves an accuracy of 82.3%, Naïve Bayes achieves highest precision of 0.80, Random Forest achieves highest recall score of 0.823, SVM achieves highest F1 score of 0.798 on the challenging SemEval2018 Task 1: E-c multi-label emotion classification dataset. Also, One-way Analysis of Variance (ANOVA) test was performed on the mean values of performance metrics (accuracy, precision, recall, and F1-score) on all the methods.
Download

Paper Nr: 20
Title:

Deep-vacuity: A Proposal of a Machine Learning Platform based on High-performance Computing Architecture for Insights on Government of Brazil Official Gazettes

Authors:

Leonardo R. De Carvalho, Felipe S. Lopes, Jefferson Chaves, Marcos C. Lima, Flávio E. Gomes De Deus, Aletéia P. F. A. von Paungarthem and Flavio De Barros Vidal

Abstract: Brazil publishes region information, public tenders for the hire of civil servants, and also government contracts with companies in its Official Gazettes. All this volume of information can contain interesting relationships that reveal unique characteristics of the government, such as the effectiveness of public policies and even the existence of illegal schemes. Establishing these relationships is not a trivial task and requires great effort. Therefore, this work proposes the Deep Vacuity platform, which, by using a High-Performance Computing architecture along with Machine Learning techniques, can collect, depurate, consolidate and analyze the data, offering a friendly interface for decision-makers.
Download

Paper Nr: 32
Title:

Adaptive and Collaborative Inference: Towards a No-compromise Framework for Distributed Intelligent Systems

Authors:

Alireza Furutanpey and Schahram Dustdar

Abstract: Deep Neural Networks (DNNs) are the backbone of virtually all complex, intelligent systems. However, networks which achieve state-of-the-art accuracy cannot execute inference tasks within a reasonable time on commodity hardware. Consequently, latency-sensitive mobile and Internet of Things (IoT) applications must compromise by executing a heavily compressed model locally or offloading their inference task to a remote server. Sacrificing accuracy is unacceptable for critical applications, such as anomaly detection. Offloading inference tasks requires ideal network conditions and harbours privacy risks. In this position paper, we introduce a series of planned research work with the overarching aim to provide a (close to) no compromise framework for accurate and fast inference. Specifically, we envision a composition of solutions that leverage the upsides of different computing paradigms while overcoming their downsides through collaboration and adaptive methods that maximise resource efficiency.
Download

Paper Nr: 37
Title:

Interaction Lab: Web User Interaction Tracking and Analysis Tool

Authors:

Daniel Fernández-Lanvin, Javier de Andrés, Martín González-Rodríguez and Pelayo García Torre

Abstract: Web interaction is a complex process that involves a series of gestures, patterns and determining factors. The degree to which these factors influence the user experience in any of its facets (performance, satisfaction, etc.) is a critical aspect since it can mean the success or failure of a website. This influence can be measured through experiments and is an important area of research in Human-Computer Interaction. This paper presents a web tool designed to support this type of experiments, providing a semi-automated way to instrument web applications, collect the interaction data of the subjects and analyse it once the experiment is finished.
Download

Paper Nr: 43
Title:

The Ginger: Another Spice to Hinder Attacks on Password Files

Authors:

Francesco Buccafurri, Vincenzo De Angelis and Sara Lazzaro

Abstract: One of the threats to password-based authentication is that the attacker is able to steal the password file from the server. Despite the fact that, thanks to the currently adopted security mechanisms such as salt, pepper, and key derivation functions, it is very hard for the attacker to reverse the password file, dedicated hardware is available that can make this attack feasible. Therefore, there is still a need to better counter password-file reversing. In this paper, we propose a new mechanism, called ginger, which can be combined with the above mechanisms, to increase the robustness of password-based authentication against password-file reversing. Unlike pepper and salt, ginger is stored client-side, and enables a stateful authentication process. A careful security analysis shows the benefits of the proposed innovation.
Download

Paper Nr: 51
Title:

A Distributed Registry of Multi-perspective Data Services in Cyber Physical Production Networks

Authors:

Ada Bagozi, Devis Bianchini and Anisa Rula

Abstract: The advances in smart technologies, such as sensor networks, cloud computing, data management and artificial intelligence, enable production systems to communicate with each other and rapidly configure themselves to meet dynamic production needs. In this context, the adoption of service-oriented computing is aimed at enabling modular and standardised software infrastructures, platform-independent interactions between software components and information hiding for ensuring data sovereignty in a fully distributed environment. However, for a full-fledged exploitation of service-oriented computing capabilities in the Industry 4.0 production systems, the existing service design solutions still lack a clear specification of what is the data which the service relies on, what is the business goal of the service and when it is invoked within the information flow throughout the production network. In this paper, we propose the model of a registry of data-oriented services in an industrial production chain. The organisation of services in the registry is guided by multiple aspects of the production network, namely: (i) the business goal of a real production network (ii) the perspective on production data that is managed through the service (iii) the high level action performed by the service The modelling strategy has been conceived to properly guide service design against ad-hoc solutions, thus facilitating future service selection and composition to meet the business goals of collaborating actors. The resulting portfolio of services can be declined by each actor of the production network, leading to a distributed registry that allows each actor to preserve control over the owned data. The application in a case study has been performed to demonstrate the feasibility of the data-oriented services.
Download

Paper Nr: 57
Title:

A Data-Centric Anomaly-Based Detection System for Interactive Machine Learning Setups

Authors:

Joseph Bugeja and Jan A. Persson

Abstract: A major concern in the use of Internet of Things (IoT) technologies in general is their reliability in the presence of security threats and cyberattacks. Particularly, there is a growing recognition that IoT environments featuring virtual sensing and interactive machine learning may be subject to additional vulnerabilities when compared to traditional networks and classical batch learning settings. Partly, this is as adversaries could more easily manipulate the user feedback channel with malicious content. To this end, we propose a data-centric anomaly-based detection system, based on machine learning, that facilitates the process of identifying anomalies, particularly those related to poisoning integrity attacks targeting the user feedback channel of interactive machine learning setups. We demonstrate the capabilities of the proposed system in a case study involving a smart campus setup consisting of different smart devices, namely, a smart camera, a climate sensmitter, smart lighting, a smart phone, and a user feedback channel over which users could furnish labels to improve detection of correct system states, namely, activity types happening inside a room. Our results indicate that anomalies targeting the user feedback channel can be accurately detected at 98% using the Random Forest classifier.
Download

Paper Nr: 58
Title:

Using Feature Analysis to Guide Risk Calculations of Cyber Incidents

Authors:

Benjamin Aziz and Alaa Mohasseb

Abstract: The prediction of incident features, for example through the use of text analysis and mining techniques, is one method by which the risk underlying Cyber security incidents can be managed and contained. In this paper, we define risk as the product of the probability of misjudging incident features and the impact such misjudgment could have on incident responses. We apply our idea to a simple case study involving a dataset of Cyber intrusion incidents in South Korean enterprises. We investigate a few problems. First, the prediction of response actions to future incidents involving malware and second, the utilisation of the knowledge of the response actions in guiding analysis to determine the type of malware or the name of the malicious code.
Download

Paper Nr: 10
Title:

Transfer, Measure and Extend Maintainability Metrics for Web Component based Applications to Achieve Higher Quality

Authors:

Tobias Münch and Rainer Roosmann

Abstract: The last few years have seen an increased interest in web components composite W3C standard. These can be used without or with frameworks like Angular or React. Modern web applications get developed by the principles of component-based software development (CBSD). Therefore, a Web-Application is a composition of multiple web components, which are connected. In order to operate and continuously extend a web application successfully in the long term, the non-functional requirement of maintainability has crucial importance. This paper describes a model to collect, measure and compare maintainability in web components and web applications. These consist of object-oriented language (OOP) and a bound HTML-Fragment. Previous knowledge of maintainability gets extended to the interconnection between OOP and HTML-Fragments. Especially the coupling and cohesion between web components get analyzed. Through the developed model, the maintainability of web components can be specified in more detail. This allows web developers to analyze the quality of their web applications and reach a higher software-quality level.
Download

Paper Nr: 40
Title:

Towards a Pattern-based Approach for Transforming Legacy COBOL Applications Into RESTful Web Services

Authors:

Christoph Gaudl and Philipp Brune

Abstract: Many aspects of modern life still depend on large-scale, monolithic legacy applications, e.g. in financial services, transport or public administration. Typically, these applications are written in ancient programming languages such as COBOL and use proprietary transaction processing monitors like CICS. While the modernization or replacement of these legacy application has been discusses in literature and practice for decades, still no universal solution exists. In many cases, an evolutionary modernization strategy has shown to be successful in practice, allowing to modernize the software architecture as well, not only the program code. Therefore, in this paper an analysis pattern is derived for transforming stateful, transactional COBOL programs into stateless RESTful web services. This pattern is evaluated by analyzing and transforming an example COBOL application to Java. While the approach shows to be useful in case of the example application, it needs to be further investigated in a broader range of real-world scenarios.
Download

Area 3 - Social Network Analytics

Full Papers
Paper Nr: 28
Title:

Cooking Reviews Segmentation and Classification based on Deep Learning and Named Entity Detection

Authors:

Randa Benkhelifa and Nasria Bouhyaoui

Abstract: YouTube is one of the most used online social networking (OSN) websites for exchanging recipes. It allows uploading them, searching for, downloading, as well as rating and reviewing them. Sentiment analysis for food and cooking recipes comments is to identify what people think about such cooking recipe video through users’ comments. Nowadays, users’ give their opinion not only about recipes; they also evaluate the cook through their comments, where a cook’s reputation can affect the users’ opinion about his cooking recipes. Frequently, when a cook has a good reputation, his recipes receive a great success by people, and vice versa. In this paper, we propose a new approach that deal with the sentiment classification of cooking reviews. Firstly, we examine the benefit of performing named entity detection and conjunctions on our corpus for text segmentation in order to divide the comment on segments concerning the cook and segments concerning the recipe. Next, we make two sentiment classifications (about the cook and about the recipe). Finally, we incorporate the polarity of the cook sentiment classification in the recipe sentiment classification in order to analyse the effect of the opinion about the cook on the performance of the categorization of the shared cooking recipes comments in OSNs.
Download

Paper Nr: 34
Title:

Unsupervised Aspect Term Extraction for Sentiment Analysis through Automatic Labeling

Authors:

Danny Suarez Vargas, Lucas R. C. Pessutto and Viviane Pereira Moreira

Abstract: In sentiment analysis, there has been growing interest in performing finer granularity analysis focusing on entities and their aspects. This is the goal of Aspect-based Sentiment Analysis which commonly involves the following tasks: Opinion Target Extraction (OTE), Aspect term extraction (ATE), and polarity Classification (PC). This work focuses on the second task, which is the more challenging and least explored in the unsupervised context. The difficulty arises mainly due to the nature of the data (user-generated contents or product reviews) and the inconsistent annotation of the evaluation datasets. Existing approaches for ATE and OTE either depend on annotated data or are limited by the availability of domain- or language-specific resources. To overcome these limitations, we propose UNsupervised Aspect Term Extractor (UNATE), an end-to-end unsupervised ATE solution. Our solution relies on a combination of topic models, word embeddings, and a BERT-based classifier to extract aspects even in the absence of annotated data. Experimental results on datasets from different domains have shown that UNATE achieves precision and F-measure scores comparable to the semi-supervised and unsupervised state-of-the-art ATE solutions.
Download

Paper Nr: 55
Title:

Do Top Higher Education Institutions’ Social Media Communication Differ Depending on Their Rank?

Authors:

Alvaro Figueira and Lirielly Vitorugo Nascimento

Abstract: Higher Education Institutions use social media as a marketing channel to attract and engage users so that the institution is promoted and thus a wide range of benefits can be achieved. These institutions are evaluated globally on various success parameters, being published in rankings. In this paper, we analyze the publishing strategies and compare the results with their overall ranking positions. The results show that there is a tendency to find a particular strategy in the top ranked universities. We also found cases where the strategies are less prominent and do not match the ranking positions.
Download

Short Papers
Paper Nr: 4
Title:

A Skill Sharing Platform for Team Collaboration and Knowledge Exchange

Authors:

Victor Obionwu, Andreas Nurnberger and Gunter Saake

Abstract: Teamwork or collaborative learning processes are known to be highly dependent on the connection and subsequent interactions that are established among the participants. It encourages knowledge creation and sharing, which results in participants developing faster skills with respect to the subject matter. Platforms such as blogs are known to be efficient in stimulating reflection, a sense of community, and collaboration. Thus, in this study, we discuss our blog implementation in SQLValidator, a web-based learning platform that focuses on database-related courses. We further discuss the students’ experiences while using blogs as knowledge acquisition tools. Furthermore, qualitative data were collected from the observation of students to gain more perspective about their experiences in using the blogs in their learning.
Download

Paper Nr: 18
Title:

“eRReBIS” Business Intelligence based Intelligent Recommender System for e-Recruitment Process

Authors:

Siwar Ayadi, Manel Bensassi and Henda Ben Ghezala

Abstract: Due to the continuous and growing spread of the corona virus worldwide, it is important, especially in the business era, to develop accurate data driven decision-aided system to support business decision-makers in processing, managing large amounts of information in the recruitment process. In this context, e-Recruitment Recommender systems emerged as a decision support systems and aims to help stakeholders in finding items that match their preferences. However, existing solutions do not afford the recruiter to manage the whole process from different points of view. Thus, the main goal of this paper is to build an accurate and generic data driven system based on Business intelligence architecture. The strengths of our proposal lie in the fact that it allows decision makers to (1) consider multiple and heterogeneous data sources, access and manage data in order to generate strategic reports and recommendations at all times (2) combine many similarity’s measure in the recommendation process (3) apply prescriptive analysis and machine learning algorithms to offer adapted and efficient recommendations.
Download

Area 4 - Web Intelligence and Semantic Web

Full Papers
Paper Nr: 17
Title:

Using Transfer Learning To Classify Long Unstructured Texts with Small Amounts of Labeled Data

Authors:

Carlos Alberto Alvares Rocha, Marcos Vinícius Pinheiro Dib, Li Weigang, Andrea Ferreira Portela Nunes, Allan Victor Almeida Faria, Daniel Oliveira Cajueiro, Maísa Kely de Melo and Victor Rafael Rezende Celestino

Abstract: Text classification is a traditional problem in Natural Language Processing (NLP). Most of the state-of-the-art implementations require high-quality, voluminous, labeled data. Pre-trained models on large corpora have shown beneficial for text classification and other NLP tasks, but they can only take a limited amount of symbols as input. This is a real case study that explores different machine learning strategies to classify a small amount of long, unstructured, and uneven data to find a proper method with good performance. The collected data includes texts of financing opportunities the international R&D funding organizations provided on their websites. The main goal is to find international R&D funding eligible for Brazilian researchers, sponsored by the Ministry of Science, Technology and Innovation. We use pre-training and word embedding solutions to learn the relationship of the words from other datasets with considerable similarity and larger scale. Then, using the acquired features, based on the available dataset from MCTI, we apply transfer learning plus deep learning models to improve the comprehension of each sentence. Compared to the baseline accuracy rate of 81%, based on the available datasets, and the 85% accuracy rate achieved through a Transformer-based approach, the Word2Vec-based approach improved the accuracy rate to 88%. The research results serve as a successful case of artificial intelligence in a federal government application.
Download

Paper Nr: 45
Title:

Characterizing Open Government Data Available on the Web from the Quality Perspective: A Systematic Mapping Study

Authors:

Rafael Chiaradia Almeida, Glauco de Figueiredo Carneiro and Edward David Moreno

Abstract: Context: Data openness can create opportunities for new and disruptive digital services on the web that has the potential to benefit the whole society. However, the quality of those data is a crucial factor for the success of any endeavor based on information made available by the government. Objective: Analyze the current state of the art of quality evaluation of open government data available on the web from the perspective of discoverability, accessibility, and usability. Methods: We performed a systematic mapping review of the published peer-reviewed literature from 2011 to 2021 to gather evidence on how practitioners and researchers evaluate the quality of open government data. Results: Out of 792 records, we selected 21 articles from the literature. Findings suggest no consensus regarding the quality evaluation of open government data. Most studies did not mention the datasetś application domain, and the preferred data analysis approach mainly relies on human observation. Of the non-conformities cited, data discoverability and usability outstand from the others. Conclusions: There is also no consensus regarding the dimensions to be included in the evaluation. None of the selected articles reported the use of machine learning algorithms for this end.
Download

Paper Nr: 52
Title:

Exploiting Linked Data-based Personalization Strategies for Recommender Systems

Authors:

Gabriela Oliveira Mota Da Silva, Lara Sant'Anna do Nascimento and Frederico Araújo Durão

Abstract: People seek assertive and reliable recommendations to support their daily decision-making tasks. To this end, recommendation systems rely on personalized user models to suggest items to a user. Linked Data-driven systems are a kind of Web Intelligent systems, which leverage the semantics of links between resources in the RDF graph to provide metadata (properties) for the user modeling process. One problem with this approach is the sparsity of the user-item matrix, caused by the many different properties of an item. However, feature selection techniques have been applied to minimize the problem. In this paper, we perform a feature selection preprocessing step based on the ontology summary data. Additionally, we combine a personalization strategy that associates weights with relevant properties according to the user’s previous interactions with the system. These strategies together aim to improve the performance and accuracy of the recommender system, since only latent representations are processed by the recommendation engine. We perform several experiments on two publicly available datasets in the film and music domains. We compare the advantages and disadvantages of the proposed strategies with non-personalized and non-preprocessed approaches. The experiments show significant increases in top-n recommendation tasks in Precision@K (K=5, 10), Map, and NDCG.
Download

Short Papers
Paper Nr: 13
Title:

Factoid vs. Non-factoid Question Identification: An Ensemble Learning Approach

Authors:

Alaa Mohasseb and Andreas Kanavos

Abstract: Question Classification is one of the most important applications of information retrieval. Identifying the correct question type constitutes the main step to enhance the performance of question answering systems. However, distinguishing between factoid and non-factoid questions is considered a challenging problem. In this paper, a grammatical based framework has been adapted for question identification. Ensemble Learning models were used for the classification process in which experimental results show that the combination of question grammatical features along with the ensemble learning models helped in achieving a good level of accuracy.
Download

Paper Nr: 16
Title:

Comparative Analysis of Recurrent Neural Network Architectures for Arabic Word Sense Disambiguation

Authors:

Rakia Saidi, Fethi Jarray and Mohammed Alsuhaibani

Abstract: Word Sense Disambiguation (WSD) refers to the process of discovering the correct sense of an ambiguous word occurring in a given context. In this paper, we address the problem of Word Sense Disambiguation of low-resource languages such as Arabic language. We model the problem as a supervised sequence-to-sequence learning where the input is a stream of tokens and the output is the sequence of senses for the ambiguous words. We propose four recurrent neural network (RNN) architectures including Vanilla RNN, LSTM, BiLSTM and GRU. We achieve, respectively, 85.22%, 88.54%, 90.77% and 92.83% accuracy with Vanilla RNN, LSTM, BiLSTM and GRU. The obtained results demonstrate superiority of GRU based deep learning Model for WSD over the existing RNN models.
Download

Paper Nr: 22
Title:

A Simple Algorithm for Checking Pattern Query Containment under Shape Expression Schema

Authors:

Haruna Fujimoto and Nobutaka Suzuki

Abstract: Query containment is one of the major fundamental problems for various kinds of data including RDF/graph, and related to many important practical problems, e.g., determining independence of queries from updates and rewriting queries using views. In this paper, we consider a query containment problem under Shape Expression (ShEx), where query is defined as pattern graph with projection. We adopt a graph-theoretic approach to cope with the containment problem, and propose a simple sound algorithm for solving the problem. In our preliminary experiments, we first verified that the results of our algorithm are correct for all pairs of queries generated in the experiments. We also show that types of ShEx schema can be used to reduce the search space for checking pattern query containment.
Download

Paper Nr: 27
Title:

Classification of the Top-cited Literature by Fusing Linguistic and Citation Information with the Transformer Model

Authors:

Masanao Ochi, Masanori Shiro, Jun’ichiro Mori and Ichiro Sakata

Abstract: The scientific literature contains a wide variety of data, including language, citations, and images of figures and tables. The Transformer model, released in 2017, was initially used in natural language processing but has since been widely used in various fields, including image processing and network science. Many Transformer models trained with an extensive data set are available, and we can apply small new data to the models for our focused tasks. However, classification and regression studies for scholarly data have been conducted primarily by using each data set individually and combining the extracted features, with insufficient consideration given to the interactions among the data. In this paper, we propose an end2end fusion method for linguistic and citation information in scholarly literature data using the Transformer model. The proposed method shows the potential to efficiently improve the accuracy of various classifications and predictions by end2end fusion of various data in the scholarly literature. Using a dataset from the Web of Science, we classified papers with the top 20% citation counts three years after publication. The results show that the proposed method improves the F-value by 2.65 to 6.08 percentage points compared to using only particular information.
Download

Paper Nr: 30
Title:

Towards Soft Web Intelligence by Collecting and Processing JSON Data Sets from Web Sources

Authors:

Paolo Fosci and Giuseppe Psaila

Abstract: Since the last two decades, Web Intelligence has denoted a plethora of approaches to discover useful knowledge from the vast World-Wide Web; however, dealing with the immense variety of the Web is not easy and the challenge is still open. In this paper, we moved from the previous functionalities provided by the J-CO Framework (a research project under development at University of Bergamo Italy), to identify a vision of Web Intelligence scopes in which capabilities of soft computing and soft querying provided by a stand-alone tool can actually create novel possibilities of making useful analysis of JSON data sets directly coming from Web sources. The paper identifies some extensions to the J-CO Framework, which we implemented; then it shows an example of soft querying enabled by these extensions.
Download

Paper Nr: 8
Title:

Term-based Website Evaluation Applying Word Vectors and Readability Measures

Authors:

Kiyoshi Nagata

Abstract: Now the homepage is an important means of transmitting information not only in companies but also in any type of organization. However, it cannot be said that the page structure in the website is always in an appropriate state. Research on websites has been actively conducted both from academic and practical aspects, and sometime from three major categories such as web content mining, web structure mining, and web usage mining. In this paper, we mainly focus on term-based properties and propose a system that evaluates the appropriateness of link relationships in a given site taking those content and link structure properties into consideration. We also consider readability of text in each webpage by applying some of readability measures to evaluate the uniformity of them across all pages. We implement those systems in our previously developed multilingual support application, and some results by applying it to several websites are shown.
Download

Paper Nr: 9
Title:

POSER: A Semantic Payload Lowering Service

Authors:

Daniel Spieldenner

Abstract: Establishing flexible, data source and structure independent interoperability between databases, web services or devices is an ubiquitous problem in today’s digitized, connected world. The concept of semantic interoperability is a promising approach to abstract communication from concrete protocols and data structures to a more meaning driven data representation. While transforming structured data into semantic knowledge graphs is a well investigated problem, actually using semantic data in an ecosystem of established legacy services, often providing only a standard structured data API, is still an open issue. In this paper we propose an approach on how to formally describe possible mappings between a higher level semantic data representation onto syntactically fixed structured data objects, introduce an algorithm describing how to generate structured data objects from semantic input using mapping rules following these concepts, and illustrate the approach with an example implementation for a possible interoperability layer service, connecting a semantic input data set to a JSON API.
Download

Paper Nr: 12
Title:

ELEVEN Data-Set: A Labeled Set of Descriptions of Goods Captured from Brazilian Electronic Invoices

Authors:

Vinícius Di Oliveira, Li Weigang and Geraldo Pereira Rocha Filho

Abstract: The task of classifying short text through machine learning (ML) models is promising and challenging for economic related sectors such as electronic invoice processing and auditing. Considering the scarcity of labeled short text data sets and the high cost of establishing new labeled short text databases for supervised learning, especially when they are manually established by experts, this research proposes ELEVEN (ELEctronic inVoicEs in portuguese laNguage) Data-Set in an open data format. This labeled short text database is composed of the product descriptions extracted from electronic invoices. These short Portuguese text descriptions are unstructured, but limited to 120 characters. First, we construct BERT and other models to demonstrate the short text classification using ELEVEN. Then, we show three successful cases, also using the data set we developed, to identify correct products codes according to the short text descriptions of goods captured from the electronic invoices and others. ELEVEN consists of 1.1 million merchandise descriptions recorded as labeled short-texts, annotated by specialist tax auditors, and detailed according to the Mercosur Common Nomenclature. For easy public use, ELEVEN is shared on GitHub by the link: https://github.com/vinidiol/descmerc.
Download

Paper Nr: 29
Title:

Towards an Ontology-based Recommender System for Assessment in a Collaborative Elearning Environment

Authors:

Asma Hadyaoui and Lilia Cheniti-Belcadhi

Abstract: Personalized recommendations can help learners to overcome the information overload problem, by recommending learning resources according to learners’ preferences and level of knowledge. In this context, we propose a Recommender System for a personalized formative assessment in an online collaborative learning environment based on an assessment ePortfolio. Our proposed Recommender System allows recommending the next assessment activity and the most suitable peer to receive feedback from, and give feedback to, by connecting that learner’s ePortfolio with the ePortfolios of other learners in the same assessment platform. The recommendation process has to meet the learners’ progressions, levels, and preferences stored and managed on the assessment ePortfolio models: the learner model, the pre-test model, the assessment activity model, and the peer-feedback model. For the construction of each one, we proposed a semantic web approach using ontologies and eLearning standards to allow reusability and interoperability of data. Indeed, we used CMI5 specifications for the assessment activity model. IEEE PAPI Learner is used to describe learners and their relationships. To formalize the peer-feedback model and the pre-test model we referred to the IMS/QTI specifications. Our ontology for the assessment ePortfolio is the fundamental layer for our personalized Recommender System.
Download

Paper Nr: 36
Title:

Publish and Enrich Geospatial Data as Linked Open Data

Authors:

Claire Ponciano, Markus Schaffert, Falk Würriehausen and Jean-Jacques Ponciano

Abstract: The rapid growth of geospatial data (at least 20% every year) makes spatial data increasingly heterogeneous. With the emergence of Semantic Web technologies, more and more approaches are trying to group these data in knowledge graphs, allowing to link data together and to facilitate their sharing, use and maintenance. These approaches face the problem of homogenisation of these data which are not unified in the structure of the data on the one hand and on the other hand have a vocabulary that varies greatly depending on the application domain for which the data are dedicated and the language in which they are described. In order to solve this problem of homogenisation, we present in this paper the foundations of a framework allowing to group efficiently heterogeneous spatial data in a knowledge base. This knowledge base is based on an ontology linked to Schema.org and DCAT-AP, and provides a data structure compatible with GeoSPARQL. This framework allows the integration of geospatial data independently of their original language by translating them using Neural Machine Translation.
Download

Paper Nr: 49
Title:

An Artificial Intelligence Application for a Network of LPI-FMCW Mini-radar to Recognize Killer-drones

Authors:

Alberto Lupidi, Alessandro Cantelli-Forti, Edmond Jajaga and Walter Matta

Abstract: The foundation of Internet Information Systems has been initially inspired by military applications. Means of air attack are pervasive in all modern armed conflicts or terrorist actions. Thus, building web-enabled, real-time, rapid and intelligent distributed decision-making systems is of immense importance. We present the intermediate results of the NATO-SPS project “Anti-Drones” that aims to fuse data from low-probability-of-intercept mini radars and a network of optical sensors communicating with web interfaces. The main focus of this paper is describing the architecture of the system and the low-cost miniradar sensor exploiting micro-Doppler effect to detect, track and recognize threats. The recognition of the target via an artificial intelligence system is the pillar to assess these threats in a reliable way.
Download

Paper Nr: 50
Title:

How Textual Datasets Enhance the PADI-Web Tool?

Authors:

Mathieu Roche, Elena Arsevska, Sarah Valentin, Sylvain Falala, Julien Rabatel and Renaud Lancelot

Abstract: The ability to rapidly detect outbreaks of emerging infectious diseases is a health priority of global health agencies. In this context, event-based surveillance (EBS) systems gather outbreak-related information from heterogeneous data sources, including online news articles. EBS systems, thus, increasingly marshal text-mining methods to alleviate the amount of manual curation of the freely available text. This paper documents the use of datasets obtained through an EBS system, PADI-Web (Platform for Automated extraction of Disease Information from the web), dedicated to digital outbreak detection in animal health. This paper describes the datasets used for improving 3 important tasks related to PADI-Web, i.e., news classification, information extraction and dissemination.
Download