WEBIST 2023 Abstracts


Area 1 - HCI in Mobile Systems and Web Interfaces

Full Papers
Paper Nr: 26
Title:

Accessibility of e-Government Websites in Italy: The User Experience of People with Disabilities

Authors:

Maria Claudia Buzzi, Marina Buzzi, Giuseppe Della Penna, Barbara Leporini and Francesca Ricci

Abstract: Public Administration services must be accessible for anyone, including people with disabilities who interact via assistive technology. In 2016, the European Union published Directive 2016/2102 with the aim of making such services more accessible to any citizen, regardless of its abilities. This paper investigates the accessibility of e-Government services in Italy from the point of view of people with disabilities: seventy-six users participated in an online survey, and the collected answers have been further refined through semi-structured interviews. Results have been compared with a previous study, showing that the number of services has increased but no substantial improvement in terms of accessibility has been recorded. Simplified interaction and increased efficiency are still lacking, even if global user satisfaction seems to have slightly improved.
Download

Paper Nr: 42
Title:

Paperless Checklist for Process Validation and Production Readiness: An Industrial Use Case

Authors:

José Cosme, Tatiana Pinto, Anabela Ribeiro, Vítor Filipe, Eurico V. Amorim and Rui Pinto

Abstract: The Digital Model concept of factory floor equipment allows simulation, visualization and processing, and the ability to communicate between the various workstations. The Digital Twin is the concept used for the digital representation of equipment on the factory floor, capable of collecting a set of data about the equipment and production, using physical sensors installed in the equipment. Within the scope of data visualization and processing, there is a need to manage information about parameters/conditions that the assembly line equipments must present to start a production order, or in a shift handover. This study proposes a paperless checklist to manage equipment information and monitor production ramp-up. The proposed solution is validated in a real-world industrial scenario, by comparing its suitability against the current paper-based approach to log information. Results show that the paperless checklist presents advantages over the current approach since it enables multi-access viewing and logging while maintaining a digital history of log changes for further analysis.
Download

Short Papers
Paper Nr: 27
Title:

Enhancing User Experience in e-Government: A Deep Dive into e-Government Forms and Citizen Perceptions

Authors:

Asma Aldrees and Denis Gračanin

Abstract: Understanding citizens’ perceptions is essential for improving e-government services and strengthening the relationship between citizens and the government. Therefore, this study focuses on the design principles of e-government forms and their impact on citizens’ experiences. Specifically, it examines the context of e-government in the United States and seeks to understand citizens’ perceptions. A web prototype for e-government forms was developed based on the US Web Design System (USWDS) guidelines. A web-based survey using a user experience questionnaire was conducted, with five scales: efficiency, trust, trustworthiness of content, quality of content, and clarity. Then, we recruited 200 US citizens to evaluate the implemented e-form. The results indicated positive user experiences across all scales. However, the trust scale received the lowest score, despite being considered the most important by citizens. Participants recognized the importance of trust but felt it was not fully established. More research is needed to investigate the trust value of e-government design principles in the US. By following established design principles and addressing trust concerns, governments can create user-friendly interfaces that foster trust and meet citizens’ expectations.
Download

Paper Nr: 41
Title:

Reconstruction and Validation of the UX Factor Trust for the User Experience Questionnaire Plus (UEQ+)

Authors:

Andreas Hinderks, Martin Schrepp, Maria Rauschenberger and Jörg Thomaschewski

Abstract: As digital technologies advance, user experience (UX) has become crucial for software and services success. The User Experience Questionnaire Plus (UEQ+) is a flexible tool used to evaluate UX through questionnaires tailored to specific problems, yet a critical factor often overlooked is Trust. Trust, understood as a user’s belief in a software’s ability to function consistently, securely, and with respect for user data privacy, is especially pivotal in areas like financial services, health informatics, and e-commerce platforms. This paper focuses on the construction and validation of Trust as a new factor in the UEQ+. During the construction phase, an initial collection of potential items was assembled for the trust factor. A subsequent study involving 405 participants facilitated the reduction of these items to four, a task accomplished via factor analysis. The proceeding stages involved two additional validation phases, enlisting a total of 897 participants, wherein the selected items were subject to validation. The culmination of this process resulted in a newly validated factor, Trust, which is constituted by the following items: insecure-secure, untrustworthy-trustworthy, unreliable-reliable, and non-transparent-transparent.
Download

Paper Nr: 43
Title:

There Are no Major Age Effects for UX Aspects of Voice User Interfaces Using the Kano Categorization

Authors:

Jana Deutschländer, Anna C. Weigand, Andreas M. Klein, Dominique Winter and Maria Rauschenberger

Abstract: Voice user interface (VUI) evaluation often focuses on user experience (UX) quality measurement of UX aspects for VUIs. However, it is crucial to differ among these UX aspects concerning their relevance to specific target groups, like different usage contexts, or user characteristics such as age. Therefore, we identified potential age-specific characteristics and determine their nature, if any. We applied the Kano model using an age-segmentation to categorize these 32 UX aspects based on VUI user data (N = 384). Our findings reveal that UX aspects of VUIs are broadly consistent across all age groups, and VUI developers and researchers should consider the important ones. Some age effects are visible and could impact the success of VUIs.
Download

Paper Nr: 56
Title:

Enhancing Users’ Interactions in Mobile Augmented Reality Systems Through Fuzzy Logic-Based Modelling of Computer Skills

Authors:

Christos Troussas, Christos Papakostas and Cleo Sgouropoulou

Abstract: Mobile augmented reality (AR) systems offer exciting opportunities for blending digital content with the real world. However, engagement in mobile AR environments mainly relies on users’ computer skills, which vary among users and impact their ability to utilize this technology. This research addresses this gap in understanding the influence of users’ computer skills on interactions in mobile AR. In view of the above, this paper presents a fuzzy logic-based model to assess and refine users’ computer skills in the context of mobile AR systems. Modelling the users’ computer skills, the system is responsible for providing personalized assistive messages and feedback. These messages are designed to align with the fuzzy weights that have been established, enhancing users’ interactions within the mobile AR environment. The presented approach is integrated in a personalized mobile AR system for spatial ability training. The evaluation results demonstrate a highly positive outlook. A major conclusion of this work is that fuzzy logic modelling has significant potential to enhance user experiences and drive advancements in mobile augmented reality technology.
Download

Paper Nr: 58
Title:

Pragmatic versus Hedonic: Determining the Dominant Quality in User Experience for Professional and Leisure Collaboration Tools

Authors:

Lisa Eidloth, Anna-Lena Meiners, Jörg Thomaschewski and Andreas Hinderks

Abstract: As collaborative technologies become integral in both professional and leisurely settings, especially during the rise of remote work and digital communities due to COVID-19, understanding the user experience (UX) factors is critical. This study aims to explore the differential importance of these UX factors across professional and leisure contexts, leveraging the widespread use of collaboration tools for an in-depth analysis. The objective of the study is to identify and assess key UX factors in collaboration tools, and to quantify their differential impact in professional and leisure settings. Our research underscores the nuanced role of context in evaluating User Experience (UX) factors’ importance in collaboration tools, with significant variances observed across professional and leisure settings. While some UX factors, including accessibility, clarity, and intuitive use, maintained universal importance across contexts and tools, others—specifically dependability and efficiency—contradicted assumptions of being universal "hygiene factors", demonstrating the complexity of UX evaluations. This complexity necessitates a differentiated approach for each context and collaboration tool type, challenging the possibility of a singular evaluation or statement.
Download

Paper Nr: 81
Title:

Towards a Framework for AI-Assisted Data Storytelling

Authors:

Angelica Lo Duca

Abstract: Data storytelling is building stories supported by data to engage the audience and inspire them to make decisions. Applying data storytelling to data visualization means adding a narrative that better explains the visual and engages the audience. Generative AI can help transform data visuals into data stories. This paper proposes AI-DaSt (AI-based Data Storytelling), a framework that helps build data stories based on generative AI. The framework focuses on visual charts and incorporates two main generative AI models provided by the OpenAI APIs: text generation and image generation. We use GPT-3.5 for the chart title, commentary and notes, and image generation for images to include in the chart. We also describe the potential ethical issues and possible countermeasures related to using Generative AI in data storytelling. Finally, we focus on a practical use case, which shows how to transform a data visualization chart into a data story using the implemented framework.
Download

Paper Nr: 35
Title:

Identifying Student Profiles in CSCL Systems for Programming Learning Using Quality in Use Analysis

Authors:

Rafael Duque, Miguel Á. Redondo, Manuel Ortega, Sergio Salomón and Ana I. Molina

Abstract: In the digital age, computer programming skills are in high demand, and collaborative learning is essential for its development. Computer-Supported Collaborative Learning (CSCL) systems enable real-time collaboration among students, regardless of their location, by offering resources and tools for programming tasks. To optimize the learning experience in CSCL systems, user profiling can be used to tailor educational content, adapt learning activities, provide personalized feedback, and facilitate targeted interventions based on individual learners’ needs, preferences, and performance patterns. This paper describes a framework that can be applied to profile students of CSCL systems. By analysing log files, computational models, and quality measures, the framework captures various dimensions of the learning process and generates user profiles based on the Myers-Briggs Type Indicator (MBTI) personality. The work also conducts a case study that applies this framework to COLLECE 2.0, a CSCL system that supports programming learning.
Download

Paper Nr: 36
Title:

Investigating the Use of the Thunkable End-User Framework to Develop Haptic-Based Assistive Aids in the Orientation of Blind People

Authors:

Alina Vozna, Giulio Galesi and Barbara Leporini

Abstract: Nowadays mobile devices are essential tools for visiting cultural heritage sites. Thus, it is very important to provide an inclusive cultural mobile experience for anyone. In this study, we investigate how to create accessible apps to enhance the experience of the visually impaired people in outdoor cultural itineraries. In such a context, the integration of specific features for improving accessibility for blind people may require high skills. This study investigates how it is possible to exploit a simple-to-use developing environment, Thunkable framework, which does not require specific technical competencies, to easily develop an accessible app.
Download

Paper Nr: 49
Title:

User-Centered Design and Iterative Refinement: Promoting Student Learning with an Interactive Dashboard

Authors:

Gilbert Drzyzga and Thorleif Harder

Abstract: The study uses a user-centered design methodology to develop a prototype for an interactive student dashboard that focuses on user needs. This includes iterative testing and integration of user feedback to develop a usable interface that presents academic data in a more understandable and intuitive manner. Key features of the dashboard include academic progress tracking and personalized recommendations based on machine learning. The primary target audience is online students who may study in isolation and have less physical contact with their peers. The learner dashboard (LD) will be developed as a plug-in to the university’s learning management system. The study presents the results of a workshop with students experienced in human-com-puter interaction. They evaluated a prototype of the LD using established interaction principles. The research provides critical insights for future advancements in educational technology and drives the creation of more interactive, personalized, and easy-to-use tools in the academic landscape.
Download

Paper Nr: 62
Title:

A User Interface for Tuning QoS Parameters in Recommendation-Based Business Process Scenario Adaptation

Authors:

Kiriakos Sgardelis, Dionisis Margaris, Dimitris Spiliotopoulos and Costas Vassilakis

Abstract: The Web Services Business Process Execution Language (BPEL) is a special-purpose language that orchestrates web services into a high-level business process. A typical BPEL scenario contains invocations to preselected web services, along with their parameters. However, many recent research works support dynamic service selection, based on user-set policies and criteria. Furthermore, users may request a service recommendation, in which case functionally equivalent service offerings by different providers will be considered by the personalization module. Along with the recommendation request, users provide the policy parameters, which include minimum and maximum bounds for the non-functional attributes concerning the service, and the system exploits these bounds to select and use the optimal candidate services. However, in many real-life cases, a person will accept/purchase a product or a service that exceeds the threshold(s) that initially he/she has set, e.g., if the overhead is marginal or the offer is deemed appealing, or no satisfactory service candidates are identified using the initial settings. In this paper, we present and evaluate a specialized User Interface that allows the user to review service candidates marginally exceeding the specified bounds and consider them while making the final service selection.
Download

Area 2 - Internet Technology

Full Papers
Paper Nr: 14
Title:

Impact of Item Polarity on the Scales of the User Experience Questionnaire (UEQ)

Authors:

Martin Schrepp, Jessica Kollmorgen and Jörg Thomaschewski

Abstract: Measuring user experience is vital for long-term success of interactive products. Questionnaires like the modular extension of the User Experience Questionnaire (UEQ+) are an established instrument for this purpose. Different item formats are available for these questionnaires, such as the number of response options (most frequent 5- or 7- point Likert scales). But the item format of an UX questionnaire can of course influence the measured results. We investigate in this paper if the change to a one-sided polarity of semantic differential items influences the effort of the participants required to answer these items and on the measured scale scores. Therefore, we conducted 6 studies with 438 collected responses for the well-known products Microsoft PowerPoint, WhatsApp and Google Maps. Each product was evaluated by a sample of participants with the original UEQ and a modified version of the UEQ with one-sided polarity. In the modified version, the positive term of the semantic differential was always placed in the right position, while it is placed in half of the items in the positive and the other half in the left position in the original UEQ version. The results showed that the effort to complete the questionnaire (completion time and number of required corrections) was lower for the version with one-sided polarity, but the differences were so small that they are not practically relevant. But the results also showed that the change to a one-sided polarity introduced an answer tendency, which impact the scale scores. Therefore, the results obtained with the two versions of the UEQ cannot directly be compared. Based on this, we can conclude that it is not possible to directly compare the scores of the original UEQ scales with the corresponding scores of UEQ+ scales.
Download

Paper Nr: 17
Title:

Enhancing SSR in Low-Thread Web Servers: A Comprehensive Approach for Progressive Server-Side Rendering with any Asynchronous API and Multiple Data Models

Authors:

Fernando M. Carvalho and Pedro Fialho

Abstract: Naive server-side rendering (SSR) techniques require a dedicated server thread per HTTP request, thereby limiting the number of concurrent requests to the available server threads. Furthermore, this approach proves impractical for modern low-thread servers like WebFlux, VertX, and Express Node.js. To achieve progressive rendering, asynchronous data models provided by non-blocking APIs must be utilized. Nevertheless, this method can introduce undesirable interleaving between template view processing and data access, potentially resulting in malformed HTML documents. Some template engines offer partial remedies through specific templating dialects, but they encounter two limitations. Firstly, their compatibility is confined to specific types of asynchronous APIs, such as the reactive stream Publisher API. Secondly, they typically support only a single asynchronous data model at a time. In this research, we propose an alternative web templating approach that embraces any asynchronous API (e.g., Publisher, promises, suspend functions, flow, etc.) and allows for multiple asynchronous data sources. Our approach is implemented on top of HtmlFlow, a Java-based DSL for writing type-safe HTML. We evaluated against state-of-the-art reactive servers, specifically WebFlux, and compared it with popular templating idioms like Thymeleaf and KotlinX.html. Our proposal effectively overcomes the limitations of existing approaches.
Download

Paper Nr: 18
Title:

A Data Service Layer for Web Browser Extensions

Authors:

Alex Tacuri, Sergio Firmenich, Gustavo Rossi and Alejandro Fernandez

Abstract: Web browser extensions are the preferred method for end-users to modify existing web applications (and the browser itself) to fulfill unanticipated requirements. Some extensions improve existing websites based on online data, combining techniques such as mashups and augmentation. To obtain data when no APIs are available, extension developers resort to scraping. Scraping is frequently implemented with hard-coded DOM references, making code fragile. Scraping becomes more difficult when a scraping pipeline involves several websites (i.e., the result of scraping composes elements from various websites). It is challenging (if not impossible) to reuse the scraping code in different browser extensions. We propose a data service layer for browser extensions. It encapsulates site-specific search and scraping logic and exposes object-oriented search APIs. The data service layer includes a visual programming environment for the specification of data search and object model creation, which are exposed then as a programmatic API. While using this data service layer, developers are unconcerned with the complexity of data search, retrieval, scraping, and composition.
Download

Paper Nr: 31
Title:

Easy to Find: A Natural Language Query Processing System on Advertisements Using an Automatically Populated Database

Authors:

Yiu-Kai Ng

Abstract: Many commercial websites, such as Target.com, which aspire to increase client’s transactions and thus profits, offer users easy-to-use pull-down menus and/or keyword searching tools to locate advertisements (ads for short) posted at their sites. These websites, however, cannot handle natural language queries, which are formulated for specific information needs and can only be processed properly by natural language query processing (NLQP) systems. We have developed a novel NLQP system, denoted AdProc, which retrieves database records that match information specified in ads queries on multiple ads domains. AdProc relies on an underlying database (DB), which contains pre-processed (ads) records that provides the source of answers to users’ queries. AdProc automates the process of populating a DB using online ads and answering user queries on multiple ads domains. Experimental results using ads queries collected through Facebook on a dataset of online ads extracted from Craigslist.org and Coupons.com show that AdProc is highly effective in (i) classifying online ads, (ii) labeling, extracting, and populating data from ads in natural language into an underlying database D, (iii) assigning ads queries into their corresponding domains to be processed, and (iv) retrieving records in D that satisfy the users’ information needs.
Download

Paper Nr: 57
Title:

Comparing the Energy Consumption of WebAssembly and JavaScript in Mobile Browsers

Authors:

Dennis Pockstaller, Stefan Huber and Lukas Demetz

Abstract: With WebAssembly, a new web technology has been developed that allows compiled bytecode to be executed directly in the browser, which, unlike JavaScript code, does not have to be initially compiled by the browser and can therefore be executed faster. This allows the development of complex web applications. A challenge for these complex web applications is the increasing importance of mobile devices and their limited battery capacity. The goal of this study is to determine whether the energy consumption of web applications can be reduced by using WebAssembly instead of JavaScript. For this purpose, an automated experiment was performed on Android smartphones with different algorithms using WebAssembly and JavaScript using common browsers. The energy consumption was measured hardware-based with the Monsoon HVPM measuring device. The results show that WebAssembly consumes about 20% to 30% less energy than JavaScript. In addition, differences between the two tested browsers, Chrome and Firefox, in the energy consumption of JavaScript and WebAssembly were found. This potential reduction of energy consumption also allows to reduce the user’s CO2 footprint. The flexible study design used, allows for further investigations with other types of devices and other compilers.
Download

Paper Nr: 65
Title:

Supporting the Automated Generation of Acceptance Tests of Process-Aware Information Systems

Authors:

Tales M. Paiva, Toacy C. Oliveira, Raquel M. Pillat and Paulo C. Alencar

Abstract: Software quality assurance is a crucial process that ensures software products meet specified requirements and quality standards. Achieving an exhaustive test coverage is essential for quality assurance, particularly in complex and dynamic Process-Aware Information Systems (PAIS) built upon the Business Process Model and Notation (BPMN). Manual testing in such systems is challenging due to many execution paths, dependencies, and external interfaces. This paper proposes a model-based testing strategy that uses BPMN models and build specifications as input to generate a Robotic Process Automation (RPA) script that automates a comprehensive User Acceptance Test procedure. Leveraging on Robotic Process Automation (RPA) to automate user interactions allows for reducing the need for testers to manually input PAIS-related information when handling user forms. We also present a Case Study to demonstrate the feasibility of our approach.
Download

Paper Nr: 85
Title:

Quantifying Fairness Disparities in Graph-Based Neural Network Recommender Systems for Protected Groups

Authors:

Nikzad Chizari, Keywan Tajfar, Niloufar Shoeibi and María N. Moreno-García

Abstract: The wide acceptance of Recommender Systems (RS) among users for product and service suggestions has led to the proposal of multiple recommendation methods that have contributed to solving the problems presented by these systems. However, the focus on bias problems is much more limited. Some of the most successful and recent methods, such as Graph Neural Networks (GNNs), present problems of bias amplification and unfairness that need to be detected, measured, and addressed. In this study, an analysis of RS fairness is conducted, focusing on measuring unfairness toward protected groups, including gender and age. We quantify fairness disparities within these groups and evaluate recommendation quality for item lists using a metric based on Normalized Discounted Cumulative Gain (NDCG). Most bias assessment metrics in the literature are only valid for the rating prediction approach, but RS usually provide recommendations in the form of item lists. The metric for lists enhances the understanding of fairness dynamics in GNN-based RS, providing a more comprehensive perspective on the quality and equity of recommendations among different user groups.
Download

Short Papers
Paper Nr: 11
Title:

Compilation of Distributed Programs to Services Using Multiple Programming Languages

Authors:

Thomas M. Prinz

Abstract: Service-orientation recommends dividing software into separate independent services, with each service being implemented in the programming language that best fits into the service’s problem space. However, data must be shared between the distributed services, so common data models and interfaces must be defined in each programming language used. This leads to a higher development effort and dependencies, while neglecting the benefits. This paper explains a new idea that arranges a distributed program as if it is a single one, even though it consists of different parts using possibly different programming languages. For this purpose, the idea of meta network programming languages is introduced. They are based on network machines and hide the complexity arising during development of distributed software. A compiler translates and distributes these programs by splitting them into several parts. As a result, this should reduce the overhead of developing distributed general purpose software. The intention of this position paper is to give new ideas to implement distributed programs in the future. An implementation of the idea does not exist yet.
Download

Paper Nr: 23
Title:

Implications of Edge Computing for Static Site Generation

Authors:

Juho Vepsäläinen, Arto Hellas and Petri Vuorimaa

Abstract: Static site generation (SSG) is a common technique in the web development space to create performant websites that are easy to host. Numerous SSG tools exist, and the approach has been complemented by newer approaches, such as Jamstack, that extend its usability. Edge computing represents a new option to extend the usefulness of SSG further by allowing the creation of dynamic sites on top of a static backdrop, providing dynamic resources close to the user. In this paper, we explore the impact of the recent developments in the edge computing space and consider its implications for SSG.
Download

Paper Nr: 24
Title:

The State of Disappearing Frameworks in 2023

Authors:

Juho Vepsäläinen, Arto Hellas and Petri Vuorimaa

Abstract: Disappearing frameworks represent a new type of thinking for web development. In the current mainstream JavaScript frameworks, the focus has been on developer experience at the cost of user experience. Disappearing frameworks shift the focus by aiming to deliver as little, even zero, JavaScript to the client. In this paper, we look at the options available in the ecosystem in mid-2023 and characterize them in terms of functionality and features to provide a state-of-the-art view of the trend. We found that the frameworks rely heavily on compilers, often support progressive enhancement, and most of the time support static output. While solutions like Astro are UI library agnostic, others, such as Marko, are more opinionated.
Download

Paper Nr: 25
Title:

Students’ Interests Related to Web and Mobile Technologies Study

Authors:

Manuela A. Petrescu, Adrian Sterca and Ioan Badarinza

Abstract: We explore in this paper the interests and challenges of students regarding web and mobile technologies. Our study is based on a survey among undergraduate students, students that attend a Web Programming course. In particular, we study the challenges students have in following a successful career in web or mobile development and we have found that the most important one is the large effort required for keeping up to date with the fast changing web and mobile technologies. Overall, the attitude of the surveyed undergraduate students towards web development and mobile development is rather positive, as more than 60% of them said that they are interested in a career in web or mobile development. We also found out that most of them prefer working on back-end web technologies. As for the specific web technologies students are interested on, they are highly varied. Overall, our study provides valuable insights into the interests and challenges of students regarding web and mobile technologies, which can guide the development of effective teaching and learning approaches in this area.
Download

Paper Nr: 30
Title:

Enhancing Soft Web Intelligence with User-Defined Fuzzy Aggregators

Authors:

Paolo Fosci and Giuseppe Psaila

Abstract: In our previous work, we proposed Soft Web Intelligence as the interpretation of the general notion of Web Intelligence in the current technological panorama, in such a way JSON data sets are acquired from the Internet, stored within JSON document stores and then processed and queried by means of soft computing and soft querying methods. Specific extensions to the J-CO Framework and to its query language (named J-CO-QL+) made possible to practically implement the concept. However, any “data intelligence” activity does not exclude aggregating data, but J-CO-QL+ did not provide statements for defining “user-defined fuzzy aggregators”. In this paper, we present the novel constructs introduced into J-CO-QL+ to allow users to define and use their own fuzzy aggregators, so as to evaluate membership degrees to fuzzy sets moving from array fields within processed JSON documents. This way, complex soft queries are enabled, so as to enhance Soft Web Intelligence.
Download

Paper Nr: 34
Title:

New Perspectives on Data Exfiltration Detection for Advanced Persistent Threats Based on Ensemble Deep Learning Tree

Authors:

Xiaojuan Cai and Hiroshi Koide

Abstract: Data exfiltration of Advanced Persistent Threats (APTs) is a critical concern for high-value entities such as governments, large enterprises, and critical infrastructures, as attackers deploy increasingly sophisticated and stealthy tactics. Although extensive research has focused on methods to detect and halt APTs at the onset of an attack (e.g., examining data exfiltration over Domain Name System tunnels), there has been a lack of attention towards detecting sensitive data exfiltration once an APT has gained a foothold in the victim system. To address this gap, this paper analyzes data exfiltration detection from two new perspectives: exfiltration over a command-and-control channel and limitations on exfiltration transfer size, assuming that APT attackers have established a presence in the victim system. We introduce two detection mechanisms (Transfer Lifetime Volatility & Transfer Speed Volatility) and propose an ensemble deep learning tree model, EDeepXGB, based on eXtreme Gradient Boosting, to analyze data exfiltration from these perspectives. By comparing our approach with eight deep learning models (including four deep neural networks and four convolutional neural networks) and four traditional machine learning models (Naive Bayes, Quadratic Discriminant Analysis, Random Forest, and AdaBoost), our approach demonstrates competitive performance on the latest public real-world dataset (Unraveled-2023), with Precision of 91.89%, Recall of 93.19%, and F1-Score of 92.49%.
Download

Paper Nr: 50
Title:

Data Exfiltration by Hotjar Revisited

Authors:

Libor Polčák and Alexandra Slezáková

Abstract: Session replay scripts allow website owners to record the interaction of each web site visitor and aggregate the interaction to reveal the interests and problems of the visitors. However, previous research identified such techniques as privacy intrusive. This position paper updates the information on data collection by Hotjar. It revisits the previous findings to detect and describe the changes. The default policy to gather inputs changed; the recording script gathers only information from explicitly allowed input elements. Nevertheless, Hotjar does record content reflecting users’ behaviour outside input HTML elements. Even though we propose changes that would prevent the leakage of the reflected content, we argue that such changes will most likely not appear in practice. The paper discusses improvements in handling TLS. Not only do web page operators interact with Hotjar through encrypted connections, but Hotjar scripts do not work on sites not protected by TLS. Hotjar respects the Do Not Track signal; however, users need to connect to Hotjar even in the presence of the Do Not Track setting. Worse, malicious web operators can trick Hotjar into recording sessions of users with the active Do Not Track setting. Finally, we propose and motivate the extension of GDPR Art. 25 obligations to processors.
Download

Paper Nr: 54
Title:

Apache Spark Based Deep Learning for Social Transaction Analysis

Authors:

Raouf Jmal, Mariam Masmoudi, Ikram Amous, Corinne A. Zayani and Florence Sèdes

Abstract: In an attempt to cope with the increasing number of trust-related attacks, a system that analyzes the whole social transaction in real-time becomes a necessity. Traditional systems cannot analyze transactions in real-time and most of them use machine learning approaches, which are not suitable for the real-time processing of social transactions in the big data environment. Therefore, in this paper, we propose a novel deep learning detection system based on Apache Spark that is capable of handling huge transactions and streaming batches. Our model is made up of two main phases: the first phase builds a supervised deep learning model to classify transactions (either benign transactions or malicious transactions). The second phase aims to analyze transaction streams using spark streaming, which transforms the model to batches of data in order to make predictions in real-time. To verify the effectiveness of the proposed system, we implement this system and we perform several comparison experiments. The obtained results show that our approach has achieved more satisfactory efficiency and accuracy, compared to other works in the literature. Thus, it is very suitable for real-time detection of malicious transactions with large capacity and high speed.
Download

Paper Nr: 61
Title:

A Stateless Bare PC Web Server

Authors:

Fahad Alotaibi, Ramesh Karne and Alex Wijesinha

Abstract: Bare PC Web servers that run on 32-bit or 64-bit machines and use TCP or UDP for transport have been built previously. This paper describes the design and implementation of a new stateless UDP-based bare PC multi-core Web server. It also presents performance measurements. The server extends previous server designs with several novel architectural and protocol enhancements. A load balancing technique suitable for multi-core servers is included to illustrate a simple way to efficiently process HTTP requests. The architecture presented here could be adapted in future to build simple conventional Web servers.
Download

Paper Nr: 66
Title:

Automatic Mapping of Business Web Applications

Authors:

Adrian Sterca, Virginia Niculescu, Alexandru Kiraly and Darius Bufnea

Abstract: We present an automated tool that can be used to construct conceptual maps of business web applications. This conceptual map depicts in an abstract, hierarchical way the possible navigation and operational paths in the UI (i.e. User Interface) of the web application. Our tool discovers this conceptual map by navigating automatically through the UI of the target business web application and by mapping UI operations to conceptual operations in a database. The output product of our tool is this conceptual map represented in a graphical or serialized form. This conceptual map can be used for documenting a business web application so that new users of the web application quickly gain the necessary knowledge to navigate through the UI screens of the application and to operate the application. It can also be used for developing RPA (i.e. Robotic Process Automation) solutions that automate process execution on that business web application. This tool comes in the form of a browser extension.
Download

Paper Nr: 86
Title:

A Decentralized Authentication Model for Internet of Vehicles Using SSI

Authors:

Victor Emanuel C. Borges, Danilo S. Santos and Dalton G. Valadares

Abstract: The Internet of Vehicles (IoV) ecosystem is well-regarded for its overall security, yet authentication remains a critical concern due to existing vulnerabilities that expose users to potential malicious attacks. Although researchers have devised authentication mechanisms and protocols to address these issues, there are two significant risk factors often overlooked by prevalent solutions. The first is trust in out-of-coverage mode, which can leave vehicles vulnerable to receiving forged messages. The second is the centralization of the standard authentication mechanism, where reliance on a centralized third-party service introduces authentication vulnerabilities that can result in access loss. In this article, we propose an innovative solution that incorporates the Self-Sovereign Identity (SSI) decentralized identity model within the Trust Over IP architecture to provide vehicular authentication. This integration establishes decentralized identification mechanisms suitable for various contexts within the IoV ecosystem. Our primary focus is enhancing security in the Advanced Driver-Assistance System (ADAS) context. We leverage the SSI model to design a specialized authentication scheme, aiming to effectively mitigate associated security risks through decentralization. This approach strengthens authentication security within the IoV ecosystem, addressing the mentioned vulnerabilities.
Download

Paper Nr: 32
Title:

Malicious Web Links Detection Using Ensemble Models

Authors:

Claudia-Ioana Coste, Anca-Mirela Andreica and Camelia Chira

Abstract: Malicious links are becoming the main propagating vector for web-malware. They may lead to serious security issues, such as phishing, distribution of fake news and low-quality content, drive-by-downloads, and malicious code running. Malware link detection is a challenging domain because of the dynamics of the online environment, where web links and web content are always changing. Moreover, the detection should be fast and accurate enough that it will contribute to a better online experience. The present paper proposes to drive an experimental analysis on machine learning algorithms used in malicious web links detection. The algorithms chosen for analysis are Logistic Regression, Naı̈ve Bayes, Ada Boost, Gradient Boosted Tree, Linear Discriminant Analysis, Multi-layer Perceptron and Support Vector Machine with different kernel types. Our purpose is twofold. First, we compare these single algorithms run individually and calibrate their parameters. Secondly, we chose 10 models and used them in ensemble models. The results of these experiments show that the ensemble models reach higher metric scores than the individual models, improving the maliciousness prediction up to 96% precision.
Download

Paper Nr: 40
Title:

Access Control Using Facial Recognition with Neural Networks for Restricted Zones

Authors:

Rodrigo Reaño, Piero Carrión and Juan-Pablo Mansilla

Abstract: A new technology that has proven to be effective and accurate in identifying people today is facial recognition. This technology, when used with IP cameras, provides a very effective and practical access control system. Moreover, this system is able to learn and improve its facial recognition capability over time through the use of neural networks, leading to higher accuracy and a lower false positive rate in the field. Thus, this paper shows a face recognition system, based on neural networks, for monitoring and controlling access of people in small and medium-sized enterprises (SMEs); with the use of IP cameras for the versatility of continuous tracking to people circulating in restricted areas. On the other hand, common security problems that are identified in these environments are addressed and solutions are offered through the implementation of the proposed system. Finally, the results obtained demonstrate that the system offers an efficient and secure solution for monitoring and controlling access of people in restricted areas of small and medium-sized enterprises (SMEs). Its accurate identification capability, combined with the elimination of barriers and convenience for users, significantly improves security and user experience.
Download

Paper Nr: 52
Title:

Quality Metrics for Reinforcement Learning for Edge Cloud and Internet-of-Things Systems

Authors:

Claus Pahl and Hamid R. Barzegar

Abstract: Computation at the edge or within the Internet-of-Things (IoT) requires the use of controllers to make the management of resources in this setting self-adaptive. Controllers are software that observe a system, analyse its quality and recommend and enact decisions to maintain or improve quality. Today, often reinforcement learning (RL) that operates on a notion of reward is used to construct these controllers. Here, we investigate quality metrics and quality management processes for RL-constructed controllers for edge and IoT settings. We introduce RL and control principles and define a quality-oriented controller reference architecture. This forms the based for the central contribution, a quality analysis metrics framework, embedded into a quality management process.
Download

Paper Nr: 68
Title:

Tourpedia App: A Web Application for Tourists and Accommodation Owners

Authors:

Angelica Lo Duca and Andrea Marchetti

Abstract: We set out a strategy to add missing details to Tourpedia, a knowledge base containing accommodation information completely built on open data. The strategy is based on developing a Web application (called Tourpedia App), which incentivizes accommodation owners to correct and add information about their activity to Tourpedia. Valuable statistics about the accommodation context are returned to every accommodation owner, compiling missing details. Tourists can also use the Tourpedia App to search for accommodation and tourist attractions. The paper describes the strategy implemented to incentivize accommodation owners to release information about their activity. In addition, it describes how Tourpedia App is implemented and how tourists and accommodation owners can use it. The main finding of this study is the implementation of the Tourpedia App, a prototype that demonstrates that it is possible to build real applications based on open data.
Download

Paper Nr: 70
Title:

Examining the Feasibility of Incorporating Social Media Platforms into Professional Training Programs

Authors:

Malak Alharbi, Jennifer Warrender and Marie Devlin

Abstract: The efficacy of incorporating social media platforms into professional training programs remains uncertain and ambiguous. To optimize the effectiveness of utilizing social media platforms in such programs, it is imperative to comprehend the requirements and influences of instructors and organizations, as well as the preferences and needs of the target audience. This necessitates finding a harmonious balance between leveraging the interactive features of social media and effectively mitigating potential challenges. That is why this research adopts a qualitative case study approach in order to examine the feasibility of incorporating social media platforms into professional training programs. The study involved pre-service instructors from three universities in Saudi Arabia which pursued their studies at the faculty of Computer Sciences and engaged in training courses. The findings of the study indicate that, Twitter emerges as the predominant social media application employed for educational and learning endeavours on a daily basis. The level of belief in the efficacy of social media applications as effective learning environments is substantial. Despite the considerable difficulties individuals encounter in communicating with their peers and professionals during training courses, the individuals involved persist in their efforts. The influence of social media applications on the sharing of learning content with colleagues is significant, indicating a substantial impact. In addition, the utilization of social media applications and the act of sharing content with colleagues have a significant impact on the learning process of pre-service teachers.
Download

Paper Nr: 72
Title:

The Impact of IOT Cybersecurity Testing in the Perspective of Industry 5.0

Authors:

Tauheed Waheed and Eda Marchetti

Abstract: The continuous advancements in IoT (Internet of Things) have various benefits. It has opened new horizons for the industrial revolution in the 21st century. Industry 4.0 and Industry 5.0 also promote using IoT devices to build better and more productive autonomous systems. The behaviour of these complex software systems evolves as they are augmented with the physical and security of IoT devices. IoT-security security and privacy benchmark systems have recently caused a financial loss in various industrial sectors. More importantly, it has damaged the trust of people in technology and IoT systems and people’s distrust towards IoT, motivating rediscovering IoT cybersecurity from a brother perspective. The paper aims to enhance security and privacy by design methodology and provides an overview of the issues and challenges in cybersecurity testing. We also proposed a Cybersecurity Testing Framework (CTF) to enhance IoT cybersecurity that will help to resolve significant security and privacy challenges related to Industry 5.0.
Download

Paper Nr: 83
Title:

WebAssembly and JavaScript Performance Analysis for Video Production Filters

Authors:

Victor Vlad and Sabin C. Buraga

Abstract: Modern Web development, especially in the area of multimedia editing and processing, revolves around the ever-evolving study of WebAssembly’s techniques to move portions of video legacy applications to the Web, by considering mainly the front-end circumstances. This research evaluates the execution time differences between WebAssembly and JavaScript in the context of video filter applications, such as color correction, blur, grayscale, and associated computational processes directly running in a modern Web browser. For discrete video filtering tasks, both programming languages can have similar processing times. However, the real advantage of WebAssembly becomes apparent when multiple filters are used together. The article also explores a multi-node graph solution to chain combinations of video filters. The conducted experiments showed that Google Chrome is the best browser for rendering video content by using a WebAssembly implementation. In the case of JavaScript processing, the best performances are provided by the Mozilla Firefox.

Paper Nr: 84
Title:

Decentralized Identification and Information Exchange in Distributed, Blockchain-Based Internet Architectures: A Technology Review

Authors:

Laura M. Marques da Fonseca, Hamid R. Barzegar and Claus Pahl

Abstract: In many Web and Internet-based systems, sharing Personally Identifiable Information (PII) to identify persons and other entities is common, but centralized systems such as central registries have limitations in terms of control of privacy and identity that a decentralized identity management architecture could address. This study aims to compare the current and potential systems, analyze protocols for decentralized identification and data exchange, propose a protocol selection method, and provide a simple code example. The goal is to assess the feasibility of decentralized processes in software-based business workflows. The methodology involves reviewing protocol materials, including white-papers, articles, and code docs, alongside ontological aspects of identification. Challenges to implementing Decentralized Identifiers (DIDs) include interoperability and the evolving Web/Internet landscape towards more decentralization, openness, and greater user utility.
Download

Area 3 - Social Network Analytics

Full Papers
Paper Nr: 53
Title:

Improved Random Key Cuckoo Search Optimization Algorithm for Community Detection in Social Networks

Authors:

Randa Boukabene, Fatima B. Tayeb and Narimene Dakiche

Abstract: Social network analysis is a prominent and thriving research field, with community detection being a particularly active area of study. In this study, we propose a cuckoo search-based approach for identifying the best network partitions by maximizing the modularity function. The proposed algorithm combines wisely the continuous nature of the standard cuckoo search algorithm with the discrete nature of the community detection problem to achieve the best results. Firstly, the algorithm incorporates the random key representation, which operates in a continuous space. This representation enables the algorithm to perform global and local walks, enabling both exploration and exploitation within the search space. Secondly, the algorithm utilizes the locus-based representation to handle the discrete aspect of the community detection problem. Experiments on both synthetic and real-world networks demonstrate the effectiveness and efficiency of our proposed algorithm.
Download

Paper Nr: 74
Title:

Impacts of Social Factors in Wage Definitions

Authors:

Arthur R. Soares de Quadros, Sarah S. Magalhães, Giulia Zanon de Castro, Jéssica D. Almeida de Lima, Wladmir C. Brandão and Alessandro Vieira

Abstract: Now more than ever, automated decision-making systems such as Artificial Intelligence models are being used to make decisions based on sensible/social data. For this reason, it is important to understand the impacts of social features in these models for salary predictions and wage classifications, avoiding to perpetuate unfairness that exists in society. In this study, publicly accessible data about job’s and employee’s information in Brazil was analyzed by descriptive and inferential statistical methods to measure social bias. The impact of social features on decision-making systems was also evaluated, with it varying depending on the model. This study concluded that, for a model with a complex approach to analyze the training data, social features are not able to define its predictions with an acceptable pattern, whereas for models with a simpler approach, they are. This means that, depending on the model used, an automated decision-making system can be more, or less, susceptible to social bias.
Download

Short Papers
Paper Nr: 19
Title:

Is this a Good Book? The Role of Intrinsic and Extrinsic Cues for Perceived Product Quality in Textbooks in e-Commerce

Authors:

Maria Madlberger and Ruslan Tagiev

Abstract: Previous e-commerce research has largely investigated the role of extrinsic cues for the assessment of product quality by online consumers. In addition, online retailers are also providing intrinsic cues to reduce uncertainties on product quality. This paper empirically investigates the role of intrinsic and extrinsic cues by an experimental design involving the average user rating as well as a product sample in the context of a printed textbook. The research design is a two-by-two between-subjects factorial experimental design with four conditions (high/low average user rating, presence/absence of sample pages). The results show that the average user rating impacts perceived product quality, however a main effect of sample availability and the interaction effect with average user rating could not be demonstrated. The study contributes to research on perceived product quality in e-commerce and the utilization of cues by investigating how online consumers use intrinsic and extrinsic cues to evaluate product quality.
Download

Paper Nr: 63
Title:

BSODCS: Bee Swarm Optimization for Detecting Community Structure

Authors:

Narimene Dakiche

Abstract: This paper presents, BSODCS, a Bee Swarm Optimization for detecting community structure within networks. It employs artificial bees to explore a search space and construct solutions for community detection. To accommodate the specific features of networks, we adopt a locus-based adjacency encoding scheme. Each bee makes decisions regarding its neighboring solutions and shares information through a dance. To explore the neighborhood of each bee, we use Pearson’s correlation as the heuristic information. The modularity of the bees’ solutions serves as a metric for evaluating their quality. The algorithm is tested on well-known real-world networks, and the experimental findings demonstrate that BSODCS outperforms other existing swarm-based methods, delivering higher-quality results.
Download

Paper Nr: 69
Title:

A Novel Hybrid Approach Combining Beam Search and DeepWalk for Community Detection in Social Networks

Authors:

Aymene Berriche, Marwa Naïr, Kamel M. Yamani, Mehdi Z. Adjal, Sarra Bendaho, Nidhal E. Chenni, Fatima B. Tayeb and Malika Bessedik

Abstract: In the era of rapidly expanding social networks, community detection within social graphs plays a pivotal role in various applications such as targeted marketing, content recommendations, and understanding social dynamics. Community detection problem consists of finding a strategy for detecting cohesive groups, based on shared interests, choices, and preferences, given a social network where nodes represent users and edges represent interactions between them. In this work, we propose a hybrid method for the community detection problem that encompasses both traditional tree search algorithms and deep learning techniques. We begin by introducing a beam-search algorithm with a modularity-based agglomeration function as a foundation. To enhance its performance, we further hybridize this approach by incorporating DeepWalk embeddings into the process and leveraging a novel similarity metric for community structure assessment. Experimentation on both synthetic and real-world networks demonstrates the effectiveness of our method, particularly excelling in small to medium-sized networks, outperforming widely adopted methods.
Download

Paper Nr: 75
Title:

Experimental Analysis of Pipelining Community Detection and Recommender Systems

Authors:

Ryan Dutra de Abreu, Laura Silva de Assis and Douglas O. Cardoso

Abstract: Community detection and recommender systems are two subjects of the highest relevance among data-oriented computational methods, considering their current applications in various contexts. This work investigated how pipelining these tasks may lead to better recommendations than those obtained without awareness of implicit communities. We experimentally assessed various combinations of methods for community detection and recommendation algorithms, as well as synthetic and real datasets. This targeted to unveil interesting patterns in the behavior of the resulting systems. Our results show that insights into communities can significantly improve both the effectiveness and efficiency of recommendation algorithms in some favorable scenarios. These findings can be used to help data science researchers and practitioners to better understand the benefits and limitations of this methodology.
Download

Area 4 - Web Intelligence and Semantic Web

Full Papers
Paper Nr: 16
Title:

Who Says What (WSW): A Novel Model for Utterance-Aware Speaker Identification in Text-Based Multi-Party Conversations

Authors:

Y. P. Priyadarshana, Zilu Liang and Ian Piumarta

Abstract: Multi-party conversation (MPC) analysis is a growing and challenging research area which involves multiple interlocutors and complex discourse structures among multiple utterances. Even though most of the existing methods consider implicit complicated structures in MPC modelling, much work remains to be done for speaker-centric written discourse parsing under MPC analysis. On the other hand, pre-trained language models (PLM) have achieved a significant success in utterance-interlocutor semantic modelling. In this study, we propose Who Says What (WSW), a novel PLM which models who says what in an MPC to understand equipping discourse parsing in deep semantic structures and contextualized representations of utterances and interlocutors. To our knowledge, this is the first attempt to use the relative semantic distance of utterances in MPCs to design self-supervised tasks for MPC utterance structure modelling and MPC utterance semantic modelling. Experiments on four public benchmark datasets show that our model outperforms the existing state-of-the-art MPC understanding baselines by considerable margins and achieves the new state-of-the-art performance in response utterance selection and speaker identification downstream tasks.
Download

Paper Nr: 33
Title:

FaceCounter: Massive Attendance Taking in Educational Institutions Through Facial Recognition

Authors:

Adrian Moscol and Willy Ugarte

Abstract: Our purpose is to implement a facial recognition system that will improve efficiency when taking assistance in educational institutes, as well as reducing the possible cases of identity theft. To achieve our objective, a facial recognition system will be created that, upon receiving a photograph of the students present in the classroom, will identify them and confirm their attendance in the database. The investigation of pre-trained models using the agile benchmarking technique will be important, the analyzed and compared models will serve as a basis for the development of the facial recognition system. This program will be connected to an application that will use a simple interface so that teachers can save class time or evaluation’s time by taking attendance or confirming the identity of the students present. Also, it will increase security by avoiding possible identity theft with tools such as false fingerprint molds (admission exams) or partial and/or final exams (false ID).
Download

Paper Nr: 48
Title:

PhotoRestorer: Restoration of Old or Damaged Portraits with Deep Learning

Authors:

Christopher Mendoza-Dávila, David Porta-Montes and Willy Ugarte

Abstract: Several studies have proposed different image restoration techniques, however most of them focus on restoring a single type of damage or, if they restore different types of damage, their results are not very good or have a long execution time, since they have a large margin for improvement. Therefore, we propose the creation of a convolutional neural network (CNN) to classify the type of damage of an image and, accordingly, use pretrained models to restore that type of damage. For the classifier we use the transfer learning technique using the Inception V3 model as the basis of our architecture. To train our classifier, we used the FFHQ dataset, which is a dataset of people’s faces, and using masks and functions, added different types of damage to the images. The results show that using a classifier to identify the type of damage in images is a good pre-restore option to reduce execution times and improve restored image results.
Download

Paper Nr: 77
Title:

Ontology-Driven Intelligent Group Pairing in Project-Based Collaborative Learning

Authors:

Asma Hadyaoui and Lilia Cheniti-Belcadhi

Abstract: In this research project, we investigate the influence of real-time online feedback from peer groups on the assessment of group work in the setting of Project-Based Collaborative Learning (PBCL). Peer feedback plays a crucial role in assisting students in evaluating their learning progress and acquiring valuable skills. Nevertheless, its effectiveness in group environments has yet to be explored. To tackle this issue, we propose an intelligent approach driven by ontologies to collect pertinent peer group feedback from the most compatible groups. We make use of agglomerative clustering to identify groups that closely match and connect them to exchange feedback. We utilize the information embedded in the ontology to create pairs of groups exhibiting similar behaviors and dynamics during project-based learning activities. To assess the effectiveness of our approach, we divide our dataset into two equal parts. We apply our intelligent pairing method to one half and a random approach to the other. We conduct assessments both before and after peer group feedback to measure its impact on project outcomes, including critical thinking and creativity. The results indicate a substantial improvement in project outcomes, particularly in terms of critical thinking and creativity, due to peer group feedback. Additionally, the groups formed using the agglomerative clustering algorithm demonstrate a higher increase in project validation (8.33%) compared to the random approach (5.27%). Our research underscores the effectiveness of integrating intelligence into the peer group feedback process, especially in the context of PBCL. The proposed ontology presents a promising solution for optimizing the assessment process, leading to improved results and the cultivation of critical thinking and creativity among students.
Download

Paper Nr: 79
Title:

Documents as Intelligent Agents: An Approach to Optimize Document Representations in Semantic Search

Authors:

Oliver Strauß and Holger Kett

Abstract: Finding good representations for documents in the context of semantic search is a relevant problem with applications in domains like medicine, research or data search. In this paper we propose to represent each document in a search index by a number of different contextual embeddings. We define and evaluate eight different strategies to combine embeddings of document title, document passages and relevant user queries by means of linear combinations, averaging, and clustering. In addition we apply an agent-based approach to search whereby each data item is modeled as an agent that tries to optimize its metadata and presentation over time by incorporating information received via the users’ interactions with the search system. We validate the document representation strategies and the agent-based approach in the context of a medical information retrieval dataset and find that a linear combination of the title embedding, mean passage embedding and the mean over the clustered embeddings of relevant queries offers the best trade-off between search-performance and index size. We further find, that incorporating embeddings of relevant user queries can significantly improve the performance of representation strategies based on semantic embeddings. The agent-based system performs slightly better than the other representation strategies but comes with a larger index size.
Download

Short Papers
Paper Nr: 21
Title:

Java Binding for JSON-LD

Authors:

Martin Ledvinka

Abstract: JSON-LD is an easy-to-understand and use data format with a Linked Data background. As such, it is one of the most approachable Semantic Web technologies. Moreover, it can bring major benefits even to applications not primarily based on the Semantic Web, especially regarding their interoperability. This work presents JB4JSON-LD – a software library allowing seamless integration of JSON-LD into REST APIs of Java Web applications without having to deal with individual nodes of the JSON-LD graph. The library is compared to existing alternatives and a demo application as well as a real-world information system are used to illustrate its use.
Download

Paper Nr: 39
Title:

DOM-Based Clustering Approach for Web Page Segmentation: A Comparative Study

Authors:

Adrian Sterca, Oana Nourescu, Adriana Guran and Camelia Serban

Abstract: Web page segmentation plays a crucial role in analyzing and understanding the content of web pages, enabling various web-related tasks. The approaches based on computer vision and machine learning have limitations determined by the need of large datasets for training and validation. In this paper, we propose a Document Object Model (DOM) based approach that uses clustering algorithms for web page segmentation. By leveraging the hierarchical structure of the DOM, our approach aims to achieve accurate and reliable segmentation results. We conduct an empirical study, using a custom built dataset to compare the performance of different clustering algorithms for web segmentation. Our research objectives focus on dataset creation, features identification, distance metrics definition, and appropriate clustering algorithms selection. The findings provide insights into the effectiveness and limitations of our approach, enabling informed decision-making in real-world applications.
Download

Paper Nr: 55
Title:

On the Construction of Database Interfaces Based on Large Language Models

Authors:

João Pinheiro, Wendy Victorio, Eduardo Nascimento, Antony Seabra, Yenier Izquierdo, Grettel García, Gustavo Coelho, Melissa Lemos, Luiz P. Leme, Antonio Furtado and Marco Casanova

Abstract: This paper argues that Large Language Models (LLMs) can be profitably used to construct natural language (NL) database interfaces, including conversational interfaces. Such interfaces will be simply called LLM-based database (conversational) interfaces. It discusses three problems: how to use an LLM to create an NL database interface; how to fine-tune an LLM to follow instructions over a particular database; and how to simplify the construction of LLM-based database (conversational) interfaces. The paper covers the first two problems with the help of examples based on two well-known LLM families, GPT and LLaMA, developed by OpenAI and Meta, respectively. Likewise, it discusses the third problem, with the help of examples based on two frameworks, LangChain and LlamaIndex.
Download

Paper Nr: 59
Title:

Pleural Effusion Classification on Chest X-Ray Images with Contrastive Learning

Authors:

Felipe A. Zeiser, Ismael G. Santos, Henrique C. Bohn, Cristiano A. da Costa, Gabriel O. Ramos, Rodrigo R. Righi, Andreas Maier, José M. Andrade and Alexandre Bacelar

Abstract: Diagnosing pleural effusion is important to recognize the disease’s etiology and reduce the length of hospital stay for patients after fluid content analysis. In this context, machine learning techniques have been increasingly used to help physicians identify radiological findings. In this work, we propose using contrastive learning to classify chest X-rays with and without pleural effusion. A model based on contrastive learning is trained to extract discriminative features from the images and reports to maximize the similarity between the correct image and text pairs. Preliminary results show that the proposed approach is promising, achieving an AUC of 0.900, an accuracy of 86.28%, and a sensitivity of 88.54% for classifying pleural effusion on chest X-rays. These results demonstrate that the proposed method achieves comparable or superior to state of the art results. Using contrastive learning can be a promising alternative to improve the accuracy of medical image classification models, contributing to a more accurate and effective diagnosis.
Download

Paper Nr: 64
Title:

SPORENLP: A Spatial Recommender System for Scientific Literature

Authors:

Johannes Wirth, Daniel Roßner, René Peinl and Claus Atzenbeck

Abstract: SPORENLP is a recommendation system designed to review scientific literature. It operates on a sub-dataset comprising 15,359 publications, with a total of 117,941,761 pairwise comparisons. This dataset includes both metadata comparisons and text-based similarity aspects obtained using natural language processing (NLP) techniques.Unlike other recommendation systems, SPORENLP does not rely on specific aspect features. Instead, it identifies the top k candidates based on shared keywords and embedding-related similarities between publications, enabling content-based, intuitive, and adjustable recommendations without excluding possible candidates through classification. To provide users with an intuitive interface for interacting with the dataset, we developed a web-based front-end that takes advantage of the principles of spatial hypertext. A qualitative expert evaluation was conducted on the dataset. The dataset creation pipeline and the source code for SPORENLP will be made freely available to the research community, allowing further exploration and improvement of the system.
Download

Paper Nr: 78
Title:

Intelligent Agents with Graph Mining for Link Prediction over Neo4j

Authors:

Michalis Nikolaou, Georgios Drakopoulos, Phivos Mylonas and Spyros Sioutas

Abstract: Intelligent agents (IAs) are highly autonomous software applications designed for performing tasks in a broad spectrum of virtual environments by circulating freely around them, possibly in numerous copies, and taking actions as needed, therefore increasing human digital awareness. Consequently, IAs are indispensable for large scale digital infrastructure across fields so diverse as logistics and long supply chains, smart cities, enterprise and Industry 4.0 settings, and Web services. In order to achieve their objectives, frequently IAs rely on machine learning algorithms. One such prime example, which lies in the general direction of evaluating the network structure integrity, is link prediction, which depending on the context may denote growth potential or a malfunction. IAs employing machine learning algorithms and local structural graph attributes pertaining to connectivity patterns are presented. Their performance is evaluated with metrics including the F1 score and the ROC curve on a benchmark dataset of scientific citations provided by Neo4j containing ground truth.
Download

Paper Nr: 82
Title:

Similarity Learning for Person Re-Identification Using Deep Auto-Encoder

Authors:

Sevdenur Kutuk, Rayan Abri, Sara Abri and Salih Cetin

Abstract: Person re-identification (ReID) has been one of the most crucial issues in computer vision, particularly for reasons of security and privacy. Person re-identification generally aims to create a unique identity for a person seen in the field of view of a camera and to identify the same person in different frames of the same camera or within the relevant frames of multiple cameras. Due to low resolution and noisy frames, crowded scenes, scenes with occlusion, weather, and light changes, and data sets with insufficient numbers of samples containing different states of the same person for training supervised models, person re-identification remains a challenging and studied problem. In this paper, we propose a hybrid person re-identification model that uses Normalized Cross-Correlation (NCC) and cosine similarity to determine whether extracted features belong to the same person, which we call DAE-ID (Deep Auto-Encoder Identification). The model is built using a pre-trained You Only Look Once Version 4 (YOLOV4) algorithm to detect objects and a convolutional auto-encoder trained on the Motion Analysis and Re-identification Set (Mars) data set for feature extraction. Our method outperforms state-of-the-art methods while outperforming them on the Chinese University of Hong Kong (CUHK03) with 0.966 rank-1 and 0.857 mAP and Duke Multi-Tracking Multi-Camera Re-Identification (DukeMTMC-reID) with 0.956 rank-1 and 0.841 mAP for single-person re-identification.
Download

Paper Nr: 22
Title:

A Proposed Ontology-Based Sociocultural Context Model

Authors:

Fatma-Zohra Rennane and Abdelkrim Meziane

Abstract: The global business landscape, including the handicraft sector in the Maghreb region, has witnessed a significant transformation with the emergence of Information and Communication Technologies (ICT). To adapt to this evolving landscape, many businesses have made the strategic shift to online operations, capitalizing on the vast opportunities offered by ICT. By establishing a strong online presence through e-commerce platforms and social media, handicraft businesses can expand their customer reach and tap into a broader market. However, the adoption of ICT remains a formidable challenge for handicraft women. This challenge stems from multiple factors such as poverty, gender disparities, language barriers, and limited literacy. To address these obstacles and provide personalized services with relevant information, a context ontology integrating sociocultural aspects is proposed. This ontology serves as a comprehensive framework, capturing the socio-cultural nuances of the handicraft sector. By leveraging this ontology, tailored ICT solutions can be developed, taking into account the socio-cultural challenges faced by these women. This approach allows for the provision of personalized services that align with their specific requirements, fostering the effective adoption of ICTs and empowering handicraft women in the Maghreb region to thrive in the digital age.
Download

Paper Nr: 71
Title:

Enhancing Industrial Productivity Through AI-Driven Systematic Literature Reviews

Authors:

Jaqueline G. Coelho, Guilherme D. Bispo, Guilherme F. Vergara, Gabriela M. Saiki, André M. Serrano, Li Weigang, Clovis Neumann, Patricia H. Martins, Welber Santos de Oliveira, Angela B. Albarello, Ricardo A. Casonatto, Patrícia Missel, Roberto D. Medeiros Junior, Jefferson O. Gomes, Carlos Rosano-Peña and Caroline C. F. da Costa

Abstract: The advent of Artificial Intelligence (AI) has opened up new possibilities for improving productivity in various industry sectors. In this paper, we propose a novel framework aimed at optimizing systematic literature reviews (SLRs) for industrial productivity. By combining traditional keyword selection methods with AI-driven classification techniques, we streamline the review process, making it more efficient. Leveraging advanced natural language processing (NLP) approaches, we identify six key sectors for optimization, thereby reducing workload in less relevant areas and enhancing the efficiency of SLRs. This approach helps conserve valuable time and resources in scientific research. Additionally, we implemented four machine learning models for category classification, achieving an impressive accuracy rate of over 75%. The results of our analyses demonstrate a promising pathway for future automation and refinements to boost productivity in the industry.
Download

Paper Nr: 73
Title:

Semantic Micro-Front-End Approach to Enterprise Knowledge Graph Applications Development

Authors:

Milorad Tosic, Nenad Petrovic and Olivera Tosic

Abstract: Industry 4.0 has been mainly driven by IoT devices and artificial intelligence developments rising heterogeneity of the data acquired by sensing devices as well as data from existing legacy systems (such as ERP) the crucial for digital transformation. Until recently, migration of enterprise applications to Cloud has been considered the only viable long-term solution. However, after hidden infrastructure costs of the Cloud-only based approach have been discovered, a number of businesses have begun considering hybrid Cloud-Edge architectures where Micro-Services Architectures (MSA) on backend are complemented with Micro-Font-End (MFE) applications. However, the architecture must be very carefully optimized in order to avoid high risks and costs due to increased system’s complexity. In this paper, a semantic-driven approach based on Enterprise Knowledge Graph (EKG) and ontologies with their automated mapping is introduced in order to manage the complexity. Ontologies are adopted for automated, low-code approach to composition and deployment of MFE components targeting enterprise productivity applications. MFE applications generated this way are built upon Semantic Micro Services backend that can transparently be distributed between Cloud and Edge. Our approach is illustrated on the case study for semantic annotation of manufacturing area which utilizes a shared marketplace component for IoT-based indoor positioning.
Download