WEBIST 2025 Abstracts


Area 1 - HCI in Mobile Systems and Web Interfaces

Full Papers
Paper Nr: 25
Title:

Multimodal Large Language Models for Portuguese Alternative Text Generation for Images

Authors:

Víctor Alexsandro Elisiário and Willian Massami Watanabe

Abstract: Since the creation of the Web Content Accessibility Guidelines (WCAG), the Web has become increasingly accessible to people with disabilities. However, related works report that Web developers are not always aware of accessibility specifications and many Web applications still contain accessibility barriers. Therefore, this work proposes the use of Multimodal Large Language Models (MLLM), leveraging Google’s Cloud Vision API and contextual information extracted from Web pages’ HTML, to generate alternative texts for images using the Gemini-1.5-Pro model. To evaluate this approach, a case study was conducted to analyze the perceived relevance of the generated descriptions. Six Master’s students in Computer Science participated in a blind analysis, assessing the relevance of the descriptions produced by the MLLM alongside the original alternative texts provided by the page authors. The evaluations were compared to measure the relative quality of the descriptions. The results indicate that the descriptions generated by the MLLM are at least equivalent to those created by humans. Notably, the best performance was achieved without incorporating additional contextual data. These findings suggest that alternative texts generated by MLLMs can effectively meet the needs of blind or visually impaired users, thereby enhancing their access to Web content.
Download

Paper Nr: 70
Title:

Linking User Experience and Business Outcomes: How Perceived Usefulness of AI Chatbots Predicts Satisfaction and NPS

Authors:

Tim-Can Werning, María José Escalona and Andreas Hinderks

Abstract: The integration of AI-based features is rapidly transforming interactions with software systems. While these innovations aim to enhance functionality, their impact on user experience and business outcomes such as satisfaction and loyalty remains underexplored. This study investigates how the user experience (UX) of AI chat bots relates to two key user-level outcomes: Customer Satisfaction (CSAT) and Net Promoter Score (NPS). Drawing on a sample of N = 146 users, we conducted regression analyses, including interaction terms with AI usage frequency and perceived competency. Results indicate that perceived Usefulness significantly predicts both CSAT and NPS, with partial support of moderation effect by the frequency of AI use. Specifically, higher usage increases the positive impact of Usefulness on NPS. Overall, our regression models for CSAT and NPS explained around 39% and 48% of the variance, respectively. These results indicate a good model fit and underline the importance of good UX in AI systems, as this is significantly impacting the satisfaction and loyalty of users. In summary, by linking established UX metrics to strategic business indicators, we show how UX professionals can contribute to more business value and additionally offer guidance to adopt a more user-centered perspective on AI development.
Download

Paper Nr: 78
Title:

Classification of Augmented Reality Design Recommendations on User Experience Dimensions: Preliminary Study Results

Authors:

Stefan Graser, Jessica Kollmorgen, Martin Schrepp, María José Escalona and Stephan Böhm

Abstract: Augmented Reality (AR) in Corporate Training (CT) enables immersive and interactive learning scenarios, resulting in a new user experience (UX). Within software development, UX is a crucial success factor. While numerous AR-specific design recommendations exist, it remains unclear how these contribute to the actual user experience perceived by learners. This misalignment between intended and actual UX highlights the challenge for AR authors. Concerning UX evaluation, questionnaires can be used to collect data from target groups and produce reliable quantitative data describing UX quality. However, a questionnaire should not include too many items to capture the UX impression of users to avoid being too time-consuming. Since UX questionnaires typically capture only high-level impressions, their results often do not provide clear suggestions for designers or developers on how to improve an application. Linking design recommendations to questionnaire scales would help connect UX evaluation results more directly to design changes that are likely to improve users’ UX impressions. We describe a study establishing such a mapping for the application domain of AR in corporate training. Preliminary results provide an initial classification of AR design recommendations across relevant UX dimensions.
Download

Short Papers
Paper Nr: 17
Title:

Hayshark: A Web-Based System for Remote Visualization of Network Traffic

Authors:

Antero Taivalsaari

Abstract: Any computer or device connected to the Internet today will be exposed to a significant amount of network traffic from potentially thousands of different IP addresses. The vast majority of this traffic is entirely invisible to ordinary users. In this paper we introduce a system called Hayshark that can be used for visualizing and analyzing network traffic on a multitude of computing devices in a quick ”at-a-glance” fashion. The Hayshark system constructs and dynamically maintains a live 3D graph representation of all incoming network traffic, and presents the data streams in a visual web-based user interface with force-directed graph layout. The user interface includes detailed message statistics and the ability to quickly dive deeper into individual streams if any anomalies are detected in their behavior.
Download

Paper Nr: 58
Title:

A Smartwatch-Based Approach to Support and Analysis of Driver Stress and Anxiety

Authors:

Tiago Mota de Oliveira, André Roberto Ortoncelli, Claudemir Casa, Claudinei Casa and Luciano Silva

Abstract: This study presents a smartwatch-based solution for monitoring drivers’ stress and anxiety levels using heart rate data, standing out for not requiring synchronization with other devices. The system captures heart rate variations and GPS coordinates, offering real-time feedback to assist drivers while storing all data in a cloud database for subsequent expert analysis. Additionally, a reporting tool is provided to help specialists (e.g., psychologists) evaluate drivers’ emotional states and offer appropriate support. A pilot study was conducted with eleven students from a driver training center during practical lessons to assess the proposed system. The results show that the application was positively received by all participants, with two expressing interest in using it beyond the study. These findings suggest that the proposed solution could enhance driver well-being and preparedness, particularly among new drivers.
Download

Paper Nr: 80
Title:

The Impact of Technology on Italian Students During the COVID-19 Pandemic: Learning, Emotions, Behaviour and Support

Authors:

Maria Claudia Buzzi, Marina Buzzi, Barbara Leporini, Luca Bastiani, Francesca Denoth, Michela Franchini, Stefania Pieroni and Sabrina Molinaro

Abstract: The COVID-19 pandemic has profoundly affected our lives in almost every area, including school. Students have experienced a transformation in their learning environments with the widespread shift, almost always from face-to-face to online lessons, that has impacted their academic performance and feelings of isolation. This paper examines the impact of technology on the learning, social, and human aspects of Italian students during the COVID-19 pandemic, which accelerated their immersion in the digital world. To this aim, the paper analyses the positive and negative aspects of technology usage during the pandemic on students' lives, specifically on academic performance, emotional well-being, behaviours, and the support they received during this period. Data from 152 Italian students aged 18 and above were collected to understand how technology helped them cope with school activities, isolation, and psychological well-being. Students appreciated the convenience and accessibility of digital lessons and materials, but they also faced challenges such as concentration difficulties, reduced social contact, and accessibility issues. The results confirm but also expand previous research findings. Statistical analysis highlights that higher levels of support are associated with better academic performance, and better Internet connectivity is linked to higher students' autonomy.
Download

Paper Nr: 81
Title:

Eye-Based Cognitive Overload Prediction in Human-Machine Interaction via Machine Learning

Authors:

Maria Trigka, Elias Dritsas and Phivos Mylonas

Abstract: Cognitive overload significantly affects human performance in complex interaction settings, making its early detection essential for designing adaptive systems. This study investigated whether gaze-derived features can reliably predict overload states using supervised machine learning (ML). The analysis was conducted on an eye-tracking dataset from a cognitively demanding visual task that incorporated fixations, saccades, and pupil diameter measurements. Five classifiers, namely, Logistic Regression (LR), Naive Bayes (NB), Support Vector Machine (SVM), XGBoost (XGB), and Multilayer Perceptron (MLP), were evaluated using stratified train/test splits and 5-fold cross-validation. XGB achieved the best performance, with an accuracy of 0.902, a precision of 0.958, a recall of 0.821, an F1 score of 0.884, and an area under the ROC curve (AUC) of 0.956. These findings confirm that gaze-derived features alone can reliably distinguish between cognitive overload states. The results also revealed trade-offs between simple models, which are easier to interpret but more conservative, and complex models, such as XGB and MLP, which achieved stronger predictive performance. Future studies should address subject-independent validation, incorporate temporal modeling of gaze dynamics, and explore personalization and cross-task generalization to advance robust and adaptive cognitive monitoring.
Download

Paper Nr: 94
Title:

Context-Aware Warning Systems: Leveraging Driving Environment Data for Improved Driver and Road User Warnings

Authors:

Alexander Stocker, Tahir Emre Kalayci, Michael Spitzer and Gerald Musser

Abstract: Web technologies, Internet of Things (IoT) frameworks, and modern communication standards are increasingly transforming the automotive sector, giving rise to software-defined vehicles. These vehicles operate as connected entities within a broader digital ecosystem, enabling real-time data exchange with infrastructure, cloud services, and other road users. This ongoing digitalization opens new opportunities to improve road safety through intelligent, context-aware driver assistance systems. Our paper introduces a novel context-aware driver warning system to be developed as part of the ROADGUARD project. The system will fuse data from in-cabin driver monitoring with data about the external driving environment to enhance the accuracy and contextual relevance of safety alerts. Conventional Driver Monitoring Systems (DMS) often rely solely on gaze-based heuristics, which can lead to false positives when environmental context is not considered. Our approach will overcome this limitation by integrating multimodal sensing, AI-driven edge inference, secure data sharing, and adaptive, multi-target warning delivery. Our proposed system architecture is structured around three interconnected subsystems-Sensing, Sharing, and Acting. It will not only enable more precise, real-time alerts for drivers but also cooperative warnings for vulnerable road users such as pedestrians and cyclists. By embedding situational awareness and supporting data-driven improvement via mobility data spaces, our system supports the Vision Zero objective of eliminating traffic fatalities.
Download

Paper Nr: 22
Title:

A Case Study on Using Generative AI in Literature Reviews: Use Cases, Benefits, and Challenges

Authors:

Eva-Maria Schön, Jessica Kollmorgen, Michael Neumann and Maria Rauschenberger

Abstract: Context: Literature reviews play a critical role in the research process. They are used not only to generate new insights but also to contextualize and justify one’s own research within the existing body of knowledge. Problem: Since years, the number of scientific publications has been increasing rapidly. Therefore, conducting literature reviews can be time-consuming and error-prone. Objective: We investigate how integrating generative Artificial Intelligence (GenAI) tools may optimize the literature review process in terms of efficiency and methodological quality. Method: We conducted a single case study with 16 Master’s students at a University of Applied Science in Germany. They all carried out a Systematic Literature Review (SLR) using generative AI tools. Results: Our study identified use cases for the application of GenAI in literature reviews, as well as benefits and challenges. Conclusion: The results reveal that GenAI is capable of supporting literature reviews, especially critical parts such as primary study selection. Participants can scan large volumes of literature in a short time and overcome language barriers using GenAI. At the same time, it is crucial to assess GenAI outputs and ensure adequate quality assurance throughout the research process due to technology limitations, such as hallucination.
Download

Paper Nr: 65
Title:

Insertion of HCI Practices in a Usability Engineering Course

Authors:

Luiz Felipe Cirqueira dos Santos, Mariano Florencio Mendonça, Edmir Queiroz, Elisrenan Barbosa da Silva, Shexmo Richarlison Ribeiro dos Santos, Marcus Vinicius Santana Silva, Alberto Luciano de Souza Bastos, Marcos Cesar Barbosa Dos Santos, Marcos Venicius Santos and Markson Fábio da Silva Santos

Abstract: Including Human-Computer Interaction (HCI) practices is crucial for preparing future professionals to design systems that effectively meet users’ needs. HCI encompasses methods and techniques to improve interactive systems’ usability, user experience, and effectiveness. This article presents an experience report on teaching HCI techniques to a Usability Engineering class using the Flipped Classroom methodology. Through these techniques, it was possible to explore practically and offer a critical and analytical perspective on developing practical, efficient, and satisfactory user interfaces.
Download

Area 2 - Internet Technology

Full Papers
Paper Nr: 19
Title:

Developers’ Insight on Manifest v3 Privacy and Security Webextensions

Authors:

Libor Polčák, Giorgio Maone, Michael McMahon and Martin Bednář

Abstract: Webextensions can improve web browser privacy, security, and user experience. The APIs offered by the browser to webextensions affect possible functionality. Currently, Chrome transitions to a modified set of APIs called Manifest v3. This paper studies the challenges and opportunities of Manifest v3 with an in-depth structured qualitative research. Even though some projects observed positive effects, a majority express concerns over limited benefits to users, removal of crucial APIs, or the need to find workarounds. Our findings indicate that the transition affects different types of webextensions differently; some can migrate without losing functionality, while others remove functionality or decline to update. The respondents identified several critical missing APIs, including reliable APIs to inject content scripts, APIs for storing confidential content, and others.
Download

Paper Nr: 21
Title:

Enhancing Data Governance in Data Trustees Through ODRL-Based End-of-Life Policies

Authors:

Michael Steinert and Daniel Tebernum

Abstract: While data sharing drives innovation, ensuring compliance with legal, regulatory, and trust requirements presents significant challenges. Research identifies data trustees as intermediaries between providers and consumers, facilitating compliant and trusted data sharing. However, an underserved aspect is managing the end-of-life (EoL) of shared data, where standardized, machine-interpretable mechanisms for detailed EoL policies are lacking. To address this gap, we propose an extension of the Open Digital Rights Language (ODRL) to incorporate semantically rich EoL policies. This enables the specification of data deletion requirements, supporting legal and regulatory obligations. Data trustees can use these enhanced policies to coordinate EoL actions among all parties. The explicit semantics within these policies facilitate clearer accountability and support the creation of auditable logs by making EoL obligations machine-interpretable and unambiguous. Our ODRL extension has been evaluated by ODRL and data governance experts, ensuring its robustness and relevance for practical implementation. This work contributes to the standardization of EoL data management by analyzing and articulating the detailed requirements for EoL policies in the context of data trustees, and by proposing a specific ODRL extension to meet these requirements. For practitioners using ODRL, our extension provides enhanced, machine-interpretable EoL capabilities, improving compliance and trust.
Download

Paper Nr: 23
Title:

Real-Time Sound Mapping of Object Rotation and Position in Augmented Reality Using Web Browser Technologies

Authors:

Victor Vlad and Sabin Corneliu Buraga

Abstract: Growing focus on immersive media within the browser has been driven by recent advances in technologies such as WebXR for augmented reality (AR), Web Audio API for spatial sound rendering and object tracking libraries such as TensorFlow.js. This research presents a real-time system for spatial audio mapping of physical object motion within a browser based augmented reality environment. By leveraging native Web technologies, the system captures the rotation and position of real world objects and translates these parameters into dynamic 3D soundscapes rendered directly in the user browser. In contrast to conventional AR applications that necessitate native platforms, the proposed solution operates exclusively within standard Web browsers, eliminating the requirement for additional installations. Performance evaluations demonstrate the system’s proficiency in delivering low-latency, directionally precise sound localization in real time. These findings suggest promising applications within the interactive media domain and underscore the burgeoning potential of the Web platform for advanced multimedia processing.
Download

Paper Nr: 29
Title:

Automated Test Generation Using LLM Based on BDD: A Comparative Study

Authors:

Shexmo Richarlison Ribeiro dos Santos, Luiz Felipe Cirqueira dos Santos, Marcus Vinicius Santana Silva, Marcos Cesar Barbosa dos Santos, Mariano Florencio Mendonça, Marcos Venicius Santos, Marckson Fábio da Silva Santos, Alberto Luciano de Souza Bastos, Sabrina Marczak, Michel S. Soares and Fabio Gomes Rocha

Abstract: In Software Engineering, seeking methods that save time in product development and improve delivery quality is essential. BDD (Behavior-Driven Development) offers an approach that, through creating user stories and acceptance criteria in collaboration with stakeholders, aims to ensure quality through test automation, allowing the validation of criteria for product acceptance. The lack of test automation poses a problem, requiring manual work to validate acceptance. To solve the issue of test automation in BDD, we conducted an experiment using standardized prompts based on user stories and acceptance criteria written in Gherkin syntax, automatically generating tests in four Large Language Models (ChatGPT, Gemini, Grok, and GitHub Copilot). The experiment compared the following aspects: response similarity, test coverage concerning acceptance criteria, accuracy, efficiency in the time required to generate the tests, and clarity. The results showed that the LLMs have significant differences in their responses, even with similar prompts. We observed variations in test coverage and accuracy, with ChatGPT standing out in both cases. In terms of efficiency, related to time, Grok was the fastest while Gemini was the slowest. Finally, regarding the clarity of the responses, ChatGPT and GitHub Copilot were similar and more effective than the others. The results show that the LLMs adopted in the study can understand and generate automated tests accurately. However, they still do not eliminate the need for human assessment, but they do serve as a support to speed up the automation process.
Download

Paper Nr: 43
Title:

Evaluating Use of ARQ Strategies in Communication Protocols for Search and Rescue

Authors:

Antonello Calabrò, Eda Marchetti and Maria Teresa Paratore

Abstract: Search and Rescue (SAR) operations often occur in remote and challenging environments where conventional communication infrastructures are unavailable or unreliable. Effective communication is crucial for mission success. Low-power wide area network (LPWAN) protocols, particularly the LORA (Long Range) protocol, have gained traction due to their low power consumption and extended range. However, LORA’s low reliability presents significant challenges in the time-sensitive context of SAR operations, necessitating effective communication strategies. This paper examines the reliability of communication protocols in real scenarios, focusing on Stop & Wait (S&W) and Selective Repeat (SR) Automatic Repeat reQuest (ARQ) protocols. It evaluates their suitability by addressing operational constraints such as geographic barriers, time sensitivity, and simplicity of implementation. Key contributions include investigating the current literature, a real implementation of the ARQ algorithm, and a comparative analysis of these protocols under real-world conditions. Furthermore, the study presents a real-world implementation of ARQ mechanisms and evaluates their operational trade-offs in SAR scenarios, considering both computational constraints and deployment feasibility.
Download

Paper Nr: 45
Title:

AI Model Cards: State of the Art and Path to Automated Use

Authors:

Ali Mehraj, An Cao, Kari Systä, Tommi Mikkonen, Pyry Kotilainen, David Hästbacka and Niko Mäkitalo

Abstract: In software engineering, the integration of machine learning (ML) and artificial intelligence (AI) components into modern web services has become commonplace. To comply with evolving regulations, such as the EU AI Act, the development of AI models must adhere to the principles of transparency. This includes the training data used, the intended use, potential biases, and the risks associated with these models. To support these goals, documents named Model Cards were introduced to standardize ethical reporting and allow stakeholders to evaluate models based on various goals. In our ongoing research, we aim to automate risk analysis and regulatory compliance checks in software systems. We envision that model cards can serve as useful tools to achieve the goal. Given the evolving format of model cards over time, we conducted a state-of-the-art review of the current state and practice of model cards by analyzing 90 model cards from four model repositories to assess their relevance to our vision. The study’s contribution is a thorough analysis of the model cards’ structure and content, as well as their ethical reporting. Our study reveals the variance in information reporting, the loose structure, and the lack of ethical reporting in the model cards. Based on the findings, we propose a unified model card template that aims to enhance the structure, promote greater transparency, and establish a foundation for future machine-interpretable AI model cards.
Download

Paper Nr: 64
Title:

Performance Evaluation of REST and GraphQL API Models in Microservices Software Development Domain

Authors:

Mohamed S. M. Elghazal, Adel Aneiba and Essa Q. Shahra

Abstract: This study presents a comprehensive comparative analysis of REST and GraphQL API models within the context of microservices development, offering empirical insights into the strengths and limitations of each approach. The research explores the effectiveness and efficiency of GraphQL versus REST, focusing on their impact on critical software quality metrics and user experience. Using a controlled experimental setup, the study evaluates key performance indicators, including response time, data transfer efficiency, and error rates. The findings reveal that REST APIs demonstrate superior memory efficiency and faster response times, particularly under high-load conditions, making them a reliable choice for performance-critical microservices. On the other hand, GraphQL excels in offering greater flexibility for data retrieval, but exhibits higher response times and higher error rates when handling complex queries. This research provides a nuanced understanding of the trade-offs between the REST and GraphQL API interaction models, offering actionable guidance to developers and researchers in selecting the optimal API model for microservice-based applications. The insights are particularly valuable for balancing considerations such as performance, flexibility, and reliability in real-world implementations.
Download

Paper Nr: 93
Title:

IMMBA: Integrated Mixed Models with Bootstrap Analysis - A Statistical Framework for Robust LLM Evaluation

Authors:

Vinícius Di Oliveira, Pedro Carvalho Brom and Li Weigang

Abstract: Large Language Models (LLMs) have advanced natural language processing across diverse applications, yet their evaluation remains methodologically limited. Standard metrics such as accuracy or BLEU offer aggregate performance snapshots but fail to capture the inherent variability of LLM outputs under prompt changes and decoding parameters like temperature and top-p. This limitation is particularly critical in high-stakes domains, such as legal, fiscal, or healthcare contexts, where output consistency and interpretability are essential. To address this gap, we propose IMMBA: Integrated Mixed Models with Bootstrap Analysis, a statistically principled framework for robust LLM evaluation. IMMBA combines Linear Mixed Models (LMMs) with bootstrap resampling to decompose output variability into fixed effects (e.g., retrieval method, decoding configuration) and random effects (e.g., prompt phrasing), while improving estimation reliability under relaxed distributional assumptions. We validate IMMBA in a Retrieval-Augmented Generation (RAG) scenario involving structured commodity classification under the Mercosur Common Nomenclature (NCM). Our results demonstrate that IMMBA isolates meaningful performance factors and detects significant interaction effects across configurations. By integrating hierarchical modelling and resampling-based inference, IMMBA offers a reproducible and scalable foundation for evaluating LLMs in sensitive, variance-prone settings.
Download

Short Papers
Paper Nr: 13
Title:

Performance and Scalability of Frontends in Processing Vectorized Text Data: A Comparison Between Traditional Monolithic and Modular Frontends

Authors:

Luiz Felipe Cirqueira dos Santos, Alberto Luciano de Souza Bastos, Shexmo Richarlison Ribeiro dos Santos, Mariano Florencio Mendonça, Marcus Vinicius Santana Silva, Marcos Cesar Barbosa dos Santos, Marcos Venicius Santos, Marckson Fábio da Silva Santos, Fabio Gomes Rocha and Michel S. Soares

Abstract: Text vectorization is essential for information search and retrieval systems, requiring efficient front-end architectures. This study compares monolithic and modular approaches for vectorized data processing, evaluating performance and scalability. Using Dynamic Capacity Theory as a basis, we developed two functionally equivalent implementations and evaluated them quantitatively using the Lighthouse tool (Chrome DevTools) in Timespan and Snapshot modes. Results show clear tradeoffs: while the monolithic architecture presents better initial performance, the modular solution shows superior scalability for large data volumes. Our conclusions provide practical guidelines for architectural selection based on specific vector processing requirements, contributing to the optimization of high-demand web systems.
Download

Paper Nr: 20
Title:

Influence of UX Factors on User Behavior: The Critical Incident Technique

Authors:

Jessica Kollmorgen, Yaprak Turhan, María José Escalona and Jörg Thomaschewski

Abstract: Measuring user experience (UX) is essential for strengthening user loyalty and product success. Usage frequency plays a central role, as it both can influence and be influenced by UX. Standard measurement methods like questionnaires can assess UX factors and calculate their impact on usage frequency. Alongside UX factors, socio-demographic data like gender are also collected in practice, as they can affect usage patterns depending on the product. However, the question remains whether additional holistic UX factors exist that are not yet captured in standard UX questionnaires. Understanding these could improve UX and raise long-term usage. To investigate the possibility of new factors, we applied the Critical Incident Technique (CIT), a method from psychology, to UX research. Using Netflix as an example, we employed CIT in a questionnaire to capture very positive/negative (“critical”) user experiences and conducted 12 interviews to assess the incidents’ influence on usage frequency. The study found that beyond the known UX factors, additional holistic factors such as Nostalgia and Anticipation were identified. These newly identified factors were also shown to impact usage frequency. Overall, CIT proves to be a promising method for capturing holistic UX factors, providing a foundation for future research into the context of use.
Download

Paper Nr: 24
Title:

Extending the GeoJSON Standard with Deontic Logic Policies

Authors:

Benjamin Aziz

Abstract: GeoJSON is a widely used format for encoding geographic data structures using the JSON format. It enables easy integration of spatial data in Web applications and supports various geometries like points, lines and polygons. However, its openness and simplicity can introduce safety and security challenges. Amongst these is the lack of the ability to express policy rules related to some geometry, therefore leading to potential security and safety risks for any critical geofencing solutions that use the standard. In this paper, we propose an extension to the GeoJSON standard that includes the addition of a policy element to geometric features, such that the policy consists of a set of rules that will be evaluated when the geometry is activated (i.e. stepped on, crossed over or entered/existed). We use deontic logic to express such rules, and demonstrate the usefulness of such approach to a couple of potential real-world examples.
Download

Paper Nr: 41
Title:

Go-Pregel: A User-Friendly Framework for Distributed Graph Processing

Authors:

Gabriel Gandour, Celso Massaki Hirata and Juliana de Melo Bezerra

Abstract: Graphs are widely used for tasks such as visualization and decision-making. When dealing with large-scale graphs, efficient storage and computation become critical. To address these challenges, distinct tools have been developed to support the implementation and execution of distributed graph algorithms. These tools simplify the development process by abstracting the underlying distribution mechanisms, making them largely transparent to the end user. However, to optimize and extend these implementations, developers must have a solid understanding of distributed computing concepts, such as communication, coordination, concurrency, and scalability, which are essential for effectively managing distributed graph processing. This work aims to explore the fundamental principles of distributed computing in the context of graph processing. To support this, we introduce Go-Pregel, a framework implemented in Golang and inspired by the core concepts of Google’s Pregel. The proposed Go-Pregel serves as a flexible experimental platform for both educational and research purposes, enabling users to better understand the underlying mechanisms of distributed systems and graph processing.
Download

Paper Nr: 42
Title:

ACCURATE - eternAl infrastruCture for seCUrity in softwaRe and hArdware developmenT and assessmEnt

Authors:

Antonello Calabrò, Eda Marchetti and Sanaz Nikghadam-Hojjati

Abstract: This paper addresses the increasing complexity of cybersecurity and the need for compliance with evolving EU regulations, highlighting the limitations of traditional software and hardware development processes in managing security, trust, and long-term compliance. To bridge these gaps, the paper proposes a novel lifecycle and supporting architecture named ACCURATE (eternal infrastructure for security in software and hardware development and assessment). ACCURATE is inspired by the DevOps approach and integrates continuous real-time monitoring, detection, and vulnerability management throughout the entire lifecycle. ACCURATE is designed for software and hardware development, as well as post-development continuous assessment. The main novelty is conceiving the “Eternal” stage, focusing on ongoing post-deployment assessment and protection, ensuring systems remain resilient against emerging threats. ACCURATE aims to transform the security landscape by embedding continuous safeguarding mechanisms throughout the development and operational stages, ultimately ensuring the integrity and reliability of both software and hardware systems in a rapidly evolving technological environment.
Download

Paper Nr: 47
Title:

Towards an Integrative Approach Between the SofIA Methodology and ChatGPT for the Extraction of Requirements from User Interviews

Authors:

P. Peña-Fernández, I. Ruiz-Marchueta, J. A. García-García and M. J. Escalona Cuaresma

Abstract: The elicitation and specification of software requirements are critical activities in software engineering, usually involving interviews between analysts and end users. These interactions are essential for understanding user needs but can lead to inconsistencies or incomplete information in the subsequent generation of use cases. This paper explores the integration of ChatGPT with the SofIA software methodology to address these challenges, leveraging natural language processing capabilities to enhance the transformation of interview transcripts into detailed use cases. The proposed approach combines the structured guidance of SofIA with ChatGPT’s ability to process and generate coherent textual outputs, facilitating the automated identification, categorisation, and refinement of requirements. A proof of concept in a real-world software development scenario was conducted to evaluate this integration, focusing on metrics such as accuracy, completeness, and time efficiency. This work contributes to the advancement of requirements engineering by introducing a semi-automated, user-centred approach that bridges the gap between human interviews and formal documentation. Future research directions include scaling the approach to more complex domains and refining its adaptability to diverse project requirements.
Download

Paper Nr: 53
Title:

Modelling Goals for Complex Problems: An Approach on the SofIA Methodology

Authors:

F. Gracia.Ahufinger, Javier J. Gutiérrez, J. A. García-García and María-José Escalona

Abstract: Complex Problem Solving (CPS) is a paradigm in modern software development. Goal modelling for addressing complex requirements is a challenge that SofIA, Software Methodology for Industrial Application, meta-model leverages in the Cynefin framework to define complexity by employing Scrum to manage iterative development. The key contributions of this article are to introduce new meta-model elements to facilitate goalorientated modelling within the SofIA framework, establish relationships between goals and various artefacts developed during the construction of Information Systems and a practical application of the extended SofIA meta-model to demonstrate through a case study, showing its effectiveness in a real-world project. The paper provides an example of integrating Cynefin and Scrum within a Model-Driven (Software) Engineering (MDE) context to tackle CPS. The extended SofIA approach aims to improve decision-making and project success by defining clear objectives and iteratively evaluating their adequacy and impact on overall system development.
Download

Paper Nr: 63
Title:

Toward Decentralized Digital Asset Management on the Blockchain

Authors:

Luiz Vasconcelos Júnior, Bryan Diniz Borck, Celso Massaki Hirata and Juliana de Melo Bezerra

Abstract: The emergence of cryptocurrencies has significantly altered the financial landscape, introducing both opportunities and challenges, particularly within asset management. By harnessing the immutability and transparency of blockchain technology, our goal is to present a model that enables the decentralization of the traditional hedge fund industry by removing intermediaries. Our approach integrates smart contracts and pioneering standards like ERC-6551, offering a way for managers to hold and operate investors’ assets securely. Furthermore, the model facilitates decentralized decision-making, enabling investors to participate in fund governance using voting mechanisms. This work outlines a concrete path toward decentralizing asset management, marking a significant step in reshaping traditional practices and fostering innovation in the financial sector.
Download

Paper Nr: 72
Title:

A Job Finder Chatbot-Based Web Platform: A Use Case for Software Engineers

Authors:

Panagiotis Fotiadis, Georgia M. Kapitsaki and Maria Papoutsoglou

Abstract: Finding a job requires browsing through a vast number of position openings, usually at various online sites. This process becomes more complicated for users when they are considering different potential job locations. Even though existing tools that automate the process exist (e.g. sonara.ai, dream.jobs), they lack the element of interactivity with the user and the integration of external resources that contain information on the skills of the user. With the aim to bridge the gap between job seekers and potential employers by matching resumes and job listings more efficiently and effectively, in this work we are introducing an AI-driven chatbot-based web platform to assist job seeking. In the initial implementation, we are considering the job seeking needs of software engineers but more disciplines can be easily added. We have integrated the developer’s CV and the user’s GitHub account, and are describing the design and implementation process of the web platform, while a small scale user evaluation has also been performed. We argue that the chatbot can be used as a starting point for similar job seeking assistants.
Download

Paper Nr: 73
Title:

Design and Implementation of a Real-Time Web Infrastructure for Student Monitoring: A Kafka-Based Plugin for Moodle

Authors:

Rima Kilany Chamoun, Wadad Wazen and Mario Gharib

Abstract: Modern Learning Management Systems (LMS) require increasingly responsive and scalable infrastructures to support real-time learning analytics. This paper presents the design and implementation of robust technical architecture that integrates Moodle, an open-source LMS, with Apache Kafka, a distributed streaming platform, to enable real-time student performance monitoring. The proposed solution captures high-velocity event data from Moodle (e.g., assignment submissions, quiz attempts, forum activity) and routes it through dynamically generated Kafka topics into a scalable pipeline, where it is processed in real time and stored in MongoDB for downstream analysis. This infrastructure supports immediate visualization of engagement data, threshold-triggered alerts, and seamless extensibility toward predictive analytics using Kafka Streams and machine learning models. The system demonstrates how architectural innovations in event-driven web applications can be applied to education, enabling data-driven interventions and advancing the capabilities of LMS platforms beyond traditional batch-based reporting.
Download

Paper Nr: 90
Title:

From Data to Warnings: Challenges in Building in-Vehicle Data-Driven Hazard Warning Systems

Authors:

Alexander Stocker and Gerald Musser

Abstract: Data offers a strong potential for advanced, data-driven services such as in-vehicle hazard warning systems. As data spaces and ecosystems mature, access to relevant assets for these applications will grow. This paper reviews the state of driver warning and reports on a project that developed a prototypical data-driven hazard warning system to alert drivers to potential route dangers. We present its architecture and key implementation challenges, including backend event generation, frontend warning mechanisms, data availability and integration, transformation of heterogeneous inputs into actionable warnings, definition of warning logics, handling of data validity and expiration, and human factors such as modality and user acceptance. By addressing these challenges through our prototype, the paper highlights technical and systemic requirements for dependable, data-driven warning applications in the evolving mobility data ecosystem.
Download

Paper Nr: 100
Title:

Paper-Based Health Records: A Case Study on the Digitization of Handwritten Clinical Records

Authors:

Vincenza Carchiolo, Michele Malgeri and Lorenzo Spadaro Sapari

Abstract: This paper presents a case study focused on the application of handwriting recognition to digitize historical clinical records containing significant handwritten content. The primary objective is to assess the feasibility of using commercial OCR technologies-in particular, Microsoft Azure’s handwriting recognition API-for processing health documents. The study aims to determine whether these tools can support the extraction of meaningful clinical information, not only by recognizing individual characters but also by leveraging the structural layout of documents, such as forms, to infer semantic content. Our methodology includes empirical evaluation of OCR output on real-world patient records, alongside a qualitative analysis of common recognition errors. In addition, we review relevant approaches from the literature, highlighting recent advances in deep learning for document understanding. The findings indicate that general-purpose OCR systems are currently insufficient for reliable clinical data extraction in such contexts, primarily due to the complexity and variability of handwritten medical records. However, the results also suggest that structural cues present in form-based documents could be harnessed-through tailored AI-based techniques-to significantly improve recognition and downstream information retrieval.
Download

Paper Nr: 15
Title:

Are We Building Sustainable Software? Adoption, Challenges, and Early-Stage Strategies

Authors:

Thalita Reis, André Araújo, Rodrigo Gusmão, Artur Farias, José Silva and Alenilton Silva

Abstract: The growing environmental impact of digital systems has brought sustainability to the forefront of software engineering research and practice. Green Software Engineering proposes a set of principles and practices aimed at reducing the energy consumption and carbon footprint of software systems. This study investigates the extent to which sustainable development practices are being adopted in the software industry and identifies the software life cycle stages in which they are applied. A structured literature review was conducted to analyze empirical evidence based on a set of design and coding practices focused on sustainability. The results reveal a fragmented and predominantly reactive adoption of these practices, with an emphasis on the development and maintenance phases. In contrast, earlier stages such as requirements elicitation and prototyping remain largely unexplored. The study also identifies key challenges, including the lack of standardization, limited real-world validation, and the absence of practical frameworks aligned with organizational processes. These findings highlight the need for comprehensive strategies and tools to support the integration of sustainability into all phases of the software development life cycle.
Download

Paper Nr: 26
Title:

Guidelines for Adopting Behavior-Driven Development (BDD): A Case Study

Authors:

Shexmo Richarlison Ribeiro dos Santos, Marcus Vinicius Santana Silva, Marcos Cesar Barbosa dos Santos, Luiz Felipe Cirqueira dos Santos, Mariano Florencio Mendonça, Marcos Venicius Santos, Marckson Fábio da Silva Santos, Alberto Luciano de Souza Bastos and Fábio Gomes Rocha

Abstract: Context: To analyze the effectiveness of adopting Behavior-Driven Development (BDD) with agile teams that have not previously used this framework. Problem: Lack of a study that combines guidelines inherent to adopting BDD. Solution: To achieve the proposed objective, the authors carried out two previous studies: the first identifying the state of the art of BDD through a Systematic Multivocal Literature Review; and, consequently, the state of the practice was analyzed, through a Survey, with professionals who use BDD in their work activities. Thus, this research sought to validate the guidelines found. Method: A case study was carried out in a private company focused on security software development with an agile team that did not use BDD. Summarization of results: The results obtained showed the effectiveness and validity of the guidelines previously found for adopting BDD. There is also a need for improving communication among stakeholders, the only problem found in BDD’s adoption. Contributions and impact: The main contribution of this study is the guidelines presented inherent to the adoption of BDD to contribute to both academia and industry regarding the understanding and implementation of this framework, respectively.
Download

Paper Nr: 54
Title:

A Model-Based Approach to the Definition of Collaborative Processes in Supply Chain Environments

Authors:

Mohsen Khorram Dastjerdi, J. A. García-García, J. G. Enríquez and M. J. Escalona Cuaresma

Abstract: Global supply chains involve multiple actors and complex interactions, often spanning multiple organizations and geographic regions. However, issues such as limited process automation, coordination gaps, and concerns about data security and integrity continue to hinder collaboration in these networks. This paper addresses these challenges by presenting a metamodel to support secure and seamless collaboration in the diamond jewelry supply chain in Spain. Based on business process management standards, the metamodel improves traceability, coordination, and automation across organizational boundaries. The metamodel includes mechanisms for hash-based certification, certificate of origin, and shared traceability rules that facilitate the exchange of transparent information while maintaining data integrity. A case study focused on diamond certification from Lesotho to Spain demonstrates the feasibility and benefits of this approach. The results show improvements in process efficiency, fraud reduction, and stakeholder trust, especially in high-value and highly regulated supply chains.
Download

Paper Nr: 75
Title:

Leaving the Tech Debt Behind: How to Sustainably Improve the User and Developer Experience of a Legacy Frontend by Designing, Building and Migrating to a New Web Client

Authors:

Friedrich Maiwald, Nina Weber, Kay Massow and Ilja Radusch

Abstract: Legacy web applications often suffer from declining user and developer experience, driven by technical debt, outdated technologies, and architectural complexity. This paper presents a structured approach for modernizing such systems, targeting simultaneous improvements in user experience (UX) and developer experience (DX). The process encompasses strategy selection, requirements elicitation and prioritization, technology evaluation, and construction best practices, all grounded in research and practical guidelines. Evaluation methods are defined for both UX and DX, combining established frameworks and actionable technical metrics. The approach is applied to the real-world migration of a legacy web application to a new frontend, using a gradual Strangler Fig strategy. The case study demonstrates how well-founded decisions with stakeholder involvement, modular architecture, modern tooling, and resilient testing can break free from legacy constraints. Quantitative and qualitative results show substantial gains in user satisfaction, codebase health, and developer productivity. The findings suggest that systematic modernization not only resolves immediate issues but enables sustainable, maintainable web applications. Future work should explore advanced quality assessment, long-term effects, and the integration of AI to support decision-making and automation in the modernization process.
Download

Paper Nr: 83
Title:

Improving Cybersecurity for Smart Home Systems

Authors:

Tauheed Waheed, Eda Marchetti and Antonello Calabrò

Abstract: Smart home systems consist of various interconnected devices that can be vulnerable to security risks, potentially compromising the integrity of the entire system. This paper aims to address cybersecurity challenges and identify research gaps to enhance cybersecurity for smart home systems. We aim to highlight the significance and impact of testing on smart home systems, with a focus on the drawbacks of current testing strategies. The limitations in current testing methodologies have identified the lack of user involvement in the testing process. We have proposed our user-centric SCTM (Smart-home Cybersecurity Testing Methodology) and its behavioral model to leverage cybersecurity in smart home systems.
Download

Paper Nr: 85
Title:

Towards a Progressive Scalability for Modular Monolith Applications

Authors:

Maurício Carvalho, Juliana de Melo Bezerra and Karla Donato Fook

Abstract: Cloud-native software startups face intense pressure from limited resources, high uncertainty, and the need for rapid validation. In this context, early architectural decisions have lasting effects on scalability, maintainability, and adaptability. Although microservices are often favored for their modularity, they introduce significant operational overhead and require organizational maturity that many startups lack. Traditional monoliths offer simplicity but tend to evolve into rigid, tightly coupled systems. When designed with disciplined modularity, modular monoliths can offer internal boundaries that support sustainable growth while avoiding the fragmentation and complexity of premature microservices adoption. The existing literature emphasizes microservices, leaving gaps in guidance for modular monoliths on topics like modularization, scalability, onboarding, and deployment. This paper proposes guidelines for designing scalable modular monoliths, maintaining architectural flexibility, and reducing complexity, thereby supporting long-term evolution under typical startup constraints. The initial category of guidelines is presented, and their intended structure is thoroughly outlined.
Download

Area 3 - Social Network Analytics

Short Papers
Paper Nr: 59
Title:

Exploring Audience Reactions on YouTube: An Approach to Sentiment and Toxicity Analysis

Authors:

André F. Rollwagen, Gabriel Zurawski, Stefano Carraro, Roberto Tietzmann, Marcelo C. Fontoura and Isabel H. Manssour

Abstract: Social media platforms like YouTube have become central spaces for public expression, allowing users to share their emotional reactions and, at times, toxic content. Understanding these reactions requires scalable and reproducible approaches that can handle informal, user-generated content. This paper presents an integrated approach for analyzing sentiment polarity and detecting toxic speech in YouTube video comments written in Portuguese. For this, we developed a set of Python scripts that automate data collection and apply Natural Language Processing (NLP) techniques to perform both tasks. These scripts are publicly available and can be adapted for use in various video and social analysis contexts. Interactive visualizations were also generated to support the interpretation of results. The applicability of the approach is demonstrated through two case studies involving highly controversial videos, which allow us to explore the relationship between sentiment, toxicity, and audience engagement patterns. The results provide valuable insights into the dynamics of public discourse and offer tools for future research on audience speech analysis on YouTube.
Download

Paper Nr: 33
Title:

A Long Short-Term Memory (LSTM) Neural Architecture for Presaging Stock Prices

Authors:

Tej Nileshkumar Doshi, Shubham Ghadge, Yamini Gonuguntla, Namirah Imtieaz Shaik, Ashutosh Mathore and Bonaventure Chidube Molokwu

Abstract: Stock price prediction is crucial for informed investment decisions(Bathla, 2020; Hochreiter and Schmidhuber, 1997). This study explores the application of Long Short-Term Memory (LSTM) architecture for analyzing and predicting stock prices of major technology companies: Alphabet Inc. (GOOG), Apple Inc. (AAPL), NVIDIA Corporation (NVDA), Meta Platforms, Inc. (META), and Tesla Inc. (TSLA). The fundamental challenge addressed is capturing temporal dependencies and complex patterns in financial time series data, which traditional statistical methods often fail to model accurately(Box et al., 1978; Hyndman and Athanasopoulos, 2013). Our methodology involved collecting historical stock data from Yahoo Finance API(Edwards et al., 2018), preprocessing through normalization and sequence creation(Hochreiter and Schmidhuber, 1997), and training separate LSTM models for each stock. Results indicate that LSTM models provide satisfactory accuracy with R² scores exceeding 0.93 for most stocks(Li et al., 2023; Selvin et al., 2017), capturing both short-term and long-term patterns(Panchal et al., 2024; Ouf et al., 2024). The implications are significant for investors and financial analysts seeking enhanced predictive tools for market forecasting(Pramod and Pm, 2020).
Download

Area 4 - Web Intelligence and Semantic Web

Full Papers
Paper Nr: 56
Title:

A Context-Enriched Hybrid ARIMAX–Deep Learning Framework for Robust Cryptocurrency Price Forecasting

Authors:

Gerasimos Vonitsanos, Andreas Kanavos and Phivos Mylonas

Abstract: The inherent volatility and nonlinear dynamics of cryptocurrency markets pose substantial challenges to accurate price forecasting. This paper proposes a novel context-enriched hybrid modeling framework that integrates classical time series analysis with deep learning techniques to enhance prediction accuracy for Bitcoin price movements. A comprehensive evaluation is conducted on ARIMA, ARIMAX, Support Vector Machines (SVM), and Long Short-Term Memory (LSTM) networks using high-resolution market data from 2019 to 2024. The framework leverages exogenous variables-such as trading volume, market capitalization, and moving averages-to enrich model inputs and capture contextual signals. Experimental results demonstrate that hybrid configurations, particularly ARIMAX-based models, consistently achieve the lowest Root Mean Squared Error (RMSE) and highest coefficient of determination (R2), closely tracking real market trends. These findings confirm the effectiveness of combining statistical rigor with the nonlinear learning capabilities of deep architectures. Furthermore, the study highlights the potential of extending this approach with ensemble strategies for even greater robustness. This work contributes to the development of accurate, data-driven forecasting tools for decision-making in highly dynamic and speculative digital asset markets.
Download

Paper Nr: 60
Title:

Deep Learning for Multimedia Feature Extraction for Personalized Recommendation

Authors:

Aymen Ben Hassen, Sonia Ben Ticha and Anja Habacha Chaibi

Abstract: The analysis of multimedia content plays a crucial role in various computer vision applications, and digital multimedia constitute a major part of multimedia data. In recent years, multimedia content products have gained increasing attention in recommendation systems since the visual appearance of products has a significant impact on users’ decision. The main goal of personalized recommender systems is to offer users recommendations that reflect with their personal preferences. In recent years, deep learning models have demonstrated strong performance and great potential in utilizing multimedia features, especially for videos and images. This paper presents a new approach that utilizes multimedia content to build a personalized user model. We employ deep learning techniques to extract latent features from multimedia content of item videos, which are then associated with user preferences to build the personalized model. This model is subsequently incorporated into a Collaborative Filtering (CF) to provide recommendations and enhance their accuracy. We experimentally evaluate our approach using the MovieLens dataset and compare our results with those of other methods which deals with different text and images attributes describing items.
Download

Paper Nr: 66
Title:

Automated Evaluation of Database Conversational Agents

Authors:

Matheus O. Silva, Eduardo R. S. Nascimento, Yenier T. Izquierdo, Melissa Lemos and Marco A. Casanova

Abstract: Database conversational agents support dialogues to help users interact with databases in their jargon. A strategy to construct such agents is to adopt an LLM-based architecture. However, evaluating agent-based systems is complex and lacks a definitive solution, as responses from such systems are open-ended, with no direct relationship between input and the expected response. This paper then focuses on the problem of evaluating LLM-based database conversational agents. It first introduces a tool to construct test datasets for such agents that explores the schema and the data values of the underlying database. The paper then describes an evaluation agent that behaves like a human user to assess the responses of a database conversational agent on a test dataset. Finally, the paper includes a proof-of-concept experiment with an implementation of a database conversational agent over two databases, the Mondial database and an industrial database in production at an energy company.
Download

Paper Nr: 86
Title:

PSemQE: Disambiguating Short Queries Through Personalised Semantic Query Expansion

Authors:

Oliver Baumann and Mirco Schoenfeld

Abstract: Locating items in large information systems can be challenging, especially if the query has multiple senses referring to different items: depending on the context, Amazon may refer to the river, rainforest, or a mythical female warrior. We propose and study Personalised Semantic Query Expansion (PSemQE) as a means of disambiguating short, ambiguous queries in information retrieval systems. This study examines PSemQE’s effectiveness in retrieving relevant documents matching intended senses of ambiguous terms and ranking them higher versus a base query without expansion. Synthetic user profiles focused on narrow domains were generated to model well-defined information needs. Word embeddings trained on these profiles were used to expand queries with semantically related terms. Experiments were conducted on corpora of varying sizes to measure the retrieval of predetermined target articles. Our results show that PSemQE successfully disambiguated polysemous queries and ranked the target articles higher than the base query. Furthermore, PSemQE produces result sets with higher relevance to user interests. Despite limitations like synthetic profiles and cold-start issues, this study shows PSemQE’s potential as an effective query disambiguation engine. Overall, PSemQE can enhance search relevance and user experience by leveraging user information to provide meaningful responses to short, ambiguous queries.
Download

Paper Nr: 97
Title:

Collective Intelligence with Large Language Models for the Review of Public Service Descriptions on Gov.br

Authors:

Rafael Marconi Ramos, Pedro Carvalho Brom, João Gabriel de Moraes Souza, Li Weigang, Vinícius Di Oliveira, Silvia Araújo dos Reis, Jose Francisco Salm Junior, Vérica Freitas, Herbert Kimura, Daniel Oliveira Cajueiro, Gladston Luiz da Silva and Victor Rafael R. Celestino

Abstract: This paper presents an intelligent multi-agent system to improve clarity, accessibility, and legal compliance of public service descriptions on the Brazilian Gov.br platform. Leveraging large language models (LLMs) like GPT-4, agents with specialized contextual profiles simulate collective deliberation to evaluate, rewrite, and select optimal service texts based on ten linguistic and seven legal criteria. An interactive voting protocol enables consensus-based editorial refinement. Experimental results show the system produces high-quality texts that balance technical accuracy with linguistic simplicity. Implemented as a Mixture of Experts (MoE) architecture through prompt-conditioning and rhetorical configurations within a shared LLM, the approach ensures scalable legal and linguistic compliance. This is among the first MoE applications for institutional text standardization on Gov.br, establishing a state-of-the-art precedent for AI-driven public sector communication.
Download

Paper Nr: 98
Title:

Evaluating LLM-Based Resume Information Extraction: A Comparative Study of Zero-Shot and One-Shot Learning Approaches in Portuguese-Specific and Multi-Language LLMs

Authors:

Arthur Rodrigues Soares de Quadros, Wesley Nogueira Galvão, Victória Emanuela Alves Oliveira, Alessandro Vieira and Wladmir Cardoso Brandão

Abstract: This paper presents a comprehensive evaluation of Large Language Models (LLMs) in the task of information extraction from unstructured resumes in Portuguese. We examine six models, including both multilingual and Portuguese-specific variants, using 0-shot and 1-shot prompting strategies. To assess accuracy, we employ two complementary metrics: cosine similarity between model predictions and ground truth, and a composite LLM-as-a-Judge metric that weights factual information, semantic information, and order of components. Additionally, we analyze token cost and execution time to assess the practicality of each solution in production environments. Our results indicate that Gemini 2.5 Pro consistently achieves the highest accuracy, particularly under 1-shot prompting. GPT 4.1 Mini and GPT 4o Mini provide strong cost-performance trade-offs. Portuguese-specific models like Sabiá 3 achieves high average accuracy specially on 0-shot considering the cosine similarity metric. We also demonstrate how the inclusion of sections frequently missing in real resumes can significantly distort model evaluation. Our findings help determine model selection strategies for real-world applications involving semi-structured document parsing in a context of resume information extraction.
Download

Short Papers
Paper Nr: 34
Title:

One-Shot Learning, Video-to-Audio Commentary System for Football and/or Soccer Games

Authors:

Khushi Mahajan, Reshma Merin Thomas, Sheiley Patel, Anvitha Reddy Thupally and Bonaventure Chidube Molokwu

Abstract: Automated real-time sports commentary poses a considerable problem at the convergence of Computer Vision and Natural Language Processing (NLP), especially in dynamic settings such as football. This research introduces a novel deep learning-based system for generating natural language commentary with synchronized audio output, detecting, tracking, and semantically interpreting football match events. For the purpose of object detection, our proposed system leverages the capabilities of YOLOv9 (You Look Only Once - version-9); for the maintenance of temporal identity - ByteTrack; and to map visual cues - a homography-based spatial transformer is used. A rule-based module using proximity and trajectory transition logic identifies possession, passes, duels, and goals. Commentary is synthesized by using a template-matching natural language generator. The Google Text-to-Speech (gTTS) engine renders it in an audible way. The fundamental problem that we address is Artificial Intelligence (AI) systems that are lacking in modularity and interpretability that can bridge visual perception with Natural Language Generation in sports broadcasting. Prior studies detected or classified through isolated Machine-Learning (ML) models yet our work proposes a framework that is unified explainable and real-time. This research has implications in accessible broadcasting as well as performance analytics. AI-powered sports media production also is being impacted.
Download

Paper Nr: 35
Title:

Predictive Modelling for Diabetes Mellitus with Respect to Basic Medical History

Authors:

Patrick Purta, Aryan Mishra, Vishal Reddy Vadde, Ruthvika Bojjala, Gopichand Jagarlamudi and Bonaventure Chidube Molokwu

Abstract: In our work herein, we observed how three (3) common oversampling techniques - SMOTE, SMOTE-ENN, and SVM-SMOTE - affect the performance of Machine Learning (ML) models applied towards predicting diabetes risk with reference to the Pima-Indian (Akimel O’odham) Diabetes dataset. Our aim was to figure out if using these methods to mitigate class imbalance, in a medical dataset, might cause the ML models to overfit - in other words, they tend to do very well on the training data but lose fitness and accuracy on new data. Our project began from a simple question: “Can oversampling fix class imbalances, with respect to a given dataset, without hurting the model’s ability to generalize?” Previous studies have shown that oversampling can help balance target-classes within a dataset, but these studies do not always address the risk of overfitting. To answer this, we combined each oversampling technique via three (3) ensemble methods - Extra Trees, Gradient Boosting, and Random Forest - and compared their performances via cross-validation objective functions. Our results reveal that, although each method improves the results or metrics on the training data, they tend to under-perform slightly on unseen test or sample data. This suggests that while oversampling is a useful strategy, it must be applied with caution to avoid overfitting. These insights are important for refining predictive models, especially in healthcare contexts where reliable performance is critical.
Download

Paper Nr: 55
Title:

Multi-Objective Policy Optimization for Effective and Cost-Conscious Penetration Testing

Authors:

Xiaojuan Cai, Lulu Zhu, Zhuo Li and Hiroshi Koide

Abstract: Penetration testing, which identifies security vulnerabilities before malicious actors can exploit them, is essential for strengthening cybersecurity defenses. Effective testing helps discover deep, high-impact vulnerabilities across complex networks, while efficient testing ensures fast execution, low resource utilization, and reduced risk of detection in constrained or sensitive environments. However, achieving both effectiveness and efficiency in real-world network environments presents a core challenge: deeper compromises often require more actions and time. At the same time, excessively conservative strategies may miss critical vulnerabilities. This work addresses the trade-off between maximizing attack performance and minimizing operational costs. We propose a multi-objective reinforcement learning framework that minimizes costs while maximizing rewards. Our approach introduces a Lagrangian-based policy optimization scheme in which a dynamically adjusted multiplier balances the relative importance of rewards and costs during learning. We evaluate our method on benchmark environments with varied network topologies and service configurations. Experimental results demonstrate that our method achieves successful penetration performance and significantly reduces time costs compared to the baselines, thereby improving the adaptability and practicality of automated penetration testing in real-world scenarios.
Download

Paper Nr: 67
Title:

Software Testing Evidence: Results from a Systematic Mapping

Authors:

Artur S. Farias, Rodrigo Rocha, Igor Vanderlei, Jean Araujo, André Araújo and Jamilson Dantas

Abstract: Software testing is a fundamental aspect of software development, essential for ensuring product quality and reliability. This paper presents the findings of a systematic mapping of the literature, aimed at addressing key research questions related to software testing practices. The study investigates the types of software testing that are most commonly utilized, the predominant approaches employed, the challenges encountered during testing execution, the reported benefits of implementing software testing, the best practices acknowledged by the industry for efficient testing, and the tools and technologies frequently applied in the field. The research methodology followed a structured protocol that guided the systematic mapping across five scientific databases: IEEEXplore, ACM, Science Direct, Springer Link, and Scopus. Following a comprehensive screening process, a total of 341 primary studies were systematically reviewed. The results provide valuable insights into current software testing practices and highlight the challenges faced in this area. Additionally, this study identifies effective solutions and best practices, assisting researchers and industry professionals in improving their software testing processes.
Download

Paper Nr: 68
Title:

Semantic Prompting over Knowledge Graphs for Next-Generation Recommender Systems

Authors:

Antony Seabra, Claudio Cavalcante and Sergio Lifschitz

Abstract: This paper presents a novel recommender system framework that integrates Knowledge Graphs (KGs) and Large Language Models (LLMs) through dynamic semantic prompt generation. Rather than relying on static templates or embeddings alone, the system dynamically constructs natural language prompts by traversing RDF-based knowledge graphs and extracting relevant entity relationships tailored to the user and recommendation task. These semantically enriched prompts serve as the interface between structured knowledge and the generative capabilities of LLMs, enabling more coherent and context-aware suggestions. We validate our approach in three practical scenarios: personalized product recommendation, identification of users for targeted marketing, and product bundling optimization. Results demonstrate that aligning prompt construction with domain semantics significantly improves recommendation quality and consistency. The paper also discusses strategies for prompt generation, template abstraction, and knowledge selection, highlighting their impact on the robustness and adaptability of the system.
Download

Paper Nr: 77
Title:

Large Language Models in Open Government Data Analysis: A Systematic Mapping Study

Authors:

Alberto Luciano de Souza Bastos, Luiz Felipe Cirqueira dos Santos, Shexmo Richarlison Ribeiro dos Santos, Marcus Vinicius Santana Silva, Marcos Cesar Barbosa dos Santos, Marcos Venicius Santos, Marckson Fábio da Silva Santos, Mariano Florencio Mendonça and Fabio Gomes Rocha

Abstract: Background: The convergence of Large Language Models (LLMs) and open government data presents transformative potential for public administration, yet there exists a significant gap in understanding adoption patterns in this emerging domain. Aim: This study analyzes adoption patterns of Large Language Models in open government data analysis, characterizing researchers’ perceptions about benefits, limitations, and methodological implications. Method: We conducted a systematic mapping study following Petersen et al. (2008) guidelines, searching six academic databases. After screening, 24 primary studies were analyzed covering contribution types, validation methods, government domains, and LLM models. Results: Analysis revealed GPT model family predominance, with health as priority domain (4 studies), followed by security and justice (3 studies each). Conversational interfaces and information extraction were dominant functions (9 studies each). Conclusions: The field demonstrates evolution toward hybrid solutions integrating LLMs with structured knowledge resources. Consistent challenges across technologies-ethical issues, privacy concerns, and data quality-indicate the need for unified frameworks. Future research should focus on developing practical solutions to achieve technical maturity comparable to established software engineering fields.
Download

Paper Nr: 88
Title:

Fuzzy-Weighted Sentiment Recognition for Educational Text-Based Interactions

Authors:

Christos Troussas, Christos Papakostas, Akrivi Krouska and Phivos Mylonas

Abstract: In web-based educational environments, students often express complex emotional states – such as confusion, frustration, or engagement – through reflective texts, forum posts, and peer interactions. Traditional sentiment analysis tools struggle to capture these subtle, mixed signals due to their reliance on rigid classification schemes and lack of domain sensitivity. To address this, we propose a fuzzy-weighted sentiment recognition framework designed specifically for educational text-based interactions. The system combines an augmented sentiment lexicon, rule-based modifier detection, and semantic similarity using pretrained Sentence-BERT embeddings to extract nuanced sentiment signals. These inputs are interpreted by a Mamdani-type fuzzy inference engine, producing a continuous sentiment score and a confidence weight that reflect both the strength and reliability of the learner’s affective state. The paper details the linguistic pipeline, fuzzy membership functions, inference rules, and aggregation strategies that enable interpretable and adaptive sentiment modeling. Evaluation on a corpus of 1125 annotated student texts from a university programming course shows that the proposed system outperforms both lexicon-based and deep learning baselines in accuracy, robustness, and interpretability, demonstrating its value for affect-aware educational applications.
Download

Paper Nr: 92
Title:

“We Need to Analyze Students GenAI Use”: Towards an AI Adoption Framework for Higher Education

Authors:

Lasse Bischof, Eva-Maria Schön, Maria Rauschenberger and Michael Neumann

Abstract: Context: Generative AI (GenAI) tools such as ChatGPT are rapidly transforming how students learn and work. While adoption among learners is high, institutional frameworks in higher education often lag behind. Objective: This study pursues two primary objectives: 1) identifying students’ use-cases for GenAI, and 2) synthesizing these into a systematic description how to integrate GenAI into higher education. Method: To address these objectives, we conducted a case study at the University of Applied Sciences and Arts Hannover. We used a questionnaire that included both quantitative and qualitative questions. Results: Our findings reveal that 129 (n=151) of the students use GenAI tools in their studies. Based on a synthesis of the results, we created a systematic description for GenAI integration into higher education. Contributions: We offer specific solutions: with the AI Adoption Framework, higher education institutions will be able to review and adapt their regulations and curricula in relation to GenAI to keep up with the pace of change in the field.
Download

Paper Nr: 96
Title:

Linguistic Analogies in Word Embeddings: Where Are They?

Authors:

Riccardo Contessi, Paolo Fosci and Giuseppe Psaila

Abstract: Word Embedding has greatly improved Natural-Language Processing. In word-embedding models, words are represented as vectors in a multi-dimensional space; these vectors are trained through neural networks, by means of very large corpora of textual documents. Linguistic analogies are claimed to be encoded within word-embedding models, in such a way that they can be dealt with through simple vector-offset operations. This paper aims to give an answer to the following research question: given a word-embedding model, are linguistic analogies really present? It seems rather unrealistic that complex semantic relationships are encoded within a word-embedding model, which is trained to encode positional relationships between words. The investigation methodology is presented, and the results are discussed. This leads to the following question: “Linguistic analogies in Word Embeddings: where are they?”.
Download

Paper Nr: 101
Title:

Ontology-Grounded Language Modeling: Enhancing GPT-Based Philosophical Text Generation with Structured Knowledge

Authors:

Claire Ponciano, Markus Schaffert and Jean-Jacques Ponciano

Abstract: We present an ontology-grounded approach to GPT-based text generation aimed at improving factual grounding, historical plausibility, and stylistic fidelity in a case study: Baruch Spinoza’s Latin writings. We construct a compact ontology from Linked Open Data (Wikidata/DBpedia) augmented with expert-curated facts, serialize triples into natural-language statements, and interleave these with a canonical Latin corpus during fine-tuning of a GPT-2 (124M) model. At inference, retrieval-augmented generation (RAG) prepends ontology-derived facts and lightweight stylistic instructions, guiding the model toward historically consistent continuations in Spinoza’s register. Evaluation follows an 80/20 paragraph split of Ethica: we generate continuations for the 80% of segments retained and measure the semantic similarity (BERTScore) with the 20% omitted. This evaluation is completed by an expert assessment of historical plausibility and cosine similarity scores computation for the stylistic authenticity. Relative to a GPT-2 baseline trained only on the Latin corpus, our ontology-grounded variant achieves higher BERTScore and produces fewer factual and conceptual errors, preserving Latin rhetorical structure. These results indicate that structured knowledge integration is a feasible and effective way to make generative models more reliable for cultural-heritage text.
Download

Paper Nr: 18
Title:

Bridging BDI Multi-Agent Systems and the Semantic Web Through the Triples-to-Beliefs-to-Triples Paradigm

Authors:

Carmelo Fabio Longo, Rocco Paolillo, Misael Mongiovì, Andrea Giovanni Nuzzolese, Francesco Poggi, Michele Geremia Ceriani, Antonio Zinilli, Giusy Giulia Tuccari and Corrado Santoro

Abstract: Well-established agent engineering frameworks from the state-of-the-art, due to their outdated designs, are not thought to work in the perspective of a shared semantics, nor do they provide an agent modeling language and environment that integrates seamlessly with them. This is especially challenging in dynamic, distributed environments where new concepts, data sources, and agents can emerge at runtime, potentially leading to semantic conflicts or inconsistencies. This paper proposes the novel paradigm Triples-to-Beliefs-to-Triples (T2B2T), which is being ontologically described, enabling multi-agent systems with seamless and consistent integration with the Semantic Web. In order to validate the approach, this paper proposes also a framework called SEMAS implementing the T2B2T paradigm, which provides a bridge between the mental attitudes Beliefs-Desire-Intentions (BDI) and triples describing a domain with an abstraction over the SPARQL language that feeds the inference process of agents. This enables more sophisticated forms of reasoning in the closed-world assumption, by supporting predicates without any limitation on arity and compositional structures, allowing also the employment of decentralized functions for the dynamic generation of new triples not included in the origin ontologies. As a case-study, SEMAS was employed on decision-making applied to academic mobility with real data coming from the SCOPUS database, demonstrating how the generated inferences can be tailored to specific conditions of individual agents, and how new triples can be inferred to capture the impact of agents’ decisions on the evolution of the knowledge domain.
Download

Paper Nr: 36
Title:

A Machine-Learning, Predictive-Analytical Model for Thyroid-Cancer Risk Assessment

Authors:

Sanjay Manda, Manohar Adapa, Harsha Sai Jasty, Rishma Sree Pathakamuri, Siddhartha Vinnakota and Bonaventure Chidube Molokwu

Abstract: Thyroid cancer is a significant health problem globally due to the increasing number of people being diagnosed, while existing methods to diagnose it heavily rely on invasive biopsies and imaging that fail to account for various patient risk factors. This research aims to develop a comprehensive and precise model to forecast thyroid cancer risk through the application of state-of-the-art machine learning techniques. We utilized a number of preprocessing methods such as imputation of missing values, outlier detection, categorical feature encoding, and the Synthetic Minority Oversampling Technique (SMOTE) to address class imbalance. We utilized advanced feature engineering methods such as polynomial transformation, logarithmic scaling, and clinical risk scoring to extract important predictive patterns. Our model was thoroughly tested using the CatBoost (Categorical Boosting) algorithm against other algorithms (Logistic Regression, Random Forest, XGBoost, and LightGBM). The CatBoost model showed outstanding prediction performance with 88% accuracy, 93% precision, 78% recall, 85% F1-score, and ROC-AUC of 90%. These findings suggest that CatBoost can differentiate well between thyroid cancer high-risk and low-risk cases. This robust prediction model identifies individuals at risk early and accurately, assists in making informed clinical decisions, and could reduce healthcare expenditure and prevent futile treatment, improving patient quality of life.
Download

Paper Nr: 38
Title:

Predictive Model for Heart-Related Issues Based on Demographic, Societal, and Lifestyle Factors

Authors:

Bindu Chandra Shekar Reddy, Pravallika Dharmavarapu, Roopal Dixit, Prudhvi Kodali, Akanksha Ojha and Bonaventure Chidube Molokwu

Abstract: This research predicts cardiovascular disease (CVD) risk by analyzing demographic, societal, and lifestyle factors, supporting early intervention for conditions like heart attacks. With CVD causing around 17.9 million deaths annually worldwide (WHO), there is a critical need for accessible, accurate predictive models. We propose an XGBoost-based machine learning model trained on a 70,000-patient dataset enriched with features such as median income, stress, and diet risk. After robust preprocessing and feature engineering-including BMI and pulse pressure-the model achieves 73% accuracy, 76% precision, 68% recall, 72% F1-score, and 80% ROC-AUC. Key predictors include pulse pressure, cholesterol, and age, indicating that this multifactor approach can enhance clinical decision-making and inform scalable health solutions.
Download

Paper Nr: 51
Title:

Web-Based Crowd Detection and Emotion Analysis for Fashion Retail Using Computer Vision

Authors:

Fiorella Valencia Rivera, Erik Romero Polli and Efrain Bautista Ubillus

Abstract: This study proposes a web-based solution to address the difficulty fashion retail stores face in obtaining accurate information on crowding and their customers’ emotional states. Using computer vision techniques, the application leverages the YOLO algorithm for people detection and convolutional neural networks (CNN) for emotion classification. Integrating this data provides retailers with strategic insights to optimize space layout, improve resource allocation, and adjust their marketing strategies, allowing managers to make decisions based on objective data. The study emphasizes ethical considerations, including data anonymization and secure storage, and highlights limitations and future research directions, such as real-world testing and collaboration with retailers for contextually accurate data collection. The system was validated in a simulated environment that replicated the operating conditions of a retail store, allowing an initial evaluation of its performance.
Download

Paper Nr: 79
Title:

Online News Verification: An AI-Based Platform for Assessing and Visualizing the Reliability of Online News Articles

Authors:

Anne Grohnert, Simon Burkard, Michael John, Christian Giertz, Stefan Klose, Andreas Billig, Amelie Schmidt-Colberg and Maryam Abdullahi

Abstract: Assessing the reliability of online news articles poses a significant challenge for users. This paper presents a novel digital platform that enables users to analyze German-language news articles based on various reliability-related aspects, including opinion strength, sentiment, and article dissemination. Unlike many existing approaches focused solely on detecting fake news, this platform emphasizes the comparative analysis and visualization of relevant reliability indicators across articles from different publishers. The paper provides a comprehensive overview of the current state-of-art describing various existing approaches for the detection and presentation of disinformational online content before presenting the technical system architecture and user interfaces of the designed platform. A concluding user evaluation reveals some limitations and opportunities for further developments, but showed generally positive feedback on the platform’s diverse analysis criteria and visual presentation to support users in assessing the credibility of news articles. Potential future applications range from evaluating article neutrality to verifying citations in academic contexts.
Download

Paper Nr: 95
Title:

Graph-Based Personalized Recommendation in Intelligent Educational Platforms: A Case Study in Engineering Education

Authors:

Sofia Merino Costa, Rui Pinto and Gil Gonçalves

Abstract: The fragmentation of digital learning materials in engineering education makes it difficult for students to find relevant content. This paper presents a graph-based recommender system integrated into an intelligent Knowledge Management System (KMS) to support personalized learning. Using Neo4j, the system models users, learning objects, and semantic relationships to generate contextualized recommendations across dashboard, module, and Learning Path (LP) views. Its scoring mechanism combines semantic similarity, interaction history, and graph proximity to provide adaptive, explainable suggestions. A mixed-methods evaluation with engineering students showed high alignment with user interests and positive perceptions of transparency and personalization. The system effectively transitioned from fallback to tailored recommendations as user interactions increased. Results highlight the potential of graph-based approaches to improve content relevance, discovery, and learner engagement in web-based educational platforms, in line with Education 5.0 principles.
Download