WEBIST 2010 Abstracts

Area 1 - Internet Technology

Full Papers
Paper Nr: 19

SHAPING STANDARDS People and Voting Rights and the Case of IEEE 802.11


Kai Jakobs

Abstract: Based on the approach of the ‘social shaping of technology’, this paper will provide a brief discussion of a) the impact that the individuals who populate a standards body’s working group, and b) this body’s voting rules have on its final standards. It will primarily draw upon a qualitative empirical study. In particular, this paper will use the IEEE 802.11 group as a real-world sample group to further highlight the issues discussed more theoretically above.

Paper Nr: 34



Mohammed Al-Badawi, Siobhán North and Barry Eaglestone

Abstract: In the context of benchmarking XML implementations, several XML benchmarks have been produced to either test the application’s overall performance or evaluate individual XML functionalities of a specific XML implementation. Among five popular XML benchmarks investigated in this article, all techniques rely on code-generated datasets which disregard many of XML’s irregular aspects such as varying the depth and breadth of the XML documents’ structure. This paper introduces a new test-model called the “3D XML benchmark” which aims to address these limitations by extending the dataset and query-set of existing XML benchmarks. Our experimental results have shown that XML techniques can perform inconsistently over different XML databases for some query classes, thus justifying the use of an improved benchmark.

Paper Nr: 74



Christophe Cruz, Christophe Cruz, Christophe Nicolle and Renaud Vanlande

Abstract: This paper presents a Semantic Web approach for facility management. This Web-based platform lets geographically dispersed project participants—from facility managers and architects to electricians to plumbers—directly use and exchange project documents in a centralized virtual environment using a simple Web browser. A 3D visualization lets participants move around in the building being designed and obtain information about the objects that compose it. This approach is based on a semantic model called CDMF and IFC 2x3. CDMF improves data management during the lifecycle of a building. Based on graph combinations and the contextual element SystemGraph, our proposition, addresses the problem of model evolution, of data mapping, management, of temporal data and the problem of the data adaptation according to the use and the user. Our framework, based on Building Information Modeling features, facilitates data maintenance (data migration, model evolution) during the building lifecycle and reduces the volume of data.

Paper Nr: 76

The Influence of Service Interactions on Individual Service Reliability in a Composition Scenario


Abhishek Srivastava and Paul Sorenson

Abstract: Selecting an optimal service from a group of functionally equivalent ones is non-trivial. This is more so when the service to be selected is part of a composite application. Research in the past has resolved this issue making use of the Quality of Service (QoS) attributes of the services to determine the most optimal from the functionally equivalents. This paper too attempts to tackle this problem using one of the more important QoS attributes, reliability. The novelty of the technique proposed here is due to the fact that while papers in the past have looked upon the reliability of individual services in a service composition in isolation, we take into account the influence that the interaction among services in a composition has on individual reliabilities. The service domain along with the interactions is represented as a continuous time Markov chain, and through appropriate procedure the reliability of individual services is calculated in the form of `failure distance'. The services selected are the ones with the largest values of failure distance. The results of experiments conducted by us have also been included to validate this technique.

Paper Nr: 120

Automatic Tag Identification in Web Service Descriptions


Zeina Azmeh, Jean-Rémy Falleri, Chouki Tibermacine and Marianne Huchard

Abstract: With the increasing interest toward service-oriented architectures, the number of existing Web services is dramatically growing. Therefore, finding a particular service among this ever increasing number of services is becoming a time-consuming task. User tags or keywords have proven to be a useful technique to smooth browsing experience in large document collections. Some service search engines, like Seekda, already propose this kind of facility. Service tagging, which is a fairly tedious and error prone task, is done manually by the providers and the users of the services. In this paper we propose an approach that automatically extracts tags from Web service descriptions. It identifies a set of relevant tags extracted from a service description and leaves only to the users the task of assigning tags not present in this description. The proposed approach is validated on a corpus of 146 services extracted from Seekda.

Paper Nr: 123

GREEN WEB ENGINEERING: A Set of Principles to Support the Development and Operation of "Green" Websites and their Utilization during a Website’s Life Cycle


Markus Dick, Stefan Naumann and Alexandra Held

Abstract: The power consumption of ICT and Internet is still increasing. To date, it is not clear if the energy savings through ICT overbalance the energy consumption by ICT, or not. In either case, it is suggestive to enforce the energy efficiency of the Web. In our paper, we present a set of 12 principles, which help e.g. to reduce the net load by caching or compressing. In order to classify our suggestions we group them by three main roles in Web Engineering. Additionally, we recommend using data centres which utilize "classic" Green IT.

Paper Nr: 126



Li Li and Wu Chou

Abstract: RESTful architecture style that underlies the Web has gained rapid adoption as a way to develop web services for machines. But the full potential of REST is hindered by the fact that HTML pages designed for human interactions are not suitable for machine processing. To address this problem, we developed a microformat framework, called micro-resource, to extend web sites into dual RESTful web services for both human and machines alike with minimum changes. This framework avoids the pitfalls of alternative “parallel” web services by keeping the correspondence and duality between human and machine webs. This framework is simple, extensible and also composable with other existing microformats. Initial application of this framework on some RESTful service composition shows that the approach is efficient and feasible.

Paper Nr: 150

Minimal-Footprint Middleware for the Creation of Qualified Signatures


Martin Centner, Wolfgang Bauer and Clemens Orthacker

Abstract: Qualified electronic signatures are recognized as being equivalent to handwritten signatures and are supported by EU legislation. They require a secure signature creation device (SSCD) such as a smart card. Unfortunately, there are neither standard means for the integration of SSCDs with Web applications, nor are the exiting means widely deployed. Web application providers are still faced with a lack of deployment of such means and a lack of integration with standard software. This paper will present a novel approach to address these issues by a middleware that does not require users to install dedicated software for the creation of qualified electronic signatures. The middleware is deployed as a web application and splits the signature creation process into two parts: One part is performed on the server side and the other part (requiring access to functions of the secure signature creation device) is deployed and executed as a lightweight component in the user’s browser on demand.

Paper Nr: 159

WSCOLAB: Structured collaborative tagging for Web service matchmaking


Maciej Gawinecki, Giacomo Cabri, Marcin Paprzycki and Maria Ganzha

Abstract: One of key requirements for success of Service Oriented Architecture is discoverability Web services. Unfortunately, application of authoritatively defined taxonomies does not work for the volume of services published on the Web. Collaborative tagging claims to address this problem, but is impeded by the lack of structure to describe Web service interfaces. We introduce structured collaborative tagging to improve Web service descriptions and report performance of the proposed technique obtained during the Cross-Evaluation track of the Semantic Service Selection 2009 contest. Obtained results shows that the proposed classification schema can be sucesfuly used by both Web service tagging and querying.

Short Papers
Paper Nr: 100

Per-request Contracts for Web Services Transactions


David Paul, Frans Henskens and Michael Hannaford

Abstract: To allow providers to keep their autonomy and ensure the overall system can run satisfactorily, it is common practice in the Web Services environment for providers to reduce the strength of some of the traditionallyrequired ACID properties when offering transactional support. However, current standards require providers to offer a constant level of transactional support for each operation they provide. We describe a method that allows service providers to dynamically decide on the level of transactional support to offer for each client request. This allows the provider to base the level of transactional support offered on the current state of the system and internal logic, resulting in potential benefits for both service providers and consumers.

Paper Nr: 109

Web Authentic and Similar Texts Detection using AR Digital Signature


Marios Poulos, Nick Skiadopoulos and George Bokos

Abstract: In this paper, we propose a new identification technique based on an AR model with a complexity of size O(n) times in web form, with the aim of creating a unique serial number for texts and to detect authentic or similar texts. For the implementation of this purpose, we used an Autoregressive Model (AR) 15th order, and for the identification procedure, we employed the cross-correlation algorithm. Empirical investigation showed that the proposed method may be used as an accurate method for identifying same, similar, or different conceptual texts. This unique identification method for texts in combination with SCI and DOI may be the solution to many problems that the information society faces, such as plagiarism and clone detections, copyright related issues, and tracking, and also in many facets of the education process, such as lesson planning and student evaluation. The advantages of the exported serial number are obvious, and we aim to highlight them while discussing its combination with DOI. Finally, this method may be used by the information services sector and the publishing industry for standard serial-number definition identification, as a copyright management system, or both.

Paper Nr: 122

DNS-based Load Balancing for Web Services


Alan Nakai, Alan Nakai, Edmundo Madeira and Luiz Eduardo Buzato

Abstract: A key issue for good performance of geographically replicated web services is the efficiency of the load balancing mechanism used to distribute client requests among the replicas. This work revisits the research on DNS-based load balancing mechanisms considering a SOA (Service-Oriented Architecture) scenario. In this kind of load balancing solution the Authoritative DNS (ADNS) of the distributed web service performs the role of the client request scheduler, redirecting the clients to one of the server replicas, according to some load distribution policy. This paper proposes a new policy that combines client load information and server load information in order to reduce the negative effects of the DNS caching on the load balancing. We also present the results obtained through an experimental tesbed built on basis of the TPC-W benchmark.

Paper Nr: 145

Are recommender systems real-time in mobile environment? Towards instantaneous recommenders


Armelle Brun and Anne Boyer

Abstract: Recommendation technologies have traditionally been used in domains such as E-commerce and Web navigation to recommend resources to customers so as to help them to get the right resources at the right moment. The interest of the collaborative filtering approach in recommender systems has highly increased over the last few years. In model-based collaborative filtering we can find the popular usage mining models, as sequential association rules. This model is usually presented as a real-time recommender. In the last few years, the m-commerce domain has emerged, and m-commerce recommenders display recommendations on the mobile device instead of the classical screen of the computer. In this paper user privacy preservation is an important objective and one way to be compliant with this constraint is to store the recommender on the mobile-side. However, although usage mining recommenders are real-time, many of them require a significant time to generate recommendations to users and when the recommender is implemented on a mobile device, it may not be real-time anymore. Although some works focused on the way to decrease the time required to compute recommendations, the computation complexity still remains relatively high. In some application cases, recommendations may be required instantaneously, classical models lead thus to unsatisfied user. We put forward a new incremental recommender to get instantaneous recommendations when exploiting usage mining recommender systems in the framework of m-commerce.

Paper Nr: 147

Indoor Location Using Wireless Networks based on Bayesian Reasoning


Jesús F. Rodríguez-Aragón, Vidal Moreno Rodilla, Belén Curto Diego, Francisco Javier Serrano Rodríguez, Raúl Alves and María José Polo Martín

Abstract: This paper describes a solution for the indoor location in the context of wireless local networks. Firstly, the processes of sampling and training are done by off-line scene analysis. Secondly, the mobile entity can be localized in a self-positioning fashion according to the Bayesian Network based method.

Paper Nr: 156

smilingPhotos: bridging the gap between digital photo albums and printed photo books


Ombretta Gaggi and Fabrizio Ghidoni

Abstract: The number of photos has significantly increased in the recent years together with the problem of how to manage them: a picture can be printed as is, arranged into a more interesting photo book or shared through the web. Digital photos have the advantage that can be easily stored and shared with family and friends from different cities or countries but we must consider that many people do not even see the PC screen as a convenient vehicle for synchronously reviewing and sharing of photos with others. This paper presents a framework to bridge the gap between digital web albums and printed photo books. The idea is to allow the user to create, at the same time, a photo book for later printing, and a multimedia presentation to share with friends not only the taken pictures but the experiences they told. The paper presents smilingPhotos a tool for this double authoring: the user creates a multimedia slideshow which can be enriched with audio (music or spoken) comments, transition effects, animations and definition of the layout of the images. The result can be shared across the network, or automatically translated into a static photo book suitable for printing.

Paper Nr: 177



Leandro Krug Wives, José Palazzo M. de Oliveira, Zakaria Maamar, Samir Tata and Mohamed Sellami

Abstract: In this position paper, we provide a brief overview of Recommender Systems (RS) and Web Services (WS). After, we propose a research roadmap for the challenges and opportunities that could arise following the combined use of WS and RS. While these challenges are expected to hinder this use, we discuss the necessary actions that need to be taken to overcome these challenges and hence, make this use a win-win situation for both WS and RS. We illustrate how the combination of RS to WS takes place in terms of what RS can do for WS and what WS can do for RS. Finally, we conclude by pointing out the actions to take so that this combination turns out successful.

Paper Nr: 185

Dynamically Reconfigurable Data-Intensive Service Composition


Onyeka Ezenwoye, Salome Busi and S. Masoud Sadjadi

Abstract: The distributed nature of services poses significant challenges to building robust service-based applications. A major aspect of this challenge is finding a model of service integration that promotes ease of dynamic reconfiguration, in response to internal and external stimuli. Centralized models of composition are not conducive for data-intensive applications such as those in the scientific domain. Decentralized compositions are more complicated to manage especially since no service has a global view of the interaction. In this paper we identify the requirements for dynamic reconfiguration of data-intensive composite services. A hybrid composition model that combines the attributes of centralization and decentralization is proposed. We argue that this model promotes dynamic reconfiguration of data-intensive service compositions.

Paper Nr: 189



Alina Bianca Andreica, Florina Livia Covaci, Daniel Stuparu and Gabriel Pop

Abstract: The present paper focuses on means of integrating dedicated information systems based on various technologies (php / postgresql, asp / MS sql) into a global web portal, based on MS technology. The portal provides as well learning management content and e-learning facilities for various user categories

Paper Nr: 202



George Gkotsis and Nikos Karacapilidis

Abstract: Motivated by the fact that contemporary argumentation systems provide low or no support with regards to argument and information processing, this paper presents a generic computational model that is able to identify and assess structural similarities in argumentative discourses. Focusing on the structure of such discourses, we sketch representative scenarios where the proposed model can be applied at a wide range of argumentation systems in order to define, elaborate and mine meaningful argumentation patterns. We argue that the proposed model is of considerable contribution to both theoretical and practical aspects of argumentation.

Paper Nr: 17

RODIN - A medium-weight portal for the aggregation and mashing of heterogeneous data sources


Rene Schneider and Fabio Ricci

Abstract: RODIN (ROue D'INformation) is a project that aims to develop an innovative tool for the bundling and coupling of user-relevant, heterogeneous information resources. Information specialists and other service users will be able to gather those information resources which are interesting in relation with their work or with their personal interests in a dynamic and user-friendly information aggregate. The tool includes a search engine which allows a simultaneous search in all components of the aggregate and will consist of an ontology based search refinement algorithm, that links the data and looks for broader and narrower results based on the search results. RODIN represents the alternative portal approach within the context of E-lib.ch-project, the swiss digital library.

Paper Nr: 47

Securing Access to Embedded Systems(An effective concept for devices lacking internet connection)


Bruno Juchli, Roland Portmann and Peter Sollberger

Abstract: Many embedded systems provide a web interface for maintenance tasks such as system configuration, test execution and firmware updating. Access to this interface usually needs to be restricted to authorized employees. This paper shows an efficient and cost-effective concept to secure maintenance interfaces using widespread standards and technology. By storing authorisation information in standard compliant X.509 certificate extensions Transport Layer Security (TLS) and X.509 Public Key Infrastructure (PKI) provide mutual authentication, message integrity as well as confidentiality and enable authorisation of employees. Practical experience of the implementation completes this paper.

Paper Nr: 54

MULTI-TIER BASED VISUAL COLLABORATION - A Model using Semantic Networks and Web3D


Eldar Sultanow, Eldar Sultanow and Edzard Weber

Abstract: Geographically distributed development has consistently had to deal with the challenge of intense awareness extensively more than locally concentrated development. Awareness marks the state of being informed incorporated with an understanding of project-related activities, states or relationships of each individual employee within a given group or as a whole. In multifarious offices, where social interaction is necessary in order to distribute and locate information together with experts (as well as their availability etc.), awareness becomes a concurrent process which amplifies the exigency of easy routes for staff to be able to access this information, deferred or decentralized, in a formalized and problem-oriented way. The appropriate visualization and navigation of this information is a requirement for ensuring that staff and project managers can orientate themselves most efficiently. This paper develops a model for visualizing the collaboration in development projects using semantic networks and Web3D.

Paper Nr: 61



Ruben Diego Carrera, Jose Manuel De la Horra, Rubén Pérez Álvarez, J.César González Galván and Francisco Ballester

Abstract: In recent years, non-face training has adquired a growing role in our society. To a large extent it’s the result of the possibilities offered by new technologies, linked to the development of information technology and more specifically, the Internet. On-line courses have proliferated exponentially, with growing number of enterprises using those techniques for staff training. However, the e-learning presents a number of limitations that avoid its use from widespreading at all. The solution adopted at this respect is the creation of a computer plafform of “on line-off line” non-face learning. This is how FONOFF (“Formación On line-Off line”/ “On line Off line Learning”) arises, which provides the advantages of classic “on line” method, but supplying some of its shortcomings.

Paper Nr: 66

INTERSECTION Approach to Vulnerability Handling


Michal Choras, Salvatore D'Antonio, Rafal Kozik and Witold Holubowicz

Abstract: In this paper our approach to heterogeneous networks vulnerability handling is presented. Vulnerabilities of heterogeneous networks like satellite, GSM, GPRS, UMTS, wireless sensor networks and the Internet have been identified, classified and described in the framework of the European co-funded project, named INTERSECTION (INfrastructure for heTErogeneous, Resilient, SEcure, Complex, Tightly Inter-Operating Networks). Since computer security incidents usually occur across administrative domains and interconnected networks it is quite clear that it would be advantageous for different organizations and network operators to be able to share data on network vulnerabilities. The exchange of vulnerability information and statistics would be crucial for proactive identification of trends that can lead to incident prevention. Network operators have always been reticent to disclose information about attacks on their systems or through their networks. However, this tendency seems to be overcome by the new awareness that it is only through cooperation that networking infrastructures can be made robust to attacks and failures. Starting from these considerations, we developed two components, namely INTERSECTION Vulnerability Database (IVD) and Project INTERSECTION Vulnerability Ontology Tool (PIVOT), for vulnerability data management and classification. Both tools will be presented in this paper.

Paper Nr: 71



Christophe Cruz and Christophe Nicolle

Abstract: The paper presents a flexible method to enrich and populate an existing OWL ontology from XML data based on graph-based rules. These rules are defined in order to populate automatically the new version of the OWL ontology. Today most of the data exchanged between information systems is done with the help of the XML syntax. Unfortunately when these data have to be integrated, the integration becomes difficult because of the semantics’ heterogeneity. Consequently, leading researches in the domain of database systems are moving to semantic model in order to store data and its semantics definition. To benefit from these new systems and technologies, and to integrate different data sources, a flexible method consists in populating an existing OWL ontology from XML data. In paper we present such a method based on the definition of a graph which represents rules that drive the populating process.

Paper Nr: 72

Applying Query By Example in OCL for Platform-Independent Programming


Piotr Habela, Krzysztof Kaczmarski, Grzegorz Falda, Wiktor Filipowicz, Krzysztof J. Stencel and Kazimierz Subieta

Abstract: Precise modelling of behaviour is an area where programming meets modelling, and textual syntax competes with a visual one. By developing a UML based platform-independent framework, we aimed to find a visual syntax aid to make the language more approachable to stakeholders, while taking advantage of existing UML syntax intuitions and offering a truly higher level of abstraction. Our solution consists of seamlessly integrated UML Actions and the Object Constraint Language (OCL) as a database query language, featuring both a textual and a visual syntax. In this paper we describe a declarative, Query by Example (QBE)-based approach to visualizing OCL expressions over a UML object-oriented model instance, to be used inside of textual or visual imperative statements. Such visual OCL expressions can also be used as ad-hoc queries. The paper presents a choice of visual syntax and describes its underlying semantics.

Paper Nr: 77

A publish/subscribe model for personal data on the Internet


Mark Wallis, Frans Henskens and Michael Hannaford

Abstract: With the recent increase in web application reliance on user-generated content, issues such as data duplication, data age and data ownership are becoming an increasing problem. It is now common to have multiple distinct web applications storing duplicate copies of a user's personal information in distinct storage formats and locations. This paper proposes a change in paradigm that places the ownership of a user's personal data back into their own hands by moving the storage of that data away from web applications and onto private storage nodes exposed by 3rd party providers. Web applications can then subscribe to various pieces of data under electronic contracts that govern the data's usage.

Paper Nr: 142



Jose Alfonso Aguilar, Irene Garrigos, Jose-Norberto Mazon and Juan Trujillo

Abstract: Web engineering software development is facing continuous changes in technology implementation. This involves analysts, developers and designers to provide extra effort in the design and maintenance of Web applications in order to adapt them to changes in requirements and implementation technologies. In this context, defining the requirements (functional and non-functional) that the system must meet to fully satisfy the user needs is a complex task. Interestingly, using a single technique does not ensure quality result, thus a set of techniques can be adopted according to the preferred development methodology. Therefore, in this paper, a systematic review is presented in order to obtain, in a formal way, the current state-of-the-art about approaches for modeling, analysis and specification of Web engineering requirements, supported with a formal and well defined strategy. The motivation that leads this review is to structure the conceptual basis of the Web engineering approaches for requirement analysis, thus identifying any gap in current research, and suggesting areas for further investigation.

Paper Nr: 155



José Miguel Rubio León, Francisco Jóse Reyes Cáceres and Jorge Inostroza

Abstract: Mobile technology is in an increasingly competitive market, and mobile devices, of various ranges and technologies such as PDAs, smartphones, cell phones, among others are their representatives, offering a wealth of services and resources to use. As it grows, the development of these devices and their processing and storage capabilities also increases the production of larger applications that exploit these capabilities. This is where the mobile gaming market appears. However due these devices are in a variety of ranges, operating systems and implementations, it creates a problem in game development area because of the lack of uniformity and consistency in the communication between devices. This article proposes an architecture model game under a ubiquitous environment for mobile devices, primarily focused on the communication of these devices using the Bluetooth protocol with the aim of standardizing applications for different platforms and devices providing support to developers of video games.

Paper Nr: 160

VoIPIntegration: VoIP Control and Processing System


Francisco Javier Serrano Rodríguez, Guillermo González Talaván, Vidal Moreno Rodilla, Belén Curto Diego, Jesús F. Rodríguez-Aragón and Ángeles María Moreno Montero

Abstract: In this paper a development for the treatment of a VoIP multidevice system which is extensible by plugins is presented. Heterogeneous character has lead us to the need of considering several different communications mechanisms at present. The usage of common devices for telephone communications has also been considered. A complete system has been developed in order to integrate all possible components of a VoIP system.

Area 2 - Web Interfaces and Applications

Full Papers
Paper Nr: 44

Annotations and Hypertrails with SpreadCrumbs - An Easy Way to Annotate, Refind and Share.


Ricardo Kawase, Wolfgang Nejdl and Eelco Herder

Abstract: Annotations have been shown to be an important activity during reading, especially during “active reading”. Annotations support understanding, interpretation, sensemaking and scannability. As valuable as in paper-based contexts, digital online annotations provide several benefits for annotators and collaborators. To study the impact of these benefits we present in this paper SpreadCrumbs, a straightforward Web annotation tool. SpreadCrumbs offers simple annotation’s interactions and metaphors that support most of the users’ annotations needs in the digital environment by enhancing the web experience with “in-context” annotations and providing a unique form of social navigation support with hypertrails. The results of our studies with the tool show the importance of annotations, the empirical outperformance of “in-context” annotations over other methods, and the outcome benefits of supporting social navigation.

Paper Nr: 64

Semantic Drift in Ontologies


Geir Solskinnsbakk, Jon Atle Gulla, Geir Solskinnsbakk, Veronika Haderlein, Per Myrseth and Olga Cerrato

Abstract: Ontology evolution is the process of incrementally and consistently adapting an existing ontology to changes in the relevant domain. Even though ontology management and versioning tools are now available, they are of limited use for ontology evolution unless the desired changes are known beforehand. Ontology learning toolsets are often employed, but they require large document sets and do not take the existing structures into account. Semantic drift refers to how concepts’ intentions gradually change as the domain evolves. When a semantic drift is detected, it means that a concept is gradually understood in a different way or its relationships with other concepts are undergoing some changes. A semantic drift captures small domain changes that are hard to detect with traditional ontology engineering approaches. This paper discusses a new approach to detecting and assessing semantic drift in ontologies. The method makes use of concept signatures that are constructed on the basis of how concepts are used and described. Comparing how signatures change over time, we see how concepts’ semantic content evolves and how their relationships to other concepts gradually reflect these changes. An experiment with the DNV’s business sector ontology from 2004 and 2008 demonstrates the value of this approach to ontology evolution.

Paper Nr: 79

The Art of Multi-faceted Tagging - Interweaving spatial annotations, categories, meaningful URIs and tags


Nicole Ullmann, Patrick Siehndel, Fabian Abel, Patrick Siehndel, Daniel Krause, Ricardo Kawase and Nicole Ullmann

Abstract: In this paper we present TagMe!, a tagging and exploration front-end for Flickr images, which enables users to attach tag assignments to a specific area within an image and to categorize tag assignments. We analyze the differences between tags and categories and show how both facets can be applied to learn semantic relations between concepts referenced by tags and categories. TagMe! automatically maps tags and categories to DBpedia URIs to clearly define the meaning. In our experiments we compare different strategies to realize such semantic mappings and show that already lightweight approaches map tags and categories with high precisions (86.85% and 93.77% respectively). We further discuss how multi-faceted tagging helps to improve the retrieval of folksonomy entities. The TagMe! system is currently available at http://tagme.groupme.org

Paper Nr: 80



M. Elena Renda

Abstract: A common characteristic of most of the traditional search and retrieval systems is that they are oriented towards a generic user, often failing in connecting people with what they are really looking for. In this paper we present PISA, a Personalized Information Search Assistant, which, rather than relying on the unrealistic assumption that the user will precisely specify what she is really looking for when searching, leverages implicit information about the user’s interests. PISA is a desktop application which provides the user with a highly personalized information space where she can create, manage and organize folders (similarly to email programs), and manage documents retrieved by the system into her folders to best fit her needs. Furthermore, PISA offers different mechanisms to search the Web, and the possibility of personalizing result delivery and visualization. PISA learns user and folder profiles from user’s choices, and uses these profiles to improve retrieval effectiveness in searching by selecting the relevant resources to query and filtering the results accordingly. A working prototype has been also developed, tested and evaluated. Preliminary user evaluation and experimental results are very promising, showing that the personalized search environment PISA provides considerably increases effectiveness and user satisfaction in the searching process.

Paper Nr: 97

Pragmatics of Storyboarding - Web Information Systems Portfolios


Klaus-Dieter Schewe and Bernhard Thalheim

Abstract: A Web Information System (WIS) can be described by a storyboard, which on a high level of abstraction specifies who will be using the system, in which way and for which goals. Syntax and semantics of storyboarding have been well-elaborated. Pragmatics is the necessary complement addressing what the storyboard means for its users. The part of pragmatics concerned with usage analysis by means of life cases, user models and contexts has been dealt with before. In this paper we complement usage analysis by WIS portfolios, which comprise two parts: the information portfolio and the utilisation portfolio. The former one is concerned with information consumed and produced by the WIS users, which leads to content chunks; the latter one captures functionality requirements.

Paper Nr: 125

Efficient Literature Research Based on Semantic Tagnets


Karl-Heinz Krempels, Uta Christoph and Daniel Götten

Abstract: In this paper we present an approach that is capable to automatically generate semantic tagnets for given sets of german tags (keywords) and an arbitrary text corpus using three different analysis methods. The resulting tagnets are used to estimate similarities between texts that are manually tagged with the keywords from the given tagset. Basically, this approach can be used in digital libraries to provide an efficient and intuitive interface for literature research. Although it is mainly optimized for the german language the proposed methods can easily be enhanced to generate tagnets for a given set of english keywords.

Short Papers
Paper Nr: 31



Dragos Palaghita, Bogdan Vintila, Dragos Palaghita and Maria Dascalu

Abstract: The paper proposes a new improved algorithm for creating hierarchies of features and options for self-adapting web interfaces against the common one used by many applications. The user interface concept is presented. Types of user interfaces are described. Quality characteristics of the user interfaces are analyzed. Ways of fulfilling these quality characteristics while keeping the costs low are discussed. Advantages and disadvantages of self-adapting and static web interfaces are given. The most common algorithm for creating hierarchies of features and options is described and analyzed. The advantages and disadvantages of the proposed algorithm are discussed. New directions for the development of the self-adapting web interfaces are highlighted.

Paper Nr: 49



Jarkko Alajääski, Harri Ketamo, Kristian Kiili and Jarkko Alajääski

Abstract: In this study the user experiences of commercial educational product were gathered in order to build the personas that can be used to revising the market segmentation. Personas are empirically formed archetypical characters representing distinct behavioural clusters, goals and the motivation of end users. Usually personas are used in different production phases as tools that help designers and marketing people in decision making. In this study, the personas were formed by applying k-means cluster analysis into quantified user interviews. According to the results of the study, qualitatively formed personas showed their strengths as decision making tools: They helped publisher to maintain the focus on a learner's needs, wants and requirements during the whole process of development.

Paper Nr: 62

A Reference Ontology Based Approach for Service Oriented Ontology Management


Shuying Wang, Jinghui Lu and Miriam Capretz

Abstract: To establish effective information exchange among applications in a distributed B2B environment, the business participants are not only required to share their functions or service interfaces, but in many cases, they also need to exchange their data models. Ontology, as a popular semantic form of knowledge representation, can be used to represent data models, thus allowing applications to locate and integrate these models in a more intelligent way. In this paper, we introduce a reference ontology based approach for service oriented ontology management. Specifically, STAR, a domain specific reference ontology, is built and used for the experiments in a real life case. Furthermore, in order to validate and evaluate our approach and implementation, a prototype system is developed to provide ontology deploying, browsing and mapping operations on a service-oriented mechanism. Our experiments have provided promising results, which are consistent with our original ideas of managing ontologies and optimizing ontology mappings to facilitate data interoperability in a distributed environment.

Paper Nr: 69

Integration of Spatial Technologies and Semantic Web Technologies for Industrial Archaeology


Christophe Cruz, Franck Marzani, Ashish Karmacharya and Frank Boochs

Abstract: We propose a method that uses the advancement in spatial technologies from current database systems within the Semantic Web Technologies in order to enrich and to populate the knowledge of a domain defined in an OWL-DL ontology. The results of spatial operations and functions are used to populate and to enrich ontologies with new individuals and new relationships. The advantage of spatial analysis within Semantic Web technologies is the diversity of the functionalities provided by the combination of spatial operations and the rule language of the Semantic Web (SWRL). This method is applied in the industrial archaeology domain in order to enhance the knowledge management.

Paper Nr: 75



David G. W. Scott, April MacPhail, Thomas Connolly and April MacPhail

Abstract: Occupational Healthcare (OH) is about the promotion and maintenance of the physical, mental and social well-being of employees. It aims to protect staff from workplace risks, but also to manage the effect of any health issues on their work. Given the cost of absence through illness to both the organisation and the individual, and given the government legislation that exists in this area, OH is of increasing importance to organisations and many now outsource this service. This paper discusses how a Knowledge Transfer Partnership (KTP) project between a university and an OH provider led to the development of a web-based Management Information System (MIS) for Occupational Health that allows organisations to better manage their OH provision and sickness absences. The system is currently being evaluated in a large public sector organisation and early feedback is positive.

Paper Nr: 93

Human Computer Collaboration to Improve Annotations in Semantic Wikis


Hala Skaf-Molli, Armelle Brun, Anne Boyer and Hala Skaf-Molli

Abstract: Semantic wikis are very promising tools for producing structured and unstructured data. However, they suffer from a lack of user provided semantic annotations, resulting in a loss of efficiency, despite of their high potential. This paper focuses on an original way to encourage users to annotate semantically pages. We propose a system that suggests automatically computed annotations to users. Users thus only have to validate, complete, modify, refuse or ignore these suggested annotations. We assume that as the annotation task becomes easier, more users will provide annotations. The system we propose is based on collaborative filtering recommender systems, it does not exploit the content of the pages but the usage made on these pages by the users: annotations are deduced from the usage of the pages and the annotations previously provided. The resulting semantic wikis contain several kinds of annotations that are differentiated by their status: human provided or computer provided annotations, human-computed interactions (suggested by the system, validated by the users) and refused annotations (suggested by the system, refused by the user). Navigation and (semantic) search will thus be facilitated and more efficient.

Paper Nr: 107



Andrea Andrenucci

Abstract: This paper discusses a follow-up study aimed at investigating the extraction of word relations from a medical parallel corpus in the field of Psychology. Word relations are extracted in order to create a bilingual lexicon for cross lingual question answering between Swedish and English on a medical portal. Six different variants of the corpus were utilized: word inflections with and without POS tagging, syntactically parsed word inflections, lemmas with and without POS tagging, syntactically parsed lemmas. The purpose of the study was to analyze the quality of the word relations obtained from the different versions of the corpus and to understand which version of the corpus was more suitable for extracting a bilingual lexicon in the field of psychology. The word alignments were evaluated with the help of reference data (gold standard) and with measures such as precision and recall.

Paper Nr: 117

Designing client view navigations using REST style service patterns


Eunjung Lee and Kyong-Jin Seo

Abstract: This paper considers an approach to the development of view navigations in a REST client page. When a page interfaces multiple service methods, it needs to maintain multiple views, along with local data. For this reason, it is necessary to develop navigational codes between views and service requests. The contributions of this paper are as follows: First, we discussed a formal approach for using REST service method patterns in order to design client page views and navigations. Second, we presented type conditions for possible method calls and view moves. In addition, we introduced a design model to help developers to describe the relations between views and resources on an abstract level. Finally, we presented a prototype implementation for navigational code generation using XForms pages, applying the proposed approach and standard patterns.

Paper Nr: 119

SITEMAPS FROM A MODEL DRIVEN PERSPECTIVE. A first step for bridging the gap between information architecture and navigation design.


Antonia Mª Reina Quintero and Jesus Torres

Abstract: Researchers claim that there is a disconnection between information architecture and navigation design. One way of approaching these two fields is to share deliverables. However, it is difficult to change the minds of audiences to make them use deliverables they are not used to. Thus, we propose let audiences use those deliverables they are more comfortable with, and then transform one deliverable into another, as far as possible. To get this aim, firstly, we need to have a deep knowledge of deliverables, and secondly, a set of mappings have to be defined in order to translate the information the source deliverable is covering into the target deliverable. Our approach uses metamodelling as the technique to define the pieces that compose deliverables and their relationships, and model transformations for mapping deliverables. In this context, the paper focuses on one of the most widely used information architecture deliverables, sitemaps, and its main contributions are: (1) a sitemap metamodel, which define the minimum set of elements that can be used for specifying sitemaps; and, (2) a set of model to model transformations to obtain a XHTML skeleton of structural and utility navigation.

Paper Nr: 140

TIYU A location based music player for sports


Georg J. Schneider and Henning Voss

Abstract: This paper describes a mobile location based music player for outdoor sports like running, cycling or hiking. Athletes can boost their performance while listening to the right songs. The TIYU system augments trails with appropriate music for the athletes, either automatically using an intelligent selection mechanism or manually via a web based authoring tool. These so created trails can be shared with other users on a web based sports community platform.

Paper Nr: 146

CrosSing Framework: A Dynamic Infrastructure to Develop Knowledge-Based Recommenders in Cross Domains


Mustafa Azak, Mustafa Azak and Aysenur Birturk

Abstract: We propose a dynamic framework that differs from the previous works as it focuses on the easy development of knowledge-based recommenders and it proposes an intensive cross domain capability with the help of domain knowledge. The framework has a generic and flexible structure that data models and user interfaces are generated based on ontologies. New recommendation domains can be integrated to the framework easily in order to improve recommendation diversity. We accomplish the cross-domain recommendation via an abstraction in domain features if the direct matching of the domain features is not possible when the domains are not very close to each other.

Paper Nr: 151



Philipp Obermeier, Anne Augustin, Marko Harasic, Anne Augustin, Philipp Obermeier and Robert Tolksdorf

Abstract: Distributed semantic stores can employ self-organization principles to improve their scalability. We present an implementation that uses ant colony optimization for clustering similar semantic information facilitating scalable retrieval. For the clustering mechanisms we use similarity measures that do not rely on access to a complete ontology. We describe a syntactical and a fingerprint-based similarity measure and discuss them regarding to expressiveness and computational effort. The results of an evaluation show that, with increasing volume of data and number of processes, the fingerprint-based measure performs much better than the syntactical one. We conclude with a discussion how to combine the advantages of the two measures and propose some technical enhancements improving the efficiency of the system.

Paper Nr: 178

Information Uniqueness in Wikipedia Articles


Nikos Kirtsis, Sofia Stamou, Paraskevi Tzekou and Nikos Zotos

Abstract: Wikipedia is one of the most successful worldwide collaborative efforts to put together user generated content in a meaningfully organized and intuitive manner. Currently, Wikipedia hosts millions of articles on a variety of topics, supplied by thousands of contributors. A critical factor in Wikipedia’s success is its open nature, which enables everyone edit, revise and /or question (via talk pages) the article contents. Considering the phenomenal growth of Wikipedia and the lack of a peer review process for its contents, it becomes evident that both editors and administrators have difficulty in validating its quality on a systematic and coordinated basis. This difficulty has motivated several research works on how to assess the quality of Wikipedia articles. In this paper, we propose the exploitation of a novel indicator for the Wikipedia articles’ quality, namely information uniqueness. In this respect, we describe a method that captures the information duplication across the article contents in an attempt to infer the amount of distinct information every article communicates. Our approach relies on the intuition that an article offering unique information about its subject is of better quality compared to an article that discusses issues already addressed in several other Wikipedia articles

Paper Nr: 180

APPLYING MEDICINE 2.0 TO THE I-CAN Managing the Needs and Rights of End Users


Deborah Richards

Abstract: This paper considers how Medicine 2.0 features can be added to an existing e-health application known as the I-CAN (Instrument for the Classification and Assessment of Support Needs). One of the biggest problems with a social networking feature based around health concerns as introduced is the issue of privacy. Even though participation is on a completely unsolicited, opt-in basis, there are access and privacy issues involved in such a tool. A preliminary design proposal is presented which takes into account the needs, responsibilities, rights and abilities of the (direct and indirect) users.

Paper Nr: 183

Transcription Support System using Subversion


Takehiko Murakawa, Hitoshi Fukuoka, Daichi Noda and Masaru Nakagawa

Abstract: We report the data management system and the interface for reading the shot image and the text of a Buddhist sutra written in Chinese and modifying the text so that it may be the same as the image in terms of content. By using Subversion we maintain the text files efficiently and obtain the difference of the contents between any two points of time easily. To make sure that the system can be employed as a multiuser transcription support tool, we present the working model and deliver the experiment where the workers used the system and we found revision markings on which two workers or more made. Furthermore we propose the method for piecing together the workers' respective outcomes to produce the integrated text file.

Paper Nr: 21

Pirka'r: Tool for Web Designers, Supporting Development of Multi-platform Web Application


Jun Iio, Hiroyuki Shimizu, Akihiko Matsumoto and Hisayoshi Sasaki

Abstract: The elemental technologies of WWW have been standardized as HTML and CSS, recommended by W3C. Therefore, not only the servers but also web browsers are considered as components independent within the whole web application system so that they should be interchangeable. However, the actual web application has not yet led to this ideal situation. The problem of interoperability discrepancies between different implementation of web browsers still remains. So far we had studied the reason why this problem arose and collected instances of problem. Based on the result of our previous study, we developed an web-system designers' tool which can provide the useful information for them to avoid the pitfalls on the interoperability discrepancy problems.

Paper Nr: 23

SemSon: Connecting Ontologies and Web Applications


Geert Vanderhulst, Karin Coninx and Kris Luyten

Abstract: The emerge of semantic data on the web puts the development of dynamic web applications to the test. On the one hand, we witness more and more semantic information becoming available on the web. On the other hand, we can see web-based applications evolve from server-side applications to responsive client-side applications running in a web browser. In this paper we present a framework that bridges the gap between semantic data defined in OWL ontologies and client-side web applications implemented in JavaScript.

Paper Nr: 29



N. Ben Fairweather, Mohammed Altayar, Neil McBride and Mohammed Altayar

Abstract: Enterprise Information Portals (EIPs) have become crucial components in contemporary organisations, and universities and other higher education institutions are not exempt. While there are many studies concerning the adoption, implementation and utilisation of EIPs in organisations, there are few studies that touch this issue in the academic environment. The aim of this paper is to report initial findings from an in-progress research project on the adoption of campus portals in some Saudi and UK universities. This study adopts a qualitative research approach based on multiple case studies. A research methodology was designed to conduct the research and to collect data through semi-structured interviews and documentation, and then analysed using various qualitative data analysis techniques such as coding and categorising, cross-interview analysis and document analysis. The findings of the study show that there are many factors that affect the adoption of campus portals such as: organisational factors, innovation factors, economic factors, technical factors and environmental factors. Finally, the paper proposes an initial model and concludes with the main findings and provides some recommendations and suggestions for further research.

Paper Nr: 35



Antonio Pintus, Carmen Santoro and Fabio Paternò

Abstract: This paper presents a methodology that defines a model-based approach in composing User Interfaces for Business Processes based on Web service technology. The core concepts of the methodology are represented by an integration of modern task model notations developed in the HCI area, such as ConcurTaskTrees, with a mainstream notation for Business Process modelling (BPMN) developed in the workflow/business process area. The main advantage is to obtain thorough support in designing complex interactive business applications able to flexibly compose Web services and obtain meaningful associated user interfaces which can be not only Web interfaces but also extended to a multi-modal fruition. The proposed methodology allows a collaborative work between business process modellers and user interface modellers remaining open to iterative refinements. In this paper we also briefly compare the considered notations and discuss an example application of the proposed method.

Paper Nr: 43

A NEW APPROACH TOWARDS VERTICAL SEARCH ENGINES, Intelligent Focused Crawling and Multilingual Semantic Techniques


Sybille Peters, Claus-Peter Rückemann and Wolfgang Sander-Beuermann

Abstract: Search engines typically consist of a crawler which traverses the web retrieving documents and a search front-end which provides the user interface to the acquired information. Focused crawlers refine the crawler by intelligently directing it to predefined topic areas. The evolution of search engines today is expedited by supplying more search capabilities such as a search for metadata as well as search within the content text. Semantic web standards have supplied methods for augmenting webpages with metadata. Machine learning techniques are used where necessary to gather more metadata from unstructured webpages. This paper analyzes the effectiveness of techniques for vertical search engines with respect to focused crawling and metadata integration exemplarily in the field of "educational research". A search engine for these purposes implemented within the EERQI project is described and tested. The enhancement of focused crawling with the use of link analysis and anchor text classification is implemented and verified. A new heuristic score calculation formula has been developed for focusing the crawler. Full-texts and metadata from various multilingual sources are collected and combined into a common format.

Paper Nr: 56



Ruben Diego Carrera, Jose Manuel De la Horra, Alba Fuertes, Nuria Forcada, Francisco Ballester and Miquel Casals

Abstract: The purpose of this paper is to describe a Knowledge management system (SGAC- Active Knowledge Management System) and an e-learning system that have been developed to manage the knowledge generated inside a research project, to transfer this knowledge from the research project to society and, moreover, to facilitate the e-learning tasks. This paper approaches the problem of knowledge transfer from a case study angle. The systems are implemented in “The Multidimensional City” which is a multi-disciplinary research project that promotes the development and implementation of Spanish technological innovation in underground construction. The SGAC system aims at achieving full integration of large set of contents created in research projects related to underground construction sites. Then, the project has developed several types of metadata for tagging and enriching contents. All this enriched information becomes the base of different e-learning courses in underground construction, promoting an updated education in this knowledge area and the dissemination of the results generated in the research project. These courses are accessible from an interactive e-learning platform.

Paper Nr: 57

Towards a Quality Evaluation Framework for Model-Driven Web Engineering Methodologies


Francisco José Domínguez Mayo, Maria Jose Escalona, Manuel Mejías Risoto and Arturo Henry Torres Zenteno

Abstract: Various development methodologies currently exist in the field of Model-Driven Web Engineering (MDWE). Given the high number of methodologies available, it is necessary to evaluate the quality of the existing methodologies and provide helpful information to the developers. Furthermore, proposals are constantly appearing and the need may arise not only to evaluate the quality but also to find out how it can be improved. This article presents the work being carried out in this field and describes tasks to define a Quality Evaluation Framework (QuEF) to evaluate, under objective measures the quality of Model-Driven Web Engineering methodologies.

Paper Nr: 86

GVSIGDROID: An open source GIS solution for the Android platform


Cristian Martín-Reinhold, Joaquín Huerta and Carlos Granell

Abstract: Mobile GIS applications are gaining attention due to a wide range of potential target applications such as e-commerce, tourism, education, agriculture, and field research. Most of these application domains require easy-to-use geospatial applications operating on mobile devices to enable both visualization and editing of widely-used geospatial data and formats. To overcome this limitation, this paper introduces gvSIGDroid, an open source geospatial mobile application that attempts to reach the new potential users emerged from the Android platform. gvSIGDroid, whose core functionalities rely on gvSIGMobile modules whereas user interface has been especially designed for the Android platform, allows mobile users to retrieve, visualize, navigate and modify both local and remote geospatial layers. Based on this prototype, future extensions can be deployed so as to provide missing functionalities such as Location Based Services and data sharing.

Paper Nr: 94

SENSORGIS: An Integrated Architecture for Information Systems Based on Sensor Networks


Jianzhao Huang, Nicholas Boers, Eleni Stroulia, Pawel Gburzynski and Ioanis Nikolaidis

Abstract: In this paper, we describe SensorGIS, an integrated architecture for WSN applications. SensorGIS provides an integrated service-oriented architecture for collecting, archiving, analyzing, and visualizing sensor network data in a geographic information system (GIS). By using an extendible GIS framework as one of its user views, SensorGIS can contextually communicate the collected data, its trends, and distinct values of interest. In addition, it is designed in the service-oriented style and hence is extendible in terms of the analyses and visualizations. Finally, it integrates an online collaborative forum that enables annotation of the collected data with the users’ observations and interpretations.

Paper Nr: 98

A COMPARATIVE STUDY OF THESAURI TOOLS A perspective from integrability in information systems


Beatriz Perez Leon and M. Mercedes Martínez-González

Abstract: Semantic Web has brought a renewed interest in these tools as support for semantic searches and other added value services. Tools that manage thesauri permit to create, edit, and query thesauri. But there is also the possibility to import thesauri and to integrate thesauri. In fact, integrability at the information level has also received an important push with the stabilisation of the SKOS standard in August 2009 as a W3C Recommendation. In this paper several thesauri tools are evaluated. The criteria used for the evaluation, which include integrability issues, are presented and later applied to a set of tools. The results of the evaluation and the conclusions obtained from its comparison are presented.

Paper Nr: 103

Emotion Based Music Retrieval Using Consistency Principle and Multi-Query Method


Song-Yi Shin, Songyi Shin, Joon-Whoan Lee, Eun-Jong Park and Kyoungbae Eum

Abstract: In this paper, we propose the construction of multi - queries and consistency principal for emotion-based music retrieval. Existing content-based music retrievals which use a single query reflect the retrieval intention of user by moving the query point or updating the weights. However, these methods have the limitation to represent the complicated factors of emotion. In the proposed method, additional queries of music are taken in each feedback process to express the user’s emotion. We classify the music by the emotions. And the music is clustered by the MKBC(Mercer Kernel-Based Clustering) method. After that, the inclusion degree of each descriptor is obtained. This means the weight that represents the importance of each descriptor for each emotion in order to reduce the computation. We got the excellent result within the 2nd retrieval through the feedback. In the feedback process, we used the consistency principle and multi- queries.

Paper Nr: 112

eHumanities Desktop - An Architecture for flexible Annotation in Iconographic Research


Rüdiger Gleim, Alexander Mehler and Paul Warner

Abstract: This article addresses challenges in maintaining and annotating image resources in the field of iconographic research. We focus on the task of bringing together generic and extensible techniques for resource and annotation management with the highly specific demands in this area of research. Special emphasis is put on the interrelation of images, image segements and textual contents. In addition, we describe the architecture, data model and user interface of the open annotation system used in the image database application that is a part of the eHumanities Desktop.

Paper Nr: 152



Antonia Huertas and Enric Mor

Abstract: Learning logic in engineering has similar difficulties like in mathematics: a very low academic performance and a high student dropout. In this kind of subjects interactive training activities with immediate feedback are fundamental. In a traditional face-to-face logic course the face-to-face interaction with the instructor usually provides it. In an e-learning or web-based paradigm the role of the instructor should be helped by an intelligent tutoring system. In this paper we present the design and development process of learning tools for a logic course. This tool follows a student-centred design approach in order to provide the accurate tool for a successful learning experience. A general discussion of learning tools for logics is also presented; showing that this kind of topics has concrete and specific needs in online learning.

Paper Nr: 163



Walter Balzano, Maria Rosaria Del Sorbo and Antonio Tarantino

Abstract: Multimedia databases store huge amount of heterogeneous information, but the user’s queries usually search for just very short sections of data hidden and mixed with each other. This work presents a support methodology for Information Retrieval Systems on a collection of Multimedia Data Objects. The main idea of this retrieval methodology exploits raw metadata information stored in multimedia objects to realize a classification using an innovative approach, based on a spatial dispersion index. A convenient synthetic representation of multimedia objects is drawn from the Lexical Database WordNet. It provides the system with synonymic and polysemic Semantic Knowledge. With the aim to achieve an alternative segmentation on document classes, a clustering algorithm based on the Nearest Neighbour geospatial index is finally used.

Paper Nr: 179

PDF/A - Towards a True Digital Archival Surrogate (DAS) for Digital Manuscript Collections


Jeffrey Monseau, Rodney Obien and Jeffrey Monseau

Abstract: Digital surrogates provide a non-invasive means to study old manuscript documents that are often too fragile and valuable for wide public access. These surrogates are generated from web-accessible derivatives made from high-resolution archival masters; these masters serve as long-term digital preservation copies. What if there was a file format that combined the functions of digital surrogate, web-accessible derivate, and archival master? This paper considers the notion of the archival file format PDF/A (ISO: 19005-1) as digital archival surrogate or DAS that combines the functions of surrogate, derivative, and master. The paper discusses, furthermore, the versatility of PDF/A in dealing with the complex nature of old manuscripts, and the possible implications of adapting PDF/A as a DAS standard.

Paper Nr: 192

A Hub Architecture for Service Ecosystems: Towards Business-to-Business Automation with an Ontology-Enabled Collaboration Platform


Alex Norta

Abstract: The management and coordination of business process collaboration experiences changes because of globalization, specialization, and innovation. Service-oriented computing (SOC) is a means towards business-process automation and recently, many industry standards emerged to become part of the service-oriented architecture (SOA) stack. In a globalized world, organizations face new challenges for setting up and carrying out collaborations in semi-automating ecosystems for business services. A need emerges for service hubs that not only store service offers and requests together with their issuing organizations and assigned owners, but that also allow an evaluation of trust and reputation in an anonymized electronic service marketplace. In this paper, we explore the features of a semi-automating ecosystem in which business processes are expressed as services and where hubs are essential for bringing together service offers and requests. The presented Hub architecture is designed so that business managers benefit from an interface that borrows concepts of social-networking sites while the complex computing machinery for matching service offers and requests remains hidden from the user. The partial implementation of service-hub components demonstrate the feasibility of our approach.

Paper Nr: 195

Web-based 'Computer Assisted Surgical Anatomy Mapping' (CASAM). A new online tool to reduce complications in surgery


N. N. Smit, S. Rabbelier, C.P. Botha, B. M. W. Sedee, Anton Kerver, Noeska N. Smit, GDPR Protected, Sverre Rabbelier, Charl P. Botha and Gert-Jan Kleinrensink

Abstract: In surgery one of the major problems is a safe approach of the operation site. For surgeons it is paramount to know the location of surgically relevant nerves and vessels. Especially in surgery of the lateral (outside) foot, the anatomy is not always completely clear since the location of nerves and vessels is highly variable. Therefore CASAM is developed by students in Delft and Rotterdam (Netherlands). This web-application is based on the Django-framework and is a useful tool for three usergroups: 1) Researchers: After photographing dissected specimen a Thin Plate Spline transformation is used to compute an average foot and the pictures of individual specimen are warped to match this reference, average-foot. Renditions can be made to depict relevant surgical anatomy. Finally the researchers can define a zone in the lateral foot in which it is safe to approach the operation site. 2) Surgeons: Relevant anatomy (gathered by the researcher) can be warped over the picture of the patient. This pre-operative planning using CASAM assists the surgeon in determining a ‘tailor made’ safe-zone for each patient. 3) Students: For educational purposes, a drawn incision line can be compared to the computed location of nerves and vessels, thus providing personal feedback.

Paper Nr: 197

WEB ACCESSIBILITY - Portuguese Web Accesibility With WCAG-1.0 and WCAG-2.0


Ramiro Gonçalves, Henrique Mamede, Jorge Pereira and José Martins

Abstract: Web accessibility is growing in importance as each day goes by. Alongside with this growth, also the need of access to web resources, by those with some sort of disability, is increasing. The web is very important for spreading information and for the interaction between the various society elements. Given this, it’s mandatory that the web presents itself as a totally accessible resource, so that it can help the disable citizens in their integration within the society. This obligation should be even bigger for the enterprises because, in their majority, the web is used as a marketing and business platform. This document is meant to be a position paper regarding the comparison of results between web accessibility evaluations of the Portuguese websites using version 1 and version 2 of the W3C Web Content Accessibility Guidelines.

Area 3 - Society, e-Business and e-Government

Full Papers
Paper Nr: 53

WEB SITE BRAND ATTRIBUTES AND E-SHOPPER LOYALTY:A comparative study of Spain and Scotland


Sandra Loureiro and Silvina Santana

Abstract: This study examines the impact of web site brand personality, web site brand association, web site brand image, and web site brand relationship on e-shopper loyalty to the web site. The model was estimated on data from consumers of online products in Spain and Scotland using PLS technique. The findings suggest that web site brand association and web site brand personality are good predictors of web site brand image. However, web site brand image does not explain the intention of Spanish students to recommend a web site and to use it to by again.

Paper Nr: 111

A Framework for Delivering Personalized e-Government Tourism Services


Malak Al-hassan, Helen Lu and Jie Lu

Abstract: E-government (e-Gov) has become one of the most important parts of government strategies. Significant efforts have been devoted to e-Gov tourism services in many countries because tourism is one of the major profitable industries. However, the current e-Gov tourism services are limited to simple online presentation of tourism information. Intelligent e-Gov tourism services, such as the personalized e-Gov (Pe-Gov) tourism services, are highly desirable for helping users decide ''where to go, and what to do/see'' amongst massive number of destinations and enormous attractiveness and activities. This paper proposes a framework of Pe-Gov tourism services using recommender system techniques and semantic ontology. This framework has the potential to enable tourism information seekers to locate the most interesting destinations with the most suitable activities with the least search efforts. Its workflow and some outstanding features are depicted with an example.

Paper Nr: 169



Andrew Nagel, Akira Kawaguchi, Andrew Nagel, Chiu Chan and Neville Parker

Abstract: This paper discusses the implementation of one type of information system for the New York City bus transit service, as a case study to provide value-added transportation services for people with impaired mobility. Information technology is a key tool for finding flexible transportation services, especially for disabled people. Useful information supplies psychological reassurance to these vulnerable people to make them feel more safe and secure. Residents in metropolitan areas increasingly rely on the convenience of public transportation, and they are becoming used to exchanging information relevant to their regional community in on-line settings. The improvement to transit accessibility needs the exact same type of the cooperation between transportation companies, local business, and residents. The widespread use of mobile wheelchairs has a socioeconomic impact. The significance of this research for the longer-term goals lies in its implications for adaptation of this kind of intelligent model into future welfare or assistive activities.

Short Papers
Paper Nr: 12

Evaluation of Collecting Reviews in Centralized Online Reputation Systems


Ling Liu, Malcolm Munro and William Song

Abstract: Background: Centralized Online Reputation Systems (ORS) have been widely used by internet companies. They collect users’ opinions on products, transactions and events as reputation information then aggregate and publish the information to the public. Aim: Studies of reputation systems evaluation to date have tended to focus on isolated systems or their aggregating algorithms only. This paper proposes an evaluation mechanism to measure different reputation systems in the same context. Method: Reputation systems naturally have differing interfaces, and track different aspects of user behavior, however, from information system perspective, they all share five underlying components: Input, Processing, Storage, Output and Feedback Loop. Therefore, reputation systems can be divided into these five components and measured by their properties respectively. Results: The paper concentrates on the evaluation of Input and develops a set of simple formulas to represent the cost of reputation information collection. This is then applied to three different sites and the resulting analysis shows the pros and cons of the differing approaches of each of these sites.

Paper Nr: 14

Is Internet Access a Human Right?


Anh Tuan Nuyen

Abstract: In June 2009, the highest court of France, The Constitutional Council, declared internet access to be a basic human right. Many people are now campaigning to have it recognized as a human right by the United Nations, along with those human rights already recognized by the world body. The main motivation behind the campaign is the desire to close the digital divide, particularly that between rich and poor nations. However, while having internet access recognized as a human right might go some way towards addressing the digital divide issue, the theoretical case for recognition has not been clearly established. Without a solid theoretical case, recognizing something to be a human right is a misunderstanding of the nature of that something as well as of human rights. The former kind of misunderstanding may result in misdirected efforts at promoting the activity in question and the latter in a debasement of human rights. This paper will provide an account of human rights and will argue that on the basis of such account, internet access is not a human right, even though it is an important right in itself and one that enables the promotion of other human rights,

Paper Nr: 24



HyunSook Ahn, DongMan Lee and Hyunsun Park

Abstract: This study investigates how government support influences the performance of e-business companies. Drawing on previous studies, funding support for technology development and marketing support, currently accounting for the biggest part of the support provided by the Korean government to the e-business sector, were selected as independent variable. Meanwhile, performance indicators specific to e-business such as human resources development, competitiveness enhancement, profitability, and growth in technology assets were chosen as dependent variable. The data was collected through a survey of CEOs and executives of e-business companies that had received or were receiving government for technology development had a positive influence on competitiveness enhancement, profitability, and technology assets growth. Marketing support, while it had a significant influence on competitiveness enhancement and technology asset growth, proved to have no measurable effect on profitability.

Paper Nr: 26

The Twittering Machine


Miranda Mowbray and Miranda Mowbray

Abstract: This paper is a study of the use of Twitter by automated agents, based on data sampled in July-September 2009. It discusses the dramatic rise in rapidly-tweeting automated Twitter accounts beginning in late June 2009; some surprising behaviour by automated Twitter profiles that make direct use of Twitter’s API; and techniques used for automated spamming on Twitter. Ideas are suggested for ways in which Twitter might defend against some common types of automated Twitter spam. The paper ends by outlining some general conclusions for designers of social information systems.

Paper Nr: 55



Juha Puustjärvi and Leena Puustjärvi

Abstract: Many studies have indicated that most patients are not satisfied with the medical treatment information on the Web though many e-health tools provide links to materials or other websites that have information about patient’s health conditions or medications. In addition, many studies have demonstrated that patients should have easy access to their own health information as well as to any information they need in order to make decisions about their own heath care. However, while there are a variety of tools for managing and sharing medical information, no integrated tool for health information management and sharing has been developed. Satisfying this challenge requires a means to capture and interconnect information from various sources which are relevant to one patient and create personal health space containing links to the health information that are related to the customer or of which the customer is interested in. In this paper we describe our work on developing a personal health assistant, which integrates the tools supporting personal health records, information therapy and health oriented blogs. Technically the personal health assistant is based on knowledge management technologies, and it is easily extensible to capture additional e-health tools

Paper Nr: 96



Daniel Vecchiato, Maria Beatriz Toledo, Itana Gimenes and Marcelo Fantinato

Abstract: Electronic contracts (e-contracts) usually describe cross-organizational business processes defining electronic services to be provided and consumed as well as constraints on service execution such as, for instance, Quality of Service (QoS). Due to market dynamism, it is common that organizations involved in a cooperation need to do some adjustments in a pre-established e-contract. These changes should be allowed through renegotiation of contractual clauses after the e-contract is already signed and being enacted. In this paper, feature modeling is used to represent electronic services (e-services), QoS attributes and control operations to be applied when QoS attribute levels are not met. In addition, an execution environment is proposed to support contract establishment, business process execution, service monitoring and contract renegotiation.

Paper Nr: 108

SECURITY IN E-BUSINESS: Understanding Customers Perceptions and Concerns


Ja'far Alqatawna, Ja'far Alqatawna, Mohammed Hjouj Btoush and Jawed Siddiqi

Abstract: It has become apparent to many security researchers that traditional security approaches are not sufficient to provide adequate security for today's pervasive electronic business environment. We and others argue that security is a socio-technical problem in which its social components are not sufficiently addressed or understood. Our contribution aims to overcome this problem situation, by developing a better understanding of online customers’ security perceptions in Jordan. An interpretive approach is employed and general inductive coding process is used to analyse the collected data. On the basis of these study's findings we argue that many customers’ related aspects need to be considered in order to elevate e-Business security. These aspects include perceptions and concerns as well knowledge of and interaction with other stakeholders.

Paper Nr: 110



Adrian Buzgar, Adrian Buzgar and Sabin-Corneliu Buraga

Abstract: This paper proposes UPCITY, a Service-Oriented Architecture for eGovernment. UPCITY tracks the stages of a local community problem-solving workflow on an interactive map, by using a zoomable user interface, as well as a timeline to add a temporal dimension to data present in the system. Usability related features, as well as interoperability with popular social networks, are used to encourage citizen participation. We provide an extensible platform by means of a flexible plug-in system, exemplified by an epidemic tracker.

Paper Nr: 115

Information Cards and Affirmative Statements


Mario Ivkovic and Martin Centner

Abstract: E-government services require strong methods of identification and authentication in order to protect personal rights and to comply with corresponding laws. The requirements for the authentication process can be fulfilled by electronic signatures. Identification in e-government applications often relies on government-issued identifiers provided by electronic identity (eID) cards. An eID card with signature creation capabilities is typically called Citizen Card. The Information Cards technology, a recently introduced user-centric identity management framework, gains more and more importance if the field of eID. Expecting a high importance of Information Cards in the future, it would be very reasonable to utilize it for e-government services. In this paper we present an approach to use Citizen Cards together with Information Cards for identification and authentication in e-government services.

Paper Nr: 129



Falk Scheiding, Melanie Stiller, Jan Finzen and Claudia Dukino

Abstract: In recent years, providers of e-business software have started tailoring their solutions to the needs of SMEs, e.g. smaller sized ERP and CRM systems. However, for many SMEs, e-business systems are still too expensive or require a lot of effort on the SME’s side. As SMEs often do not employ IT-specialists who possess the necessary skills and knowledge to evaluate and select an appropriate e business software system that fits the company’s needs, the need for external support becomes evident. The eBSN eBusiness Solutions Guide is an online tool that especially helps SMEs in finding suitable e-business solutions. It is equipped with different search algorithms and offers an e-business competence calculator. The paper introduces the tool and thereby focuses on the methods and concepts to match the offers of e-business suppliers and SME needs.

Paper Nr: 166

New Tide of eCommerce: Case from China


June Lu, Chun-Sheng Yu and Xue-Bin Dong

Abstract: Electronic commerce (eCommerce) is expected to play an increasingly important role in the 21st century global market. This paper describes a case based on interviews with the CEO of an SME implementing EC business in the East Coastal Area of China. Discussion of the case is based on Molla and Licker’s (2005) six EC stages of growth and their Perceived E-Readiness Model (PERM) for developing countries. The findings help to explore the eCommerce status, major contributing factors, and unique values of eCommerce for China. The findings should also be important for studying eCommerce trends in the emerging markets.

Paper Nr: 172



Elin Wihlborg, Linus Johansson Krafve and Ulf Melin

Abstract: This paper shows how e-government can, or might even have to, be considered as a public policy transformation. In the process of merging authorities into new organisations public policies on e-government appeared as a key activity. The case study presented in the paper is the formation of the new Swedish Transport Agency formed out of several formerly independent authorities. The Swedish case study is a mature public administration and basic democratic core values. The main contribution from the case study is to point out the importance of translation of policies into organizational practices.

Paper Nr: 188



Huayu Zhang, Agnes Koschmider and Andreas Oberweis

Abstract: Social networks are known to stimulate the exchange and sharing of information among peers. Even more social networks can initiate a cooperation (e.g., people sharing music) and a collaboration (e.g., searching for collaborators for research works). However, social networks are not widely used as work resources (e.g., for help or support request) mostly due to missing coordination mechanisms. This paper describes how col-laboration can be coordinated in social networks. The proposed way to achieve this is based on the usage of a set of activity lists of social network members. An activity list specifies all personal activities required to reach a collaborative output. Based on the activity lists a process model can be generated that controls and analyzes the coordination. Activities requiring collaboration are performed using social network. The approach is illustrated with a use case.

Paper Nr: 201



Christian Baumann, Paul Peitz, Oliver Raabe and Richard Wacker

Abstract: As with the advancement in Web-based infrastructures applications can be composed of services by different providers across the Internet, it is not possible to foresee legal requirements for every situation. Therefore, new legal challenges arise for modular applications in an Internet of Services. However, since such service based systems become more and more self describing by using sophisticated description schemas, we propose to apply standard legal methodology on this situation. By formalizing legal norms and the process of legal assessment to obtain legal rights and obligations we envision an autarchic system which can subsume service description facts under the terms of legal regulations in order to obtain legal consequences. This paper contributes the scientific concept to transfer legal methodology, as known in the offline world for decades, to a distributed and modular online business world, which composes its applications dynamically with services from different providers.

Paper Nr: 7



Ashutosh Tiwari, Rafael Navarro Fontestad and Christopher Turner

Abstract: Today, information overload and the lack of systems that provide employees with the right knowledge and skills are common challenges that large organisations face. This can lead to knowledge workers re-inventing the wheel due to problems in the retrieval of information from both internal and external sources. Web 2.0 tools aim to address this type of issue facilitating collaboration and knowledge sharing in a corporate setting. This paper describes the benefits and constraints associated with the use of Web 2.0 tools and examines the drivers behind the adoption of such tools in industry. A number of landscape overview models are presented here that attempt to describe the effect of using Web 2.0 tools on a knowledge based organisation. An organisation, active in the construction industry, is the focus of a case study where Web 2.0 tools are matched to real knowledge sharing and collaboration problems.

Paper Nr: 59



Francesc Miralles, Ferran Giones and Rosa Rudo

Abstract: The effect of cultural values in IT adoption has attracted growing interest in the last years. Researchers posit that cultural values can shed some additional light on the factors that determine IT user acceptance and use. In this research in progress work, the authors propose a model based on previous user acceptance theories to develop a research study to inquire the role that individual cultural values play on the adoption of those social networks features that threat user's privacy the most. What the authors posit is that adoption of those features that are more critical from the point of view of users' privacy can be explained from the perspective of individual's cultural values. In this preliminary work, the authors have developed the model and have drawn a set of hypotheses. In the following steps of the research the authors are going to develop a survey to start the quantitative research.

Paper Nr: 65

Context-Oriented Knowledge Management for Intelligent User Assistance in Smart Space


Alexey Kashevnik, Nikolay Shilov and Alexander Smirnov

Abstract: Such topics as smart home, smart car have become widespread recently. The paper presents an innovative approach to context-oriented knowledge management in the smart space. The smart space consists of a set of devices that can interact with each other, exchange information and services. Knowledge management in such systems allows coordinating activities of a large amount of entities which can communicate within the smart space.

Paper Nr: 89



Marco Remondino and Marco Pironti

Abstract: To understand the adoption of collaborative systems, it is of great importance to know about economical effects of collaboration itself. Decision makers should be able to evaluate potential drawbacks and advantages of collaboration: strategies may be seen as a mixture of cost reduction, product differentiation and improvement of decision making and/or planning. In this context information technology may help a firm to create sustaining competitive advantages over competitors. It is less clear whether collaboration is of any use in such an environment. According to the Economics literature, the most important factors affecting benefits of collaboration are market structure, kind and degree of uncertainty faced by the firms, their risk preferences and the collaboration propensity. The results depend on the way these factors are combined. We present a microeconomic model and use techniques from game theory for the analysis. The way the model is constructed will allow the derivation of closed-form solutions. Traditional learning models can't represent individualities in a social system, or else they represented all of them in the same way – i.e.: as focused and rational agents; they don’t represent individual inclinations and preferences. Results indicating whether collaboration in various areas makes sense will be obtained. This makes it possible to judge the potential of available collaborative technology. The basic presented model may be extended in various ways.

Paper Nr: 165

COLLABORATIVE OBSERVATIONS OF WEATHER: A Weather Information Sharers’ Community of Practice


Katarina Elevant

Abstract: Beside occasional disastrous impacts of weather, weather also affects daily life. Societal and environmental challenges of the future include both providing customized weather information in-time due to users’ needs, and detecting climate change and its impacts on land and ecosystems. The accuracy of weather and climatic information is, however, limited by spatial and temporal borders that need to be overriden. Also, weather information services cannot be fully customized due to the spatial inaccuracy of weather forecasts and observations. Here, the role of social media, collective and civic intelligence and crowd sourcing should be investigated. This paper envisions a community of practice where a social network of weather-inetersted users provide usable observations of weather and environmental change. User-generated weather observations can be processed based on principles of collective intelligence and co-creation, in order to improve, customize and personalize weather information. The paper presents the interface of a web weather 2.0 service - a new method to collect weather and climatic information. Additionally, methods for turning the input into usable weather information is suggested, and motivation to contribute with user-generated observations of weather is regarded in particular.

Paper Nr: 193



Mário Rodrigues, Gonçalo Paiva Dias and António Teixeira

Abstract: Effective provision of government services implies that, besides being provided online, services become available through other channels, are organized according to citizen's expectations, are accessible to everyone, anytime and anywhere, and include information from unstructured sources. It is also essential to provide the tools that allow citizens to correctly identify the services they need. In this paper we will discuss how it is possible to improve e-gov service delivery by using human language technologies. We argue that these technologies can contribute to: deliver services in more inclusive manners; provide human centered and multilingual service and support; and include non-structured information scattered across different sources.

Paper Nr: 204



Noel Carroll, Ita Richardson and Eoin Whelan

Abstract: The unprecedented growth in service-based business processes over a short period of time has underscored the need for understanding the mechanisms and theorising the business models and business process management adopted across many organisations today. This research presents a survey of the literature and argues that the inability of current Business Process Management (BPM) techniques to visualise and monitor web-enabled business processes prevents us from transforming information on network activity and infrastructures, thus inhibiting managers to anticipate change and adapt to more agile business practices. Thus, this research sets out to propose the need to develop a framework to enhance manager’s ability to monitor key performance indicators (KPIs) while improving business process restructuring practices. This paper reports on the current status of the research which is largely derived from the literature review to date.

Area 4 - Web Intelligence

Full Papers
Paper Nr: 27

SENTIMENT ANALYSIS RELOADED: A Comparative Study On Sentiment Polarity Identification Combining Machine Learning And Subjectivity Features


Ulli Waltinger

Abstract: This paper presents an empirical study on machine learning-based sentiment analysis. Though polarity classification has been extensively explored at different document-structure levels (e.g. document, sentence, word), little work has been done investigating feature selection methods and subjectivity resources. We systematically analyze four different English subjectivity resources for the task of sentiment polarity identification. While the results show that the size of dictionaries clearly correlate to polarity-based feature coverage, this property does not correlate to classification accuracy. Using polarity-based feature selection, considering a minimum amount of prior polarity features, in combination with SVM-based machine learning methods exhibits the best performance (acc = 84:1; f 1 = 83:9), in comparison to the classical approaches on polarity identification. Based on the findings of the English-based experimental setup, a new German subjectivity resource is proposed for the task of German-based sentiment analysis. The results of the experiments show, with f 1 = 85:9 its good adaptability to the new domain.

Paper Nr: 46

GRSK: A generalist recommender system


Inma Garcia, Laura Sebastia, Sergio Pajares and Eva Onaindía de la Rivaherrera

Abstract: This paper describes the main characteristics of GRSK, a Generalist Recommender System Kernel. It is a RS based on the semantic description of the domain, which allows the system to work with any domain as long as the data of this domain can be defined through an ontology representation. GRSK uses several Basic Recommendation and Hybrid Techniques to obtain the recommended items. Through the GRSK configuration process, it is possible to select which techniques to use and to parameterize different aspects of the recommendation process, in order to adjust the GRSK behavior to the particular application domain. The experimental results will show that GRSK can be successfully used with different domains.

Paper Nr: 48

Probability-based Extended Profile Filtering: An Advanced Collaborative Filtering Algorithm for User-Generated Content


Toon De Pessemier, Toon De Pessemier, Kris Vanhecke, Simon Dooms, Tom Deryckere and Luc Martens

Abstract: The enormous offer of (user-generated) content on the internet and its continuous growth make the selection process increasingly difficult for end-users. This overabundance of content can be handled by a recommendation system that observes user preferences and assists people with offering interesting suggestions. However, present-day recommendation systems are optimized for suggesting premium content and partially lose their effectiveness when recommending user-generated content. The transitoriness of the content and the sparsity of the data matrix are two major characteristics that influence the effectiveness of the recommendation algorithm and in which premium and user-generated content systems can be distinguished. Therefore, we developed an advanced collaborative filtering algorithm which takes into account the specific characteristics of user-generated content systems. As a solution to the sparsity problem, inadequate profiles will be extended with the most likely future consumptions. These extended profiles will increase the profile overlap probability, which will increase the number of neighbours in a collaborative filtering system. In this way, the personal suggestions are based on an enlarged group of neighbours, which makes them more precise and diverse than traditional collaborative filtering recommendations. This paper explains in detail the proposed algorithm and demonstrates the improvements on standard collaborative filtering algorithms.

Paper Nr: 87



Gang Liu, Zhi Lu, Tianyong Hao and Wenyin Liu

Abstract: An automatic annotation method for annotating text with semantic labels is proposed for question answering systems. The approach first extracts the keywords from a given question. Semantic label selection module is then employed to select the semantic labels to tag keywords. In order to distinguish multi-senses and assigns semantic labels to multi-senses keywords, a Bayesian based method is used by referring to historically annotated questions. If there is no corresponding label, WordNet is then employed to obtain appropriate candidate labels by calculating the similarity between each keyword in the question and the concept list in our predefined Tagger Ontology. This ontology is designed to organize semantic labels in a hierarchical structure with only two levels and all concepts in the ontology are mapped to WordNet correspondingly. Experimental results show that precision of this keywords annotation method is 76% in average.

Paper Nr: 113

Ad-hoc Georeferencing of Web-pages Using Street-name Prefix Trees


Andrei Tabarcea, Ville Hautamäki and Pasi Fränti

Abstract: A bottleneck of constructing location-based web searches is that most web-pages do not contain any explicit geocoding such as geotags. Alternative solution can be based on ad-hoc georeferencing which relies on street addresses, but the problem is how to extract and validate the address strings from free-form text. We propose a rule-based solution that detects address-based locations using a gazetteer and street-name prefix trees created from the gazetteer. We compare this approach against a method that doesn’t require a gazetteer (a heuristic method that assumes that street-name has a certain structure) and a method that also uses data structures created from the gazetteer in the form of street-name arrays. Experiments using our location based search engine prototype (MOPSI) for Finland and Singapore, show that the proposed prefix-tree solution is twice as fast and 10% more accurate than its rule-based alternative and 10 times faster if an array structure is used when accessing the gazetteer.

Paper Nr: 114



Viktor De Boer, Maarten Van Someren and Tiberiu Lupascu

Abstract: To automatically classify and process web pages, current systems use the textual content of those pages, including both the displayed content and the underlying (HTML) code. However, a very important feature of a web page is its visual appearance. In this paper, we show that using generic visual features we can classify the web pages for several different types of tasks. The features used in this document are simple color and edge histograms, Gabor and texture features. These were extracted using an off-the-shelf visual feature extraction method. In three experiments, we classify web pages based on their aesthetic value, their recency and the type of website. Results show that these simple, global visual features already produce good classification results. We also introduce an online tool that uses the trained classifiers to assess new web pages.

Paper Nr: 143



Arnaud Renard, Sylvie Calabretto and Béatrice Rumpler

Abstract: Nowadays, semantics is one of the greatest challenges in IR systems evolution, as well as when it comes to (semi-)structured IR systems which are considered here. Usually, this challenge needs an additional external semantic resource related to the documents collection. In order to compare concepts and from a wider point of view to work with semantic resources, it is necessary to have semantic similarity measures. Similarity measures assume that concepts related to the terms have been identified without ambiguity. Therefore, misspelled terms interfere in term to concept matching process. So, existing semantic aware (semi-)structured IR systems lay on basic concept identification but don’t care about terms spelling uncertainty. We choose to deal with this last aspect and we suggest a way to detect and correct misspelled terms through a fuzzy semantic weighting formula which can be integrated in an IR system. In order to evaluate expected gains, we have developed a prototype which first results on small datasets seem interesting.

Short Papers
Paper Nr: 18

Applied Visual Exploration on Real-time News Feeds Using Polarity and Geo-spatial Analysis


Milos Krstajic, Peter Bak, Daniela Oelke, Martin Atkinson, William Ribarsky and Daniel Keim

Abstract: This paper presents a visual analytics approach to explore large news article collections in the domains of polarity and spatial analysis. The exploration is performed on the data collected with Europe Media Monitor (EMM), a system which monitors over 2500 online sources and processes 90,000 articles per day. By analyzing the news feeds, we want to find out which topics are important in different countries and what is the general polarity of the articles within these topics. To assess the polarity of a news article, automatic techniques for polarity analysis are employed and the results are represented using Literature Fingerprinting for visualization. In the spatial description of the news feeds, every article can be represented by two geographic attributes, the news origin and the location of the event itself. In order to assess these spatial properties of news articles, we conducted our geo-analysis, which is able to cope with the size and spatial distribution of the data. Within this application framework, we show opportunities how real-time news feed data can be analyzed efficiently.

Paper Nr: 22



Alejandro Figueroa

Abstract: This work presents a data-driven definition question answering (QA) system that outputs a set of temporally anchored definitions as answers. A peculiarity of this class of incipient definition QA system is that it has to be lightweight as it is usually compelled to process several documents when searching for answers. The system in this work builds surface language models on top of a corpus automatically acquired from Wikipedia abstracts, and ranks answer candidates in agreement with these models afterwards. Additionally, this study deals at greater length with the impact of several surface features in the ranking of temporally anchored answers.

Paper Nr: 28

Decay-based Ranking for Social Application Content


George Papadakis, Claudia Niederee and Wolfgang Nejdl

Abstract: Social applications are prone to information explosion, due to the proliferation of user generated content. Locating and retrieving information in their context poses, therefore, a great challenge. Classical information retrieval methods are, however, inadequate in this environment, and users inevitably drown in an information flood. In this paper, we present a novel method that facilitates user’s information quests by identifying and improving the accessibility of the most important resources. This is achieved through an information valuation method, that estimates how likely it is for each information item to be accessed in the near future. The experiments verify that our method performs significantly better than others typically used in social applications, while being more versatile, too.

Paper Nr: 45

Web Analytics - Analysing, Classifying and Describing Web Metrics with Fuzzy Logic


Darius Zumstein

Abstract: In the Internet economy, it has become a crucial task of electronic business to monitor and optimize websites, their usage and online marketing success. Web analytics, which is defined as the measurement, collection, analysis and reporting of Internet data, is an effective instrument of website management. First, this paper describes the technical functionality and use of web analytics and discusses different web metrics. Second, a fuzzy web analytics approach is proposed, which makes it possible to classify metrics precisely into more than one class at the same time. Third, a fuzzy web metrics index has been developed for multi-dimensional, intelligent web analysis. Fuzzy logic enables computing with words and more intuitive, human-oriented queries, segmentation and descriptions of metrics in natural language. Finally, a web analytics framework is suggested to analyze and control key performance indicators in a web controlling loop.

Paper Nr: 50

BALANCING ADAPTIVE CONTENT WITH AGENTS: Modeling and reproducing group behavior as computational system


Harri Ketamo

Abstract: To ensure the quality of adaptive contents, there should be continuous testing during the development phase. One of the most important reasons to empirically test the content during the development phase is the balance of the adaptive framework. Empirical testing is time-consuming and in many cases several iterative cycles are needed. In 2007 we started to develop methods of testing in a computational test bench. The idea to speed up the production process was based on software agents that could behave like real user community. The study shows that we can construct very reliable artificial behaviour when comparing it to human behaviour in group level. On design phase's usability tests, we are especially interested in group behaviour, not on single action etc., which means that the method suits for it's purposes.

Paper Nr: 83

CORD: A Hybrid Approach for Efficient Clustering of Ordinal Data using Fuzzy Logic and Self-Organizing Maps


Natascha Hoebel and Stanislav Kreuzer

Abstract: This paper presents CORD, a hybrid clustering system, which combines modifications of three modern clustering approaches to create a hybrid solution, that is able to efficiently process very large sets of ordinal data. The Self-organizing Maps algorithm for categorical data by Chen and Marques is hereby used for a rough preclustering for finding the initial position and number of centroids. The main clustering task utilizes a k-modes algorithm and its fuzzy set extension described by Kim et al. for categorical data using fuzzy centroids. Finally in dealing with large amounts of data, the BIRCH algorithm described by Zhang et al. for efficient clustering of very large databases (VLDBs) is adapted to ordinal data. BIRCH can be used as a preliminary phase for both Fuzzy Centroids and NCSOM. Both algorithms profit from this symbiosis as their iterative computations can be done on data, that is fully held in main memory. Combining these approaches, the resulting system is able to extract significant information even from very large datasets efficiently. The presented reference implementation of the hybrid system shows good results. The aim is clustering and visual analyzing large amounts of user profiles. This should help in understandingWeb user behavior and personalize advertisement.

Paper Nr: 158

Supporting Information Retrieval in RSS Feeds


Nacéra Bennacer, Georges Dubus and Mathieu Bruyen

Abstract: Really Simple Syndication (RSS) information feeds present new challenges to information retrieval technologies. In this paper we propose a RSS feeds retrieval approach which aims to give for an user a personalized view of items and making easier the access to their content. In our proposal, we define different filters in order to construct the vocabulary used in text describing items feeds. This filtering takes into account both the lexical category and the frequency of terms. The set of items feeds is then represented in a $m$-dimensional vector space. The k-means clustering algorithm with an adapted centroid computation and a distance measure is applied to find automatically clusters. The clusters indexed by relevant terms can so be refined, labeled and browsed by the user. We experiment the approach on a collection of items feeds collected from news sites. The resulting clusters show a good quality of their cohesion and their separation. This provides meaningful classes to organize the information and to classify new items feeds.

Paper Nr: 199

BEYOND OPINION MINING: How Can Automatic Online Opinion Analysis Help in Product Design?


Ying Liu

Abstract: The rapid development of WWW, information technology and e-commerce has made the Internet forums, e-opinion portals and personal blogs widely accessible to consumers. As a result, nowadays it has become extremely popular for consumers to share their experience, point out their preferences and concerns with respect to a specific product on Web. These online customer reviews possess vital information that product designers can gain insights of their customers and products, and make improvements accordingly. However, the sheer amount of data, their distributed locations and the inherent ambiguity of human language have challenged designers greatly. In this paper, we aim to outline an intelligent system that is able to first automatically gather global online reviews with respect to certain products interested, identify the product features and customer requirements, and most importantly relates them to the product’s engineering characteristics through quality function deployment (QFD), a tool that is widely used by product designers in the customer-driven design paradigm. Meanwhile, we also highlight the challenges and relevant research issues in order to fulfil such an ambition. As a pioneer study, we believe that this research will greatly help designers in the era of global competition and e-commerce.

Paper Nr: 203

Bio-inspired data placement in peer-to-peer networks


Benoît Romito, François Bourdon, Hugo Pommier, Hugo Pommier, Benoît Romito and François Bourdon

Abstract: In this paper we present the benefits of using a multi-agents system to manage the data placement in a decentralized storage application. In our model, after a fragmentation step, each piece of data is associated to a mobile agent making its own decisions. To manage agents placement, we apply flocking rules in a peer-to-peer network called SCAMP. Each agent follows simple rules and the emerging behavior is a flock of fragments. To provide an efficient load-balancing, agents drop pheromones among network peers. We made some experiments to measure the cohesion degree of our flock and to measure the network coverage of a flock. We also discuss about availability and reliability of our approach.

Paper Nr: 205

Learning from ‘Tag Clouds’: A Novel Approach to build Datasets for Memory-Based Reasoning Classification of Relevant Blog Articles


Ahmad Ammari and Valentina Zharkova

Abstract: The advent of the Social Web has created massive online media through turning the former information consumers to present information producers. The best example is the blogosphere. Blog websites are a collection of articles written by millions of blog writers to millions of blog readers. Blogging has become a very popular means for Web 2.0 users to communicate, express, share, collaborate, and debate through their blog posts. However, as a consequence to the very massive number of blogs as well as the so diverse topics of blog posts available on the Web, most blog search engines encounter the serious challenge of finding the blog articles that are truly relevant to the certain topic that blog readers may look for. To help handling this problem, an intelligent approach to blog post search that takes advantage from the concept of ‘tag clouds’ and leverages many open source libraries, has been proposed. A Memory-Based Reasoning model has been built using SAS Enterprise Miner to assess the approach effectiveness. Results are very encouraging as retrieval precision has indicated a significant improvement in retrieving relevant posts to the user compared with traditional means of blog post retrieval.

Paper Nr: 13

Automatic Generation of On-line Conceptual Assessment Courses using TagHelper


Ismael Pascual-Nieto and Diana Perez-Marin

Abstract: TagHelper is a verbal data analysis application. It is based on the use of the Weka toolkit. It is able to classify sentences as one of a set of categories previously introduced into the system. TagHelper has been used to support data analysis in English, German, and Chinese. TagHelper has been recently extended to support Spanish too. The Will Tools are a set of web-based learning tools able to automatically assess students' free-text answers written in Spanish or in English. In this paper, we describe a new procedure to generate a conceptual assessment course in the format required by the Will Tools automatically from web data using TagHelper in Spanish. The procedure has been successfully implemented, and two different courses have already been generated.

Paper Nr: 32



Fahad Kalil, Edimar Manica, Daniel Lichtnow, Valderi Leithardt , Edimar Manica, Fahad Kalil, Ana Marilza Pernas Fleischmann and José Palazzo M. de Oliveira

Abstract: The authorship is an important criteria to evaluate content quality. Frequently, Web users have to spend a lot of time in Web searchers to find information about author’s expertise. This paper presents an approach to help Web users in this task. The approach consists of: a set of techniques to extract information about authors from Web and an architecture of an extraction tool. An application scenario is presented, in which the user can read details about a specific author of a Web page when reading the document

Paper Nr: 95

TOWARDS RECOMMENDER SYSTEMS BASED ON KALMAN FILTERS - A new approach by state space modelling


Samuel Nowakowski, Armelle Brun and Anne Boyer

Abstract: This position proposes an original approach based on a new formulation of a recommender system. This formulation uses state space description for users and web resources. Then states and parameters are predicted and estimated with two stages algorithms of a Kalman filter. In this paper, we give the main theoretical results of this original approach.

Paper Nr: 137

Combining Desktop Data and Web 3.0 Technologies to Profile a User


Vasileios Lapatas and Michalis Stefanidakis

Abstract: An interesting promise of Web 3.0 is the seamless integration of desktop and web spaces. Private data, locked until recently inside a user’s computer, can lead to the intelligent generation of web content. This paper presents an idea of how desktop and web data can be integrated in creative ways in the World Wide Web. As a proof of concept, an application which can profile a user based on his bookmarks is being demonstrated. An existing web service is used to classify bookmarks, enabling thus platforms with limited processing power to perform the profiling process . Results gathered from the classification process indicate that even a generic untrainable and not fine-tuneable classifier can produce results with high accuracy. With accurate user profiles web content can be created in an intelligent way, enabling better Web 3.0 applications.

Paper Nr: 154



Przemyslaw Jarzebowski, Maciej Dabrowski, Przemyslaw Jarzebowski, Thomas Acton and Sean O'Riain

Abstract: Online shopping is a very goal-oriented activity. Consumers have a set of preferences for a product or service that is used as criteria for assessment of the available alternatives. However, crucial information about products is often available as text reviews. Finding a product with specific features is extremely time-consuming using the typical search functionality found in existing shopping sites. In this work we propose a method for the seamless integration of unstructured information from product reviews with structured product descriptions using opinion mining. We demonstrate our method through shopping for a used car based on 148240 car reviews. Evaluation results using a user study and simulations show that the technique enables customers to assess more product characteristics and potentially make better decisions.