WEBIST 2009 Abstracts


Area 1 - Internet Technology

Full Papers
Paper Nr: 38
Title:

Simulating the Human Factor in Reputation Management Systems for P2P Networks

Authors:

Guido Boella, Marco Remondino and Gianluca Tornese

Abstract: A compelling problem in peer-to-peer (P2P) networks for file shar- ing, is the spreading of inauthentic files. To counter this, reputation management systems (RMS) have been introduced. These systems dynamically assign to the users a reputation value, which is considered in the decision to download files from them or not. RMS are proven, via simulation, to make P2P networks safe from attacks by malicious peer spreading inauthentic files. But in large networks of millions of users non-malicious users get a benefit from sharing inauthentic files due to the credit system. In this paper we show using agent based simulation that reputation systems are effective only if there is a widespread cooperation by users in verifying authenticity of files starting during the download phase, while the size punishment derived by the reputation systems is less relevant. This was not evident in previous works since they make several ideal assumptions about the behavior of peers who have to verify files to discover inauthentic ones. Agent based simulation allows to study the human factor behind the behavior of peers, in particular the advantage of spreading inauthentic files, of not checking as soon as possible their authenticity during the download, thus unwillingly cooperating to the spreading of files.
Download

Paper Nr: 49
Title:

An Agent-Based Programming Model for Developing Client-Side Concurrent Web 2.0 Applications

Authors:

Giulio Piancastelli, Alessandro Ricci and Mattia Minotti

Abstract: Using the event-driven programming style of JavaScript to develop the concurrent and highly interactive client-side of Web 2.0 applications is showing more and more shortcomings in terms of engineering properties such as reusability and maintainability. Additional libraries, frameworks, and AJAX techniques do not help reduce the gap between the single-threaded JavaScript model and the concurrency needs of applications. We propose to exploit a different programming model based on a new agent-oriented abstraction layer, where first-class entities -- namely agents and artifacts -- can be used, respectively, to capture concurrency of activities and their interaction, and to represent tools and resources used by agents during their activities. We specialise the model in the context of client-side Web development, by characterising common domain agents and artifacts that form an extension of an existing programming framework. Finally, we design and implement a simple but significant case study to showcase the capabilities of the model and verify the feasibility of the technology.
Download

Paper Nr: 150
Title:

SHIP – SIP HTTP INTERACTION PROTOCOL / Proposing a Thin-Client Architecture for IMS Applications

Authors:

Rene Gabner, Marco Happenhofer, Sandford Bessler and Joachim Zeiss

Abstract: IMS is capable of providing a wide range of services. As a result, terminal software becomes more and more complex to deliver network intelligence to user applications. Currently mobile terminal software needs to be permanently updated so that the latest network services and functionality can be delivered to the user. In the Internet, browser based user interfaces assure that an interface is made available to the user which offers the latest services in the net immediately. Client software is virtualized using script and widget technologies, which allow user interfaces to run on different hardware platforms and operating systems. Our approach, called SHIP, combines the benefits of the Session Initiation Protocol (SIP) and those of the HTTP protocol to bring the same type of user interfacing to IMS. SIP (IMS) realizes authentication, session management, charging and Quality of Service (QoS), HTTP provides access to Internet services and allows the user interface of an application to run on a mobile terminal while processing and orchestration is done on the server. A SHIP enabled IMS client only needs to handle data transport and session management via SIP, HTTP and RTP and render streaming media, HTML and Javascript.. Furthermore, the SHIP architecture allows new kinds of applications, which combine audio, video and data within a single multimedia session.
Download

Paper Nr: 171
Title:

RICH PRESENCE AUTHORIZATION USING SECURE WEB SERVICES

Authors:

Li Li and Wu Chou

Abstract: This paper presents an extended Role-Based Access Control (RBAC) model for rich presence authorization using secure web services. Following the information symmetry principle, the standard RBAC model is extended to support data integrity, flexible and intuitive authorization specification, efficient authorization process and cascaded authority within web services architecture. In conjunction with the extended RBAC model, we introduce an extensible presence architecture prototype using WS-Security and WS-Eventing to secure rich presence information exchanges based on PKI certificates. Applications and performance measurements of our presence system are presented to show that the proposed RBAC framework for presence and collaboration is well suited for real-time communication and collaboration.
Download

Short Papers
Paper Nr: 18
Title:

A SIP-based Web Session Migration Service

Authors:

Michael Adeyeye, Neco Ventura and David Humphrey

Abstract: Web session handoff is one of the ways of improving the web browsing experience; other ways include the use of bookmarks and web history synchronization between two PCs. This paper discusses the implementation and evaluation of a SIP-based web session migration service. A graphical tool, which is called Data Flow Diagram, is used to describe how the session migration service works. This work is compared with other existing web session migration approaches. In addition, the large scale deployment and limitations of the service are also discussed. Although all web sessions could not be migrated, the session mobility service worked in a Peer-to-Peer environment and offered SIP functionalities within web browsers. That is, a web browser can now act as an adaptive User Agent Client to surf the Internet and set-up multimedia sessions like a SIP client. In summary, it is a novel approach to web session migration in which SIP is used to transfer session data. It also borrowed SIP Mobility mechanisms to introduce new service, namely content sharing and session handoff, to the web browsing experience.
Download

Paper Nr: 119
Title:

JWIG: Yet Another Framework for Maintainable and Secure Web Applications

Authors:

Anders Møller and Mathias Schwarz

Abstract: Although numerous frameworks for web application programming have been developed in recent years, writing web applications remains a challenging task. Guided by a collection of classical design principles, we propose yet another framework. It is based on a simple but flexible server-oriented architecture that coherently supports general aspects of modern web applications, including dynamic XML construction, session management, data persistence, caching, and authentication, but it also simplifies programming of server-push communication and integration of XHTML-based applications and XML-based web services. The resulting framework provides a novel foundation for developing maintainable and secure web applications.
Download

Paper Nr: 120
Title:

APPLICATIONS OF SERVICE ORIENTED ARCHITECTURE FOR THE INTEGRATION OF LMS AND M-LEARNING APPLICATIONS

Authors:

Miguel Á. Conde, Miguel Ángel Conde González, Marc Alier Forment and Francisco García Peñalvo

Abstract: Mobile learning applications introduce a new degree of ubiquitousness in the learning process. There is a new generation of ICT-powered mobile learning experiences that exist in isolated contexts: experiences limited to small learning communities. These rising mobile learning experiences appear while web-based learning, specially Learning Management Systems (LMS), are consolidated and widely adopted by learning institutions, teachers and learners. The innovation techniques breeding in the experimental world of mobile learning need to be translated into the mainstream echosystems. Mobile learning is not intended to replace e-learning or web based learning, but to extend it. So, mobile learning applications need to be integrated somehow in the web-based LMS. To do so is needed to address interoperability issues on both ends: the LMS and the mobile application. Webservices and Service Oriented Architecture offer a standarized and effective way to achieve interoperability between systems. This paper presents an architecture that allows a two-way interoperability between LMS and Mobile Applications: access LMS contents from the mobile device, and to be able to embed part of the mobile applications inside the LMS framework. This architecture incorporates elements from famous interoperability standards (IMS LTI and OKI) and has been validated with two projects related to the Open Source LMS Moodle.
Download

Paper Nr: 131
Title:

INTEGRAL SECURITY MODEL FOR THE EXCHANGE OF OBJECTS IN SERVICES ORIENTED ARCHITECTURE

Authors:

Emilio Rodriguez-Priego and Francisco J. García Izquierdo

Abstract: Nowadays, security approaches and solutions for SOA focus mainly on messages and data, but they forget the code security (both service code and exchanged code). Moreover, some security aspects (e.g. validity, correctness...) are usually forgotten. We state that any security approach will be incomplete if the security of both data (messages) and code (service code) is not addressed in a general sense. In this paper, we extend a previous approach about securing code in SOA. We analyze general problems related to the exchange of code and state in SOA and in the specific case of Web Services architectures. A new general model of security is presented. This model covers any aspect related to the authorship, distribution, transformation, execution and validation of both code and data.
Download

Paper Nr: 142
Title:

PACD: A BITMAP-BASED FRAMEWORK FOR PROCESSING XML DATA

Authors:

Mohammed Al-Badawi, Siobhán North and Barry Eaglestone

Abstract: Current XML/RDBMS storage models and query processing technologies are reviewed in this paper, leading to the identification of query expressiveness and performance limitations. A novel serialized XML query processing framework is proposed to address these. The proposed query processor (called PACD) is based on a bitmap representation for XML’s structural relationships. XPath axes, plus their extension (i.e. “next” axis) for accessing the document order, are translated to sparse matrices allowing data compression, query complexity reduction and XML updates relaxation. Experimental results, outlined in this paper, show promising performance improvements over conventional techniques in a wide range of query types.
Download

Paper Nr: 200
Title:

TOWARDS SERVICE ORIENTATION ON RESOURCE CONSTRAINED DEVICES

Authors:

Nils Glombitza, Carsten Buschmann, Dennis Pfisterer, Stefan Fischer and Horst Pahl

Abstract: For the flexible integration of enterprise applications and business processes, the Service Oriented Architecture (SOA) concept and the web service technology are the state of the art today. Especially for powerful hardware, a lot of web service and related technologies were developed during the last years. But with the development of Future Internet technologies, there is a demand for integrating all kinds of devices into a SOA. This includes especially devices with extremely limited resources, such as wireless sensor nodes, which are not capable of running today's web service technologies. In this paper we disclose the need for research action on different layers of the web service technology stack. We discuss promising solutions for running standard compliant web services in sensor networks and integrating sensor network and enterprise IT web services as well as BPEL business processes seamlessly. We introduce the L2D2 project in which our concept will be realized and proven.
Download

Paper Nr: 32
Title:

XML PROCESSING. NO PARSING

Authors:

Yevgeniy Guseynov

Abstract: The main properties considered lacking from XML for a potentially efficient interchange format are Compactness and Processing Efficiency, and Parsing being the main deterrent to Processing Efficiency. The proposed Contiguous Memory Tree (CMT) and its XML API completely resolve Parsing and Processing Efficiency permitting an efficient interchange format for XML. CMT is based on the presentation of XML documents as a tree that contiguously resides in memory and is simultaneously a stream that can be directly copied as a message and an application object that can be directly accessed through the CMT XML API. CMT XML API does not need to read and evaluate markup or decode information items that takes much CPU time when processing, thus is significantly more efficient than any existing formatting schemes, SAX and DOM parsers.
Download

Paper Nr: 41
Title:

Modeling Service Systems in Service-Oriented Environments

Authors:

Dionisis Adamopoulos

Abstract: The advent of deregulation combined with new opportunities opened by advances in telecommunications technologies has significantly changed the paradigm of telecommunications services, leading to a dramatic increase in the number and type of services that telecommunication companies can offer. Building new advanced multimedia telecommunications services in a distributed and heterogeneous environment is very difficult, unless there is a methodology to support the entire service development process in a structured and systematic manner, and assist and constrain service designers and developers by setting out goals and providing specific means to achieve these goals. Therefore, in this paper, after a brief presentation of a proposed service creation methodology, its service design phase is examined in detail focusing on the essential activities and artifacts. In this process, the exploitation of important service engineering techniques and UML modelling principles is especially considered. Finally, alternative and complementary approaches for service design are highlighted and a validation attempt is briefly outlined.
Download

Paper Nr: 46
Title:

A STUDY OF NATIVE XML DATABASES. Document update, querying, access control and Application Programming Interfaces in Native XML Databases.

Authors:

M. M. Martínez-González, Miguel A. Martínez-Prieto and María Muñoz-Nieto

Abstract: Native XML databases (NXD) are called to play a crucial role in the near future. The experience with relational databases shows that standard methods for accessing and manipulating databases are necessary if wide acceptance of these database systems is to be expected in information systems whose applications should access databases preferably through standard APIs. In this paper, the current state of APIs that provide standard access to NXD, standard query languages and standard methods for document update and access control are analyzed from the perspective acquired with our experience using NXD in information systems. Our conclusions show the weak points which still need to be improved as compared with relational databases.
Download

Paper Nr: 60
Title:

Web Browser Transactionality

Authors:

Mark Wallis, Frans Henskens, Michael Hannaford and David Paul

Abstract: As the complexity of web applications increases new challenges are faced in relation to data integrity and system scalability. Traditional client/server fat applications allow for a high level of transactionality between the client and server, due largely to transactional protocols and tight coupling between components. Transactional functionality within web applications is historically limited to within the web server hosting the application. The scope of the traditional transaction in this context does not extend outside of the web server and its attached services. This paper proposes that web applications can achieve increased system integrity by extending the scope of the transaction to encompass tasks performed by the web browser. An additional layer is introduced to the standard HTTP protocol to facilitate the new functionality, and a simulator is presented as the basis for further research.
Download

Paper Nr: 146
Title:

SWAPT, Semantic Workflow Architecture for Petroleum Techniques

Authors:

Nabil Belaid, Jean-François Rainaud and Yamine Ait-ameur

Abstract: In the petroleum industry, many engineering studies are conducted to evaluate the potential of the geological structures to be exploited as hydrocarbon reservoirs. They are realized following series of complex workflows composed by activities realized by geologists. Nowadays these workflows are build mainly according to the experience gained by experts along their previous realizations. It is not possible to share this experience between geologists by reusing and composing activities if a minimum of semantics is not applied to describe them and the workflows that use them. The focus of our work is to evaluate the benefit of using semantics to make the geologist daily work easier. In this article, we first explain how we can operate today without semantics. Then, we enrich such complex workflows and the data they manipulate with semantic annotations through ontology-based characterizations (Geological Data and Activities Ontologies). As future work, we plan to use these annotations for a full architecture that would assist geologists in building their workflows.
Download

Paper Nr: 174
Title:

Universal, Ontology-based Description of Devices and Services

Authors:

Boyan Bontchev

Abstract: A great problem with existing service discovery middleware solutions (e.g. UPnP, Bluetooth, and Jini etc) is that services available in one platform cannot be easily accessed from services based on another platform which is known as “hidden service problem”. One of the reasons for this incompatibility is a different method used for service description in each of these platforms. This paper highlights the importance of unified, platform-neutral, semantic-oriented device and service descriptions to achieve seamless integration between different technologies and systems for advertising, discovering and invoking services. Moreover, it presents an ontology-driven approach for describing devices and their services which will enable service-seeking peers to reason about available services and devices, and make intelligent and informed decisions regarding which services to use, and how. We argue that having separate descriptions (ontologies) for device and services that this device offers aims to be more flexible, modular and therefore better design. The proposed device ontology is to be used for description of characteristics of the device that offers a specific service. For development of semantic models of services our analysis shows that making several enhancements of OWL-S (using OWL subclassing) will enable its use as common language for service description. Finally we outline some of the ongoing and future work that gives a real-world perspective and application of the proposed ontologies.
Download

Paper Nr: 178
Title:

RDF rules for XML Data conversion to OWL Ontology

Authors:

Christophe Cruz and Christophe Nicolle

Abstract: The paper presents a flexible method to enrich and populate an existing OWL ontology from XML data based on RDF rules. Theses rules are defined in order to populate automatically the new version of the OWL ontology. Basic rules are defined to identify elements in XML schemas and an OWL schema. Advanced mapping rules are based on basic rules in order to define the mapping between XML schemas elements and OWL schema elements. In addition, this flexible method allows users to reuse rules for other conversions and populations.
Download

Paper Nr: 189
Title:

Adaptive Integration of Information

Authors:

Christophe Nicolle and Christophe Cruz

Abstract: The objective of this paper is twofold. On the one hand, this paper presents a concise state-of-the-art of information integration methods in the field of information systems. In this part, academic proposals are examined and confronted with actual industrial realities. On the other hand, this paper suggests a new approach to improve the integration of information. This approach is based on the use of semantic adaptive graphs. The adaptive feature of our proposal makes it possible to manage two specific aspects related to information integration: the adaptation of information according to the user’s access rights and the lifecycle of the integrated information.
Download

Area 2 - Web Interfaces and Applications

Full Papers
Paper Nr: 42
Title:

DISCOVERING LINKS INTO THE FUTURE ON THE WEB

Authors:

Muhammad T. Afzal

Abstract: Current search engines require the explicit specification of queries in retrieving related materials. Based on personalized information acquired over time, such retrieval systems aggregate or approximate the intent of users. In this case, an aggregated user profile is often constructed, with minimal application of context-specific information. This paper describes the design and realization of the idea of ‘Links into the Future’ for discovering related documents from the Web, within the context of an electronic journal. The information captured based on an individual’s current activity is applied for discovering relevant information along a temporal domain. This information is further pushed directly to the users’ local contexts. This paper as such presents a framework for the characterization and discovery of highly relevant documents.
Download

Paper Nr: 52
Title:

INTERACTIVE USER INTERFACES FOR NEUROANATOMY EXPLORATION

Authors:

Felix G. Hamza-Lup and Tina Thompson

Abstract: Human neuroanatomy is extremely complex, and functional neuroanatomical pathways can not be dissected and easily visualized in an anatomy lab. Teaching students to see neuro-anatomical relationships over the extent of the neuraxis is challenging. The ability to internalize a 3D map of the neuraxis with the appropriate clinically relevant neuro-pathways superimposed is critical for medical students, as it facilitates long-term retention of the information as opposed to short-term memorization. Interactive 3D simulations can play a significant role in facilitating learning through engagement, immediate feedback and by providing real-world contexts.
Download

Paper Nr: 65
Title:

Game-based learning: Conceptual methodology for creating educational games

Authors:

Stephanie B. Linek, Daniel Schwarz, Matthias Bopp and Dietrich Albert

Abstract: Game-based learning builds upon the idea of using the enjoyment and the motivational potential of video games in the educational context. Thus, the design of educational games has to address optimizing enjoyment as well as optimizing learning. Within the EC-project ELEKTRA a methodology about the conceptual design of digital learning games was developed. Thereby state-of-the-art psycho-pedagogical approaches (like the Competence-based Knowledge Space Theory) were combined with insights of media-psychology (e.g., on parasocial interaction) as well as with best-practice game design. This science-based interdisciplinary approach was enriched by enclosed empirical research to answer open questions. Additionally, several evaluation-cycles were implemented to achieve further improvements. The psycho-pedagogical core of the methodology can be summarized by the ELEKTRA’s 4Ms: Macroadaptivity, Microadaptivity, Metacognition and Motivation. The conceptual framework of the developed methodology is structured in eight phases which have several interconnections and feedback-cycles that enable a close interdisciplinary collaboration between game design, pedagogy, cognitive science and media psychology.
Download

Paper Nr: 70
Title:

SiteGuide: an Example-based Approach to Web Site Development Assistance

Authors:

Vera Hollink, Viktor De Boer and Maarten Van Someren

Abstract: We present `SiteGuide', a tool that helps web designers to decide which information will be included in a new web site and how the information will be organized. SiteGuide takes as input URLs of web sites from the same domain as the site the user wants to create. It automatically searches the pages of these example sites for common topics and common structural features. On the basis of these commonalities it creates a model of the example sites. The model can serve as a starting point for the new web site. Also, it can be used to check whether important elements are missing in a concept version of the new site. Evaluation shows that SiteGuide is able to detect a large part of the common topics in example sites and to present these topics in an understandable form to its users.
Download

Paper Nr: 98
Title:

ARHINET – A SYSTEM FOR GENERATING AND PROCESSING SEMANTICALLY-ENHANCED ARCHIVAL ECONTENT

Authors:

Ioan Salomie, Mihaela Dinsoreanu, Cristina Pop, Sorin L. Suciu, Tudor Vlad and Ioana Iacob

Abstract: This paper addresses the problem of generating and processing of eContent from archives and digital libraries. We present a system that adds semantic mark-up to the content of historical documents, thus enabling document and knowledge retrieval as response to semantic queries. The system functionality follows two main workflows: eContent generation and knowledge acquisition on one hand and knowledge processing and retrieval on the other hand. Within the first workflow, the relevant domain information is extracted from documents written in natural languages, followed by semantic annotation and domain ontology population. In the second workflow, ontologically guided queries trigger reasoning processes that provide relevant search results.
Download

Paper Nr: 106
Title:

semantic Service-Oriented Design And Development Methodology For Enterprise Healthcare Integration

Authors:

Ratnesh Sahay, Manfred Hauswirth and Ronan Fox

Abstract: The application of ontologies (semantic) to enhance any existing or proposed Service-Oriented Architecture (SOA) based software architecture has various levels of use in terms of intra and inter enterprise integration. The use of ontologies in an architectural design and development methodology of any service-oriented enterprise software holds the key to offer a dynamic, flexible and scalable solution. Current efforts in semantic Service-Oriented Architecture (sSOA) involve primarily the top-down modeling of services and data. A road-map that meets industrial demand of existing (bottom-up) services and data is missing. This paper analyses a healthcare standard (HL7) as an integration mechanism to connect healthcare enterprises. We have applied semantics on top of HL7 profiles to fill the gap between HL7 and SOA artifacts. The results have shown that semantics can ease the integration steps and burden to connect healthcare enterprises. We have proposed an integration platform which is based on a semantic Service-oriented Architecture (sSOA). Our goal is to apply lightweight semantics that incorporate and benefit from both development methodologies (top-down and bottom-up), to create a converged approach, for enterprise healthcare integration.
Download

Paper Nr: 133
Title:

Context-aware Ranking Algorithms in Folksonomies

Authors:

Fabian Abel, Nicola Henze and Daniel Krause

Abstract: Folksonomy systems have shown to contribute to the quality of Web search ranking strategies. In this paper, we analyze and compare graph-based ranking algorithms: FolkRank and SocialPageRank. We enhance these algorithms by exploiting the context of tags, and evaluate the results on the GroupMe! dataset. In GroupMe!, users can organize and maintain arbitrary Web resources in self-defined groups. When users annotate resources in GroupMe!, this can be interpreted in context of a certain group. The grouping activity itself is easy for users to perform: simple drag & drop operations allow users to collect and group resources. However, it delivers valuable semantic information about resources and their context. We show how to use this information to improve the detection of relevant search results, and compare different strategies for ranking result lists in folksonomy systems.
Download

Paper Nr: 136
Title:

TAKING INTO ACCOUNT USER PROFILES IN THE DEVELOPMENT OF DOMAIN ONTOLOGIES Application to Conformity Checking in Construction

Authors:

Anastasiya Yurchyshyna, Catherine Faron-Zucker, Nhan Le Thanh and Alain Zarli

Abstract: This paper presents a formal method for the development of a domain ontology adapted to different user profiles, in the context of conformity checking in construction. We start by describing our research domain: the conformity-checking in construction. Then we discuss some ontology-based approaches for formalising domain knowledge and focus on methods of ontology development for different user profiles. In order to efficiently adapt our conformity-checking ontology for different user profiles, we suggest a semantic approach for a development of the domain ontology that takes into account domain knowledge. This approach is based on three main steps. First, we describe the knowledge acquisition method developed for our conformity-checking model. Second, we propose an approach for adapting the initial domain ontology by integrating the feedback on semantic searches conducted by of non professional users. Third, we introduce an approach for the development of different facets of the domain ontology adapted for different user profiles. Finally, we describe the C3R web-based prototype, which implements our method for the development of a domain ontology aware of different user profiles.
Download

Paper Nr: 143
Title:

A User Interface to Define and Adjust Policies for Dynamic User Models

Authors:

Daniel Olmedilla, Arne Wolf Koesling, Daniel Krause, Juri Luca de Coi, Daniel Olmedilla, Nicola Henze, Arne Wolf Koesling and Fabian Abel

Abstract: A fine-grained user-aware access control to user profile data is the key requirement for sharing user profiles among applications, and hence improving the effort of these systems massively. Policy languages like Protune can handle access restrictions very well but are too complicated to be used by non-experts. In this paper, we identify policy templates and embed them into a user interface that enables users to specify powerful access policies and makes them aware of the current and future consequences of their policies.
Download

Paper Nr: 167
Title:

POPULATING A DOMAIN ONTOLOGY FROM A WEB BIOGRAPHICAL DICTIONARY OF MUSIC - An Unsupervised Rule-Based Method to Handle Brazilian Portuguese Texts

Authors:

Eduardo Motta, Sean Siqueira and Alexandre Andreatta

Abstract: An increasing amount of information is available on the web and usually is expressed as text, representing unstructured or semi-structured data. Semantic information is implicit in these texts, since they are mainly intended for human consumption and interpretation. Since unstructured information is not easily handled automatically, an information extraction process has to be used to identify concepts and establish relations among them. Information extraction outcome can be represented as a domain ontology. Ontologies are an appropriate way to represent structured knowledge bases, enabling sharing, reuse and inference. In this paper, an information extraction process is used for populating a domain ontology. It targets Brazilian Portuguese texts from a biographical dictionary of music, which requires specific tools due to some language unique aspects. An unsupervised rule-based method is proposed. Through this process, latent concepts and relations expressed in natural language can be extracted and represented as an ontology, allowing new uses and visualizations of the content, such as semantically browsing and inferring new knowledge.
Download

Paper Nr: 175
Title:

On the Evolution of Search Engine Rankings

Authors:

Panagiotis Metaxas

Abstract: Search Engines have greatly influenced the way we experience the web. Since the early days of the web, users have been relying on them to get informed and make decisions. When the web was relatively small, web directories were built and maintained using human experts to screen and categorize pages according to their characteristics. By the mid 1990's, however, it was apparent that the human expert model of categorizing web pages does not scale. The first search engines appeared and they have been evolving ever since, taking over the role that web directories used to play. But what need makes a search engine evolve? Beyond the financial objectives, there is a need for quality in search results. Users interact with search engines through search query results. Search engines know that the quality of their ranking will determine how successful they are. If users perceive the results as valuable and reliable, they will use it again. Otherwise, it is easy for them to switch to another search engine. Search results, however, are not simply based on well-designed scientific principles, but they are influenced by web spammers. Web spamming, the practice of introducing artificial text and links into web pages to affect the results of web searches, has been recognized as a major search engine problem. It is also a serious users problem because they are not aware of it and they tend to confuse trusting the search engine with trusting the results of a search. In this paper, we analyze the influence that web spam has on the evolution of the search engines and we identify the strong relationship of spamming methods on the web to propagandistic techniques in society. Our analysis provides a foundation for understanding why spamming works and offers new insight on how to address it. In particular, it suggests that one could use social anti-propagandistic techniques to recognize web spam.
Download

Paper Nr: 182
Title:

Improving Web Search by Exploiting Search Logs

Authors:

Hongyan Ma

Abstract: With the increasing use of Web search engines, there evolve acute needs for more adaptive and more personalizable Information Retrieval (IR) systems. This study proposes an innovative probabilistic method exploiting search logs to gather useful data about contexts and users to support adaptive retrieval. Real users’ search logs from an operational Web search engine, Infocious, were processed to obtain past queries and click-through data for adaptive indexing and unified probabilistic retrieval. An empirical experiment of retrieval effectiveness was conducted. The results demonstrate that the log-based probabilistic system yields statistically superior performance over the baseline system.
Download

Short Papers
Paper Nr: 15
Title:

CarpoolNow: Just-In-Time Carpooling Without Elaborate Preplanning

Authors:

Dominic Massaro

Abstract: Carpooling reduces the number of cars on the road, reduces gas consumption, and saves participants money. In order to free carpooling from rigid schedules and preplanning, just-in-time carpooling allows a large member base of passengers and drivers to be matched with each other automatically and instantly, allowing for on-the-spot arrangement of rides. A mobile phone call or text message initiates an automatic process in which drivers and passengers are matched to a shared ride wherever and whenever they need it, without the scheduling constraints of traditional carpooling. This program faces a number of challenging barriers in technology and behavioral science. These include the creation of a seamless interaction between mobile phones and the internet server, voice recognition and SMS solutions, safety of mobile phone use and driving, and motivation, safety, and trust among participating members of the carpooling community.
Download

Paper Nr: 28
Title:

T-Prox: A User-Tracking Proxy for Usability Testing

Authors:

Sven Lilienthal

Abstract: While usability analyses become more and more accepted for conventional software, there are many reasons told to forgo usability analyses for web-based applications and websites. Most companies dread the high monetary and personnel costs together with the unacquainted process. Nevertheless, this leads to a lowered acceptance and thereby to lowered success. This motivates the development and creation of an easy to use solution which enables companies and institutions to easily analyze their web systems without having to rely on external usability experts and expensive labs and equipment.
Download

Paper Nr: 36
Title:

VISUALIZING NETWORKS OF MUSIC ARTISTS WITH RAMA

Authors:

Luis Sarmento, Eugénio Oliveira, Fabien Gouyon and Bruno Costa

Abstract: In this paper we present RAMA (Relational Artist MAps), a simple yet efficient interface to navigate through networks of music artists. RAMA is built upon a dataset of artist similarity and user-defined tags regarding 583.000 artists gathered from Last.fm. This third-party, publicly available, data about artists similarity and artists tags is used to produce a visualization of artists relations. RAMA provides two simultaneous layers of information: (i) a graph built from artist similarity data, and (ii) overlaid labels containing user-defined tags. Differing from existing artist network visualization tools, the proposed prototype emphasizes commonalities as well as main differences between artist categorizations derived from user-defined tags, hence providing enhanced browsing experiences to users.
Download

Paper Nr: 47
Title:

DEVELOPMENT OF A WEB-AVAILABLE EPIDEMIOLOGICAL SURVEILLANCE SYSTEM INTEGRATING GEOGRAPHIC INFORMATION The Public Health Emergencies Support System at the Portuguese General Directorate for Health

Authors:

Andre Oliveira and Pedro Cabral

Abstract: The application of geographic information tools in Public Health management already includes many areas of study, one of which deals with the integration of Geographic Information Systems (GIS) in epidemiological surveillance systems, with the objective of aiding Public Health officials in decision-making. Some of these systems are already operational in several countries, acting in various spatial and temporal scales, and with different levels of priority. The present article introduces the development of a Public Health spatial data management infrastructure within the Portuguese General Directorate of Health, baptized Public Health Emergencies Support System and essentially aimed at performing epidemiological surveillance tasks. This is a multiplatform environment that brings together relational databases, geographic information systems and web technology, making it possible to supply daily and weekly updated results to health officials through the Internet. Satisfactory results were obtained with the implementation of SSESP, since most of the planned infrastructure and functionalities are already operational. Some of the system’s present handicaps and evolutionary perspectives are also discussed.
Download

Paper Nr: 50
Title:

FOCUSING WEB CRAWLS ON LOCATION-SPECIFIC CONTENT

Authors:

Lefteris Kozanidis, Sofia Stamou and George Spiros

Abstract: Retrieving relevant data for location-sensitive keyword queries is a challenging task that has so far been addressed as a problem of automatically determining the geographical orientation of web searches. Unfortunately, identifying localizable queries is not sufficient per se for performing successful location-sensitive searches, unless there exists a geo-referenced index of data sources against which localizable queries are searched. In this paper, we propose a novel approach towards the automatic construction of a geo-referenced search engine index. Our approach relies on a geo-focused crawler that incorporates a structural parser and uses GeoWordNet as a knowledge base in order to automatically deduce the geo-spatial information that is latent in the pages’ contents. Based on location-descriptive elements in the page URLs and anchor text, the crawler directs the pages to a location-sensitive downloader. This downloading module resolves the geo-graphical references of the URL location elements and organizes them into indexable hierarchical structures. The location-aware URL hierarchies are linked to their respective pages, resulting into a geo-referenced index against which location-sensitive queries can be answered
Download

Paper Nr: 62
Title:

Experimental Comparison of Adaptive Links Annotation Technique with Adaptive Direct Guidance Technique

Authors:

Jozef Kapusta, Michal Munk and Milan Turčáni

Abstract: The problematics of educational environment adaptation when using the adaptive hypermedia systems (AHS) not only includes the need to implement these systems, to develop the applicable adaptive problem solving structures but also the evaluation of e-learning, pedagogical-psychological aspects of creating materials supporting the education, scheming the subject matter, efficiency of the problematics presentation etc. Based on their knowledge of the given field, the authors of this article have executed an experiment aimed at the quantitative evaluation of results when searching for the options of AHS application in the informatics courses at the Department of Informatics, Faculty of Natural Sciences, Constantine the Philosopher University in Nitra. The gained experimental results have verified the didactical efficiency of e-learning courses built by using adaptive hypermedia systems, the time-effectiveness of these courses, as well as the choice of the best adaptation form. In the experiment, the adaptive annotation technique was compared with the direct guidance technique. An important discovery coming from the results of the executed experiment was that the direct guidance technique when compared with other techniques was the least time-effective, but its didactical efficiency was the highest.
Download

Paper Nr: 69
Title:

A QUERY EXPANSION METHODOLOGY IN A COOPERATION OF INFORMATION SYSTEMS BASED ON ONTOLOGIES

Authors:

GOMEZ CARPIO Guillermo Valente, Lylia Abrouk and Nadine Cullot

Abstract: The great development of Internet technologies and the emergence of the semantic web have led to the specification of architectures and tools to describe and to allow the “relevant” sharing of heterogeneous information sources. Shared data can be annotated and mapped to an agreed representation of their semantic using ontologies. Ontology is a representation of a domain of interest which is agreed by a community of people. The aim of the paper is to propose a user’s query expansion methodology in a heterogeneous cooperation of information systems. We propose a complete architecture called OWSCIS (Ontology and Web Services Cooperation of Information Sources) of cooperation of information systems based on the use of reference ontology and several local ontologies. This paper highlights the query expansion methodology in this architecture. The objective is to help and guide the user during the query process by analysing his query and using the usual behaviours of the users to predict his need.
Download

Paper Nr: 74
Title:

Markup and Validation Agents in Vijjana: A Pragmatic Model for Collaborative, Self-organizing, Domain Centric Knowledge Networks

Authors:

Ramana Reddy, Luyi Wang, Sumitra Reddy and Sriram Devalapalli

Abstract: Abstract - : In this paper we describe the Markup and Validation agents in Vijjana, a model for transforming a collection of URLs (Uniform Resource Locators) into a useful knowledge network which reveals the semantic connections between these disparate knowledge units. The markup process is similar to, but much more involved than the traditional book-marking. All the relevant metadata corresponding to a particular Uniform Resource locator is generated and passed on to the organizing agent, which adds this URL to the database. Validation agent checks and ensures the database is consistent and has valid entries.
Download

Paper Nr: 75
Title:

INCREMENTAL END-USER QUERY CONSTRUCTION FOR THE SEMANTIC DESKTOP

Authors:

Ricardo Kawase, Enrico Minack, Samur Araujo, Daniel Schwabe and Wolfgang Nejdl

Abstract: This paper describes the design and implementation of a user interface that allows end-users to incrementally construct a query over the information in the Personal Information Management (PIM) domain. It allows semantically enriched keyword queries, implemented in the Semantic Desktop of the NEPOMUK Project. The Semantic Desktop user is able to explicitly articulate machine-processable knowledge, as described by its metadata. Therefore, searching this semantic information space can also benefit from the knowledge articulation within the query. Contrary to keyword queries, where it is not possible to provide semantic information, structured query languages as SPARQL enable exploiting this knowledge explicitly.
Download

Paper Nr: 82
Title:

Towards Computerized Digital Preservation Based on Intelligent Agents and Web Services

Authors:

Geyong Min, Jianmin Jiang and Xiaolong Jin

Abstract: The explosively growing volume of digital information results in pressing demands to transfer digital objects from active IT systems to digital repositories, libraries, and archives for long-term preservation. However, existing strategies of digital preservation are labour intensive and often require specialist skills. In order to meet the preservation demands of immense digital information, it is necessary to find new levels of automation and self-reliance in preservation strategies. On the other hand, intelligent agent technology is widely viewed as a promising approach to developing large-scale complex software systems. It has already been successfully applied in some industrial and commercial areas. Meanwhile, Web services have evolved into a key paradigm for distributed computing. They provide an efficient way to realize loosely-coupled architecture and interoperable solutions across heterogeneous platforms and systems. Therefore, Web services have received great attention from both industry and academia. However, to the best of our knowledge, there are no initiatives that employ the technologies of intelligent agents and Web services as the general methodology to study long-term digital preservation in the open literature. In this paper, we describe an intelligent agent and Web service based architecture of the PROTAGE system, which is funded by the European FP7 Research Programme and aims to computerize long-term digital preservation. We discuss the fundamental agents involved in the PROTAGE system as well as their interactions. We further present a general framework of automated decision making based on intelligent agents and Web services, which are crucial for the automation of long-term digital preservation. Finally, we discuss some key issues related to the implementation of the PROTAGE system.
Download

Paper Nr: 85
Title:

ICT TRAINING APPROACH FOR THE STRUCTURAL STEEL DESIGN UNDER THE EUROCODES

Authors:

Miguel Serrano, Manuel Aenlle, Carlos Lopez-Colina, Fernando Gayarre and Alfonso Lozano

Abstract: Fortunately the design processes of steel buildings across Europe is eventually covered by a unified code: The Eurocode 3: “Design of steel structures“. Nevertheless, although Eurocodes will soon become mandatory documents, designs will not be standardized because each country has a set of National Annexes which must be taking into account when designing in that particular country. Furthermore every country also has its own body of non-conflicting complementary information. A problem then arises when engineers need to produce designs in other European countries, either for a company based in one state or as individuals. Also, allowing engineers time out of the office for attendance at the intensive training courses which are required for earning experience on the new codes of design, frequently represents an obstacle for their employers. In an attempt to solve these problems, a strong trans-national partnership has been working on a project which aims to develop an ICT-supported, flexible training approach to allow designers to apply Eurocodes in accordance with the national regulations and practices of different member states. The resulting material shows how to design a typical building according to the different national contexts. The developed portal incorporates facilities for course presentation, forums, blogs and on-line translation
Download

Paper Nr: 94
Title:

MEDFINDER: Using Semantic Web, Web 2.0 and Geolocation methods to develop a Decision Support System to locate doctors

Authors:

Alejandro Rodríguez González, Jesús Fernandez, Mateusz Radzimski, Myriam Mencke, Enrique Jiménez-Domingo, Juan Miguel Gómez, Giner Alor-Hernandez and RUBEN POSADA GOMEZ

Abstract: Currently the introduction of new technologies for efficient medical diagnosis and treatment is having a profound social and organizational impact, initiating the need to exploit the latent potential of novel methods. A requirement has emerged to maximally exploit the fusion of new technologies for more efficient patient care than is presently available. This scenario motivated the objective of the current paper, combining an existing medical diagnosis system which uses semantic technology and probabilistic techniques with Web 2.0 and geolocation methods to develop a system to locate the most appropriate doctor for a patient. Results of an initial prototype implementation are promising.
Download

Paper Nr: 99
Title:

2D 3D Web Transitions: Methods and Techniques

Authors:

Eric Deléglise, Diponkar Paul and Morten Fjeld

Abstract: While numerous web applications exploit either 2D or 3D technology, the work presented here suggests that it is feasible to integrate 2D and 3D environments. Besides presenting methods and techniques for realizing 2D and 3D web environments, novel ways of putting corresponding technologies to work are suggested. The prospective user benefits and inherent limitations of realized models are presented in our work. Then the issues that arise when switching between 2D and 3D environments are explored. The background and justification for the three methods and corresponding techniques presented here illustrate how to enable and facilitate a fluent and effective transition between 2D and 3D web environments. While some of the methods and techniques presented are still at a conceptual stage, two techniques have been realized and the findings are presented in detail. A video presentation of realized techniques is offered. Usability of the methods and techniques realized is discussed.
Download

Paper Nr: 111
Title:

A Conceptual Model for Digital Libraries Evolution

Authors:

Antonina Dattolo, Paolo Casoto, Andrea Baruzzo and Carlo Tasso

Abstract: The evolution and preservation of digital libraries are not simply a matter of technological decisions, but they can be better understood if treated as the integration of three complementary dimensions (based on the informational, technological and social domains). These dimensions together form a conceptual framework suitable to characterize the whole digital library concept. In this paper, starting from the experience and the lessons learned in the realization of the EU-India E-Dvara project, we propose such framework, providing motivational examples and discussing opportune solutions. More in particular, we discuss the issues concerned the technical infrastructure adaptation, the coordination of different user roles, and the data evolution in order to select the dimensions along which we base our framework.
Download

Paper Nr: 114
Title:

A Tool Based on Web Services to Query Biodiversity Information

Authors:

Joana G. Malaverri, Bruno M. Vilar, Joana Malaverri, Claudia B. Medeiros and Bruno Vilar

Abstract: Biodiversity Information Systems are complex software systems that present data management solutions to allow researchers to analyze species and their interactions. The complexity of these systems varies with the data handled, users targeted and environment in which they are executed. An open problem to be faced especially in a Web environment is data heterogeneity, and the diversity of use vocabularies and needs. This hampers query processing. In this paper we present a tool based on Web services to expand and process biodiversity queries using ontology information. This solution relies on a new database organization, also described here, which combines in a single model data collected in the field with data found in archival sources. This tool is being tested using real case studies, developed in cooperation with biologists. It is being integrated within a large Web-based biodiversity system.
Download

Paper Nr: 117
Title:

MANAGING TRANSACTIONAL COMPOSITIONS OF WEB SERVICE APPLICATIONS

Authors:

Juha Puustjärvi

Abstract: The ACID transaction model has evolved over time to incorporate more complex transaction structures and to selectively relax the atomicity and isolation properties. Such advanced transaction models are more appropriate for SOA, which is geared toward open environments consisting of autonomous and heterogeneous systems. However, due to the autonomy and heterogeneity of local systems supporting transactional compositions of Web service applications is problematic in SOA. In addition, the interfaces of Web services are not usually designed for transactional compositions. Neither there are mechanisms for registering Web services’ abilities to participate transactional compositions nor mechanisms for registering Web services’ coordinators. How these problems can be avoided by introducing a Composition server is the topic of this paper.
Download

Paper Nr: 118
Title:

APPLICATION VERSIONING, SELECTIVE CLASS RECOMPILATION AND MANAGEMENT OF ACTIVE INSTANCES IN A FRAMEWORK FOR DYNAMIC APPLICATIONS

Authors:

Georgios Voulalas and Georgios Evangelidis

Abstract: In our previous research we have presented the core functional and data components of a framework for the development and deployment of web-based applications. The framework enables the operation of multiple applications within a single installation and supports runtime evolution by dynamically recompiling classes based on the source code that is retrieved from the database. It is structured upon a universal database schema (meta-model). The contributions of this paper include a versioning mechanism that enables access to old data in their real context (i.e. within the version of the application that created this data), a proposal for selective recompilation of new classes that allows applications to evolve safely at the minimum processing cost, and a policy for handling active classes (i.e. classes that have running instances) that need to be dynamically recompiled in order to reflect changes.
Download

Paper Nr: 121
Title:

Integration between Digital Terrestrial Television and Internet by means of a DVB-MHP web browser

Authors:

Irene Amerini, Giovanni Ballocca, Rudy Becarelli, Roberto Borri, Roberto Caldelli and Francesco Filippini

Abstract: The process of digital TV convergence, expected in Europe to the end of 2012, makes desirable to develop a system that can provide access to the Internet offering web content on a television screen. In this paper we present WebClimb, a web browser that would pursue an effective integration of Digital Terrestrial Television (DTT) and Internet in the DVB-MHP platform. WebClimb is a Java-based web browser and enables users to browse the web interacting with an easy to use Graphical User Interface (GUI), driven by a TV remote control. We explain the system architecture of this novel web browser and we evaluate the operating modes of a WebClimb prototype, over commercial digital terrestrial set-top-boxes.
Download

Paper Nr: 138
Title:

Architecture to connect tool-based web interfaces to service-oriented architectures

Authors:

Matthias L. Hemmje, Martin Mois and Claus-Peter Klas

Abstract: DAFFODIL is a user-oriented digital library system for accessing distributed libraries through a unique user interface. Beyond that, DAFFODIL also provides high-level functions to support proved search strategies which allow a system-wide search and navigation over the integrated digital libraries. The DAFFODIL system is based on distributed agents and comes with a Java Swing-Client which has to be installed on the user’s system. This paper describes an architecture for combining the powerful services of DAFFODIL with a modern Ajax-enabled interface to lower the access boundaries for efficient and effective information search for all users.
Download

Paper Nr: 140
Title:

The Geospatial Semantic Web: are GIS Catalogs prepared for this?

Authors:

Carla G. Macário and Claudia B. Medeiros

Abstract: Geospatial information catalogs are complex infrastructures that store and publish geographic information. They are an important part of Geographic Information Systems (GIS), systems that manage geospatial data for a wide variety of application domains. To be useful, a catalog must efficiently support discovery and retrieval of geospatial information, working as a key component for planning and decision-making in a variety of domains. Catalogs use standards to support data interoperability. However, the simple adoption of standards and specifications for geospatial data description enables only syntactic interoperability. Semantic heterogeneity still presents challenges for the so-called Geospatial Semantic Web. This work discusses some features that GIS catalogs should have, focusing in semantic issues. We tested some existing and well known catalogs, comparing them by means of these features. Based on this comparison, we identified some open issues that should be addressed considering advanced Geospatial applications on the Web.
Download

Paper Nr: 149
Title:

Extracting precise activities of users from HTTP logs

Authors:

Kiyotaka Takasuka, Yoshikatsu Tada, Kazutaka Maruyama and Minoru Terada

Abstract: Browsing histories are often used to build user profiles for browsing supports and personalizations. But, the browsing history also contains HTTP requests generated concomitantly with user activity(concomitant request), which must be removed in order to build correct user profiles. Current filtering methods are based on rather simple characteristics of requests such as the extension of the file name or reported content types. We invent a more efficient filtering method based on other characteristics such as the intervals of requests and the referer relations of requests. In this paper we analyze these characteristics in real web transactions and evaluate their usefulness on filtering.
Download

Paper Nr: 152
Title:

Strategic Innovation Management on the Basis of Searching and Mining Press Releases

Authors:

Jan Finzen, Steffen Koch, Holger Kett and Maximilien Kintz

Abstract: Press releases may contain a lot of information that is especially applicable in strategic innovation management: They contain up-to-date information by definition and thus may give hints to upcoming trends and techniques. They also tell a lot about the strategies of partners, customers, and, most of all, competitors. We analysed many of today’s existing press release search engines and identified a number of shortcomings: The query frontends do not provide enough flexibility with regards to search space restriction, the result lists presentation typically cannot be influenced by the user, and the ranking order stays often unclear. Press releases offer a number of features that make them useful for automatic handling but are widely ignored by today’s search engines: They are relatively homogenously structured and contain certain kinds of easy-extractable meta-data that can be utilized for use cases such as monitoring trends (date of publishing), discovering geographical competency clusters (author and address information), or identifying relations between companies (firm name mentioning). We describe the prototype of a new press release search engine that makes use of the above-mentioned meta-data and additionally offers sophisticated search features tailored to the needs of innovation professionals.
Download

Paper Nr: 154
Title:

USING THE STRUCTURAL CONTENT OF DOCUMENTS TO AUTOMATICALLY GENERATE QUALITY METADATA

Authors:

Lars H. Edvardsen, Trond Aalberg, Ingeborg Torvik Sølvberg and Hallvard Trætteberg

Abstract: Giving search engines access to high quality document metadata is crucial for efficient document retrieval efforts on the Internet and on corporate Intranets. Presence of such metadata is currently sparsely present. This paper presents how the structural content of document files can be used for Automatic Metadata Generation (AMG) efforts, basing efforts directly on the documents’ content (code) and enabling effective usage of combinations of AMG algorithms for additional harvesting and extraction efforts. This enables usage of AMG efforts to generate high quality metadata in terms of syntax, semantics and pragmatics, from non-homogenous data sources in terms of visual characteristics and language of their intellectual content.
Download

Paper Nr: 157
Title:

A Collaborative Filtering approach combining Clustering and Navigational based correlations

Authors:

Esslimani Ilham, Armelle Brun and Anne Boyer

Abstract: Recommender systems are widely used for automatic personalization of information on web sites and information retrieval systems. Collaborative Filtering (CF) is the most popular recommendation technique, but several CF systems still suffer from problems like data rating availability and space dimensionality for neighborhood selection. In this paper, we present a new CF approach (PSN-CF) that uses usage traces to model users. These traces are used to estimate ratings that will be employed to generate clusters. Then, the PSN-CF evaluates navigational correlations between users within these clusters. Predictions are performed in a following step. The performance of PSN-CF is evaluated in terms of accuracy and time processing on a real usage dataset. We show that PSN-CF highly improves the accuracy of predictions in terms of MAE. Moreover, the use of clustering and positive sequences before computing the navigational correlations contributes to an important reduction of time processing.
Download

Paper Nr: 185
Title:

A Web-based Multilingual Utterance Collection System for the Medical Field

Authors:

Taku Fukushima, Takashi Yoshino, Taku Fukushima and Ryuichi Nishimura

Abstract: We have developed a web-based multilingual utterance collection system, named OTOCKER, for the medical field. The purpose of OTOCKER is to act as a voice data collection platform for intercultural communication. Although speech synthesis systems have improved significantly, fluent and smooth speech synthesis is still a problem. Currently, it is difficult to synthesize speech in different languages. In the medical field, in particular, it is important that the intended meaning of spoken words be conveyed effectively along with the different nuances. Therefore, we use an utterance collection system that collects people’s voices directly. The limitations of this system are (1) an insufficient number of correct sentences pertaining to the medical field and (2) that easy participation in both voice record and collection are difficult. We can solve the first problem by using a system that collects parallel medical texts. The second problem can be solved by using w3voice―a web-based voice-recording system. This system can run only on a web browser. This paper presents the design of OTOCKER, its prototype, and the results of its trial.
Download

Paper Nr: 199
Title:

EFFECTS OF CRAWLING STRATEGIES ON THE PERFORMANCE OF FOCUSED WEB CRAWLING

Authors:

Ari Pirkola and Tuomas Talvensaari Tuomas Talvensaari

Abstract: Focused crawlers are programs that selectively download Web documents (pages), restricting the scope of crawling to a specific domain or topic. We investigate different focused crawling strategies including the use of data fusion in focused crawling. Documents in the domains of genomics and genetics were fetched by Nalanda iVia Focused Crawler using three crawling strategies. In the first one, a text classifier was trained to identify relevant documents. In the latter two strategies, the identification of relevant documents was based on query-document matching. In experiments, the crawling results of the single strategies were combined to yield fused crawling results. The experiments showed, first, that different single strategies overlap only to a small extent, identifying mainly different relevant documents. Second, a query-based strategy where the words of the link context were weighted gave the best coverage (i.e., number of relevant documents) after 10 000 and 40 000 documents had been downloaded. The combination of the two query-based strategies was the best fused strategy but it did not perform better than the best single strategy.
Download

Paper Nr: 5
Title:

EXTENDED VISUALIZATION FOR A DIGITAL JOURNAL

Authors:

Muhammad S. Khan, Muhammad T. Afzal, Narayanan Kulathuramaiyer and Hermann Maurer

Abstract: Content analysis has been a tradition of many electronic and printed journals, in order to ensure quality and the journal’s standing. Traditionally, researchers have tried to analyze patterns in scholarly publications using normal tables and statistical charts. In this paper we present an interactive visualization system that can help for a deeper analysis of different trends’ patterns hidden in scholarly publications of a digital journal. We apply this technique to the Journal of Universal Computer Science (J.UCS). The proposed visualization system is an easy to use web application, based on animated 2D bubble charts and pie charts to handle geographical, temporal and large kinds of categorical data. The paper gives a brief overview of the state of the art visualization techniques available to understand the knowledge structure of any given academic discipline. The design and technical aspects of the proposed visualization tool and various interesting results drawn from it have been discussed.
Download

Paper Nr: 22
Title:

Development of a Support System for University Course Selection using Semantic Web Technology

Authors:

Minoru Nakayama and Jun Hohshito

Abstract: The authors proposed a support system for university students to create their own course schedules using semantic web technology. The system provides course information, such as syllabus, students' assessment scores and reviews, which are RDF based ontology, while participants create their own course schedules. A prototype system was developed for course selections of two departments, and its effectiveness was determined. As a result, the number of courses selected increased significantly, and participants' subjective responses were encouraged when they consulted the system.
Download

Paper Nr: 29
Title:

Virtual Language Framework (VLF) - A Semantic Abstraction Layer (Position Paper)

Authors:

Frederic Hallot and Wim MEES

Abstract: In a previous paper, we presented the concept of the Semantic Abstraction Layer (SAL) as a theoretical abstraction aiming to solve some recurrent design problems related to semantics and multilinguality. In this paper, after a short recall of what a SAL is, we present the Virtual Language Framework (VLF), which is our implementation of the SAL concept. We present two approaches for implementing the VLF, one centralized and the other decentralized. We discuss their advantages and drawbacks and then present our solution, which combines both strategies. We end with a short description of an ongoing project at the Royal Military Academy of Belgium where the VLF is used in the context of a disaster management information system.
Download

Paper Nr: 51
Title:

Towards a Template-based Generation of Virtual 3D Museum Environments

Authors:

Daniel Biella and Wolfram Luther

Abstract: This paper focuses on the question how metadata and existing metadata standards can be used for the administration, layout, storage, retrieval and visualization of web-based virtual 3D museum environments. We present enhanced metadata concepts which encompass the infrastructure of a virtual museum or laboratory with stationary or mobile interfaces to communicate with the information sources or to interact with the artifacts.
Download

Paper Nr: 54
Title:

Making Forms Accessible

Authors:

Norbert Kuhn, Michael Schmidt, Stefan Naumann, Andreas Truar and Stefan Richter

Abstract: The purpose of the underlying paper is to discuss an idea how documents can be designed so that they can adapt almost automatically to an actual reader. We observe that paper is still the dominant means to transport information. In particular in the relationship between governmental authorities and citizens paper based documents still play a vital role. Therefore, we describe an approach to design documents and governmental forms so that they may change their appearance in accordance to the preferences of their recipients. This allows the creation of documents that are better understandable for different target groups, as there are elderly people, people with visual impairments or people suffering from dyslexia. We further describe a prototypical system called GUIDO which implements our ideas in form of a special Web service which considers the actual circumstances of the user.
Download

Paper Nr: 64
Title:

A STRAIGHTFORWARD APPROACH FOR ONLINE ANNOTATIONS: SPREADCRUMBS. Enhancing and simplifying online collaboration.

Authors:

Ricardo Kawase and Wolfgang Nejdl

Abstract: Countless user studies and everyday observations have shown that individuals make annotations while reading - highlighting, circling and underlining important parts of the text, moreover adding written comments. Since the Web became the biggest accessible source of information, many of the reading activities happens online in the browser. In this sense, it is expected that the individuals would keep their annotation behaviors, provided that the appropriate tools are available. Although several Web annotation projects currently exist, it is difficult to identify the most prominent in the field. With SpreadCrumbs, we simplify the annotations actions and the social navigation support. SpreadCrumbs users can add in-context annotations to any webpage with minimum cognition load, as they would do when reading a paper; in addition SpreadCrumbs enhances online collaboration and provides mechanisms to support social navigation by means of existing social networks. It allows the users to freely express themselves and to add any desirable substance to the resources. Technically, annotations carry valuable information about the content, more than bookmarks or tags, having a greater impact on collaboration and search for re-finding. SpreadCrumbs exploits all these advantages with an intuitive and easy-to-use user interface.
Download

Paper Nr: 71
Title:

MODELLING AND DEPLOYING SECURITY POLICIES

Authors:

Xabier Larrucea and Rubén Alonso

Abstract: Web Services (WS) platform is one of the most widely accepted implementations of Service Oriented Architectures (SOA). There is a huge amount of specifications related to the so-called WS-*. In this context Security Policies specification and deployment support is still immature and it needs improvement. This paper is focused on the growing importance of security, the increase of collaboration amongst organizations and the emergent need of modelling SOA and security aspects. This paper presents a methodology and modelling framework based on Eclipse platform for designing security aspects in SOA and a derivation mechanism in order to automatically generate web service security elements. This approach is illustrated with an example
Download

Paper Nr: 130
Title:

INTERACTIVE ANALYSIS OF MULTIDEMENSIONAL DATA ON THE WEB BY USING TIME-TUNNEL

Authors:

Seiji Okajima and Yoshihiro Okada

Abstract: In recent years, the Internet has become popular in various application fields so that a huge number of data records are generated and stored on the web. In this situation, we need any tool that helps us to analyze such multidimensional data for obtaining new findings from those data. In this paper, we introduce a visual and interactive analysis tool for multidimensional data called Time-tunnel. Time-tunnel visualizes any number of time series numerical data records as individual charts each of which is displayed on an individual rectangular plane called data-wing in a 3D virtual space. Through direct manipulations on a computer screen, the user easily puts more than two data-wings overlapped together to compare their charts in order to recognize the similarity or the difference among those data records. Simultaneously a radar chart among those data at any time point is displayed to recognize the similarity and the correlation among them. This time, the authors extended Time-tunnel to make it available on the web and this paper clarifies the usefulness of web-version Time-tunnel by showing practical analysis examples.
Download

Paper Nr: 139
Title:

VISUALIZATION OF AND RETRIEVAL OF BACKGROUND INFORMATION RELATING TO WORDS IN WEB DOCUMENTS

Authors:

Kouji Shimatsuka and Tatsuhiro Yonekura

Abstract: When people encounter unfamiliar words, they often use tools such as search engines to obtain background information on these words. However, the semantic content of words can be complex, and it is not always possible to understand the meaning of words from textual information alone. In this paper we quantify the semantic content of words by means of a simple and convenient text-based method whereby the semantic content is constructed from linguistic, visual and auditory characteristic values. Using characteristic vectors generated in this way, users are able to visually check and search for background information on unfamiliar terms in a web document.
Download

Paper Nr: 147
Title:

OCC FOR EMOTION GENERATION IN E-LEARNING SYSTEMS

Authors:

Efthimios Alepis and Maria Virvou

Abstract: This paper describes an educational authoring tool that incorporates the OCC cognitive theory of emotions in order to help instructors create and author affective courses. The authoring tool provides an important facility to instructors for the creation of their own tutoring characters for the user interface of the resulting applications. In this way, the tutoring characters that are speaking, animated personas may represent the teaching behaviour of the human instructor who is in charge of the remote lessons. Students, who are going to use the resulting educational applications, will have a user interface that is more human-like and affective. Thus they may feel less deprived of the human-human interaction between them and a human teacher that would take place in the settings of a real classroom
Download

Paper Nr: 148
Title:

GENERIC ARCHITECTURE FOR INCOORPORATING CLUSTERING INTO E-COMMERCE APPLICATIONS

Authors:

Anastasios Savvopoulos and Maria Virvou

Abstract: Today product recommending applications use many techniques in order to achieve personalization. These techniques may prove successful but lack in portability. This means that it is very difficult to apply the same architecture and techniques that have been used on one system to a totally different one. In this paper we propose a generic architecture that can be used to achieve personalization to a product recommending system. The main advantage of this architecture is that every system can use it, even if it is built on php, asp.net or a different web technology. In this paper we present a case study that we applied this architecture. This case study proves the independency of our architecture and that it can be applied easily to any kind of remote recommending system
Download

Paper Nr: 151
Title:

A MOBILE BROWSER FOR GEO-REFERENCED IMAGES USING AN ACCELEROMETER-BASED COMPASS

Authors:

Francesco Massidda, Davide Carboni, Roberto Manca and Davide Carboni

Abstract: In this paper a new mobile browser for geo-referenced pictures is introduced. Based on common embedded GPS and accelerometer sensors, the implemented mobile browser is able to show tagged photos on the web, depending on the direction the user is facing to, allowing a positional-dependent touristic, commercial or cultural preview of our cities. Due to the lack of integration of digital compass in present mobile phones, a novel compass-simulator developed using build-in accelerometers data samples represent an interesting and cheap tool. Test results show that it is possible to reach a great level of accuracy in cardinal point approximation and that this approach must be used also in several different application fields.
Download

Paper Nr: 180
Title:

A Web Likely-Word Instant Organizer (WebLio): Dynamic Hints During Knowledge Collectors Move Mouse Over A Sentence

Authors:

Po-Hsun Cheng, Ying-Pei Chen and Mei-Ju Su

Abstract: The more complicated web resources exist, the more professional web browsing technologies should be innovated. This paper illustrates a concept for how to extract a web page semantic content and automatically follow the cursor location to organize the likely-words from a sentence for data intelligence. Such a web browsing concept could be implemented with a couple of cross-browser techniques. We believe this concept will be popular with any other miscellaneous form in the future browsers. However, this concept will be another important step for human-computer interaction, especially, for minimizing the time expense and maintaining the likely keywords library during further web surfing utilization.
Download

Paper Nr: 181
Title:

A Web-Based Tool for Creating Georeferenced Boundary Maps

Authors:

Omar Valenzuela, Néstor Rodríguez and José A. Borges

Abstract: This work presents a web-based application with the capability of creating, manipulating and editing georeferenced boundary maps (polygons) of geographical areas using georeferenced images. The application is platform-independent. It can run on any computer with a browser and an Internet connection. The application provides the capability of locally storing boundary maps attaching to them spatial information from the images that facilitates image searching. Boundary maps can be imported from other sources and they can also be exported to make them available for other GIS applications. A usability test conducted with the application demonstrated that it is easy to learn and use. Participants were able to complete 96% of the tasks with a fifteen minutes tutorial. I addition they found all the interaction actions tested to be easy to use.
Download

Paper Nr: 183
Title:

INCREMENTAL MAINTENANCE OF ONTOLOGIES BASED ON BIPARTITE GRAPH MATCHING

Authors:

Preetpal Singh and Kalpdrum Passi

Abstract: Today’s Information Society demands complete access to available information, which is often heterogeneous and distributed. A key challenge in building the Semantic Web is integrating heterogeneous data sources. This paper presents an incremental method for maintaining integration of data in ontologies across diverse domains. As example, an increased number of smaller, task oriented ontologies, are emerging across Bioinformatics domains to represent domain knowledge. Integrating these heterogeneous ontologies is crucial for applications utilizing multiple ontologies. Most ontologies share a core of common knowledge allowing them to communicate, but no single ontology contains complete domain knowledge. Recent papers examined integrating ontologies using bipartite graph matching techniques. However, they do not address the issue of incrementally maintaining the matching in evolving ontologies. In this paper we present an incremental algorithm, OntoMaintain, which incrementally calculates the perfect matching among evolving ontologies and simultaneously updates the labels of the concepts of ontologies. We show that our algorithm has a complexity of O(n2) compared to complexity O(n3) of traditional matching algorithms. In addition, our experimental results prove that the OntoMaintain algorithm maintains the correctness of ‘brute force method’ while significantly reducing the time needed to find perfect matchings in evolving ontologies.
Download

Paper Nr: 197
Title:

EVOLUTION OF NEWS SERVICES IN THE GULF COOPERATION COUNCIL (GCC)

Authors:

Jilowey Alqahtani

Abstract: This study explores the evolution of electronic news services on the Internet in Gulf Cooperation Council States (GCC). This research contains surveys from a number of media professions and a number of users from GCC countries, in addition to the perception of the providers and users of the news services. Attention was focused on the rapid evolution of Internet technology. The study also discusses to what extent GCC media organisations have exploited Internet. It is found that GCC news services have exploited the Internet technology and increased over a short period of time. Despite this, it seems that the GCC suffers from difficulties, such as Internet infrastructure weakness, Arabic font and acute shortage of personnel specialising.
Download

Area 3 - Society, e-Business and e-Government

Full Papers
Paper Nr: 4
Title:

Designing for Social Awareness of Cooperative Activities

Authors:

Monique Janneck

Abstract: Mechanisms supporting a shared representation of activities—or awareness—within a group of people are an important prerequisite for successful computer supported cooperative activities. This article highlights the design of awareness mechanisms from a social psychological viewpoint of human behaviour in and within groups. Based on this, design guidelines for awareness functions supporting cooperative activities—with an emphasis on promoting social awareness—are proposed and evaluated empirically. Results show that users’ awareness was influenced positively as predicted by the design guidelines.
Download

Paper Nr: 37
Title:

VIETE - ENABLING TRUST EMERGENCE IN SERVICE-ORIENTED COLLABORATIVE ENVIRONMENTS

Authors:

Florian Skopik, Hong-Linh Truong and Schahram Dustdar

Abstract: In activity-centric environments where people from different companies and disciplines work remotely together and where new virtual teams are formed and dissolved continuously, how to find the most suitable collaboration partner for a given task and how well one partner is able to collaborate with another one are challenging research questions. Determining and considering people’s professional competencies, collaboration behavior and relationships is a prerequisite to enhance the overall collaboration performance and success, because these factors highly impact on the notion of trust used to select and grade partners. In this paper we analyze these factors and their impact on trust relationships in modern service-oriented collaboration environments. We present VieTE, a framework for trust emergence therein supporting the analysis of trust between partners in various contexts and from different views. In contrast to other approaches, which mostly rely on manual and subjective user feedback, VieTE monitors automatically collaboration efforts and deduces trust between any two partners based on past collaboration, previous successes, and individual competencies.
Download

Paper Nr: 67
Title:

HOW DOES ALGORITHM VISUALIZATION AFFECT COLLABORATION? Video Analysis of Engagement and Discussions

Authors:

Mikko-Jussi Laakso, Niko Myller, Ari Korhonen, Mikko-Jussi Laakso and Niko Myller

Abstract: In this paper, we report a study on the use of Algorithm Visualizations (AV) in collaborative learning. Our previous results have confirmed the hypothesis that students’ higher engagement has a positive effect on learning outcomes. Thus, we now analyze the students’ collaborative learning process in order to find phenomena that explain the learning improvements. Based on the study of the recorded screens and audio during the learning, we show that the amount of collaboration and discussion increases during the learning sessions when the level of engagement increases. Furthermore, the groups that used visualizations on higher level of engagement, discussed the learned topic on different levels of abstraction whereas groups that used visualizations on lower levels of engagement tended to concentrate more on only one aspect of the topic. Therefore, we conclude that the level of engagement predicts, not only the learning performance, but also the amount of on-topic discussion in collaboration. Furthermore, we claim that the amount and quality of discussions explain the learning performance differences when students use visualizations in collaboration on different levels of engagement.
Download

Paper Nr: 93
Title:

Evaluation of Reputation Metric for the B2C E-Commerce Reputation System

Authors:

Anna Gutowska and Andy Sloane

Abstract: This paper evaluates recently developed novel and comprehensive reputation metric designed for the distributed multi-agent reputation system for the Business-to-Consumer (B2C) E-commerce applications. To do that an agent-based simulation framework was implemented which models different types of behaviours in the marketplace. The trustworthiness of different types of providers is investigated to establish whether the simulation models behaviour of B2C E-commerce systems as they are expected to behave in real life.
Download

Short Papers
Paper Nr: 35
Title:

On the use of an on-line free-text scoring system individually or collaborativelly

Authors:

Diana Perez-Marin, Ismael Pascual-Nieto and Pilar Rodriguez

Abstract: Willow iis an adaptive web-based application that allows students to review course material. The system analyzes the students' free-text answers providing immediate feedback to the students. In the past, Willow has been used by individuals working alone. However, the trend of improving learning performance by allowing students to cooperate inspired us to develop a collaborative version of Willow. Our hypothesis was that students working together can reach understanding of ideas better than working individually with Willow. Therefore, in this paper, we explore the collaborative use of the system. We describe from a computer-science perspective, the minimum changes that have to be done to the system in order to permit a collaborative review. Furthermore, we provide the preliminary results of an experiment in which 22 students were given the possibility of using the individual or collaborative version of Willow.
Download

Paper Nr: 45
Title:

DYONIPOS: Redesigned Knowledge Management

Authors:

Silke Weiß, Josef Makolm and Doris Ipsmiller

Abstract: Traditional knowledge management is often combined with extra work to collect the information again which is already electronically available. Another obstacle to be overcome is to make the content of the collected information easy accessible but. At present conventional searching tools provide only documents and not the meaning of the content. They are often based on the search after character strings, deliver many unnecessary hits and no or less context information. DYONIPOS offers a new way. The research project DYONIPOS focuses on detecting the knowledge needs of knowledge workers and automatically providing the required knowledge just in time, while avoiding additional work and violations of the knowledge worker’s privacy. This knowledge is made available through semantic linkage of the relevant information out of existing artifacts.
Download

Paper Nr: 48
Title:

EXTENDED GOVERNMENT: an interoperability point of view

Authors:

Claudio Biancalana, Francesco S. Profiti and Fabio Raimondi

Abstract: The widespread diffusion witnessed by e-Government services in recent years, has allowed the realization of important cases of administrative simplification, mainly due to the direct interaction between informative systems of administrations in A2A modality. In the above scenario, a great importance is assumed by the concept of interoperability, intended as the set of technical rules necessary to define a common interface between the administrations, which have the need to exchange information in A2A modality, and which allow to protect the technological choices already in existence, and the organizational autonomy. The aim of the present paper is to illustrate the state of the art of the project initiatives prompted by the Regione Lazio, relatively to interoperability, with particular reference to the concept of Extended Government. Such concept finds its foundation in the definition of Extended Enterprise. It has been massively used in project initiatives of the Region, with the aim of reusing the scientific research results in such field, mainly relatively to the design and realization of Knowledge Management Systems.
Download

Paper Nr: 105
Title:

A Multidimensional Model to Analyze Social and Technical Factors in Computer-Mediated Communication

Authors:

Monique Janneck

Abstract: This paper proposes a multidimensional model to analyze problems in computer-mediated communication (CMC), which can serve as a framework to integrate existing CMC approaches and also offers guidelines for the selection, the design, and the social and organizational integration of CMC tools. The specific strength of the model is its clear distinction between social and technical factors influencing computer-mediated communication. A case study of groupware use is presented to demonstrate the usefulness of the model to analyze difficulties in CMC settings and decide whether to address a certain problem on a design level or a personal, social or organizational level.
Download

Paper Nr: 108
Title:

Normative Conflicts: Patterns, Detection and Resolution

Authors:

Georgios K. Giannikis and Aspassia Daskalopulu

Abstract: The analysis, representation and management of normative conflicts have been the focus of much research in recent years in commercial and business applications. In this paper we are concerned with normative conflicts that arise for agents engaging in electronic contracting. First, we identify a set of primitive conflict patterns and present some patterns that have not been identified in other proposals. Secondly, we use a representation of e-contracts as Default Theories, which afford us both detection and resolution of such conflict patterns.
Download

Paper Nr: 141
Title:

AccessFabrik: Researching and Developing New Tools for Collaborative Design and Communication

Authors:

Michael Murphy, Michael Dick and Michael Lawrie

Abstract: Abstract Through a review of literature and experimentation with enabling technologies, it was concluded that the Access Grid (an open-source, videoconferencing platform) held significant potential to facilitate the sharing of audiovisual material in the area of collaborative industrial design; however, it would have to be extended to allow for high-definition visualization and remote desktop control. As a result, researchers in Canada and Germany developed new tools to accomplish this, and utilized them to remotely manipulate industrial designs in real-time and with very low latency. Additionally, automated captioning and translation services were developed to better facilitate cross-cultural business-to-business collaboration. Future research directions for this project involve the continued prototyping of these tools, leading either to their deployment within industry or further improvement within the Access Grid Community.
Download

Paper Nr: 144
Title:

AN EMPIRICAL STUDY ON THE DETERMINANTS OF USER ACCEPTANCE OF E-GOVERNMENT IN PUBLIC SECTOR

Authors:

Sinawong Sang, Jeong-Dong LEE, Jeong-Dong LEE and Jong-Su LEE

Abstract: The purpose of this paper is to examine the determinants of user acceptance of e-Government in public sector by using the technology acceptance model (TAM) as a based theoretical model. The model of e-Government acceptance in public sector integrates constructs from TAM, the extended TAM (TAM2), the diffusion of innovation (DOI), and trust literature. To empirically test the model, the data was collected from 112 public officers in 10 ministries in Cambodia. The finding shows that image and output quality are significant influential determinants toward perceived usefulness. Perceived usefulness, relative advantage, and trust are significant determinants toward the acceptance of e-Government usage in public sector. These results have important policy and strategy implications for the public sector and policy makers to increase the acceptance of e-Government usage in public sector.
Download

Paper Nr: 198
Title:

Web Business and Development Opportunities: Learning from Community Networked Services

Authors:

David G. Messerschmitt, Juhana Peltonen, David Messerschmitt and Mikko Laine

Abstract: In this position paper we analyze a category of internet service firms providing what we call “community networked services” (CNS), a concept often discussed under the broad umbrella of “Web 2.0”. In a CNS, members of a virtual community co-create value amongst themselves with the explicit facilitation of a service provider, often manifested by a continuously accumulating and usually open information repository capturing user-generated content. We discuss these characteristics and their operational and managerial implications to CNS firms, which include a smaller reliance on human workforces, a community-oriented innovation model, a stronger disconnect between revenue and recipients of value, and greater network externalities among users. Drawing on these observations, we argue that there is an opportunity for academic research that both understands and improves upon the processes used for the integration of business development, IS development, and user support in CNS firms. Not only can such research help improve the performance of CNS firms, but given the high risk tolerance and experimental nature of these consumer applications, it can also capture innovations and best practices applicable to incumbent service firms in e-commerce and enterprise software applications.
Download

Paper Nr: 25
Title:

E-BUSINESS MATURITY AND INFORMATION TECHNOLOGY

Authors:

Elisabete Morais, José Pires and Ramiro Gonçalves

Abstract: Maturity models describe the maturing of the use of information systems in organizations. They are a useful framework to describe an organization’s current position as well as a range of possible position in the future in terms of their e-business maturity. The relationship between Information Technology (IT) and e-business maturity is examined. We used a model of e-business maturity, Stages Of Growth for e-business (SOGe) model, to put an organization in a maturity stage. In our survey we presented a set of technologies and we asked to enterprises which of them are implemented, in development, planned or inexistent. We concluded that there is a strongly correlation between IT implementation and the e-business maturity.
Download

Paper Nr: 30
Title:

INFORMATION FLOWS IN SUPPLY CHAIN MANAGEMENT – ARE ROAD TRANSPORT COMPANIES INVOLVED WITH SUPPLY CHAIN PLANNING PROCESSES?

Authors:

Jarkko Rantala

Abstract: Businesses are increasingly facing global competition and therefore they meet growing demand for cost efficiency and customer responsiveness. Time-based competition is more a precondition than a source of competitive advantage at present business environment. At the same time companies are concentrating on their core business and outsourcing supporting operations to network partners. Logistics and transports are typical example of outsourced functions. Information management is one key element for effective and reliable supply network operations. This research clarified the role of road transport companies in the supply chains and identified discontinuations of information management and partnership operations from the view point of transport companies.
Download

Paper Nr: 31
Title:

EFFECTS OF E-BUSINESS ON LOGISTICS AND URBAN FREIGHT TRANSPORTATIONS

Authors:

Jarkko Rantala

Abstract: Information society has been anticipated to have many effects on the demand for transport. The effects of the information society have been assessed to be relatively complicated, indicating on the one hand a generative and addictive effect and on the other hand substitution and modification effects. Information technology and its many applications such as electronic data transfer has accelerated the globalisation and integration of markets and given rise to more complicated and sophisticated supply chain solutions. Many of these tendencies are likely to support longer transport distances, higher delivery frequencies, faster deliveries, and smaller delivery sizes. Electronic business models have strengthened these recent logistics trends although in principle e-business solutions should lead to cost-effective and environment friendly supply chains.
Download

Paper Nr: 125
Title:

FROM PAPER TO BYTES: DIGITAL KNOWLEDGE SHARING. A multi level approach to digitalization of Dictionary of the Italian Resurgency by Michele Rosi for a collaborative discipline of the historical studie

Authors:

Letizia Bollini

Abstract: Archives have a lot of documental collections relevant for studies; however items within are difficult to access or consult. Web will be an efficient tool to disseminate the content of documents rare or undisclosed for preservation purpose. The research “Dictionary of the Italian Resurgency” by Rosi has allowed to public, historians to access four Volumes fundamental to studies on Italian Modern History. Project has been developed in collaboration with Civiche Raccolte Storiche di Milano (Milano Historical Downtown Collections) following a proposal in the framework of the Network of Historical Museums. The project was a transformation in digital form of books; then a web system was implemented to publish, to search text, using booth a textual format (indexed by Google and WW searchable) and the original printing (downloadable in jpg format). A search system using key criteria allows to crosscorrelate dates, overcoming the labour intensive traditional examination. A distributed system for data insertion allows many students collaborate, revising and upgrading the contents. Next step will be the creation of links between the contents of the first tree volumes: Peoples and the content of the fourth: Events where peoples quoted in the dictionary were involved as main actors.
Download

Paper Nr: 135
Title:

THE EXISTENCE PROOF SERVICE OF THEWEB PAGES -New Web Service to Get Grounds of the Existence of the Web Pages-

Authors:

Yuka Obu, Masahiro Miyata, Juyeon Choe, Katsumi Terakado and Tatsuhiro Yonekura

Abstract: Recently, more people exchange the information via the Web. However, because the information on the Web is electronic-base data, the information can be deleted or changed frequently. Therefore, it is difficult to prove to the third party that the content existed or it has not been changed. Hence, we developed the Web service called “The existence proof service of the Web pages” that creates the Web cache and adds the time stamp as the grounds that the cache has not been changed. With this service, the Web page can be proved to have existed in a certain URL at certain time. This can be used for verification of the alteration of the Web page.
Download

Paper Nr: 168
Title:

CHARACTERIZATION OF E-BANKING TECHNOLOGICAL SOLUTIONS IN PORTUGAL

Authors:

Paulo Martins, Ramiro Gonçalves and Juliana Tavares

Abstract: Economic activities are part of everyday life since always. The banking sector has a large dimension that exists for a long time, doing a major contribution to economic growth in Portugal. In this new millennium the advent of Internet has had a significant impact on the banking service that is traditionally offered by banks to customers. With help of the Internet, customers can access its banking services anytime, anywhere, since Internet access is available. This service is called Electronic Banking (EB), being in explosive growth in many countries, transforming the traditional banking practices. With this paper, we intend review, evaluate and characterize the technological solutions of EB in Portugal.
Download

Paper Nr: 173
Title:

Personalization in Virutal Enterprises

Authors:

Claudio Biancalana, Fabio Gasparetti and Alessandro Micarelli

Abstract: Each business company collects, produces and exploits for its activities and goals large amounts of information. Most of the times this knowledge makes the intellectual capital for creating value and innovation. Knowledge management (KM) systems aim at manipulating knowledge by storing and redistributing corporate information that are acquired from the organizations members. In this context, Virtual Enterprises (VE) plays a crucial role as not permanent alliances of enterprises joined together to share resources and skills in order to better respond to business opportunities. The representation and retrieval of distributed knowledge is an important feature that information systems must provide in order to obtain advantages from this kind of enterprises. PVE (Personalized Virtual Enterprise) is an ongoing research project for developing a system able to extract and let different business companies access to collective knowledge required to achieve particular shared goals. In this paper, we report the most important features of this system, especially in the context of distributed knowledge representation and retrieval.
Download

Paper Nr: 176
Title:

Using WebQual 4.0 in the Evaluation of the Russian B2C Cosmetic Web Sites

Authors:

Valeria Durova and Nadia Amin

Abstract: The rapid development of Russian e-commerce has involved the emergence of different Web sites. The latest tendencies showed that there is an upward trend in online purchases in the country. This paper examines the results of a quality survey of three different types of cosmetic Web sites in Russia. This industry sector is of particular interest because of its rapid growth and a wide range of organizations involved in this business. The Web sites are examined in terms of design, usability, and information quality. The findings show strengths and weaknesses of the sites and demonstrate user impressions over the interaction with the Web sites.
Download

Area 4 - Web Intelligence

Full Papers
Paper Nr: 53
Title:

IDENTIFYING SIMILAR USERS BY THEIR SCIENTIFIC PUBLICATIONS TO REDUCE COLD START IN RECOMMENDER SYSTEMS

Authors:

Stanley Loh, Fabiana Lorenzi, Roger Granada, Daniel Lichtnow, Leandro K. Wives and José Palazzo M. de Oliveira

Abstract: This paper presents investigations on representing user’s profiles with information extracted from their scientific publications. The work assumes that scientific papers written by users can be used to represent user’s interest or expertise and that these representations can be used to find similar users. The goal is to support similarity evaluations between users in a model-based collaborative recommender. Representing users by their publications can help minimizing the new user problem. The idea is to avoid the necessity of asking users to evaluate a set of items or give some information about their preferences, for example. In scientific communities, particularly on digital libraries and systems focused on the retrieval of scientific papers, this is an interesting feature. We have conducted some experiments to compare different techniques to represent the papers (title, keywords, abstract and complete text) and two kinds of text indexes: terms and concepts. Furthermore, two distinct similarity functions (Jaccard and a Fuzzy function) were applied on these representations and then compared with the goal of finding similar users.
Download

Paper Nr: 55
Title:

Extracting object-relevant data from websites

Authors:

Jianqiang Li and Yu Zhao

Abstract: This paper proposes a method to identify the object relevant information which is distributed across multiple web pages in a website. Many researches have been reported on page-level web data extraction. They assume that the input web pages contain the data records of interested objects. However, in many cases for data mining from a website, the group of web pages describing an object are sparsely distributed in the website. It makes the page-level solutions no longer applicable. This paper exploits the hierarchy model employed by the website builder for web page organization to solve the problem of website-level data extraction. A new resource, the Hierarchical Navigation Path (HNP), which can be discovered from the website structure, is introduced for object relevant web page filtering. The found web pages are clustered based on the URL and semantic hyperlink analysis, and then the entry page and the detailed profile pages of each object are identified. The empirical experiments show the effectiveness of the proposed approach.
Download

Paper Nr: 59
Title:

Anti-folksonomical Item Recommendation System Based on Similarities between Item Clusters in Social Bookmarking

Authors:

Akira Sasaki, Takamichi Miyata, Yasuhiro Inazumi, Yoshinori Sakai and Aki Kobayashi

Abstract: Web-based bookmark management services called social bookmarking has been in the spotlight recently. Social bookmarking allows users to add several keywords called tags to items they bookmarked. Many previous works on social bookmarking using actual words for tags, called folksonomy, have come out. However, essential information of tags is not represented in their tag names, but in the classification of items by tags. Based on this assumption, we propose an anti-folksonomical recommendation system for calculating similarities between groups of items classified according to tags. In addition, we use hypothesis testing to improve these similarities based on statistical reliability. The experimental results show that our proposed system provides an appropriate recommendation result even if users tagged with different keywords.
Download

Paper Nr: 63
Title:

Classifying Structured Web Sources Using Aggressive Feature Selection

Authors:

Hieu Q. Le and Stefan Conrad

Abstract: This paper studies the problem of classifying structured data sources on the Web. While prior works use all features, once extracted from search interfaces, we further refine the feature set. In our research, each search interface is treated simply as a bag-of-words. We choose a subset of words, which is suited to classify web sources, by our feature selection methods with new metrics and a novel simple ranking scheme. Using aggressive feature selection approach, together with a Gaussian process classifier, we obtained high classification performance in an evaluation over real web data.
Download

Paper Nr: 77
Title:

Faceted Ranking in Collaborative Tagging Systems

Authors:

José I. Orlicki, Pablo I. Fierens and Jose I. Alvarez-Hamelin

Abstract: Multimedia content is uploaded, tagged and recommended by users of collaborative systems such as YouTube and Flickr. These systems can be represented as tagged-graphs, where nodes correspond to users and tagged-links to recommendations. In this paper we analyze the online computation of user-rankings associated to a set of tags, called a facet. A simple approach to faceted ranking is to apply an algorithm that calculates a measure of node centrality, say, PageRank, to a subgraph associated with the given facet. This solution, however, is not feasible for online computation. We propose an alternative solution: (i) first, a ranking for each tag is computed offline on the basis of tag-related subgraphs; (ii) then, a faceted order is generated online by merging rankings corresponding to all the tags in the facet. Based on empirical observations, we show that step (i) is scalable. We also present efficient algorithms for step (ii), which are evaluated by comparing their results to those produced by the direct calculation of node centrality based on the facet-dependent graph.
Download

Paper Nr: 92
Title:

Crawling Deep Web Content Through Query Forms

Authors:

Jun Liu, Zhaohui Wu, Lu Jiang, Qinghua Zheng and Xiao Liu

Abstract: This paper proposes the concept of Minimum Executable Pattern (MEP), and then presents a MEP generation method and a MEP-based deep web adaptive query method. The query method extends query interface from single textbox to MEP set, and generates local-optimal query by choosing a MEP and a keyword vector of the MEP. Our method overcomes the problem of “data islands” to a certain extent which results from deficiency of current methods. The experimental results on six real-world deep web sites show that our method outperforms existing methods in terms of query capability and applicability.
Download

Paper Nr: 100
Title:

USING DEPENDENCY PATHS FOR ANSWERING DEFINITION QUESTIONS ON THE WEB

Authors:

Alejandro Figueroa and John Atkinson

Abstract: This work presents a new approach to automatically answer definition questions from the Web. This approach learns n-gram language models from lexicalised dependency paths taken from abstracts provided by Wikipedia and uses context information to identify candidate descriptive sentences containing target answers. Results using a prototype of the model showed the effectiveness of lexicalised dependency paths as salient indicators for the presence of definitions in natural language texts.
Download

Paper Nr: 126
Title:

Classifying Web Pages by Genre: A Distance Function Approach

Authors:

Jane Mason, Michael Shepherd and Jack Duffy

Abstract: The research reported in this paper is part of a larger project on the automatic classification of Web pages by their genres, using a distance function classification model. In this paper, we investigate the effect of several commonly used data preprocessing steps, explore the use of byte and word n-grams, and test our classification model on three Web page data sets. Our approach is to represent each Web page by a profile that is composed of fixed-length n-grams and their normalized frequencies within the document. Similarly, each of the genres in a data set is represented by a profile that is constructed by combining the n-gram profiles for each exemplar Web page of that genre, forming a centroid profile for each Web page genre. We use a distance function approach to determine the similarity between two profiles, assigning each Web page the label of the genre profile to which its profile is most similar. Our results compare very favorably to those of other researchers.
Download

Short Papers
Paper Nr: 44
Title:

Topic extraction from divided document sets

Authors:

Takeru Yokoi and Hidekazu Yanagimoto

Abstract: We propose here a method to extract topics from a large document set with the topics included in its divisions and the combination of them. In order to extract topics, the Sparse Non-negative Matrix Factorization that imposes sparse constrain only to a basis matrix, which we call SNMF/L, is applied to document sets. It is useful to combine the topics from some small document sets since if the number of documents is large, the procedure of topic extraction with the SNMF/L from a large corpus takes a long time. In this paper, we have shortened the procedure time for the topic extraction from a large document set with the combining topics that are extracted from respective divided document set. In addition, an evaluation of our proposed method has been carried out with the corresponding topics between the combined topics and the topics from the large document set by the SNMF/L directly, and the procedure times of the SNMF/L.
Download

Paper Nr: 163
Title:

TIMELINESS FOR DYNAMIC SOURCE SELECTION IN SITUATED PUBLIC DISPLAYS

Authors:

Fernando Ribeiro and Rui José

Abstract: Dynamic sources, which make regularly updated data available for use by other applications, are increasingly a key enabling feature of the web. They are extensively used in all sorts of social media applications where they are re-combined in multiple ways to generate new aggregate services. Public situated displays are an emergent area where dynamic sources can also play a key role in providing situated and frequently updated content. However, the specificities of public displays raise the need for automated selection of the most relevant sources to present. This study addresses relevance from the perspective of timeliness. We propose a timeliness model that supports the most common types of dynamic source. To validate that model, we set an experiment with a public display exhibiting content from dynamic sources and receiving from users feedback on its timeliness. The results from this experiment suggest a reasonable match between our model and the users’ perspectives on timeliness. The results also show that the model is able to make comparative calculations of timeliness for different types of dynamic source. These results enable us to conclude that timeliness functions may help to significantly increase the relevance of content automatically selected from dynamic sources.
Download

Paper Nr: 166
Title:

AUTOMATIC GENERATION OF CONCEPT TAXONOMIES FROM WEB SEARCH DATA USING SUPPORT VECTOR MACHINE

Authors:

Robertas Damasevicius

Abstract: Ontologies and concept taxonomies are essential parts of the Semantic Web infrastructure. Since manual construction of taxonomies requires considerable efforts, automated methods for taxonomy construction should be considered. In this paper, an approach for automatic derivation of concept taxonomies from web search results is presented. The method is based on generating derivative features from web search data and applying the machine learning techniques. The Support Vector Machine (SVM) classifier is trained with known concept hyponym-hypernym pairs and the obtained classification model is used to predict new hyponymy (is-a) relations. Prediction results are used to generate concept taxonomies in OWL. The results of the application of the approach for constructing colour taxonomy are presented.
Download

Paper Nr: 172
Title:

Finding Non-Obvious Profiles by Using Ant-Algorithms

Authors:

Sascha Kaufmann and Thomas Ambrosi

Abstract: Visitors on a website are usually on their own when they are moving around. First time visitors especially are guessing where to find the information they are looking for. In this paper we will show a way to combine the concepts of non-obvious profiles and ant-algorithms to come up with a set of paths, which tries to cover the user’s interests in a proper way. These paths can be used to give recommendations to visitors. While the profiles help to get a better understanding of the users’ interests, the concept of ant-algorithms is employed to determine recently and frequently used paths to lead the user to the desired information. We explain the basic idea of our approach, the current state of the prototype realized and first results.
Download

Paper Nr: 104
Title:

A TWO-LEVEL APPROACH TO WEB GENRE CLASSIFICATION

Authors:

Ulli Waltinger, Alexander Mehler and Armin Wegner

Abstract: This paper presents an approach of two-level categorization of web pages. In contrast to related approaches the model additionally explores and categorizes functionally and thematically demarcated segments of the hypertext types to be categorized. By classifying these segments conclusions can be drawn about the type of the corresponding compound web document.
Download

Paper Nr: 158
Title:

Towards Social Search: From Explicit to Implicit Collaboration to Predict Users' Interests

Authors:

Luca Longo, Pierpaolo Dondio, Luca Longo, Barrett Stephen and Pierpaolo Dondio

Abstract: The concept of social search has been acquiring importance in the WWW as large-scale collaborative computing environments have become feasible.This field focuses on the reader’s perspective in order to assign relevance and trustworthiness to web pages. Although current web searching technologies tend to rely on explicit human recommendations, these techniques are hard to scale as feedback is hard to obtain. Implicit feedback techniques, on the other hand, can collect data indirectly. The challenge is in producing implicit web-rankings by reasoning over users’ activity during a web-search without recourse to explicit human interventions. This paper presents a comparison between explicit and implicit users’ feedbacks upon web pages. An experiment, involving 25 volunteers explicitly evaluating the usefulness of 12 thematic web-sites, was performed implicitly gathering their web browsing activity. The results obtained prove the existence of a strong correlation between explicit judgments and generated implicit feedbacks.
Download

Paper Nr: 169
Title:

Concept based query and document expansion using hidden Markov model

Authors:

Jiuling Zhang, Zuoda Liu, Beixing Deng and Xing Li

Abstract: Query and document expansion techniques have been widely studied for improving the effectiveness of information retrieval. In this paper, we propose a method for concept based query and document expansion employing the hidden Markov model(HMM). WordNet is adopted as the thesaurus set of concepts and terms. Expanded query and document candidates are yielded basing on the concepts which are recovered from the original query/document term sequence by employing the hidden Markov model. Using 50000 web pages crawled from universities' homepage as our test collection and Lemur Toolkit as our retrieval tool, preliminary experiment on query expansion show that the score of top 20 retrieved documents have a 2.7113 average score increment. Numbers of documents with score higher than a given value also increased significantly.
Download