Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Integration of Distributed Learning Objects by Wrapper-Mediator Architecture

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Integration of distributed Learning Objects by Wrapper-Mediator architecture

Angelo Chianese, Paolo Maresca, Vincenzo Moscato, Antonio Penta and Antonio Picariello Dipartimento di Informatica e Sistemistica Universit` a Federico II, Napoli, Italy {angchian,paomares,vmoscato,a.penta,picus}@unina.it Abstract
The main problem addressed here is the denition of a strategy to improve the sharing and reusability of learning objects in an efcient manner inside an e-Learning community, reducing the cost of creating courses and other learning objects. In particular, in this paper we present a system, for integrating multimedia heterogeneous LO repositories and retrieving distributed data. Such system uses the LOM data model and a wrapper-mediator architecture to perform this task, so providing a simple mechanism for enabling the integration while maintaining the independence of the repositories. Our system also supports a semantic search on managed objects that makes more effective the retrieval process. distributed multimedia data repositories that make complex the data management processes. To solve such problems it is necessary to dene a data model that is capable of representing in unique logical view the multimedia data, so it can be used by applications, inside an architecture able to support in an efcient manner the management of such data. For what concerns the data model, its natural to model multimedia data by means of objects using the object oriented paradigm, in order to capture both the different variety of real data and the different related functionalities. For what concerns the architecture, a well known strategy for supporting distributed data integration is the adoption of the wrappers/mediator middleware joint to database technologies. Modern e-Learning applications make more and more an extensive use of multimedia data to enhance and speed up the learning process. In the current assumption, one or more multimedia contents are assembled together to generate learning resources that are seen as a kind of Learning Object (LO). A Learning Object represents small capsules of knowledge in a suitable form for didactic presentation and assimilation by learners. The LO metadata standardization is expected to introduce a large degree of interoperability and re-use, promoting the wide-spread investment in, and adoption of, this technology. Each learning object, by being highly atomic and complete in capturing a concept or learning chunk, provides the opportunity for the conguration of a large number of course variations. The resulting ne-grained course customization is expected to lead to just-in time, just-enough, just-for-you, training and performance support courseware. The IEEE Learning Technology Standards Committee (IEEE-LTSC P1484) has undertaken the initiative of drafting a set of standards among which they dene a data model for Learning Object Metadata (LOM) useful for e-learning contents authoring and description [7]. The main problem addressed here is the denition of a strategy to improve the sharing and reusability of LOs in an efcient manner, reducing the cost of creating courses and other learning objects.

1 Introduction
With the advent of new technologies and World Wide Web, enormous quantities of informative material are nowadays available on-line, and, thanks to the convergence of services offered by ICT companies, the future will be more and more characterized by the ability to provide and distribute multimedia information. In such a context the integrated management of multimedia information such as images, graphics, video, audio, and text, is at the moment of great interest in a lot of application elds like Information Retrieval, Ofce Automation, Elearning, Virtual Museums, Newspaper and Magazines production, Video and Cinema Editing, Medical Applications, Geographical Information Systems Management, Biometric Security Application, and so on. Unfortunately such produced media are not available in a unique content container, but in heterogeneous and distributed repositories as World Wide Web, professional and personal databases of different kinds (Relational Database, Object Database, XML Database, etc..), digital libraries and archives. The major challenges in this non-trivial task are due to structural, syntactic and semantic heterogeneity of

In particular, in this paper we present a system, for integrating multimedia heterogeneous LO repositories and retrieving distributed data. Such system uses the LOM data model and a wrapper-mediator architecture to perform this task, so providing a simple mechanism for enabling the integration while maintaining the independence of the repositories. Our system also supports a semantic search on managed objects that makes more effective the retrieval process. The paper is organized as follows. In Section 2 the main systems for learning object integration are described. In section 3, we outline the data model for LOs management and a functional overview of system architecture, based on Wrapper/Mediator schema. In section 4, we describe the retrieval process for LOs based on a semantic search, while in section 5 some experimental results are discussed. Concluding remarks are given in Section 6.

2 Related Works
In the literature, several multimedia integration systems have been proposed. In the following we report a short description of the main general purposes systems. MediaLand [13] is a database system aiming to provide a true support for multimedia data management. The objective of MediaLand is to provide an integrated framework for users with different levels of experiences to manage and search multimedia repositories easily, effectively, efciently and intelligently. For satisfying these objectives, each multimedia data is represented as a particular object (described by apposite metadata) and the correlation among different objects is obtained by means of links, in order to construct multimedia object graphs. In this way, the authors give a unique conceptual structure for describing the multimedia data, successively clustered in domains called media class. The system presents a 4-tier architecture and supports a multi-paradigm query approach to retrieval aims. InfoSleuth [3] is an agents system that proposes a semantic approach to provide heterogeneous data integration. In particular, the data integration is obtained extracting a common view of the semantic content from multimedia repositories. This approach gives an independence of requests from information structures, resolving the heterogeneity of data by means of an apposite ontology. InfoSleuth also uses a specic language, KQL (Knowledge Query and Manipulation Language), for communication among agents. Garlic [10] is a more complete and complex system that, similarly to some of previous projects, uses an objectoriented approach to represent in an uniform way the data from different content servers. But, differently from the other approaches, Garlic provides an efcient query processing and data access layers, for efciently managing user queries. The data model is based on Odmg-93 and denes

an apposite language for the data denition. Impact [11] is an agents-based system capable of integrating heterogeneous information . Impact Architecture is based on two entities: Agents - software modules created from users or other agents and having high level functions and Impact Servers - representing the services infrastructure crated by agents. A Multi-Agents Paradigm allows to integrate heterogeneous information using different agents with particular functions and services. A yellow pages mechanism is used to manage the services discovery. At the same time, in e-Learning domain, different systems for LOs sharing and integration based on the P2P architecture have been developed. In the following the main proposals are shortly described. LOMSTER [12] is a project that address sharing of LOs on a P2P base by using LOM elds both for indexing and searching data and XML as representation and query language. EDUTELLA [9] is a project that address sharing and reusability of LOs on a P2P base by combining RDF and XML binding of LOM to support query, replication mapping, mediation and clustering services. ROSA-P2P [4] is a P2P distributed system which provide a physical environment to carry out the integration of LOs. The environment includes functionalities for aggregation, grouping denition, election, communication, balancing and redistribution of peers. Eventually, for what concerns the problem of retrieval of multimedia data, in [2] a system based on a multimedia ontology (represented by the TAO XML language) concept is proposed to solve heterogeneity of semantic content of the managed objects.

3 Overview of system functionalities


3.1 Data model
We describe a learning object and support the reuse and search of such an object. According to the LOM standard metadata consists of nine sections. General: this category groups the general information that describes this learning object as a whole, e.g. title, keywords and description. Life Cycle: this category describes the history and the current state of a learning object and those entities that have affected this learning object during its evolution. Meta-Metadata: this category describes this metadata record itself and how the metadata instance can be identied, who created this metadata instance, how, when, and with what references. Technical: this category describes the technical requirements and characteristics of a learning object.

Educational: this category describes the key educational or pedagogic characteristics of a learning object. Rights: this category describes the intellectual property rights and conditions of use for a learning object. Relation: this category denes the relationship between this learning object and other learning objects. Annotation: this category provides comments on the educational use of this learning object, and information on when and by whom the comments were created. Classification: this category describes where this learning object falls within a particular classication system.

3.2 Mediator/Wrapper Architecture


The proposed architecture is based on the classic Mediator-Wrapper schema, also used in [5], and tries to satisfy the main requirements of a multimedia database management system. In this kind of approach, the wrapper explores and examines the several LO repositories and send the mediator an appropriate LOM XML description of the related information. From the other side, the mediator receives and organizes these information in order to create a single view on all repositories in order to satisfy the user queries processing. The system architecture has tree functional layers: a client layer to submit queries, a mediator layer to manage data, a wrapper layer to extract data. The Mediator middleware (whose logical architecture is shown in gure 1) has the following functionalities: classify and manage the LO XML description sent by wrapper; manage the user query; manage the communication with wrapper systems. In the classication task, a STORAGE module takes the repositories XML description of LOs from the wrappers and stores it in a special and dedicated database called METADATA DB based on a XML Native DBMS. A SEMANTIC MANAGER, based on a MULTIMEDIA KNOWLEDGE BASE, is also used after the storage stage in order to associate the repository data with a semantic concept, organized in leearning semantic domains, to be stored in the METADATA DB. To these aims, the module uses the information contained in some standard LOM metadata (e.g., title, keywords, description) of multimedia objects. During the query processing task the user queries are submitted by means of a USER INTERFACE; such software component is also used to show related query results. The user queries are then processed by means of a QUERY ANALYZER, whose results and related information are stored in a system database, called QUERY

Figure 1. Mediator Architecture

DB. The queries are then taken and compiled by means of a QUERY ENGINE, using information stored in the METADATA DB, which sends the appropriate wrappers the given query. The partial and global query results are then stored in an another system database called RESULT DB. The results are managed by a RESULT HANDLER and by a TOP-K SELECTION module, which analyzes the results and chooses, by means of appropriate strategies, the best K results and reports them to an OBJECT RECOVERY module. In the choosing of best results also object semantic information (user keywords) are considered by means of a SEMANTIC REFINING module that uses a SEMANTIC NETWORK to discover hidden associations between objects having a similar semantic meaning. The SEMANTIC NETWORK is dynamically generated by means of a general knowledge base (in our case Wordnet [8]) and used for computing the semantic similarity metric described in section 4. The communication with wrappers is carried out by an apposite component called WRAPPER INTERFACE, that send mediator requests and picks up the meta-information and the related query results according to a XML-based protocol. The Wrapper middleware (which logical architecture is shown in gure 2), entirely developed using JAVA and XML technologies, has the following functionalities:

it has to classify and manage the multimedia LO dened by the repository administrators; it has to manage the mediator queries; it has to manage the communication with the mediator system.

INTERFACE module, that is composed of two fundamental sub-parts: the Communicator and the Analyzer. The Communicator has the task of managing the physical communication with the Mediator. From the other side, the Analyzer has the task of interpreting the mediator requests. Eventually, for what concerns supported multimedia queries, referring to the LOM description, the following queries can be expressed: metadata exact-matching queries; semantic keywords-based queries. The rst class of queries can be solved by the classical SQL (XPATH) approach, while for the second ones the use of techniques for multimedia data management is required.

4 Semantic Retrieval of LOs


In this section we describe an innovative metric which use semantic information, based on textual annotations, to perform a more accurate retrieval process on LO repositories. Such metric allows to determine a grade of relatedness between user search-keywords and the LOM object description by using a semantic net generated in an automatic way thanks to WordNet. The construction of this net is exploited dynamically considering the information and the structure of WordNet (i.e., in WordNet the terms are organized through their linguistic properties: each term can have different meaning or sense, polisemy, depending on the topic area, and, each sense is then organized in synsets constituted by synonyms). To this aim, the system supplies an interface that helps the user in choosing the right sense of a term through the description retrieved from the WordNet structure (gloss). Once chosen the sense and the appropriate synset, it is possible to build a rst core of the semantic net being considered all the terms contained in the synset; successively, by exploiting the other WordNet linguistic properties related to the type of a given term (e.g., names, adjectives, verbs, adverbs), the semantic net can be extended obtaining a strongly connected net, in which the relation between the different terms are labeled using some normalized weights that take in account the strength of the relation. For measuring the correlation between terms in the LO metadata and user data the following metric is used: ed ed (1) ed + ed where 0 and > 0 are two scaling parameters whose values have been dened by an experimental setup and i and j are the indexes of the considered terms. Sti ,tj = el

Figure 2. Wraper Architecture

In the classication task, creation of the LO XML description is the LOM EXTRACTOR responsibility. All the information about the metadata, dened on the created LO, and used for the query resolution, are stored in a system database called WRAPPER DB. A WRAPPER ADMINISTRATOR INTERFACE is used to describe the repositories (i.e. if they are Relational Database, Object Database, Web Page, XML Database, etc...) and the location of data. For example, in the case of relational database, the administrator must indicate the tables and the stored or external procedure which must be exported for the class definition. This interface is also called every time there are some errors in the wrapper automatic tasks. The query processing task is carried out by means of a QUERY EXECUTOR, activated when the wrapper receives a mediator query request. It analyzes the WRAPPER DB, taking information about the LO and runs the query, in the optimal manner, on the repository. The communication with mediator system is performed by means of a MEDIATOR

The above equation is expressed by combining the norhj 1 malized distance between two terms l = minj i=1 j , where j spans over all the paths between the two considered terms, hj is the number of hops in the j-th path and j is the weight assigned to the type of relations in the j-th path, and, the depth of its subsumer d from the root of WordNet hierarchy (d is computed using WordNet and considering the IS-A hyponymy-hyperonymy hierarchy only). Eventually the nal relatedness measure is obtained by combining Sti ,tj quantity with the weight of a given term ti 1 calculated as calculated as wti = poly (i) , being poly (i) is the polisemy grade of ti . More details about the semantic net construction and the used metric are reported in [1]. Such approach is useful to solve a semantic heterogeneity in the description of the LO since the user can specify, in its query, keywords not directly present in the metadata but related to them by linguistic relations.

submit a query; view the query results; set some communication parameters in order to optimize the transmission ow towards the user. While, by means of the ADMINISTRATOR INTERFACE, it is possible to administrate and congure the system. The main task of mediator is to store the LOM descriptions originated by the two wrappers. If the classication has a success, the mediator inserts a new object in the METADATA DB. The SEMANTIC MANAGER tries to associate a semantic concept to the wrapper objects and the semantic associations with the other objects by using the described SEMANTIC NETWORK. The query processing is carried out in three phases: by means of USER INTERFACE all data (Top-K dimension, query weights), necessary to the query execution, are picked up and stored in the QUERY DB; the query, stored in the QUERY DB, is compiled by the QUERY ENGINE and sent to wrappers; the query results are picked up form Wrapper and, by means of the TOP-K SELECTION [6], the best ones are shown to the user. During the download, as specied in the user settings, the object resolution is adapted to the user device type. We have tested our semantic metric by performing 5 queries and calculating recall on the result test. In particular we have used search keywords not present in any managed LOs metadata, but related to Computer Science learning domain. The table 1 summarizes the obtained results. Table 1. Experimental Results
Query 1 2 3 4 Search keyword Random Access Memory Cache Memory Operating System File System TopK 50 50 50 50 Recall 98% 92% 95% 89%

5 Experimental results
From one side, we suppose the presence of two LO repositories containing 50 learning resources related to the Computer Science learning domain. In particular the rst repository is a MySql Relational Database running on a Linux Slackware operating system. The second repository is a Tamino XML Database running on a Windows XP operating system. The wrapper functionalities are provided with the USER INTERFACE. It is possible to: register a new wrapper and congure the wrapper for repository communication; describe the multimedia learning objects; set communication parameters with mediator. By the LOM EXTRACTOR the wrapper administrator can export from his repository the LOM metadata for the various managed LOs. The LOM tree structure is after converted in a XML format which is sent to mediator with the related data, necessary to associate to the object a semantic meaning. Eventually the description of the object is stored in the WRAPPER DB. The query processing to/from mediator is managed by the QUERY EXECUTOR: such module uses the information in WRAPPER DB in order to translate mediator requests in the local DBMS SQL o XPATH format. Eventually the query results and the related scores are sent to the mediator. From the other side, the mediator functionalities are provided with two different graphical interfaces. By means of the USER INTERFACE, it is possible to:

6 Conclusions
A learning object sharing and integration system has been presented. It allows a single unied database view from more LO repositories. The proposed data model is the IEEE LOM, while the system architecture is based on the Mediator/Wrapper schema in order to have a ne description and organization of data. The user query is processed by ad hoc module and sent to appropriate wrappers.

The query results are calculated taking into account their semantic meaning and using a dynamic multimedia semantic network. The reported preliminary results show the efciency and effectiveness of the proposed architecture performances.

[12] S. Ternier, E. Duval and P. Vadepitte, LOMster: Peerto-Perr Learning Object Metadata, Proceedings of EdMedia-2002, pp. 1942-1943, Denver (Co., USA), 2002. [13] J. Wen, Q. Li, W. Ma, and H. Zhang A Multiparadigm Querying Approach for a Generic Multimedia Database Management System, ACM SIGMOD, Volume 32 , Issue 1, pp. 26-34, March 2003.

References
[1] M. Albanese, A. Picariello, and A. M. Rinaldi, Semantic Search Engine for WEB Information Retrieval: an Approach Based on Dynamic Semantic Networks, Proc. of Semantic Web Workshop, ACM SIGIR 2004, Shefeld, UK, 2004. [2] M. Albanese, Paolo Maresca, A. Picariello, and A. M. Rinaldi, Towards a Multimedia Ontology System: an Approach Using TAO XML, Proc. of Distributed Multimedia Systems Conference (DSM 2005), pp. 5257, Banff, Canada, September 2005. [3] R.J. Bayardo Jr. et Al., InfoSleuth: Agent-Based Semantic Integration of Information in Open and Dynamic Environment, Proc. of ACM SIGMOD Int. Conf on Management of data, Vol. 26, Issue 2, 1997. [4] G. A. D. D. Brito, ROSA - P2P: a Peer-to-Peer System for Learning Objects Integration on the Web, ACM SIGMOD Record, 2005. [5] M.J. Carey, L. M. Haas, and P. M. Schwarz, Towards Heterogeneous Multimedia Informations System: the Garlic Approach, Technical Report RJ 9911, IBM Almaden Research Center, 1994. [6] R. Fagin, Fuzzy Queries in Multimedia Database Systems, Proc. of the 17th ACM SIGACT-SIGMODSIGART symposium on Principles of database systems, 1998. [7] IEEE Learning Technology Standard Committee, IEEE LOM Working Draft 6.1. [8] Miller A. G., WordNet: A lexical database for english, Communications of the ACM, pp 3941, 1995. [9] W. Nejdl and B. Wolf, EDUTELLA: A P2P Networking Infrastructure Based on RDF, Proceedings of WWW 2002, Honululu (Hawaii, USA), 2002. [10] M. T. Roth et Al. , The Garlic Project, Proceedings of ACM SIGMOD international conference on Management of data, Volume 25, Issue 2, 1996. [11] V. S. Subrahmanian et Al. , IMPACT, Journal of Intelligent Information Systems, Volume 14, Issue 2-3, 2000.

You might also like