Rough clustering of sequential data
This paper presents a new indiscernibility-based rough agglomerative hierarchical clustering algorithm for sequential data. In this approach, the indiscernibility relation has been extended to a tolerance relation with the transitivity property being ...
Reasoning and change management in modular ontologies
The benefits of modular representations are well known from many areas of computer science. While in software engineering modularization is mainly a vehicle for supporting distributed development and re-use, in knowledge representation, the main goal of ...
Mapping, indexing and querying of MPEG-7 descriptors in RDBMS with IXMDB
MPEG-7 is a promising standard for the description of multimedia content. A number of applications based on MPEG-7 media descriptions have been set up for research, commercial and industrial applications. Therefore, an efficient storage solution for ...
Towards efficient variables ordering for Bayesian networks classifier
Traditionally, the task of learning Bayesian Networks (BNs) from data has been treated as a NP-Hard search problem. To overcome such difficulty in terms of computational complexity, several approximations have been designed, such as imposing a previous ...
A knowledge representation model for the nuclear power generation domain
A knowledge representation model for the nuclear power field is proposed. The model is a generalized production rule function inspired by a neural network approach that enables the representation of physical systems of nuclear power plants. The article ...
Better mobile client's cache reusability and data access time in a wireless broadcast environment
Data broadcasting is an efficient data dissemination method in a wireless client-server system. A data server broadcasts data items periodically, and mobile clients cache data items to save communication bandwidth, resource usage, and data access time. ...
Efficient top-k processing in large-scaled distributed environments
The rapid development of networking technologies has made it possible to construct a distributed database that involves a huge number of sites. Query processing in such a large-scaled system poses serious challenges beyond the scope of traditional ...
An empirical study on selective partitioning dimensions for partition-based similarity joins
Real-world application data are usually distributed sparsely and non-uniformly in the high dimensional space that is huge in size. Hence, selection of effective partitioning dimensions is crucial for partition-based similarity joins. In this paper, we ...
Privacy preserving decision tree learning over multiple parties
Data mining over multiple data sources has emerged as an important practical problem with applications in different areas such as data streams, data-warehouses, and bioinformatics. Although the data sources are willing to run data mining algorithms in ...
DR-NEGOTIATE - A system for automated agent negotiation with defeasible logic-based strategies
This paper reports on a system for automated agent negotiation, based on a formal and executable approach to capture the behavior of parties involved in a negotiation. It uses the JADE agent framework, and its major distinctive feature is the use of ...
Combining classifiers for word sense disambiguation based on Dempster-Shafer theory and OWA operators
In this paper, we discuss a framework for weighted combination of classifiers for word sense disambiguation (WSD). This framework is essentially based on Dempster-Shafer theory of evidence [G. Shafer, A Mathematical Theory of Evidence, Princeton ...
WeR-trees
R-tree has been proven to be one of the most practical and well-behaved data structures for accommodating dynamic massive sets of low dimensionality geometric objects and conducting a very diverse set of queries on such data sets in real-world ...
On new scheduling policy for the improvement of firm RTDBSs performances
Earliest deadline first (EDF) is one of the main scheduling policies used in real-time database systems (RTDBSs) for transactions processing. With EDF, prioritized transactions are not necessarily the most important in the system. Moreover, it is well-...
Parameterized pattern queries
We introduce parameterized pattern queries as a new paradigm to extend traditional pattern expressions over sequence databases. A parameterized pattern is essentially a string made of constant symbols or variables where variables can be matched against ...
Incremental procedures for partitioning highly intermixed multi-class datasets into hyper-spherical and hyper-ellipsoidal clusters
Two procedures for partitioning large collections of highly intermixed datasets of different classes into a number of hyper-spherical or hyper-ellipsoidal clusters are presented. The incremental procedures are to generate a minimum numbers of hyper-...
Adaptive similarity search in streaming time series with sliding windows
The challenge in a database of evolving time series is to provide efficient algorithms and access methods for query processing, taking into consideration the fact that the database changes continuously as new data become available. Traditional access ...
A k-mean clustering algorithm for mixed numeric and categorical data
Use of traditional k-mean type algorithm is limited to numeric data. This paper presents a clustering algorithm based on k-mean paradigm that works well for data with mixed numeric and categorical features. We propose new cost function and distance ...
Cell trees: An adaptive synopsis structure for clustering multi-dimensional on-line data streams
To effectively trace the clusters of recently generated data elements in an on-line data stream, a sibling list and a cell tree are proposed in this paper. Initially, the multi-dimensional data space of a data stream is partitioned into mutually ...
Privacy-preserving distributed association rule mining via semi-trusted mixer
Distributed data mining applications, such as those dealing with health care, finance, counter-terrorism and homeland defence, use sensitive data from distributed databases held by different parties. This comes into direct conflict with an individual's ...
Extracting generalization hierarchies from relational databases: A reverse engineering approach
Relational Data Base Management Systems (RDBMS) are currently the most popular database management systems. The relational model is a simple and powerful model for representing real world applications. However, it lacks the expressiveness of conceptual ...