Our previous works suggest that fractal texture feature is useful to detect pediatric brain tumor in multimodal MRI. In this study, we systematically investigate efficacy of using several different image features such as intensity,... more
Our previous works suggest that fractal texture feature is useful to detect pediatric brain tumor in multimodal MRI. In this study, we systematically investigate efficacy of using several different image features such as intensity, fractal texture, and level-set shape in segmentation of posterior-fossa (PF) tumor for pediatric patients. We explore effectiveness of using four different feature selection and three different segmentation techniques, respectively, to discriminate tumor regions from normal tissue in multimodal brain MRI. We further study the selective fusion of these features for improved PF tumor segmentation. Our result suggests that Kullback-Leibler divergence measure for feature ranking and selection and the expectation maximization algorithm for feature fusion and tumor segmentation offer the best results for the patient data in this study. We show that for T1 and fluid attenuation inversion recovery (FLAIR) MRI modalities, the best PF tumor segmentation is obtained using the texture feature such as multifractional Brownian motion (mBm) while that for T2 MRI is obtained by fusing level-set shape with intensity features. In multimodality fused MRI (T1, T2, and FLAIR), mBm feature offers the best PF tumor segmentation performance. We use different similarity metrics to evaluate quality and robustness of these selected features for PF tumor segmentation in MRI for ten pediatric patients.
In the On Line Analytical Processing (OLAP) context, ex-ploration of huge and sparse data cubes is a tedious task which does not always lead to efficient results. In this paper, we couple OLAP with the Multiple Correspondence Analy-sis... more
In the On Line Analytical Processing (OLAP) context, ex-ploration of huge and sparse data cubes is a tedious task which does not always lead to efficient results. In this paper, we couple OLAP with the Multiple Correspondence Analy-sis (MCA) in order to enhance ...
This paper presents a probabilistic framework, QARLA, for the evaluation of text summarisation systems. The input of the framework is a set of manual (reference) summaries, a set of baseline (automatic) summaries and a set of similarity... more
This paper presents a probabilistic framework, QARLA, for the evaluation of text summarisation systems. The input of the framework is a set of manual (reference) summaries, a set of baseline (automatic) summaries and a set of similarity metrics between summaries. It provides i) a measure to evaluate the quality of any set of similarity metrics, ii) a measure to evaluate the quality of a summary using an optimal set of similarity metrics, and iii) a measure to evaluate whether the set of baseline summaries is reliable or may produce biased results. Compared to previous approaches, our framework is able to combine different metrics and evaluate the quality of a set of metrics without any a-priori weighting of their relative importance. We provide quantitative evidence about the effectiveness of the approach to improve the automatic evaluation of text summarisation systems by combining several similarity metrics. 1
Data clustering is a technique for clustering set of objects into known number of groups. Several approaches are widely applied to data clustering so that objects within the clusters are similar and objects in different clusters are far... more
Data clustering is a technique for clustering set of objects into known number of groups. Several approaches are widely applied to data clustering so that objects within the clusters are similar and objects in different clusters are far away from each other. K-Means, is one of the familiar center based clustering
algorithms since implementation is very easy and fast convergence. However, K-Means algorithm suffers
from initialization, hence trapped in local optima. Flower Pollination Algorithm (FPA) is the global
optimization technique, which avoids trapping in local optimum solution. In this paper, a novel hybrid data clustering approach using Flower Pollination Algorithm and K-Means (FPAKM) is proposed. The proposed algorithm results are compared with K-Means and FPA on eight datasets. From the experimental
results, FPAKM is better than FPA and K-Means.
The transport sector emits a wide variety of gases and aerosols, with distinctly different characteristics which influence climate directly and indirectly via chemical and physical processes. Tools that allow these emissions to be placed... more
The transport sector emits a wide variety of gases and aerosols, with distinctly different characteristics which influence climate directly and indirectly via chemical and physical processes. Tools that allow these emissions to be placed on some kind of common scale in terms of their impact on climate have a number of possible uses such as: in agreements and emission trading schemes; when considering potential trade-offs between changes in emissions resulting from technological or operational developments; and/or for comparing the impact of different environmental impacts of transport activities.
A new class of distances appropriate for measuring similarity relations between sequences, say one type of similarity per distance, is studied. We propose a new "normalized information distance," based on the noncomputable notion of... more
A new class of distances appropriate for measuring similarity relations between sequences, say one type of similarity per distance, is studied. We propose a new "normalized information distance," based on the noncomputable notion of Kolmogorov complexity, and show that it is in this class and it minorizes every computable distance in the class (that is, it is universal in that it discovers all computable similarities). We demonstrate that it is a metric and call it the similarity metric . This theory forms the foundation for a new practical tool. To evidence generality and robustness, we give two distinctive applications in widely divergent areas using standard compression programs like gzip and GenCompress. First, we compare whole mitochondrial genomes and infer their evolutionary history. This results in a first completely automatic computed whole mitochondrial phylogeny tree. Secondly, we fully automatically compute the language tree of 52 different languages.
Visualizations of static networks in the form of node-link diagrams have evolved rapidly, though researchers are still grappling with how best to show evolution of nodes over time in these diagrams. This paper introduces NetVisia, a... more
Visualizations of static networks in the form of node-link diagrams have evolved rapidly, though researchers are still grappling with how best to show evolution of nodes over time in these diagrams. This paper introduces NetVisia, a social network visualization system designed to support users in exploring temporal evolution in networks by using heat maps to display node attribute changes over time. NetVisia's novel contributions to network visualizations are to (1) cluster nodes in the heat map by similar metric values instead of by topological similarity, and (2) align nodes in the heat map by events. We compare NetVisia to existing systems and describe a formative user evaluation of a NetVisia prototype with four participants that emphasized the need for tooltips and coordinated views. Despite the presence of some usability issues, in 30-40 minutes the user evaluation participants discovered new insights about the data set which had not been discovered using other systems. We...
In this paper identification of laryngeal disorders using cepstral parameters of human voice is researched. Mel-frequency cepstral coefficients (MFCCs), extracted from audio recordings of patient’s voice, are further approximated, using... more
In this paper identification of laryngeal disorders using cepstral parameters of human voice is researched. Mel-frequency cepstral coefficients (MFCCs), extracted from audio recordings of patient’s voice, are further approximated, using various strategies (sampling, averaging, and clustering by Gaussian mixture model). The effectiveness of similarity-based classification techniques in categorizing such pre-processed data into normal voice, nodular, and diffuse vocal fold lesion
Abstract: Several scene-change detection algorithms have been proposed in literature up to now. Most of them use fixed thresholds for the similarity metrics used to decide if there was a change or not. These thresholds are obtained by... more
Abstract: Several scene-change detection algorithms have been proposed in literature up to now. Most of them use fixed thresholds for the similarity metrics used to decide if there was a change or not. These thresholds are obtained by empirically or they must be calculated ...
This paper describes the development of a new Internet Information Agent (IIA) that uses similarity-based methods to search the Internet. The Agent works by analysing a sample of the type of text that is known to be of interest to the... more
This paper describes the development of a new Internet Information Agent (IIA) that uses similarity-based methods to search the Internet. The Agent works by analysing a sample of the type of text that is known to be of interest to the user. It then extracts a number of linguistic features and stores these as a feature vector that is used to describe the content of the document. This data is then used as input to a range of similarity metrics that allow the agent to compare new texts with the original and thereby acquire "more of the same".
In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial... more
In an age of increasingly large data sets, investigators in many different disciplines have turned to clustering as a tool for data analysis and exploration. Existing clustering methods, however, typically depend on several nontrivial assumptions about the structure of data. ...
Content-based image retrieval enables the user to search a database for visually similar images. In these scenarios, the user submits an example that is compared to the images in the database by their low-level characteristics such as... more
Content-based image retrieval enables the user to search a database for visually similar images. In these scenarios, the user submits an example that is compared to the images in the database by their low-level characteristics such as colour, texture and shape. While visual similarity is essential for a vast number of applications, there are cases where a user needs to
The case-based reasoning (CBR) approach is a modern approach. It is adopted for designing knowledge-based expertise systems. The aforementioned approach depends much on the stored experiences. These experiences serve as cases that can be... more
The case-based reasoning (CBR) approach is a modern approach. It is adopted for designing knowledge-based expertise systems. The aforementioned approach depends much on the stored experiences. These experiences serve as cases that can be employed for solving new problems. That is done through retrieving similar cases from the system and utilizing their solutions. The latter approach aims at solving problems through reviewing, processing and applying their experiences. The present study aimed to shed a light on a CBR application. That is done to develop a new system for assisting Internet users and solving the problems faced by those users. In addition, the present study focuses on the cases retrieval stage. It aimed at designing and building an experienced inquiry system for solving any problem that internet users might face when using a case-based reasoning (CBR) dialogue. That shall enable internet users to solve the problems faced when using the Internet. The system that was developed by the researcher operates through displaying the similar cases through a dialogue. That is done through using a well-developed algorithm and reviewing the relevant previous studies. It was found that the success rate of the proposed system is high.
We address the problem of matching imperfectly docu- mented schemas of data streams and large databases. Instance-level schema matching algorithms identify likely correspondences between at- tributes by quantifying the similarity of their... more
We address the problem of matching imperfectly docu- mented schemas of data streams and large databases. Instance-level schema matching algorithms identify likely correspondences between at- tributes by quantifying the similarity of their corresponding values. How- ever, exact calculation of these similarities requires processing of all database records—which is infeasible for data streams. We devise a fast matching algorithm that uses