Engelbert Mephu Nguifo

Université Blaise-Pascal, F- Clermont-Ferrand, Mathematics and Computer Science, Faculty Member

Followers

Following

Co-authors

Public Views

Sabeur Aridhi

Aalto University, School of Science

INSAT

Indian Council of Agricultural Research

InterestsView All (11)

Uploads

Papers by Engelbert Mephu Nguifo

Domain knowledge-based model for phenotype prediction of ionizing-radiation-resistance in bacteria

by Sabeur Aridhi and Engelbert Mephu Nguifo

F1000posters, Oct 7, 2014

Download

A new approach based on Galois lattice for concept learning

A perceptual hash function to store and retrieve large scale DNA sequences

This paper proposes a novel approach for storing and retrieving massive DNA sequences. The method... more This paper proposes a novel approach for storing and retrieving massive DNA sequences. The method is based on a perceptual hash function, commonly used to determine the similarity between digital images that we adapted for DNA sequences. Perceptual hash function presented here is based on a Discrete Cosine Transform Sign Only (DCT--SO). Each nucleotide is encoded as a fixed gray level intensity pixel and the hash is calculated from its significant frequency characteristics. This results to a drastic data reduction between the sequence and the perceptual hash. Unlike cryptographic hash functions, perceptual hashes are not affected by "avalanche effect" and thus can be compared. The similarity distance between two hashes is estimated with the Hamming Distance, which is used to retrieve DNA sequences. Experiments that we conducted show that our approach is relevant for storing massive DNA sequences, and retrieve them.

Download

LEarning with GALois Lattice : Un syst?me d''apprentissage de concepts ? partir d''exemples

Concevoir une abstraction à partir de ressemblances

Http Www Theses Fr, 1993

A Multiparadigm Intelligent Tutoring System for Robotic Arm Training

IEEE Transactions on Learning Technologies, 2000

Machine Learning operators meet Dialogue operators

Une étude des algorithmes de construction d'architecture des réseaux de neurones multicouches

A comparative study of HMM classifier on protein sequence classification

F1000posters, Aug 25, 2011

Download

Une �tude des algorithmes de construction d'architecture des r�seaux de neurones multicouches

F Egc, 2007

Une nouvelle m�thode d'alignement et de visualisation d'ontologies OWL-Lite

F Egc, 2007

Efficient construction of the lattice of frequent closed patterns and simultaneous extraction of generic bases of rules

In the last few years, the amount of collected data, in various computer science applications, ha... more In the last few years, the amount of collected data, in various computer science applications, has grown considerably. These large volumes of data need to be analyzed in order to extract useful hidden knowledge. This work focuses on association rule extraction. This technique is one of the most popular in data mining. Nevertheless, the number of extracted association rules is often very high, and many of them are redundant. In this paper, we propose a new algorithm, called PRINCE. Its main feature is the construction of a partially ordered structure for extracting subsets of association rules, called generic bases. Without loss of information these subsets form representation of the whole association rule set. To reduce the cost of such a construction, the partially ordered structure is built thanks to the minimal generators associated to frequent closed patterns. The closed ones are simultaneously derived with generic bases thanks to a simple bottom-up traversal of the obtained structure. The experimentations we carried out in benchmark and "worst case" contexts showed the efficiency of the proposed algorithm, compared to algorithms like CLOSE, A-CLOSE and TITANIC.

Download

Optimized Mining of a Concise Representation for Frequent Patterns based on Disjunctions Rather than Conjunctions

Twenty Third International Flairs Conference, 2010

... {tarek.hamrouni@fst.rnu.tn, hamrouni@cril.univ-artois.fr} ... On the other hand, in many real... more

Les itemsets essentiels ferm�s : une nouvelle repr�sentation concise

F Egc, 2007

Cost Models for Distributed Pattern Mining in the Cloud

2015 IEEE Trustcom/BigDataSE/ISPA, 2015

Smoothing 3D protein structure motifs through graph mining and amino-acids similarities

One of the most powerful techniques to study proteins is to look for recurrent fragments (also ca... more One of the most powerful techniques to study proteins is to look for recurrent fragments (also called substructures or spatial motifs), then use them as patterns to characterize the proteins under study. An emergent trend consists in parsing proteins three-dimensional (3D) structures into graphs of amino acids. Hence, the search of recurrent substructures is formulated as a process of frequent subgraph discovery where each subgraph represents a 3D-motif. In this scope, several efficient approaches for frequent 3D-motifs discovery have been proposed in the literature. However, the set of discovered 3D-motifs is too large to be efficiently analyzed and explored in any further process. In this paper, we propose a novel pattern selection approach that shrinks the large number of discovered frequent 3D-motifs by selecting the representative ones. Existing pattern selection approaches do not exploit the domain knowledge. Yet, in our approach we incorporate the evolutionary information of amino acids defined in the substitution matrices in order to select the representative 3D-motifs. We show the effectiveness of our approach on a number of real datasets. The results issued from our experiments show that our approach detects relations between patterns that current subgraph selection approaches fail to detect, and that it is able to considerably decrease the number of motifs while enhancing their interestingness.

Download

Towards a constructive multilayer perceptron for regression task using non-parametric clustering. A case study of Photo-Z redshift reconstruction

The choice of architecture of artificial neuron network (ANN) is still a challenging task that us... more The choice of architecture of artificial neuron network (ANN) is still a challenging task that users face every time. It greatly affects the accuracy of the built network. In fact there is no optimal method that is applicable to various implementations at the same time. In this paper we propose a method to construct ANN based on clustering, that resolves the problems of random and ad'hoc approaches for multilayer ANN architecture. Our method can be applied to regression problems. Experimental results obtained with different datasets, reveals the efficiency of our method.

Download

CLANN: Concept Lattice-based Artificial Neural Network for Supervised Classification

Cla, 2007

Multi-layer neural networks have been successfully applied in a wide range of supervised and unsu... more Multi-layer neural networks have been successfully applied in a wide range of supervised and unsupervised learning applications. As they often produce incomprehensible models they are not widely used in data mining applications. To avoid such limitations, comprehensive models have been previously introduced making use of an apriori knowledge to build the network architecture. They permit to neural network methods to deserve a place in the tool boxes of data mining specialists. However, as the apriori knowledge is not always available for every new dataset, we hereby propose a novel approach that generates a concept semi-lattice from initial dataset, to directly build the neural network architecture. Carried out experiments showed the soundness and efficiency of our approach on various UCI.

Download

Discovering "Factual" and "Implicative" generic association rules

Cap, 2005

Download

Mining Representative Unsubstituted Graph Patterns Using Prior Similarity Matrix

One of the most powerful techniques to study protein structures is to look for recurrent fragment... more One of the most powerful techniques to study protein structures is to look for recurrent fragments (also called substructures or spatial motifs), then use them as patterns to characterize the proteins under study. An emergent trend consists in parsing proteins three-dimensional (3D) structures into graphs of amino acids. Hence, the search of recurrent spatial motifs is formulated as a process of frequent subgraph discovery where each subgraph represents a spatial motif. In this scope, several efficient approaches for frequent subgraph discovery have been proposed in the literature. However, the set of discovered frequent subgraphs is too large to be efficiently analyzed and explored in any further process. In this paper, we propose a novel pattern selection approach that shrinks the large number of discovered frequent subgraphs by selecting the representative ones. Existing pattern selection approaches do not exploit the domain knowledge. Yet, in our approach we incorporate the evolutionary information of amino acids defined in the substitution matrices in order to select the representative subgraphs. We show the effectiveness of our approach on a number of real datasets. The results issued from our experiments show that our approach is able to considerably decrease the number of motifs while enhancing their interestingness.

Download

Domain knowledge-based model for phenotype prediction of ionizing-radiation-resistance in bacteria

by Sabeur Aridhi and Engelbert Mephu Nguifo

F1000posters, Oct 7, 2014

Download

A new approach based on Galois lattice for concept learning

A perceptual hash function to store and retrieve large scale DNA sequences

Download

LEarning with GALois Lattice : Un syst?me d''apprentissage de concepts ? partir d''exemples

Concevoir une abstraction à partir de ressemblances

Http Www Theses Fr, 1993

A Multiparadigm Intelligent Tutoring System for Robotic Arm Training