Pattern mining is a valuable tool for exploratory data analysis, but identifying relevant pattern... more Pattern mining is a valuable tool for exploratory data analysis, but identifying relevant patterns for a specific user is challenging. Various interestingness measures have been developed to evaluate patterns, but they may not efficiently estimate user-specific functions. Learning user-specific functions by ranking patterns has been proposed, but this requires significant time and training samples. In this paper, we present a solution that formulates the problem of learning pattern ranking functions as a multi-criteria decision-making problem. Our approach uses an analytic hierarchy process (AHP) to elicit weights for different interestingness measures based on user preference. We also propose an active learning mode with a sensitivity-based heuristic to minimize user ranking queries while still providing high-quality results. Experiments show that our approach significantly reduces running time and returns precise pattern ranking while being robust to user mistakes, compared to sta...
This paper addresses the problem of mining sequential patterns (SPM) from data represented as a s... more This paper addresses the problem of mining sequential patterns (SPM) from data represented as a set ofsequences. In this work, we are interested in sequences of items in which each item is associated with its quantity.To the best of our knowledge, existing approaches don’t allow to handle this kind of sequences under constraints.In the other hand, several proposals show the efficiency of constraint programming (CP) to solve SPM problemdealing with several kind of constraints. However, in this paper, we propose the global constraint QSPM whichis an extension of the two CP-based approaches proposed in [5] and [7]. Experiments on real-life datasets showthe efficiency of our approach allowing to specify many constraints like size, membership and regular expressionconstraints.
International Journal of Information Technology and Web Engineering
The Semantic Web uses ontologies to cope with the data heterogeneity problem. However, ontologies... more The Semantic Web uses ontologies to cope with the data heterogeneity problem. However, ontologies become themselves heterogeneous; this heterogeneity may occur at the syntactic, terminological, conceptual, and semantic levels. To solve this problem, alignments between entities of ontologies must be identified. This process is called ontology matching. In this paper, the authors propose a new method to extract alignment with multiple cardinalities using integer linear programming techniques. The authors conducted a series of experiments and compared them with currently used methods. The obtained results show the efficiency of the proposed method.
Discovering relevant patterns for a particular user remains a challenging tasks in data mining. S... more Discovering relevant patterns for a particular user remains a challenging tasks in data mining. Several approaches have been proposed to learn user-specific pattern ranking functions. These approaches generalize well, but at the expense of the running time. On the other hand, several measures are often used to evaluate the interestingness of patterns, with the hope to reveal a ranking that is as close as possible to the user-specific ranking. In this paper, we formulate the problem of learning pattern ranking functions as a multicriteria decision making problem. Our approach aggregates different interestingness measures into a single weighted linear ranking function, using an interactive learning procedure that operates in either passive or active modes. A fast learning step is used for eliciting the weights of all the measures by mean of pairwise comparisons. This approach is based on Analytic Hierarchy Process (AHP), and a set of user-ranked patterns to build a preference matrix, ...
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Constraint-based pattern mining is at the core of numerous data mining tasks. Unfortunately, thre... more Constraint-based pattern mining is at the core of numerous data mining tasks. Unfortunately, thresholds which are involved in these constraints cannot be easily chosen. This paper investigates a Multi-objective Optimization approach where several (often conflicting) functions need to be optimized at the same time. We introduce a new model for efficiently mining Pareto optimal patterns with constraint programming. Our model exploits condensed pattern representations to reduce the mining effort. To this end, we design a new global constraint for ensuring the closeness of patterns over a set of measures. We show how our approach can be applied to derive high-quality non redundant association rules without the use of thresholds whose added-value is studied on both UCI datasets and case study related to the analysis of genes expression data integrating multiple external genes annotations.
Conceptual clustering combines two long-standing machine learning tasks: the unsupervised groupin... more Conceptual clustering combines two long-standing machine learning tasks: the unsupervised grouping of similar instances and their description by symbolic concepts. In this paper, we decouple the problems of finding descriptions and forming clusters by first mining formal concepts (i.e. closed itemsets), and searching for the best k clusters that can be described with those itemsets. Most existing approaches performing the two steps separately are of a heuristic nature and produce results of varying quality. Instead, we address the problem of finding an optimal constrained conceptual clustering by using integer linear programming techniques. Most other generic approaches for this problem tend to have problems scaling. Our approach takes advantageous of both techniques, the general framework of integer linear programming, and high-speed specialized approaches of data mining. Experiments performed on UCI datasets show that our approach efficiently finds clusterings of consistently high...
We propose an equitable conceptual clustering approach based on multi-agent optimization. In the ... more We propose an equitable conceptual clustering approach based on multi-agent optimization. In the context of conceptual clustering, each cluster is represented by an agent having its own satisfaction and the problem consists in finding the best cumulative satisfaction while emphasizing a fair compromise between all individual agents. The fairness goal is achieved using an equitable formulation of the Ordered Weighted Averages (OWA) operator. Experiments performed on UCI datasets and on instances coming from real application ERP show that our approach efficiently finds clusterings of consistently high quality.
In this paper, we study the use of Spearman’s rho correlation coefficient, first in a heuristic m... more In this paper, we study the use of Spearman’s rho correlation coefficient, first in a heuristic method, and then in combination with two exact methods based on constraint programming and mixed integer programming, to tackle the parameter elicitation problem in the lexicographic ordering (LO). Like all multicriteria optimization methods, the LO method have a parameter that should be fixed carefully. Indeed, the criteria usually conflict with each other, and thus, finding an appropriate parameter settings for a specific multicriteria method is challenging. This is why we propose elicitation methods ∗Ce travail est soutenu par le programme de recherche TASSILI 11MDU839 (France, Algérie), et par le projet de recherche PNR 70/TIC/2011 (Algérie). in order to assist the Decision Maker (DM) in fixing the parameters. These methods require some prior knowledge, that the DM can give easily. We present some numerical experiments, showing the effectiveness of our
Aircraft Engineering and Aerospace Technology, 2020
Purpose Airspace sectorization is an important task, which has a significant impact in the everyd... more Purpose Airspace sectorization is an important task, which has a significant impact in the everyday work of air control services. Especially in recent years, because of the constant increase in air traffic, existing airspace sectorization techniques have difficulties to tackle the large air traffic volumes, creating imbalanced sectors and uneven workload distribution among sectors. The purpose of this paper is to propose a new approach to find optimal airspace sectorization balancing the traffic controller workload between sectors, subject to airspace requirements. Design/methodology/approach A constraint programming (CP) model called equitable airspace sectorization problem (EQASP) relies on ordered weighted averaging (OWA) multiagent optimization and the parallel portfolio architecture has been developed, which integrates the equity into an existing CP approach (Trandac et al., 2005). The EQASP was evaluated and compared with the method of Trandac et al. (2005), according to the q...
International Journal of Applied Metaheuristic Computing, 2015
This paper introduces a local search optimization technique for solving efficiently a financial p... more This paper introduces a local search optimization technique for solving efficiently a financial portfolio design problem which consists to affect assets to portfolios, allowing a compromise between maximizing gains and minimizing losses. This practical problem appears usually in financial engineering, such as in the design of CDO-squared portfolios. This problem has been modeled by Flener et al. who proposed an exact method to solve it. It can be formulated as a quadratic program on the 0-1 domain. It is well known that exact solving approaches on difficult and large instances of quadratic integer programs are known to be inefficient. That is why this work has adopted a local search method. It proposes neighborhood and evaluation functions specialized on this problem. To boost the local search process, it also proposes a greedy algorithm to start the search with an optimized initial configuration. Experimental results on non-trivial instances of the problem show the effectiveness of...
In recent years, pattern mining has moved from a slow-moving repeated three-step process to a muc... more In recent years, pattern mining has moved from a slow-moving repeated three-step process to a much more agile iterative/user-centric mining model. A vital ingredient of this framework is the ability to quickly present a set of diverse patterns to the user. In this paper, we use constraint programming (wellsuited to user-centric mining due to its rich constraint language) to efficiently mine a diverse set of closed patterns. Diversity is controlled through a threshold on the Jaccard similarity of pattern occurrences. We show that the Jaccard measure has no monotonicity property, which prevents usual pruning techniques and makes classical pattern mining unworkable. This is why we propose antimonotonic lower and upper bound relaxations, which allow effective pruning, with an efficient branching rule, boosting the whole search process. We show experimentally that our approach significantly reduces the number of patterns and is very efficient in terms of running times, particularly on de...
ABSTRACT Extrapolation methods are used in numerical analysis to accelerate the convergence of re... more ABSTRACT Extrapolation methods are used in numerical analysis to accelerate the convergence of real number sequences. Interval tightening algorithms produce interval vector sequences. Extrapolation can be applied directly on some of these sequences. Nevertheless, bounds are no longer guaranteed. This paper investigates how to use extrapolation methods without losing solutions and reports some experimental results.
Pattern mining is a valuable tool for exploratory data analysis, but identifying relevant pattern... more Pattern mining is a valuable tool for exploratory data analysis, but identifying relevant patterns for a specific user is challenging. Various interestingness measures have been developed to evaluate patterns, but they may not efficiently estimate user-specific functions. Learning user-specific functions by ranking patterns has been proposed, but this requires significant time and training samples. In this paper, we present a solution that formulates the problem of learning pattern ranking functions as a multi-criteria decision-making problem. Our approach uses an analytic hierarchy process (AHP) to elicit weights for different interestingness measures based on user preference. We also propose an active learning mode with a sensitivity-based heuristic to minimize user ranking queries while still providing high-quality results. Experiments show that our approach significantly reduces running time and returns precise pattern ranking while being robust to user mistakes, compared to sta...
This paper addresses the problem of mining sequential patterns (SPM) from data represented as a s... more This paper addresses the problem of mining sequential patterns (SPM) from data represented as a set ofsequences. In this work, we are interested in sequences of items in which each item is associated with its quantity.To the best of our knowledge, existing approaches don’t allow to handle this kind of sequences under constraints.In the other hand, several proposals show the efficiency of constraint programming (CP) to solve SPM problemdealing with several kind of constraints. However, in this paper, we propose the global constraint QSPM whichis an extension of the two CP-based approaches proposed in [5] and [7]. Experiments on real-life datasets showthe efficiency of our approach allowing to specify many constraints like size, membership and regular expressionconstraints.
International Journal of Information Technology and Web Engineering
The Semantic Web uses ontologies to cope with the data heterogeneity problem. However, ontologies... more The Semantic Web uses ontologies to cope with the data heterogeneity problem. However, ontologies become themselves heterogeneous; this heterogeneity may occur at the syntactic, terminological, conceptual, and semantic levels. To solve this problem, alignments between entities of ontologies must be identified. This process is called ontology matching. In this paper, the authors propose a new method to extract alignment with multiple cardinalities using integer linear programming techniques. The authors conducted a series of experiments and compared them with currently used methods. The obtained results show the efficiency of the proposed method.
Discovering relevant patterns for a particular user remains a challenging tasks in data mining. S... more Discovering relevant patterns for a particular user remains a challenging tasks in data mining. Several approaches have been proposed to learn user-specific pattern ranking functions. These approaches generalize well, but at the expense of the running time. On the other hand, several measures are often used to evaluate the interestingness of patterns, with the hope to reveal a ranking that is as close as possible to the user-specific ranking. In this paper, we formulate the problem of learning pattern ranking functions as a multicriteria decision making problem. Our approach aggregates different interestingness measures into a single weighted linear ranking function, using an interactive learning procedure that operates in either passive or active modes. A fast learning step is used for eliciting the weights of all the measures by mean of pairwise comparisons. This approach is based on Analytic Hierarchy Process (AHP), and a set of user-ranked patterns to build a preference matrix, ...
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Constraint-based pattern mining is at the core of numerous data mining tasks. Unfortunately, thre... more Constraint-based pattern mining is at the core of numerous data mining tasks. Unfortunately, thresholds which are involved in these constraints cannot be easily chosen. This paper investigates a Multi-objective Optimization approach where several (often conflicting) functions need to be optimized at the same time. We introduce a new model for efficiently mining Pareto optimal patterns with constraint programming. Our model exploits condensed pattern representations to reduce the mining effort. To this end, we design a new global constraint for ensuring the closeness of patterns over a set of measures. We show how our approach can be applied to derive high-quality non redundant association rules without the use of thresholds whose added-value is studied on both UCI datasets and case study related to the analysis of genes expression data integrating multiple external genes annotations.
Conceptual clustering combines two long-standing machine learning tasks: the unsupervised groupin... more Conceptual clustering combines two long-standing machine learning tasks: the unsupervised grouping of similar instances and their description by symbolic concepts. In this paper, we decouple the problems of finding descriptions and forming clusters by first mining formal concepts (i.e. closed itemsets), and searching for the best k clusters that can be described with those itemsets. Most existing approaches performing the two steps separately are of a heuristic nature and produce results of varying quality. Instead, we address the problem of finding an optimal constrained conceptual clustering by using integer linear programming techniques. Most other generic approaches for this problem tend to have problems scaling. Our approach takes advantageous of both techniques, the general framework of integer linear programming, and high-speed specialized approaches of data mining. Experiments performed on UCI datasets show that our approach efficiently finds clusterings of consistently high...
We propose an equitable conceptual clustering approach based on multi-agent optimization. In the ... more We propose an equitable conceptual clustering approach based on multi-agent optimization. In the context of conceptual clustering, each cluster is represented by an agent having its own satisfaction and the problem consists in finding the best cumulative satisfaction while emphasizing a fair compromise between all individual agents. The fairness goal is achieved using an equitable formulation of the Ordered Weighted Averages (OWA) operator. Experiments performed on UCI datasets and on instances coming from real application ERP show that our approach efficiently finds clusterings of consistently high quality.
In this paper, we study the use of Spearman’s rho correlation coefficient, first in a heuristic m... more In this paper, we study the use of Spearman’s rho correlation coefficient, first in a heuristic method, and then in combination with two exact methods based on constraint programming and mixed integer programming, to tackle the parameter elicitation problem in the lexicographic ordering (LO). Like all multicriteria optimization methods, the LO method have a parameter that should be fixed carefully. Indeed, the criteria usually conflict with each other, and thus, finding an appropriate parameter settings for a specific multicriteria method is challenging. This is why we propose elicitation methods ∗Ce travail est soutenu par le programme de recherche TASSILI 11MDU839 (France, Algérie), et par le projet de recherche PNR 70/TIC/2011 (Algérie). in order to assist the Decision Maker (DM) in fixing the parameters. These methods require some prior knowledge, that the DM can give easily. We present some numerical experiments, showing the effectiveness of our
Aircraft Engineering and Aerospace Technology, 2020
Purpose Airspace sectorization is an important task, which has a significant impact in the everyd... more Purpose Airspace sectorization is an important task, which has a significant impact in the everyday work of air control services. Especially in recent years, because of the constant increase in air traffic, existing airspace sectorization techniques have difficulties to tackle the large air traffic volumes, creating imbalanced sectors and uneven workload distribution among sectors. The purpose of this paper is to propose a new approach to find optimal airspace sectorization balancing the traffic controller workload between sectors, subject to airspace requirements. Design/methodology/approach A constraint programming (CP) model called equitable airspace sectorization problem (EQASP) relies on ordered weighted averaging (OWA) multiagent optimization and the parallel portfolio architecture has been developed, which integrates the equity into an existing CP approach (Trandac et al., 2005). The EQASP was evaluated and compared with the method of Trandac et al. (2005), according to the q...
International Journal of Applied Metaheuristic Computing, 2015
This paper introduces a local search optimization technique for solving efficiently a financial p... more This paper introduces a local search optimization technique for solving efficiently a financial portfolio design problem which consists to affect assets to portfolios, allowing a compromise between maximizing gains and minimizing losses. This practical problem appears usually in financial engineering, such as in the design of CDO-squared portfolios. This problem has been modeled by Flener et al. who proposed an exact method to solve it. It can be formulated as a quadratic program on the 0-1 domain. It is well known that exact solving approaches on difficult and large instances of quadratic integer programs are known to be inefficient. That is why this work has adopted a local search method. It proposes neighborhood and evaluation functions specialized on this problem. To boost the local search process, it also proposes a greedy algorithm to start the search with an optimized initial configuration. Experimental results on non-trivial instances of the problem show the effectiveness of...
In recent years, pattern mining has moved from a slow-moving repeated three-step process to a muc... more In recent years, pattern mining has moved from a slow-moving repeated three-step process to a much more agile iterative/user-centric mining model. A vital ingredient of this framework is the ability to quickly present a set of diverse patterns to the user. In this paper, we use constraint programming (wellsuited to user-centric mining due to its rich constraint language) to efficiently mine a diverse set of closed patterns. Diversity is controlled through a threshold on the Jaccard similarity of pattern occurrences. We show that the Jaccard measure has no monotonicity property, which prevents usual pruning techniques and makes classical pattern mining unworkable. This is why we propose antimonotonic lower and upper bound relaxations, which allow effective pruning, with an efficient branching rule, boosting the whole search process. We show experimentally that our approach significantly reduces the number of patterns and is very efficient in terms of running times, particularly on de...
ABSTRACT Extrapolation methods are used in numerical analysis to accelerate the convergence of re... more ABSTRACT Extrapolation methods are used in numerical analysis to accelerate the convergence of real number sequences. Interval tightening algorithms produce interval vector sequences. Extrapolation can be applied directly on some of these sequences. Nevertheless, bounds are no longer guaranteed. This paper investigates how to use extrapolation methods without losing solutions and reports some experimental results.
Uploads
Papers by Yahia Lebbah