Search | arXiv e-print repository

Massive Dimensions Reduction and Hybridization with Meta-heuristics in Deep Learning

Authors: Rasa Khosrowshahli, Shahryar Rahnamayan, Beatrice Ombuki-Berman

Abstract: Deep learning is mainly based on utilizing gradient-based optimization for training Deep Neural Network (DNN) models. Although robust and widely used, gradient-based optimization algorithms are prone to getting stuck in local minima. In this modern deep learning era, the state-of-the-art DNN models have millions and billions of parameters, including weights and biases, making them huge-scale optim… ▽ More Deep learning is mainly based on utilizing gradient-based optimization for training Deep Neural Network (DNN) models. Although robust and widely used, gradient-based optimization algorithms are prone to getting stuck in local minima. In this modern deep learning era, the state-of-the-art DNN models have millions and billions of parameters, including weights and biases, making them huge-scale optimization problems in terms of search space. Tuning a huge number of parameters is a challenging task that causes vanishing/exploding gradients and overfitting; likewise, utilized loss functions do not exactly represent our targeted performance metrics. A practical solution to exploring large and complex solution space is meta-heuristic algorithms. Since DNNs exceed thousands and millions of parameters, even robust meta-heuristic algorithms, such as Differential Evolution, struggle to efficiently explore and converge in such huge-dimensional search spaces, leading to very slow convergence and high memory demand. To tackle the mentioned curse of dimensionality, the concept of blocking was recently proposed as a technique that reduces the search space dimensions by grouping them into blocks. In this study, we aim to introduce Histogram-based Blocking Differential Evolution (HBDE), a novel approach that hybridizes gradient-based and gradient-free algorithms to optimize parameters. Experimental results demonstrated that the HBDE could reduce the parameters in the ResNet-18 model from 11M to 3K during the training/optimizing phase by metaheuristics, namely, the proposed HBDE, which outperforms baseline gradient-based and parent gradient-free DE algorithms evaluated on CIFAR-10 and CIFAR-100 datasets showcasing its effectiveness with reduced computational demands for the very first time. △ Less

Submitted 13 August, 2024; originally announced August 2024.

Comments: 8 pages, 5 figures, 3 tables, accepted at IEEE CCECE 2024 (updated Fig. 1 and conclusion remarks)

arXiv:2407.17795 [pdf, other]

doi 10.1109/CEC60901.2024.10612084

Enhancing Diversity in Multi-objective Feature Selection

Authors: Sevil Zanjani Miyandoab, Shahryar Rahnamayan, Azam Asilian Bidgoli, Sevda Ebrahimi, Masoud Makrehchi

Abstract: Feature selection plays a pivotal role in the data preprocessing and model-building pipeline, significantly enhancing model performance, interpretability, and resource efficiency across diverse domains. In population-based optimization methods, the generation of diverse individuals holds utmost importance for adequately exploring the problem landscape, particularly in highly multi-modal multi-obje… ▽ More Feature selection plays a pivotal role in the data preprocessing and model-building pipeline, significantly enhancing model performance, interpretability, and resource efficiency across diverse domains. In population-based optimization methods, the generation of diverse individuals holds utmost importance for adequately exploring the problem landscape, particularly in highly multi-modal multi-objective optimization problems. Our study reveals that, in line with findings from several prior research papers, commonly employed crossover and mutation operations lack the capability to generate high-quality diverse individuals and tend to become confined to limited areas around various local optima. This paper introduces an augmentation to the diversity of the population in the well-established multi-objective scheme of the genetic algorithm, NSGA-II. This enhancement is achieved through two key components: the genuine initialization method and the substitution of the worst individuals with new randomly generated individuals as a re-initialization approach in each generation. The proposed multi-objective feature selection method undergoes testing on twelve real-world classification problems, with the number of features ranging from 2,400 to nearly 50,000. The results demonstrate that replacing the last front of the population with an equivalent number of new random individuals generated using the genuine initialization method and featuring a limited number of features substantially improves the population's quality and, consequently, enhances the performance of the multi-objective algorithm. △ Less

Submitted 18 August, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

Comments: 8 pages, 3 figures, published in IEEE WCCI 2024 conference, DOI added

MSC Class: 68T05 (Primary); 68T20; 68W20; 68W40; 90C29; 90C27; 62H30; 62H25 (Secondary) ACM Class: I.2.6; I.2.8; I.5; H.2.8

Journal ref: S. Z. Miyandoab, S. Rahnamayan, A. A. Bidgoli, S. Ebrahimi, and M. Makrehchi, "Enhancing Diversity in Multi-Objective Feature Selection," 2024 IEEE Congress on Evolutionary Computation (CEC), Yokohama, Japan, 2024, pp. 1-8

arXiv:2402.12646 [pdf, other]

doi 10.1109/SSCI52147.2023.10371958

Training Artificial Neural Networks by Coordinate Search Algorithm

Authors: Ehsan Rokhsatyazdi, Shahryar Rahnamayan, Sevil Zanjani Miyandoab, Azam Asilian Bidgoli, H. R. Tizhoosh

Abstract: Training Artificial Neural Networks poses a challenging and critical problem in machine learning. Despite the effectiveness of gradient-based learning methods, such as Stochastic Gradient Descent (SGD), in training neural networks, they do have several limitations. For instance, they require differentiable activation functions, and cannot optimize a model based on several independent non-different… ▽ More Training Artificial Neural Networks poses a challenging and critical problem in machine learning. Despite the effectiveness of gradient-based learning methods, such as Stochastic Gradient Descent (SGD), in training neural networks, they do have several limitations. For instance, they require differentiable activation functions, and cannot optimize a model based on several independent non-differentiable loss functions simultaneously; for example, the F1-score, which is used during testing, can be used during training when a gradient-free optimization algorithm is utilized. Furthermore, the training in any DNN can be possible with a small size of the training dataset. To address these concerns, we propose an efficient version of the gradient-free Coordinate Search (CS) algorithm, an instance of General Pattern Search methods, for training neural networks. The proposed algorithm can be used with non-differentiable activation functions and tailored to multi-objective/multi-loss problems. Finding the optimal values for weights of ANNs is a large-scale optimization problem. Therefore instead of finding the optimal value for each variable, which is the common technique in classical CS, we accelerate optimization and convergence by bundling the weights. In fact, this strategy is a form of dimension reduction for optimization problems. Based on the experimental results, the proposed method, in some cases, outperforms the gradient-based approach, particularly, in situations with insufficient labeled training data. The performance plots demonstrate a high convergence rate, highlighting the capability of our suggested method to find a reasonable solution with fewer function calls. As of now, the only practical and efficient way of training ANNs with hundreds of thousands of weights is gradient-based algorithms such as SGD or Adam. In this paper we introduce an alternative method for training ANN. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: 7 pages, 9 figures

ACM Class: I.2.6

Journal ref: 2023 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1540-1546. IEEE, 2023

arXiv:2402.12625 [pdf, other]

doi 10.1109/SMC53992.2023.10394458

Compact NSGA-II for Multi-objective Feature Selection

Authors: Sevil Zanjani Miyandoab, Shahryar Rahnamayan, Azam Asilian Bidgoli

Abstract: Feature selection is an expensive challenging task in machine learning and data mining aimed at removing irrelevant and redundant features. This contributes to an improvement in classification accuracy, as well as the budget and memory requirements for classification, or any other post-processing task conducted after feature selection. In this regard, we define feature selection as a multi-objecti… ▽ More Feature selection is an expensive challenging task in machine learning and data mining aimed at removing irrelevant and redundant features. This contributes to an improvement in classification accuracy, as well as the budget and memory requirements for classification, or any other post-processing task conducted after feature selection. In this regard, we define feature selection as a multi-objective binary optimization task with the objectives of maximizing classification accuracy and minimizing the number of selected features. In order to select optimal features, we have proposed a binary Compact NSGA-II (CNSGA-II) algorithm. Compactness represents the population as a probability distribution to enhance evolutionary algorithms not only to be more memory-efficient but also to reduce the number of fitness evaluations. Instead of holding two populations during the optimization process, our proposed method uses several Probability Vectors (PVs) to generate new individuals. Each PV efficiently explores a region of the search space to find non-dominated solutions instead of generating candidate solutions from a small population as is the common approach in most evolutionary algorithms. To the best of our knowledge, this is the first compact multi-objective algorithm proposed for feature selection. The reported results for expensive optimization cases with a limited budget on five datasets show that the CNSGA-II performs more efficiently than the well-known NSGA-II method in terms of the hypervolume (HV) performance metric requiring less memory. The proposed method and experimental results are explained and analyzed in detail. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: 8 pages, 2 figures

ACM Class: I.2.6

Journal ref: 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 3868-3875. IEEE, 2023

arXiv:2402.12616 [pdf, other]

doi 10.1109/SMC53992.2023.10394067

Multi-objective Binary Coordinate Search for Feature Selection

Authors: Sevil Zanjani Miyandoab, Shahryar Rahnamayan, Azam Asilian Bidgoli

Abstract: A supervised feature selection method selects an appropriate but concise set of features to differentiate classes, which is highly expensive for large-scale datasets. Therefore, feature selection should aim at both minimizing the number of selected features and maximizing the accuracy of classification, or any other task. However, this crucial task is computationally highly demanding on many real-… ▽ More A supervised feature selection method selects an appropriate but concise set of features to differentiate classes, which is highly expensive for large-scale datasets. Therefore, feature selection should aim at both minimizing the number of selected features and maximizing the accuracy of classification, or any other task. However, this crucial task is computationally highly demanding on many real-world datasets and requires a very efficient algorithm to reach a set of optimal features with a limited number of fitness evaluations. For this purpose, we have proposed the binary multi-objective coordinate search (MOCS) algorithm to solve large-scale feature selection problems. To the best of our knowledge, the proposed algorithm in this paper is the first multi-objective coordinate search algorithm. In this method, we generate new individuals by flipping a variable of the candidate solutions on the Pareto front. This enables us to investigate the effectiveness of each feature in the corresponding subset. In fact, this strategy can play the role of crossover and mutation operators to generate distinct subsets of features. The reported results indicate the significant superiority of our method over NSGA-II, on five real-world large-scale datasets, particularly when the computing budget is limited. Moreover, this simple hyper-parameter-free algorithm can solve feature selection much faster and more efficiently than NSGA-II. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: 8 pages, 1 figure

ACM Class: I.2.6

Journal ref: 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 4176-4183. IEEE, 2023

arXiv:2308.03936 [pdf, other]

ALFA -- Leveraging All Levels of Feature Abstraction for Enhancing the Generalization of Histopathology Image Classification Across Unseen Hospitals

Authors: Milad Sikaroudi, Maryam Hosseini, Shahryar Rahnamayan, H. R. Tizhoosh

Abstract: We propose an exhaustive methodology that leverages all levels of feature abstraction, targeting an enhancement in the generalizability of image classification to unobserved hospitals. Our approach incorporates augmentation-based self-supervision with common distribution shifts in histopathology scenarios serving as the pretext task. This enables us to derive invariant features from training image… ▽ More We propose an exhaustive methodology that leverages all levels of feature abstraction, targeting an enhancement in the generalizability of image classification to unobserved hospitals. Our approach incorporates augmentation-based self-supervision with common distribution shifts in histopathology scenarios serving as the pretext task. This enables us to derive invariant features from training images without relying on training labels, thereby covering different abstraction levels. Moving onto the subsequent abstraction level, we employ a domain alignment module to facilitate further extraction of invariant features across varying training hospitals. To represent the highly specific features of participating hospitals, an encoder is trained to classify hospital labels, independent of their diagnostic labels. The features from each of these encoders are subsequently disentangled to minimize redundancy and segregate the features. This representation, which spans a broad spectrum of semantic information, enables the development of a model demonstrating increased robustness to unseen images from disparate distributions. Experimental results from the PACS dataset (a domain generalization benchmark), a synthetic dataset created by applying histopathology-specific jitters to the MHIST dataset (defining different domains with varied distribution shifts), and a Renal Cell Carcinoma dataset derived from four image repositories from TCGA, collectively indicate that our proposed model is adept at managing varying levels of image granularity. Thus, it shows improved generalizability when faced with new, out-of-distribution hospital images. △ Less

Submitted 9 August, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

Comments: Accepted for publication at ICCV 2023, Computer Vision for Automated Medical Diagnosis Workshop

arXiv:2304.08498 [pdf, other]

Ranking Loss and Sequestering Learning for Reducing Image Search Bias in Histopathology

Authors: Pooria Mazaheri, Azam Asilian Bidgoli, Shahryar Rahnamayan, H. R. Tizhoosh

Abstract: Recently, deep learning has started to play an essential role in healthcare applications, including image search in digital pathology. Despite the recent progress in computer vision, significant issues remain for image searching in histopathology archives. A well-known problem is AI bias and lack of generalization. A more particular shortcoming of deep models is the ignorance toward search functio… ▽ More Recently, deep learning has started to play an essential role in healthcare applications, including image search in digital pathology. Despite the recent progress in computer vision, significant issues remain for image searching in histopathology archives. A well-known problem is AI bias and lack of generalization. A more particular shortcoming of deep models is the ignorance toward search functionality. The former affects every model, the latter only search and matching. Due to the lack of ranking-based learning, researchers must train models based on the classification error and then use the resultant embedding for image search purposes. Moreover, deep models appear to be prone to internal bias even if using a large image repository of various hospitals. This paper proposes two novel ideas to improve image search performance. First, we use a ranking loss function to guide feature extraction toward the matching-oriented nature of the search. By forcing the model to learn the ranking of matched outputs, the representation learning is customized toward image search instead of learning a class label. Second, we introduce the concept of sequestering learning to enhance the generalization of feature extraction. By excluding the images of the input hospital from the matched outputs, i.e., sequestering the input domain, the institutional bias is reduced. The proposed ideas are implemented and validated through the largest public dataset of whole slide images. The experiments demonstrate superior results compare to the-state-of-art. △ Less

Submitted 14 April, 2023; originally announced April 2023.

Comments: Under Review for publication

arXiv:2303.00943 [pdf, other]

doi 10.1109/TEVC.2022.3178299

Evolutionary Computation in Action: Feature Selection for Deep Embedding Spaces of Gigapixel Pathology Images

Authors: Azam Asilian Bidgoli, Shahryar Rahnamayan, Taher Dehkharghanian, Abtin Riasatian, H. R. Tizhoosh

Abstract: One of the main obstacles of adopting digital pathology is the challenge of efficient processing of hyperdimensional digitized biopsy samples, called whole slide images (WSIs). Exploiting deep learning and introducing compact WSI representations are urgently needed to accelerate image analysis and facilitate the visualization and interpretability of pathology results in a postpandemic world. In th… ▽ More One of the main obstacles of adopting digital pathology is the challenge of efficient processing of hyperdimensional digitized biopsy samples, called whole slide images (WSIs). Exploiting deep learning and introducing compact WSI representations are urgently needed to accelerate image analysis and facilitate the visualization and interpretability of pathology results in a postpandemic world. In this paper, we introduce a new evolutionary approach for WSI representation based on large-scale multi-objective optimization (LSMOP) of deep embeddings. We start with patch-based sampling to feed KimiaNet , a histopathology-specialized deep network, and to extract a multitude of feature vectors. Coarse multi-objective feature selection uses the reduced search space strategy guided by the classification accuracy and the number of features. In the second stage, the frequent features histogram (FFH), a novel WSI representation, is constructed by multiple runs of coarse LSMOP. Fine evolutionary feature selection is then applied to find a compact (short-length) feature vector based on the FFH and contributes to a more robust deep-learning approach to digital pathology supported by the stochastic power of evolutionary algorithms. We validate the proposed schemes using The Cancer Genome Atlas (TCGA) images in terms of WSI representation, classification accuracy, and feature quality. Furthermore, a novel decision space for multicriteria decision making in the LSMOP field is introduced. Finally, a patch-level visualization approach is proposed to increase the interpretability of deep features. The proposed evolutionary algorithm finds a very compact feature vector to represent a WSI (almost 14,000 times smaller than the original feature vectors) with 8% higher accuracy compared to the codes provided by the state-of-the-art methods. △ Less

Submitted 18 April, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

Journal ref: IEEE Transactions on Evolutionary Computation, vol. 27, no. 1, pp. 52-66, Feb. 2023

arXiv:2205.07274 [pdf, ps, other]

Variable Functioning and Its Application to Large Scale Steel Frame Design Optimization

Authors: Amir H Gandomi, Kalyanmoy Deb, Ronald C Averill, Shahryar Rahnamayan, Mohammad Nabi Omidvar

Abstract: To solve complex real-world problems, heuristics and concept-based approaches can be used in order to incorporate information into the problem. In this study, a concept-based approach called variable functioning Fx is introduced to reduce the optimization variables and narrow down the search space. In this method, the relationships among one or more subset of variables are defined with functions u… ▽ More To solve complex real-world problems, heuristics and concept-based approaches can be used in order to incorporate information into the problem. In this study, a concept-based approach called variable functioning Fx is introduced to reduce the optimization variables and narrow down the search space. In this method, the relationships among one or more subset of variables are defined with functions using information prior to optimization; thus, instead of modifying the variables in the search process, the function variables are optimized. By using problem structure analysis technique and engineering expert knowledge, the $Fx$ method is used to enhance the steel frame design optimization process as a complex real-world problem. The proposed approach is coupled with particle swarm optimization and differential evolution algorithms and used for three case studies. The algorithms are applied to optimize the case studies by considering the relationships among column cross-section areas. The results show that $Fx$ can significantly improve both the convergence rate and the final design of a frame structure, even if it is only used for seeding. △ Less

Submitted 15 May, 2022; originally announced May 2022.

arXiv:2204.02404 [pdf, other]

Hospital-Agnostic Image Representation Learning in Digital Pathology

Authors: Milad Sikaroudi, Shahryar Rahnamayan, H. R. Tizhoosh

Abstract: Whole Slide Images (WSIs) in digital pathology are used to diagnose cancer subtypes. The difference in procedures to acquire WSIs at various trial sites gives rise to variability in the histopathology images, thus making consistent diagnosis challenging. These differences may stem from variability in image acquisition through multi-vendor scanners, variable acquisition parameters, and differences… ▽ More Whole Slide Images (WSIs) in digital pathology are used to diagnose cancer subtypes. The difference in procedures to acquire WSIs at various trial sites gives rise to variability in the histopathology images, thus making consistent diagnosis challenging. These differences may stem from variability in image acquisition through multi-vendor scanners, variable acquisition parameters, and differences in staining procedure; as well, patient demographics may bias the glass slide batches before image acquisition. These variabilities are assumed to cause a domain shift in the images of different hospitals. It is crucial to overcome this domain shift because an ideal machine-learning model must be able to work on the diverse sources of images, independent of the acquisition center. A domain generalization technique is leveraged in this study to improve the generalization capability of a Deep Neural Network (DNN), to an unseen histopathology image set (i.e., from an unseen hospital/trial site) in the presence of domain shift. According to experimental results, the conventional supervised-learning regime generalizes poorly to data collected from different hospitals. However, the proposed hospital-agnostic learning can improve the generalization considering the low-dimensional latent space representation visualization, and classification accuracy results. △ Less

Submitted 5 April, 2022; originally announced April 2022.

Comments: Accepted for presentation at the 44th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC'22)

arXiv:2106.06623 [pdf, other]

Pay Attention with Focus: A Novel Learning Scheme for Classification of Whole Slide Images

Authors: Shivam Kalra, Mohammed Adnan, Sobhan Hemati, Taher Dehkharghanian, Shahryar Rahnamayan, Hamid Tizhoosh

Abstract: Deep learning methods such as convolutional neural networks (CNNs) are difficult to directly utilize to analyze whole slide images (WSIs) due to the large image dimensions. We overcome this limitation by proposing a novel two-stage approach. First, we extract a set of representative patches (called mosaic) from a WSI. Each patch of a mosaic is encoded to a feature vector using a deep network. The… ▽ More Deep learning methods such as convolutional neural networks (CNNs) are difficult to directly utilize to analyze whole slide images (WSIs) due to the large image dimensions. We overcome this limitation by proposing a novel two-stage approach. First, we extract a set of representative patches (called mosaic) from a WSI. Each patch of a mosaic is encoded to a feature vector using a deep network. The feature extractor model is fine-tuned using hierarchical target labels of WSIs, i.e., anatomic site and primary diagnosis. In the second stage, a set of encoded patch-level features from a WSI is used to compute the primary diagnosis probability through the proposed Pay Attention with Focus scheme, an attention-weighted averaging of predicted probabilities for all patches of a mosaic modulated by a trainable focal factor. Experimental results show that the proposed model can be robust, and effective for the classification of WSIs. △ Less

Submitted 11 June, 2021; originally announced June 2021.

Comments: Accepted in MICCAI, 2021

arXiv:2008.03553 [pdf, other]

Forming Local Intersections of Projections for Classifying and Searching Histopathology Images

Authors: Aditya Sriram, Shivam Kalra, Morteza Babaie, Brady Kieffer, Waddah Al Drobi, Shahryar Rahnamayan, Hany Kashani, Hamid R. Tizhoosh

Abstract: In this paper, we propose a novel image descriptor called Forming Local Intersections of Projections (FLIP) and its multi-resolution version (mFLIP) for representing histopathology images. The descriptor is based on the Radon transform wherein we apply parallel projections in small local neighborhoods of gray-level images. Using equidistant projection directions in each window, we extract unique a… ▽ More In this paper, we propose a novel image descriptor called Forming Local Intersections of Projections (FLIP) and its multi-resolution version (mFLIP) for representing histopathology images. The descriptor is based on the Radon transform wherein we apply parallel projections in small local neighborhoods of gray-level images. Using equidistant projection directions in each window, we extract unique and invariant characteristics of the neighborhood by taking the intersection of adjacent projections. Thereafter, we construct a histogram for each image, which we call the FLIP histogram. Various resolutions provide different FLIP histograms which are then concatenated to form the mFLIP descriptor. Our experiments included training common networks from scratch and fine-tuning pre-trained networks to benchmark our proposed descriptor. Experiments are conducted on the publicly available dataset KIMIA Path24 and KIMIA Path960. For both of these datasets, FLIP and mFLIP descriptors show promising results in all experiments.Using KIMIA Path24 data, FLIP outperformed non-fine-tuned Inception-v3 and fine-tuned VGG16 and mFLIP outperformed fine-tuned Inception-v3 in feature extracting. △ Less

Submitted 8 August, 2020; originally announced August 2020.

Comments: To appear in International Conference on AI in Medicine (AIME 2020)

arXiv:2007.12332 [pdf, other]

Image-Based Benchmarking and Visualization for Large-Scale Global Optimization

Authors: Kyle Robert Harrison, Azam Asilian Bidgoli, Shahryar Rahnamayan, Kalyanmoy Deb

Abstract: In the context of optimization, visualization techniques can be useful for understanding the behaviour of optimization algorithms and can even provide a means to facilitate human interaction with an optimizer. Towards this goal, an image-based visualization framework, without dimension reduction, that visualizes the solutions to large-scale global optimization problems as images is proposed. In th… ▽ More In the context of optimization, visualization techniques can be useful for understanding the behaviour of optimization algorithms and can even provide a means to facilitate human interaction with an optimizer. Towards this goal, an image-based visualization framework, without dimension reduction, that visualizes the solutions to large-scale global optimization problems as images is proposed. In the proposed framework, the pixels visualize decision variables while the entire image represents the overall solution quality. This framework affords a number of benefits over existing visualization techniques including enhanced scalability (in terms of the number of decision variables), facilitation of standard image processing techniques, providing nearly infinite benchmark cases, and explicit alignment with human perception. Furthermore, image-based visualization can be used to visualize the optimization process in real-time, thereby allowing the user to ascertain characteristics of the search process as it is progressing. To the best of the authors' knowledge, this is the first realization of a dimension-preserving, scalable visualization framework that embeds the inherent relationship between decision space and objective space. The proposed framework is utilized with 10 different mapping schemes on an image-reconstruction problem that encompass continuous, discrete, binary, combinatorial, constrained, dynamic, and multi-objective optimization. The proposed framework is then demonstrated on arbitrary benchmark problems with known optima. Experimental results elucidate the flexibility and demonstrate how valuable information about the search process can be gathered via the proposed visualization framework. △ Less

Submitted 23 July, 2020; originally announced July 2020.

Comments: Preprint submitted to Applied Intelligence. 43 pages, 30 figures

arXiv:2007.00449 [pdf, other]

Multi-objective Optimal Control of Dynamic Integrated Model of Climate and Economy: Evolution in Action

Authors: Mostapha Kalami Heris, Shahryar Rahnamayan

Abstract: One of the widely used models for studying economics of climate change is the Dynamic Integrated model of Climate and Economy (DICE), which has been developed by Professor William Nordhaus, one of the laureates of the 2018 Nobel Memorial Prize in Economic Sciences. Originally a single-objective optimal control problem has been defined on DICE dynamics, which is aimed to maximize the social welfare… ▽ More One of the widely used models for studying economics of climate change is the Dynamic Integrated model of Climate and Economy (DICE), which has been developed by Professor William Nordhaus, one of the laureates of the 2018 Nobel Memorial Prize in Economic Sciences. Originally a single-objective optimal control problem has been defined on DICE dynamics, which is aimed to maximize the social welfare. In this paper, a bi-objective optimal control problem defined on DICE model, objectives of which are maximizing social welfare and minimizing the temperature deviation of atmosphere. This multi-objective optimal control problem solved using Non-Dominated Sorting Genetic Algorithm II (NSGA-II) also it is compared to previous works on single-objective version of the problem. The resulting Pareto front rediscovers the previous results and generalizes to a wide range of non-dominant solutions to minimize the global temperature deviation while optimizing the economic welfare. The previously used single-objective approach is unable to create such a variety of possibilities, hence, its offered solution is limited in vision and reachable performance. Beside this, resulting Pareto-optimal set reveals the fact that temperature deviation cannot go below a certain lower limit, unless we have significant technology advancement or positive change in global conditions. △ Less

Submitted 29 June, 2020; originally announced July 2020.

Comments: 8 pages, 6 figures, conference paper

arXiv:2003.03676 [pdf, other]

Towards Solving Large-scale Expensive Optimization Problems Efficiently Using Coordinate Descent Algorithm

Authors: Shahryar Rahnamayan, Seyed Jalaleddin Mousavirad

Abstract: Many real-world problems are categorized as large-scale problems, and metaheuristic algorithms as an alternative method to solve large-scale problem; they need the evaluation of many candidate solutions to tackle them prior to their convergence, which is not affordable for practical applications since the most of them are computationally expensive. In other words, these problems are not only large… ▽ More Many real-world problems are categorized as large-scale problems, and metaheuristic algorithms as an alternative method to solve large-scale problem; they need the evaluation of many candidate solutions to tackle them prior to their convergence, which is not affordable for practical applications since the most of them are computationally expensive. In other words, these problems are not only large-scale but also computationally expensive, that makes them very difficult to solve. There is no efficient surrogate model to support large-scale expensive global optimization (LSEGO) problems. As a result, the algorithms should address LSEGO problems using a limited computational budget to be applicable in real-world applications. Coordinate Descent (CD) algorithm is an optimization strategy based on the decomposition of a n-dimensional problem into n one-dimensional problem. To the best our knowledge, there is no significant study to assess benchmark functions with various dimensions and landscape properties to investigate CD algorithm. In this paper, we propose a modified Coordinate Descent algorithm (MCD) to tackle LSEGO problems with a limited computational budget. Our proposed algorithm benefits from two leading steps, namely, finding the region of interest and then shrinkage of the search space by folding it into the half with exponential speed. One of the main advantages of the proposed algorithm is being free of any control parameters, which makes it far from the intricacies of the tuning process. The proposed algorithm is compared with cooperative co-evolution with delta grouping on 20 benchmark functions with dimension 1000. Also, we conducted some experiments on CEC-2017, D=10, 30, 50, and 100, to investigate the behavior of MCD algorithm in lower dimensions. The results show that MCD is beneficial not only in large-scale problems, but also in low-scale optimization problems. △ Less

Submitted 11 September, 2020; v1 submitted 7 March, 2020; originally announced March 2020.

Comments: Accepted in IEEE International Conference On Systems, Man, and Cybernetics, 2020, Toronto, Canada

arXiv:1710.01247 [pdf, other]

Learning Autoencoded Radon Projections

Authors: Aditya Sriram, Shivam Kalra, H. R. Tizhoosh, Shahryar Rahnamayan

Abstract: Autoencoders have been recently used for encoding medical images. In this study, we design and validate a new framework for retrieving medical images by classifying Radon projections, compressed in the deepest layer of an autoencoder. As the autoencoder reduces the dimensionality, a multilayer perceptron (MLP) can be employed to classify the images. The integration of MLP promotes a rather shallow… ▽ More Autoencoders have been recently used for encoding medical images. In this study, we design and validate a new framework for retrieving medical images by classifying Radon projections, compressed in the deepest layer of an autoencoder. As the autoencoder reduces the dimensionality, a multilayer perceptron (MLP) can be employed to classify the images. The integration of MLP promotes a rather shallow learning architecture which makes the training faster. We conducted a comparative study to examine the capabilities of autoencoders for different inputs such as raw images, Histogram of Oriented Gradients (HOG) and normalized Radon projections. Our framework is benchmarked on IRMA dataset containing $14,410$ x-ray images distributed across $57$ different classes. Experiments show an IRMA error of $313$ (equivalent to $\approx 82\%$ accuracy) outperforming state-of-the-art works on retrieval from IRMA dataset using autoencoders. △ Less

Submitted 27 September, 2017; originally announced October 2017.

Comments: To appear in proceedings of The IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2017), Honolulu, Hawaii, USA, Nov. 27 -- Dec 1, 2017

arXiv:1709.06909 [pdf, ps, other]

Opposition based Ensemble Micro Differential Evolution

Authors: Hojjat Salehinejad, Shahryar Rahnamayan, Hamid R. Tizhoosh

Abstract: Differential evolution (DE) algorithm with a small population size is called Micro-DE (MDE). A small population size decreases the computational complexity but also reduces the exploration ability of DE by limiting the population diversity. In this paper, we propose the idea of combining ensemble mutation scheme selection and opposition-based learning concepts to enhance the diversity of populatio… ▽ More Differential evolution (DE) algorithm with a small population size is called Micro-DE (MDE). A small population size decreases the computational complexity but also reduces the exploration ability of DE by limiting the population diversity. In this paper, we propose the idea of combining ensemble mutation scheme selection and opposition-based learning concepts to enhance the diversity of population in MDE at mutation and selection stages. The proposed algorithm enhances the diversity of population by generating a random mutation scale factor per individual and per dimension, randomly assigning a mutation scheme to each individual in each generation, and diversifying individuals selection using opposition-based learning. This approach is easy to implement and does not require the setting of mutation scheme selection and mutation scale factor. Experimental results are conducted for a variety of objective functions with low and high dimensionality on the CEC Black- Box Optimization Benchmarking 2015 (CEC-BBOB 2015). The results show superior performance of the proposed algorithm compared to the other micro-DE algorithms. △ Less

Submitted 20 September, 2017; v1 submitted 7 September, 2017; originally announced September 2017.

Comments: This paper is accepted for presentation at IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2017), Hawaii, USA, 2017

arXiv:1705.07522 [pdf, other]

Classification and Retrieval of Digital Pathology Scans: A New Dataset

Authors: Morteza Babaie, Shivam Kalra, Aditya Sriram, Christopher Mitcheltree, Shujin Zhu, Amin Khatami, Shahryar Rahnamayan, H. R. Tizhoosh

Abstract: In this paper, we introduce a new dataset, \textbf{Kimia Path24}, for image classification and retrieval in digital pathology. We use the whole scan images of 24 different tissue textures to generate 1,325 test patches of size 1000$\times$1000 (0.5mm$\times$0.5mm). Training data can be generated according to preferences of algorithm designer and can range from approximately 27,000 to over 50,000 p… ▽ More In this paper, we introduce a new dataset, \textbf{Kimia Path24}, for image classification and retrieval in digital pathology. We use the whole scan images of 24 different tissue textures to generate 1,325 test patches of size 1000$\times$1000 (0.5mm$\times$0.5mm). Training data can be generated according to preferences of algorithm designer and can range from approximately 27,000 to over 50,000 patches if the preset parameters are adopted. We propose a compound patch-and-scan accuracy measurement that makes achieving high accuracies quite challenging. In addition, we set the benchmarking line by applying LBP, dictionary approach and convolutional neural nets (CNNs) and report their results. The highest accuracy was 41.80\% for CNN. △ Less

Submitted 21 May, 2017; originally announced May 2017.

Comments: Accepted for presentation at Workshop for Computer Vision for Microscopy Image Analysis (CVMI 2017) @ CVPR 2017, Honolulu, Hawaii

arXiv:1609.05123 [pdf, other]

Learning Opposites Using Neural Networks

Authors: Shivam Kalra, Aditya Sriram, Shahryar Rahnamayan, H. R. Tizhoosh

Abstract: Many research works have successfully extended algorithms such as evolutionary algorithms, reinforcement agents and neural networks using "opposition-based learning" (OBL). Two types of the "opposites" have been defined in the literature, namely \textit{type-I} and \textit{type-II}. The former are linear in nature and applicable to the variable space, hence easy to calculate. On the other hand, ty… ▽ More Many research works have successfully extended algorithms such as evolutionary algorithms, reinforcement agents and neural networks using "opposition-based learning" (OBL). Two types of the "opposites" have been defined in the literature, namely \textit{type-I} and \textit{type-II}. The former are linear in nature and applicable to the variable space, hence easy to calculate. On the other hand, type-II opposites capture the "oppositeness" in the output space. In fact, type-I opposites are considered a special case of type-II opposites where inputs and outputs have a linear relationship. However, in many real-world problems, inputs and outputs do in fact exhibit a nonlinear relationship. Therefore, type-II opposites are expected to be better in capturing the sense of "opposition" in terms of the input-output relation. In the absence of any knowledge about the problem at hand, there seems to be no intuitive way to calculate the type-II opposites. In this paper, we introduce an approach to learn type-II opposites from the given inputs and their outputs using the artificial neural networks (ANNs). We first perform \emph{opposition mining} on the sample data, and then use the mined data to learn the relationship between input $x$ and its opposite $\breve{x}$. We have validated our algorithm using various benchmark functions to compare it against an evolving fuzzy inference approach that has been recently introduced. The results show the better performance of a neural approach to learn the opposites. This will create new possibilities for integrating oppositional schemes within existing algorithms promising a potential increase in convergence speed and/or accuracy. △ Less

Submitted 16 September, 2016; originally announced September 2016.

Comments: To appear in proceedings of the 23rd International Conference on Pattern Recognition (ICPR 2016), Cancun, Mexico, December 2016

arXiv:1605.06820 [pdf, other]

Automated Resolution Selection for Image Segmentation

Authors: Fares Al-Qunaieer, Hamid R. Tizhoosh, Shahryar Rahnamayan

Abstract: It is well-known in image processing that computational cost increases rapidly with the number and dimensions of the images to be processed. Several fields, such as medical imaging, routinely use numerous very large images, which might also be 3D and/or captured at several frequency bands, all adding to the computational expense. Multiresolution analysis is a method of increasing the efficiency of… ▽ More It is well-known in image processing that computational cost increases rapidly with the number and dimensions of the images to be processed. Several fields, such as medical imaging, routinely use numerous very large images, which might also be 3D and/or captured at several frequency bands, all adding to the computational expense. Multiresolution analysis is a method of increasing the efficiency of the segmentation process. One multiresolution approach is the coarse-to-fine segmentation strategy, whereby the segmentation starts at a coarse resolution and is then fine-tuned during subsequent steps. The starting resolution for segmentation is generally selected arbitrarily with no clear selection criteria. The research reported in this paper showed that starting from different resolutions for image segmentation results in different accuracies and computational times, even for images of the same category (depicting similar scenes or objects). An automated method for resolution selection for an input image would thus be beneficial. This paper introduces a framework for the automated selection of the best resolution for image segmentation. We propose a measure for defining the best resolution based on user/system criteria, offering a trade-off between accuracy and computation time. A learning approach is then introduced for the selection of the resolution, whereby extracted image features are mapped to the previously determined best resolution. In the learning process, class (i.e., resolution) distribution is generally imbalanced, making effective learning from the data difficult. Experiments conducted with three datasets using two different segmentation algorithms show that the resolutions selected through learning enable much faster segmentation than the original ones, while retaining at least the original accuracy. △ Less

Submitted 22 May, 2016; originally announced May 2016.

arXiv:1604.04673 [pdf, other]

Evolutionary Projection Selection for Radon Barcodes

Authors: Hamid R. Tizhoosh, Shahryar Rahnamayan

Abstract: Recently, Radon transformation has been used to generate barcodes for tagging medical images. The under-sampled image is projected in certain directions, and each projection is binarized using a local threshold. The concatenation of the thresholded projections creates a barcode that can be used for tagging or annotating medical images. A small number of equidistant projections, e.g., 4 or 8, is ge… ▽ More Recently, Radon transformation has been used to generate barcodes for tagging medical images. The under-sampled image is projected in certain directions, and each projection is binarized using a local threshold. The concatenation of the thresholded projections creates a barcode that can be used for tagging or annotating medical images. A small number of equidistant projections, e.g., 4 or 8, is generally used to generate short barcodes. However, due to the diverse nature of digital images, and since we are only working with a small number of projections (to keep the barcode short), taking equidistant projections may not be the best course of action. In this paper, we proposed to find $n$ optimal projections, whereas $n\!<\!180$, in order to increase the expressiveness of Radon barcodes. We show examples for the exhaustive search for the simple case when we attempt to find 4 best projections out of 16 equidistant projections and compare it with the evolutionary approach in order to establish the benefit of the latter when operating on a small population size as in the case of micro-DE. We randomly selected 10 different classes from IRMA dataset (14,400 x-ray images in 58 classes) and further randomly selected 5 images per class for our tests. △ Less

Submitted 15 April, 2016; originally announced April 2016.

Comments: To appear in proceedings of The 2016 IEEE Congress on Evolutionary Computation (IEEE CEC 2016), July 24-29, 2016, Vancouver, Canada

arXiv:1512.07980 [pdf, ps, other]

Diversity Enhancement for Micro-Differential Evolution

Authors: Hojjat Salehinejad, Shahryar Rahnamayan, Hamid R. Tizhoosh

Abstract: The differential evolution (DE) algorithm suffers from high computational time due to slow nature of evaluation. In contrast, micro-DE (MDE) algorithms employ a very small population size, which can converge faster to a reasonable solution. However, these algorithms are vulnerable to a premature convergence as well as to high risk of stagnation. In this paper, MDE algorithm with vectorized random… ▽ More The differential evolution (DE) algorithm suffers from high computational time due to slow nature of evaluation. In contrast, micro-DE (MDE) algorithms employ a very small population size, which can converge faster to a reasonable solution. However, these algorithms are vulnerable to a premature convergence as well as to high risk of stagnation. In this paper, MDE algorithm with vectorized random mutation factor (MDEVM) is proposed, which utilizes the small size population benefit while empowers the exploration ability of mutation factor through randomizing it in the decision variable level. The idea is supported by analyzing mutation factor using Monte-Carlo based simulations. To facilitate the usage of MDE algorithms with very-small population sizes, new mutation schemes for population sizes less than four are also proposed. Furthermore, comprehensive comparative simulations and analysis on performance of the MDE algorithms over various mutation schemes, population sizes, problem types (i.e. uni-modal, multi-modal, and composite), problem dimensionalities, and mutation factor ranges are conducted by considering population diversity analysis for stagnation and trapping in local optimum situations. The studies are conducted on 28 benchmark functions provided for the IEEE CEC-2013 competition. Experimental results demonstrate high performance and convergence speed of the proposed MDEVM algorithm. △ Less

Submitted 26 September, 2016; v1 submitted 25 December, 2015; originally announced December 2015.

Comments: Developed version is submitted for review to Applied soft computing

arXiv:1504.05619 [pdf, other]

Learning Opposites with Evolving Rules

Authors: Hamid R. Tizhoosh, Shahryar Rahnamayan

Abstract: The idea of opposition-based learning was introduced 10 years ago. Since then a noteworthy group of researchers has used some notions of oppositeness to improve existing optimization and learning algorithms. Among others, evolutionary algorithms, reinforcement agents, and neural networks have been reportedly extended into their opposition-based version to become faster and/or more accurate. Howeve… ▽ More The idea of opposition-based learning was introduced 10 years ago. Since then a noteworthy group of researchers has used some notions of oppositeness to improve existing optimization and learning algorithms. Among others, evolutionary algorithms, reinforcement agents, and neural networks have been reportedly extended into their opposition-based version to become faster and/or more accurate. However, most works still use a simple notion of opposites, namely linear (or type- I) opposition, that for each $x\in[a,b]$ assigns its opposite as $\breve{x}_I=a+b-x$. This, of course, is a very naive estimate of the actual or true (non-linear) opposite $\breve{x}_{II}$, which has been called type-II opposite in literature. In absence of any knowledge about a function $y=f(\mathbf{x})$ that we need to approximate, there seems to be no alternative to the naivety of type-I opposition if one intents to utilize oppositional concepts. But the question is if we can receive some level of accuracy increase and time savings by using the naive opposite estimate $\breve{x}_I$ according to all reports in literature, what would we be able to gain, in terms of even higher accuracies and more reduction in computational complexity, if we would generate and employ true opposites? This work introduces an approach to approximate type-II opposites using evolving fuzzy rules when we first perform opposition mining. We show with multiple examples that learning true opposites is possible when we mine the opposites from the training data to subsequently approximate $\breve{x}_{II}=f(\mathbf{x},y)$. △ Less

Submitted 21 April, 2015; originally announced April 2015.

Comments: Accepted for publication in The 2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2015), August 2-5, 2015, Istanbul, Turkey

Showing 1–23 of 23 results for author: Rahnamayan, S