-
Memory-efficient Energy-adaptive Inference of Pre-Trained Models on Batteryless Embedded Systems
Authors:
Pietro Farina,
Subrata Biswas,
Eren Yıldız,
Khakim Akhunov,
Saad Ahmed,
Bashima Islam,
Kasım Sinan Yıldırım
Abstract:
Batteryless systems frequently face power failures, requiring extra runtime buffers to maintain inference progress and leaving only a memory space for storing ultra-tiny deep neural networks (DNNs). Besides, making these models responsive to stochastic energy harvesting dynamics during inference requires a balance between inference accuracy, latency, and energy overhead. Recent works on compressio…
▽ More
Batteryless systems frequently face power failures, requiring extra runtime buffers to maintain inference progress and leaving only a memory space for storing ultra-tiny deep neural networks (DNNs). Besides, making these models responsive to stochastic energy harvesting dynamics during inference requires a balance between inference accuracy, latency, and energy overhead. Recent works on compression mostly focus on time and memory, but often ignore energy dynamics or significantly reduce the accuracy of pre-trained DNNs. Existing energy-adaptive inference works modify the architecture of pre-trained models and have significant memory overhead. Thus, energy-adaptive and accurate inference of pre-trained DNNs on batteryless devices with extreme memory constraints is more challenging than traditional microcontrollers. We combat these issues by proposing FreeML, a framework to optimize pre-trained DNN models for memory-efficient and energy-adaptive inference on batteryless systems. FreeML comprises (1) a novel compression technique to reduce the model footprint and runtime memory requirements simultaneously, making them executable on extremely memory-constrained batteryless platforms; and (2) the first early exit mechanism that uses a single exit branch for all exit points to terminate inference at any time, making models energy-adaptive with minimal memory overhead. Our experiments showed that FreeML reduces the model sizes by up to $95 \times$, supports adaptive inference with a $2.03-19.65 \times$ less memory overhead, and provides significant time and energy benefits with only a negligible accuracy drop compared to the state-of-the-art.
△ Less
Submitted 16 May, 2024;
originally announced May 2024.
-
Adaptive Online Bayesian Estimation of Frequency Distributions with Local Differential Privacy
Authors:
Soner Aydin,
Sinan Yildirim
Abstract:
We propose a novel Bayesian approach for the adaptive and online estimation of the frequency distribution of a finite number of categories under the local differential privacy (LDP) framework. The proposed algorithm performs Bayesian parameter estimation via posterior sampling and adapts the randomization mechanism for LDP based on the obtained posterior samples. We propose a randomized mechanism…
▽ More
We propose a novel Bayesian approach for the adaptive and online estimation of the frequency distribution of a finite number of categories under the local differential privacy (LDP) framework. The proposed algorithm performs Bayesian parameter estimation via posterior sampling and adapts the randomization mechanism for LDP based on the obtained posterior samples. We propose a randomized mechanism for LDP which uses a subset of categories as an input and whose performance depends on the selected subset and the true frequency distribution. By using the posterior sample as an estimate of the frequency distribution, the algorithm performs a computationally tractable subset selection step to maximize the utility of the privatized response of the next user. We propose several utility functions related to well-known information metrics, such as (but not limited to) Fisher information matrix, total variation distance, and information entropy. We compare each of these utility metrics in terms of their computational complexity. We employ stochastic gradient Langevin dynamics for posterior sampling, a computationally efficient approximate Markov chain Monte Carlo method. We provide a theoretical analysis showing that (i) the posterior distribution targeted by the algorithm converges to the true parameter even for approximate posterior sampling, and (ii) the algorithm selects the optimal subset with high probability if posterior sampling is performed exactly. We also provide numerical results that empirically demonstrate the estimation accuracy of our algorithm where we compare it with nonadaptive and semi-adaptive approaches under experimental settings with various combinations of privacy parameters and population distribution parameters.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
Deep learning-based blind image super-resolution with iterative kernel reconstruction and noise estimation
Authors:
Hasan F. Ates,
Suleyman Yildirim,
Bahadir K. Gunturk
Abstract:
Blind single image super-resolution (SISR) is a challenging task in image processing due to the ill-posed nature of the inverse problem. Complex degradations present in real life images make it difficult to solve this problem using naïve deep learning approaches, where models are often trained on synthetically generated image pairs. Most of the effort so far has been focused on solving the inverse…
▽ More
Blind single image super-resolution (SISR) is a challenging task in image processing due to the ill-posed nature of the inverse problem. Complex degradations present in real life images make it difficult to solve this problem using naïve deep learning approaches, where models are often trained on synthetically generated image pairs. Most of the effort so far has been focused on solving the inverse problem under some constraints, such as for a limited space of blur kernels and/or assuming noise-free input images. Yet, there is a gap in the literature to provide a well-generalized deep learning-based solution that performs well on images with unknown and highly complex degradations. In this paper, we propose IKR-Net (Iterative Kernel Reconstruction Network) for blind SISR. In the proposed approach, kernel and noise estimation and high-resolution image reconstruction are carried out iteratively using dedicated deep models. The iterative refinement provides significant improvement in both the reconstructed image and the estimated blur kernel even for noisy inputs. IKR-Net provides a generalized solution that can handle any type of blur and level of noise in the input low-resolution image. IKR-Net achieves state-of-the-art results in blind SISR, especially for noisy images with motion blur.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Differential Privacy of Noisy (S)GD under Heavy-Tailed Perturbations
Authors:
Umut Şimşekli,
Mert Gürbüzbalaban,
Sinan Yıldırım,
Lingjiong Zhu
Abstract:
Injecting heavy-tailed noise to the iterates of stochastic gradient descent (SGD) has received increasing attention over the past few years. While various theoretical properties of the resulting algorithm have been analyzed mainly from learning theory and optimization perspectives, their privacy preservation properties have not yet been established. Aiming to bridge this gap, we provide differenti…
▽ More
Injecting heavy-tailed noise to the iterates of stochastic gradient descent (SGD) has received increasing attention over the past few years. While various theoretical properties of the resulting algorithm have been analyzed mainly from learning theory and optimization perspectives, their privacy preservation properties have not yet been established. Aiming to bridge this gap, we provide differential privacy (DP) guarantees for noisy SGD, when the injected noise follows an $α$-stable distribution, which includes a spectrum of heavy-tailed distributions (with infinite variance) as well as the Gaussian distribution. Considering the $(ε, δ)$-DP framework, we show that SGD with heavy-tailed perturbations achieves $(0, \tilde{\mathcal{O}}(1/n))$-DP for a broad class of loss functions which can be non-convex, where $n$ is the number of data points. As a remarkable byproduct, contrary to prior work that necessitates bounded sensitivity for the gradients or clipping the iterates, our theory reveals that under mild assumptions, such a projection step is not actually necessary. We illustrate that the heavy-tailed noising mechanism achieves similar DP guarantees compared to the Gaussian case, which suggests that it can be a viable alternative to its light-tailed counterparts.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Fine-tuning Transformer-based Encoder for Turkish Language Understanding Tasks
Authors:
Savas Yildirim
Abstract:
Deep learning-based and lately Transformer-based language models have been dominating the studies of natural language processing in the last years. Thanks to their accurate and fast fine-tuning characteristics, they have outperformed traditional machine learning-based approaches and achieved state-of-the-art results for many challenging natural language understanding (NLU) problems. Recent studies…
▽ More
Deep learning-based and lately Transformer-based language models have been dominating the studies of natural language processing in the last years. Thanks to their accurate and fast fine-tuning characteristics, they have outperformed traditional machine learning-based approaches and achieved state-of-the-art results for many challenging natural language understanding (NLU) problems. Recent studies showed that the Transformer-based models such as BERT, which is Bidirectional Encoder Representations from Transformers, have reached impressive achievements on many tasks. Moreover, thanks to their transfer learning capacity, these architectures allow us to transfer pre-built models and fine-tune them to specific NLU tasks such as question answering. In this study, we provide a Transformer-based model and a baseline benchmark for the Turkish Language. We successfully fine-tuned a Turkish BERT model, namely BERTurk that is trained with base settings, to many downstream tasks and evaluated with a the Turkish Benchmark dataset. We showed that our studies significantly outperformed other existing baseline approaches for Named-Entity Recognition, Sentiment Analysis, Question Answering and Text Classification in Turkish Language. We publicly released these four fine-tuned models and resources in reproducibility and with the view of supporting other Turkish researchers and applications.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
The NUS-HLT System for ICASSP2024 ICMC-ASR Grand Challenge
Authors:
Meng Ge,
Yizhou Peng,
Yidi Jiang,
Jingru Lin,
Junyi Ao,
Mehmet Sinan Yildirim,
Shuai Wang,
Haizhou Li,
Mengling Feng
Abstract:
This paper summarizes our team's efforts in both tracks of the ICMC-ASR Challenge for in-car multi-channel automatic speech recognition. Our submitted systems for ICMC-ASR Challenge include the multi-channel front-end enhancement and diarization, training data augmentation, speech recognition modeling with multi-channel branches. Tested on the offical Eval1 and Eval2 set, our best system achieves…
▽ More
This paper summarizes our team's efforts in both tracks of the ICMC-ASR Challenge for in-car multi-channel automatic speech recognition. Our submitted systems for ICMC-ASR Challenge include the multi-channel front-end enhancement and diarization, training data augmentation, speech recognition modeling with multi-channel branches. Tested on the offical Eval1 and Eval2 set, our best system achieves a relative 34.3% improvement in CER and 56.5% improvement in cpCER, compared to the offical baseline system.
△ Less
Submitted 26 December, 2023;
originally announced December 2023.
-
USED: Universal Speaker Extraction and Diarization
Authors:
Junyi Ao,
Mehmet Sinan Yıldırım,
Ruijie Tao,
Meng Ge,
Shuai Wang,
Yanmin Qian,
Haizhou Li
Abstract:
Speaker extraction and diarization are two enabling techniques for real-world speech applications. Speaker extraction aims to extract a target speaker's voice from a speech mixture, while speaker diarization demarcates speech segments by speaker, annotating `who spoke when'. Previous studies have typically treated the two tasks independently. In practical applications, it is more meaningful to hav…
▽ More
Speaker extraction and diarization are two enabling techniques for real-world speech applications. Speaker extraction aims to extract a target speaker's voice from a speech mixture, while speaker diarization demarcates speech segments by speaker, annotating `who spoke when'. Previous studies have typically treated the two tasks independently. In practical applications, it is more meaningful to have knowledge about `who spoke what and when', which is captured by the two tasks. The two tasks share a similar objective of disentangling speakers. Speaker extraction operates in the frequency domain, whereas diarization is in the temporal domain. It is logical to believe that speaker activities obtained from speaker diarization can benefit speaker extraction, while the extracted speech offers more accurate speaker activity detection than the speech mixture. In this paper, we propose a unified model called Universal Speaker Extraction and Diarization (USED) to address output inconsistency and scenario mismatch issues. It is designed to manage speech mixture with varying overlap ratios and variable number of speakers. We show that the USED model significantly outperforms the competitive baselines for speaker extraction and diarization tasks on LibriMix and SparseLibriMix datasets. We further validate the diarization performance on CALLHOME, a dataset based on real recordings, and experimental results indicate that our model surpasses recently proposed approaches.
△ Less
Submitted 9 May, 2024; v1 submitted 19 September, 2023;
originally announced September 2023.
-
Few-shot learning for sentence pair classification and its applications in software engineering
Authors:
Robert Kraig Helmeczi,
Mucahit Cevik,
Savas Yıldırım
Abstract:
Few-shot learning-the ability to train models with access to limited data-has become increasingly popular in the natural language processing (NLP) domain, as large language models such as GPT and T0 have been empirically shown to achieve high performance in numerous tasks with access to just a handful of labeled examples. Smaller language models such as BERT and its variants have also been shown t…
▽ More
Few-shot learning-the ability to train models with access to limited data-has become increasingly popular in the natural language processing (NLP) domain, as large language models such as GPT and T0 have been empirically shown to achieve high performance in numerous tasks with access to just a handful of labeled examples. Smaller language models such as BERT and its variants have also been shown to achieve strong performance with just a handful of labeled examples when combined with few-shot learning algorithms like pattern-exploiting training (PET) and SetFit. The focus of this work is to investigate the performance of alternative few-shot learning approaches with BERT-based models. Specifically, vanilla fine-tuning, PET and SetFit are compared for numerous BERT-based checkpoints over an array of training set sizes. To facilitate this investigation, applications of few-shot learning are considered in software engineering. For each task, high-performance techniques and their associated model checkpoints are identified through detailed empirical analysis. Our results establish PET as a strong few-shot learning approach, and our analysis shows that with just a few hundred labeled examples it can achieve performance near that of fine-tuning on full-sized data sets.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
Differentially Private Distributed Bayesian Linear Regression with MCMC
Authors:
Barış Alparslan,
Sinan Yıldırım,
Ş. İlker Birbil
Abstract:
We propose a novel Bayesian inference framework for distributed differentially private linear regression. We consider a distributed setting where multiple parties hold parts of the data and share certain summary statistics of their portions in privacy-preserving noise. We develop a novel generative statistical model for privately shared statistics, which exploits a useful distributional relation b…
▽ More
We propose a novel Bayesian inference framework for distributed differentially private linear regression. We consider a distributed setting where multiple parties hold parts of the data and share certain summary statistics of their portions in privacy-preserving noise. We develop a novel generative statistical model for privately shared statistics, which exploits a useful distributional relation between the summary statistics of linear regression. Bayesian estimation of the regression coefficients is conducted mainly using Markov chain Monte Carlo algorithms, while we also provide a fast version to perform Bayesian estimation in one iteration. The proposed methods have computational advantages over their competitors. We provide numerical results on both real and simulated data, which demonstrate that the proposed algorithms provide well-rounded estimation and prediction.
△ Less
Submitted 7 June, 2023; v1 submitted 31 January, 2023;
originally announced January 2023.
-
Differentially Private Online Bayesian Estimation With Adaptive Truncation
Authors:
Sinan Yıldırım
Abstract:
We propose a novel online and adaptive truncation method for differentially private Bayesian online estimation of a static parameter regarding a population. We assume that sensitive information from individuals is collected sequentially and the inferential aim is to estimate, on-the-fly, a static parameter regarding the population to which those individuals belong. We propose sequential Monte Carl…
▽ More
We propose a novel online and adaptive truncation method for differentially private Bayesian online estimation of a static parameter regarding a population. We assume that sensitive information from individuals is collected sequentially and the inferential aim is to estimate, on-the-fly, a static parameter regarding the population to which those individuals belong. We propose sequential Monte Carlo to perform online Bayesian estimation. When individuals provide sensitive information in response to a query, it is necessary to perturb it with privacy-preserving noise to ensure the privacy of those individuals. The amount of perturbation is proportional to the sensitivity of the query, which is determined usually by the range of the queried information. The truncation technique we propose adapts to the previously collected observations to adjust the query range for the next individual. The idea is that, based on previous observations, we can carefully arrange the interval into which the next individual's information is to be truncated before being perturbed with privacy-preserving noise. In this way, we aim to design predictive queries with small sensitivity, hence small privacy-preserving noise, enabling more accurate estimation while maintaining the same level of privacy. To decide on the location and the width of the interval, we use an exploration-exploitation approach a la Thompson sampling with an objective function based on the Fisher information of the generated observation. We show the merits of our methodology with numerical examples.
△ Less
Submitted 19 January, 2023;
originally announced January 2023.
-
Transfer learning for conflict and duplicate detection in software requirement pairs
Authors:
Garima Malik,
Savas Yildirim,
Mucahit Cevik,
Ayse Bener,
Devang Parikh
Abstract:
Consistent and holistic expression of software requirements is important for the success of software projects. In this study, we aim to enhance the efficiency of the software development processes by automatically identifying conflicting and duplicate software requirement specifications. We formulate the conflict and duplicate detection problem as a requirement pair classification task. We design…
▽ More
Consistent and holistic expression of software requirements is important for the success of software projects. In this study, we aim to enhance the efficiency of the software development processes by automatically identifying conflicting and duplicate software requirement specifications. We formulate the conflict and duplicate detection problem as a requirement pair classification task. We design a novel transformers-based architecture, SR-BERT, which incorporates Sentence-BERT and Bi-encoders for the conflict and duplicate identification task. Furthermore, we apply supervised multi-stage fine-tuning to the pre-trained transformer models. We test the performance of different transfer models using four different datasets. We find that sequentially trained and fine-tuned transformer models perform well across the datasets with SR-BERT achieving the best performance for larger datasets. We also explore the cross-domain performance of conflict detection models and adopt a rule-based filtering approach to validate the model classifications. Our analysis indicates that the sentence pair classification approach and the proposed transformer-based natural language processing strategies can contribute significantly to achieving automation in conflict and duplicate detection
△ Less
Submitted 9 January, 2023;
originally announced January 2023.
-
Adaptive Fine-tuning for Multiclass Classification over Software Requirement Data
Authors:
Savas Yildirim,
Mucahit Cevik,
Devang Parikh,
Ayse Basar
Abstract:
The analysis of software requirement specifications (SRS) using Natural Language Processing (NLP) methods has been an important study area in the software engineering field in recent years. Especially thanks to the advances brought by deep learning and transfer learning approaches in NLP, SRS data can be utilized for various learning tasks more easily. In this study, we employ a three-stage domain…
▽ More
The analysis of software requirement specifications (SRS) using Natural Language Processing (NLP) methods has been an important study area in the software engineering field in recent years. Especially thanks to the advances brought by deep learning and transfer learning approaches in NLP, SRS data can be utilized for various learning tasks more easily. In this study, we employ a three-stage domain-adaptive fine-tuning approach for three prediction tasks regarding software requirements, which improve the model robustness on a real distribution shift. The multi-class classification tasks involve predicting the type, priority and severity of the requirement texts specified by the users. We compare our results with strong classification baselines such as word embedding pooling and Sentence BERT, and show that the adaptive fine-tuning leads to performance improvements across the tasks. We find that an adaptively fine-tuned model can be specialized to particular data distribution, which is able to generate accurate results and learns from abundantly available textual data in software engineering task management systems.
△ Less
Submitted 1 January, 2023;
originally announced January 2023.
-
A Prompt-based Few-shot Learning Approach to Software Conflict Detection
Authors:
Robert K. Helmeczi,
Mucahit Cevik,
Savas Yıldırım
Abstract:
A software requirement specification (SRS) document is an essential part of the software development life cycle which outlines the requirements that a software program in development must satisfy. This document is often specified by a diverse group of stakeholders and is subject to continual change, making the process of maintaining the document and detecting conflicts between requirements an esse…
▽ More
A software requirement specification (SRS) document is an essential part of the software development life cycle which outlines the requirements that a software program in development must satisfy. This document is often specified by a diverse group of stakeholders and is subject to continual change, making the process of maintaining the document and detecting conflicts between requirements an essential task in software development. Notably, projects that do not address conflicts in the SRS document early on face considerable problems later in the development life cycle. These problems incur substantial costs in terms of time and money, and these costs often become insurmountable barriers that ultimately result in the termination of a software project altogether. As a result, early detection of SRS conflicts is critical to project sustainability. The conflict detection task is approached in numerous ways, many of which require a significant amount of manual intervention from developers, or require access to a large amount of labeled, task-specific training data. In this work, we propose using a prompt-based learning approach to perform few-shot learning for conflict detection. We compare our results to supervised learning approaches that use pretrained language models, such as BERT and its variants. Our results show that prompting with just 32 labeled examples can achieve a similar level of performance in many key metrics to that of supervised learning on training sets that are magnitudes larger in size. In contrast to many other conflict detection approaches, we make no assumptions about the type of underlying requirements, allowing us to analyze pairings of both functional and non-functional requirements. This allows us to omit the potentially expensive task of filtering out non-functional requirements from our dataset.
△ Less
Submitted 4 November, 2022;
originally announced November 2022.
-
Harfang3D Dog-Fight Sandbox: A Reinforcement Learning Research Platform for the Customized Control Tasks of Fighter Aircrafts
Authors:
Muhammed Murat Özbek,
Süleyman Yıldırım,
Muhammet Aksoy,
Eric Kernin,
Emre Koyuncu
Abstract:
The advent of deep learning (DL) gave rise to significant breakthroughs in Reinforcement Learning (RL) research. Deep Reinforcement Learning (DRL) algorithms have reached super-human level skills when applied to vision-based control problems as such in Atari 2600 games where environment states were extracted from pixel information. Unfortunately, these environments are far from being applicable to…
▽ More
The advent of deep learning (DL) gave rise to significant breakthroughs in Reinforcement Learning (RL) research. Deep Reinforcement Learning (DRL) algorithms have reached super-human level skills when applied to vision-based control problems as such in Atari 2600 games where environment states were extracted from pixel information. Unfortunately, these environments are far from being applicable to highly dynamic and complex real-world tasks as in autonomous control of a fighter aircraft since these environments only involve 2D representation of a visual world. Here, we present a semi-realistic flight simulation environment Harfang3D Dog-Fight Sandbox for fighter aircrafts. It is aimed to be a flexible toolbox for the investigation of main challenges in aviation studies using Reinforcement Learning. The program provides easy access to flight dynamics model, environment states, and aerodynamics of the plane enabling user to customize any specific task in order to build intelligent decision making (control) systems via RL. The software also allows deployment of bot aircrafts and development of multi-agent tasks. This way, multiple groups of aircrafts can be configured to be competitive or cooperative agents to perform complicated tasks including Dog Fight. During the experiments, we carried out training for two different scenarios: navigating to a designated location and within visual range (WVR) combat, shortly Dog Fight. Using Deep Reinforcement Learning techniques for both scenarios, we were able to train competent agents that exhibit human-like behaviours. Based on this results, it is confirmed that Harfang3D Dog-Fight Sandbox can be utilized as a 3D realistic RL research platform.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
Token Classification for Disambiguating Medical Abbreviations
Authors:
Mucahit Cevik,
Sanaz Mohammad Jafari,
Mitchell Myers,
Savas Yildirim
Abstract:
Abbreviations are unavoidable yet critical parts of the medical text. Using abbreviations, especially in clinical patient notes, can save time and space, protect sensitive information, and help avoid repetitions. However, most abbreviations might have multiple senses, and the lack of a standardized mapping system makes disambiguating abbreviations a difficult and time-consuming task. The main obje…
▽ More
Abbreviations are unavoidable yet critical parts of the medical text. Using abbreviations, especially in clinical patient notes, can save time and space, protect sensitive information, and help avoid repetitions. However, most abbreviations might have multiple senses, and the lack of a standardized mapping system makes disambiguating abbreviations a difficult and time-consuming task. The main objective of this study is to examine the feasibility of token classification methods for medical abbreviation disambiguation. Specifically, we explore the capability of token classification methods to deal with multiple unique abbreviations in a single text. We use two public datasets to compare and contrast the performance of several transformer models pre-trained on different scientific and medical corpora. Our proposed token classification approach outperforms the more commonly used text classification models for the abbreviation disambiguation task. In particular, the SciBERT model shows a strong performance for both token and text classification tasks over the two considered datasets. Furthermore, we find that abbreviation disambiguation performance for the text classification models becomes comparable to that of token classification only when postprocessing is applied to their predictions, which involves filtering possible labels for an abbreviation based on the training data.
△ Less
Submitted 5 October, 2022;
originally announced October 2022.
-
Energy Optimization of Wind Turbines via a Neural Control Policy Based on Reinforcement Learning Markov Chain Monte Carlo Algorithm
Authors:
Vahid Tavakol Aghaei,
Arda Ağababaoğlu,
Biram Bawo,
Peiman Naseradinmousavi,
Sinan Yıldırım,
Serhat Yeşilyurt,
Ahmet Onat
Abstract:
This study focuses on the numerical analysis and optimal control of vertical-axis wind turbines (VAWT) using Bayesian reinforcement learning (RL). We specifically address small-scale wind turbines, which are well-suited to local and compact production of electrical energy on a small scale, such as urban and rural infrastructure installations. Existing literature concentrates on large scale wind tu…
▽ More
This study focuses on the numerical analysis and optimal control of vertical-axis wind turbines (VAWT) using Bayesian reinforcement learning (RL). We specifically address small-scale wind turbines, which are well-suited to local and compact production of electrical energy on a small scale, such as urban and rural infrastructure installations. Existing literature concentrates on large scale wind turbines which run in unobstructed, mostly constant wind profiles. However urban installations generally must cope with rapidly changing wind patterns. To bridge this gap, we formulate and implement an RL strategy using the Markov chain Monte Carlo (MCMC) algorithm to optimize the long-term energy output of a wind turbine. Our MCMC-based RL algorithm is a model-free and gradient-free algorithm, in which the designer does not have to know the precise dynamics of the plant and its uncertainties. Our method addresses the uncertainties by using a multiplicative reward structure, in contrast with additive reward used in conventional RL approaches. We have shown numerically that the method specifically overcomes the shortcomings typically associated with conventional solutions, including, but not limited to, component aging, modeling errors, and inaccuracies in the estimation of wind speed patterns. Our results show that the proposed method is especially successful in capturing power from wind transients; by modulating the generator load and hence the rotor torque load, so that the rotor tip speed quickly reaches the optimum value for the anticipated wind speed. This ratio of rotor tip speed to wind speed is known to be critical in wind power applications. The wind to load energy efficiency of the proposed method was shown to be superior to two other methods; the classical maximum power point tracking method and a generator controlled by deep deterministic policy gradient (DDPG) method.
△ Less
Submitted 12 March, 2023; v1 submitted 7 September, 2022;
originally announced September 2022.
-
Reliable Transiently-Powered Communication
Authors:
Alessandro Torrisi,
Kasım Sinan Yıldırım,
Davide Brunelli
Abstract:
Frequent power failures can introduce significant packet losses during communication among energy harvesting batteryless wireless sensors. Nodes should be aware of the energy level of their neighbors to guarantee the success of communication and avoid wasting energy. This paper presents TRAP (TRAnsiently-powered Protocol) that allows nodes to communicate only if the energy availability on both sid…
▽ More
Frequent power failures can introduce significant packet losses during communication among energy harvesting batteryless wireless sensors. Nodes should be aware of the energy level of their neighbors to guarantee the success of communication and avoid wasting energy. This paper presents TRAP (TRAnsiently-powered Protocol) that allows nodes to communicate only if the energy availability on both sides of the communication channel is sufficient before packet transmission. TRAP relies on a novel modulator circuit, which operates without microcontroller intervention and transmits the energy status almost for free over the radiofrequency backscatter channel. Our experimental results showed that TRAP avoids failed transmissions introduced by the power failures and ensures reliable intermittent communication among batteryless sensors.
△ Less
Submitted 16 March, 2022;
originally announced April 2022.
-
Statistic Selection and MCMC for Differentially Private Bayesian Estimation
Authors:
Baris Alparslan,
Sinan Yildirim
Abstract:
This paper concerns differentially private Bayesian estimation of the parameters of a population distribution, when a statistic of a sample from that population is shared in noise to provide differential privacy.
This work mainly addresses two problems: (1) What statistic of the sample should be shared privately? For the first question, i.e., the one about statistic selection, we promote using t…
▽ More
This paper concerns differentially private Bayesian estimation of the parameters of a population distribution, when a statistic of a sample from that population is shared in noise to provide differential privacy.
This work mainly addresses two problems: (1) What statistic of the sample should be shared privately? For the first question, i.e., the one about statistic selection, we promote using the Fisher information. We find out that, the statistic that is most informative in a non-privacy setting may not be the optimal choice under the privacy restrictions. We provide several examples to support that point. We consider several types of data sharing settings and propose several Monte Carlo-based numerical estimation methods for calculating the Fisher information for those settings. The second question concerns inference: (2) Based on the shared statistics, how could we perform effective Bayesian inference? We propose several Markov chain Monte Carlo (MCMC) algorithms for sampling from the posterior distribution of the parameter given the noisy statistic. The proposed MCMC algorithms can be preferred over one another depending on the problem. For example, when the shared statistics is additive and added Gaussian noise, a simple Metropolis-Hasting algorithm that utilizes the central limit theorem is a decent choice. We propose more advanced MCMC algorithms for several other cases of practical relevance.
Our numerical examples involve comparing several candidate statistics to be shared privately. For each statistic, we perform Bayesian estimation based on the posterior distribution conditional on the privatized version of that statistic. We demonstrate that, the relative performance of a statistic, in terms of the mean squared error of the Bayesian estimator based on the corresponding privatized statistic, is adequately predicted by the Fisher information of the privatized statistic.
△ Less
Submitted 28 March, 2022; v1 submitted 24 March, 2022;
originally announced March 2022.
-
NORM: An FPGA-based Non-volatile Memory Emulation Framework for Intermittent Computing
Authors:
Simone Ruffini,
Luca Caronti,
Kasım Sinan Yıldırım,
Davide Brunelli
Abstract:
Intermittent computing systems operate by relying only on harvested energy accumulated in their tiny energy reservoirs, typically capacitors. An intermittent device dies due to a power failure when there is no energy in its capacitor and boots again when the harvested energy is sufficient to power its hardware components. Power failures prevent the forward progress of computation due to the freque…
▽ More
Intermittent computing systems operate by relying only on harvested energy accumulated in their tiny energy reservoirs, typically capacitors. An intermittent device dies due to a power failure when there is no energy in its capacitor and boots again when the harvested energy is sufficient to power its hardware components. Power failures prevent the forward progress of computation due to the frequent loss of computational state. To remedy this problem, intermittent computing systems comprise built-in fast non-volatile memories with high write endurance to store information that persists despite frequent power failures. However, the lack of design tools makes fast-prototyping these systems difficult. Even though FPGAs are common platforms for fast prototyping and behavioral verification of continuously-powered architectures, they do not target prototyping intermittent computing systems. This article introduces a new FPGA-based framework, named NORM (Non-volatile memORy eMulator), to emulate and verify the behavior of any intermittent computing system that exploits fast non-volatile memories. Our evaluation showed that NORM can be used to emulate and validate FeRAM-based transiently-powered hardware architectures successfully.
△ Less
Submitted 16 February, 2022;
originally announced February 2022.
-
ETAP: Energy-aware Timing Analysis of Intermittent Programs
Authors:
Ferhat Erata,
Arda Goknil,
Eren Yıldız,
Kasım Sinan Yıldırım,
Ruzica Piskac,
Jakub Szefer,
Gökçin Sezgin
Abstract:
Energy harvesting battery-free embedded devices rely only on ambient energy harvesting that enables stand-alone and sustainable IoT applications. These devices execute programs when the harvested ambient energy in their energy reservoir is sufficient to operate and stop execution abruptly (and start charging) otherwise. These intermittent programs have varying timing behavior under different energ…
▽ More
Energy harvesting battery-free embedded devices rely only on ambient energy harvesting that enables stand-alone and sustainable IoT applications. These devices execute programs when the harvested ambient energy in their energy reservoir is sufficient to operate and stop execution abruptly (and start charging) otherwise. These intermittent programs have varying timing behavior under different energy conditions, hardware configurations, and program structures. This paper presents Energy-aware Timing Analysis of intermittent Programs (ETAP), a probabilistic symbolic execution approach that analyzes the timing and energy behavior of intermittent programs at compile time. ETAP symbolically executes the given program while taking time and energy cost models for ambient energy and dynamic energy consumption into account. We evaluated ETAP on several intermittent programs and compared the compile-time analysis results with executions on real hardware. The results show that ETAP's normalized prediction accuracy is 99.5%, and it speeds up the timing analysis by at least two orders of magnitude compared to manual testing.
△ Less
Submitted 3 February, 2022; v1 submitted 27 January, 2022;
originally announced January 2022.
-
Learning with Subset Stacking
Authors:
S. İlker Birbil,
Sinan Yildirim,
Kaya Gökalp,
M. Hakan Akyüz
Abstract:
We propose a new regression algorithm that learns from a set of input-output pairs. Our algorithm is designed for populations where the relation between the input variables and the output variable exhibits a heterogeneous behavior across the predictor space. The algorithm starts with generating subsets that are concentrated around random points in the input space. This is followed by training a lo…
▽ More
We propose a new regression algorithm that learns from a set of input-output pairs. Our algorithm is designed for populations where the relation between the input variables and the output variable exhibits a heterogeneous behavior across the predictor space. The algorithm starts with generating subsets that are concentrated around random points in the input space. This is followed by training a local predictor for each subset. Those predictors are then combined in a novel way to yield an overall predictor. We call this algorithm ``LEarning with Subset Stacking'' or LESS, due to its resemblance to the method of stacking regressors. We compare the testing performance of LESS with state-of-the-art methods on several datasets. Our comparison shows that LESS is a competitive supervised learning method. Moreover, we observe that LESS is also efficient in terms of computation time and it allows a straightforward parallel implementation.
△ Less
Submitted 30 October, 2023; v1 submitted 12 December, 2021;
originally announced December 2021.
-
Virtualizing Intermittent Computing
Authors:
Caglar Durmaz,
Kasim Sinan Yildirim,
Geylani Kardas
Abstract:
Intermittent computing requires custom programming models to ensure the correct execution of applications despite power failures. However, existing programming models lead to programs that are hardware-dependent and not reusable. This paper aims at virtualizing intermittent computing to remedy these problems. We introduce PureVM, a virtual machine that abstracts a transiently powered computer, and…
▽ More
Intermittent computing requires custom programming models to ensure the correct execution of applications despite power failures. However, existing programming models lead to programs that are hardware-dependent and not reusable. This paper aims at virtualizing intermittent computing to remedy these problems. We introduce PureVM, a virtual machine that abstracts a transiently powered computer, and PureLANG, a continuation-passing-style programming language to develop programs that run on PureVM. This virtualization, for the first time, paves the way for portable and reusable transiently-powered applications.
△ Less
Submitted 28 November, 2021;
originally announced November 2021.
-
Differentially Private Linear Optimization for Multi-Party Resource Sharing
Authors:
Utku Karaca,
Nursen Aydin,
Sinan Yildirim,
S. Ilker Birbil
Abstract:
This study examines a resource-sharing problem involving multiple parties that agree to use a set of capacities together. We start with modeling the whole problem as a mathematical program, where all parties are required to exchange information to obtain the optimal objective function value. This information bears private data from each party in terms of coefficients used in the mathematical progr…
▽ More
This study examines a resource-sharing problem involving multiple parties that agree to use a set of capacities together. We start with modeling the whole problem as a mathematical program, where all parties are required to exchange information to obtain the optimal objective function value. This information bears private data from each party in terms of coefficients used in the mathematical program. Moreover, the parties also consider the individual optimal solutions as private. In this setting, the concern for the parties is the privacy of their data and their optimal allocations. We propose a two-step approach to meet the privacy requirements of the parties. In the first step, we obtain a reformulated model that is amenable to a decomposition scheme. Although this scheme eliminates almost all data exchanges, it does not provide a formal privacy guarantee. In the second step, we provide this guarantee with a locally differentially private algorithm, which does not need a trusted aggregator, at the expense of deviating slightly from the optimality. We provide bounds on this deviation and discuss the consequences of these theoretical results. We also propose a novel modification to increase the efficiency of the algorithm in terms of reducing the theoretical optimality gap. The study ends with a numerical experiment on a planning problem that demonstrates an application of the proposed approach. As we work with a general linear optimization model, our analysis and discussion can be used in different application areas including production planning, logistics, and revenue management.
△ Less
Submitted 4 January, 2024; v1 submitted 20 October, 2021;
originally announced October 2021.
-
Metropolis-Hastings with Averaged Acceptance Ratios
Authors:
Christophe Andrieu,
Sinan Yıldırım,
Arnaud Doucet,
Nicolas Chopin
Abstract:
Markov chain Monte Carlo (MCMC) methods to sample from a probability distribution $π$ defined on a space $(Θ,\mathcal{T})$ consist of the simulation of realisations of Markov chains $\{θ_{n},n\geq1\}$ of invariant distribution $π$ and such that the distribution of $θ_{i}$ converges to $π$ as $i\rightarrow\infty$. In practice one is typically interested in the computation of expectations of functio…
▽ More
Markov chain Monte Carlo (MCMC) methods to sample from a probability distribution $π$ defined on a space $(Θ,\mathcal{T})$ consist of the simulation of realisations of Markov chains $\{θ_{n},n\geq1\}$ of invariant distribution $π$ and such that the distribution of $θ_{i}$ converges to $π$ as $i\rightarrow\infty$. In practice one is typically interested in the computation of expectations of functions, say $f$, with respect to $π$ and it is also required that averages $M^{-1}\sum_{n=1}^{M}f(θ_{n})$ converge to the expectation of interest. The iterative nature of MCMC makes it difficult to develop generic methods to take advantage of parallel computing environments when interested in reducing time to convergence. While numerous approaches have been proposed to reduce the variance of ergodic averages, including averaging over independent realisations of $\{θ_{n},n\geq1\}$ simulated on several computers, techniques to reduce the "burn-in" of MCMC are scarce. In this paper we explore a simple and generic approach to improve convergence to equilibrium of existing algorithms which rely on the Metropolis-Hastings (MH) update, the main building block of MCMC. The main idea is to use averages of the acceptance ratio w.r.t. multiple realisations of random variables involved, while preserving $π$ as invariant distribution. The methodology requires limited change to existing code, is naturally suited to parallel computing and is shown on our examples to provide substantial performance improvements both in terms of convergence to equilibrium and variance of ergodic averages. In some scenarios gains are observed even on a serial machine.
△ Less
Submitted 29 December, 2020;
originally announced January 2021.
-
PS-DeVCEM: Pathology-sensitive deep learning model for video capsule endoscopy based on weakly labeled data
Authors:
A. Mohammed,
I. Farup,
M. Pedersen,
S. Yildirim,
Ø Hovde
Abstract:
We propose a novel pathology-sensitive deep learning model (PS-DeVCEM) for frame-level anomaly detection and multi-label classification of different colon diseases in video capsule endoscopy (VCE) data. Our proposed model is capable of coping with the key challenge of colon apparent heterogeneity caused by several types of diseases. Our model is driven by attention-based deep multiple instance lea…
▽ More
We propose a novel pathology-sensitive deep learning model (PS-DeVCEM) for frame-level anomaly detection and multi-label classification of different colon diseases in video capsule endoscopy (VCE) data. Our proposed model is capable of coping with the key challenge of colon apparent heterogeneity caused by several types of diseases. Our model is driven by attention-based deep multiple instance learning and is trained end-to-end on weakly labeled data using video labels instead of detailed frame-by-frame annotation. The spatial and temporal features are obtained through ResNet50 and residual Long short-term memory (residual LSTM) blocks, respectively. Additionally, the learned temporal attention module provides the importance of each frame to the final label prediction. Moreover, we developed a self-supervision method to maximize the distance between classes of pathologies. We demonstrate through qualitative and quantitative experiments that our proposed weakly supervised learning model gives superior precision and F1-score reaching, 61.6% and 55.1%, as compared to three state-of-the-art video analysis methods respectively. We also show our model's ability to temporally localize frames with pathologies, without frame annotation information during training. Furthermore, we collected and annotated the first and largest VCE dataset with only video labels. The dataset contains 455 short video segments with 28,304 frames and 14 classes of colorectal diseases and artifacts. Dataset and code supporting this publication will be made available on our home page.
△ Less
Submitted 22 November, 2020;
originally announced November 2020.
-
Differentially Private Accelerated Optimization Algorithms
Authors:
Nurdan Kuru,
Ş. İlker Birbil,
Mert Gurbuzbalaban,
Sinan Yildirim
Abstract:
We present two classes of differentially private optimization algorithms derived from the well-known accelerated first-order methods. The first algorithm is inspired by Polyak's heavy ball method and employs a smoothing approach to decrease the accumulated noise on the gradient steps required for differential privacy. The second class of algorithms are based on Nesterov's accelerated gradient meth…
▽ More
We present two classes of differentially private optimization algorithms derived from the well-known accelerated first-order methods. The first algorithm is inspired by Polyak's heavy ball method and employs a smoothing approach to decrease the accumulated noise on the gradient steps required for differential privacy. The second class of algorithms are based on Nesterov's accelerated gradient method and its recent multi-stage variant. We propose a noise dividing mechanism for the iterations of Nesterov's method in order to improve the error behavior of the algorithm. The convergence rate analyses are provided for both the heavy ball and the Nesterov's accelerated gradient method with the help of the dynamical system analysis techniques. Finally, we conclude with our numerical experiments showing that the presented algorithms have advantages over the well-known differentially private algorithms.
△ Less
Submitted 5 August, 2020;
originally announced August 2020.
-
Process Knowledge Driven Change Point Detection for Automated Calibration of Discrete Event Simulation Models Using Machine Learning
Authors:
Suleyman Yildirim,
Alper Ekrem Murat,
Murat Yildirim,
Suzan Arslanturk
Abstract:
Initial development and subsequent calibration of discrete event simulation models for complex systems require accurate identification of dynamically changing process characteristics. Existing data driven change point methods (DD-CPD) assume changes are extraneous to the system, thus cannot utilize available process knowledge. This work proposes a unified framework for process-driven multi-variate…
▽ More
Initial development and subsequent calibration of discrete event simulation models for complex systems require accurate identification of dynamically changing process characteristics. Existing data driven change point methods (DD-CPD) assume changes are extraneous to the system, thus cannot utilize available process knowledge. This work proposes a unified framework for process-driven multi-variate change point detection (PD-CPD) by combining change point detection models with machine learning and process-driven simulation modeling. The PD-CPD, after initializing with DD-CPD's change point(s), uses simulation models to generate system level outputs as time-series data streams which are then used to train neural network models to predict system characteristics and change points. The accuracy of the predictive models measures the likelihood that the actual process data conforms to the simulated change points in system characteristics. PD-CPD iteratively optimizes change points by repeating simulation and predictive model building steps until the set of change point(s) with the maximum likelihood is identified. Using an emergency department case study, we show that PD-CPD significantly improves change point detection accuracy over DD-CPD estimates and is able to detect actual change points.
△ Less
Submitted 21 September, 2020; v1 submitted 11 May, 2020;
originally announced May 2020.
-
Bayesian Allocation Model: Inference by Sequential Monte Carlo for Nonnegative Tensor Factorizations and Topic Models using Polya Urns
Authors:
Ali Taylan Cemgil,
Mehmet Burak Kurutmaz,
Sinan Yildirim,
Melih Barsbey,
Umut Simsekli
Abstract:
We introduce a dynamic generative model, Bayesian allocation model (BAM), which establishes explicit connections between nonnegative tensor factorization (NTF), graphical models of discrete probability distributions and their Bayesian extensions, and the topic models such as the latent Dirichlet allocation. BAM is based on a Poisson process, whose events are marked by using a Bayesian network, whe…
▽ More
We introduce a dynamic generative model, Bayesian allocation model (BAM), which establishes explicit connections between nonnegative tensor factorization (NTF), graphical models of discrete probability distributions and their Bayesian extensions, and the topic models such as the latent Dirichlet allocation. BAM is based on a Poisson process, whose events are marked by using a Bayesian network, where the conditional probability tables of this network are then integrated out analytically. We show that the resulting marginal process turns out to be a Polya urn, an integer valued self-reinforcing process. This urn processes, which we name a Polya-Bayes process, obey certain conditional independence properties that provide further insight about the nature of NTF. These insights also let us develop space efficient simulation algorithms that respect the potential sparsity of data: we propose a class of sequential importance sampling algorithms for computing NTF and approximating their marginal likelihood, which would be useful for model selection. The resulting methods can also be viewed as a model scoring method for topic models and discrete Bayesian networks with hidden variables. The new algorithms have favourable properties in the sparse data regime when contrasted with variational algorithms that become more accurate when the total sum of the elements of the observed tensor goes to infinity. We illustrate the performance on several examples and numerically study the behaviour of the algorithms for various data regimes.
△ Less
Submitted 11 March, 2019;
originally announced March 2019.
-
Multi-hop Backscatter Tag-to-Tag Networks
Authors:
Amjad Yousef Majid,
Michel Jansen,
Guillermo Ortas Delgado,
Kasım Sinan Yıldırım,
Przemysław Pawełczak
Abstract:
We characterize the performance of a backscatter tag-to-tag (T2T) multi-hop network. For this, we developed a discrete component-based backscatter T2T transceiver and a communication protocol suite. The protocol composed of a novel (i) flooding-based link control tailored towards backscatter transmission, and (ii) low-power listening MAC. The MAC design is based on the new insight that backscatter…
▽ More
We characterize the performance of a backscatter tag-to-tag (T2T) multi-hop network. For this, we developed a discrete component-based backscatter T2T transceiver and a communication protocol suite. The protocol composed of a novel (i) flooding-based link control tailored towards backscatter transmission, and (ii) low-power listening MAC. The MAC design is based on the new insight that backscatter reception is more energy costly than transmission. Our experiments show that multi-hopping extends the coverage of backscatter networks by enabling longer backward T2T links (tag far from the exciter sending to the tag close to the exciter). Four hops, for example, extend the communication range by a factor of two. Furthermore, we show that dead spots in multi-hop T2T networks are far less significant than those in the single-hop T2T networks.
△ Less
Submitted 29 January, 2019;
originally announced January 2019.
-
Scalable Monte Carlo inference for state-space models
Authors:
Sinan Yıldırım,
Christophe Andrieu,
Arnaud Doucet
Abstract:
We present an original simulation-based method to estimate likelihood ratios efficiently for general state-space models. Our method relies on a novel use of the conditional Sequential Monte Carlo (cSMC) algorithm introduced in \citet{Andrieu_et_al_2010} and presents several practical advantages over standard approaches. The ratio is estimated using a unique source of randomness instead of estimati…
▽ More
We present an original simulation-based method to estimate likelihood ratios efficiently for general state-space models. Our method relies on a novel use of the conditional Sequential Monte Carlo (cSMC) algorithm introduced in \citet{Andrieu_et_al_2010} and presents several practical advantages over standard approaches. The ratio is estimated using a unique source of randomness instead of estimating separately the two likelihood terms involved. Beyond the benefits in terms of variance reduction one may expect in general from this type of approach, an important point here is that the variance of this estimator decreases as the distance between the likelihood parameters decreases. We show how this can be exploited in the context of Monte Carlo Markov chain (MCMC) algorithms, leading to the development of a new class of exact-approximate MCMC methods to perform Bayesian static parameter inference in state-space models. We show through simulations that, in contrast to the Particle Marginal Metropolis-Hastings (PMMH) algorithm of Andrieu_et_al_2010, the computational effort required by this novel MCMC scheme scales very favourably for large data sets.
△ Less
Submitted 7 September, 2018;
originally announced September 2018.
-
Image Segmentation with Pseudo-marginal MCMC Sampling and Nonparametric Shape Priors
Authors:
Ertunc Erdil,
Sinan Yildirim,
Tolga Tasdizen,
Mujdat Cetin
Abstract:
In this paper, we propose an efficient pseudo-marginal Markov chain Monte Carlo (MCMC) sampling approach to draw samples from posterior shape distributions for image segmentation. The computation time of the proposed approach is independent from the size of the training set used to learn the shape prior distribution nonparametrically. Therefore, it scales well for very large data sets. Our approac…
▽ More
In this paper, we propose an efficient pseudo-marginal Markov chain Monte Carlo (MCMC) sampling approach to draw samples from posterior shape distributions for image segmentation. The computation time of the proposed approach is independent from the size of the training set used to learn the shape prior distribution nonparametrically. Therefore, it scales well for very large data sets. Our approach is able to characterize the posterior probability density in the space of shapes through its samples, and to return multiple solutions, potentially from different modes of a multimodal probability density, which would be encountered, e.g., in segmenting objects from multiple shape classes. Experimental results demonstrate the potential of the proposed approach.
△ Less
Submitted 3 September, 2018;
originally announced September 2018.
-
Ensemble of Convolutional Neural Networks for Dermoscopic Images Classification
Authors:
Tomáš Majtner,
Buda Bajić,
Sule Yildirim,
Jon Yngve Hardeberg,
Joakim Lindblad,
Nataša Sladoje
Abstract:
In this report, we are presenting our automated prediction system for disease classification within dermoscopic images. The proposed solution is based on deep learning, where we employed transfer learning strategy on VGG16 and GoogLeNet architectures. The key feature of our solution is preprocessing based primarily on image augmentation and colour normalization. The solution was evaluated on Task…
▽ More
In this report, we are presenting our automated prediction system for disease classification within dermoscopic images. The proposed solution is based on deep learning, where we employed transfer learning strategy on VGG16 and GoogLeNet architectures. The key feature of our solution is preprocessing based primarily on image augmentation and colour normalization. The solution was evaluated on Task 3: Lesion Diagnosis of the ISIC 2018: Skin Lesion Analysis Towards Melanoma Detection.
△ Less
Submitted 15 August, 2018;
originally announced August 2018.
-
Y-Net: A deep Convolutional Neural Network for Polyp Detection
Authors:
Ahmed Mohammed,
Sule Yildirim,
Ivar Farup,
Marius Pedersen,
Øistein Hovde
Abstract:
Colorectal polyps are important precursors to colon cancer, the third most common cause of cancer mortality for both men and women. It is a disease where early detection is of crucial importance. Colonoscopy is commonly used for early detection of cancer and precancerous pathology. It is a demanding procedure requiring significant amount of time from specialized physicians and nurses, in addition…
▽ More
Colorectal polyps are important precursors to colon cancer, the third most common cause of cancer mortality for both men and women. It is a disease where early detection is of crucial importance. Colonoscopy is commonly used for early detection of cancer and precancerous pathology. It is a demanding procedure requiring significant amount of time from specialized physicians and nurses, in addition to a significant miss-rates of polyps by specialists. Automated polyp detection in colonoscopy videos has been demonstrated to be a promising way to handle this problem. {However, polyps detection is a challenging problem due to the availability of limited amount of training data and large appearance variations of polyps. To handle this problem, we propose a novel deep learning method Y-Net that consists of two encoder networks with a decoder network. Our proposed Y-Net method} relies on efficient use of pre-trained and un-trained models with novel sum-skip-concatenation operations. Each of the encoders are trained with encoder specific learning rate along the decoder. Compared with the previous methods employing hand-crafted features or 2-D/3-D convolutional neural network, our approach outperforms state-of-the-art methods for polyp detection with 7.3% F1-score and 13% recall improvement.
△ Less
Submitted 5 June, 2018;
originally announced June 2018.
-
On the utility of Metropolis-Hastings with asymmetric acceptance ratio
Authors:
Christophe Andrieu,
Arnaud Doucet,
Sinan Yıldırım,
Nicolas Chopin
Abstract:
The Metropolis-Hastings algorithm allows one to sample asymptotically from any probability distribution $π$. There has been recently much work devoted to the development of variants of the MH update which can handle scenarios where such an evaluation is impossible, and yet are guaranteed to sample from $π$ asymptotically. The most popular approach to have emerged is arguably the pseudo-marginal MH…
▽ More
The Metropolis-Hastings algorithm allows one to sample asymptotically from any probability distribution $π$. There has been recently much work devoted to the development of variants of the MH update which can handle scenarios where such an evaluation is impossible, and yet are guaranteed to sample from $π$ asymptotically. The most popular approach to have emerged is arguably the pseudo-marginal MH algorithm which substitutes an unbiased estimate of an unnormalised version of $π$ for $π$. Alternative pseudo-marginal algorithms relying instead on unbiased estimates of the MH acceptance ratio have also been proposed. These algorithms can have better properties than standard PM algorithms. Convergence properties of both classes of algorithms are known to depend on the variability of the estimators involved and reduced variability is guaranteed to decrease the asymptotic variance of ergodic averages and will shorten the burn-in period, or convergence to equilibrium, in most scenarios of interest. A simple approach to reduce variability, amenable to parallel computations, consists of averaging independent estimators. However, while averaging estimators of $π$ in a pseudo-marginal algorithm retains the guarantee of sampling from $π$ asymptotically, naive averaging of acceptance ratio estimates breaks detailed balance, leading to incorrect results. We propose an original methodology which allows for a correct implementation of this idea. We establish theoretical properties which parallel those available for standard PM algorithms and discussed above. We demonstrate the interest of the approach on various inference problems. In particular we show that convergence to equilibrium can be significantly shortened, therefore offering the possibility to reduce a user's waiting time in a generic fashion when a parallel computing architecture is available.
△ Less
Submitted 26 March, 2018;
originally announced March 2018.
-
Patient Specific Congestive Heart Failure Detection From Raw ECG signal
Authors:
Yakup Kutlu,
Apdullah Yayık,
Esen Yıldırım,
Mustafa Yeniad,
Serdar Yıldırım
Abstract:
In this study; in order to diagnose congestive heart failure (CHF) patients, non-linear second-order difference plot (SODP) obtained from raw 256 Hz sampled frequency and windowed record with different time of ECG records are used. All of the data rows are labelled with their belongings to classify much more realistically. SODPs are divided into different radius of quadrant regions and numbers of…
▽ More
In this study; in order to diagnose congestive heart failure (CHF) patients, non-linear second-order difference plot (SODP) obtained from raw 256 Hz sampled frequency and windowed record with different time of ECG records are used. All of the data rows are labelled with their belongings to classify much more realistically. SODPs are divided into different radius of quadrant regions and numbers of the points fall in the quadrants are computed in order to extract feature vectors. Fisher's linear discriminant, Naive Bayes, Radial basis function, and artificial neural network are used as classifier. The results are considered in two step validation methods as general k-fold cross-validation and patient based cross-validation. As a result, it is shown that using neural network classifier with features obtained from SODP, the constructed system could distinguish normal and CHF patients with 100% accuracy rate. Keywords
△ Less
Submitted 1 February, 2017;
originally announced March 2017.
-
MCMC Shape Sampling for Image Segmentation with Nonparametric Shape Priors
Authors:
Ertunc Erdil,
Sinan Yıldırım,
Müjdat Çetin,
Tolga Taşdizen
Abstract:
Segmenting images of low quality or with missing data is a challenging problem. Integrating statistical prior information about the shapes to be segmented can improve the segmentation results significantly. Most shape-based segmentation algorithms optimize an energy functional and find a point estimate for the object to be segmented. This does not provide a measure of the degree of confidence in t…
▽ More
Segmenting images of low quality or with missing data is a challenging problem. Integrating statistical prior information about the shapes to be segmented can improve the segmentation results significantly. Most shape-based segmentation algorithms optimize an energy functional and find a point estimate for the object to be segmented. This does not provide a measure of the degree of confidence in that result, neither does it provide a picture of other probable solutions based on the data and the priors. With a statistical view, addressing these issues would involve the problem of characterizing the posterior densities of the shapes of the objects to be segmented. For such characterization, we propose a Markov chain Monte Carlo (MCMC) sampling-based image segmentation algorithm that uses statistical shape priors. In addition to better characterization of the statistical structure of the problem, such an approach would also have the potential to address issues with getting stuck at local optima, suffered by existing shape-based segmentation methods. Our approach is able to characterize the posterior probability density in the space of shapes through its samples, and to return multiple solutions, potentially from different modes of a multimodal probability density, which would be encountered, e.g., in segmenting objects from multiple shape classes. We present promising results on a variety of data sets. We also provide an extension for segmenting shapes of objects with parts that can go through independent shape variations. This extension involves the use of local shape priors on object parts and provides robustness to limitations in shape training data size.
△ Less
Submitted 11 November, 2016;
originally announced November 2016.
-
On the Synchronization of Intermittently Powered Wireless Embedded Systems
Authors:
Kasım Sinan Yıldırım,
Henko Aantjes,
Amjad Yousef Majid,
Przemysław Pawełczak
Abstract:
Battery-free computational RFID platforms, such as WISP (Wireless Identification and Sensing Platform), are emerging intermittently powered devices designed for replacing existing battery-powered sensor networks. As their applications become increasingly complex, we anticipate that synchronization (among others) to appear as one of crucial building blocks for collaborative and coordinated actions.…
▽ More
Battery-free computational RFID platforms, such as WISP (Wireless Identification and Sensing Platform), are emerging intermittently powered devices designed for replacing existing battery-powered sensor networks. As their applications become increasingly complex, we anticipate that synchronization (among others) to appear as one of crucial building blocks for collaborative and coordinated actions. With this paper we aim at providing initial observations regarding the synchronization of intermittently powered systems. In particular, we design and implement the first and very initial synchronization protocol for the WISP platform that provides explicit synchronization among individual WISPs that reside inside the communication range of a common RFID reader. Evaluations in our testbed showed that with our mechanism a synchronization error of approximately 1.5 milliseconds can be ensured between the RFID reader and a WISP tag.
△ Less
Submitted 6 June, 2016;
originally announced June 2016.
-
On the Use of Penalty MCMC for Differential Privacy
Authors:
Sinan Yıldırım
Abstract:
We view the penalty algorithm of Ceperley and Dewing (1999), a Markov chain Monte Carlo (MCMC) algorithm for Bayesian inference, in the context of data privacy. Specifically, we study differential privacy of the penalty algorithm and advocate its use for data privacy. We show that in the simple model of independent observations the algorithm has desirable convergence and privacy properties that sc…
▽ More
We view the penalty algorithm of Ceperley and Dewing (1999), a Markov chain Monte Carlo (MCMC) algorithm for Bayesian inference, in the context of data privacy. Specifically, we study differential privacy of the penalty algorithm and advocate its use for data privacy. We show that in the simple model of independent observations the algorithm has desirable convergence and privacy properties that scale with data size. Two special cases are also investigated and privacy preserving schemes are proposed for those cases: (i) Data are distributed among several data owners who are interested in the inference of a common parameter while preserving their data privacy. (ii) The data likelihood belongs to an exponential family.
△ Less
Submitted 25 April, 2016;
originally announced April 2016.
-
Safe and Secure Wireless Power Transfer Networks: Challenges and Opportunities in RF-Based Systems
Authors:
Qingzhi Liu,
Kasım Sinan Yıldırım,
Przemysław Pawełczak,
Martijn Warnier
Abstract:
RF-based wireless power transfer networks (WPTNs) are deployed to transfer power to embedded devices over the air via RF waves. Up until now, a considerable amount of effort has been devoted by researchers to design WPTNs that maximize several objectives such as harvested power, energy outage and charging delay. However, inherent security and safety issues are generally overlooked and these need t…
▽ More
RF-based wireless power transfer networks (WPTNs) are deployed to transfer power to embedded devices over the air via RF waves. Up until now, a considerable amount of effort has been devoted by researchers to design WPTNs that maximize several objectives such as harvested power, energy outage and charging delay. However, inherent security and safety issues are generally overlooked and these need to be solved if WPTNs are to be become widespread. This article focuses on safety and security problems related WPTNs and highlight their cruciality in terms of efficient and dependable operation of RF-based WPTNs. We provide a overview of new research opportunities in this emerging domain.
△ Less
Submitted 11 February, 2016; v1 submitted 21 January, 2016;
originally announced January 2016.
-
Gradient Descent Algorithm Inspired Adaptive Time Synchronization in Wireless Sensor Networks
Authors:
Kasim Sinan Yildirim
Abstract:
Our motivation in this paper is to take another step forward from complex and heavyweight synchronization protocols to the easy-to-implement and lightweight synchronization protocols in WSNs. To this end, we present GraDeS, a novel multi-hop time synchronization protocol based upon gradient descent algorithm. We give details about our implementation of GraDeS and present its experimental evaluatio…
▽ More
Our motivation in this paper is to take another step forward from complex and heavyweight synchronization protocols to the easy-to-implement and lightweight synchronization protocols in WSNs. To this end, we present GraDeS, a novel multi-hop time synchronization protocol based upon gradient descent algorithm. We give details about our implementation of GraDeS and present its experimental evaluation in our testbed of MICAz sensor nodes. Our observations indicate that GraDeS is scalable, it has identical memory and processing overhead, better convergence time and comparable synchronization performance as compared to existing lightweight solutions.
△ Less
Submitted 9 December, 2015;
originally announced December 2015.
-
Proportional-Integral Clock Synchronization in Wireless Sensor Networks
Authors:
Kasım Sinan Yıldırım,
Ruggero Carli,
Luca Schenato
Abstract:
In this article, we present a new control theoretic distributed time synchronization algorithm, named PISync, in order to synchronize sensor nodes in Wireless Sensor Networks (WSNs). PISync algorithm is based on a Proportional-Integral (PI) controller. It applies a proportional feedback (P) and an integral feedback (I) on the local measured synchronization errors to compensate the differences betw…
▽ More
In this article, we present a new control theoretic distributed time synchronization algorithm, named PISync, in order to synchronize sensor nodes in Wireless Sensor Networks (WSNs). PISync algorithm is based on a Proportional-Integral (PI) controller. It applies a proportional feedback (P) and an integral feedback (I) on the local measured synchronization errors to compensate the differences between the clock offsets and clock speeds. We present practical flooding-based and fully distributed protocol implementations of the PISync algorithm, and we provide theoretical analysis to highlight the benefits of this approach in terms of improved steady state error and scalability as compared to existing synchronization algorithms. We show through real-world experiments and simulations that PISync protocols have several advantages over existing protocols in the WSN literature, namely no need for memory allocation, minimal CPU overhead and code size independent of network size and topology, and graceful performance degradation in terms of network size.
△ Less
Submitted 29 October, 2014;
originally announced October 2014.
-
Adaptive Synchronization of Robotic Sensor Networks
Authors:
Kasım Sinan Yıldırım,
Önder Gürcan
Abstract:
The main focus of recent time synchronization research is developing power-efficient synchronization methods that meet pre-defined accuracy requirements. However, an aspect that has been often overlooked is the high dynamics of the network topology due to the mobility of the nodes. Employing existing flooding-based and peer-to-peer synchronization methods, are networked robots still be able to ada…
▽ More
The main focus of recent time synchronization research is developing power-efficient synchronization methods that meet pre-defined accuracy requirements. However, an aspect that has been often overlooked is the high dynamics of the network topology due to the mobility of the nodes. Employing existing flooding-based and peer-to-peer synchronization methods, are networked robots still be able to adapt themselves and self-adjust their logical clocks under mobile network dynamics? In this paper, we present the application and the evaluation of the existing synchronization methods on robotic sensor networks. We show through simulations that Adaptive Value Tracking synchronization is robust and efficient under mobility. Hence, deducing the time synchronization problem in robotic sensor networks into a dynamic value searching problem is preferable to existing synchronization methods in the literature.
△ Less
Submitted 28 October, 2014;
originally announced October 2014.
-
Bayesian tracking and parameter learning for non-linear multiple target tracking models
Authors:
Lan Jiang,
Sumeetpal S. Singh,
Sinan Yıldırım
Abstract:
We propose a new Bayesian tracking and parameter learning algorithm for non-linear non-Gaussian multiple target tracking (MTT) models. We design a Markov chain Monte Carlo (MCMC) algorithm to sample from the posterior distribution of the target states, birth and death times, and association of observations to targets, which constitutes the solution to the tracking problem, as well as the model par…
▽ More
We propose a new Bayesian tracking and parameter learning algorithm for non-linear non-Gaussian multiple target tracking (MTT) models. We design a Markov chain Monte Carlo (MCMC) algorithm to sample from the posterior distribution of the target states, birth and death times, and association of observations to targets, which constitutes the solution to the tracking problem, as well as the model parameters. In the numerical section, we present performance comparisons with several competing techniques and demonstrate significant performance improvements in all cases.
△ Less
Submitted 8 October, 2014;
originally announced October 2014.
-
Obstructions of Turkish Public Organizations Getting ISO/IEC 27001 Certified
Authors:
Tolga Mataracioglu,
Sevgi Ozkan Yildirim
Abstract:
In this paper; a comparison has been made among the Articles contained in the ISO/IEC 27001 Standard and the Articles of the Civil Servants Law No 657, which should essentially be complied with by the personnel employed within the bodies of public institutions in Turkey; and efforts have been made in order to emphasize the consistent Articles; and in addition, the matters, which should be paid att…
▽ More
In this paper; a comparison has been made among the Articles contained in the ISO/IEC 27001 Standard and the Articles of the Civil Servants Law No 657, which should essentially be complied with by the personnel employed within the bodies of public institutions in Turkey; and efforts have been made in order to emphasize the consistent Articles; and in addition, the matters, which should be paid attention by the public institutions indenting to obtain the ISO/IEC 27001 certificate for the Articles of the Civil Servants Law No 657 which are not consistent with the ISO/IEC 27001 certification process, have been mentioned. Furthermore, solution offers have been presented in order to ensure that the mentioned Articles become consistent with the ISO/IEC 27001 certification process.
△ Less
Submitted 8 July, 2014;
originally announced July 2014.
-
An Online Expectation-Maximisation Algorithm for Nonnegative Matrix Factorisation Models
Authors:
Sinan Yildirim,
A. Taylan Cemgil,
Sumeetpal S. Singh
Abstract:
In this paper we formulate the nonnegative matrix factorisation (NMF) problem as a maximum likelihood estimation problem for hidden Markov models and propose online expectation-maximisation (EM) algorithms to estimate the NMF and the other unknown static parameters. We also propose a sequential Monte Carlo approximation of our online EM algorithm. We show the performance of the proposed method wit…
▽ More
In this paper we formulate the nonnegative matrix factorisation (NMF) problem as a maximum likelihood estimation problem for hidden Markov models and propose online expectation-maximisation (EM) algorithms to estimate the NMF and the other unknown static parameters. We also propose a sequential Monte Carlo approximation of our online EM algorithm. We show the performance of the proposed method with two numerical examples.
△ Less
Submitted 10 January, 2014;
originally announced January 2014.
-
Parameter Estimation in Hidden Markov Models with Intractable Likelihoods Using Sequential Monte Carlo
Authors:
Sinan Yildirim,
Sumeetpal Singh,
Thomas Dean,
Ajay Jasra
Abstract:
We propose sequential Monte Carlo based algorithms for maximum likelihood estimation of the static parameters in hidden Markov models with an intractable likelihood using ideas from approximate Bayesian computation. The static parameter estimation algorithms are gradient based and cover both offline and online estimation. We demonstrate their performance by estimating the parameters of three intra…
▽ More
We propose sequential Monte Carlo based algorithms for maximum likelihood estimation of the static parameters in hidden Markov models with an intractable likelihood using ideas from approximate Bayesian computation. The static parameter estimation algorithms are gradient based and cover both offline and online estimation. We demonstrate their performance by estimating the parameters of three intractable models, namely the alpha-stable distribution, g-and-k distribution, and the stochastic volatility model with alpha-stable returns, using both real and synthetic data.
△ Less
Submitted 17 November, 2013;
originally announced November 2013.
-
Estimating the Static Parameters in Linear Gaussian Multiple Target Tracking Models
Authors:
Sinan Yildirim,
Lan Jiang,
Sumeetpal S. Singh,
Tom Dean
Abstract:
We present both offline and online maximum likelihood estimation (MLE) techniques for inferring the static parameters of a multiple target tracking (MTT) model with linear Gaussian dynamics. We present the batch and online versions of the expectation-maximisation (EM) algorithm for short and long data sets respectively, and we show how Monte Carlo approximations of these methods can be implemented…
▽ More
We present both offline and online maximum likelihood estimation (MLE) techniques for inferring the static parameters of a multiple target tracking (MTT) model with linear Gaussian dynamics. We present the batch and online versions of the expectation-maximisation (EM) algorithm for short and long data sets respectively, and we show how Monte Carlo approximations of these methods can be implemented. Performance is assessed in numerical examples using simulated data for various scenarios and a comparison with a Bayesian estimation procedure is also provided.
△ Less
Submitted 4 December, 2012;
originally announced December 2012.
-
Relativistic Transport Approach to Collective Nuclear Dynamics
Authors:
S. Yildirim,
T. Gaitanos,
M. Di Toro,
V. Greco
Abstract:
The isoscalar giant monopole resonance (ISGMR) and isovector giant dipole resonance (IVGDR) in finite nuclei are studied in the framework of a relativistic transport approach. The kinetic equations are derived within an effective nucleon-meson field theory in the Relativistic Mean Field (RMF) scheme, even extended to density dependent vertices. Small amplitude oscillations are analysed using the…
▽ More
The isoscalar giant monopole resonance (ISGMR) and isovector giant dipole resonance (IVGDR) in finite nuclei are studied in the framework of a relativistic transport approach. The kinetic equations are derived within an effective nucleon-meson field theory in the Relativistic Mean Field (RMF) scheme, even extended to density dependent vertices. Small amplitude oscillations are analysed using the Relativistic Vlasov (RV) approach, i.e. neglecting nucleon collision terms. The time evolution of the isoscalar monopole moment and isovector dipole moment and the corresponding Fourier power spectra are discussed. In the case of ^{208}Pb we study in detail the dependence of the monopole response on the effective mass and symmetry energy at saturation given by the used covariant effective interaction. We show that a reduced m^* and a larger a_4 can compensate the effect on the ISGMR energy centroid of a much larger compressibility modulus K_{nm}. This result is important in order to overcome the conflicting determination of the nuclear compressibility between non-relativistic and relativistic effective interactions. For the symmetry energy dynamical effects, we carefully analyze the influence of the inclusion of an effective isovector scalar channel,
δ-meson field, with constant and density dependent couplings. We show the relevance of the $slope$ (or pressure) of the symmetry energy at saturation on the ISGMR and IVGDR modes for neutron-rich systems. Density dependent vertices are not much affecting our conclusions. Following as a guidance some extended dispersion relations in nuclear matter, we see two main reasons for that, the smoothness of the density dependences around saturation and the presence of compensation effects coming from rearrangement terms.
△ Less
Submitted 6 July, 2005;
originally announced July 2005.
-
Collisional Damping of Giant Monopole and Quadrupole Resonances
Authors:
S. Yildirim,
A. Gokalp,
O. Yilmaz,
S. Ayik
Abstract:
Collisional damping widths of giant monopole and quadrupole excitations for $^{120}$Sn and $^{208}$Pb at zero and finite temperatures are calculated within Thomas-Fermi approximation by employing the microscopic in-medium cross-sections of Li and Machleidt and the phenomenological Skyrme and Gogny forces, and are compared with each other. The results for the collisional widths of giant monopole…
▽ More
Collisional damping widths of giant monopole and quadrupole excitations for $^{120}$Sn and $^{208}$Pb at zero and finite temperatures are calculated within Thomas-Fermi approximation by employing the microscopic in-medium cross-sections of Li and Machleidt and the phenomenological Skyrme and Gogny forces, and are compared with each other. The results for the collisional widths of giant monopole and quadrupole vibrations at zero temperature as a function of the mass number show that the collisional damping of giant monopole vibrations accounts for about 30-40% of the observed widths at zero temperature, while for giant quadrupole vibrations it accounts for only 20-30% of the observed widths of zero temperature.
△ Less
Submitted 8 January, 2001;
originally announced January 2001.
-
On the Collisional Damping of Giant Dipole Resonance
Authors:
Osman Yilmaz,
Ahmet Gokalp,
Serbulent Yildirim,
Sakir Ayik
Abstract:
Collisional damping widths of giant dipole excitations are calculated in Thomas-Fermi approximation by employing the microscopic in-medium cross-sections of Li and Machleidt and the phenomenological Gogny force. The results obtained in both calculations compare well, but account for about 25-35% of the observed widths in $^{120}Sn$ and $^{208}Pb$ at finite temperatures.
Collisional damping widths of giant dipole excitations are calculated in Thomas-Fermi approximation by employing the microscopic in-medium cross-sections of Li and Machleidt and the phenomenological Gogny force. The results obtained in both calculations compare well, but account for about 25-35% of the observed widths in $^{120}Sn$ and $^{208}Pb$ at finite temperatures.
△ Less
Submitted 21 September, 1999;
originally announced September 1999.