Search | arXiv e-print repository

Towards Optimal Trade-offs in Knowledge Distillation for CNNs and Vision Transformers at the Edge

Authors: John Violos, Symeon Papadopoulos, Ioannis Kompatsiaris

Abstract: This paper discusses four facets of the Knowledge Distillation (KD) process for Convolutional Neural Networks (CNNs) and Vision Transformer (ViT) architectures, particularly when executed on edge devices with constrained processing capabilities. First, we conduct a comparative analysis of the KD process between CNNs and ViT architectures, aiming to elucidate the feasibility and efficacy of employi… ▽ More This paper discusses four facets of the Knowledge Distillation (KD) process for Convolutional Neural Networks (CNNs) and Vision Transformer (ViT) architectures, particularly when executed on edge devices with constrained processing capabilities. First, we conduct a comparative analysis of the KD process between CNNs and ViT architectures, aiming to elucidate the feasibility and efficacy of employing different architectural configurations for the teacher and student, while assessing their performance and efficiency. Second, we explore the impact of varying the size of the student model on accuracy and inference speed, while maintaining a constant KD duration. Third, we examine the effects of employing higher resolution images on the accuracy, memory footprint and computational workload. Last, we examine the performance improvements obtained by fine-tuning the student model after KD to specific downstream tasks. Through empirical evaluations and analyses, this research provides AI practitioners with insights into optimal strategies for maximizing the effectiveness of the KD process on edge devices. △ Less

Submitted 25 June, 2024; originally announced July 2024.

arXiv:2404.15385 [pdf, ps, other]

Sum of Group Error Differences: A Critical Examination of Bias Evaluation in Biometric Verification and a Dual-Metric Measure

Authors: Alaa Elobaid, Nathan Ramoly, Lara Younes, Symeon Papadopoulos, Eirini Ntoutsi, Ioannis Kompatsiaris

Abstract: Biometric Verification (BV) systems often exhibit accuracy disparities across different demographic groups, leading to biases in BV applications. Assessing and quantifying these biases is essential for ensuring the fairness of BV systems. However, existing bias evaluation metrics in BV have limitations, such as focusing exclusively on match or non-match error rates, overlooking bias on demographic… ▽ More Biometric Verification (BV) systems often exhibit accuracy disparities across different demographic groups, leading to biases in BV applications. Assessing and quantifying these biases is essential for ensuring the fairness of BV systems. However, existing bias evaluation metrics in BV have limitations, such as focusing exclusively on match or non-match error rates, overlooking bias on demographic groups with performance levels falling between the best and worst performance levels, and neglecting the magnitude of the bias present. This paper presents an in-depth analysis of the limitations of current bias evaluation metrics in BV and, through experimental analysis, demonstrates their contextual suitability, merits, and limitations. Additionally, it introduces a novel general-purpose bias evaluation measure for BV, the ``Sum of Group Error Differences (SEDG)''. Our experimental results on controlled synthetic datasets demonstrate the effectiveness of demographic bias quantification when using existing metrics and our own proposed measure. We discuss the applicability of the bias evaluation metrics in a set of simulated demographic bias scenarios and provide scenario-based metric recommendations. Our code is publicly available under \url{https://github.com/alaaobeid/SEDG}. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2308.11684 [pdf, other]

doi 10.1145/3394231.3397920

User Identity Linkage in Social Media Using Linguistic and Social Interaction Features

Authors: Despoina Chatzakou, Juan Soler-Company, Theodora Tsikrika, Leo Wanner, Stefanos Vrochidis, Ioannis Kompatsiaris

Abstract: Social media users often hold several accounts in their effort to multiply the spread of their thoughts, ideas, and viewpoints. In the particular case of objectionable content, users tend to create multiple accounts to bypass the combating measures enforced by social media platforms and thus retain their online identity even if some of their accounts are suspended. User identity linkage aims to re… ▽ More Social media users often hold several accounts in their effort to multiply the spread of their thoughts, ideas, and viewpoints. In the particular case of objectionable content, users tend to create multiple accounts to bypass the combating measures enforced by social media platforms and thus retain their online identity even if some of their accounts are suspended. User identity linkage aims to reveal social media accounts likely to belong to the same natural person so as to prevent the spread of abusive/illegal activities. To this end, this work proposes a machine learning-based detection model, which uses multiple attributes of users' online activity in order to identify whether two or more virtual identities belong to the same real natural person. The models efficacy is demonstrated on two cases on abusive and terrorism-related Twitter content. △ Less

Submitted 22 August, 2023; originally announced August 2023.

arXiv:2307.01210 [pdf]

AI and Non AI Assessments for Dementia

Authors: Mahboobeh Parsapoor, Hamed Ghodrati, Vincenzo Dentamaro, Christopher R. Madan, Ioulietta Lazarou, Spiros Nikolopoulos, Ioannis Kompatsiaris

Abstract: Current progress in the artificial intelligence domain has led to the development of various types of AI-powered dementia assessments, which can be employed to identify patients at the early stage of dementia. It can revolutionize the dementia care settings. It is essential that the medical community be aware of various AI assessments and choose them considering their degrees of validity, efficien… ▽ More Current progress in the artificial intelligence domain has led to the development of various types of AI-powered dementia assessments, which can be employed to identify patients at the early stage of dementia. It can revolutionize the dementia care settings. It is essential that the medical community be aware of various AI assessments and choose them considering their degrees of validity, efficiency, practicality, reliability, and accuracy concerning the early identification of patients with dementia (PwD). On the other hand, AI developers should be informed about various non-AI assessments as well as recently developed AI assessments. Thus, this paper, which can be readable by both clinicians and AI engineers, fills the gap in the literature in explaining the existing solutions for the recognition of dementia to clinicians, as well as the techniques used and the most widespread dementia datasets to AI engineers. It follows a review of papers on AI and non-AI assessments for dementia to provide valuable information about various dementia assessments for both the AI and medical communities. The discussion and conclusion highlight the most prominent research directions and the maturity of existing solutions. △ Less

Submitted 29 June, 2023; originally announced July 2023.

Comments: 49 pages

arXiv:2304.12053 [pdf, other]

doi 10.1145/3592572.3592846

Improving Synthetically Generated Image Detection in Cross-Concept Settings

Authors: Pantelis Dogoulis, Giorgos Kordopatis-Zilos, Ioannis Kompatsiaris, Symeon Papadopoulos

Abstract: New advancements for the detection of synthetic images are critical for fighting disinformation, as the capabilities of generative AI models continuously evolve and can lead to hyper-realistic synthetic imagery at unprecedented scale and speed. In this paper, we focus on the challenge of generalizing across different concept classes, e.g., when training a detector on human faces and testing on syn… ▽ More New advancements for the detection of synthetic images are critical for fighting disinformation, as the capabilities of generative AI models continuously evolve and can lead to hyper-realistic synthetic imagery at unprecedented scale and speed. In this paper, we focus on the challenge of generalizing across different concept classes, e.g., when training a detector on human faces and testing on synthetic animal images - highlighting the ineffectiveness of existing approaches that randomly sample generated images to train their models. By contrast, we propose an approach based on the premise that the robustness of the detector can be enhanced by training it on realistic synthetic images that are selected based on their quality scores according to a probabilistic quality estimation model. We demonstrate the effectiveness of the proposed approach by conducting experiments with generated images from two seminal architectures, StyleGAN2 and Latent Diffusion, and using three different concepts for each, so as to measure the cross-concept generalization ability. Our results show that our quality-based sampling method leads to higher detection performance for nearly all concepts, improving the overall effectiveness of the synthetic image detectors. △ Less

Submitted 24 April, 2023; originally announced April 2023.

arXiv:2304.07395 [pdf, other]

Investigation of ensemble methods for the detection of deepfake face manipulations

Authors: Nikolaos Giatsoglou, Symeon Papadopoulos, Ioannis Kompatsiaris

Abstract: The recent wave of AI research has enabled a new brand of synthetic media, called deepfakes. Deepfakes have impressive photorealism, which has generated exciting new use cases but also raised serious threats to our increasingly digital world. To mitigate these threats, researchers have tried to come up with new methods for deepfake detection that are more effective than traditional forensics and h… ▽ More The recent wave of AI research has enabled a new brand of synthetic media, called deepfakes. Deepfakes have impressive photorealism, which has generated exciting new use cases but also raised serious threats to our increasingly digital world. To mitigate these threats, researchers have tried to come up with new methods for deepfake detection that are more effective than traditional forensics and heavily rely on deep AI technology. In this paper, following up on encouraging prior work for deepfake detection with attribution and ensemble techniques, we explore and compare multiple designs for ensemble detectors. The goal is to achieve robustness and good generalization ability by leveraging ensembles of models that specialize in different manipulation categories. Our results corroborate that ensembles can achieve higher accuracy than individual models when properly tuned, while the generalization ability relies on access to a large number of training data for a diverse set of known manipulations. △ Less

Submitted 14 April, 2023; originally announced April 2023.

Comments: 16 pages, 4 figures

arXiv:2304.03378 [pdf, other]

Self-Supervised Video Similarity Learning

Authors: Giorgos Kordopatis-Zilos, Giorgos Tolias, Christos Tzelepis, Ioannis Kompatsiaris, Ioannis Patras, Symeon Papadopoulos

Abstract: We introduce S$^2$VS, a video similarity learning approach with self-supervision. Self-Supervised Learning (SSL) is typically used to train deep models on a proxy task so as to have strong transferability on target tasks after fine-tuning. Here, in contrast to prior work, SSL is used to perform video similarity learning and address multiple retrieval and detection tasks at once with no use of labe… ▽ More We introduce S$^2$VS, a video similarity learning approach with self-supervision. Self-Supervised Learning (SSL) is typically used to train deep models on a proxy task so as to have strong transferability on target tasks after fine-tuning. Here, in contrast to prior work, SSL is used to perform video similarity learning and address multiple retrieval and detection tasks at once with no use of labeled data. This is achieved by learning via instance-discrimination with task-tailored augmentations and the widely used InfoNCE loss together with an additional loss operating jointly on self-similarity and hard-negative similarity. We benchmark our method on tasks where video relevance is defined with varying granularity, ranging from video copies to videos depicting the same incident or event. We learn a single universal model that achieves state-of-the-art performance on all tasks, surpassing previously proposed methods that use labeled data. The code and pretrained models are publicly available at: https://github.com/gkordo/s2vs △ Less

Submitted 16 June, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

arXiv:2301.05562 [pdf, ps, other]

Multilingual Alzheimer's Dementia Recognition through Spontaneous Speech: a Signal Processing Grand Challenge

Authors: Saturnino Luz, Fasih Haider, Davida Fromm, Ioulietta Lazarou, Ioannis Kompatsiaris, Brian MacWhinney

Abstract: This Signal Processing Grand Challenge (SPGC) targets a difficult automatic prediction problem of societal and medical relevance, namely, the detection of Alzheimer's Dementia (AD). Participants were invited to employ signal processing and machine learning methods to create predictive models based on spontaneous speech data. The Challenge has been designed to assess the extent to which predictive… ▽ More This Signal Processing Grand Challenge (SPGC) targets a difficult automatic prediction problem of societal and medical relevance, namely, the detection of Alzheimer's Dementia (AD). Participants were invited to employ signal processing and machine learning methods to create predictive models based on spontaneous speech data. The Challenge has been designed to assess the extent to which predictive models built based on speech in one language (English) generalise to another language (Greek). To the best of our knowledge no work has investigated acoustic features of the speech signal in multilingual AD detection. Our baseline system used conventional machine learning algorithms with Active Data Representation of acoustic features, achieving accuracy of 73.91% on AD detection, and 4.95 root mean squared error on cognitive score prediction. △ Less

Submitted 13 January, 2023; originally announced January 2023.

Comments: ICASSP 2023 SPGC description

MSC Class: 68T10 (Primary) 92C55 (Secondary) ACM Class: J.3; I.2.6; I.5.1

arXiv:2212.01128 [pdf, other]

A Multi-Stream Fusion Network for Image Splicing Localization

Authors: Maria Siopi, Giorgos Kordopatis-Zilos, Polychronis Charitidis, Ioannis Kompatsiaris, Symeon Papadopoulos

Abstract: In this paper, we address the problem of image splicing localization with a multi-stream network architecture that processes the raw RGB image in parallel with other handcrafted forensic signals. Unlike previous methods that either use only the RGB images or stack several signals in a channel-wise manner, we propose an encoder-decoder architecture that consists of multiple encoder streams. Each st… ▽ More In this paper, we address the problem of image splicing localization with a multi-stream network architecture that processes the raw RGB image in parallel with other handcrafted forensic signals. Unlike previous methods that either use only the RGB images or stack several signals in a channel-wise manner, we propose an encoder-decoder architecture that consists of multiple encoder streams. Each stream is fed with either the tampered image or handcrafted signals and processes them separately to capture relevant information from each one independently. Finally, the extracted features from the multiple streams are fused in the bottleneck of the architecture and propagated to the decoder network that generates the output localization map. We experiment with two handcrafted algorithms, i.e., DCT and Splicebuster. Our proposed approach is benchmarked on three public forensics datasets, demonstrating competitive performance against several competing methods and achieving state-of-the-art results, e.g., 0.898 AUC on CASIA. △ Less

Submitted 2 December, 2022; originally announced December 2022.

Comments: Accepted to the International Conference on MultiMedia Modeling (MMM 2023)

arXiv:2207.13458 [pdf, other]

VICTOR: Visual Incompatibility Detection with Transformers and Fashion-specific contrastive pre-training

Authors: Stefanos-Iordanis Papadopoulos, Christos Koutlis, Symeon Papadopoulos, Ioannis Kompatsiaris

Abstract: For fashion outfits to be considered aesthetically pleasing, the garments that constitute them need to be compatible in terms of visual aspects, such as style, category and color. Previous works have defined visual compatibility as a binary classification task with items in a garment being considered as fully compatible or fully incompatible. However, this is not applicable to Outfit Maker applica… ▽ More For fashion outfits to be considered aesthetically pleasing, the garments that constitute them need to be compatible in terms of visual aspects, such as style, category and color. Previous works have defined visual compatibility as a binary classification task with items in a garment being considered as fully compatible or fully incompatible. However, this is not applicable to Outfit Maker applications where users create their own outfits and need to know which specific items may be incompatible with the rest of the outfit. To address this, we propose the Visual InCompatibility TransfORmer (VICTOR) that is optimized for two tasks: 1) overall compatibility as regression and 2) the detection of mismatching items and utilize fashion-specific contrastive language-image pre-training for fine tuning computer vision neural networks on fashion imagery. We build upon the Polyvore outfit benchmark to generate partially mismatching outfits, creating a new dataset termed Polyvore-MISFITs, that is used to train VICTOR. A series of ablation and comparative analyses show that the proposed architecture can compete and even surpass the current state-of-the-art on Polyvore datasets while reducing the instance-wise floating operations by 88%, striking a balance between high performance and efficiency. We release our code at https://github.com/stevejpapad/Visual-InCompatibility-Transformer △ Less

Submitted 8 September, 2022; v1 submitted 27 July, 2022; originally announced July 2022.

Comments: 14 pages, 6 figures

arXiv:2205.10003 [pdf, other]

InDistill: Information flow-preserving knowledge distillation for model compression

Authors: Ioannis Sarridis, Christos Koutlis, Giorgos Kordopatis-Zilos, Ioannis Kompatsiaris, Symeon Papadopoulos

Abstract: In this paper we introduce InDistill, a model compression approach that combines knowledge distillation and channel pruning in a unified framework for the transfer of the critical information flow paths from a heavyweight teacher to a lightweight student. Such information is typically collapsed in previous methods due to an encoding stage prior to distillation. By contrast, InDistill leverages a p… ▽ More In this paper we introduce InDistill, a model compression approach that combines knowledge distillation and channel pruning in a unified framework for the transfer of the critical information flow paths from a heavyweight teacher to a lightweight student. Such information is typically collapsed in previous methods due to an encoding stage prior to distillation. By contrast, InDistill leverages a pruning operation applied to the teacher's intermediate layers reducing their width to the corresponding student layers' width. In that way, we force architectural alignment enabling the intermediate layers to be directly distilled without the need of an encoding stage. Additionally, a curriculum learning-based training scheme is adopted considering the distillation difficulty of each layer and the critical learning periods in which the information flow paths are created. The proposed method surpasses state-of-the-art performance on three standard benchmarks, i.e. CIFAR-10, CUB-200, and FashionMNIST by 3.08%, 14.27%, and 1% mAP, respectively, as well as on more challenging evaluation settings, i.e. ImageNet and CIFAR-100 by 1.97% and 5.65% mAP, respectively. △ Less

Submitted 16 June, 2023; v1 submitted 20 May, 2022; originally announced May 2022.

arXiv:2204.12902 [pdf, other]

A Graph Diffusion Scheme for Decentralized Content Search based on Personalized PageRank

Authors: Nikolaos Giatsoglou, Emmanouil Krasanakis, Symeon Papadopoulos, Ioannis Kompatsiaris

Abstract: Decentralization is emerging as a key feature of the future Internet. However, effective algorithms for search are missing from state-of-the-art decentralized technologies, such as distributed hash tables and blockchain. This is surprising, since decentralized search has been studied extensively in earlier peer-to-peer (P2P) literature. In this work, we adopt a fresh outlook for decentralized sear… ▽ More Decentralization is emerging as a key feature of the future Internet. However, effective algorithms for search are missing from state-of-the-art decentralized technologies, such as distributed hash tables and blockchain. This is surprising, since decentralized search has been studied extensively in earlier peer-to-peer (P2P) literature. In this work, we adopt a fresh outlook for decentralized search in P2P networks that is inspired by advancements in dense information retrieval and graph signal processing. In particular, we generate latent representations of P2P nodes based on their stored documents and diffuse them to the rest of the network with graph filters, such as personalized PageRank. We then use the diffused representations to guide search queries towards relevant content. Our preliminary approach is successful in locating relevant documents in nearby nodes but the accuracy declines sharply with the number of stored documents, highlighting the need for more sophisticated techniques. △ Less

Submitted 27 April, 2022; originally announced April 2022.

Comments: 7 pages, 3 figures, to be published in the Decentralized Internet, Networks, Protocols, and Systems (DINPS 2022) workshop hosted at the 42nd IEEE International Conference on Distributed Computing Systems (ICDCS 2022) - https://research.protocol.ai/sites/dinps/

arXiv:2204.12816 [pdf, other]

The MeVer DeepFake Detection Service: Lessons Learnt from Developing and Deploying in the Wild

Authors: Spyridon Baxevanakis, Giorgos Kordopatis-Zilos, Panagiotis Galopoulos, Lazaros Apostolidis, Killian Levacher, Ipek B. Schlicht, Denis Teyssou, Ioannis Kompatsiaris, Symeon Papadopoulos

Abstract: Enabled by recent improvements in generation methodologies, DeepFakes have become mainstream due to their increasingly better visual quality, the increase in easy-to-use generation tools and the rapid dissemination through social media. This fact poses a severe threat to our societies with the potential to erode social cohesion and influence our democracies. To mitigate the threat, numerous DeepFa… ▽ More Enabled by recent improvements in generation methodologies, DeepFakes have become mainstream due to their increasingly better visual quality, the increase in easy-to-use generation tools and the rapid dissemination through social media. This fact poses a severe threat to our societies with the potential to erode social cohesion and influence our democracies. To mitigate the threat, numerous DeepFake detection schemes have been introduced in the literature but very few provide a web service that can be used in the wild. In this paper, we introduce the MeVer DeepFake detection service, a web service detecting deep learning manipulations in images and video. We present the design and implementation of the proposed processing pipeline that involves a model ensemble scheme, and we endow the service with a model card for transparency. Experimental results show that our service performs robustly on the three benchmark datasets while being vulnerable to Adversarial Attacks. Finally, we outline our experience and lessons learned when deploying a research system into production in the hopes that it will be useful to other academic and industry teams. △ Less

Submitted 27 April, 2022; originally announced April 2022.

Comments: 10 pages, 6 figures

arXiv:2204.04014 [pdf, other]

Multimodal Quasi-AutoRegression: Forecasting the visual popularity of new fashion products

Authors: Stefanos I. Papadopoulos, Christos Koutlis, Symeon Papadopoulos, Ioannis Kompatsiaris

Abstract: Estimating the preferences of consumers is of utmost importance for the fashion industry as appropriately leveraging this information can be beneficial in terms of profit. Trend detection in fashion is a challenging task due to the fast pace of change in the fashion industry. Moreover, forecasting the visual popularity of new garment designs is even more demanding due to lack of historical data. T… ▽ More Estimating the preferences of consumers is of utmost importance for the fashion industry as appropriately leveraging this information can be beneficial in terms of profit. Trend detection in fashion is a challenging task due to the fast pace of change in the fashion industry. Moreover, forecasting the visual popularity of new garment designs is even more demanding due to lack of historical data. To this end, we propose MuQAR, a Multimodal Quasi-AutoRegressive deep learning architecture that combines two modules: (1) a multi-modal multi-layer perceptron processing categorical, visual and textual features of the product and (2) a quasi-autoregressive neural network modelling the "target" time series of the product's attributes along with the "exogenous" time series of all other attributes. We utilize computer vision, image classification and image captioning, for automatically extracting visual features and textual descriptions from the images of new products. Product design in fashion is initially expressed visually and these features represent the products' unique characteristics without interfering with the creative process of its designers by requiring additional inputs (e.g manually written texts). We employ the product's target attributes time series as a proxy of temporal popularity patterns, mitigating the lack of historical data, while exogenous time series help capture trends among interrelated attributes. We perform an extensive ablation analysis on two large scale image fashion datasets, Mallzee and SHIFT15m to assess the adequacy of MuQAR and also use the Amazon Reviews: Home and Kitchen dataset to assess generalisability to other domains. A comparative study on the VISUELLE dataset, shows that MuQAR is capable of competing and surpassing the domain's current state of the art by 4.65% and 4.8% in terms of WAPE and MAE respectively. △ Less

Submitted 5 May, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

Comments: 14 pages, 4 figures

arXiv:2111.14837 [pdf, other]

doi 10.1109/ACCESS.2022.3159688

p2pGNN: A Decentralized Graph Neural Network for Node Classification in Peer-to-Peer Networks

Authors: Emmanouil Krasanakis, Symeon Papadopoulos, Ioannis Kompatsiaris

Abstract: In this work, we aim to classify nodes of unstructured peer-to-peer networks with communication uncertainty, such as users of decentralized social networks. Graph Neural Networks (GNNs) are known to improve the accuracy of simple classifiers in centralized settings by leveraging naturally occurring network links, but graph convolutional layers are challenging to implement in decentralized settings… ▽ More In this work, we aim to classify nodes of unstructured peer-to-peer networks with communication uncertainty, such as users of decentralized social networks. Graph Neural Networks (GNNs) are known to improve the accuracy of simple classifiers in centralized settings by leveraging naturally occurring network links, but graph convolutional layers are challenging to implement in decentralized settings when node neighbors are not constantly available. We address this problem by employing decoupled GNNs, where base classifier predictions and errors are diffused through graphs after training. For these, we deploy pre-trained and gossip-trained base classifiers and implement peer-to-peer graph diffusion under communication uncertainty. In particular, we develop an asynchronous decentralized formulation of diffusion that converges to centralized predictions in distribution and linearly with respect to communication rates. We experiment on three real-world graphs with node features and labels and simulate peer-to-peer networks with uniformly random communication frequencies; given a portion of known labels, our decentralized graph diffusion achieves comparable accuracy to centralized GNNs with minimal communication overhead (less than 3% of what gossip training already adds). △ Less

Submitted 15 March, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

Comments: 12 pages, 4 figures, 2 tables, accepted manuscript, IEEE Access

arXiv:2110.09274 [pdf, other]

pygrank: A Python Package for Graph Node Ranking

Authors: Emmanouil Krasanakis, Symeon Papadopoulos, Ioannis Kompatsiaris, Andreas Symeonidis

Abstract: We introduce pygrank, an open source Python package to define, run and evaluate node ranking algorithms. We provide object-oriented and extensively unit-tested algorithm components, such as graph filters, post-processors, measures, benchmarks and online tuning. Computations can be delegated to numpy, tensorflow or pytorch backends and fit in back-propagation pipelines. Classes can be combined to d… ▽ More We introduce pygrank, an open source Python package to define, run and evaluate node ranking algorithms. We provide object-oriented and extensively unit-tested algorithm components, such as graph filters, post-processors, measures, benchmarks and online tuning. Computations can be delegated to numpy, tensorflow or pytorch backends and fit in back-propagation pipelines. Classes can be combined to define interoperable complex algorithms. Within the context of this paper we compare the package with related alternatives and demonstrate its flexibility and ease of use with code examples. △ Less

Submitted 18 October, 2021; originally announced October 2021.

Comments: 6 pages, 1 figure, 2 tables, 3 code snippets

arXiv:2108.12397 [pdf, other]

Prior Signal Editing for Graph Filter Posterior Fairness Constraints

Authors: Emmanouil Krasanakis, Symeon Papadopoulos, Ioannis Kompatsiaris, Andreas Symeonidis

Abstract: Graph filters are an emerging paradigm that systematizes information propagation in graphs as transformation of prior node values, called graph signals, to posterior scores. In this work, we study the problem of mitigating disparate impact, i.e. posterior score differences between a protected set of sensitive nodes and the rest, while minimally editing scores to preserve recommendation quality. To… ▽ More Graph filters are an emerging paradigm that systematizes information propagation in graphs as transformation of prior node values, called graph signals, to posterior scores. In this work, we study the problem of mitigating disparate impact, i.e. posterior score differences between a protected set of sensitive nodes and the rest, while minimally editing scores to preserve recommendation quality. To this end, we develop a scheme that respects propagation mechanisms by editing graph signal priors according to their posteriors and node sensitivity, where a small number of editing parameters can be tuned to constrain or eliminate disparate impact. We also theoretically explain that coarse prior editing can locally optimize posteriors objectives thanks to graph filter robustness. We experiment on a diverse collection of 12 graphs with varying number of nodes, where our approach performs equally well or better than previous ones in minimizing disparate impact and preserving posterior AUC under fairness constraints. △ Less

Submitted 27 August, 2021; originally announced August 2021.

Comments: 40 pages, 4 figures, 9 tables

arXiv:2107.07919 [pdf, other]

A Survey on Bias in Visual Datasets

Authors: Simone Fabbrizzi, Symeon Papadopoulos, Eirini Ntoutsi, Ioannis Kompatsiaris

Abstract: Computer Vision (CV) has achieved remarkable results, outperforming humans in several tasks. Nonetheless, it may result in significant discrimination if not handled properly as CV systems highly depend on the data they are fed with and can learn and amplify biases within such data. Thus, the problems of understanding and discovering biases are of utmost importance. Yet, there is no comprehensive s… ▽ More Computer Vision (CV) has achieved remarkable results, outperforming humans in several tasks. Nonetheless, it may result in significant discrimination if not handled properly as CV systems highly depend on the data they are fed with and can learn and amplify biases within such data. Thus, the problems of understanding and discovering biases are of utmost importance. Yet, there is no comprehensive survey on bias in visual datasets. Hence, this work aims to: i) describe the biases that might manifest in visual datasets; ii) review the literature on methods for bias discovery and quantification in visual datasets; iii) discuss existing attempts to collect bias-aware visual datasets. A key conclusion of our study is that the problem of bias discovery and quantification in visual datasets is still open, and there is room for improvement in terms of both methods and the range of biases that can be addressed. Moreover, there is no such thing as a bias-free dataset, so scientists and practitioners must become aware of the biases in their datasets and make them explicit. To this end, we propose a checklist to spot different types of bias during visual dataset collection. △ Less

Submitted 23 June, 2022; v1 submitted 16 July, 2021; originally announced July 2021.

arXiv:2106.13266 [pdf, other]

doi 10.1007/s11263-022-01651-3

DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval

Authors: Giorgos Kordopatis-Zilos, Christos Tzelepis, Symeon Papadopoulos, Ioannis Kompatsiaris, Ioannis Patras

Abstract: In this paper, we address the problem of high performance and computationally efficient content-based video retrieval in large-scale datasets. Current methods typically propose either: (i) fine-grained approaches employing spatio-temporal representations and similarity calculations, achieving high performance at a high computational cost or (ii) coarse-grained approaches representing/indexing vide… ▽ More In this paper, we address the problem of high performance and computationally efficient content-based video retrieval in large-scale datasets. Current methods typically propose either: (i) fine-grained approaches employing spatio-temporal representations and similarity calculations, achieving high performance at a high computational cost or (ii) coarse-grained approaches representing/indexing videos as global vectors, where the spatio-temporal structure is lost, providing low performance but also having low computational cost. In this work, we propose a Knowledge Distillation framework, called Distill-and-Select (DnS), that starting from a well-performing fine-grained Teacher Network learns: a) Student Networks at different retrieval performance and computational efficiency trade-offs and b) a Selector Network that at test time rapidly directs samples to the appropriate student to maintain both high retrieval performance and high computational efficiency. We train several students with different architectures and arrive at different trade-offs of performance and efficiency, i.e., speed and storage requirements, including fine-grained students that store/index videos using binary representations. Importantly, the proposed scheme allows Knowledge Distillation in large, unlabelled datasets -- this leads to good students. We evaluate DnS on five public datasets on three different video retrieval tasks and demonstrate a) that our students achieve state-of-the-art performance in several cases and b) that the DnS framework provides an excellent trade-off between retrieval performance, computational speed, and storage space. In specific configurations, the proposed method achieves similar mAP with the teacher but is 20 times faster and requires 240 times less storage space. The collected dataset and implementation are publicly available: https://github.com/mever-team/distill-and-select. △ Less

Submitted 5 August, 2022; v1 submitted 24 June, 2021; originally announced June 2021.

Journal ref: International Journal of Computer Vision (2022)

arXiv:2105.07645 [pdf, ps, other]

Leveraging EfficientNet and Contrastive Learning for Accurate Global-scale Location Estimation

Authors: Giorgos Kordopatis-Zilos, Panagiotis Galopoulos, Symeon Papadopoulos, Ioannis Kompatsiaris

Abstract: In this paper, we address the problem of global-scale image geolocation, proposing a mixed classification-retrieval scheme. Unlike other methods that strictly tackle the problem as a classification or retrieval task, we combine the two practices in a unified solution leveraging the advantages of each approach with two different modules. The first leverages the EfficientNet architecture to assign i… ▽ More In this paper, we address the problem of global-scale image geolocation, proposing a mixed classification-retrieval scheme. Unlike other methods that strictly tackle the problem as a classification or retrieval task, we combine the two practices in a unified solution leveraging the advantages of each approach with two different modules. The first leverages the EfficientNet architecture to assign images to a specific geographic cell in a robust way. The second introduces a new residual architecture that is trained with contrastive learning to map input images to an embedding space that minimizes the pairwise geodesic distance of same-location images. For the final location estimation, the two modules are combined with a search-within-cell scheme, where the locations of most similar images from the predicted geographic cell are aggregated based on a spatial clustering scheme. Our approach demonstrates very competitive performance on four public datasets, achieving new state-of-the-art performance in fine granularity scales, i.e., 15.0% at 1km range on Im2GPS3k. △ Less

Submitted 17 May, 2021; originally announced May 2021.

arXiv:2105.05515 [pdf, other]

Operation-wise Attention Network for Tampering Localization Fusion

Authors: Polychronis Charitidis, Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Ioannis Kompatsiaris

Abstract: In this work, we present a deep learning-based approach for image tampering localization fusion. This approach is designed to combine the outcomes of multiple image forensics algorithms and provides a fused tampering localization map, which requires no expert knowledge and is easier to interpret by end users. Our fusion framework includes a set of five individual tampering localization methods for… ▽ More In this work, we present a deep learning-based approach for image tampering localization fusion. This approach is designed to combine the outcomes of multiple image forensics algorithms and provides a fused tampering localization map, which requires no expert knowledge and is easier to interpret by end users. Our fusion framework includes a set of five individual tampering localization methods for splicing localization on JPEG images. The proposed deep learning fusion model is an adapted architecture, initially proposed for the image restoration task, that performs multiple operations in parallel, weighted by an attention mechanism to enable the selection of proper operations depending on the input signals. This weighting process can be very beneficial for cases where the input signal is very diverse, as in our case where the output signals of multiple image forensics algorithms are combined. Evaluation in three publicly available forensics datasets demonstrates that the performance of the proposed approach is competitive, outperforming the individual forensics techniques as well as another recently proposed fusion framework in the majority of cases. △ Less

Submitted 13 May, 2021; v1 submitted 12 May, 2021; originally announced May 2021.

Comments: 6 pages, 3 figures, cbmi

arXiv:2010.08737 [pdf, ps, other]

Audio-based Near-Duplicate Video Retrieval with Audio Similarity Learning

Authors: Pavlos Avgoustinakis, Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Andreas L. Symeonidis, Ioannis Kompatsiaris

Abstract: In this work, we address the problem of audio-based near-duplicate video retrieval. We propose the Audio Similarity Learning (AuSiL) approach that effectively captures temporal patterns of audio similarity between video pairs. For the robust similarity calculation between two videos, we first extract representative audio-based video descriptors by leveraging transfer learning based on a Convolutio… ▽ More In this work, we address the problem of audio-based near-duplicate video retrieval. We propose the Audio Similarity Learning (AuSiL) approach that effectively captures temporal patterns of audio similarity between video pairs. For the robust similarity calculation between two videos, we first extract representative audio-based video descriptors by leveraging transfer learning based on a Convolutional Neural Network (CNN) trained on a large scale dataset of audio events, and then we calculate the similarity matrix derived from the pairwise similarity of these descriptors. The similarity matrix is subsequently fed to a CNN network that captures the temporal structures existing within its content. We train our network following a triplet generation process and optimizing the triplet loss function. To evaluate the effectiveness of the proposed approach, we have manually annotated two publicly available video datasets based on the audio duplicity between their videos. The proposed approach achieves very competitive results compared to three state-of-the-art methods. Also, unlike the competing methods, it is very robust to the retrieval of audio duplicates generated with speed transformations. △ Less

Submitted 11 January, 2021; v1 submitted 17 October, 2020; originally announced October 2020.

arXiv:2009.00945 [pdf, other]

doi 10.1016/j.asoc.2020.106685

LAVARNET: Neural Network Modeling of Causal Variable Relationships for Multivariate Time Series Forecasting

Authors: Christos Koutlis, Symeon Papadopoulos, Manos Schinas, Ioannis Kompatsiaris

Abstract: Multivariate time series forecasting is of great importance to many scientific disciplines and industrial sectors. The evolution of a multivariate time series depends on the dynamics of its variables and the connectivity network of causal interrelationships among them. Most of the existing time series models do not account for the causal effects among the system's variables and even if they do the… ▽ More Multivariate time series forecasting is of great importance to many scientific disciplines and industrial sectors. The evolution of a multivariate time series depends on the dynamics of its variables and the connectivity network of causal interrelationships among them. Most of the existing time series models do not account for the causal effects among the system's variables and even if they do they rely just on determining the between-variables causality network. Knowing the structure of such a complex network and even more specifically knowing the exact lagged variables that contribute to the underlying process is crucial for the task of multivariate time series forecasting. The latter is a rather unexplored source of information to leverage. In this direction, here a novel neural network-based architecture is proposed, termed LAgged VAriable Representation NETwork (LAVARNET), which intrinsically estimates the importance of lagged variables and combines high dimensional latent representations of them to predict future values of time series. Our model is compared with other baseline and state of the art neural network architectures on one simulated data set and four real data sets from meteorology, music, solar activity, and finance areas. The proposed architecture outperforms the competitive architectures in most of the experiments. △ Less

Submitted 2 September, 2020; originally announced September 2020.

arXiv:2007.02135 [pdf]

doi 10.1145/3405962.3405979

Towards Semantic Detection of Smells in Cloud Infrastructure Code

Authors: Indika Kumara, Zoe Vasileiou, Georgios Meditskos, Damian A. Tamburri, Willem-Jan Van Den Heuvel, Anastasios Karakostas, Stefanos Vrochidis, Ioannis Kompatsiaris

Abstract: Automated deployment and management of Cloud applications relies on descriptions of their deployment topologies, often referred to as Infrastructure Code. As the complexity of applications and their deployment models increases, developers inadvertently introduce software smells to such code specifications, for instance, violations of good coding practices, modular structure, and more. This paper p… ▽ More Automated deployment and management of Cloud applications relies on descriptions of their deployment topologies, often referred to as Infrastructure Code. As the complexity of applications and their deployment models increases, developers inadvertently introduce software smells to such code specifications, for instance, violations of good coding practices, modular structure, and more. This paper presents a knowledge-driven approach enabling developers to identify the aforementioned smells in deployment descriptions. We detect smells with SPARQL-based rules over pattern-based OWL 2 knowledge graphs capturing deployment models. We show the feasibility of our approach with a prototype and three case studies. △ Less

Submitted 4 July, 2020; originally announced July 2020.

Comments: 5 pages, 6 figures. The 10 th International Conference on Web Intelligence, Mining and Semantics (WIMS 2020)

Journal ref: In The 10th International Conference on Web Intelligence, Mining and Semantics (WIMS 2020), June 30-July 3, 2020, Biarritz, France. ACM, New York, NY, USA, 5 pages

arXiv:2006.07084 [pdf, other]

Investigating the Impact of Pre-processing and Prediction Aggregation on the DeepFake Detection Task

Authors: Polychronis Charitidis, Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Ioannis Kompatsiaris

Abstract: Recent advances in content generation technologies (widely known as DeepFakes) along with the online proliferation of manipulated media content render the detection of such manipulations a task of increasing importance. Even though there are many DeepFake detection methods, only a few focus on the impact of dataset preprocessing and the aggregation of frame-level to video-level prediction on model… ▽ More Recent advances in content generation technologies (widely known as DeepFakes) along with the online proliferation of manipulated media content render the detection of such manipulations a task of increasing importance. Even though there are many DeepFake detection methods, only a few focus on the impact of dataset preprocessing and the aggregation of frame-level to video-level prediction on model performance. In this paper, we propose a pre-processing step to improve the training data quality and examine its effect on the performance of DeepFake detection. We also propose and evaluate the effect of video-level prediction aggregation approaches. Experimental results show that the proposed pre-processing approach leads to considerable improvements in the performance of detection models, and the proposed prediction aggregation scheme further boosts the detection efficiency in cases where there are multiple faces in a video. △ Less

Submitted 19 October, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

arXiv:2004.14858 [pdf, other]

MuSe 2020 -- The First International Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop

Authors: Lukas Stappen, Alice Baird, Georgios Rizos, Panagiotis Tzirakis, Xinchen Du, Felix Hafner, Lea Schumann, Adria Mallol-Ragolta, Björn W. Schuller, Iulia Lefter, Erik Cambria, Ioannis Kompatsiaris

Abstract: Multimodal Sentiment Analysis in Real-life Media (MuSe) 2020 is a Challenge-based Workshop focusing on the tasks of sentiment recognition, as well as emotion-target engagement and trustworthiness detection by means of more comprehensively integrating the audio-visual and language modalities. The purpose of MuSe 2020 is to bring together communities from different disciplines; mainly, the audio-vis… ▽ More Multimodal Sentiment Analysis in Real-life Media (MuSe) 2020 is a Challenge-based Workshop focusing on the tasks of sentiment recognition, as well as emotion-target engagement and trustworthiness detection by means of more comprehensively integrating the audio-visual and language modalities. The purpose of MuSe 2020 is to bring together communities from different disciplines; mainly, the audio-visual emotion recognition community (signal-based), and the sentiment analysis community (symbol-based). We present three distinct sub-challenges: MuSe-Wild, which focuses on continuous emotion (arousal and valence) prediction; MuSe-Topic, in which participants recognise domain-specific topics as the target of 3-class (low, medium, high) emotions; and MuSe-Trust, in which the novel aspect of trustworthiness is to be predicted. In this paper, we provide detailed information on MuSe-CaR, the first of its kind in-the-wild database, which is utilised for the challenge, as well as the state-of-the-art features and modelling approaches applied. For each sub-challenge, a competitive baseline for participants is set; namely, on test we report for MuSe-Wild a combined (valence and arousal) CCC of .2568, for MuSe-Topic a score (computed as 0.34$\cdot$ UAR + 0.66$\cdot$F1) of 76.78 % on the 10-class topic and 40.64 % on the 3-class emotion prediction, and for MuSe-Trust a CCC of .4359. △ Less

Submitted 9 July, 2020; v1 submitted 30 April, 2020; originally announced April 2020.

Comments: Baseline Paper MuSe 2020, MuSe Workshop Challenge, ACM Multimedia

arXiv:2001.09762 [pdf, other]

Bias in Data-driven AI Systems -- An Introductory Survey

Authors: Eirini Ntoutsi, Pavlos Fafalios, Ujwal Gadiraju, Vasileios Iosifidis, Wolfgang Nejdl, Maria-Esther Vidal, Salvatore Ruggieri, Franco Turini, Symeon Papadopoulos, Emmanouil Krasanakis, Ioannis Kompatsiaris, Katharina Kinder-Kurlanda, Claudia Wagner, Fariba Karimi, Miriam Fernandez, Harith Alani, Bettina Berendt, Tina Kruegel, Christian Heinze, Klaus Broelemann, Gjergji Kasneci, Thanassis Tiropanis, Steffen Staab

Abstract: AI-based systems are widely employed nowadays to make decisions that have far-reaching impacts on individuals and society. Their decisions might affect everyone, everywhere and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their desig… ▽ More AI-based systems are widely employed nowadays to make decisions that have far-reaching impacts on individuals and society. Their decisions might affect everyone, everywhere and anytime, entailing concerns about potential human rights issues. Therefore, it is necessary to move beyond traditional AI algorithms optimized for predictive performance and embed ethical and legal principles in their design, training and deployment to ensure social good while still benefiting from the huge potential of the AI technology. The goal of this survey is to provide a broad multi-disciplinary overview of the area of bias in AI systems, focusing on technical challenges and solutions as well as to suggest new research directions towards approaches well-grounded in a legal frame. In this survey, we focus on data-driven AI, as a large part of AI is powered nowadays by (big) data and powerful Machine Learning (ML) algorithms. If otherwise not specified, we use the general term bias to describe problems related to the gathering or processing of data that might result in prejudiced decisions on the bases of demographic features like race, sex, etc. △ Less

Submitted 14 January, 2020; originally announced January 2020.

Comments: 19 pages, 1 figure

arXiv:1908.07410 [pdf, other]

ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning

Authors: Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Ioannis Patras, Ioannis Kompatsiaris

Abstract: In this paper we introduce ViSiL, a Video Similarity Learning architecture that considers fine-grained Spatio-Temporal relations between pairs of videos -- such relations are typically lost in previous video retrieval approaches that embed the whole frame or even the whole video into a vector descriptor before the similarity estimation. By contrast, our Convolutional Neural Network (CNN)-based app… ▽ More In this paper we introduce ViSiL, a Video Similarity Learning architecture that considers fine-grained Spatio-Temporal relations between pairs of videos -- such relations are typically lost in previous video retrieval approaches that embed the whole frame or even the whole video into a vector descriptor before the similarity estimation. By contrast, our Convolutional Neural Network (CNN)-based approach is trained to calculate video-to-video similarity from refined frame-to-frame similarity matrices, so as to consider both intra- and inter-frame relations. In the proposed method, pairwise frame similarity is estimated by applying Tensor Dot (TD) followed by Chamfer Similarity (CS) on regional CNN frame features - this avoids feature aggregation before the similarity calculation between frames. Subsequently, the similarity matrix between all video frames is fed to a four-layer CNN, and then summarized using Chamfer Similarity (CS) into a video-to-video similarity score -- this avoids feature aggregation before the similarity calculation between videos and captures the temporal similarity patterns between matching frame sequences. We train the proposed network using a triplet loss scheme and evaluate it on five public benchmark datasets on four different video retrieval problems where we demonstrate large improvements in comparison to the state of the art. The implementation of ViSiL is publicly available. △ Less

Submitted 20 August, 2019; originally announced August 2019.

arXiv:1809.04094 [pdf, other]

doi 10.1109/TMM.2019.2905741

FIVR: Fine-grained Incident Video Retrieval

Authors: Giorgos Kordopatis-Zilos, Symeon Papadopoulos, Ioannis Patras, Ioannis Kompatsiaris

Abstract: This paper introduces the problem of Fine-grained Incident Video Retrieval (FIVR). Given a query video, the objective is to retrieve all associated videos, considering several types of associations that range from duplicate videos to videos from the same incident. FIVR offers a single framework that contains several retrieval tasks as special cases. To address the benchmarking needs of all such ta… ▽ More This paper introduces the problem of Fine-grained Incident Video Retrieval (FIVR). Given a query video, the objective is to retrieve all associated videos, considering several types of associations that range from duplicate videos to videos from the same incident. FIVR offers a single framework that contains several retrieval tasks as special cases. To address the benchmarking needs of all such tasks, we construct and present a large-scale annotated video dataset, which we call FIVR-200K, and it comprises 225,960 videos. To create the dataset, we devise a process for the collection of YouTube videos based on major news events from recent years crawled from Wikipedia and deploy a retrieval pipeline for the automatic selection of query videos based on their estimated suitability as benchmarks. We also devise a protocol for the annotation of the dataset with respect to the four types of video associations defined by FIVR. Finally, we report the results of an experimental study on the dataset comparing five state-of-the-art methods developed based on a variety of visual descriptors, highlighting the challenges of the current problem. △ Less

Submitted 24 March, 2019; v1 submitted 11 September, 2018; originally announced September 2018.

Journal ref: IEEE Transactions on Multimedia 2019

arXiv:1710.08528 [pdf, other]

A Two-Level Classification Approach for Detecting Clickbait Posts using Text-Based Features

Authors: Olga Papadopoulou, Markos Zampoglou, Symeon Papadopoulos, Ioannis Kompatsiaris

Abstract: The emergence of social media as news sources has led to the rise of clickbait posts attempting to attract users to click on article links without informing them on the actual article content. This paper presents our efforts to create a clickbait detector inspired by fake news detection algorithms, and our submission to the Clickbait Challenge 2017. The detector is based almost exclusively on text… ▽ More The emergence of social media as news sources has led to the rise of clickbait posts attempting to attract users to click on article links without informing them on the actual article content. This paper presents our efforts to create a clickbait detector inspired by fake news detection algorithms, and our submission to the Clickbait Challenge 2017. The detector is based almost exclusively on text-based features taken from previous work on clickbait detection, our own work on fake post detection, and features we designed specifically for the challenge. We use a two-level classification approach, combining the outputs of 65 first-level classifiers in a second-level feature vector. We present our exploratory results with individual features and their combinations, taken from the post text and the target article title, as well as feature selection. While our own blind tests with the dataset led to an F-score of 0.63, our final evaluation in the Challenge only achieved an F-score of 0.43. We explore the possible causes of this, and lay out potential future steps to achieve more successful results. △ Less

Submitted 23 October, 2017; originally announced October 2017.

Comments: Clickbait Challenge 2017

arXiv:1701.01142 [pdf]

The Dem@Care Experiments and Datasets: a Technical Report

Authors: Anastasios Karakostas, Alexia Briassouli, Konstantinos Avgerinakis, Ioannis Kompatsiaris, Magda Tsolaki

Abstract: The objective of Dem@Care is the development of a complete system providing personal health services to people with dementia, as well as medical professionals and caregivers, by using a multitude of sensors, for context-aware, multi-parametric monitoring of lifestyle, ambient environment, and health parameters. Multi-sensor data analysis, combined with intelligent decision making mechanisms, will… ▽ More The objective of Dem@Care is the development of a complete system providing personal health services to people with dementia, as well as medical professionals and caregivers, by using a multitude of sensors, for context-aware, multi-parametric monitoring of lifestyle, ambient environment, and health parameters. Multi-sensor data analysis, combined with intelligent decision making mechanisms, will allow an accurate representation of the person's current status and will provide the appropriate feedback, both to the person and the associated caregivers, enhancing the standard clinical workflow. Within the project framework, several data collection activities have taken place to assist technical development and evaluation tasks. In all these activities, particular attention has been paid to adhere to ethical guidelines and preserve the participants' privacy. This technical report describes shorty the (a) the main objectives of the project, (b) the main ethical principles and (c) the datasets that have been already created. △ Less

Submitted 17 December, 2016; originally announced January 2017.

Comments: 4pages 2figures

arXiv:1610.01209 [pdf, other]

Towards Air Quality Estimation Using Collected Multimodal Environmental Data

Authors: Anastasia Moumtzidou, Symeon Papadopoulos, Stefanos Vrochidis, Ioannis Kompatsiaris, Konstantinos Kourtidis, George Hloupis, Ilias Stavrakas, Konstantina Papachristopoulou, Christodoulos Keratidis

Abstract: This paper presents an open platform, which collects multimodal environmental data related to air quality from several sources including official open sources, social media and citizens. Collecting and fusing different sources of air quality data into a unified air quality indicator is a highly challenging problem, leveraging recent advances in image analysis, open hardware, machine learning and d… ▽ More This paper presents an open platform, which collects multimodal environmental data related to air quality from several sources including official open sources, social media and citizens. Collecting and fusing different sources of air quality data into a unified air quality indicator is a highly challenging problem, leveraging recent advances in image analysis, open hardware, machine learning and data fusion and is expected to result in increased geographical coverage and temporal granularity of air quality data. △ Less

Submitted 3 October, 2016; originally announced October 2016.

arXiv:1602.02057 [pdf, ps, other]

A Generalised Differential Framework for Measuring Signal Sparsity

Authors: Anastasios Maronidis, Elisavet Chatzilari, Spiros Nikolopoulos, Ioannis Kompatsiaris

Abstract: The notion of signal sparsity has been gaining increasing interest in information theory and signal processing communities. As a consequence, a plethora of sparsity metrics has been presented in the literature. The appropriateness of these metrics is typically evaluated against a set of objective criteria that has been proposed for assessing the credibility of any sparsity metric. In this paper, w… ▽ More The notion of signal sparsity has been gaining increasing interest in information theory and signal processing communities. As a consequence, a plethora of sparsity metrics has been presented in the literature. The appropriateness of these metrics is typically evaluated against a set of objective criteria that has been proposed for assessing the credibility of any sparsity metric. In this paper, we propose a Generalised Differential Sparsity (GDS) framework for generating novel sparsity metrics whose functionality is based on the concept that sparsity is encoded in the differences among the signal coefficients. We rigorously prove that every metric generated using GDS satisfies all the aforementioned criteria and we provide a computationally efficient formula that makes GDS suitable for high-dimensional signals. The great advantage of GDS is its flexibility to offer sparsity metrics that can be well-tailored to certain requirements stemming from the nature of the data and the problem to be solved. This is in contrast to current state-of-the-art sparsity metrics like Gini Index (GI), which is actually proven to be only a specific instance of GDS, demonstrating the generalisation power of our framework. In verifying our claims, we have incorporated GDS in a stochastic signal recovery algorithm and experimentally investigated its efficacy in reconstructing randomly projected sparse signals. As a result, it is proven that GDS, in comparison to GI, both loosens the bounds of the assumed sparsity of the original signals and reduces the minimum number of projected dimensions, required to guarantee an almost perfect reconstruction of heavily compressed signals. The superiority of GDS over GI in conjunction with the fact that the latter is considered as a standard in numerous scientific domains, prove the great potential of GDS as a general purpose framework for measuring sparsity. △ Less

Submitted 5 February, 2016; originally announced February 2016.

Comments: 14 pages, 4 figures, The abstract field cannot be longer than 1,920 characters: The abstract appearing here is slightly shorter than the one in the pdf

arXiv:1602.00904 [pdf, other]

Comparative evaluation of state-of-the-art algorithms for SSVEP-based BCIs

Authors: Vangelis P. Oikonomou, Georgios Liaros, Kostantinos Georgiadis, Elisavet Chatzilari, Katerina Adam, Spiros Nikolopoulos, Ioannis Kompatsiaris

Abstract: Brain-computer interfaces (BCIs) have been gaining momentum in making human-computer interaction more natural, especially for people with neuro-muscular disabilities. Among the existing solutions the systems relying on electroencephalograms (EEG) occupy the most prominent place due to their non-invasiveness. However, the process of translating EEG signals into computer commands is far from trivial… ▽ More Brain-computer interfaces (BCIs) have been gaining momentum in making human-computer interaction more natural, especially for people with neuro-muscular disabilities. Among the existing solutions the systems relying on electroencephalograms (EEG) occupy the most prominent place due to their non-invasiveness. However, the process of translating EEG signals into computer commands is far from trivial, since it requires the optimization of many different parameters that need to be tuned jointly. In this report, we focus on the category of EEG-based BCIs that rely on Steady-State-Visual-Evoked Potentials (SSVEPs) and perform a comparative evaluation of the most promising algorithms existing in the literature. More specifically, we define a set of algorithms for each of the various different parameters composing a BCI system (i.e. filtering, artifact removal, feature extraction, feature selection and classification) and study each parameter independently by keeping all other parameters fixed. The results obtained from this evaluation process are provided together with a dataset consisting of the 256-channel, EEG signals of 11 subjects, as well as a processing toolbox for reproducing the results and supporting further experimentation. In this way, we manage to make available for the community a state-of-the-art baseline for SSVEP-based BCIs that can be used as a basis for introducing novel methods and approaches. △ Less

Submitted 3 February, 2016; v1 submitted 2 February, 2016; originally announced February 2016.

arXiv:1502.01753 [pdf, other]

doi 10.1109/IJCNN.2015.7280766

Monitoring Term Drift Based on Semantic Consistency in an Evolving Vector Field

Authors: Peter Wittek, Sándor Darányi, Efstratios Kontopoulos, Theodoros Moysiadis, Ioannis Kompatsiaris

Abstract: Based on the Aristotelian concept of potentiality vs. actuality allowing for the study of energy and dynamics in language, we propose a field approach to lexical analysis. Falling back on the distributional hypothesis to statistically model word meaning, we used evolving fields as a metaphor to express time-dependent changes in a vector space model by a combination of random indexing and evolving… ▽ More Based on the Aristotelian concept of potentiality vs. actuality allowing for the study of energy and dynamics in language, we propose a field approach to lexical analysis. Falling back on the distributional hypothesis to statistically model word meaning, we used evolving fields as a metaphor to express time-dependent changes in a vector space model by a combination of random indexing and evolving self-organizing maps (ESOM). To monitor semantic drifts within the observation period, an experiment was carried out on the term space of a collection of 12.8 million Amazon book reviews. For evaluation, the semantic consistency of ESOM term clusters was compared with their respective neighbourhoods in WordNet, and contrasted with distances among term vectors by random indexing. We found that at 0.05 level of significance, the terms in the clusters showed a high level of semantic consistency. Tracking the drift of distributional patterns in the term space across time periods, we found that consistency decreased, but not at a statistically significant level. Our method is highly scalable, with interpretations in philosophy. △ Less

Submitted 5 February, 2015; originally announced February 2015.

Comments: 8 pages, 1 figure. Code used to conduct the experiments is available at https://github.com/peterwittek/concept_drifts

Journal ref: Proceedings of IJCNN-15, International Joint Conference on Neural Networks, pages 1--8, 2015

Showing 1–35 of 35 results for author: Kompatsiaris, I