Search | arXiv e-print repository

Tuning Magnetism in Ising-type van der Waals Magnet FePS3 by Lithium Intercalation

Authors: Dinesh Upreti, Rabindra Basnet, M. M. Sharma, Santosh Karki Chhetri, Gokul Acharya, Md Rafique Un Nabi, Josh Sakon, Bo Da, Mansour Mortazavi, Jin Hu

Abstract: Recently, layered materials transition metal thiophosphate MPX3 (M = transition metals, X = S or Se) have gained significant attention because of their rich magnetic, optical, and electronic properties. Specifically, the diverse magnetic structures and the robustness of magnetism in the two-dimensional limit have made them prominent candidates to study two-dimensional magnetism. Numerous efforts s… ▽ More Recently, layered materials transition metal thiophosphate MPX3 (M = transition metals, X = S or Se) have gained significant attention because of their rich magnetic, optical, and electronic properties. Specifically, the diverse magnetic structures and the robustness of magnetism in the two-dimensional limit have made them prominent candidates to study two-dimensional magnetism. Numerous efforts such as substitutions and interlayer intercalations have been made to tune the properties of these materials, which has greatly deepened the understanding of the underlying mechanisms that govern the properties. In this work, we focus on modifying the magnetism of Ising-type antiferromagnet FePS3 using electrochemical lithium intercalation. Our work unveils the effectiveness of electrochemical intercalation as a controllable tool to modulating magnetism, including tuning magnetic ordering temperature and inducing low temperature spin-glass state, offering an approach for implementing this material into applications. △ Less

Submitted 17 July, 2024; originally announced July 2024.

arXiv:2407.03267 [pdf]

Insulator-to-Metal Transition and Isotropic Gigantic Magnetoresistance in Layered Magnetic Semiconductors

Authors: Gokul Acharya, Bimal Neupane, Chia-Hsiu Hsu, Xian P. Yang, David Graf, Eun Sang Choi, Krishna Pandey, Md Rafique Un Nabi, Santosh Karki Chhetri, Rabindra Basnet, Sumaya Rahman, Jian Wang, Zhengxin Hu, Bo Da, Hugh Churchill, Guoqing Chang, M. Zahid Hasan, Yuanxi Wang, Jin Hu

Abstract: Magnetotransport, the response of electrical conduction to external magnetic field, acts as an important tool to reveal fundamental concepts behind exotic phenomena and plays a key role in enabling spintronic applications. Magnetotransport is generally sensitive to magnetic field orientations. In contrast, efficient and isotropic modulation of electronic transport, which is useful in technology ap… ▽ More Magnetotransport, the response of electrical conduction to external magnetic field, acts as an important tool to reveal fundamental concepts behind exotic phenomena and plays a key role in enabling spintronic applications. Magnetotransport is generally sensitive to magnetic field orientations. In contrast, efficient and isotropic modulation of electronic transport, which is useful in technology applications such as omnidirectional sensing, is rarely seen, especially for pristine crystals. Here we propose a strategy to realize extremely strong modulation of electron conduction by magnetic field which is independent of field direction. GdPS, a layered antiferromagnetic semiconductor with resistivity anisotropies, supports a field-driven insulator-to-metal transition with a paradoxically isotropic gigantic negative magnetoresistance insensitive to magnetic field orientations. This isotropic magnetoresistance originates from the combined effects of a near-zero spin-orbit coupling of Gd3+-based half-filling f-electron system and the strong on-site f-d exchange coupling in Gd atoms. Our results not only provide a novel material system with extraordinary magnetotransport that offers a missing block for antiferromagnet-based ultrafast and efficient spintronic devices, but also demonstrate the key ingredients for designing magnetic materials with desired transport properties for advanced functionalities. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: 44 pages, 18 figures

arXiv:2404.15241 [pdf]

doi 10.1103/PhysRevB.109.184429

Evolution of Magnetism in Magnetic Topological Semimetal NdSb$_x$Te$_{2-x+δ}$

Authors: Santosh Karki Chhetri, Rabindra Basnet, Jian Wang, Krishna Pandey, Gokul Acharya, Md Rafique Un Nabi, Dinesh Upreti, Josh Sakon, Mansour Mortazavi, Jin Hu

Abstract: Magnetic topological semimetals LnSbTe (Ln = Lanthanide) have attracted intensive attention because of the presence of interplay between magnetism, topological, and electron correlations depending on the choices of magnetic Ln elements. Recently, varying Sb-Te composition has been found to effectively control the electronic and magnetic states in LnSbxTe$_{2-x}$. With this motivation, we report th… ▽ More Magnetic topological semimetals LnSbTe (Ln = Lanthanide) have attracted intensive attention because of the presence of interplay between magnetism, topological, and electron correlations depending on the choices of magnetic Ln elements. Recently, varying Sb-Te composition has been found to effectively control the electronic and magnetic states in LnSbxTe$_{2-x}$. With this motivation, we report the evolution of magnetic properties with Sb-Te substitution in NdSb$_x$Te$_{2-x+δ}$. Our work reveals the interesting non-monotonic change in magnetic ordering temperature with varying composition stoichiometry. In addition, reducing the Sb content x drives the reorientation of moments from in-plane (ab-plane) to out-of-plane (c-axis) direction that results in the distinct magnetic structures for two end compounds NdTe$_2$ ($x = 0$) and NdSbTe ($x = 1$). Furthermore, the moment orientation in NdSb$_x$Te$_{2-x+δ}$ is also found to be strongly tunable upon application of weak magnetic field, leading to rich magnetic phases depending on the composition stoichiometry, temperature, and magnetic field. Such strong tuning of magnetism in this material establishes it as a promising platform for investigating tunable topological states and correlated topological physics. △ Less

Submitted 23 April, 2024; originally announced April 2024.

Comments: 35 pages, 5 figures

Report number: BK14576

Journal ref: PRB 2024

arXiv:2404.02091 [pdf]

doi 10.1103/PhysRevB.109.184405

Field-induced spin polarization in lightly Cr-substituted layered antiferromagnet NiPS3

Authors: Rabindra Basnet, Dinesh Upreti, Taksh Patel, Santosh Karki Chhetri, Gokul Acharya, Md Rafique Un Nabi, Manish Mani Sharma, Josh Sakon, Mansour Mortazavi, Jin Hu

Abstract: Tuning magnetic properties in layered magnets is an important route to realize novel phenomenon related to two-dimensional (2D) magnetism. Recently, tuning antiferromagnetic (AFM) properties through substitution and intercalation techniques have been widely studied in MPX3 compounds. Interesting phenomena, such as diverse AFM structures and even the signatures of ferrimagnetism, have been reported… ▽ More Tuning magnetic properties in layered magnets is an important route to realize novel phenomenon related to two-dimensional (2D) magnetism. Recently, tuning antiferromagnetic (AFM) properties through substitution and intercalation techniques have been widely studied in MPX3 compounds. Interesting phenomena, such as diverse AFM structures and even the signatures of ferrimagnetism, have been reported. However, long-range ferromagnetic (FM) ordering has remained elusive. In this work, we explored the magnetic properties of the previously unreported Cr-substituted NiPS3. We found that Cr substitution is extremely efficient in controlling spin orientation in NiPS3. Our study reveals a field-induced spin polarization in lightly (9%) Cr-substituted NiPS3, which is likely attributed to the attenuation of AFM interactions and magnetic anisotropy due to Cr doping. Our work provides a possible strategy to achieve FM phase in AFM MPX3, which could be useful for investigating 2D magnetism as well as potential device applications. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Journal ref: Physical Review B 109, 184405 (2024)

arXiv:2402.11087 [pdf]

Distinct Composition-Dependent Topological Hall Effect in Mn2-xZnxSb

Authors: Md Rafique Un Nabi, Yue Li, Suzanne G E te Velthuis, Santosh Karki Chhetri, Dinesh Upreti, Rabindra Basnet, Gokul Acharya, Charudatta Phatak, Jin Hu

Abstract: Spintronics, an evolving interdisciplinary field at the intersection of magnetism and electronics, explores innovative applications of electron charge and spin properties for advanced electronic devices. The topological Hall effect, a key component in spintronics, has gained significance due to emerging theories surrounding noncoplanar chiral spin textures. This study focuses on Mn2-xZnxSb, a mate… ▽ More Spintronics, an evolving interdisciplinary field at the intersection of magnetism and electronics, explores innovative applications of electron charge and spin properties for advanced electronic devices. The topological Hall effect, a key component in spintronics, has gained significance due to emerging theories surrounding noncoplanar chiral spin textures. This study focuses on Mn2-xZnxSb, a material crystalizing in centrosymmetric space group with rich magnetic phases tunable by Zn contents. Through comprehensive magnetic and transport characterizations, we found that the high-Zn (x>0.6) samples display THE which is enhanced with decreasing temperature, while THE in the low-Zn (x<0.6) samples show an opposite trend. The coexistence of those distinct temperature dependences for THE suggests very different magnetic interactions/structure for different compositions and underscores the strong coupling between magnetism and transport in Mn2-xZnxSb. Our findings contribute to understanding topological magnetism in centrosymmetric tetragonal lattices, establishing Mn2-xZnxSb as a unique platform for exploring tunable transport effects and opening avenues for further exploration in the realm of spintronics. △ Less

Submitted 16 February, 2024; originally announced February 2024.

Comments: 23 pages, 6 Figures, Accepted

Report number: Article ID: APXR12184

Journal ref: Advanced Physics Research, 2024

arXiv:2402.10155 [pdf]

doi 10.1002/aelm.202300738

Understanding and tuning magnetism in layered Ising-type antiferromagnet FePSe3 for potential 2D magnet

Authors: Rabindra Basnet, Taksh Patel, Jian Wang, Dinesh Upreti, Santosh Karki Chhetri, Gokul Acharya, Md Rafique Un Nabi, Josh Sakon, Jin Hu

Abstract: Recent development in two-dimensional (2D) magnetic materials have motivated the search for new van der Waals magnetic materials, especially Ising-type magnets with strong magnetic anisotropy. Fe-based MPX3 (M = transition metal, X = chalcogen) compounds such as FePS3 and FePSe3 both exhibit an Ising-type magnetic order, but FePSe3 receives much less attention compared to FePS3. This work focuses… ▽ More Recent development in two-dimensional (2D) magnetic materials have motivated the search for new van der Waals magnetic materials, especially Ising-type magnets with strong magnetic anisotropy. Fe-based MPX3 (M = transition metal, X = chalcogen) compounds such as FePS3 and FePSe3 both exhibit an Ising-type magnetic order, but FePSe3 receives much less attention compared to FePS3. This work focuses on establishing the strategy to engineer magnetic anisotropy and exchange interactions in this less-explored compound. Through chalcogen and metal substitutions, the magnetic anisotropy is found to be immune against S substitution for Se whereas tunable only with heavy Mn substitution for Fe. In particular, Mn substitution leads to a continuous rotation of magnetic moments from the out-of-plane direction towards in-plane. Furthermore, the magnetic ordering temperature displays non-monotonic doping dependence for both chalcogen and metal substitutions but due to different mechanisms. These findings provide deeper insight into the Ising-type magnetism in this important van der Waals material, shedding light on the study of other Ising-type magnetic systems as well as discovering novel 2D magnets for potential applications in spintronics. △ Less

Submitted 15 February, 2024; originally announced February 2024.

Journal ref: Adv. Electron. Mater. 2024, 2300738

arXiv:2401.08491 [pdf, other]

Contrastive Perplexity for Controlled Generation: An Application in Detoxifying Large Language Models

Authors: Tassilo Klein, Moin Nabi

Abstract: The generation of undesirable and factually incorrect content of large language models poses a significant challenge and remains largely an unsolved issue. This paper studies the integration of a contrastive learning objective for fine-tuning LLMs for implicit knowledge editing and controlled text generation. Optimizing the training objective entails aligning text perplexities in a contrastive fas… ▽ More The generation of undesirable and factually incorrect content of large language models poses a significant challenge and remains largely an unsolved issue. This paper studies the integration of a contrastive learning objective for fine-tuning LLMs for implicit knowledge editing and controlled text generation. Optimizing the training objective entails aligning text perplexities in a contrastive fashion. To facilitate training the model in a self-supervised fashion, we leverage an off-the-shelf LLM for training data generation. We showcase applicability in the domain of detoxification. Herein, the proposed approach leads to a significant decrease in the generation of toxic content while preserving general utility for downstream tasks such as commonsense reasoning and reading comprehension. The proposed approach is conceptually simple but empirically powerful. △ Less

Submitted 24 January, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

arXiv:2312.15371 [pdf, other]

New three-dimensional dispersion in the type-II Dirac semimetals PtTe$_2$ and PdTe$_2$ revealed through Angle Resolved Photoemission Spectroscopy

Authors: Ivan Pelayo, Derek Bergner, Archibald J. Williams, Jiayuwen Qi, Penghao Zhu, Mahfuzun Nabi, Warren L. B. Huey, Luca Moreschini, Ziling Deng, Jonathan Denlinger, Alessandra Lanzara, Yuan-Ming Lu, Wolfgang Windl, Joshua Goldberger, Claudia Ojeda-Aristizabal

Abstract: PtTe$_2$ and PdTe$_2$ are among the first transition metal dichalcogenides that were predicted to host type-II Dirac fermions, exotic particles prohibited in free space. These materials are layered and air-stable, which makes them top candidates for technological applications that take advantage of their anisotropic magnetotransport properties. Here, we provide a detailed characterization of the e… ▽ More PtTe$_2$ and PdTe$_2$ are among the first transition metal dichalcogenides that were predicted to host type-II Dirac fermions, exotic particles prohibited in free space. These materials are layered and air-stable, which makes them top candidates for technological applications that take advantage of their anisotropic magnetotransport properties. Here, we provide a detailed characterization of the electronic structure of PtTe$_2$ and PdTe$_2$ using Angle Resolved Photoemission Spectroscopy (ARPES) and Density Functional Theory (DFT) calculations, unveiling a new three-dimensional dispersion in these materials. Through the use of circularly polarized light, we report a different behavior of such dispersion in PdTe$_2$ compared to PtTe$_2$, that we relate to a symmetry analysis of the dipole matrix element. Such analysis reveals a link between the observed circular dichroism and the different momentum-dependent terms in the dispersion of these two compounds, despite their close similarity in crystal structure. Additionally, our data shows a clear difference in the circular dichroic signal for the type-II Dirac cones characteristic of these materials, compared to their topologically protected surface states. Our work provides a useful reference for the ARPES characterization of other transition metal dichalcogenides with topological properties and illustrates the use of circular dichroism as a guide to identify the topological character of two otherwise equivalent band dispersions, and to recognize different attributes in the band structure of similar materials. △ Less

Submitted 16 May, 2024; v1 submitted 23 December, 2023; originally announced December 2023.

Comments: 14 pages, 9 figures

arXiv:2310.15309 [pdf, other]

Lead-free Magnetic Double Perovskites for Photovoltaic and Photocatalysis Applications

Authors: Muskan Nabi, Sanika S. Padelkar, Jacek J. Jasieniak, Alexandr N. Simonov, Aftab Alam

Abstract: The magnetic spin degrees of freedom in magnetic materials serve as additional capability to tune materials properties, thereby invoking magneto-optical response. Herein, we report the magneto-optoelectronic properties of a family of lead-free magnetic double perovskites Cs_{2}AgTX_{6} (T = Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu; X=Cl, Br, I). This turns out to provide an extremely fertile series, givi… ▽ More The magnetic spin degrees of freedom in magnetic materials serve as additional capability to tune materials properties, thereby invoking magneto-optical response. Herein, we report the magneto-optoelectronic properties of a family of lead-free magnetic double perovskites Cs_{2}AgTX_{6} (T = Sc, Ti, V, Cr, Mn, Fe, Co, Ni, Cu; X=Cl, Br, I). This turns out to provide an extremely fertile series, giving rise to potential candidate materials for photovoltaic(PV) applications. In conjunction with high absorption coefficient and high simulated power conversion efficiency for PV applications, few compounds in this series exhibit novel magnetic character useful for spintronic applications. The interaction between magnetism and light can have far-reaching results on the photovoltaic properties as a consequence of the shift in the defect energy levels due to Zeeman effect. This subsequently affects the recombination rate of minority carriers, and hence the photoconversion efficiency. Moreover, the distinct ferromagnetic and anti-ferromagnetic ordering driven by hybridization and super-exchange mechanism can play a significant role to break the time-reversal and/or inversion symmetry. Such a coalescence of magnetism and efficient optoelectronic response has the potential to trigger magnetic/spin anomalous photovoltaic (non-linear Optical) effect in this Cs$_{2}$AgTX$_{6}$ family. These insights can thus channelize the advancement of lead-free double perovskites in magnetic/spin anomalous photovoltaic field as well. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: 9 pages, 5 figures, 1 table

arXiv:2308.10764 [pdf]

doi 10.1016/j.actamat.2023.119251

Coupling Between Magnetic and Transport Properties in Magnetic Layered Material Mn2-xZnxSb

Authors: Md Rafique Un Nabi, Rabindra Basnet, Krishna Pandey, Santosh Karki Chhetri, Dinesh Upreti, Gokul Acharya, Fei Wang, Arash Fereidouni, Hugh O. H. Churchill, Yingdong Guan, Zhiqiang Mao, Jin Hu

Abstract: We synthesized single crystals for Mn2-xZnxSb and studied their magnetic and electronic transport properties. This material system displays rich magnetic phase tunable with temperature and Zn composition. In addition, two groups of distinct magnetic and electronic properties, separated by a critical Zn composition of x = 0.6, are discovered. The Zn-less samples are metallic and characterized by a… ▽ More We synthesized single crystals for Mn2-xZnxSb and studied their magnetic and electronic transport properties. This material system displays rich magnetic phase tunable with temperature and Zn composition. In addition, two groups of distinct magnetic and electronic properties, separated by a critical Zn composition of x = 0.6, are discovered. The Zn-less samples are metallic and characterized by a resistivity jump at the magnetic ordering temperature, while the Zn-rich samples lose metallicity and show a metal-to-insulator transition-like feature tunable by magnetic field. Our findings establish Mn2-xZnxSb as a promising material platform that offers opportunities to study how the coupling of spin, charge, and lattice degrees of freedom governs interesting transport properties in 2D magnets, which is currently a topic of broad interest. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: 23 pages, 5 figures, Figures are at the end of the manuscript

Journal ref: Acta Materialia (2023): 119251

arXiv:2306.07483 [pdf, other]

Semi-supervised learning made simple with self-supervised clustering

Authors: Enrico Fini, Pietro Astolfi, Karteek Alahari, Xavier Alameda-Pineda, Julien Mairal, Moin Nabi, Elisa Ricci

Abstract: Self-supervised learning models have been shown to learn rich visual representations without requiring human annotations. However, in many real-world scenarios, labels are partially available, motivating a recent line of work on semi-supervised methods inspired by self-supervised principles. In this paper, we propose a conceptually simple yet empirically powerful approach to turn clustering-based… ▽ More Self-supervised learning models have been shown to learn rich visual representations without requiring human annotations. However, in many real-world scenarios, labels are partially available, motivating a recent line of work on semi-supervised methods inspired by self-supervised principles. In this paper, we propose a conceptually simple yet empirically powerful approach to turn clustering-based self-supervised methods such as SwAV or DINO into semi-supervised learners. More precisely, we introduce a multi-task framework merging a supervised objective using ground-truth labels and a self-supervised objective relying on clustering assignments with a single cross-entropy loss. This approach may be interpreted as imposing the cluster centroids to be class prototypes. Despite its simplicity, we provide empirical evidence that our approach is highly effective and achieves state-of-the-art performance on CIFAR100 and ImageNet. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: CVPR 2023 - Code available at https://github.com/pietroastolfi/suave-daino

Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023) 3187-3197

arXiv:2212.05102 [pdf, other]

A soft nearest-neighbor framework for continual semi-supervised learning

Authors: Zhiqi Kang, Enrico Fini, Moin Nabi, Elisa Ricci, Karteek Alahari

Abstract: Despite significant advances, the performance of state-of-the-art continual learning approaches hinges on the unrealistic scenario of fully labeled data. In this paper, we tackle this challenge and propose an approach for continual semi-supervised learning--a setting where not all the data samples are labeled. A primary issue in this scenario is the model forgetting representations of unlabeled da… ▽ More Despite significant advances, the performance of state-of-the-art continual learning approaches hinges on the unrealistic scenario of fully labeled data. In this paper, we tackle this challenge and propose an approach for continual semi-supervised learning--a setting where not all the data samples are labeled. A primary issue in this scenario is the model forgetting representations of unlabeled data and overfitting the labeled samples. We leverage the power of nearest-neighbor classifiers to nonlinearly partition the feature space and flexibly model the underlying data distribution thanks to its non-parametric nature. This enables the model to learn a strong representation for the current task, and distill relevant information from previous tasks. We perform a thorough experimental evaluation and show that our method outperforms all the existing approaches by large margins, setting a solid state of the art on the continual semi-supervised learning paradigm. For example, on CIFAR-100 we surpass several others even when using at least 30 times less supervision (0.8% vs. 25% of annotations). Finally, our method works well on both low and high resolution images and scales seamlessly to more complex datasets such as ImageNet-100. The code is publicly available on https://github.com/kangzhiq/NNCSL △ Less

Submitted 11 September, 2023; v1 submitted 9 December, 2022; originally announced December 2022.

Comments: Accepted at ICCV 2023

arXiv:2211.04928 [pdf, other]

miCSE: Mutual Information Contrastive Learning for Low-shot Sentence Embeddings

Authors: Tassilo Klein, Moin Nabi

Abstract: This paper presents miCSE, a mutual information-based contrastive learning framework that significantly advances the state-of-the-art in few-shot sentence embedding. The proposed approach imposes alignment between the attention pattern of different views during contrastive learning. Learning sentence embeddings with miCSE entails enforcing the structural consistency across augmented views for ever… ▽ More This paper presents miCSE, a mutual information-based contrastive learning framework that significantly advances the state-of-the-art in few-shot sentence embedding. The proposed approach imposes alignment between the attention pattern of different views during contrastive learning. Learning sentence embeddings with miCSE entails enforcing the structural consistency across augmented views for every sentence, making contrastive self-supervised learning more sample efficient. As a result, the proposed approach shows strong performance in the few-shot learning domain. While it achieves superior results compared to state-of-the-art methods on multiple benchmarks in few-shot learning, it is comparable in the full-shot scenario. This study opens up avenues for efficient self-supervised learning methods that are more robust than current contrastive methods for sentence embedding. △ Less

Submitted 23 May, 2023; v1 submitted 9 November, 2022; originally announced November 2022.

Comments: Accepted to ACL 2023

arXiv:2204.05229 [pdf, other]

Mixture-of-experts VAEs can disregard variation in surjective multimodal data

Authors: Jannik Wolff, Tassilo Klein, Moin Nabi, Rahul G. Krishnan, Shinichi Nakajima

Abstract: Machine learning systems are often deployed in domains that entail data from multiple modalities, for example, phenotypic and genotypic characteristics describe patients in healthcare. Previous works have developed multimodal variational autoencoders (VAEs) that generate several modalities. We consider subjective data, where single datapoints from one modality (such as class labels) describe multi… ▽ More Machine learning systems are often deployed in domains that entail data from multiple modalities, for example, phenotypic and genotypic characteristics describe patients in healthcare. Previous works have developed multimodal variational autoencoders (VAEs) that generate several modalities. We consider subjective data, where single datapoints from one modality (such as class labels) describe multiple datapoints from another modality (such as images). We theoretically and empirically demonstrate that multimodal VAEs with a mixture of experts posterior can struggle to capture variability in such surjective data. △ Less

Submitted 11 April, 2022; originally announced April 2022.

Comments: Accepted at the NeurIPS 2021 workshop on Bayesian Deep Learning

arXiv:2203.14098 [pdf, other]

doi 10.1109/TPAMI.2022.3163806

Uncertainty-aware Contrastive Distillation for Incremental Semantic Segmentation

Authors: Guanglei Yang, Enrico Fini, Dan Xu, Paolo Rota, Mingli Ding, Moin Nabi, Xavier Alameda-Pineda, Elisa Ricci

Abstract: A fundamental and challenging problem in deep learning is catastrophic forgetting, i.e. the tendency of neural networks to fail to preserve the knowledge acquired from old tasks when learning new tasks. This problem has been widely investigated in the research community and several Incremental Learning (IL) approaches have been proposed in the past years. While earlier works in computer vision hav… ▽ More A fundamental and challenging problem in deep learning is catastrophic forgetting, i.e. the tendency of neural networks to fail to preserve the knowledge acquired from old tasks when learning new tasks. This problem has been widely investigated in the research community and several Incremental Learning (IL) approaches have been proposed in the past years. While earlier works in computer vision have mostly focused on image classification and object detection, more recently some IL approaches for semantic segmentation have been introduced. These previous works showed that, despite its simplicity, knowledge distillation can be effectively employed to alleviate catastrophic forgetting. In this paper, we follow this research direction and, inspired by recent literature on contrastive learning, we propose a novel distillation framework, Uncertainty-aware Contrastive Distillation (\method). In a nutshell, \method~is operated by introducing a novel distillation loss that takes into account all the images in a mini-batch, enforcing similarity between features associated to all the pixels from the same classes, and pulling apart those corresponding to pixels from different classes. In order to mitigate catastrophic forgetting, we contrast features of the new model with features extracted by a frozen model learned at the previous incremental step. Our experimental results demonstrate the advantage of the proposed distillation technique, which can be used in synergy with previous IL approaches, and leads to state-of-art performance on three commonly adopted benchmarks for incremental semantic segmentation. The code is available at \url{https://github.com/ygjwd12345/UCD}. △ Less

Submitted 20 May, 2022; v1 submitted 26 March, 2022; originally announced March 2022.

Comments: TPAMI

arXiv:2203.07847 [pdf, other]

SCD: Self-Contrastive Decorrelation for Sentence Embeddings

Authors: Tassilo Klein, Moin Nabi

Abstract: In this paper, we propose Self-Contrastive Decorrelation (SCD), a self-supervised approach. Given an input sentence, it optimizes a joint self-contrastive and decorrelation objective. Learning a representation is facilitated by leveraging the contrast arising from the instantiation of standard dropout at different rates. The proposed method is conceptually simple yet empirically powerful. It achie… ▽ More In this paper, we propose Self-Contrastive Decorrelation (SCD), a self-supervised approach. Given an input sentence, it optimizes a joint self-contrastive and decorrelation objective. Learning a representation is facilitated by leveraging the contrast arising from the instantiation of standard dropout at different rates. The proposed method is conceptually simple yet empirically powerful. It achieves comparable results with state-of-the-art methods on multiple benchmarks without using contrastive pairs. This study opens up avenues for efficient self-supervised learning methods that are more robust than current contrastive methods. △ Less

Submitted 15 March, 2022; originally announced March 2022.

Comments: To appear at ACL 2022

arXiv:2112.00805 [pdf, other]

doi 10.1103/PhysRevB.104.174419

Giant topological Hall effect in centrosymmetric tetragonal Mn2-xZnxSb

Authors: Md Rafique Un Nabi, Aaron Wegner, Fei Wang, Yanglin Zhu, Yingdong Guan, Arash Fereidouni, Krishna Pandey, Rabindra Basnet, Gokul Acharya, Hugh O. H. Churchill, Zhiqiang Mao, Jin Hu

Abstract: Topological magnetism typically appears in noncentrosymmetric compounds or compounds with geometric frustration. Here we report the effective tuning of magnetism in centrosymmetric tetragonal Mn2-xZnxSb by Zn substitution. The magnetism is found to be closely coupled to the transport properties, giving rise to to a very large topological hall effect is possibly associated with topological magnetis… ▽ More Topological magnetism typically appears in noncentrosymmetric compounds or compounds with geometric frustration. Here we report the effective tuning of magnetism in centrosymmetric tetragonal Mn2-xZnxSb by Zn substitution. The magnetism is found to be closely coupled to the transport properties, giving rise to to a very large topological hall effect is possibly associated with topological magnetism. Our finding suggests Mn2-xZnxSb is a candidate material for the centrosymmetric tetragonal topological magnetic system, offers opportunities for studying and tuning spin textures and developing near room temperature spin-based devices. △ Less

Submitted 1 December, 2021; originally announced December 2021.

Comments: 15 pages, 4 Figures and already published in Physical Review B

Journal ref: journal={Physical Review B}, volume={104}, number={17}, pages={174419}, year={2021}, publisher={APS} }

arXiv:2109.05108 [pdf, other]

Attention-based Contrastive Learning for Winograd Schemas

Authors: Tassilo Klein, Moin Nabi

Abstract: Self-supervised learning has recently attracted considerable attention in the NLP community for its ability to learn discriminative features using a contrastive objective. This paper investigates whether contrastive learning can be extended to Transfomer attention to tackling the Winograd Schema Challenge. To this end, we propose a novel self-supervised framework, leveraging a contrastive loss dir… ▽ More Self-supervised learning has recently attracted considerable attention in the NLP community for its ability to learn discriminative features using a contrastive objective. This paper investigates whether contrastive learning can be extended to Transfomer attention to tackling the Winograd Schema Challenge. To this end, we propose a novel self-supervised framework, leveraging a contrastive loss directly at the level of self-attention. Experimental analysis of our attention-based models on multiple datasets demonstrates superior commonsense reasoning capabilities. The proposed approach outperforms all comparable unsupervised approaches while occasionally surpassing supervised ones. △ Less

Submitted 10 September, 2021; originally announced September 2021.

Comments: To appear at EMNLP 2021 (findings)

arXiv:2109.05105 [pdf, other]

Towards Zero-shot Commonsense Reasoning with Self-supervised Refinement of Language Models

Authors: Tassilo Klein, Moin Nabi

Abstract: Can we get existing language models and refine them for zero-shot commonsense reasoning? This paper presents an initial study exploring the feasibility of zero-shot commonsense reasoning for the Winograd Schema Challenge by formulating the task as self-supervised refinement of a pre-trained language model. In contrast to previous studies that rely on fine-tuning annotated datasets, we seek to boos… ▽ More Can we get existing language models and refine them for zero-shot commonsense reasoning? This paper presents an initial study exploring the feasibility of zero-shot commonsense reasoning for the Winograd Schema Challenge by formulating the task as self-supervised refinement of a pre-trained language model. In contrast to previous studies that rely on fine-tuning annotated datasets, we seek to boost conceptualization via loss landscape refinement. To this end, we propose a novel self-supervised learning approach that refines the language model utilizing a set of linguistic perturbations of similar concept relationships. Empirical analysis of our conceptually simple framework demonstrates the viability of zero-shot commonsense reasoning on multiple benchmarks. △ Less

Submitted 10 September, 2021; originally announced September 2021.

Comments: To appear at EMNLP 2021

arXiv:2108.10983 [pdf]

doi 10.1002/qute.202100063

Magnetic Topological Semimetal Phase with Electronic Correlation Enhancement in SmSbTe

Authors: Krishna Pandey, Debashis Mondal, John William Villanova, Joseph Roll, Rabindra Basnet, Aaron Wegner, Gokul Acharya, Md Rafique Un Nabi, Barun Ghosh, Jun Fujii, Jian Wang, Bo Da, Amit Agarwal, Ivana Vobornik, Antonio Politano, Salvador Barraza-Lopez, Jin Hu

Abstract: The ZrSiS family of compounds hosts various exotic quantum phenomena due to the presence of both topological nonsymmorphic Dirac fermions and nodal-line fermions. In this material family, the LnSbTe (Ln= lanthanide) compounds are particularly interesting owing to the intrinsic magnetism from magnetic Ln which leads to new properties and quantum states. In this work, the authors focus on the previo… ▽ More The ZrSiS family of compounds hosts various exotic quantum phenomena due to the presence of both topological nonsymmorphic Dirac fermions and nodal-line fermions. In this material family, the LnSbTe (Ln= lanthanide) compounds are particularly interesting owing to the intrinsic magnetism from magnetic Ln which leads to new properties and quantum states. In this work, the authors focus on the previously unexplored compound SmSbTe. The studies reveal a rare combination of a few functional properties in this material, including antiferromagnetism with possible magnetic frustration, electron correlation enhancement, and Dirac nodal-line fermions. These properties enable SmSbTe as a unique platform to explore exotic quantum phenomena and advanced functionalities arising from the interplay between magnetism, topology, and electronic correlations. △ Less

Submitted 24 August, 2021; originally announced August 2021.

Comments: 23 pages, 5 figures

Journal ref: Adv. Quantum Technol. 2021, 2100063

arXiv:2108.09632 [pdf, other]

Mathematical Analysis of Modified BEM-FEM Coupling Approach for 3D Electromagnetic Levitation Problem

Authors: Sayan Sarkar, Amit Jena, Mashuq un Nabi

Abstract: In electromagnetic analysis, the finite element and boundary element methods jointly known as 'FEM-BEM coupling' is applied for numerically solving levitation problem based on eddy current. The main focus behind this coupled analysis method is to determine the dynamic characteristic of the levitating body in the presence of a magnetic field. An innovative 3D structure is developed that couples Lag… ▽ More In electromagnetic analysis, the finite element and boundary element methods jointly known as 'FEM-BEM coupling' is applied for numerically solving levitation problem based on eddy current. The main focus behind this coupled analysis method is to determine the dynamic characteristic of the levitating body in the presence of a magnetic field. An innovative 3D structure is developed that couples Lagrangian description and BEM-FEM coupling method for this purpose. The coupling methodology is based on the boundary conditions on the common boundaries between FEM and BEM sub-domains. Subsequent coding has been developed to simulate the problem in the MATLAB environment. An example similar to TEAM (Testing Electromagnetic Analysis Methods) workshop problem 28 has been used to study the efficiency of code for computationally inexpensive analysis. △ Less

Submitted 22 August, 2021; originally announced August 2021.

Comments: 9 pages, multiple figures

arXiv:2108.08536 [pdf, other]

A Unified Objective for Novel Class Discovery

Authors: Enrico Fini, Enver Sangineto, Stéphane Lathuilière, Zhun Zhong, Moin Nabi, Elisa Ricci

Abstract: In this paper, we study the problem of Novel Class Discovery (NCD). NCD aims at inferring novel object categories in an unlabeled set by leveraging from prior knowledge of a labeled set containing different, but related classes. Existing approaches tackle this problem by considering multiple objective functions, usually involving specialized loss terms for the labeled and the unlabeled samples res… ▽ More In this paper, we study the problem of Novel Class Discovery (NCD). NCD aims at inferring novel object categories in an unlabeled set by leveraging from prior knowledge of a labeled set containing different, but related classes. Existing approaches tackle this problem by considering multiple objective functions, usually involving specialized loss terms for the labeled and the unlabeled samples respectively, and often requiring auxiliary regularization terms. In this paper, we depart from this traditional scheme and introduce a UNified Objective function (UNO) for discovering novel classes, with the explicit purpose of favoring synergy between supervised and unsupervised learning. Using a multi-view self-labeling strategy, we generate pseudo-labels that can be treated homogeneously with ground truth labels. This leads to a single classification objective operating on both known and unknown classes. Despite its simplicity, UNO outperforms the state of the art by a significant margin on several benchmarks (~+10% on CIFAR-100 and +8% on ImageNet). The project page is available at: https://ncd-uno.github.io. △ Less

Submitted 29 September, 2021; v1 submitted 19 August, 2021; originally announced August 2021.

Comments: ICCV 2021 (Oral)

arXiv:2108.01775 [pdf, other]

Solo-learn: A Library of Self-supervised Methods for Visual Representation Learning

Authors: Victor G. Turrisi da Costa, Enrico Fini, Moin Nabi, Nicu Sebe, Elisa Ricci

Abstract: This paper presents solo-learn, a library of self-supervised methods for visual representation learning. Implemented in Python, using Pytorch and Pytorch lightning, the library fits both research and industry needs by featuring distributed training pipelines with mixed-precision, faster data loading via Nvidia DALI, online linear evaluation for better prototyping, and many additional training tric… ▽ More This paper presents solo-learn, a library of self-supervised methods for visual representation learning. Implemented in Python, using Pytorch and Pytorch lightning, the library fits both research and industry needs by featuring distributed training pipelines with mixed-precision, faster data loading via Nvidia DALI, online linear evaluation for better prototyping, and many additional training tricks. Our goal is to provide an easy-to-use library comprising a large amount of Self-supervised Learning (SSL) methods, that can be easily extended and fine-tuned by the community. solo-learn opens up avenues for exploiting large-budget SSL solutions on inexpensive smaller infrastructures and seeks to democratize SSL by making it accessible to all. The source code is available at https://github.com/vturrisi/solo-learn. △ Less

Submitted 4 February, 2022; v1 submitted 3 August, 2021; originally announced August 2021.

Comments: Accepted to JMLR

arXiv:2103.12837 [pdf]

An Approach for the Automation of IaaS Cloud Upgrade

Authors: Mina Nabi, Ferhat Khendek, Maria Toeroe

Abstract: An Infrastructure as a Service (IaaS) cloud provider is committed to each tenant by a service level agreement (SLA) which indicates the terms of commitment, e.g. the level of availability of the IaaS cloud service.The different resources providing this IaaS cloud service may need to be upgraded several times throughout their life-cycle; and these upgrades may affect the service delivered by the Ia… ▽ More An Infrastructure as a Service (IaaS) cloud provider is committed to each tenant by a service level agreement (SLA) which indicates the terms of commitment, e.g. the level of availability of the IaaS cloud service.The different resources providing this IaaS cloud service may need to be upgraded several times throughout their life-cycle; and these upgrades may affect the service delivered by the IaaS layer. This may violate the SLAs towards the tenants and result in penalty as they impact the tenant services relying on the IaaS.Therefore, it is important to handle upgrades properly with respect to the SLAs.The upgrade of IaaS cloud systems inherits all the challenges of clustered systems and faces other, cloud specific challenges, such as size and dynamicity due to elasticity.In this paper, we propose a novel approach to automatically upgrade an IaaS cloud system under SLA constraints such as availability and elasticity.In this approach, the upgrade methods and actions appropriate for each upgrade request are identified, scheduled, and applied automatically in an iterative manner based on the vendors descriptions of the infrastructure components, the tenant SLAs, and the status of the system. The proposed approach allows new upgrade requests during ongoing upgrades, which makes it suitable for continuous delivery.In addition, it also handles failures of upgrade actions through localized retry and undo operations automatically. △ Less

Submitted 23 March, 2021; originally announced March 2021.

Comments: 20 pages, 11 figures

MSC Class: 90B25 ACM Class: D.2.9; D.4.5

arXiv:2011.08899 [pdf, other]

Multimodal Prototypical Networks for Few-shot Learning

Authors: Frederik Pahde, Mihai Puscas, Tassilo Klein, Moin Nabi

Abstract: Although providing exceptional results for many computer vision tasks, state-of-the-art deep learning algorithms catastrophically struggle in low data scenarios. However, if data in additional modalities exist (e.g. text) this can compensate for the lack of data and improve the classification results. To overcome this data scarcity, we design a cross-modal feature generation framework capable of e… ▽ More Although providing exceptional results for many computer vision tasks, state-of-the-art deep learning algorithms catastrophically struggle in low data scenarios. However, if data in additional modalities exist (e.g. text) this can compensate for the lack of data and improve the classification results. To overcome this data scarcity, we design a cross-modal feature generation framework capable of enriching the low populated embedding space in few-shot scenarios, leveraging data from the auxiliary modality. Specifically, we train a generative model that maps text data into the visual feature space to obtain more reliable prototypes. This allows to exploit data from additional modalities (e.g. text) during training while the ultimate task at test time remains classification with exclusively visual data. We show that in such cases nearest neighbor classification is a viable approach and outperform state-of-the-art single-modal and multimodal few-shot learning methods on the CUB-200 and Oxford-102 datasets. △ Less

Submitted 17 November, 2020; originally announced November 2020.

Comments: To appear at WACV 2021

arXiv:2010.11369 [pdf, other]

Learning Graph-Based Priors for Generalized Zero-Shot Learning

Authors: Colin Samplawski, Jannik Wolff, Tassilo Klein, Moin Nabi

Abstract: The task of zero-shot learning (ZSL) requires correctly predicting the label of samples from classes which were unseen at training time. This is achieved by leveraging side information about class labels, such as label attributes or word embeddings. Recently, attention has shifted to the more realistic task of generalized ZSL (GZSL) where test sets consist of seen and unseen samples. Recent approa… ▽ More The task of zero-shot learning (ZSL) requires correctly predicting the label of samples from classes which were unseen at training time. This is achieved by leveraging side information about class labels, such as label attributes or word embeddings. Recently, attention has shifted to the more realistic task of generalized ZSL (GZSL) where test sets consist of seen and unseen samples. Recent approaches to GZSL have shown the value of generative models, which are used to generate samples from unseen classes. In this work, we incorporate an additional source of side information in the form of a relation graph over labels. We leverage this graph in order to learn a set of prior distributions, which encourage an aligned variational autoencoder (VAE) model to learn embeddings which respect the graph structure. Using this approach we are able to achieve improved performance on the CUB and SUN benchmarks over a strong baseline. △ Less

Submitted 21 October, 2020; originally announced October 2020.

Comments: Presented at AAAI 2020 Workshop on Deep Learning on Graphs: Methodologies and Applications (DLGMA'20)

arXiv:2008.01510 [pdf, other]

Online Continual Learning under Extreme Memory Constraints

Authors: Enrico Fini, Stéphane Lathuilière, Enver Sangineto, Moin Nabi, Elisa Ricci

Abstract: Continual Learning (CL) aims to develop agents emulating the human ability to sequentially learn new tasks while being able to retain knowledge obtained from past experiences. In this paper, we introduce the novel problem of Memory-Constrained Online Continual Learning (MC-OCL) which imposes strict constraints on the memory overhead that a possible algorithm can use to avoid catastrophic forgettin… ▽ More Continual Learning (CL) aims to develop agents emulating the human ability to sequentially learn new tasks while being able to retain knowledge obtained from past experiences. In this paper, we introduce the novel problem of Memory-Constrained Online Continual Learning (MC-OCL) which imposes strict constraints on the memory overhead that a possible algorithm can use to avoid catastrophic forgetting. As most, if not all, previous CL methods violate these constraints, we propose an algorithmic solution to MC-OCL: Batch-level Distillation (BLD), a regularization-based CL approach, which effectively balances stability and plasticity in order to learn from data streams, while preserving the ability to solve old tasks through distillation. Our extensive experimental evaluation, conducted on three publicly available benchmarks, empirically demonstrates that our approach successfully addresses the MC-OCL problem and achieves comparable accuracy to prior distillation methods requiring higher memory overhead. △ Less

Submitted 12 January, 2022; v1 submitted 4 August, 2020; originally announced August 2020.

Comments: ECCV 2020

arXiv:2006.05536 [pdf]

doi 10.1103/PhysRevB.101.235161

Electronic and Magnetic Properties of Topological Semimetal Candidate NdSbTe

Authors: Krishna Pandey, Rabindra Basnet, Aaron Wegner, Gokul Acharya, Md Rafique Un Nabi, Jiangwei Liu, Jian Wang, Yukiko Takahashi, Bo Da, Jin Hu

Abstract: ZrSiS-type materials represent a large material family with unusual coexistence of topological nonsymmorphic Dirac fermions and nodal-line fermions. As a special group of ZrSiS-family, LnSbTe (Ln = Lanthanide rare earth) compounds provide a unique opportunity to explore new quantum phases due to the intrinsic magnetism induced by Ln. Here we report the single crystal growth and characterization of… ▽ More ZrSiS-type materials represent a large material family with unusual coexistence of topological nonsymmorphic Dirac fermions and nodal-line fermions. As a special group of ZrSiS-family, LnSbTe (Ln = Lanthanide rare earth) compounds provide a unique opportunity to explore new quantum phases due to the intrinsic magnetism induced by Ln. Here we report the single crystal growth and characterization of NdSbTe, a previously unexplored LnSbTe compound. NdSbTe has an antiferromagnetic ground state with field-driven metamagnetic transitions similar to other known LnSbTe, but exhibits distinct enhanced electronic correlations characterized by large a Sommerfeld coefficient of 115 mJ/mol $K^2$, which is the highest among the known LnSbTe compounds. Furthermore, our transport studies have revealed the coupling with magnetism and signatures of Kondo localization. All these findings establish NdSbTe as a new platform for observing novel phenomena arising from the interplay between magnetism, topology, and electron correlations. △ Less

Submitted 9 June, 2020; originally announced June 2020.

Journal ref: Phys. Rev. B 101, 235161 (2020)

arXiv:2005.00669 [pdf, other]

Contrastive Self-Supervised Learning for Commonsense Reasoning

Authors: Tassilo Klein, Moin Nabi

Abstract: We propose a self-supervised method to solve Pronoun Disambiguation and Winograd Schema Challenge problems. Our approach exploits the characteristic structure of training corpora related to so-called "trigger" words, which are responsible for flipping the answer in pronoun disambiguation. We achieve such commonsense reasoning by constructing pair-wise contrastive auxiliary predictions. To this end… ▽ More We propose a self-supervised method to solve Pronoun Disambiguation and Winograd Schema Challenge problems. Our approach exploits the characteristic structure of training corpora related to so-called "trigger" words, which are responsible for flipping the answer in pronoun disambiguation. We achieve such commonsense reasoning by constructing pair-wise contrastive auxiliary predictions. To this end, we leverage a mutual exclusive loss regularized by a contrastive margin. Our architecture is based on the recently introduced transformer networks, BERT, that exhibits strong performance on many NLP benchmarks. Empirical results show that our method alleviates the limitation of current supervised approaches for commonsense reasoning. This study opens up avenues for exploiting inexpensive self-supervision to achieve performance gain in commonsense reasoning tasks. △ Less

Submitted 1 May, 2020; originally announced May 2020.

Comments: To appear at ACL2020

arXiv:1912.05396 [pdf, other]

Multimodal Self-Supervised Learning for Medical Image Analysis

Authors: Aiham Taleb, Christoph Lippert, Tassilo Klein, Moin Nabi

Abstract: Self-supervised learning approaches leverage unlabeled samples to acquire generic knowledge about different concepts, hence allowing for annotation-efficient downstream task learning. In this paper, we propose a novel self-supervised method that leverages multiple imaging modalities. We introduce the multimodal puzzle task, which facilitates rich representation learning from multiple image modalit… ▽ More Self-supervised learning approaches leverage unlabeled samples to acquire generic knowledge about different concepts, hence allowing for annotation-efficient downstream task learning. In this paper, we propose a novel self-supervised method that leverages multiple imaging modalities. We introduce the multimodal puzzle task, which facilitates rich representation learning from multiple image modalities. The learned representations allow for subsequent fine-tuning on different downstream tasks. To achieve that, we learn a modality-agnostic feature embedding by confusing image modalities at the data-level. Together with the Sinkhorn operator, with which we formulate the puzzle solving optimization as permutation matrix inference instead of classification, they allow for efficient solving of multimodal puzzles with varying levels of complexity. In addition, we also propose to utilize cross-modal generation techniques for multimodal data augmentation used for training self-supervised tasks. In other words, we exploit synthetic images for self-supervised pretraining, instead of downstream tasks directly, in order to circumvent quality issues associated with synthetic images, while improving data-efficiency and representations quality. Our experimental results, which assess the gains in downstream performance and data-efficiency, show that solving our multimodal puzzles yields better semantic representations, compared to treating each modality independently. Our results also highlight the benefits of exploiting synthetic images for self-supervised pretraining. We showcase our approach on four downstream tasks: Brain tumor segmentation and survival days prediction using four MRI modalities, Prostate segmentation using two MRI modalities, and Liver segmentation using unregistered CT and MRI modalities. We outperform many previous solutions, and achieve results competitive to state-of-the-art. △ Less

Submitted 25 October, 2020; v1 submitted 11 December, 2019; originally announced December 2019.

Comments: NeurIPS 2019 Workshops

arXiv:1912.00200 [pdf, other]

Pruning at a Glance: Global Neural Pruning for Model Compression

Authors: Abdullah Salama, Oleksiy Ostapenko, Tassilo Klein, Moin Nabi

Abstract: Deep Learning models have become the dominant approach in several areas due to their high performance. Unfortunately, the size and hence computational requirements of operating such models can be considerably high. Therefore, this constitutes a limitation for deployment on memory and battery constrained devices such as mobile phones or embedded systems. To address these limitations, we propose a n… ▽ More Deep Learning models have become the dominant approach in several areas due to their high performance. Unfortunately, the size and hence computational requirements of operating such models can be considerably high. Therefore, this constitutes a limitation for deployment on memory and battery constrained devices such as mobile phones or embedded systems. To address these limitations, we propose a novel and simple pruning method that compresses neural networks by removing entire filters and neurons according to a global threshold across the network without any pre-calculation of layer sensitivity. The resulting model is compact, non-sparse, with the same accuracy as the non-compressed model, and most importantly requires no special infrastructure for deployment. We prove the viability of our method by producing highly compressed models, namely VGG-16, ResNet-56, and ResNet-110 respectively on CIFAR10 without losing any performance compared to the baseline, as well as ResNet-34 and ResNet-50 on ImageNet without a significant loss of accuracy. We also provide a well-retrained 30% compressed ResNet-50 that slightly surpasses the base model accuracy. Additionally, compressing more than 56% and 97% of AlexNet and LeNet-5 respectively. Interestingly, the resulted models' pruning patterns are highly similar to the other methods using layer sensitivity pre-calculation step. Our method does not only exhibit good performance but what is more also easy to implement. △ Less

Submitted 3 December, 2019; v1 submitted 30 November, 2019; originally announced December 2019.

Comments: Extended version of the ICASSP paper (https://ieeexplore.ieee.org/document/8683224)

arXiv:1911.02365 [pdf, other]

Learning to Answer by Learning to Ask: Getting the Best of GPT-2 and BERT Worlds

Authors: Tassilo Klein, Moin Nabi

Abstract: Automatic question generation aims at the generation of questions from a context, with the corresponding answers being sub-spans of the given passage. Whereas, most of the methods mostly rely on heuristic rules to generate questions, more recently also neural network approaches have been proposed. In this work, we propose a variant of the self-attention Transformer network architectures model to g… ▽ More Automatic question generation aims at the generation of questions from a context, with the corresponding answers being sub-spans of the given passage. Whereas, most of the methods mostly rely on heuristic rules to generate questions, more recently also neural network approaches have been proposed. In this work, we propose a variant of the self-attention Transformer network architectures model to generate meaningful and diverse questions. To this end, we propose an easy to use model consisting of the conjunction of the Transformer decoder GPT-2 model with Transformer encoder BERT for the downstream task for question answering. The model is trained in an end-to-end fashion, where the language model is trained to produce a question-answer-aware input representation that facilitates to generate an answer focused question. Our result of neural question generation from text on the SQuAD 1.1 dataset suggests that our method can produce semantically correct and diverse questions. Additionally, we assessed the performance of our proposed method for the downstream task of question answering. The analysis shows that our proposed generation & answering collaboration framework relatively improves both tasks and is particularly powerful in the semi-supervised setup. The results further suggest a robust and comparably lean pipeline facilitating question generation in the small-data regime. △ Less

Submitted 6 November, 2019; originally announced November 2019.

arXiv:1909.00626 [pdf, other]

Uncertainty-Driven Semantic Segmentation through Human-Machine Collaborative Learning

Authors: Mahdyar Ravanbakhsh, Tassilo Klein, Kayhan Batmanghelich, Moin Nabi

Abstract: Deep learning-based approaches achieve state-of-the-art performance in the majority of image segmentation benchmarks. However, training of such models requires a sizable amount of manual annotations. In order to reduce this effort, we propose a method based on conditional Generative Adversarial Network (cGAN), which addresses segmentation in a semi-supervised setup and in a human-in-the-loop fashi… ▽ More Deep learning-based approaches achieve state-of-the-art performance in the majority of image segmentation benchmarks. However, training of such models requires a sizable amount of manual annotations. In order to reduce this effort, we propose a method based on conditional Generative Adversarial Network (cGAN), which addresses segmentation in a semi-supervised setup and in a human-in-the-loop fashion. More specifically, we use the discriminator to identify unreliable slices for which expert annotation is required and use the generator in the GAN to synthesize segmentations on unlabeled data for which the model is confident. The quantitative results on a conventional standard benchmark show that our method is comparable with the state-of-the-art fully supervised methods in slice-level evaluation requiring far less annotated data. △ Less

Submitted 2 September, 2019; originally announced September 2019.

Comments: MIDL 2019 [arXiv:1907.08612]

Report number: MIDL/2019/ExtendedAbstract/rkgnwY04cV

arXiv:1905.13497 [pdf, other]

Attention Is (not) All You Need for Commonsense Reasoning

Authors: Tassilo Klein, Moin Nabi

Abstract: The recently introduced BERT model exhibits strong performance on several language understanding benchmarks. In this paper, we describe a simple re-implementation of BERT for commonsense reasoning. We show that the attentions produced by BERT can be directly utilized for tasks such as the Pronoun Disambiguation Problem and Winograd Schema Challenge. Our proposed attention-guided commonsense reason… ▽ More The recently introduced BERT model exhibits strong performance on several language understanding benchmarks. In this paper, we describe a simple re-implementation of BERT for commonsense reasoning. We show that the attentions produced by BERT can be directly utilized for tasks such as the Pronoun Disambiguation Problem and Winograd Schema Challenge. Our proposed attention-guided commonsense reasoning method is conceptually simple yet empirically powerful. Experimental analysis on multiple datasets demonstrates that our proposed system performs remarkably well on all cases while outperforming the previously reported state of the art by a margin. While results suggest that BERT seems to implicitly learn to establish complex relationships between entities, solving commonsense reasoning tasks might require more than unsupervised models learned from huge text corpora. △ Less

Submitted 31 May, 2019; originally announced May 2019.

Comments: to appear at ACL 2019

arXiv:1905.06242 [pdf, other]

doi 10.1109/ICCV.2019.00047

Budget-Aware Adapters for Multi-Domain Learning

Authors: Rodrigo Berriel, Stéphane Lathuilière, Moin Nabi, Tassilo Klein, Thiago Oliveira-Santos, Nicu Sebe, Elisa Ricci

Abstract: Multi-Domain Learning (MDL) refers to the problem of learning a set of models derived from a common deep architecture, each one specialized to perform a task in a certain domain (e.g., photos, sketches, paintings). This paper tackles MDL with a particular interest in obtaining domain-specific models with an adjustable budget in terms of the number of network parameters and computational complexity… ▽ More Multi-Domain Learning (MDL) refers to the problem of learning a set of models derived from a common deep architecture, each one specialized to perform a task in a certain domain (e.g., photos, sketches, paintings). This paper tackles MDL with a particular interest in obtaining domain-specific models with an adjustable budget in terms of the number of network parameters and computational complexity. Our intuition is that, as in real applications the number of domains and tasks can be very large, an effective MDL approach should not only focus on accuracy but also on having as few parameters as possible. To implement this idea we derive specialized deep models for each domain by adapting a pre-trained architecture but, differently from other methods, we propose a novel strategy to automatically adjust the computational complexity of the network. To this aim, we introduce Budget-Aware Adapters that select the most relevant feature channels to better handle data from a novel domain. Some constraints on the number of active switches are imposed in order to obtain a network respecting the desired complexity budget. Experimentally, we show that our approach leads to recognition accuracy competitive with state-of-the-art approaches but with much lighter networks both in terms of storage and computation. △ Less

Submitted 8 December, 2020; v1 submitted 15 May, 2019; originally announced May 2019.

Comments: ICCV 2019

arXiv:1904.03137 [pdf, other]

Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning

Authors: Oleksiy Ostapenko, Mihai Puscas, Tassilo Klein, Patrick Jähnichen, Moin Nabi

Abstract: Models trained in the context of continual learning (CL) should be able to learn from a stream of data over an undefined period of time. The main challenges herein are: 1) maintaining old knowledge while simultaneously benefiting from it when learning new tasks, and 2) guaranteeing model scalability with a growing amount of data to learn from. In order to tackle these challenges, we introduce Dyna… ▽ More Models trained in the context of continual learning (CL) should be able to learn from a stream of data over an undefined period of time. The main challenges herein are: 1) maintaining old knowledge while simultaneously benefiting from it when learning new tasks, and 2) guaranteeing model scalability with a growing amount of data to learn from. In order to tackle these challenges, we introduce Dynamic Generative Memory (DGM) - a synaptic plasticity driven framework for continual learning. DGM relies on conditional generative adversarial networks with learnable connection plasticity realized with neural masking. Specifically, we evaluate two variants of neural masking: applied to (i) layer activations and (ii) to connection weights directly. Furthermore, we propose a dynamic network expansion mechanism that ensures sufficient model capacity to accommodate for continually incoming tasks. The amount of added capacity is determined dynamically from the learned binary mask. We evaluate DGM in the continual class-incremental setup on visual classification tasks. △ Less

Submitted 2 December, 2019; v1 submitted 5 April, 2019; originally announced April 2019.

Comments: CVPR 2019

arXiv:1902.09455 [pdf, ps, other]

Evaluation, Modeling and Optimization of Coverage Enhancement Methods of NB-IoT

Authors: Sahithya Ravi, Pouria Zand, Mohieddine El Soussi, Majid Nabi

Abstract: Narrowband Internet of Things (NB-IoT) is a new Low Power Wide Area Network (LPWAN) technology released by 3GPP. The primary goals of NB-IoT are improved coverage, massive capacity, low cost, and long battery life. In order to improve coverage, NB-IoT has promising solutions, such as increasing transmission repetitions, decreasing bandwidth, and adapting the Modulation and Coding Scheme (MCS). In… ▽ More Narrowband Internet of Things (NB-IoT) is a new Low Power Wide Area Network (LPWAN) technology released by 3GPP. The primary goals of NB-IoT are improved coverage, massive capacity, low cost, and long battery life. In order to improve coverage, NB-IoT has promising solutions, such as increasing transmission repetitions, decreasing bandwidth, and adapting the Modulation and Coding Scheme (MCS). In this paper, we present an implementation of coverage enhancement features of NB-IoT in NS-3, an end-to-end network simulator. The resource allocation and link adaptation in NS-3 are modified to comply with the new features of NB-IoT. Using the developed simulation framework, the influence of the new features on network reliability and latency is evaluated. Furthermore, an optimal hybrid link adaptation strategy based on all three features is proposed. To achieve this, we formulate an optimization problem that has an objective function based on latency, and constraint based on the Signal to Noise Ratio (SNR). Then, we propose several algorithms to minimize latency and compare them with respect to accuracy and speed. The best hybrid solution is chosen and implemented in the NS-3 simulator by which the latency formulation is verified. The numerical results show that the proposed optimization algorithm for hybrid link adaptation is eight times faster than the exhaustive search approach and yields similar latency. △ Less

Submitted 19 February, 2019; originally announced February 2019.

arXiv:1901.01868 [pdf, other]

Low-Shot Learning from Imaginary 3D Model

Authors: Frederik Pahde, Mihai Puscas, Jannik Wolff, Tassilo Klein, Nicu Sebe, Moin Nabi

Abstract: Since the advent of deep learning, neural networks have demonstrated remarkable results in many visual recognition tasks, constantly pushing the limits. However, the state-of-the-art approaches are largely unsuitable in scarce data regimes. To address this shortcoming, this paper proposes employing a 3D model, which is derived from training images. Such a model can then be used to hallucinate nove… ▽ More Since the advent of deep learning, neural networks have demonstrated remarkable results in many visual recognition tasks, constantly pushing the limits. However, the state-of-the-art approaches are largely unsuitable in scarce data regimes. To address this shortcoming, this paper proposes employing a 3D model, which is derived from training images. Such a model can then be used to hallucinate novel viewpoints and poses for the scarce samples of the few-shot learning scenario. A self-paced learning approach allows for the selection of a diverse set of high-quality images, which facilitates the training of a classifier. The performance of the proposed approach is showcased on the fine-grained CUB-200-2011 dataset in a few-shot setting and significantly improves our baseline accuracy. △ Less

Submitted 4 January, 2019; originally announced January 2019.

Comments: To appear at WACV 2019. arXiv admin note: text overlap with arXiv:1811.09192

arXiv:1811.09192 [pdf, other]

Self Paced Adversarial Training for Multimodal Few-shot Learning

Authors: Frederik Pahde, Oleksiy Ostapenko, Patrick Jähnichen, Tassilo Klein, Moin Nabi

Abstract: State-of-the-art deep learning algorithms yield remarkable results in many visual recognition tasks. However, they still fail to provide satisfactory results in scarce data regimes. To a certain extent this lack of data can be compensated by multimodal information. Missing information in one modality of a single data point (e.g. an image) can be made up for in another modality (e.g. a textual desc… ▽ More State-of-the-art deep learning algorithms yield remarkable results in many visual recognition tasks. However, they still fail to provide satisfactory results in scarce data regimes. To a certain extent this lack of data can be compensated by multimodal information. Missing information in one modality of a single data point (e.g. an image) can be made up for in another modality (e.g. a textual description). Therefore, we design a few-shot learning task that is multimodal during training (i.e. image and text) and single-modal during test time (i.e. image). In this regard, we propose a self-paced class-discriminative generative adversarial network incorporating multimodality in the context of few-shot learning. The proposed approach builds upon the idea of cross-modal data generation in order to alleviate the data sparsity problem. We improve few-shot learning accuracies on the finegrained CUB and Oxford-102 datasets. △ Less

Submitted 22 November, 2018; originally announced November 2018.

Comments: To appear at WACV 2019

arXiv:1809.04344 [pdf, other]

The Wisdom of MaSSeS: Majority, Subjectivity, and Semantic Similarity in the Evaluation of VQA

Authors: Shailza Jolly, Sandro Pezzelle, Tassilo Klein, Andreas Dengel, Moin Nabi

Abstract: We introduce MASSES, a simple evaluation metric for the task of Visual Question Answering (VQA). In its standard form, the VQA task is operationalized as follows: Given an image and an open-ended question in natural language, systems are required to provide a suitable answer. Currently, model performance is evaluated by means of a somehow simplistic metric: If the predicted answer is chosen by at… ▽ More We introduce MASSES, a simple evaluation metric for the task of Visual Question Answering (VQA). In its standard form, the VQA task is operationalized as follows: Given an image and an open-ended question in natural language, systems are required to provide a suitable answer. Currently, model performance is evaluated by means of a somehow simplistic metric: If the predicted answer is chosen by at least 3 human annotators out of 10, then it is 100% correct. Though intuitively valuable, this metric has some important limitations. First, it ignores whether the predicted answer is the one selected by the Majority (MA) of annotators. Second, it does not account for the quantitative Subjectivity (S) of the answers in the sample (and dataset). Third, information about the Semantic Similarity (SES) of the responses is completely neglected. Based on such limitations, we propose a multi-component metric that accounts for all these issues. We show that our metric is effective in providing a more fine-grained evaluation both on the quantitative and qualitative level. △ Less

Submitted 12 September, 2018; originally announced September 2018.

Comments: 10 pages, 7 figures

arXiv:1806.05147 [pdf, other]

Cross-modal Hallucination for Few-shot Fine-grained Recognition

Authors: Frederik Pahde, Patrick Jähnichen, Tassilo Klein, Moin Nabi

Abstract: State-of-the-art deep learning algorithms generally require large amounts of data for model training. Lack thereof can severely deteriorate the performance, particularly in scenarios with fine-grained boundaries between categories. To this end, we propose a multimodal approach that facilitates bridging the information gap by means of meaningful joint embeddings. Specifically, we present a benchmar… ▽ More State-of-the-art deep learning algorithms generally require large amounts of data for model training. Lack thereof can severely deteriorate the performance, particularly in scenarios with fine-grained boundaries between categories. To this end, we propose a multimodal approach that facilitates bridging the information gap by means of meaningful joint embeddings. Specifically, we present a benchmark that is multimodal during training (i.e. images and texts) and single-modal in testing time (i.e. images), with the associated task to utilize multimodal data in base classes (with many samples), to learn explicit visual classifiers for novel classes (with few samples). Next, we propose a framework built upon the idea of cross-modal data hallucination. In this regard, we introduce a discriminative text-conditional GAN for sample generation with a simple self-paced strategy for sample selection. We show the results of our proposed discriminative hallucinated method for 1-, 2-, and 5- shot learning on the CUB dataset, where the accuracy is improved by employing multimodal data. △ Less

Submitted 14 June, 2018; v1 submitted 13 June, 2018; originally announced June 2018.

Comments: CVPR 2018 Workshop on Fine-Grained Visual Categorization

arXiv:1712.07557 [pdf, ps, other]

Differentially Private Federated Learning: A Client Level Perspective

Authors: Robin C. Geyer, Tassilo Klein, Moin Nabi

Abstract: Federated learning is a recent advance in privacy protection. In this context, a trusted curator aggregates parameters optimized in decentralized fashion by multiple clients. The resulting model is then distributed back to all clients, ultimately converging to a joint representative model without explicitly having to share the data. However, the protocol is vulnerable to differential attacks, whic… ▽ More Federated learning is a recent advance in privacy protection. In this context, a trusted curator aggregates parameters optimized in decentralized fashion by multiple clients. The resulting model is then distributed back to all clients, ultimately converging to a joint representative model without explicitly having to share the data. However, the protocol is vulnerable to differential attacks, which could originate from any party contributing during federated optimization. In such an attack, a client's contribution during training and information about their data set is revealed through analyzing the distributed model. We tackle this problem and propose an algorithm for client sided differential privacy preserving federated optimization. The aim is to hide clients' contributions during training, balancing the trade-off between privacy loss and model performance. Empirical studies suggest that given a sufficiently large number of participating clients, our proposed procedure can maintain client-level differential privacy at only a minor cost in model performance. △ Less

Submitted 1 March, 2018; v1 submitted 20 December, 2017; originally announced December 2017.

Comments: NIPS 2017 Workshop: Machine Learning on the Phone and other Consumer Devices

arXiv:1708.09644 [pdf, other]

Abnormal Event Detection in Videos using Generative Adversarial Nets

Authors: Mahdyar Ravanbakhsh, Moin Nabi, Enver Sangineto, Lucio Marcenaro, Carlo Regazzoni, Nicu Sebe

Abstract: In this paper we address the abnormality detection problem in crowded scenes. We propose to use Generative Adversarial Nets (GANs), which are trained using normal frames and corresponding optical-flow images in order to learn an internal representation of the scene normality. Since our GANs are trained with only normal data, they are not able to generate abnormal events. At testing time the real d… ▽ More In this paper we address the abnormality detection problem in crowded scenes. We propose to use Generative Adversarial Nets (GANs), which are trained using normal frames and corresponding optical-flow images in order to learn an internal representation of the scene normality. Since our GANs are trained with only normal data, they are not able to generate abnormal events. At testing time the real data are compared with both the appearance and the motion representations reconstructed by our GANs and abnormal areas are detected by computing local differences. Experimental results on challenging abnormality detection datasets show the superiority of the proposed method compared to the state of the art in both frame-level and pixel-level abnormality detection tasks. △ Less

Submitted 31 August, 2017; originally announced August 2017.

Comments: Best Paper / Student Paper Award Finalist, IEEE International Conference on Image Processing (ICIP), 2017

arXiv:1706.07680 [pdf, other]

Training Adversarial Discriminators for Cross-channel Abnormal Event Detection in Crowds

Authors: Mahdyar Ravanbakhsh, Enver Sangineto, Moin Nabi, Nicu Sebe

Abstract: Abnormal crowd behaviour detection attracts a large interest due to its importance in video surveillance scenarios. However, the ambiguity and the lack of sufficient abnormal ground truth data makes end-to-end training of large deep networks hard in this domain. In this paper we propose to use Generative Adversarial Nets (GANs), which are trained to generate only the normal distribution of the dat… ▽ More Abnormal crowd behaviour detection attracts a large interest due to its importance in video surveillance scenarios. However, the ambiguity and the lack of sufficient abnormal ground truth data makes end-to-end training of large deep networks hard in this domain. In this paper we propose to use Generative Adversarial Nets (GANs), which are trained to generate only the normal distribution of the data. During the adversarial GAN training, a discriminator (D) is used as a supervisor for the generator network (G) and vice versa. At testing time we use D to solve our discriminative task (abnormality detection), where D has been trained without the need of manually-annotated abnormal data. Moreover, in order to prevent G learn a trivial identity function, we use a cross-channel approach, forcing G to transform raw-pixel data in motion information and vice versa. The quantitative results on standard benchmarks show that our method outperforms previous state-of-the-art methods in both the frame-level and the pixel-level evaluation. △ Less

Submitted 26 November, 2018; v1 submitted 23 June, 2017; originally announced June 2017.

Comments: To appear at WACV 2019

arXiv:1705.01359 [pdf, other]

doi 10.18653/v1/P17-1024

FOIL it! Find One mismatch between Image and Language caption

Authors: Ravi Shekhar, Sandro Pezzelle, Yauhen Klimovich, Aurelie Herbelot, Moin Nabi, Enver Sangineto, Raffaella Bernardi

Abstract: In this paper, we aim to understand whether current language and vision (LaVi) models truly grasp the interaction between the two modalities. To this end, we propose an extension of the MSCOCO dataset, FOIL-COCO, which associates images with both correct and "foil" captions, that is, descriptions of the image that are highly similar to the original ones, but contain one single mistake ("foil word"… ▽ More In this paper, we aim to understand whether current language and vision (LaVi) models truly grasp the interaction between the two modalities. To this end, we propose an extension of the MSCOCO dataset, FOIL-COCO, which associates images with both correct and "foil" captions, that is, descriptions of the image that are highly similar to the original ones, but contain one single mistake ("foil word"). We show that current LaVi models fall into the traps of this data and perform badly on three tasks: a) caption classification (correct vs. foil); b) foil word detection; c) foil word correction. Humans, in contrast, have near-perfect performance on those tasks. We demonstrate that merely utilising language cues is not enough to model FOIL-COCO and that it challenges the state-of-the-art by requiring a fine-grained understanding of the relation between text and image. △ Less

Submitted 3 May, 2017; originally announced May 2017.

Comments: To appear at ACL 2017

arXiv:1611.06764 [pdf, other]

Efficient Convolutional Neural Network with Binary Quantization Layer

Authors: Mahdyar Ravanbakhsh, Hossein Mousavi, Moin Nabi, Lucio Marcenaro, Carlo Regazzoni

Abstract: In this paper we introduce a novel method for segmentation that can benefit from general semantics of Convolutional Neural Network (CNN). Our segmentation proposes visually and semantically coherent image segments. We use binary encoding of CNN features to overcome the difficulty of the clustering on the high-dimensional CNN feature space. These binary encoding can be embedded into the CNN as an e… ▽ More In this paper we introduce a novel method for segmentation that can benefit from general semantics of Convolutional Neural Network (CNN). Our segmentation proposes visually and semantically coherent image segments. We use binary encoding of CNN features to overcome the difficulty of the clustering on the high-dimensional CNN feature space. These binary encoding can be embedded into the CNN as an extra layer at the end of the network. This results in real-time segmentation. To the best of our knowledge our method is the first attempt on general semantic image segmentation using CNN. All the previous papers were limited to few number of category of the images (e.g. PASCAL VOC). Experiments show that our segmentation algorithm outperform the state-of-the-art non-semantic segmentation methods by a large margin. △ Less

Submitted 21 November, 2016; originally announced November 2016.

Comments: Workshop on Efficient Methods for Deep Neural Networks (EMDNN), NIPS 2016, Barcelona, Spain. arXiv admin note: substantial text overlap with arXiv:1609.09220

arXiv:1610.00307 [pdf, other]

Plug-and-Play CNN for Crowd Motion Analysis: An Application in Abnormal Event Detection

Authors: Mahdyar Ravanbakhsh, Moin Nabi, Hossein Mousavi, Enver Sangineto, Nicu Sebe

Abstract: Most of the crowd abnormal event detection methods rely on complex hand-crafted features to represent the crowd motion and appearance. Convolutional Neural Networks (CNN) have shown to be a powerful tool with excellent representational capacities, which can leverage the need for hand-crafted features. In this paper, we show that keeping track of the changes in the CNN feature across time can facil… ▽ More Most of the crowd abnormal event detection methods rely on complex hand-crafted features to represent the crowd motion and appearance. Convolutional Neural Networks (CNN) have shown to be a powerful tool with excellent representational capacities, which can leverage the need for hand-crafted features. In this paper, we show that keeping track of the changes in the CNN feature across time can facilitate capturing the local abnormality. We specifically propose a novel measure-based method which allows measuring the local abnormality in a video by combining semantic information (inherited from existing CNN models) with low-level Optical-Flow. One of the advantage of this method is that it can be used without the fine-tuning costs. The proposed method is validated on challenging abnormality detection datasets and the results show the superiority of our method compared to the state-of-the-art methods. △ Less

Submitted 26 January, 2018; v1 submitted 2 October, 2016; originally announced October 2016.

Comments: To appear at WACV 2018

arXiv:1609.09220 [pdf, other]

CNN-aware Binary Map for General Semantic Segmentation

Authors: Mahdyar Ravanbakhsh, Hossein Mousavi, Moin Nabi, Mohammad Rastegari, Carlo Regazzoni

Abstract: In this paper we introduce a novel method for general semantic segmentation that can benefit from general semantics of Convolutional Neural Network (CNN). Our segmentation proposes visually and semantically coherent image segments. We use binary encoding of CNN features to overcome the difficulty of the clustering on the high-dimensional CNN feature space. These binary codes are very robust agains… ▽ More In this paper we introduce a novel method for general semantic segmentation that can benefit from general semantics of Convolutional Neural Network (CNN). Our segmentation proposes visually and semantically coherent image segments. We use binary encoding of CNN features to overcome the difficulty of the clustering on the high-dimensional CNN feature space. These binary codes are very robust against noise and non-semantic changes in the image. These binary encoding can be embedded into the CNN as an extra layer at the end of the network. This results in real-time segmentation. To the best of our knowledge our method is the first attempt on general semantic image segmentation using CNN. All the previous papers were limited to few number of category of the images (e.g. PASCAL VOC). Experiments show that our segmentation algorithm outperform the state-of-the-art non-semantic segmentation methods by large margin. △ Less

Submitted 29 September, 2016; originally announced September 2016.

Comments: ICIP 2016 Best Paper / Student Paper Finalist

arXiv:1607.07646 [pdf, other]

Emotion-Based Crowd Representation for Abnormality Detection

Authors: Hamidreza Rabiee, Javad Haddadnia, Hossein Mousavi, Moin Nabi, Vittorio Murino, Nicu Sebe

Abstract: In crowd behavior understanding, a model of crowd behavior need to be trained using the information extracted from video sequences. Since there is no ground-truth available in crowd datasets except the crowd behavior labels, most of the methods proposed so far are just based on low-level visual features. However, there is a huge semantic gap between low-level motion/appearance features and high-le… ▽ More In crowd behavior understanding, a model of crowd behavior need to be trained using the information extracted from video sequences. Since there is no ground-truth available in crowd datasets except the crowd behavior labels, most of the methods proposed so far are just based on low-level visual features. However, there is a huge semantic gap between low-level motion/appearance features and high-level concept of crowd behaviors. In this paper we propose an attribute-based strategy to alleviate this problem. While similar strategies have been recently adopted for object and action recognition, as far as we know, we are the first showing that the crowd emotions can be used as attributes for crowd behavior understanding. The main idea is to train a set of emotion-based classifiers, which can subsequently be used to represent the crowd motion. For this purpose, we collect a big dataset of video clips and provide them with both annotations of "crowd behaviors" and "crowd emotions". We show the results of the proposed method on our dataset, which demonstrate that the crowd emotions enable the construction of more descriptive models for crowd behaviors. We aim at publishing the dataset with the article, to be used as a benchmark for the communities. △ Less

Submitted 26 July, 2016; originally announced July 2016.

arXiv:1605.07651 [pdf, other]

doi 10.1109/TPAMI.2018.2804907

Self Paced Deep Learning for Weakly Supervised Object Detection

Authors: Enver Sangineto, Moin Nabi, Dubravko Culibrk, Nicu Sebe

Abstract: In a weakly-supervised scenario object detectors need to be trained using image-level annotation alone. Since bounding-box-level ground truth is not available, most of the solutions proposed so far are based on an iterative, Multiple Instance Learning framework in which the current classifier is used to select the highest-confidence boxes in each image, which are treated as pseudo-ground truth in… ▽ More In a weakly-supervised scenario object detectors need to be trained using image-level annotation alone. Since bounding-box-level ground truth is not available, most of the solutions proposed so far are based on an iterative, Multiple Instance Learning framework in which the current classifier is used to select the highest-confidence boxes in each image, which are treated as pseudo-ground truth in the next training iteration. However, the errors of an immature classifier can make the process drift, usually introducing many of false positives in the training dataset. To alleviate this problem, we propose in this paper a training protocol based on the self-paced learning paradigm. The main idea is to iteratively select a subset of images and boxes that are the most reliable, and use them for training. While in the past few years similar strategies have been adopted for SVMs and other classifiers, we are the first showing that a self-paced approach can be used with deep-network-based classifiers in an end-to-end training pipeline. The method we propose is built on the fully-supervised Fast-RCNN architecture and can be applied to similar architectures which represent the input image as a bag of boxes. We show state-of-the-art results on Pascal VOC 2007, Pascal VOC 2010 and ILSVRC 2013. On ILSVRC 2013 our results based on a low-capacity AlexNet network outperform even those weakly-supervised approaches which are based on much higher-capacity networks. △ Less

Submitted 21 February, 2018; v1 submitted 24 May, 2016; originally announced May 2016.

Comments: To appear at IEEE Transactions on PAMI

Showing 1–50 of 51 results for author: Nabi, M