A Collaborative Alignment Framework of Transferable Knowledge Extraction for Unsupervised Domain Adaptation
Unsupervised domain adaptation (UDA) aims to utilize knowledge from a label-rich source domain to understand a similar yet distinct unlabeled target domain. Notably, global distribution statistics across domains and local semantic characteristics across ...
A Data-Driven Approach for Scheduling Bus Services Subject to Demand Constraints
Passenger satisfaction is extremely important for the success of a public transportation system. Many studies have shown that passenger satisfaction strongly depends on the time they have to wait at the bus stop (waiting time) to get on a bus. To be ...
A Graph and Attentive Multi-Path Convolutional Network for Traffic Prediction
Traffic prediction is an important and yet highly challenging problem due to the complexity and constantly changing nature of traffic systems. To address the challenges, we propose a <italic>graph and attentive multi-path convolutional network</italic> (...
A Hybrid Spiking Neurons Embedded LSTM Network for Multivariate Time Series Learning Under Concept-Drift Environment
Complicated temporal patterns can provide important information for accurate time series forecasting. Existing long short-term memory (LSTM) model with attention mechanism have achieved significant performance. However, the exponential decay of long-term ...
A Survey of Context-Aware Recommender Systems: From an Evaluation Perspective
In recent years, context-aware recommender systems (CARSs), which incorporate contextual information to achieve better recommendations, become a hot topic in the domain of recommender systems. Many context-aware recommendation methods have been proposed ...
A Survey on Dropout Methods and Experimental Verification in Recommendation
Overfitting is a common problem in machine learning, which means the model too closely fits the training data while performing poorly in the test data. Among various methods of coping with overfitting, dropout is one of the representative ways. From ...
Adaptive Generalized Multi-View Canonical Correlation Analysis for Incrementally Update Multiblock Data
One of the major problems in real-life multiblock dynamic data analysis is that all the available modalities may not be relevant. Some of them may provide noisy or even inconsistent information with respect to other modalities. So, it is necessary to ...
An Experimental Survey of Missing Data Imputation Algorithms
Due to the ubiquity of missing data, data imputation has received extensive attention in the past decades. It is a well-recognized problem impacting almost all fields of scientific study. Existing imputation algorithms differ in problem settings, model ...
An Investigation of SMOTE Based Methods for Imbalanced Datasets With Data Complexity Analysis
Many binary class datasets in real-life applications are affected by class imbalance problem. Data complexities like noise examples, class overlap and small disjuncts problems are observed to play a key role in producing poor classification performance. ...
AutoSrh: An Embedding Dimensionality Search Framework for Tabular Data Prediction
Prediction over tabular data is often a crucial task in many real-life applications. Recent advances in deep learning give rise to various deep models for tabular data prediction. A common and essential step in these models is to vectorize raw input ...
Beyond Low-Pass Filtering: Graph Convolutional Networks With Automatic Filtering
Graph convolutional networks are becoming indispensable for deep learning from graph-structured data. Most of the existing graph convolutional networks share two big shortcomings. First, they are essentially low-pass filters, thus the potentially useful ...
BGNN-XML: Bilateral Graph Neural Networks for Extreme Multi-Label Text Classification
Extreme multi-label text classification (XMTC) aims to tag a text instance with the most relevant subset of labels from an extremely large label set. XMTC has attracted much recent attention due to massive label sets yielded by modern applications, such ...
Bloom Filter With Noisy Coding Framework for Multi-Set Membership Testing
This article is on designing a compact data structure for multi-set membership testing that allows fast set querying. Multi-set membership testing is a fundamental operation for computing systems. Most existing schemes for multi-set membership testing are ...
Classification-Labeled Continuousization and Multi-Domain Spatio-Temporal Fusion for Fine-Grained Urban Crime Prediction
Fine-grained urban crime prediction is of great significance to urban management and public safety. Previous crime prediction work has been done at a relatively coarse time granularity, which may suffer from two issues for fine-grained crime prediction. 1)...
Collecting Geospatial Data Under Local Differential Privacy With Improving Frequency Estimation
Geospatial data provides a lot of benefits for personalized services. However, since the geospatial data contains sensitive information about personal activities, collecting the raw data has a potential risk of leaking private information from the ...
Collecting Preference Rankings Under Local Differential Privacy
With the deep penetration of the Internet and mobile devices, preference rankings are being collected on a massive scale by diverse data collectors for various business demands. However, users’ preference rankings in many applications are highly ...
ConPhrase: Enhancing Context-Aware Phrase Mining From Text Corpora
Phrase mining is an essential step when transforming unstructured text into structured information, in which the aim is to extract high-quality phrases from given corpora automatically. Existing statistics-based methods have achieved state-of-the-art ...
DDRM: A Continual Frequency Estimation Mechanism With Local Differential Privacy
Many applications rely on continual data collection to provide real-time information services, e.g., real-time road traffic forecasts. However, the collection of original data brings risks to user privacy. Recently, local differential privacy (LDP) has ...
Deep Cross-Modal Proxy Hashing
Due to the high retrieval efficiency and low storage cost for cross-modal search tasks, cross-modal hashing methods have attracted considerable attention from the researchers. For the supervised cross-modal hashing methods, how to make the learned hash ...
Deep Generative Networks Coupled With Evidential Reasoning for Dynamic User Preferences Using Short Texts
Seeking an efficient solution for the problem of dynamic user preferences on social networks is challenging because the input data are <italic>short texts</italic> and user preferences usually <italic>change</italic> over time. This work proposes a novel ...
Development of Fully Convolutional Neural Networks Based on Discretization in Time Series Classification
Time Series Classification (TSC) is a crucial area in machine learning. Although applications of Deep Neural Networks (DNNs) in this area have led to relatively good results, classifying this kind of data is a major challenge. This issue is due to the ...
Discovery of Cross Joins
A cross join between two attribute sets holds on a relation whenever its projection onto the union of the attribute sets is the cross join between its projections on the first and second attribute set. Hence, the cross join is a fundamental operator on ...
Distantly-Supervised Long-Tailed Relation Extraction Using Constraint Graphs
Label noise and long-tailed distributions are two major challenges in distantly supervised relation extraction. Recent studies have shown great progress on denoising, but paid little attention to the problem of long-tailed relations. In this paper, we ...
Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AI
- Jiangchao Yao,
- Shengyu Zhang,
- Yang Yao,
- Feng Wang,
- Jianxin Ma,
- Jianwei Zhang,
- Yunfei Chu,
- Luo Ji,
- Kunyang Jia,
- Tao Shen,
- Anpeng Wu,
- Fengda Zhang,
- Ziqi Tan,
- Kun Kuang,
- Chao Wu,
- Fei Wu,
- Jingren Zhou,
- Hongxia Yang
Influenced by the great success of deep learning via cloud computing and the rapid development of edge chips, research in artificial intelligence (AI) has shifted to both of the computing paradigms, i.e., cloud computing and edge computing. In recent ...
Efficient Multi-View K-Means Clustering With Multiple Anchor Graphs
Multi-view clustering has attracted a lot of attention due to its ability to integrate information from distinct views, but how to improve efficiency is still a hot research topic. Anchor graph-based methods and k-means-based methods are two current ...
Explainable Discrete Collaborative Filtering
Using hashing to learn the binary codes of users and items significantly improves the efficiency and reduces the space consumption of the recommender system. However, existing hashing-based recommender systems remain black boxes without any explainable ...
Explicit Message-Passing Heterogeneous Graph Neural Network
Graph neural network (GNN) has shown its prominent performance in representation learning of graphs but it has not been fully considered for heterogeneous graphs which contain more complex structures and rich semantics. The rich semantic information of ...
Fast Flexible Bipartite Graph Model for Co-Clustering
Co-clustering methods make use of the correlation between samples and attributes to explore the co-occurrence structure in data. These methods have played a significant role in gene expression analysis, image segmentation, and document clustering. In ...
Generalized Divergence-Based Decision Making Method With an Application to Pattern Classification
In decision-making systems, how to address uncertainty plays an important role for the improvement of system performance in uncertainty reasoning. Dempster–Shafer evidence (DSE) theory is an effective method to address uncertainty in decision-...
Geo-Ellipse-Indistinguishability: Community-Aware Location Privacy Protection for Directional Distribution
Directional distribution analysis has long served as a fundamental functionality in abstracting dispersion and orientation of spatial datasets. Spatial datasets that describe sensitive information of individuals such as health status and home addresses ...