[PDF][PDF] Positive unlabeled learning for deceptive reviews detection
Proceedings of the 2014 conference on empirical methods in natural …, 2014•aclanthology.org
Deceptive reviews detection has attracted significant attention from both business and
research communities. However, due to the difficulty of human labeling needed for
supervised learning, the problem remains to be highly challenging. This paper proposed a
novel angle to the problem by modeling PU (positive unlabeled) learning. A semi-
supervised model, called mixing population and individual property PU learning (MPIPUL),
is proposed. Firstly, some reliable negative examples are identified from the unlabeled …
research communities. However, due to the difficulty of human labeling needed for
supervised learning, the problem remains to be highly challenging. This paper proposed a
novel angle to the problem by modeling PU (positive unlabeled) learning. A semi-
supervised model, called mixing population and individual property PU learning (MPIPUL),
is proposed. Firstly, some reliable negative examples are identified from the unlabeled …
Abstract
Deceptive reviews detection has attracted significant attention from both business and research communities. However, due to the difficulty of human labeling needed for supervised learning, the problem remains to be highly challenging. This paper proposed a novel angle to the problem by modeling PU (positive unlabeled) learning. A semi-supervised model, called mixing population and individual property PU learning (MPIPUL), is proposed. Firstly, some reliable negative examples are identified from the unlabeled dataset. Secondly, some representative positive examples and negative examples are generated based on LDA (Latent Dirichlet Allocation). Thirdly, for the remaining unlabeled examples (we call them spy examples), which can not be explicitly identified as positive and negative, two similarity weights are assigned, by which the probability of a spy example belonging to the positive class and the negative class are displayed. Finally, spy examples and their similarity weights are incorporated into SVM (Support Vector Machine) to build an accurate classifier. Experiments on gold-standard dataset demonstrate the effectiveness of MPIPUL which outperforms the state-of-the-art baselines.
aclanthology.org
Showing the best result for this search. See all results