Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- invited-talkOctober 2015
Vision-enhanced Immersive Interaction and Remote Collaboration with Large Touch Displays
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 3–4https://doi.org/10.1145/2733373.2817845Large displays are becoming commodity, and more and more, they are touch-enabled. In this keynote, we describe a system called ViiBoard (Vision-enhanced Immersive Interaction with touch Board) that enables natural interaction and immersive remote ...
- research-articleOctober 2015
Who are the Devils Wearing Prada in New York City?
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 177–180https://doi.org/10.1145/2733373.2809930Fashion is a perpetual topic in human social life, and the mass has the penchant to emulate what large city residents and celebrities wear. Undeniably, New York City is such a bellwether large city with all kinds of fashion leadership. Consequently, to ...
- abstractOctober 2015
Captioning Images Using Different Styles
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 665–668https://doi.org/10.1145/2733373.2807998I develop techniques that can be used to incorporate stylistic objectives into existing image captioning systems. Style is generally a very tricky concept to define, thus I concentrate on two specific components of style. First I develop a technique for ...
- demonstrationOctober 2015
AR in Hand: Egocentric Palm Pose Tracking and Gesture Recognition for Augmented Reality Applications
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 743–744https://doi.org/10.1145/2733373.2807972Wearable devices such as Microsoft Hololens and Google glass are highly popular in recent years. As traditional input hardware is difficult to use on such platforms, vision-based hand pose tracking and gesture control techniques are more suitable ...
- abstractOctober 2015
ImmersiveMe'15: 3rd ACM International Workshop on Immersive Media Experiences
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1339–1340https://doi.org/10.1145/2733373.2806410This ACM International Workshop on Immersive Media Experiences is in its 3rd edition. Since 2013 in Barcelona, it has been a meeting point of researchers, students, media producers, service providers and industry players in the area of immersive media ...
-
- short-paperOctober 2015
Vision-Inertial Hybrid Tracking for Robust and Efficient Augmented Reality on Smartphones
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1039–1042https://doi.org/10.1145/2733373.2806396This paper aims at robust and efficient pose tracking for augmented reality on modern smartphones. Existing methods, relying on either vision analysis or motion sensing, are either too computationally expensive to achieve real-time performance on a ...
- short-paperOctober 2015
Deep People Counting in Extremely Dense Crowds
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1299–1302https://doi.org/10.1145/2733373.2806337People counting in extremely dense crowds is an important step for video surveillance and anomaly warning. The problem becomes especially more challenging due to the lack of training samples, severe occlusions, cluttered scenes and variation of ...
- short-paperOctober 2015
Exclusive Constrained Discriminative Learning for Weakly-Supervised Semantic Segmentation
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1251–1254https://doi.org/10.1145/2733373.2806329How to import image-level labels as weak supervision to direct the region-level labeling task is the core task of weakly-supervised semantic segmentation. In this paper, we focus on designing an effective but simple weakly-supervised constraint, and ...
- short-paperOctober 2015
Semi- and Weakly- Supervised Semantic Segmentation with Deep Convolutional Neural Networks
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1223–1226https://doi.org/10.1145/2733373.2806322Successful semantic segmentation methods typically rely on the training datasets containing a large number of pixel-wise labeled images. To alleviate the dependence on such a fully annotated training dataset, in this paper, we propose a semi- and weakly-...
- short-paperOctober 2015
GPU Accelerated Generalised Subclass Discriminant Analysis for Event and Concept Detection in Video
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1219–1222https://doi.org/10.1145/2733373.2806321In this paper a discriminant analysis (DA) technique called accelerated generalised subclass discriminant analysis (AGSDA) and its GPU implementation are presented. This method identifies a discriminant subspace of the input space in three steps: a) ...
- short-paperOctober 2015
Human Action Recognition With Trajectory Based Covariance Descriptor In Unconstrained Videos
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1175–1178https://doi.org/10.1145/2733373.2806310Human action recognition from realistic videos plays a key role in multimedia event detection and understanding. In this paper, a novel Trajectory Based Covariance (TBC) descriptor is proposed, which is formulated along the dense trajectories. To map ...
- short-paperOctober 2015
Online Object Tracking Based on CNN with Metropolis-Hasting Re-Sampling
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1163–1166https://doi.org/10.1145/2733373.2806307Tracking-by-learning strategies have been effective in solving many challenging problems in visual tracking, in which the learning sample generation and labeling play important roles for final performance. Since the concern of deep learning based ...
- short-paperOctober 2015
Hyperspectral Image Classification with Convolutional Neural Networks
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1159–1162https://doi.org/10.1145/2733373.2806306Hyperspectral image (HSI) classification is one of the most widely used methods for scene analysis from hyperspectral imagery. In the past, many different engineered features have been proposed for the HSI classification problem. In this paper, however, ...
- short-paperOctober 2015
Spatio-Temporal Triangular-Chain CRF for Activity Recognition
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1151–1154https://doi.org/10.1145/2733373.2806304Understanding human activities in video is a fundamental problem in computer vision. In real life, human activities are composed of temporal and spatial arrangement of actions. Understanding such complex activities requires recognizing not only each ...
- short-paperOctober 2015
Predicting Image Memorability by Multi-view Adaptive Regression
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1147–1150https://doi.org/10.1145/2733373.2806303The images we encounter throughout our lives make different impressions on us: Some are remembered at first glance, while others are forgotten. This phenomenon is caused by the intrinsic memorability of images revealed by recent studies [5,6]. In this ...
- short-paperOctober 2015
3D Person Tracking In World Coordinates and Attribute Estimation with PDR
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1139–1142https://doi.org/10.1145/2733373.2806301In this paper, we propose an online 3D person tracking method and an attribute estimation method with pedestrian dead reckoning (PDR). For person tracking, we employ a structured prediction approach, which extends the Struck algorithm. Although the main ...
- short-paperOctober 2015
Weak Labeled Multi-Label Active Learning for Image Classification
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1127–1130https://doi.org/10.1145/2733373.2806298In order to achieve better classification performance with even fewer labeled images, active learning is suitable for these situations. Several active learning methods have been proposed for multi-label image classification, but all of them assume that ...
- short-paperOctober 2015
Local Depth Patterns for Tracking in Depth Videos
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1115–1118https://doi.org/10.1145/2733373.2806295Conventional video tracking operates over RGB or grey-level data which contain significant clues for the identification of the targets. While this is often desirable in a video surveillance context, use of video tracking in privacy-sensitive ...
- short-paperOctober 2015
A Probabilistic Approach for Image Retrieval Using Descriptive Textual Queries
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1091–1094https://doi.org/10.1145/2733373.2806289We address the problem of image retrieval using textual queries. In particular, we focus on descriptive queries that can be either in the form of simple captions (e.g., ``a brown cat sleeping on a sofa''), or even long descriptions with multiple ...
- short-paperOctober 2015
Detecting Salient Objects via Spatial and Appearance Compactness Hypotheses
MM '15: Proceedings of the 23rd ACM international conference on MultimediaPages 1087–1090https://doi.org/10.1145/2733373.2806288Object-level saliency detection has been attracting a lot of attention, due to its potential enhancement in many high-level vision tasks. Many previous methods are based on the contrast hypothesis which regards the regions with high contrast in a ...