Xiang Fu

Xiang Fu

San Jose, California, United States
1K followers 500+ connections

About

My interest is to explore and solve the novel and practical problems in the era of big…

Activity

Join now to see all activity

Experience

  • Apple Graphic

    Apple

    San Francisco Bay Area

  • -

    San Francisco Bay Area

  • -

    Greater Los Angeles Area

  • -

    Greater Los Angeles Area

  • -

    Beijing City, China

  • -

    Shanghai City, China

  • -

    Shanghai City, China

  • -

    Shanghai City, China

Education

  • University of Southern California Graphic

    University of Southern California

    -

    Supervised by Prof. C.-C. Jay Kuo (http://mcl.usc.edu/people/cckuo/)
    Research Assistant: Visual Data Segmentation, Video Object Representation and Segmentation, Visual Tracking, and Video Matting.

  • -

    Computer Vision, Computer Graphics, Data Mining, Machine Learning, Multimedia

  • -

    Digital Signal Processing, Digital Image Processing, Pattern Recognition, Advanced DSP Lab

  • -

    Information Engineering

  • -

Volunteer Experience

  • SHANGHAI JIAO TONG UNIVERSITY ALUMNI ASSOC OF SILICON VALLEY Graphic

    Volunteer

    SHANGHAI JIAO TONG UNIVERSITY ALUMNI ASSOC OF SILICON VALLEY

    - Present 7 years 7 months

    Science and Technology

    One of the outstanding contributors for organizing the SJTU Entrepreneur Summit Silicon Valley 2017.

  • Volunteer

    2006 Special Olympic Shanghai Invitational Games

    - Present 18 years 4 months

    Social Services

    Organized, guided, and accompanied special athletes in the stadium.

  • Volunteer

    Shanghai International Marathon

    - Present 16 years 3 months

    Social Services

    Service support for the athletes.

Publications

  • Exploring Confusing Scene Classes for the Places Dataset: Insights and Solutions

    APSIPA ASC 2017

    Scene classification is more challenging than object classification due to higher ambiguity in scene labels. In this work, we propose to use the filter weights at the last stage of a CNN model trained by the Places dataset, which is also known as the "scene anchor vector (SAV)", to explain the source of confusions. An SAV points to a cluster of images. If two anchor vectors have a smaller angle, we see overlapping image clusters, leading to a set of confusing classes. To overcome it, we propose…

    Scene classification is more challenging than object classification due to higher ambiguity in scene labels. In this work, we propose to use the filter weights at the last stage of a CNN model trained by the Places dataset, which is also known as the "scene anchor vector (SAV)", to explain the source of confusions. An SAV points to a cluster of images. If two anchor vectors have a smaller angle, we see overlapping image clusters, leading to a set of confusing classes. To overcome it, we propose to merge images associated with confusing anchor vectors into a confusion set and split the set in an unsupervised fashion to create multiple subsets. It is called the "automatic subset clustering (ASC)" process. Each of these subsets contains scene images of strong visual similarity. After the ASC process, we train a random forest (RF) classifier for each confusion subset to allow better scene classification. The ASC/RF scheme can be added on top of any existing scene-classification CNN as a post-processing module with little extra training effort. It is shown by extensive experimental results that, for a given baseline CNN, the ASC/RF scheme can offer a significant performance gain.

    See publication
  • Image Segmentation using Contour, Surface, and Depth Cues

    ICIP 2017 (oral)

    We target at solving the problem of automatic image segmentation. Although 1D contour and 2D surface cues have been widely utilized in existing work, 3D depth information of an image, a necessary cue according to human visual perception, is however overlooked in automatic image segmentation. In this paper, we study how to fully utilize 1D contour, 2D surface, and 3D depth cues for image segmentation. First, three elementary segmentation modules are developed for these cues respectively. The…

    We target at solving the problem of automatic image segmentation. Although 1D contour and 2D surface cues have been widely utilized in existing work, 3D depth information of an image, a necessary cue according to human visual perception, is however overlooked in automatic image segmentation. In this paper, we study how to fully utilize 1D contour, 2D surface, and 3D depth cues for image segmentation. First, three elementary segmentation modules are developed for these cues respectively. The proposed 3D depth cue is able to segment different textured regions even with similar color, and also merge similar textured areas, which cannot be achieved using state-of-the-art approaches. Then, a content-dependent spectral (CDS) graph is proposed for layered affinity models to produce the final segmentation. CDS is designed to build a more reliable relationship between neighboring surface nodes based on the three elementary cues in the spectral graph. Extensive experiments not only show the superior performance of the proposed algorithm over state-of-the-art approaches, but also verify the necessities of these three cues in image segmentation.

    Other authors
    See publication
  • Video

    Encyclopedia of Database Systems. Springer, New York, NY

    Video, which means “I see” in Latin, is an electronic representation of a sequence of images or frames, put together to simulate motion and interactivity. From the producer’s perspective, a video delivers information created from the recording of real events to be processed simultaneously by a viewer’s eyes and ears. For most of time, a video also contains other forms of media such as text or audio.
    Video is also referred to as a storage format for moving pictures as compared to text, image,…

    Video, which means “I see” in Latin, is an electronic representation of a sequence of images or frames, put together to simulate motion and interactivity. From the producer’s perspective, a video delivers information created from the recording of real events to be processed simultaneously by a viewer’s eyes and ears. For most of time, a video also contains other forms of media such as text or audio.
    Video is also referred to as a storage format for moving pictures as compared to text, image, audio, graphics and animation.

    Other authors
    See publication
  • Hierarchical Supervoxel Graph for Interactive Video Object Representation and Segmentation

    ACCV 2016

    In this paper, we study the problem of how to represent and segment objects in a video. To handle the motion and variations of the internal regions of objects, we present an interactive hierarchical supervoxel representation for video object segmentation. First, a hierarchical supervoxel graph with various granularities is built based on local clustering and region merging to represent the video, in which both color histogram and motion information are leveraged in the feature space, and visual…

    In this paper, we study the problem of how to represent and segment objects in a video. To handle the motion and variations of the internal regions of objects, we present an interactive hierarchical supervoxel representation for video object segmentation. First, a hierarchical supervoxel graph with various granularities is built based on local clustering and region merging to represent the video, in which both color histogram and motion information are leveraged in the feature space, and visual saliency is also taken into account as merging guidance to build the graph. Then, a supervoxel selection algorithm is introduced to choose supervoxels with diverse granularities to represent the object(s) labeled by the user. Finally, based on above representations, an interactive video object segmentation framework is proposed to handle complex and diverse scenes with large motion and occlusions. The experimental results show the effectiveness of the proposed algorithms in supervoxel graph construction and video object segmentation.

    Other authors
    See publication
  • Robust Image Segmentation Using Contour-guided Color Palettes

    ICCV 2015

    The contour-guided color palette (CCP) is proposed for robust image segmentation. It efficiently integrates contour and color cues of an image. To find representative colors of an image, we collect color samples from both sides of long contours and conduct the mean-shift (MS) algorithm in the sampled color space to define an image-dependent color palette. This color palette provides a preliminary segmentation in the spatial domain, which is further fine-tuned by post-processing techniques such…

    The contour-guided color palette (CCP) is proposed for robust image segmentation. It efficiently integrates contour and color cues of an image. To find representative colors of an image, we collect color samples from both sides of long contours and conduct the mean-shift (MS) algorithm in the sampled color space to define an image-dependent color palette. This color palette provides a preliminary segmentation in the spatial domain, which is further fine-tuned by post-processing techniques such as leakage avoidance, fake boundary removal, and small region mergence. Segmentation performances of CCP and MS are compared and analyzed. While CCP offers an acceptable standalone segmentation result, it can be further integrated into the framework of layered spectral segmentation to produce a more robust segmentation. The superior performance of CCP-based segmentation algorithm is demonstrated by experiments on the Berkeley Segmentation Dataset.

    Other authors
    See publication
  • Hierarchical Bag-of-Words Model for Joint Multi-View Object Representation and Classification

    APSIPA ASC 2012 (oral)

    Multi-view object classification is a challenging problem in image retrieval. One common approach is to apply the visual bag-of-words (BoW) model to all view representations of each object class and compare them with the representation of the query image one by one so as to determine the closest view of the object class. This approach offers good matching performance, yet it demands a large amount of computation and storage space. To address these issues, we propose a novel hierarchical BoW…

    Multi-view object classification is a challenging problem in image retrieval. One common approach is to apply the visual bag-of-words (BoW) model to all view representations of each object class and compare them with the representation of the query image one by one so as to determine the closest view of the object class. This approach offers good matching performance, yet it demands a large amount of computation and storage space. To address these issues, we propose a novel hierarchical BoW model that provides a concise representation of each object class with multi-views. When the higher level BoW representation does not match with that of the query instance, further comparison can be saved. We can also incorporate similar views to reduce the storage space. We conduct experiments on a dataset of 3D object classes, and show that the proposed approach achieves higher efficiency in terms of lower computational complexity and storage space while preserving good matching performance.

    Other authors
    See publication

Patents

Courses

  • Advanced DSP Design Laboratory

    EE-586L

  • Analysis of Algorithms

    CSCI-570

  • Computer Graphics

    CSCI-480

  • Computer Vision

    CSCI-574

  • Data Mining and Statistical Inference

    CSCI-599

  • Database Systems

    CSCI-585

  • Directed Research

    EE-590

  • Introduction to Digital Image Processing

    EE-569

  • Introduction to Digital Signal Processing

    EE-483

  • Mathematical Pattern Recognition

    EE-559

  • Multimedia System Design

    CSCI-576

  • Probability Theory for Engineers

    EE-464

  • Random Processes in Engineering

    EE-562A

  • Speech Recognition and Processing for Multimedia

    EE-519

  • Web Technologies

    CSCI-571

Projects

  • Course Project: CSCI-576 Multimedia System Design

    -

    Group Project: Media-Based Querying and Searching
    Established a three-layer checking system to find the location of the query object in the search image

    Other creators
  • Course Project: CSCI-571 Web Technologies

    -

    Built the company's web-based information system for employees using PHP, Javascript, HTML, and MySQL
    Built the company's web-based information system for customers using CodeIgniter and JQuery

    See project
  • Course Project: CSCI-574 Computer Vision

    -

    Implemented projects on Panoramic Stitching, Planar Augmented Reality, 3D Face Recognition using C++
    Prize Award for CSCI 574 Projects (3 of 60+ students got this prize)

  • Open Course Project: CSCI-599 Data Mining and Statistical Inference

    -

    TRECVID conference group project: Multimedia Event Detection – Content Understanding
    Constructed hierarchical BoW model to classify and achieved 45.5% accuracy using MATLAB

    Other creators
  • Course Project: EE-569 Digital Image Processing

    -

    Projects: Image Enhancement, Noise Removal, Edge Detection, Morphology, Halftoning, Geometric Manipulation, Texture Analysis and Segmentation, Optical Character Recognition, Media Retargeting
    Realized and improved NLM Filter, Canny Edge Detector, Face Warping, Seam Carving using C++

  • Directed Research, USC Institute for Creative Technologies

    -

    Group Project: LED tracking – Depth Estimator
    Tracked multiple targets with pose information using C++ with OpenCV, OpenGL and GSL

    Other creators
  • Open Course Project: CSCI-480 Computer Graphics

    -

    Group Project: Asteroids Facebook App Using WebGL (http://apps.facebook.com/331271226889802/)
    Established graphics rendering framework, particle systems and models, gameplay logic

    Other creators
  • Open Course Project: EE-586L Advanced DSP Lab.

    -

    Group Project: Real-time Environment Exchange System using TMS320C6416T with VM3224K
    Implemented and improved Gaussian Mixture Models and Graph Cut to extract foreground using C

    Other creators

Honors & Awards

  • Membership of The Honor Society of Phi Kappa Phi

    Phi Kappa Phi Chapter at USC

  • Academic Achievement Award for International Student

    The Office of International Services at USC

  • 2007 - 2008 "Three Good" Student

    School of Electronic Information and Electrical Engineering at SJTU

  • 2007- 2008 Academic Excellence Scholarship

    School of Electronic Information and Electrical Engineering at SJTU

  • 2006 - 2007 Academic Excellence Scholarship

    School of Electronic Information and Electrical Engineering at SJTU

  • 2005 - 2006 Academic Excellence Scholarship

    School of Electronic Information and Electrical Engineering at SJTU

Languages

  • English

    Full professional proficiency

  • Mandarin

    Native or bilingual proficiency

Organizations

  • APSIPA: Asia-Pacific Signal and Information Processing Association

    Student Member

  • IEEE: Institute of Electrical and Electronics Engineers

    Student Member

  • SIAM: Society for Industrial and Applied Mathematics

    Student Member

More activity by Xiang

View Xiang’s full profile

  • See who you know in common
  • Get introduced
  • Contact Xiang directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Xiang Fu in United States

Add new skills with these courses