Many image retrieval systems, and the evaluation methodologies of these systems, make use of eith... more Many image retrieval systems, and the evaluation methodologies of these systems, make use of either visual or textual information only. Only few combine textual and visual features for retrieval and evaluation. If text is used, it is often relies upon having a standardised and complete annotation schema for the entire collection. This, in combination with high{level semantic queries, makes visual/textual
We present an approach using Gaussian mixture models for part-based object recognition where spat... more We present an approach using Gaussian mixture models for part-based object recognition where spatial relationships of the parts are explicitly modeled and parameters of the generative model are tuned discriminatively. These extensions lead to great improvements of the classification accuracy. Fur- thermore we evaluate several improvements over our baseline system which incrementally improve the obtained results which compare favorable well
At the moment Google image search is probably the only widely known way to search the world wide ... more At the moment Google image search is probably the only widely known way to search the world wide web for images. Google's search engine works based on text retrieval: The images are not indexed by their appearance but by text which can be found in the context of the image. To achieve enhancements for the user we propose to reorder
This paper describes the medial image retrieval and the medical annotation tasks of ImageCLEF 200... more This paper describes the medial image retrieval and the medical annotation tasks of ImageCLEF 2006.These tasks are described in a separate paper from the other task to reduce the size of the overview papaer. These two medical tasks are described separately with respect to the goals, databases used, topics created and distributed among participants, results and techniques used. The
... 1Computer Vision Laboratory ETH Zurich, Switzerland {deselaers,ferrari}@vision.ee.ethz.ch ...... more ... 1Computer Vision Laboratory ETH Zurich, Switzerland {deselaers,ferrari}@vision.ee.ethz.ch ... 1). More precisely, we study the following aspects: (i) We analyze how the visual variability within a cate-gory changes with depth in the hierarchy, ie the size of its semantic domain. ...
In this paper we propose a method to retrieve images based on the persons shown. The method aims ... more In this paper we propose a method to retrieve images based on the persons shown. The method aims at retrieving from images showing groups of people those in which the same persons are depicted as in the query image. It is experimentally shown that this aim is achieved for rather simple tasks and that improvements over baseline methods are possible
We introduce the use of appearance-based features in hid- den Markov model emission probabilities... more We introduce the use of appearance-based features in hid- den Markov model emission probabilities to recognize dynamic gestures. Tangent distance and the image distortion model are used to directly model image variability in videos. No explicit hand models and no seg- mentation of the hand is necessary. Dierent appearance-based features are investigated and the invariant distance measures are systematically evaluated.
In this paper we describe our current work on automatic continuous sign language recognition. We ... more In this paper we describe our current work on automatic continuous sign language recognition. We present an automatic sign language recognition system that is based on a large vocabulary speech recognition system and adopts many of the approaches that are conven- tionally applied in the recognition of spoken language. Furthermore, we present a set of freely available databases that can
We present an approach to automatically recognize sign language and translate it into a spoken la... more We present an approach to automatically recognize sign language and translate it into a spoken language. A system to address these tasks is created based on state-of- the-art techniques from statistical machine translation, speech recognition, and image processing research. Such a system is necessary for communication between deaf and hearing people. The communication is otherwise nearly impossible due to missing
Many image retrieval systems, and the evaluation methodologies of these systems, make use of eith... more Many image retrieval systems, and the evaluation methodologies of these systems, make use of either visual or textual information only. Only few combine textual and visual features for retrieval and evaluation. If text is used, it is often relies upon having a standardised and complete annotation schema for the entire collection. This, in combination with high{level semantic queries, makes visual/textual
We present an approach using Gaussian mixture models for part-based object recognition where spat... more We present an approach using Gaussian mixture models for part-based object recognition where spatial relationships of the parts are explicitly modeled and parameters of the generative model are tuned discriminatively. These extensions lead to great improvements of the classification accuracy. Fur- thermore we evaluate several improvements over our baseline system which incrementally improve the obtained results which compare favorable well
At the moment Google image search is probably the only widely known way to search the world wide ... more At the moment Google image search is probably the only widely known way to search the world wide web for images. Google's search engine works based on text retrieval: The images are not indexed by their appearance but by text which can be found in the context of the image. To achieve enhancements for the user we propose to reorder
This paper describes the medial image retrieval and the medical annotation tasks of ImageCLEF 200... more This paper describes the medial image retrieval and the medical annotation tasks of ImageCLEF 2006.These tasks are described in a separate paper from the other task to reduce the size of the overview papaer. These two medical tasks are described separately with respect to the goals, databases used, topics created and distributed among participants, results and techniques used. The
... 1Computer Vision Laboratory ETH Zurich, Switzerland {deselaers,ferrari}@vision.ee.ethz.ch ...... more ... 1Computer Vision Laboratory ETH Zurich, Switzerland {deselaers,ferrari}@vision.ee.ethz.ch ... 1). More precisely, we study the following aspects: (i) We analyze how the visual variability within a cate-gory changes with depth in the hierarchy, ie the size of its semantic domain. ...
In this paper we propose a method to retrieve images based on the persons shown. The method aims ... more In this paper we propose a method to retrieve images based on the persons shown. The method aims at retrieving from images showing groups of people those in which the same persons are depicted as in the query image. It is experimentally shown that this aim is achieved for rather simple tasks and that improvements over baseline methods are possible
We introduce the use of appearance-based features in hid- den Markov model emission probabilities... more We introduce the use of appearance-based features in hid- den Markov model emission probabilities to recognize dynamic gestures. Tangent distance and the image distortion model are used to directly model image variability in videos. No explicit hand models and no seg- mentation of the hand is necessary. Dierent appearance-based features are investigated and the invariant distance measures are systematically evaluated.
In this paper we describe our current work on automatic continuous sign language recognition. We ... more In this paper we describe our current work on automatic continuous sign language recognition. We present an automatic sign language recognition system that is based on a large vocabulary speech recognition system and adopts many of the approaches that are conven- tionally applied in the recognition of spoken language. Furthermore, we present a set of freely available databases that can
We present an approach to automatically recognize sign language and translate it into a spoken la... more We present an approach to automatically recognize sign language and translate it into a spoken language. A system to address these tasks is created based on state-of- the-art techniques from statistical machine translation, speech recognition, and image processing research. Such a system is necessary for communication between deaf and hearing people. The communication is otherwise nearly impossible due to missing
Uploads
Papers by Thomas Deselaers