Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Mariella Dimiccoli

    Mariella Dimiccoli

    Wearable cameras can capture naturally-occurring activities from a first-person (egocentric) point of view, across multiple environments, and with different lasting durations. The automatic recognition of egocentric activities has several... more
    Wearable cameras can capture naturally-occurring activities from a first-person (egocentric) point of view, across multiple environments, and with different lasting durations. The automatic recognition of egocentric activities has several potential health applications such as assistive and rehabilitative technology. The characteristic problem of egocentric activity recognition is that the person himself is only partially visible in the images through his hands. As a consequence, activity recognition can rely solely on the user interactions with objects, other persons, and the scene. Over the last seven years, there has been a growing interest in analyzing human daily activities from data collected by wearable cameras [1, 5, 11, 12, 13, 15]. Most of these works model activities as a combination of egocentric features such as hand pose [1, 5], head motion and gaze [11], manipulated objects [13, 15], and all of them [2, 12]. These methods strongly rely on motion features, neglecting th...
    Event boundaries play a crucial role as a pre-processing step for detection, localization, and recognition tasks of human activities in videos. Typically, although their intrinsic subjectiveness, temporal bounds are provided manually as... more
    Event boundaries play a crucial role as a pre-processing step for detection, localization, and recognition tasks of human activities in videos. Typically, although their intrinsic subjectiveness, temporal bounds are provided manually as input for training action recognition algorithms. However, their role for activity recognition in the domain of egocentric photostreams has been so far neglected. In this paper, we provide insights of how automatically computed boundaries can impact activity recognition results in the emerging domain of egocentric photostreams. Furthermore, we collected a new annotated dataset acquired by 15 people by a wearable photo-camera and we used it to show the generalization capabilities of several deep learning based architectures to unseen users.
    Research Interests:
    In this paper, we give an overview on the emerging trend of the digitized self, focusing on visual lifelogging through wearable cameras. This is about continuously recording our life from a first-person view by wearing a camera that... more
    In this paper, we give an overview on the emerging trend of the digitized self, focusing on visual lifelogging through wearable cameras. This is about continuously recording our life from a first-person view by wearing a camera that passively captures images. On one hand, visual lifelogging has opened the door to a large number of applications, including health. On the other, it has also boosted new challenges in the field of data analysis as well as new ethical concerns. While currently increasing efforts are being devoted to exploit lifelogging data for the improvement of personal well-being, we believe there are still many interesting applications to explore, ranging from tourism to the digitization of human behavior.
    Semantic image retrieval from large amounts of egocentric visual data requires to leverage powerful techniques for filling in the semantic gap. This paper introduces LEMoRe, a Lifelog Engine for Moments Retrieval, developed in the context... more
    Semantic image retrieval from large amounts of egocentric visual data requires to leverage powerful techniques for filling in the semantic gap. This paper introduces LEMoRe, a Lifelog Engine for Moments Retrieval, developed in the context of the Lifelog Semantic Access Task (LSAT) of the the NTCIR-12 challenge and discusses its performance variation on different trials. LEMoRe integrates classical image descriptors with high-level semantic concepts extracted by Convolutional Neural Networks (CNN), powered by a graphic user interface that uses natural language processing. Although this is just a first attempt towards interactive image retrieval from large egocentric datasets and there is a large room for improvement of the system components and the user interface, the structure of the system itself and the way the single components cooperate are very promising.
    We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike... more
    We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike photo collections captured by a digital camera, which typically are pre-processed by the user who organizes them into event-related folders, smartphone pictures are automatically stored in the cloud. As a consequence, photo collections captured by a smartphone are highly unstructured and because smartphones are ubiquitous, they present a larger variability compared to pictures captured by a digital camera. To solve the need of organizing large smartphone photo collections automatically, we propose here a new methodology for hierarchical photo organization into topics and topic-related categories. Our approach successfully estimates latent topics in the pictures by applying probabilistic Latent Semantic Analysis, and automatically assigns a name to ...
    This paper presents an algorithm for tree-based representation of single images and its applications to segmentation and filtering with depth. In a our recent work, we have addressed the problem of seg- mentation with depth by... more
    This paper presents an algorithm for tree-based representation of single images and its applications to segmentation and filtering with depth. In a our recent work, we have addressed the problem of seg- mentation with depth by incorporating depth ordering information into a region merging algorithm and by reasoning about depth rela- tions through a graph model. In this paper, we extend this previous work giving a two-fold contribution. First, we propose to model each pixel statistically by its probability distribution instead of determin- istically by its color value. Second, we propose a depth-oriented fil- ter, which allows to remove foreground regions and to replace them with a plausible background. Experimental results are satisfactory.
    Research Interests: