Oana Ignat

Follower

Following

Public Views

Interests

Uploads

Papers by Oana Ignat

WhyAct: Identifying Action Reasons in Lifestyle Vlogs

arXiv (Cornell University), Sep 6, 2021

Download

Building a Flexible Knowledge Graph to Capture Real-World Events

Theory and Applications of Categories, 2019

Download

OCR Improves Machine Translation for Low-Resource Languages

arXiv (Cornell University), Feb 26, 2022

Download

Identifying Visible Actions in Lifestyle Vlogs

arXiv (Cornell University), Jun 10, 2019

Download

Detecting Inspiring Content on Social Media

arXiv (Cornell University), Sep 6, 2021

Download

Scalable Performance Analysis for Vision-Language Models

Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)

When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs

ACM Transactions on Multimedia Computing, Communications, and Applications

We consider the task of temporal human action localization in lifestyle vlogs. We introduce a nov... more We consider the task of temporal human action localization in lifestyle vlogs. We introduce a novel dataset consisting of manual annotations of temporal localization for 13,000 narrated actions in 1,200 video clips. We present an extensive analysis of this data, which allows us to better understand how the language and visual modalities interact throughout the videos. We propose a simple yet effective method to localize the narrated actions based on their expected duration. Through several experiments and analyses, we show that our method brings complementary information with respect to previous methods, and leads to improvements over previous work for the task of temporal action localization.

Download

Towards Human Action Understanding in Social Media Videos Using Multimodal Models

Deep Blue (University of Michigan), 2022

Download

FIBER: Fill-in-the-Blanks as a Challenging Video Understanding Evaluation Framework

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Download

OCR Improves Machine Translation for Low-Resource Languages

Findings of the Association for Computational Linguistics: ACL 2022

Download

WhyAct: Identifying Action Reasons in Lifestyle Vlogs

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Download

Building a Flexible Knowledge Graph to Capture Real-World Events

Events and situations unfold quickly in our modern world, generating streams of Internet articles... more Events and situations unfold quickly in our modern world, generating streams of Internet articles, photos, and videos. The ability to automatically sort through this wealth of information would allow us to identify which pieces of information are most important and credible, and how trends unfold over time. In this paper, we present the first piece of a system to sort through large amounts of political data from the web. Our system takes in raw multimodal input (e.g., text, images, and videos), and generates a knowledge graph connecting entities, events, and relations in meaningful ways. This work is part of the DARPA-funded Active Interpretation of Disparate Alternatives (AIDA) project, which aims to automatically build a knowledge base that can be queried to strategically generate hypotheses about different aspects of an event. We are participating in this project as a TA1 team, building the first step of the overall system. Our approach is outlined in Figure 1 and will be discuss...

Download

Entity and Event Extraction from Scratch Using Minimal Training Data

Theory and Applications of Categories, 2018

Understanding current world events in real-time involves sifting through news articles, tweets, p... more Understanding current world events in real-time involves sifting through news articles, tweets, photos, and videos from many different perspectives. The goal of the DARPA-funded AIDA project is to automate much of this process, building a knowledge base that can be queried to strategically generate hypotheses about different aspects of an event. We are participating in this project as a TA1 team, and we are building the first step of the overall system. Given raw multimodal input (e.g., text, images, video), our goal is to generate a knowledge graph with entities, events, and relations. Figure 1 shows an overview of our pipeline. The first stage is pre-processing. This involves translating all the raw documents, as well as transcribing and translating audio and video data. All the translated information is input to our main processing module that extracts entities, events, and relations. Entities are extracted from both text and video data. In the final, output generation stage of t...

Download

Detecting Inspiring Content on Social Media

2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII), 2021

Download

Identifying Visible Actions in Lifestyle Vlogs

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019

Download

Disparity image segmentation for free-space detection

2016 IEEE 12th International Conference on Intelligent Computer Communication and Processing (ICCP), 2016

The paper introduces a novel and efficient algorithm for determining the free-space in road drivi... more The paper introduces a novel and efficient algorithm for determining the free-space in road driving assistance scenarios. The input data for the algorithm is gathered from a stereo camera and is processed as a disparity image. Each column of the disparity image is segmented based on its relative extreme points. The idea is inspired from a time series compression article which presents a method for segmenting data measured at equal intervals of time (time series): electro cardiograms, monthly stocking-exchanges, etc. The novelty of the method consists in adapting an idea used in a different area of interest for an image recognition purpose. Compared to existing algorithms in the driving assistance field that share the same goal, the proposed method achieves great adaptability and a linear time performance. The adaptability of the method is worth mentioning as it gives good results both on precise data gathered with a lidar scanner and on noisy disparity inferred with a stereo camera. The algorithm filters most of the errors of measurement while preserving the points of interest that delimit the road, objects or sky. Because the filtering steps preserve the data of interest, additional post-processing steps are no longer required thus minimizing the time complexity.