User profiles for Viorica Patraucean
Viorica PatrauceanGoogle DeepMind Verified email at google.com Cited by 2213 |
Understanding real world indoor scenes with synthetic data
A Handa, V Patraucean… - Proceedings of the …, 2016 - openaccess.thecvf.com
Scene understanding is a prerequisite to many high level tasks for any automated intelligent
machine operating in real world environments. Recent attempts with supervised learning …
machine operating in real world environments. Recent attempts with supervised learning …
Spatio-temporal video autoencoder with differentiable memory
We describe a new spatio-temporal video autoencoder, based on a classic spatial image
autoencoder and a novel nested temporal autoencoder. The temporal encoder is represented …
autoencoder and a novel nested temporal autoencoder. The temporal encoder is represented …
[HTML][HTML] State of research in automatic as-built modelling
Building Information Models (BIMs) are becoming the official standard in the construction
industry for encoding, reusing, and exchanging information about structural assets. …
industry for encoding, reusing, and exchanging information about structural assets. …
Active acquisition for multimodal temporal data: A challenging decision-making task
We introduce a challenging decision-making task that we call active acquisition for
multimodal temporal data (A2MT). In many real-world scenarios, input features are not readily …
multimodal temporal data (A2MT). In many real-world scenarios, input features are not readily …
Perception test: A diagnostic benchmark for multimodal video models
We propose a novel multimodal video benchmark-the Perception Test-to evaluate the perception
and reasoning skills of pre-trained multimodal models (eg Flamingo, BEiT-3, or GPT-4). …
and reasoning skills of pre-trained multimodal models (eg Flamingo, BEiT-3, or GPT-4). …
Broaden your views for self-supervised video learning
…, M Malinowski, V Pătrăucean… - Proceedings of the …, 2021 - openaccess.thecvf.com
Most successful self-supervised learning methods are trained to align the representations of
two independent views from the data. State-of-the-art methods in video are inspired by …
two independent views from the data. State-of-the-art methods in video are inspired by …
A simple recipe for contrastively pre-training video-first encoders beyond 16 frames
…, J Chiu, J Heyward, V Patraucean… - Proceedings of the …, 2024 - openaccess.thecvf.com
Understanding long real-world videos requires modeling of long-range visual dependencies.
To this end we explore video-first architectures building on the common paradigm of …
To this end we explore video-first architectures building on the common paradigm of …
A parameterless line segment and elliptical arc detector with enhanced ellipse fitting
We propose a combined line segment and elliptical arc detector, which formally guarantees
the control of the number of false positives and requires no parameter tuning. The accuracy …
the control of the number of false positives and requires no parameter tuning. The accuracy …
gvnn: Neural network library for geometric computer vision
We introduce gvnn, a neural network library in Torch aimed towards bridging the gap between
classic geometric computer vision and deep learning. Inspired by the recent success of …
classic geometric computer vision and deep learning. Inspired by the recent success of …
Scenenet: An annotated model generator for indoor scene understanding
We introduce SceneNet, a framework for generating high-quality annotated 3D scenes to
aid indoor scene understanding. SceneNet leverages manually-annotated datasets of real …
aid indoor scene understanding. SceneNet leverages manually-annotated datasets of real …