-
Iterative Occlusion-Aware Light Field Depth Estimation using 4D Geometrical Cues
Authors:
Rui Lourenço,
Lucas Thomaz,
Eduardo A. B. Silva,
Sergio M. M. Faria
Abstract:
Light field cameras and multi-camera arrays have emerged as promising solutions for accurately estimating depth by passively capturing light information. This is possible because the 3D information of a scene is embedded in the 4D light field geometry. Commonly, depth estimation methods extract this information relying on gradient information, heuristic-based optimisation models, or learning-based…
▽ More
Light field cameras and multi-camera arrays have emerged as promising solutions for accurately estimating depth by passively capturing light information. This is possible because the 3D information of a scene is embedded in the 4D light field geometry. Commonly, depth estimation methods extract this information relying on gradient information, heuristic-based optimisation models, or learning-based approaches. This paper focuses mainly on explicitly understanding and exploiting 4D geometrical cues for light field depth estimation. Thus, a novel method is proposed, based on a non-learning-based optimisation approach for depth estimation that explicitly considers surface normal accuracy and occlusion regions by utilising a fully explainable 4D geometric model of the light field. The 4D model performs depth/disparity estimation by determining the orientations and analysing the intersections of key 2D planes in 4D space, which are the images of 3D-space points in the 4D light field. Experimental results show that the proposed method outperforms both learning-based and non-learning-based state-of-the-art methods in terms of surface normal angle accuracy, achieving a Median Angle Error on planar surfaces, on average, 26.3\% lower than the state-of-the-art, and still being competitive with state-of-the-art methods in terms of Mean Squared Error $\vc{\times}$ 100 and Badpix 0.07.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Cloud Classification with Unsupervised Deep Learning
Authors:
Takuya Kurihana,
Ian Foster,
Rebecca Willett,
Sydney Jenkins,
Kathryn Koenig,
Ruby Werman,
Ricardo Barros Lourenco,
Casper Neo,
Elisabeth Moyer
Abstract:
We present a framework for cloud characterization that leverages modern unsupervised deep learning technologies. While previous neural network-based cloud classification models have used supervised learning methods, unsupervised learning allows us to avoid restricting the model to artificial categories based on historical cloud classification schemes and enables the discovery of novel, more detail…
▽ More
We present a framework for cloud characterization that leverages modern unsupervised deep learning technologies. While previous neural network-based cloud classification models have used supervised learning methods, unsupervised learning allows us to avoid restricting the model to artificial categories based on historical cloud classification schemes and enables the discovery of novel, more detailed classifications. Our framework learns cloud features directly from radiance data produced by NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) satellite instrument, deriving cloud characteristics from millions of images without relying on pre-defined cloud types during the training process. We present preliminary results showing that our method extracts physically relevant information from radiance data and produces meaningful cloud classes.
△ Less
Submitted 30 September, 2022;
originally announced September 2022.
-
AlphaD3M: Machine Learning Pipeline Synthesis
Authors:
Iddo Drori,
Yamuna Krishnamurthy,
Remi Rampin,
Raoni de Paula Lourenco,
Jorge Piazentin Ono,
Kyunghyun Cho,
Claudio Silva,
Juliana Freire
Abstract:
We introduce AlphaD3M, an automatic machine learning (AutoML) system based on meta reinforcement learning using sequence models with self play. AlphaD3M is based on edit operations performed over machine learning pipeline primitives providing explainability. We compare AlphaD3M with state-of-the-art AutoML systems: Autosklearn, Autostacker, and TPOT, on OpenML datasets. AlphaD3M achieves competiti…
▽ More
We introduce AlphaD3M, an automatic machine learning (AutoML) system based on meta reinforcement learning using sequence models with self play. AlphaD3M is based on edit operations performed over machine learning pipeline primitives providing explainability. We compare AlphaD3M with state-of-the-art AutoML systems: Autosklearn, Autostacker, and TPOT, on OpenML datasets. AlphaD3M achieves competitive performance while being an order of magnitude faster, reducing computation time from hours to minutes, and is explainable by design.
△ Less
Submitted 3 November, 2021;
originally announced November 2021.
-
DataExposer: Exposing Disconnect between Data and Systems
Authors:
Sainyam Galhotra,
Anna Fariha,
Raoni Lourenço,
Juliana Freire,
Alexandra Meliou,
Divesh Srivastava
Abstract:
As data is a central component of many modern systems, the cause of a system malfunction may reside in the data, and, specifically, particular properties of the data. For example, a health-monitoring system that is designed under the assumption that weight is reported in imperial units (lbs) will malfunction when encountering weight reported in metric units (kilograms). Similar to software debuggi…
▽ More
As data is a central component of many modern systems, the cause of a system malfunction may reside in the data, and, specifically, particular properties of the data. For example, a health-monitoring system that is designed under the assumption that weight is reported in imperial units (lbs) will malfunction when encountering weight reported in metric units (kilograms). Similar to software debugging, which aims to find bugs in the mechanism (source code or runtime conditions), our goal is to debug the data to identify potential sources of disconnect between the assumptions about the data and the systems that operate on that data. Specifically, we seek which properties of the data cause a data-driven system to malfunction. We propose DataExposer, a framework to identify data properties, called profiles, that are the root causes of performance degradation or failure of a system that operates on the data. Such identification is necessary to repair the system and resolve the disconnect between data and system. Our technique is based on causal reasoning through interventions: when a system malfunctions for a dataset, DataExposer alters the data profiles and observes changes in the system's behavior due to the alteration. Unlike statistical observational analysis that reports mere correlations, DataExposer reports causally verified root causes, in terms of data profiles, of the system malfunction. We empirically evaluate DataExposer on three real-world and several synthetic data-driven systems that fail on datasets due to a diverse set of reasons. In all cases, DataExposer identifies the root causes precisely while requiring orders of magnitude fewer interventions than prior techniques.
△ Less
Submitted 12 May, 2021;
originally announced May 2021.
-
BugDoc: Algorithms to Debug Computational Processes
Authors:
Raoni Lourenço,
Juliana Freire,
Dennis Shasha
Abstract:
Data analysis for scientific experiments and enterprises, large-scale simulations, and machine learning tasks all entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous outputs, the pipeline may fail to execute or produce incorrect results. Inferring the root cause(s) of such failures is challen…
▽ More
Data analysis for scientific experiments and enterprises, large-scale simulations, and machine learning tasks all entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous outputs, the pipeline may fail to execute or produce incorrect results. Inferring the root cause(s) of such failures is challenging, usually requiring time and much human thought, while still being error-prone. We propose a new approach that makes use of iteration and provenance to automatically infer the root causes and derive succinct explanations of failures. Through a detailed experimental evaluation, we assess the cost, precision, and recall of our approach compared to the state of the art. Our experimental data and processing software is available for use, reproducibility, and enhancement.
△ Less
Submitted 12 April, 2020;
originally announced April 2020.
-
Debugging Machine Learning Pipelines
Authors:
Raoni Lourenço,
Juliana Freire,
Dennis Shasha
Abstract:
Machine learning tasks entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous or uninformative outputs, the pipeline may fail or produce incorrect results. Inferring the root cause of failures and unexpected behavior is challenging, usually requiring much human thought, and is both time-consumin…
▽ More
Machine learning tasks entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous or uninformative outputs, the pipeline may fail or produce incorrect results. Inferring the root cause of failures and unexpected behavior is challenging, usually requiring much human thought, and is both time-consuming and error-prone. We propose a new approach that makes use of iteration and provenance to automatically infer the root causes and derive succinct explanations of failures. Through a detailed experimental evaluation, we assess the cost, precision, and recall of our approach compared to the state of the art. Our source code and experimental data will be available for reproducibility and enhancement.
△ Less
Submitted 11 February, 2020;
originally announced February 2020.
-
Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar
Authors:
Iddo Drori,
Yamuna Krishnamurthy,
Raoni Lourenco,
Remi Rampin,
Kyunghyun Cho,
Claudio Silva,
Juliana Freire
Abstract:
Automatic machine learning is an important problem in the forefront of machine learning. The strongest AutoML systems are based on neural networks, evolutionary algorithms, and Bayesian optimization. Recently AlphaD3M reached state-of-the-art results with an order of magnitude speedup using reinforcement learning with self-play. In this work we extend AlphaD3M by using a pipeline grammar and a pre…
▽ More
Automatic machine learning is an important problem in the forefront of machine learning. The strongest AutoML systems are based on neural networks, evolutionary algorithms, and Bayesian optimization. Recently AlphaD3M reached state-of-the-art results with an order of magnitude speedup using reinforcement learning with self-play. In this work we extend AlphaD3M by using a pipeline grammar and a pre-trained model which generalizes from many different datasets and similar tasks. Our results demonstrate improved performance compared with our earlier work and existing methods on AutoML benchmark datasets for classification and regression tasks. In the spirit of reproducible research we make our data, models, and code publicly available.
△ Less
Submitted 24 May, 2019;
originally announced May 2019.
-
Running the Network Harder: Connection Provisioning under Resource Crunch
Authors:
Rafael B. R. Lourenco,
Massimo Tornatore,
Charles U. Martel,
Biswanath Mukherjee
Abstract:
Traditionally, networks operate at a small fraction of their capacities; however, recent technologies, such as Software-Defined Networking, may let operators run their networks harder (i.e., at higher utilization levels). Higher utilization can increase the network operator's revenue, but this gain comes at a cost: daily traffic fluctuations and failures might occasionally overload the network. We…
▽ More
Traditionally, networks operate at a small fraction of their capacities; however, recent technologies, such as Software-Defined Networking, may let operators run their networks harder (i.e., at higher utilization levels). Higher utilization can increase the network operator's revenue, but this gain comes at a cost: daily traffic fluctuations and failures might occasionally overload the network. We call such situations Resource Crunch. Dealing with Resource Crunch requires certain types of flexibility in the system. We focus on scenarios with flexible bandwidth requirements, e.g., some connections can tolerate lower bandwidth allocation. This may free capacity to provision new requests that would otherwise be blocked. For that, the network operator needs to make an informed decision, since reducing the bandwidth of a high-paying connection to allocate a low-value connection is not sensible. We propose a strategy to decide whether or not to provision a request (and which other connections to degrade) focusing on maximizing profits during Resource Crunch. To address this problem, we use an abstraction of the network state, called a Connection Adjacency Graph (CAG). We propose PROVISIONER, which integrates our CAG solution with an efficient Linear Program (LP). We compare our method to existing greedy approaches and to LP-only solutions, and show that our method outperforms them during Resource Crunch.
△ Less
Submitted 25 April, 2018; v1 submitted 29 September, 2017;
originally announced October 2017.