Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–16 of 16 results for author: Shasha, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.13271  [pdf, other

    cs.PL cs.LO

    Verifying Lock-free Search Structure Templates

    Authors: Nisarg Patel, Dennis Shasha, Thomas Wies

    Abstract: We present and verify template algorithms for lock-free concurrent search structures that cover a broad range of existing implementations based on lists and skiplists. Our linearizability proofs are fully mechanized in the concurrent separation logic Iris. The proofs are modular and cover the broader design space of the underlying algorithms by parameterizing the verification over aspects such as… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Extended version of an article to appear in ECOOP'24

  2. arXiv:2312.05456  [pdf, other

    cs.LG physics.soc-ph q-bio.PE

    On the calibration of compartmental epidemiological models

    Authors: Nikunj Gupta, Anh Mai, Azza Abouzied, Dennis Shasha

    Abstract: Epidemiological compartmental models are useful for understanding infectious disease propagation and directing public health policy decisions. Calibration of these models is an important step in offering accurate forecasts of disease dynamics and the effectiveness of interventions. In this study, we present an overview of calibrating strategies that can be employed, including several optimization… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  3. arXiv:2301.12802  [pdf, other

    cs.LG

    Planning Multiple Epidemic Interventions with Reinforcement Learning

    Authors: Anh Mai, Nikunj Gupta, Azza Abouzied, Dennis Shasha

    Abstract: Combating an epidemic entails finding a plan that describes when and how to apply different interventions, such as mask-wearing mandates, vaccinations, school or workplace closures. An optimal plan will curb an epidemic with minimal loss of life, disease burden, and economic cost. Finding an optimal plan is an intractable computational problem in realistic settings. Policy-makers, however, would g… ▽ More

    Submitted 7 June, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  4. arXiv:2212.07876  [pdf, other

    cs.LG

    Forgetful Forests: high performance learning data structures for streaming data under concept drift

    Authors: Zhehu Yuan, Yinqi Sun, Dennis Shasha

    Abstract: Database research can help machine learning performance in many ways. One way is to design better data structures. This paper combines the use of incremental computation and sequential and probabilistic filtering to enable "forgetful" tree-based learning algorithms to cope with concept drift data (i.e., data whose function from input to classification changes over time). The forgetful algorithms… ▽ More

    Submitted 15 December, 2022; originally announced December 2022.

    Comments: 21 pages, 12 Figures, 7 algorithms

    ACM Class: E.1; B.8.m

  5. arXiv:2112.08851  [pdf, other

    stat.ML cs.CV cs.LG

    Classification Under Ambiguity: When Is Average-K Better Than Top-K?

    Authors: Titouan Lorieul, Alexis Joly, Dennis Shasha

    Abstract: When many labels are possible, choosing a single one can lead to low precision. A common alternative, referred to as top-$K$ classification, is to choose some number $K$ (commonly around 5) and to return the $K$ labels with the highest scores. Unfortunately, for unambiguous cases, $K>1$ is too many and, for very ambiguous cases, $K \leq 5$ (for example) can be too small. An alternative sensible st… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Comments: 53 pages, 21 figures

  6. arXiv:2109.05631  [pdf, other

    cs.PL cs.LO

    Verifying Concurrent Multicopy Search Structures

    Authors: Nisarg Patel, Siddharth Krishna, Dennis Shasha, Thomas Wies

    Abstract: Multicopy search structures such as log-structured merge (LSM) trees are optimized for high insert/update/delete (collectively known as upsert) performance. In such data structures, an upsert on key $k$, which adds $(k,v)$ where $v$ can be a value or a tombstone, is added to the root node even if $k$ is already present in other nodes. Thus there may be multiple copies of $k$ in the search structur… ▽ More

    Submitted 12 September, 2021; originally announced September 2021.

    Comments: Extended version of an article to appear in OOPSLA'21

  7. arXiv:2004.08368  [pdf, other

    cs.RO cs.CV

    Robotic Room Traversal using Optical Range Finding

    Authors: Cole Smith, Eric Lin, Dennis Shasha

    Abstract: Consider the goal of visiting every part of a room that is not blocked by obstacles. Doing so efficiently requires both sensors and planning. Our findings suggest a method of inexpensive optical range finding for robotic room traversal. Our room traversal algorithm relies upon the approximate distance from the robot to the nearest obstacle in 360 degrees. We then choose the path with the furthest… ▽ More

    Submitted 17 April, 2020; originally announced April 2020.

    Comments: Technical Report TR2018-991

    Report number: TR2018-991

  8. BugDoc: Algorithms to Debug Computational Processes

    Authors: Raoni Lourenço, Juliana Freire, Dennis Shasha

    Abstract: Data analysis for scientific experiments and enterprises, large-scale simulations, and machine learning tasks all entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous outputs, the pipeline may fail to execute or produce incorrect results. Inferring the root cause(s) of such failures is challen… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

    Comments: To appear in SIGMOD 2020. arXiv admin note: text overlap with arXiv:2002.04640

  9. arXiv:2003.11546  [pdf, other

    cs.DB

    MultiRI: Fast Subgraph Matching in Labeled Multigraphs

    Authors: Giovanni Micale, Vincenzo Bonnici, Alfredo Ferro, Dennis Shasha, Rosalba Giugno, Alfredo Pulvirenti

    Abstract: The Subgraph Matching (SM) problem consists of finding all the embeddings of a given small graph, called the query, into a large graph, called the target. The SM problem has been widely studied for simple graphs, i.e. graphs where there is exactly one edge between two nodes and nodes have single labels, but few approaches have been devised for labeled multigraphs, i.e. graphs having possibly multi… ▽ More

    Submitted 25 March, 2020; originally announced March 2020.

    Comments: Submitted for pubblication on January 2019

    ACM Class: E.1; F.2.0

  10. arXiv:2002.04640  [pdf, other

    cs.LG cs.DB stat.ML

    Debugging Machine Learning Pipelines

    Authors: Raoni Lourenço, Juliana Freire, Dennis Shasha

    Abstract: Machine learning tasks entail the use of complex computational pipelines to reach quantitative and qualitative conclusions. If some of the activities in a pipeline produce erroneous or uninformative outputs, the pipeline may fail or produce incorrect results. Inferring the root cause of failures and unexpected behavior is challenging, usually requiring much human thought, and is both time-consumin… ▽ More

    Submitted 11 February, 2020; originally announced February 2020.

    Comments: 10 pages

    Journal ref: Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning, June 2019, Article No.: 3

  11. arXiv:1711.03272  [pdf, other

    cs.LO cs.DS cs.PL

    Go with the Flow: Compositional Abstractions for Concurrent Data Structures (Extended Version)

    Authors: Siddharth Krishna, Dennis Shasha, Thomas Wies

    Abstract: Concurrent separation logics have helped to significantly simplify correctness proofs for concurrent data structures. However, a recurring problem in such proofs is that data structure abstractions that work well in the sequential setting are much harder to reason about in a concurrent setting due to complex sharing and overlays. To solve this problem, we propose a novel approach to abstracting re… ▽ More

    Submitted 9 November, 2017; originally announced November 2017.

    Comments: This is an extended version of a POPL 2018 conference paper

  12. A Collaborative Approach to Computational Reproducibility

    Authors: Fernando Chirigati, Rebecca Capone, Dennis Shasha, Remi Rampin, Juliana Freire

    Abstract: Although a standard in natural science, reproducibility has been only episodically applied in experimental computer science. Scientific papers often present a large number of tables, plots and pictures that summarize the obtained results, but then loosely describe the steps taken to derive them. Not only can the methods and the implementation be complex, but also their configuration may require se… ▽ More

    Submitted 9 August, 2017; originally announced September 2017.

    Journal ref: The Journal of Information Systems, Volume 59, Pages 95-97, ISSN 0306-4379 (2016)

  13. arXiv:1708.06425  [pdf, other

    cs.LG cs.AI math.ST

    SafePredict: A Meta-Algorithm for Machine Learning That Uses Refusals to Guarantee Correctness

    Authors: Mustafa A. Kocak, David Ramirez, Elza Erkip, Dennis E. Shasha

    Abstract: SafePredict is a novel meta-algorithm that works with any base prediction algorithm for online data to guarantee an arbitrarily chosen correctness rate, $1-ε$, by allowing refusals. Allowing refusals means that the meta-algorithm may refuse to emit a prediction produced by the base algorithm on occasion so that the error rate on non-refused predictions does not exceed $ε$. The SafePredict error bo… ▽ More

    Submitted 8 November, 2017; v1 submitted 21 August, 2017; originally announced August 2017.

    Comments: Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, August 2017

  14. arXiv:1703.02638  [pdf, other

    cs.DB

    Constellation Queries over Big Data

    Authors: Fabio Porto, Amir Khatibi, João R. Nobre, Eduardo Ogasawara, Patrick Valduriez, Dennis Shasha

    Abstract: A geometrical pattern is a set of points with all pairwise distances (or, more generally, relative distances) specified. Finding matches to such patterns has applications to spatial data in seismic, astronomical, and transportation contexts. For example, a particularly interesting geometric pattern in astronomy is the Einstein cross, which is an astronomical phenomenon in which a single quasar is… ▽ More

    Submitted 7 March, 2017; originally announced March 2017.

    ACM Class: H.2.4; H.2.8; H.3.1

  15. arXiv:1401.2000  [pdf, other

    cs.CE cond-mat.stat-mech physics.comp-ph

    A model project for reproducible papers: critical temperature for the Ising model on a square lattice

    Authors: M. Dolfi, J. Gukelberger, A. Hehn, J. Imriška, K. Pakrouski, T. F. Rønnow, M. Troyer, I. Zintchenko, F. Chirigati, J. Freire, D. Shasha

    Abstract: In this paper we present a simple, yet typical simulation in statistical physics, consisting of large scale Monte Carlo simulations followed by an involved statistical analysis of the results. The purpose is to provide an example publication to explore tools for writing reproducible papers. The simulation estimates the critical temperature where the Ising model on the square lattice becomes magnet… ▽ More

    Submitted 9 January, 2014; originally announced January 2014.

    Comments: Authors are listed in alphabetical order by institution and name. 5 pages, 4 figures

  16. arXiv:1304.1835  [pdf, other

    cs.PL

    Locality Optimization for Data Parallel Programs

    Authors: Eric Hielscher, Alex Rubinsteyn, Dennis Shasha

    Abstract: Productivity languages such as NumPy and Matlab make it much easier to implement data-intensive numerical algorithms. However, these languages can be intolerably slow for programs that don't map well to their built-in primitives. In this paper, we discuss locality optimizations for our system Parakeet, a just-in-time compiler and runtime system for an array-oriented subset of Python. Parakeet dyna… ▽ More

    Submitted 5 April, 2013; originally announced April 2013.

    Report number: NYU CS TR2013-955