-
Orientable and negative orientable sequences
Authors:
Chris J Mitchell,
Peter R Wild
Abstract:
Analogously to de Bruijn sequences, orientable sequences have application in automatic position-location applications and, until recently, studies of these sequences focused on the binary case. In recent work by Alhakim et al., a range of methods of construction were described for orientable sequences over arbitrary finite alphabets; some of these methods involve using negative orientable sequence…
▽ More
Analogously to de Bruijn sequences, orientable sequences have application in automatic position-location applications and, until recently, studies of these sequences focused on the binary case. In recent work by Alhakim et al., a range of methods of construction were described for orientable sequences over arbitrary finite alphabets; some of these methods involve using negative orientable sequences as a building block. In this paper we describe three techniques for generating such negative orientable sequences, as well as upper bounds on their period. We then go on to show how these negative orientable sequences can be used to generate orientable sequences with period close to the maximum possible for every non-binary alphabet size and for every tuple length. In doing so we use two closely related approaches described by Alhakim et al.
△ Less
Submitted 1 September, 2024;
originally announced September 2024.
-
Report on the Advanced Linear Collider Study Group (ALEGRO) Workshop 2024
Authors:
J. Vieira,
B. Cros,
P. Muggli,
I. A. Andriyash,
O. Apsimon,
M. Backhouse,
C. Benedetti,
S. S. Bulanov,
A. Caldwell,
Min Chen,
V. Cilento,
S. Corde,
R. D'Arcy,
S. Diederichs,
E. Ericson,
E. Esarey,
J. Farmer,
L. Fedeli,
A. Formenti,
B. Foster,
M. Garten,
C. G. R. Geddes,
T. Grismayer,
M. J. Hogan,
S. Hooker
, et al. (19 additional authors not shown)
Abstract:
The workshop focused on the application of ANAs to particle physics keeping in mind the ultimate goal of a collider at the energy frontier (10\,TeV, e$^+$/e$^-$, e$^-$/e$^-$, or $γγ$). The development of ANAs is conducted at universities and national laboratories worldwide. The community is thematically broad and diverse, in particular since lasers suitable for ANA research (multi-hundred-terawatt…
▽ More
The workshop focused on the application of ANAs to particle physics keeping in mind the ultimate goal of a collider at the energy frontier (10\,TeV, e$^+$/e$^-$, e$^-$/e$^-$, or $γγ$). The development of ANAs is conducted at universities and national laboratories worldwide. The community is thematically broad and diverse, in particular since lasers suitable for ANA research (multi-hundred-terawatt peak power, a few tens of femtosecond-long pulses) and acceleration of electrons to hundreds of mega electron volts to multi giga electron volts became commercially available. The community spans several continents (Europe, America, Asia), including more than 62 laboratories in more than 20 countries. It is among the missions of the ICFA-ANA panel to feature the amazing progress made with ANAs, to provide international coordination and to foster international collaborations towards a future HEP collider. The scope of this edition of the workshop was to discuss the recent progress and necessary steps towards realizing a linear collider for particle physics based on novel-accelerator technologies (laser or beam driven in plasma or structures). Updates on the relevant aspects of the European Strategy for Particle Physics (ESPP) Roadmap Process as well as of the P5 (in the US) were presented, and ample time was dedicated to discussions. The major outcome of the workshop is the decision for ALEGRO to coordinate efforts in Europe, in the US, and in Asia towards a pre-CDR for an ANA-based, 10\,TeV CM collider. This goal of this coordination is to lead to a funding proposal to be submitted to both EU and EU/US funding agencies. This document presents a summary of the workshop, as seen by the co-chairs, as well as short 'one-pagers' written by the presenters at the workshop.
△ Less
Submitted 15 August, 2024; v1 submitted 6 August, 2024;
originally announced August 2024.
-
Orientable sequences over non-binary alphabets
Authors:
Abbas Alhakim,
Chris J. Mitchell,
Janusz Szmidt,
Peter R. Wild
Abstract:
We describe new, simple, recursive methods of construction for orientable sequences over an arbitrary finite alphabet, i.e. periodic sequences in which any sub-sequence of n consecutive elements occurs at most once in a period in either direction. In particular we establish how two variants of a generalised Lempel homomorphism can be used to recursively construct such sequences, generalising previ…
▽ More
We describe new, simple, recursive methods of construction for orientable sequences over an arbitrary finite alphabet, i.e. periodic sequences in which any sub-sequence of n consecutive elements occurs at most once in a period in either direction. In particular we establish how two variants of a generalised Lempel homomorphism can be used to recursively construct such sequences, generalising previous work on the binary case. We also derive an upper bound on the period of an orientable sequence.
△ Less
Submitted 22 August, 2024; v1 submitted 20 July, 2024;
originally announced July 2024.
-
Parameter Estimation and Identifiability in Kinetic Flux Profiling Models of Metabolism
Authors:
Breanna Guppy,
Colleen Mitchell,
Eric Taylor
Abstract:
Metabolic fluxes are the rates of life-sustaining chemical reactions within a cell and metabolites are the components. Determining the changes in these fluxes is crucial to understanding diseases with metabolic causes and consequences. Kinetic flux profiling (KFP) is a method for estimating flux that utilizes data from isotope tracing experiments. In these experiments, the isotope-labeled nutrient…
▽ More
Metabolic fluxes are the rates of life-sustaining chemical reactions within a cell and metabolites are the components. Determining the changes in these fluxes is crucial to understanding diseases with metabolic causes and consequences. Kinetic flux profiling (KFP) is a method for estimating flux that utilizes data from isotope tracing experiments. In these experiments, the isotope-labeled nutrient is metabolized through a pathway and integrated into the downstream metabolite pools. Measurements of proportion labeled for each metabolite in the pathway are taken at multiple time points and used to fit an ordinary differential equations model with fluxes as parameters. We begin by generalizing the process of converting diagrams of metabolic pathways into mathematical models composed of differential equations and algebraic constraints. The scaled differential equations for proportions of unlabeled metabolite contain parameters related to the metabolic fluxes in the pathway. We investigate flux parameter identifiability given data collected only at the steady state of the differential equation. Next, we give criteria for valid parameter estimations in the case of a large separation of timescales with fast-slow analysis. Bayesian parameter estimation on simulated data from KFP experiments containing both irreversible and reversible reactions illustrates the accuracy and reliability of flux estimations. These analyses provide constraints that serve as guidelines for the design of KFP experiments to estimate metabolic fluxes.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Integrity-protecting block cipher modes -- Untangling a tangled web
Authors:
Chris J Mitchell
Abstract:
This paper re-examines the security of three related block cipher modes of operation designed to provide authenticated encryption. These modes, known as PES-PCBC, IOBC and EPBC, were all proposed in the mid-1990s. However, analyses of security of the latter two modes were published more recently. In each case one or more papers describing security issues with the schemes were eventually published,…
▽ More
This paper re-examines the security of three related block cipher modes of operation designed to provide authenticated encryption. These modes, known as PES-PCBC, IOBC and EPBC, were all proposed in the mid-1990s. However, analyses of security of the latter two modes were published more recently. In each case one or more papers describing security issues with the schemes were eventually published, although a flaw in one of these analyses (of EPBC) was subsequently discovered - this means that until now EPBC had no known major issues. This paper establishes that, despite this, all three schemes possess defects which should prevent their use - especially as there are a number of efficient alternative schemes possessing proofs of security.
△ Less
Submitted 17 June, 2024; v1 submitted 6 March, 2024;
originally announced March 2024.
-
Synthesizing Particle-in-Cell Simulations Through Learning and GPU Computing for Hybrid Particle Accelerator Beamlines
Authors:
Ryan T. Sandberg,
Remi Lehe,
Chad E. Mitchell,
Marco Garten,
Andrew Myers,
Ji Qiang,
Jean-Luc Vay,
Axel Huebl
Abstract:
Particle accelerator modeling is an important field of research and development, essential to investigating, designing and operating some of the most complex scientific devices ever built. Kinetic simulations of relativistic, charged particle beams and advanced plasma accelerator elements are often performed with high-fidelity particle-in-cell simulations, some of which fill the largest GPU superc…
▽ More
Particle accelerator modeling is an important field of research and development, essential to investigating, designing and operating some of the most complex scientific devices ever built. Kinetic simulations of relativistic, charged particle beams and advanced plasma accelerator elements are often performed with high-fidelity particle-in-cell simulations, some of which fill the largest GPU supercomputers. Start-to-end modeling of a particle accelerator includes many elements and it is desirable to integrate and model advanced accelerator elements fast, in effective models. Traditionally, analytical and reduced-physics models fill this role. The vast data from high-fidelity simulations and power of GPU-accelerated computation open a new opportunity to complement traditional modeling without approximations: surrogate modeling through machine learning. In this paper, we implement, present and benchmark such a data-driven workflow, synthesising a fully GPU-accelerated, conventional-surrogate simulation for hybrid particle accelerator beamlines.
△ Less
Submitted 30 April, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
-
De-amplifying Bias from Differential Privacy in Language Model Fine-tuning
Authors:
Sanjari Srivastava,
Piotr Mardziel,
Zhikhun Zhang,
Archana Ahlawat,
Anupam Datta,
John C Mitchell
Abstract:
Fairness and privacy are two important values machine learning (ML) practitioners often seek to operationalize in models. Fairness aims to reduce model bias for social/demographic sub-groups. Privacy via differential privacy (DP) mechanisms, on the other hand, limits the impact of any individual's training data on the resulting model. The trade-offs between privacy and fairness goals of trustworth…
▽ More
Fairness and privacy are two important values machine learning (ML) practitioners often seek to operationalize in models. Fairness aims to reduce model bias for social/demographic sub-groups. Privacy via differential privacy (DP) mechanisms, on the other hand, limits the impact of any individual's training data on the resulting model. The trade-offs between privacy and fairness goals of trustworthy ML pose a challenge to those wishing to address both. We show that DP amplifies gender, racial, and religious bias when fine-tuning large language models (LLMs), producing models more biased than ones fine-tuned without DP. We find the cause of the amplification to be a disparity in convergence of gradients across sub-groups. Through the case of binary gender bias, we demonstrate that Counterfactual Data Augmentation (CDA), a known method for addressing bias, also mitigates bias amplification by DP. As a consequence, DP and CDA together can be used to fine-tune models while maintaining both fairness and privacy.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Serberus: Protecting Cryptographic Code from Spectres at Compile-Time
Authors:
Nicholas Mosier,
Hamed Nemati,
John C. Mitchell,
Caroline Trippel
Abstract:
We present Serberus, the first comprehensive mitigation for hardening constant-time (CT) code against Spectre attacks (involving the PHT, BTB, RSB, STL and/or PSF speculation primitives) on existing hardware. Serberus is based on three insights. First, some hardware control-flow integrity (CFI) protections restrict transient control-flow to the extent that it may be comprehensively considered by s…
▽ More
We present Serberus, the first comprehensive mitigation for hardening constant-time (CT) code against Spectre attacks (involving the PHT, BTB, RSB, STL and/or PSF speculation primitives) on existing hardware. Serberus is based on three insights. First, some hardware control-flow integrity (CFI) protections restrict transient control-flow to the extent that it may be comprehensively considered by software analyses. Second, conformance to the accepted CT code discipline permits two code patterns that are unsafe in the post-Spectre era. Third, once these code patterns are addressed, all Spectre leakage of secrets in CT programs can be attributed to one of four classes of taint primitives--instructions that can transiently assign a secret value to a publicly-typed register. We evaluate Serberus on cryptographic primitives in the OpenSSL, Libsodium, and HACL* libraries. Serberus introduces 21.3% runtime overhead on average, compared to 24.9% for the next closest state-of-the-art software mitigation, which is less secure.
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
An explainable three dimension framework to uncover learning patterns: A unified look in variable sulci recognition
Authors:
Michail Mamalakis,
Heloise de Vareilles,
Atheer AI-Manea,
Samantha C. Mitchell,
Ingrid Arartz,
Lynn Egeland Morch-Johnsen,
Jane Garrison,
Jon Simons,
Pietro Lio,
John Suckling,
Graham Murray
Abstract:
The significant features identified in a representative subset of the dataset during the learning process of an artificial intelligence model are referred to as a 'global' explanation. Three-dimensional (3D) global explanations are crucial in neuroimaging where a complex representational space demands more than basic two-dimensional interpretations. Curently, studies in the literature lack accurat…
▽ More
The significant features identified in a representative subset of the dataset during the learning process of an artificial intelligence model are referred to as a 'global' explanation. Three-dimensional (3D) global explanations are crucial in neuroimaging where a complex representational space demands more than basic two-dimensional interpretations. Curently, studies in the literature lack accurate, low-complexity, and 3D global explanations in neuroimaging and beyond. To fill this gap, we develop a novel explainable artificial intelligence (XAI) 3D-Framework that provides robust, faithful, and low-complexity global explanations. We evaluated our framework on various 3D deep learning networks trained, validated, and tested on a well-annotated cohort of 596 MRI images. The focus of detection was on the presence or absence of the paracingulate sulcus, a highly variable feature of brain topology associated with symptoms of psychosis. Our proposed 3D-Framework outperformed traditional XAI methods in terms of faithfulness for global explanations. As a result, these explanations uncovered new patterns that not only enhance the credibility and reliability of the training process but also reveal the broader developmental landscape of the human cortex. Our XAI 3D-Framework proposes for the first time, a way to utilize global explanations to discover the context in which detection of specific features are embedded, opening our understanding of normative brain development and atypical trajectories that can lead to the emergence of mental illness.
△ Less
Submitted 8 July, 2024; v1 submitted 2 September, 2023;
originally announced September 2023.
-
Stack-sorting simplices: geometry and lattice-point enumeration
Authors:
Eon Lee,
Carson Mitchell,
Andrés R. Vindas-Meléndez
Abstract:
We study the polytopes that arise from the convex hulls of stack-sorting on particular permutations. We show that they are simplices and proceed to study their geometry and lattice-point enumeration. First, we prove some enumerative results on $Ln1$ permutations, i.e., permutations of length $n$ whose penultimate and last entries are $n$ and $1$, respectively. Additionally, we then focus on a spec…
▽ More
We study the polytopes that arise from the convex hulls of stack-sorting on particular permutations. We show that they are simplices and proceed to study their geometry and lattice-point enumeration. First, we prove some enumerative results on $Ln1$ permutations, i.e., permutations of length $n$ whose penultimate and last entries are $n$ and $1$, respectively. Additionally, we then focus on a specific permutation, which we call $L'n1$, and show that the convex hull of all its iterations through the stack-sorting algorithm share the same lattice-point enumerator as that of the $(n-1)$-dimensional unit cube and lecture-hall simplex. Lastly, we detail some results on the real lattice-point enumerator for variations of the simplices arising from stack-sorting $L'n1$ permutations. This then allows us to show that $L'n1$ simplices are Gorenstein of index $2$.
△ Less
Submitted 31 August, 2023;
originally announced August 2023.
-
Laser-Plasma Ion Beam Booster Based on Hollow-Channel Magnetic Vortex Acceleration
Authors:
Marco Garten,
Stepan S. Bulanov,
Sahel Hakimi,
Lieselotte Obst-Huebl,
Chad E. Mitchell,
Carl Schroeder,
Eric Esarey,
Cameron G. R. Geddes,
Jean-Luc Vay,
Axel Huebl
Abstract:
Laser-driven ion acceleration provides ultra-short, high-charge, low-emittance beams, which are desirable for a wide range of high-impact applications. Yet after decades of research, a significant increase in maximum ion energy is still needed. This work introduces a quality-preserving staging concept for ultra-intense ion bunches that is seamlessly applicable from the non-relativistic plasma sour…
▽ More
Laser-driven ion acceleration provides ultra-short, high-charge, low-emittance beams, which are desirable for a wide range of high-impact applications. Yet after decades of research, a significant increase in maximum ion energy is still needed. This work introduces a quality-preserving staging concept for ultra-intense ion bunches that is seamlessly applicable from the non-relativistic plasma source to the relativistic regime. Full 3D particle-in-cell simulations prove robustness and capture of a high-charge proton bunch, suitable for readily available and near-term laser facilities.
△ Less
Submitted 5 May, 2024; v1 submitted 9 August, 2023;
originally announced August 2023.
-
Tight information bounds for spontaneous emission lifetime resolution of quantum sources with varied spectral purity
Authors:
Cheyenne S. Mitchell,
Mikael P. Backlund
Abstract:
We generalize the theory of resolving a mixture of two closely spaced spontaneous emission lifetimes to include pure dephasing contributions to decoherence, leading to the resurgence of Rayleigh's Curse at small lifetime separations. Considerable resolution enhancement remains possible when lifetime broadening is more significant than that due to pure dephasing. In the limit that lifetime broadeni…
▽ More
We generalize the theory of resolving a mixture of two closely spaced spontaneous emission lifetimes to include pure dephasing contributions to decoherence, leading to the resurgence of Rayleigh's Curse at small lifetime separations. Considerable resolution enhancement remains possible when lifetime broadening is more significant than that due to pure dephasing. In the limit that lifetime broadening dominates, one can achieve super-resolution either by a tailored one-photon measurement or Hong-Ou-Mandel interferometry. We describe conditions for which either choice is superior.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.
-
AI Models Close to your Chest: Robust Federated Learning Strategies for Multi-site CT
Authors:
Edward H. Lee,
Brendan Kelly,
Emre Altinmakas,
Hakan Dogan,
Maryam Mohammadzadeh,
Errol Colak,
Steve Fu,
Olivia Choudhury,
Ujjwal Ratan,
Felipe Kitamura,
Hernan Chaves,
Jimmy Zheng,
Mourad Said,
Eduardo Reis,
Jaekwang Lim,
Patricia Yokoo,
Courtney Mitchell,
Golnaz Houshmand,
Marzyeh Ghassemi,
Ronan Killeen,
Wendy Qiu,
Joel Hayden,
Farnaz Rafiee,
Chad Klochko,
Nicholas Bevins
, et al. (5 additional authors not shown)
Abstract:
While it is well known that population differences from genetics, sex, race, and environmental factors contribute to disease, AI studies in medicine have largely focused on locoregional patient cohorts with less diverse data sources. Such limitation stems from barriers to large-scale data share and ethical concerns over data privacy. Federated learning (FL) is one potential pathway for AI developm…
▽ More
While it is well known that population differences from genetics, sex, race, and environmental factors contribute to disease, AI studies in medicine have largely focused on locoregional patient cohorts with less diverse data sources. Such limitation stems from barriers to large-scale data share and ethical concerns over data privacy. Federated learning (FL) is one potential pathway for AI development that enables learning across hospitals without data share. In this study, we show the results of various FL strategies on one of the largest and most diverse COVID-19 chest CT datasets: 21 participating hospitals across five continents that comprise >10,000 patients with >1 million images. We also propose an FL strategy that leverages synthetically generated data to overcome class and size imbalances. We also describe the sources of data heterogeneity in the context of FL, and show how even among the correctly labeled populations, disparities can arise due to these biases.
△ Less
Submitted 13 April, 2023; v1 submitted 23 March, 2023;
originally announced March 2023.
-
From Compact Plasma Particle Sources to Advanced Accelerators with Modeling at Exascale
Authors:
Axel Huebl,
Remi Lehe,
Edoardo Zoni,
Olga Shapoval,
Ryan T. Sandberg,
Marco Garten,
Arianna Formenti,
Revathi Jambunathan,
Prabhat Kumar,
Kevin Gott,
Andrew Myers,
Weiqun Zhang,
Ann Almgren,
Chad E. Mitchell,
Ji Qiang,
David Grote,
Alexander Sinn,
Severin Diederichs,
Maxence Thevenet,
Luca Fedeli,
Thomas Clark,
Neil Zaim,
Henri Vincenti,
Jean-Luc Vay
Abstract:
Developing complex, reliable advanced accelerators requires a coordinated, extensible, and comprehensive approach in modeling, from source to the end of beam lifetime. We present highlights in Exascale Computing to scale accelerator modeling software to the requirements set for contemporary science drivers. In particular, we present the first laser-plasma modeling on an exaflop supercomputer using…
▽ More
Developing complex, reliable advanced accelerators requires a coordinated, extensible, and comprehensive approach in modeling, from source to the end of beam lifetime. We present highlights in Exascale Computing to scale accelerator modeling software to the requirements set for contemporary science drivers. In particular, we present the first laser-plasma modeling on an exaflop supercomputer using the US DOE Exascale Computing Project WarpX. Leveraging developments for Exascale, the new DOE SCIDAC-5 Consortium for Advanced Modeling of Particle Accelerators (CAMPA) will advance numerical algorithms and accelerate community modeling codes in a cohesive manner: from beam source, over energy boost, transport, injection, storage, to application or interaction. Such start-to-end modeling will enable the exploration of hybrid accelerators, with conventional and advanced elements, as the next step for advanced accelerator modeling. Following open community standards, we seed an open ecosystem of codes that can be readily combined with each other and machine learning frameworks. These will cover ultrafast to ultraprecise modeling for future hybrid accelerator design, even enabling virtual test stands and twins of accelerators that can be used in operations.
△ Less
Submitted 18 April, 2023; v1 submitted 22 March, 2023;
originally announced March 2023.
-
A Topological Deep Learning Framework for Neural Spike Decoding
Authors:
Edward C. Mitchell,
Brittany Story,
David Boothe,
Piotr J. Franaszczuk,
Vasileios Maroulas
Abstract:
The brain's spatial orientation system uses different neuron ensembles to aid in environment-based navigation. Two of the ways brains encode spatial information is through head direction cells and grid cells. Brains use head direction cells to determine orientation whereas grid cells consist of layers of decked neurons that overlay to provide environment-based navigation. These neurons fire in ens…
▽ More
The brain's spatial orientation system uses different neuron ensembles to aid in environment-based navigation. Two of the ways brains encode spatial information is through head direction cells and grid cells. Brains use head direction cells to determine orientation whereas grid cells consist of layers of decked neurons that overlay to provide environment-based navigation. These neurons fire in ensembles where several neurons fire at once to activate a single head direction or grid. We want to capture this firing structure and use it to decode head direction grid cell data. Understanding, representing, and decoding these neural structures requires models that encompass higher order connectivity, more than the 1-dimensional connectivity that traditional graph-based models provide. To that end, in this work, we develop a topological deep learning framework for neural spike train decoding. Our framework combines unsupervised simplicial complex discovery with the power of deep learning via a new architecture we develop herein called a simplicial convolutional recurrent neural network. Simplicial complexes, topological spaces that use not only vertices and edges but also higher-dimensional objects, naturally generalize graphs and capture more than just pairwise relationships. Additionally, this approach does not require prior knowledge of the neural activity beyond spike counts, which removes the need for similarity measurements. The effectiveness and versatility of the simplicial convolutional neural network is demonstrated on head direction and trajectory prediction via head direction and grid cell datasets.
△ Less
Submitted 6 September, 2023; v1 submitted 1 December, 2022;
originally announced December 2022.
-
Performance and utility trade-off in interpretable sleep staging
Authors:
Irfan Al-Hussaini,
Cassie S. Mitchell
Abstract:
Recent advances in deep learning have led to the development of models approaching the human level of accuracy. However, healthcare remains an area lacking in widespread adoption. The safety-critical nature of healthcare results in a natural reticence to put these black-box deep learning models into practice. This paper explores interpretable methods for a clinical decision support system called s…
▽ More
Recent advances in deep learning have led to the development of models approaching the human level of accuracy. However, healthcare remains an area lacking in widespread adoption. The safety-critical nature of healthcare results in a natural reticence to put these black-box deep learning models into practice. This paper explores interpretable methods for a clinical decision support system called sleep staging, an essential step in diagnosing sleep disorders. Clinical sleep staging is an arduous process requiring manual annotation for each 30s of sleep using physiological signals such as electroencephalogram (EEG). Recent work has shown that sleep staging using simple models and an exhaustive set of features can perform nearly as well as deep learning approaches but only for some specific datasets. Moreover, the utility of those features from a clinical standpoint is ambiguous. On the other hand, the proposed framework, NormIntSleep demonstrates exceptional performance across different datasets by representing deep learning embeddings using normalized features. NormIntSleep performs 4.5% better than the exhaustive feature-based approach and 1.5% better than other representation learning approaches. An empirical comparison between the utility of the interpretations of these models highlights the improved alignment with clinical expectations when performance is traded-off slightly. NormIntSleep paired with a clinically meaningful set of features can best balance this trade-off by providing reliable, clinically relevant interpretation with robust performance.
△ Less
Submitted 5 February, 2023; v1 submitted 6 November, 2022;
originally announced November 2022.
-
CCS Explorer: Relevance Prediction, Extractive Summarization, and Named Entity Recognition from Clinical Cohort Studies
Authors:
Irfan Al-Hussaini,
Davi Nakajima An,
Albert J. Lee,
Sarah Bi,
Cassie S. Mitchell
Abstract:
Clinical Cohort Studies (CCS), such as randomized clinical trials, are a great source of documented clinical research. Ideally, a clinical expert inspects these articles for exploratory analysis ranging from drug discovery for evaluating the efficacy of existing drugs in tackling emerging diseases to the first test of newly developed drugs. However, more than 100 articles are published daily on a…
▽ More
Clinical Cohort Studies (CCS), such as randomized clinical trials, are a great source of documented clinical research. Ideally, a clinical expert inspects these articles for exploratory analysis ranging from drug discovery for evaluating the efficacy of existing drugs in tackling emerging diseases to the first test of newly developed drugs. However, more than 100 articles are published daily on a single prevalent disease like COVID-19 in PubMed. As a result, it can take days for a physician to find articles and extract relevant information. Can we develop a system to sift through the long list of these articles faster and document the crucial takeaways from each of these articles? In this work, we propose CCS Explorer, an end-to-end system for relevance prediction of sentences, extractive summarization, and patient, outcome, and intervention entity detection from CCS. CCS Explorer is packaged in a web-based graphical user interface where the user can provide any disease name. CCS Explorer then extracts and aggregates all relevant information from articles on PubMed based on the results of an automatically generated query produced on the back-end. For each task, CCS Explorer fine-tunes pre-trained language representation models based on transformers with additional layers. The models are evaluated using two publicly available datasets. CCS Explorer obtains a recall of 80.2%, AUC-ROC of 0.843, and an accuracy of 88.3% on sentence relevance prediction using BioBERT and achieves an average Micro F1-Score of 77.8% on Patient, Intervention, Outcome detection (PIO) using PubMedBERT. Thus, CCS Explorer can reliably extract relevant information to summarize articles, saving time by $\sim \text{660}\times$.
△ Less
Submitted 15 November, 2022; v1 submitted 31 October, 2022;
originally announced November 2022.
-
Agent swarms: cooperation and coordination under stringent communications constraint
Authors:
Paul Kinsler,
Sean Holman,
Andrew Elliott,
Cathryn N. Mitchell,
R. Eddie Wilson
Abstract:
Here we consider the communications tactics appropriate for a group of agents that need to "swarm" together in a highly adversarial environment. Specfically, whilst they need to cooperate by exchanging information with each other about their location and their plans; at the same time they also need to keep such communications to an absolute minimum. This might be due to a need for stealth, or othe…
▽ More
Here we consider the communications tactics appropriate for a group of agents that need to "swarm" together in a highly adversarial environment. Specfically, whilst they need to cooperate by exchanging information with each other about their location and their plans; at the same time they also need to keep such communications to an absolute minimum. This might be due to a need for stealth, or otherwise be relevant to situations where communications are signficantly restricted. Complicating this process is that we assume each agent has (a) no means of passively locating others, (b) it must rely on being updated by reception of appropriate messages; and if no such update messages arrive, (c) then their own beliefs about other agents will gradually become out of date and increasingly inaccurate. Here we use a geometry-free multi-agent model that is capable of allowing for message-based information transfer between agents with different intrinsic connectivities, as would be present in a spatial arrangement of agents. We present agent-centric performance metrics that require only minimal assumptions, and show how simulated outcome distributions, risks, and connectivities depend on the ratio of information gain to loss. We also show that checking for too-long round-trip times can be an effective minimal-information filter for determining which agents to no longer target with messages.
△ Less
Submitted 6 April, 2023; v1 submitted 3 October, 2022;
originally announced October 2022.
-
SERF: Interpretable Sleep Staging using Embeddings, Rules, and Features
Authors:
Irfan Al-Hussaini,
Cassie S. Mitchell
Abstract:
The accuracy of recent deep learning based clinical decision support systems is promising. However, lack of model interpretability remains an obstacle to widespread adoption of artificial intelligence in healthcare. Using sleep as a case study, we propose a generalizable method to combine clinical interpretability with high accuracy derived from black-box deep learning. Clinician-determined sleep…
▽ More
The accuracy of recent deep learning based clinical decision support systems is promising. However, lack of model interpretability remains an obstacle to widespread adoption of artificial intelligence in healthcare. Using sleep as a case study, we propose a generalizable method to combine clinical interpretability with high accuracy derived from black-box deep learning. Clinician-determined sleep stages from polysomnogram (PSG) remain the gold standard for evaluating sleep quality. However, PSG manual annotation by experts is expensive and time-prohibitive. We propose SERF, interpretable Sleep staging using Embeddings, Rules, and Features to read PSG. SERF provides interpretation of classified sleep stages through meaningful features derived from the AASM Manual for the Scoring of Sleep and Associated Events. In SERF, the embeddings obtained from a hybrid of convolutional and recurrent neural networks are transposed to the interpretable feature space. These representative interpretable features are used to train simple models like a shallow decision tree for classification. Model results are validated on two publicly available datasets. SERF surpasses the current state-of-the-art for interpretable sleep staging by 2%. Using Gradient Boosted Trees as the classifier, SERF obtains 0.766 $κ$ and 0.870 AUC-ROC, within 2% of the current state-of-the-art black-box models.
△ Less
Submitted 25 September, 2022; v1 submitted 21 September, 2022;
originally announced September 2022.
-
Next Generation Computational Tools for the Modeling and Design of Particle Accelerators at Exascale
Authors:
Axel Huebl,
Remi Lehe,
Chad E. Mitchell,
Ji Qiang,
Robert D. Ryne,
Ryan T. Sandberg,
Jean-Luc Vay
Abstract:
Particle accelerators are among the largest, most complex devices. To meet the challenges of increasing energy, intensity, accuracy, compactness, complexity and efficiency, increasingly sophisticated computational tools are required for their design and optimization. It is key that contemporary software take advantage of the latest advances in computer hardware and scientific software engineering…
▽ More
Particle accelerators are among the largest, most complex devices. To meet the challenges of increasing energy, intensity, accuracy, compactness, complexity and efficiency, increasingly sophisticated computational tools are required for their design and optimization. It is key that contemporary software take advantage of the latest advances in computer hardware and scientific software engineering practices, delivering speed, reproducibility and feature composability for the aforementioned challenges. A new open source software stack is being developed at the heart of the Beam pLasma Accelerator Simulation Toolkit (BLAST) by LBNL and collaborators, providing new particle-in-cell modeling codes capable of exploiting the power of GPUs on Exascale supercomputers. Combined with advanced numerical techniques, such as mesh-refinement, and intrinsic support for machine learning, these codes are primed to provide ultrafast to ultraprecise modeling for future accelerator design and operations.
△ Less
Submitted 9 August, 2022; v1 submitted 3 August, 2022;
originally announced August 2022.
-
Extracting particle size distribution from laser speckle with a physics-enhanced autocorrelation-based estimator (PEACE)
Authors:
Qihang Zhang,
Janaka C. Gamekkanda,
Ajinkya Pandit,
Wenlong Tang,
Charles Papageorgiou,
Chris Mitchell,
Yihui Yang,
Michael Schwaerzler,
Tolutola Oyetunde,
Richard D. Braatz,
Allan S. Myerson,
George Barbastathis
Abstract:
Extracting quantitative information about highly scattering surfaces from an imaging system is challenging because the phase of the scattered light undergoes multiple folds upon propagation, resulting in complex speckle patterns. One specific application is the drying of wet powders in the pharmaceutical industry, where quantifying the particle size distribution (PSD) is of particular interest. A…
▽ More
Extracting quantitative information about highly scattering surfaces from an imaging system is challenging because the phase of the scattered light undergoes multiple folds upon propagation, resulting in complex speckle patterns. One specific application is the drying of wet powders in the pharmaceutical industry, where quantifying the particle size distribution (PSD) is of particular interest. A non-invasive and real-time monitoring probe in the drying process is required, but there is no suitable candidate for this purpose. In this report, we develop a theoretical relationship from the PSD to the speckle image and describe a physics-enhanced autocorrelation-based estimator (PEACE) machine learning algorithm for speckle analysis to measure the PSD of a powder surface. This method solves both the forward and inverse problems together and enjoys increased interpretability, since the machine learning approximator is regularized by the physical law.
△ Less
Submitted 2 March, 2023; v1 submitted 20 April, 2022;
originally announced April 2022.
-
Using Kernel-Based Statistical Distance to Study the Dynamics of Charged Particle Beams in Particle-Based Simulation Codes
Authors:
Chad E. Mitchell,
Robert D. Ryne,
Kilean Hwang
Abstract:
Measures of discrepancy between probability distributions (statistical distance) are widely used in the fields of artificial intelligence and machine learning. We describe how certain measures of statistical distance can be implemented as numerical diagnostics for simulations involving charged-particle beams. Related measures of statistical dependence are also described. The resulting diagnostics…
▽ More
Measures of discrepancy between probability distributions (statistical distance) are widely used in the fields of artificial intelligence and machine learning. We describe how certain measures of statistical distance can be implemented as numerical diagnostics for simulations involving charged-particle beams. Related measures of statistical dependence are also described. The resulting diagnostics provide sensitive measures of dynamical processes important for beams in nonlinear or high-intensity systems, which are otherwise difficult to characterize. The focus is on kernel-based methods such as Maximum Mean Discrepancy, which have a well-developed mathematical foundation and reasonable computational complexity. Several benchmark problems and examples involving intense beams are discussed. While the focus is on charged-particle beams, these methods may also be applied to other many-body systems such as plasmas or gravitational systems.
△ Less
Submitted 8 April, 2022;
originally announced April 2022.
-
Insights for post-pandemic pedagogy across one CS department
Authors:
Maxwell Bigman,
Yosefa Gilon,
Jenny Han,
John C Mitchell
Abstract:
Adaptive remote instruction has led to important lessons for the future, including rediscovery of known pedagogical principles in new contexts and new insights for supporting remote learning. Studying one computer science department that serves residential and remote undergraduate and graduate students, we conducted interviews with stakeholders in the department (n=26) and ran a department-wide st…
▽ More
Adaptive remote instruction has led to important lessons for the future, including rediscovery of known pedagogical principles in new contexts and new insights for supporting remote learning. Studying one computer science department that serves residential and remote undergraduate and graduate students, we conducted interviews with stakeholders in the department (n=26) and ran a department-wide student survey (n=102) during the four academic quarters from spring 2020 to spring 2021. Our case study outlines what the instructors did, summarizes what instructors and students say about courses during this period, and provides recommendations for CS departments with similar scope going forward. Specific insights address: (1) how instructional components are best structured for students; (2) how students are assessed for their learning; and (3) how students are supported in student-initiated components of learning. The institution is a large U.S. research university that has a history of online programs including online enrollment in regular on-campus courses and large-scale open enrollment courses. Our recommendations to instructors across the scope of this department may also be applicable to other institutions that provide technology-supported in-person instruction, remote enrollment, and hybrid courses combining both modalities.
△ Less
Submitted 16 March, 2022;
originally announced March 2022.
-
Quantum limits to resolution and discrimination of spontaneous emission lifetimes
Authors:
Cheyenne S. Mitchell,
Mikael P. Backlund
Abstract:
In this work we investigate the quantum information theoretical limits to several tasks related to lifetime estimation and discrimination of a two-level spontaneous optical emitter. We focus in particular on the model problem of resolving two mutually incoherent exponential decays with highly overlapping temporal probability profiles. Mirroring recent work on quantum-inspired super-resolution of p…
▽ More
In this work we investigate the quantum information theoretical limits to several tasks related to lifetime estimation and discrimination of a two-level spontaneous optical emitter. We focus in particular on the model problem of resolving two mutually incoherent exponential decays with highly overlapping temporal probability profiles. Mirroring recent work on quantum-inspired super-resolution of point emitters, we find that direct lifetime measurement suffers from an analogue of "Rayleigh's Curse" when the time constants of the two decay channels approach one another. We propose alternative measurement schemes that circumvent this limit, and also demonstrate superiority to direct measurement for a related binary hypothesis test. Our findings add to a growing list of examples in which a quantum analysis uncovers significant information gains for certain tasks in opto-molecular metrology that do not rely on multiphoton interference, but evidently do benefit from a more thorough exploitation of the coherence properties of single photons.
△ Less
Submitted 8 February, 2022;
originally announced February 2022.
-
Ability-Based Methods for Personalized Keyboard Generation
Authors:
Claire L. Mitchell,
Gabriel J. Cler,
Susan K. Fager,
Paola Contessa,
Serge H. Roy,
Gianluca De Luca,
Joshua C. Kline,
Jennifer M. Vojtech
Abstract:
This study introduces an ability-based method for personalized keyboard generation, wherein an individual's own movement and human-computer interaction data are used to automatically compute a personalized virtual keyboard layout. Our approach integrates a multidirectional point-select task to characterize cursor control over time, distance, and direction. The characterization is automatically emp…
▽ More
This study introduces an ability-based method for personalized keyboard generation, wherein an individual's own movement and human-computer interaction data are used to automatically compute a personalized virtual keyboard layout. Our approach integrates a multidirectional point-select task to characterize cursor control over time, distance, and direction. The characterization is automatically employed to develop a computationally efficient keyboard layout that prioritizes each user's movement abilities through capturing directional constraints and preferences. We evaluated our approach in a study involving 16 participants using inertial sensing and facial electromyography as an access method, resulting in significantly increased communication rates using the personalized keyboard (52.0 bits/min) when compared to a generically optimized keyboard (47.9 bits/min). Our results demonstrate the ability to effectively characterize an individual's movement abilities to design a personalized keyboard for improved communication. This work underscores the importance of integrating a user's motor abilities when designing virtual interfaces.
△ Less
Submitted 3 August, 2022; v1 submitted 12 January, 2022;
originally announced January 2022.
-
Privacy-Preserving Biometric Matching Using Homomorphic Encryption
Authors:
Gaëtan Pradel,
Chris Mitchell
Abstract:
Biometric matching involves storing and processing sensitive user information. Maintaining the privacy of this data is thus a major challenge, and homomorphic encryption offers a possible solution. We propose a privacy-preserving biometrics-based authentication protocol based on fully homomorphic encryption, where the biometric sample for a user is gathered by a local device but matched against a…
▽ More
Biometric matching involves storing and processing sensitive user information. Maintaining the privacy of this data is thus a major challenge, and homomorphic encryption offers a possible solution. We propose a privacy-preserving biometrics-based authentication protocol based on fully homomorphic encryption, where the biometric sample for a user is gathered by a local device but matched against a biometric template by a remote server operating solely on encrypted data. The design ensures that 1) the user's sensitive biometric data remains private, and 2) the user and client device are securely authenticated to the server. A proof-of-concept implementation building on the TFHE library is also presented, which includes the underlying basic operations needed to execute the biometric matching. Performance results from the implementation show how complex it is to make FHE practical in this context, but it appears that, with implementation optimisations and improvements, the protocol could be used for real-world applications.
△ Less
Submitted 24 November, 2021;
originally announced November 2021.
-
Simulations of Future Particle Accelerators: Issues and Mitigations
Authors:
D. Sagan,
M. Berz,
N. M. Cook,
Y. Hao,
G. Hoffstaetter,
A. Huebl,
C. -K. Huang,
M. H. Langston,
C. E. Mayes,
C. E. Mitchell,
C. -K. Ng,
J. Qiang,
R. D. Ryne,
A. Scheinker,
E. Stern,
J. -L. Vay,
D. Winklehner,
H. Zhang
Abstract:
The ever increasing demands placed upon machine performance have resulted in the need for more comprehensive particle accelerator modeling. Computer simulations are key to the success of particle accelerators. Many aspects of particle accelerators rely on computer modeling at some point, sometimes requiring complex simulation tools and massively parallel supercomputing. Examples include the modeli…
▽ More
The ever increasing demands placed upon machine performance have resulted in the need for more comprehensive particle accelerator modeling. Computer simulations are key to the success of particle accelerators. Many aspects of particle accelerators rely on computer modeling at some point, sometimes requiring complex simulation tools and massively parallel supercomputing. Examples include the modeling of beams at extreme intensities and densities (toward the quantum degeneracy limit), and with ultra-fine control (down to the level of individual particles). In the future, adaptively tuned models might also be relied upon to provide beam measurements beyond the resolution of existing diagnostics. Much time and effort has been put into creating accelerator software tools, some of which are highly successful. However, there are also shortcomings such as the general inability of existing software to be easily modified to meet changing simulation needs. In this paper possible mitigating strategies are discussed for issues faced by the accelerator community as it endeavors to produce better and more comprehensive modeling tools. This includes lack of coordination between code developers, lack of standards to make codes portable and/or reusable, lack of documentation, among others.
△ Less
Submitted 24 August, 2021;
originally announced August 2021.
-
Constructing orientable sequences
Authors:
Chris J Mitchell,
Peter R Wild
Abstract:
This paper describes new, simple, recursive methods of construction for orientable sequences, i.e. periodic binary sequences in which any n-tuple occurs at most once in a period in either direction. As has been previously described, such sequences have potential applications in automatic position-location systems, where the sequence is encoded onto a surface and a reader needs only examine n conse…
▽ More
This paper describes new, simple, recursive methods of construction for orientable sequences, i.e. periodic binary sequences in which any n-tuple occurs at most once in a period in either direction. As has been previously described, such sequences have potential applications in automatic position-location systems, where the sequence is encoded onto a surface and a reader needs only examine n consecutive encoded bits to determine its location and orientation on the surface. The only previously described method of construction (due to Dai et al.) is somewhat complex, whereas the new techniques are simple to both describe and implement. The methods of construction cover both the standard `infinite periodic' case, and also the aperiodic, finite sequence, case. Both the new methods build on the Lempel homomorphism, first introduced as a means of recursively generating de Bruijn sequences.
△ Less
Submitted 7 January, 2022; v1 submitted 6 August, 2021;
originally announced August 2021.
-
First-Generation Inference Accelerator Deployment at Facebook
Authors:
Michael Anderson,
Benny Chen,
Stephen Chen,
Summer Deng,
Jordan Fix,
Michael Gschwind,
Aravind Kalaiah,
Changkyu Kim,
Jaewon Lee,
Jason Liang,
Haixin Liu,
Yinghai Lu,
Jack Montgomery,
Arun Moorthy,
Satish Nadathur,
Sam Naghshineh,
Avinash Nayak,
Jongsoo Park,
Chris Petersen,
Martin Schatz,
Narayanan Sundaram,
Bangsheng Tang,
Peter Tang,
Amy Yang,
Jiecao Yu
, et al. (90 additional authors not shown)
Abstract:
In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the in…
▽ More
In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the inference accelerator platform ecosystem we developed and deployed at Facebook: both hardware, through Open Compute Platform (OCP), and software framework and tooling, through Pytorch/Caffe2/Glow. A characteristic of this ecosystem from the start is its openness to enable a variety of AI accelerators from different vendors. This platform, with six low-power accelerator cards alongside a single-socket host CPU, allows us to serve models of high complexity that cannot be easily or efficiently run on CPUs. We describe various performance optimizations, at both platform and accelerator level, which enables this platform to serve production traffic at Facebook. We also share deployment challenges, lessons learned during performance optimization, as well as provide guidance for future inference hardware co-design.
△ Less
Submitted 4 August, 2021; v1 submitted 8 July, 2021;
originally announced July 2021.
-
Extracting Dynamical Frequencies from Invariants of Motion in Finite-Dimensional Nonlinear Integrable Systems
Authors:
Chad E. Mitchell,
Robert D. Ryne,
Kilean Hwang,
Sergei Nagaitsev,
Timofey Zolkin
Abstract:
Integrable dynamical systems play an important role in many areas of science, including accelerator and plasma physics. An integrable dynamical system with $n$ degrees of freedom (DOF) possesses $n$ nontrivial integrals of motion, and can be solved, in principle, by covering the phase space with one or more charts in which the dynamics can be described using action-angle coordinates. To obtain the…
▽ More
Integrable dynamical systems play an important role in many areas of science, including accelerator and plasma physics. An integrable dynamical system with $n$ degrees of freedom (DOF) possesses $n$ nontrivial integrals of motion, and can be solved, in principle, by covering the phase space with one or more charts in which the dynamics can be described using action-angle coordinates. To obtain the frequencies of motion, both the transformation to action-angle coordinates and its inverse must be known in explicit form. However, no general algorithm exists for constructing this transformation explicitly from a set of $n$ known (and generally coupled) integrals of motion. In this paper we describe how one can determine the dynamical frequencies of the motion as functions of these $n$ integrals in the absence of explicitly-known action-angle variables, and we provide several examples.
△ Less
Submitted 10 June, 2021; v1 submitted 4 June, 2021;
originally announced June 2021.
-
Design of double- and multi-bend achromat lattices with large dynamic aperture and approximate invariants
Authors:
Yongjun Li,
Kilean Hwang,
Chad Mitchell,
Robert Rainer,
Robert Ryne,
Victor Smaluk
Abstract:
A numerical method to design nonlinear double- and multi-bend achromat (DBA and MBA) lattices with approximate invariants of motion is investigated. The search for such nonlinear lattices is motivated by Fermilab's Integrable Optics Test Accelerator (IOTA), whose design is based on an integrable Hamiltonian system with two invariants of motion. While it may not be possible to design an achromatic…
▽ More
A numerical method to design nonlinear double- and multi-bend achromat (DBA and MBA) lattices with approximate invariants of motion is investigated. The search for such nonlinear lattices is motivated by Fermilab's Integrable Optics Test Accelerator (IOTA), whose design is based on an integrable Hamiltonian system with two invariants of motion. While it may not be possible to design an achromatic lattice for a dedicated synchrotron light source storage ring with one or more exact invariants of motion, it is possible to tune the sextupoles and octupoles in existing DBA and MBA lattices to produce approximate invariants. In our procedure, the lattice is tuned while minimizing the turn-by-turn fluctuations of the Courant-Snyder actions $J_x$ and $J_y$ at several distinct amplitudes, while simultaneously minimizing diffusion of the on-energy betatron tunes. The resulting lattices share some important features with integrable ones, such as a large dynamic aperture, trajectories confined to invariant tori, robustness to resonances and errors, and a large amplitude-dependent tune-spread. Compared to the nominal NSLS-II lattice, the single- and multi-bunch instability thresholds are increased and the bunch-by-bunch feedback gain can be reduced.
△ Less
Submitted 23 November, 2021; v1 submitted 10 March, 2021;
originally announced March 2021.
-
QUAREP-LiMi: A community-driven initiative to establish guidelines for quality assessment and reproducibility for instruments and images in light microscopy
Authors:
Glyn Nelson,
Ulrike Boehm,
Steve Bagley,
Peter Bajcsy,
Johanna Bischof,
Claire M Brown,
Aurelien Dauphin,
Ian M Dobbie,
John E Eriksson,
Orestis Faklaris,
Julia Fernandez-Rodriguez,
Alexia Ferrand,
Laurent Gelman,
Ali Gheisari,
Hella Hartmann,
Christian Kukat,
Alex Laude,
Miso Mitkovski,
Sebastian Munck,
Alison J North,
Tobias M Rasse,
Ute Resch-Genger,
Lucas C Schuetz,
Arne Seitz,
Caterina Strambio-De-Castillia
, et al. (75 additional authors not shown)
Abstract:
In April 2020, the QUality Assessment and REProducibility for Instruments and Images in Light Microscopy (QUAREP-LiMi) initiative was formed. This initiative comprises imaging scientists from academia and industry who share a common interest in achieving a better understanding of the performance and limitations of microscopes and improved quality control (QC) in light microscopy. The ultimate goal…
▽ More
In April 2020, the QUality Assessment and REProducibility for Instruments and Images in Light Microscopy (QUAREP-LiMi) initiative was formed. This initiative comprises imaging scientists from academia and industry who share a common interest in achieving a better understanding of the performance and limitations of microscopes and improved quality control (QC) in light microscopy. The ultimate goal of the QUAREP-LiMi initiative is to establish a set of common QC standards, guidelines, metadata models, and tools, including detailed protocols, with the ultimate aim of improving reproducible advances in scientific research. This White Paper 1) summarizes the major obstacles identified in the field that motivated the launch of the QUAREP-LiMi initiative; 2) identifies the urgent need to address these obstacles in a grassroots manner, through a community of stakeholders including, researchers, imaging scientists, bioimage analysts, bioimage informatics developers, corporate partners, funding agencies, standards organizations, scientific publishers, and observers of such; 3) outlines the current actions of the QUAREP-LiMi initiative, and 4) proposes future steps that can be taken to improve the dissemination and acceptance of the proposed guidelines to manage QC. To summarize, the principal goal of the QUAREP-LiMi initiative is to improve the overall quality and reproducibility of light microscope image data by introducing broadly accepted standard practices and accurately captured image data metrics.
△ Less
Submitted 27 January, 2021; v1 submitted 21 January, 2021;
originally announced January 2021.
-
The (in)security of some recently proposed lightweight key distribution schemes
Authors:
Chris J Mitchell
Abstract:
Two recently published papers propose some very simple key distribution schemes designed to enable two or more parties to establish a shared secret key with the aid of a third party. Unfortunately, as we show, most of the schemes are inherently insecure and all are incompletely specified - moreover, claims that the schemes are inherently lightweight are shown to be highly misleading. We also brief…
▽ More
Two recently published papers propose some very simple key distribution schemes designed to enable two or more parties to establish a shared secret key with the aid of a third party. Unfortunately, as we show, most of the schemes are inherently insecure and all are incompletely specified - moreover, claims that the schemes are inherently lightweight are shown to be highly misleading. We also briefly critique a somewhat related very recent paper by the same authors that uses similar techniques to achieve what are claimed to be secure multiparty computations.
△ Less
Submitted 13 March, 2021; v1 submitted 20 January, 2021;
originally announced January 2021.
-
Denoising Multi-Source Weak Supervision for Neural Text Classification
Authors:
Wendi Ren,
Yinghao Li,
Hanting Su,
David Kartchner,
Cassie Mitchell,
Chao Zhang
Abstract:
We study the problem of learning neural text classifiers without using any labeled data, but only easy-to-provide rules as multiple weak supervision sources. This problem is challenging because rule-induced weak labels are often noisy and incomplete. To address these two challenges, we design a label denoiser, which estimates the source reliability using a conditional soft attention mechanism and…
▽ More
We study the problem of learning neural text classifiers without using any labeled data, but only easy-to-provide rules as multiple weak supervision sources. This problem is challenging because rule-induced weak labels are often noisy and incomplete. To address these two challenges, we design a label denoiser, which estimates the source reliability using a conditional soft attention mechanism and then reduces label noise by aggregating rule-annotated weak labels. The denoised pseudo labels then supervise a neural classifier to predicts soft labels for unmatched samples, which address the rule coverage issue. We evaluate our model on five benchmarks for sentiment, topic, and relation classifications. The results show that our model outperforms state-of-the-art weakly-supervised and semi-supervised methods consistently, and achieves comparable performance with fully-supervised methods even without any labeled data. Our code can be found at https://github.com/weakrules/Denoise-multi-weak-sources.
△ Less
Submitted 9 October, 2020;
originally announced October 2020.
-
Two closely related insecure noninteractive group key establishment schemes
Authors:
Chris J Mitchell
Abstract:
Serious weaknesses in two very closely related group authentication and group key establishment schemes are described. Simple attacks against the group key establishment part of the schemes are described, which strongly suggest that the schemes should not be used.
Serious weaknesses in two very closely related group authentication and group key establishment schemes are described. Simple attacks against the group key establishment part of the schemes are described, which strongly suggest that the schemes should not be used.
△ Less
Submitted 7 March, 2021; v1 submitted 19 September, 2020;
originally announced September 2020.
-
Bayesian EWMA and CUSUM Control Charts Under Different Loss Functions
Authors:
Chelsea Mitchell,
Abdel-Salam Abdel-Salam,
D'Arcy Mays
Abstract:
The Exponentially Weighted Moving Average (EWMA) and Cumulative Sum (CUSUM) control charts have been used in profile monitoring to track drift shifts that occur in a monitored process. We construct Bayesian EWMA and Bayesian CUSUM charts informed by posterior and posterior predictive distributions using different loss functions, prior distributions, and likelihood distributions. A simulation study…
▽ More
The Exponentially Weighted Moving Average (EWMA) and Cumulative Sum (CUSUM) control charts have been used in profile monitoring to track drift shifts that occur in a monitored process. We construct Bayesian EWMA and Bayesian CUSUM charts informed by posterior and posterior predictive distributions using different loss functions, prior distributions, and likelihood distributions. A simulation study is performed, and the performance of the charts are evaluated via average run length (ARL), standard deviation of the run length (SDRL), average time to signal (ATS), and standard deviation of time to signal (SDTS). A sensitivity analysis is conducted using choices for the smoothing parameter, out-of-control shift size, and hyper-parameters of the distribution. Based on obtained results, we provide recommendations for use of the Bayesian EWMA and Bayesian CUSUM control charts.
△ Less
Submitted 19 July, 2020;
originally announced July 2020.
-
Model Checking Bitcoin and other Proof-of-Work Consensus Protocols
Authors:
Max DiGiacomo-Castillo,
Yiyun Liang,
Advay Pal,
John C. Mitchell
Abstract:
The Bitcoin Backbone Protocol [GKL15] is an abstraction of the bitcoin proof-of-work consensus protocol. We use a model-checking tool (UPPAALSMC) to examine the concrete security of proof-ofwork consensus by varying protocol parameters and using an adversary that leverages the selfish mining strategy introduced in [GKL15]. We provide insights into modeling proof-of-work protocols and demonstrate t…
▽ More
The Bitcoin Backbone Protocol [GKL15] is an abstraction of the bitcoin proof-of-work consensus protocol. We use a model-checking tool (UPPAALSMC) to examine the concrete security of proof-ofwork consensus by varying protocol parameters and using an adversary that leverages the selfish mining strategy introduced in [GKL15]. We provide insights into modeling proof-of-work protocols and demonstrate tradeoffs between operating parameters. Applying this methodology to protocol design options, we show that the uniform tie-breaking rule from [ES18] decreases the failure rate of the chain quality property, but increases the failure rate of the common prefix property. This tradeoff illustrates how design decisions affect protocol properties, within a range of concrete operating conditions, in a manner that is not evident from prior asymptotic analysis.
△ Less
Submitted 16 July, 2020;
originally announced July 2020.
-
Provably insecure group authentication: Not all security proofs are what they claim to be
Authors:
Chris J Mitchell
Abstract:
A paper presented at the ICICS 2019 conference describes what is claimed to be a `provably secure group authentication [protocol] in the asynchronous communication model'. We show here that this is far from being the case, as the protocol is subject to serious attacks. To try to explain this troubling case, an earlier (2013) scheme on which the ICICS 2019 protocol is based was also examined and fo…
▽ More
A paper presented at the ICICS 2019 conference describes what is claimed to be a `provably secure group authentication [protocol] in the asynchronous communication model'. We show here that this is far from being the case, as the protocol is subject to serious attacks. To try to explain this troubling case, an earlier (2013) scheme on which the ICICS 2019 protocol is based was also examined and found to possess even more severe flaws - this latter scheme was previously known to be subject to attack, but not in quite as fundamental a way as is shown here. Examination of the security theorems provided in both the 2013 and 2019 papers reveals that in neither case are they exactly what they seem to be at first sight; the issues raised by this are also briefly discussed.
△ Less
Submitted 9 June, 2021; v1 submitted 11 May, 2020;
originally announced May 2020.
-
How not to secure wireless sensor networks revisited: Even if you say it twice it's still not secure
Authors:
Chris J Mitchell
Abstract:
Two recent papers describe almost exactly the same group key establishment protocol for wireless sensor networks. Quite part from the duplication issue, we show that both protocols are insecure and should not be used - a member of a group can successfully impersonate the key generation centre and persuade any other group member to accept the wrong key value. This breaks the stated objectives of th…
▽ More
Two recent papers describe almost exactly the same group key establishment protocol for wireless sensor networks. Quite part from the duplication issue, we show that both protocols are insecure and should not be used - a member of a group can successfully impersonate the key generation centre and persuade any other group member to accept the wrong key value. This breaks the stated objectives of the schemes.
△ Less
Submitted 20 November, 2020; v1 submitted 9 May, 2020;
originally announced May 2020.
-
Who Needs Trust for 5G?
Authors:
Chris J Mitchell
Abstract:
There has been much recent discussion of the criticality of the 5G infrastructure, and whether certain vendors should be able to supply 5G equipment. The key issue appears to be about trust, namely to what degree the security and reliability properties of 5G equipment and systems need to be trusted, and by whom, and how the necessary level of trust might be obtained. In this paper, by considering…
▽ More
There has been much recent discussion of the criticality of the 5G infrastructure, and whether certain vendors should be able to supply 5G equipment. The key issue appears to be about trust, namely to what degree the security and reliability properties of 5G equipment and systems need to be trusted, and by whom, and how the necessary level of trust might be obtained. In this paper, by considering existing examples such as the Internet, the possible need for trust is examined in a systematic way, and possible routes to gaining trust are described. The issues that arise when a security and/or reliability failure actually occurs are also discussed. The paper concludes with a discussion of possible future ways of enabling all parties to gain the assurances they need in a cost-effective and harmonised way.
△ Less
Submitted 2 May, 2020;
originally announced May 2020.
-
How not to secure wireless sensor networks: A plethora of insecure polynomial-based key pre-distribution schemes
Authors:
Chris J Mitchell
Abstract:
Three closely-related polynomial-based group key pre-distribution schemes have recently been proposed, aimed specifically at wireless sensor networks. The schemes enable any subset of a predefined set of sensor nodes to establish a shared secret key without any communications overhead. It is claimed that these schemes are both secure and lightweight, i.e. making them particularly appropriate for n…
▽ More
Three closely-related polynomial-based group key pre-distribution schemes have recently been proposed, aimed specifically at wireless sensor networks. The schemes enable any subset of a predefined set of sensor nodes to establish a shared secret key without any communications overhead. It is claimed that these schemes are both secure and lightweight, i.e. making them particularly appropriate for network scenarios where nodes have limited computational and storage capabilities. Further papers have built on these schemes, e.g. to propose secure routing protocols for wireless sensor networks. Unfortunately, as we show in this paper, all three schemes are completely insecure; whilst the details of their operation varies, they share common weaknesses. In every case we show that an attacker equipped with the information built into at most two sensor nodes can compute group keys for all possible groups of which the attacked nodes are not a member, which breaks a fundamental design objective. The attacks can also be achieved by an attacker armed with the information from a single node together with a single group key to which this sensor node is not entitled. Repairing the schemes appears difficult, if not impossible. The existence of major flaws is not surprising given the complete absence of any rigorous proofs of security for the proposed schemes. A further recent paper proposes a group membership authentication and key establishment scheme based on one of the three key pre-distribution schemes analysed here; as we demonstrate, this scheme is also insecure, as the attack we describe on the corresponding pre-distribution scheme enables the authentication process to be compromised.
△ Less
Submitted 5 October, 2020; v1 submitted 12 April, 2020;
originally announced April 2020.
-
Resources: A Safe Language Abstraction for Money
Authors:
Sam Blackshear,
David L. Dill,
Shaz Qadeer,
Clark W. Barrett,
John C. Mitchell,
Oded Padon,
Yoni Zohar
Abstract:
Smart contracts are programs that implement potentially sophisticated transactions on modern blockchain platforms. In the rapidly evolving blockchain environment, smart contract programming languages must allow users to write expressive programs that manage and transfer assets, yet provide strong protection against sophisticated attacks. Addressing this need, we present flexible and reliable abstr…
▽ More
Smart contracts are programs that implement potentially sophisticated transactions on modern blockchain platforms. In the rapidly evolving blockchain environment, smart contract programming languages must allow users to write expressive programs that manage and transfer assets, yet provide strong protection against sophisticated attacks. Addressing this need, we present flexible and reliable abstractions for programming with digital currency in the Move language [Blackshear et al. 2019]. Move uses novel linear [Girard 1987] resource types with semantics drawing on C++11 [Stroustrup 2013] and Rust [Matsakis and Klock 2014]: when a resource value is assigned to a new memory location, the location previously holding it must be invalidated. In addition, a resource type can only be created or destroyed by procedures inside its declaring module. We present an executable bytecode language with resources and prove that it enjoys resource safety, a conservation property for program values that is analogous to conservation of mass in the physical world.
△ Less
Submitted 23 July, 2020; v1 submitted 10 April, 2020;
originally announced April 2020.
-
Yet another insecure group key distribution scheme using secret sharing
Authors:
Chris J Mitchell
Abstract:
A recently proposed group key distribution scheme known as UMKESS, based on secret sharing, is shown to be insecure. Not only is it insecure, but it does not always work, and the rationale for its design is unsound. UMKESS is the latest in a long line of flawed group key distribution schemes based on secret sharing techniques.
A recently proposed group key distribution scheme known as UMKESS, based on secret sharing, is shown to be insecure. Not only is it insecure, but it does not always work, and the rationale for its design is unsound. UMKESS is the latest in a long line of flawed group key distribution schemes based on secret sharing techniques.
△ Less
Submitted 18 November, 2020; v1 submitted 31 March, 2020;
originally announced March 2020.
-
The impact of quantum computing on real-world security: A 5G case study
Authors:
Chris J Mitchell
Abstract:
This paper provides a detailed analysis of the impact of quantum computing on the security of 5G mobile telecommunications. This involves considering how cryptography is used in 5G, and how the security of the system would be affected by the advent of quantum computing. This leads naturally to the specification of a series of simple, phased, recommended changes intended to ensure that the security…
▽ More
This paper provides a detailed analysis of the impact of quantum computing on the security of 5G mobile telecommunications. This involves considering how cryptography is used in 5G, and how the security of the system would be affected by the advent of quantum computing. This leads naturally to the specification of a series of simple, phased, recommended changes intended to ensure that the security of 5G (as well as 3G and 4G) is not badly damaged if and when large scale quantum computing becomes a practical reality. By exploiting backwards-compatibility features of the 5G security system design, we are able to propose a novel multi-phase approach to upgrading security that allows for a simple and smooth migration to a post-quantum-secure system.
△ Less
Submitted 13 December, 2019; v1 submitted 18 November, 2019;
originally announced November 2019.
-
Complex Representation of Potentials and Fields for the Nonlinear Magnetic Insert of the Integrable Optics Test Accelerator
Authors:
Chad Mitchell
Abstract:
An alternative representation for the vector potential of the nonlinear magnetic insert for the Integrable Optics Test Accelerator (IOTA), first described in Sec. V.A. of the paper of Danilov and Nagaitsev, is determined from first principles using standard complex variable methods. In particular, it is shown that the coupled system consisting of the 2D Laplace equation and the Bertrand-Darboux eq…
▽ More
An alternative representation for the vector potential of the nonlinear magnetic insert for the Integrable Optics Test Accelerator (IOTA), first described in Sec. V.A. of the paper of Danilov and Nagaitsev, is determined from first principles using standard complex variable methods. In particular, it is shown that the coupled system consisting of the 2D Laplace equation and the Bertrand-Darboux equation is equivalent to a single ordinary differential equation in the complex plane, and a simple solution is constructed. The results are consistent with the paper of Danilov and Nagaitsev, and this concise representation provides computational advantages for particle tracking through the nonlinear insert by avoiding numerical errors caused by small denominators that appear when evaluating transverse derivatives of the vector potential near the midplane. A similar representation is provided for the spatial dependence of the two invariants of motion.
△ Less
Submitted 31 July, 2019;
originally announced August 2019.
-
The Saeed-Liu-Tian-Gao-Li authenticated key agreement protocol is insecure
Authors:
Chris J Mitchell
Abstract:
A recently proposed authenticated key agreement protocol is shown to be insecure. In particular, one of the two parties is not authenticated, allowing an active man in the middle opponent to replay old messages. The protocol is essentially an authenticated Diffie-Hellman key agreement scheme, and the lack of authentication allows an attacker to replay old messages and have them accepted. Moreover,…
▽ More
A recently proposed authenticated key agreement protocol is shown to be insecure. In particular, one of the two parties is not authenticated, allowing an active man in the middle opponent to replay old messages. The protocol is essentially an authenticated Diffie-Hellman key agreement scheme, and the lack of authentication allows an attacker to replay old messages and have them accepted. Moreover, if the ephemeral key used to compute a protocol message is ever compromised, then the key established using the replayed message will also be compromised. Fixing the problem is simple - there are many provably secure and standardised protocols which are just as efficient as the flawed scheme.
△ Less
Submitted 21 June, 2019;
originally announced June 2019.
-
RF design of APEX2 two-cell continuous-wave normal conducting photoelectron gun cavity based on multi-objective genetic algorithm
Authors:
T. Luo,
H. Feng,
D. Filippetto,
M. Johnson,
A. Lambert,
D. Li,
C. Mitchell,
F. Sannibale,
J. Staples,
S. Virostek,
R. Wells
Abstract:
High brightness, high repetition rate electron beams are key components for optimizing the performance of next generation scientific instruments, such as MHz-class X-ray Free Electron Laser (XFEL) and Ultra-fast Electron Diffraction/Microscopy (UED/UEM). In the Advanced Photo-injector EXperiment (APEX) at Berkeley Lab, a photoelectron gun based on a 185.7 MHz normal conducting re-entrant RF cavity…
▽ More
High brightness, high repetition rate electron beams are key components for optimizing the performance of next generation scientific instruments, such as MHz-class X-ray Free Electron Laser (XFEL) and Ultra-fast Electron Diffraction/Microscopy (UED/UEM). In the Advanced Photo-injector EXperiment (APEX) at Berkeley Lab, a photoelectron gun based on a 185.7 MHz normal conducting re-entrant RF cavity, has been proven to be a feasible solution to provide high brightness, high repetition rate electron beam for both XFEL and UED/UEM. Based on the success of APEX, a new electron gun system, named APEX2, has been under development to further improve the electron beam brightness. For APEX2, we have designed a new 162.5 MHz two-cell photoelectron gun and achieved a significant increase on the cathode launching field and the beam exit energy. For a fixed charge per bunch, these improvements will allow for the emittance reduction and hence an increased beam brightness. The design of APEX2 gun cavity is a complex problem with multiple design goals and restrictions, some even competing each other. For a systematic and comprehensive search for the optimized cavity geometry, we have developed and implemented a novel optimization method based on the Multi-Objective Genetic Algorithm (MOGA).
△ Less
Submitted 28 May, 2019; v1 submitted 25 May, 2019;
originally announced May 2019.
-
Beyond Cookie Monster Amnesia: Real World Persistent Online Tracking
Authors:
Nasser Mohammed Al-Fannah,
Wanpeng Li,
Chris J Mitchell
Abstract:
Browser fingerprinting is a relatively new method of uniquely identifying browsers that can be used to track web users. In some ways it is more privacy-threatening than tracking via cookies, as users have no direct control over it. A number of authors have considered the wide variety of techniques that can be used to fingerprint browsers; however, relatively little information is available on how…
▽ More
Browser fingerprinting is a relatively new method of uniquely identifying browsers that can be used to track web users. In some ways it is more privacy-threatening than tracking via cookies, as users have no direct control over it. A number of authors have considered the wide variety of techniques that can be used to fingerprint browsers; however, relatively little information is available on how widespread browser fingerprinting is, and what information is collected to create these fingerprints in the real world. To help address this gap, we crawled the 10,000 most popular websites; this gave insights into the number of websites that are using the technique, which websites are collecting fingerprinting information, and exactly what information is being retrieved. We found that approximately 69\% of websites are, potentially, involved in first-party or third-party browser fingerprinting. We further found that third-party browser fingerprinting, which is potentially more privacy-damaging, appears to be predominant in practice. We also describe \textit{FingerprintAlert}, a freely available browser extension we developed that detects and, optionally, blocks fingerprinting attempts by visited websites.
△ Less
Submitted 23 May, 2019;
originally announced May 2019.
-
OAuthGuard: Protecting User Security and Privacy with OAuth 2.0 and OpenID Connect
Authors:
Wanpeng Li,
Chris J Mitchell,
Thomas Chen
Abstract:
Millions of users routinely use Google to log in to websites supporting OAuth 2.0 or OpenID Connect; the security of OAuth 2.0 and OpenID Connect is therefore of critical importance. As revealed in previous studies, in practice RPs often implement OAuth 2.0 incorrectly, and so many real-world OAuth 2.0 and OpenID Connect systems are vulnerable to attack. However, users of such flawed systems are t…
▽ More
Millions of users routinely use Google to log in to websites supporting OAuth 2.0 or OpenID Connect; the security of OAuth 2.0 and OpenID Connect is therefore of critical importance. As revealed in previous studies, in practice RPs often implement OAuth 2.0 incorrectly, and so many real-world OAuth 2.0 and OpenID Connect systems are vulnerable to attack. However, users of such flawed systems are typically unaware of these issues, and so are at risk of attacks which could result in unauthorised access to the victim user's account at an RP. In order to address this threat, we have developed OAuthGuard, an OAuth 2.0 and OpenID Connect vulnerability scanner and protector, that works with RPs using Google OAuth 2.0 and OpenID Connect services. It protects user security and privacy even when RPs do not implement OAuth 2.0 or OpenID Connect correctly. We used OAuthGuard to survey the 1000 top-ranked websites supporting Google sign-in for the possible presence of five OAuth 2.0 or OpenID Connect security and privacy vulnerabilities, of which one has not previously been described in the literature. Of the 137 sites in our study that employ Google Sign-in, 69 were found to suffer from at least one serious vulnerability. OAuthGuard was able to protect user security and privacy for 56 of these 69 RPs, and for the other 13 was able to warn users that they were using an insecure implementation.
△ Less
Submitted 24 January, 2019;
originally announced January 2019.
-
The Hsu-Harn-Mu-Zhang-Zhu group key establishment protocol is insecure
Authors:
Chris J Mitchell
Abstract:
A significant security vulnerability in a recently published group key establishment protocol is described. This vulnerability allows a malicious insider to fraudulently establish a group key with an innocent victim, with the key chosen by the attacker. This shortcoming is sufficiently serious that the protocol should not be used.
A significant security vulnerability in a recently published group key establishment protocol is described. This vulnerability allows a malicious insider to fraudulently establish a group key with an innocent victim, with the key chosen by the attacker. This shortcoming is sufficiently serious that the protocol should not be used.
△ Less
Submitted 16 March, 2018; v1 submitted 14 March, 2018;
originally announced March 2018.