-
A Robot Walks into a Bar: Can Language Models Serve as Creativity Support Tools for Comedy? An Evaluation of LLMs' Humour Alignment with Comedians
Authors:
Piotr Wojciech Mirowski,
Juliette Love,
Kory W. Mathewson,
Shakir Mohamed
Abstract:
We interviewed twenty professional comedians who perform live shows in front of audiences and who use artificial intelligence in their artistic process as part of 3-hour workshops on ``AI x Comedy'' conducted at the Edinburgh Festival Fringe in August 2023 and online. The workshop consisted of a comedy writing session with large language models (LLMs), a human-computer interaction questionnaire to…
▽ More
We interviewed twenty professional comedians who perform live shows in front of audiences and who use artificial intelligence in their artistic process as part of 3-hour workshops on ``AI x Comedy'' conducted at the Edinburgh Festival Fringe in August 2023 and online. The workshop consisted of a comedy writing session with large language models (LLMs), a human-computer interaction questionnaire to assess the Creativity Support Index of AI as a writing tool, and a focus group interrogating the comedians' motivations for and processes of using AI, as well as their ethical concerns about bias, censorship and copyright. Participants noted that existing moderation strategies used in safety filtering and instruction-tuned LLMs reinforced hegemonic viewpoints by erasing minority groups and their perspectives, and qualified this as a form of censorship. At the same time, most participants felt the LLMs did not succeed as a creativity support tool, by producing bland and biased comedy tropes, akin to ``cruise ship comedy material from the 1950s, but a bit less racist''. Our work extends scholarship about the subtle difference between, one the one hand, harmful speech, and on the other hand, ``offensive'' language as a practice of resistance, satire and ``punching up''. We also interrogate the global value alignment behind such language models, and discuss the importance of community-based value alignment and data ownership to build AI tools that better suit artists' needs.
△ Less
Submitted 3 June, 2024; v1 submitted 31 May, 2024;
originally announced May 2024.
-
Two-layer retrieval augmented generation framework for low-resource medical question-answering: proof of concept using Reddit data
Authors:
Sudeshna Das,
Yao Ge,
Yuting Guo,
Swati Rajwal,
JaMor Hairston,
Jeanne Powell,
Drew Walker,
Snigdha Peddireddy,
Sahithi Lakamana,
Selen Bozkurt,
Matthew Reyna,
Reza Sameni,
Yunyu Xiao,
Sangmi Kim,
Rasheeta Chandler,
Natalie Hernandez,
Danielle Mowery,
Rachel Wightman,
Jennifer Love,
Anthony Spadaro,
Jeanmarie Perrone,
Abeed Sarker
Abstract:
Retrieval augmented generation (RAG) provides the capability to constrain generative model outputs, and mitigate the possibility of hallucination, by providing relevant in-context text. The number of tokens a generative large language model (LLM) can incorporate as context is finite, thus limiting the volume of knowledge from which to generate an answer. We propose a two-layer RAG framework for qu…
▽ More
Retrieval augmented generation (RAG) provides the capability to constrain generative model outputs, and mitigate the possibility of hallucination, by providing relevant in-context text. The number of tokens a generative large language model (LLM) can incorporate as context is finite, thus limiting the volume of knowledge from which to generate an answer. We propose a two-layer RAG framework for query-focused answer generation and evaluate a proof-of-concept for this framework in the context of query-focused summary generation from social media forums, focusing on emerging drug-related information. The evaluations demonstrate the effectiveness of the two-layer framework in resource constrained settings to enable researchers in obtaining near real-time data from users.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
CARE-SD: Classifier-based analysis for recognizing and eliminating stigmatizing and doubt marker labels in electronic health records: model development and validation
Authors:
Drew Walker,
Annie Thorne,
Sudeshna Das,
Jennifer Love,
Hannah LF Cooper,
Melvin Livingston III,
Abeed Sarker
Abstract:
Objective: To detect and classify features of stigmatizing and biased language in intensive care electronic health records (EHRs) using natural language processing techniques. Materials and Methods: We first created a lexicon and regular expression lists from literature-driven stem words for linguistic features of stigmatizing patient labels, doubt markers, and scare quotes within EHRs. The lexico…
▽ More
Objective: To detect and classify features of stigmatizing and biased language in intensive care electronic health records (EHRs) using natural language processing techniques. Materials and Methods: We first created a lexicon and regular expression lists from literature-driven stem words for linguistic features of stigmatizing patient labels, doubt markers, and scare quotes within EHRs. The lexicon was further extended using Word2Vec and GPT 3.5, and refined through human evaluation. These lexicons were used to search for matches across 18 million sentences from the de-identified Medical Information Mart for Intensive Care-III (MIMIC-III) dataset. For each linguistic bias feature, 1000 sentence matches were sampled, labeled by expert clinical and public health annotators, and used to supervised learning classifiers. Results: Lexicon development from expanded literature stem-word lists resulted in a doubt marker lexicon containing 58 expressions, and a stigmatizing labels lexicon containing 127 expressions. Classifiers for doubt markers and stigmatizing labels had the highest performance, with macro F1-scores of .84 and .79, positive-label recall and precision values ranging from .71 to .86, and accuracies aligning closely with human annotator agreement (.87). Discussion: This study demonstrated the feasibility of supervised classifiers in automatically identifying stigmatizing labels and doubt markers in medical text, and identified trends in stigmatizing language use in an EHR setting. Additional labeled data may help improve lower scare quote model performance. Conclusions: Classifiers developed in this study showed high model performance and can be applied to identify patterns and target interventions to reduce stigmatizing labels and doubt markers in healthcare systems.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Vision-based robot manipulation of transparent liquid containers in a laboratory setting
Authors:
Daniel Schober,
Ronja Güldenring,
James Love,
Lazaros Nalpantidis
Abstract:
Laboratory processes involving small volumes of solutions and active ingredients are often performed manually due to challenges in automation, such as high initial costs, semi-structured environments and protocol variability. In this work, we develop a flexible and cost-effective approach to address this gap by introducing a vision-based system for liquid volume estimation and a simulation-driven…
▽ More
Laboratory processes involving small volumes of solutions and active ingredients are often performed manually due to challenges in automation, such as high initial costs, semi-structured environments and protocol variability. In this work, we develop a flexible and cost-effective approach to address this gap by introducing a vision-based system for liquid volume estimation and a simulation-driven pouring method particularly designed for containers with small openings. We evaluate both components individually, followed by an applied real-world integration of cell culture automation using a UR5 robotic arm. Our work is fully reproducible: we share our code at at \url{https://github.com/DaniSchober/LabLiquidVision} and the newly introduced dataset LabLiquidVolume is available at https://data.dtu.dk/articles/dataset/LabLiquidVision/25103102.
△ Less
Submitted 25 April, 2024;
originally announced April 2024.
-
Minimum Description Feature Selection for Complexity Reduction in Machine Learning-based Wireless Positioning
Authors:
Myeung Suk Oh,
Anindya Bijoy Das,
Taejoon Kim,
David J. Love,
Christopher G. Brinton
Abstract:
Recently, deep learning approaches have provided solutions to difficult problems in wireless positioning (WP). Although these WP algorithms have attained excellent and consistent performance against complex channel environments, the computational complexity coming from processing high-dimensional features can be prohibitive for mobile applications. In this work, we design a novel positioning neura…
▽ More
Recently, deep learning approaches have provided solutions to difficult problems in wireless positioning (WP). Although these WP algorithms have attained excellent and consistent performance against complex channel environments, the computational complexity coming from processing high-dimensional features can be prohibitive for mobile applications. In this work, we design a novel positioning neural network (P-NN) that utilizes the minimum description features to substantially reduce the complexity of deep learning-based WP. P-NN's feature selection strategy is based on maximum power measurements and their temporal locations to convey information needed to conduct WP. We improve P-NN's learning ability by intelligently processing two different types of inputs: sparse image and measurement matrices. Specifically, we implement a self-attention layer to reinforce the training ability of our network. We also develop a technique to adapt feature space size, optimizing over the expected information gain and the classification capability quantified with information-theoretic measures on signal bin selection. Numerical results show that P-NN achieves a significant advantage in performance-complexity tradeoff over deep learning baselines that leverage the full power delay profile (PDP). In particular, we find that P-NN achieves a large improvement in performance for low SNR, as unnecessary measurements are discarded in our minimum description features.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Multi-Agent Hybrid SAC for Joint SS-DSA in CRNs
Authors:
David R. Nickel,
Anindya Bijoy Das,
David J. Love,
Christopher G. Brinton
Abstract:
Opportunistic spectrum access has the potential to increase the efficiency of spectrum utilization in cognitive radio networks (CRNs). In CRNs, both spectrum sensing and resource allocation (SSRA) are critical to maximizing system throughput while minimizing collisions of secondary users with the primary network. However, many works in dynamic spectrum access do not consider the impact of imperfec…
▽ More
Opportunistic spectrum access has the potential to increase the efficiency of spectrum utilization in cognitive radio networks (CRNs). In CRNs, both spectrum sensing and resource allocation (SSRA) are critical to maximizing system throughput while minimizing collisions of secondary users with the primary network. However, many works in dynamic spectrum access do not consider the impact of imperfect sensing information such as mis-detected channels, which the additional information available in joint SSRA can help remediate. In this work, we examine joint SSRA as an optimization which seeks to maximize a CRN's net communication rate subject to constraints on channel sensing, channel access, and transmit power. Given the non-trivial nature of the problem, we leverage multi-agent reinforcement learning to enable a network of secondary users to dynamically access unoccupied spectrum via only local test statistics, formulated under the energy detection paradigm of spectrum sensing. In doing so, we develop a novel multi-agent implementation of hybrid soft actor critic, MHSAC, based on the QMIX mixing scheme. Through experiments, we find that our SSRA algorithm, HySSRA, is successful in maximizing the CRN's utilization of spectrum resources while also limiting its interference with the primary network, and outperforms the current state-of-the-art by a wide margin. We also explore the impact of wireless variations such as coherence time on the efficacy of the system.
△ Less
Submitted 22 April, 2024;
originally announced April 2024.
-
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models
Authors:
Aleksandar Botev,
Soham De,
Samuel L Smith,
Anushan Fernando,
George-Cristian Muraru,
Ruba Haroun,
Leonard Berrada,
Razvan Pascanu,
Pier Giuseppe Sessa,
Robert Dadashi,
Léonard Hussenot,
Johan Ferret,
Sertan Girgin,
Olivier Bachem,
Alek Andreev,
Kathleen Kenealy,
Thomas Mesnard,
Cassidy Hardin,
Surya Bhupatiraju,
Shreya Pathak,
Laurent Sifre,
Morgane Rivière,
Mihir Sanjay Kale,
Juliette Love,
Pouya Tafti
, et al. (37 additional authors not shown)
Abstract:
We introduce RecurrentGemma, an open language model which uses Google's novel Griffin architecture. Griffin combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state, which reduces memory use and enables efficient inference on long sequences. We provide a pre-trained model with 2B non-embedding parameters, and an instruction tuned var…
▽ More
We introduce RecurrentGemma, an open language model which uses Google's novel Griffin architecture. Griffin combines linear recurrences with local attention to achieve excellent performance on language. It has a fixed-sized state, which reduces memory use and enables efficient inference on long sequences. We provide a pre-trained model with 2B non-embedding parameters, and an instruction tuned variant. Both models achieve comparable performance to Gemma-2B despite being trained on fewer tokens.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Gemma: Open Models Based on Gemini Research and Technology
Authors:
Gemma Team,
Thomas Mesnard,
Cassidy Hardin,
Robert Dadashi,
Surya Bhupatiraju,
Shreya Pathak,
Laurent Sifre,
Morgane Rivière,
Mihir Sanjay Kale,
Juliette Love,
Pouya Tafti,
Léonard Hussenot,
Pier Giuseppe Sessa,
Aakanksha Chowdhery,
Adam Roberts,
Aditya Barua,
Alex Botev,
Alex Castro-Ros,
Ambrose Slone,
Amélie Héliou,
Andrea Tacchetti,
Anna Bulanova,
Antonia Paterson,
Beth Tsai,
Bobak Shahriari
, et al. (83 additional authors not shown)
Abstract:
This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Ge…
▽ More
This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Gemma outperforms similarly sized open models on 11 out of 18 text-based tasks, and we present comprehensive evaluations of safety and responsibility aspects of the models, alongside a detailed description of model development. We believe the responsible release of LLMs is critical for improving the safety of frontier models, and for enabling the next wave of LLM innovations.
△ Less
Submitted 16 April, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Authors:
Gemini Team,
Petko Georgiev,
Ving Ian Lei,
Ryan Burnell,
Libin Bai,
Anmol Gulati,
Garrett Tanzer,
Damien Vincent,
Zhufeng Pan,
Shibo Wang,
Soroosh Mariooryad,
Yifan Ding,
Xinyang Geng,
Fred Alcober,
Roy Frostig,
Mark Omernick,
Lexi Walker,
Cosmin Paduraru,
Christina Sorokin,
Andrea Tacchetti,
Colin Gaffney,
Samira Daruki,
Olcan Sercinoglu,
Zach Gleicher,
Juliette Love
, et al. (1092 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February…
▽ More
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
△ Less
Submitted 14 June, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Complexity Reduction in Machine Learning-Based Wireless Positioning: Minimum Description Features
Authors:
Myeung Suk Oh,
Anindya Bijoy Das,
Taejoon Kim,
David J. Love,
Christopher G. Brinton
Abstract:
A recent line of research has been investigating deep learning approaches to wireless positioning (WP). Although these WP algorithms have demonstrated high accuracy and robust performance against diverse channel conditions, they also have a major drawback: they require processing high-dimensional features, which can be prohibitive for mobile applications. In this work, we design a positioning neur…
▽ More
A recent line of research has been investigating deep learning approaches to wireless positioning (WP). Although these WP algorithms have demonstrated high accuracy and robust performance against diverse channel conditions, they also have a major drawback: they require processing high-dimensional features, which can be prohibitive for mobile applications. In this work, we design a positioning neural network (P-NN) that substantially reduces the complexity of deep learning-based WP through carefully crafted minimum description features. Our feature selection is based on maximum power measurements and their temporal locations to convey information needed to conduct WP. We also develop a novel methodology for adaptively selecting the size of feature space, which optimizes over balancing the expected amount of useful information and classification capability, quantified using information-theoretic measures on the signal bin selection. Numerical results show that P-NN achieves a significant advantage in performance-complexity tradeoff over deep learning baselines that leverage the full power delay profile (PDP).
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Simulation-Enhanced Data Augmentation for Machine Learning Pathloss Prediction
Authors:
Ahmed P. Mohamed,
Byunghyun Lee,
Yaguang Zhang,
Max Hollingsworth,
C. Robert Anderson,
James V. Krogmeier,
David J. Love
Abstract:
Machine learning (ML) offers a promising solution to pathloss prediction. However, its effectiveness can be degraded by the limited availability of data. To alleviate these challenges, this paper introduces a novel simulation-enhanced data augmentation method for ML pathloss prediction. Our method integrates synthetic data generated from a cellular coverage simulator and independently collected re…
▽ More
Machine learning (ML) offers a promising solution to pathloss prediction. However, its effectiveness can be degraded by the limited availability of data. To alleviate these challenges, this paper introduces a novel simulation-enhanced data augmentation method for ML pathloss prediction. Our method integrates synthetic data generated from a cellular coverage simulator and independently collected real-world datasets. These datasets were collected through an extensive measurement campaign in different environments, including farms, hilly terrains, and residential areas. This comprehensive data collection provides vital ground truth for model training. A set of channel features was engineered, including geographical attributes derived from LiDAR datasets. These features were then used to train our prediction model, incorporating the highly efficient and robust gradient boosting ML algorithm, CatBoost. The integration of synthetic data, as demonstrated in our study, significantly improves the generalizability of the model in different environments, achieving a remarkable improvement of approximately 12dB in terms of mean absolute error for the best-case scenario. Moreover, our analysis reveals that even a small fraction of measurements added to the simulation training set, with proper data balance, can significantly enhance the model's performance.
△ Less
Submitted 5 February, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Coding for Gaussian Two-Way Channels: Linear and Learning-Based Approaches
Authors:
Junghoon Kim,
Taejoon Kim,
Anindya Bijoy Das,
Seyyedali Hosseinalipour,
David J. Love,
Christopher G. Brinton
Abstract:
Although user cooperation cannot improve the capacity of Gaussian two-way channels (GTWCs) with independent noises, it can improve communication reliability. In this work, we aim to enhance and balance the communication reliability in GTWCs by minimizing the sum of error probabilities via joint design of encoders and decoders at the users. We first formulate general encoding/decoding functions, wh…
▽ More
Although user cooperation cannot improve the capacity of Gaussian two-way channels (GTWCs) with independent noises, it can improve communication reliability. In this work, we aim to enhance and balance the communication reliability in GTWCs by minimizing the sum of error probabilities via joint design of encoders and decoders at the users. We first formulate general encoding/decoding functions, where the user cooperation is captured by the coupling of user encoding processes. The coupling effect renders the encoder/decoder design non-trivial, requiring effective decoding to capture this effect, as well as efficient power management at the encoders within power constraints. To address these challenges, we propose two different two-way coding strategies: linear coding and learning-based coding. For linear coding, we propose optimal linear decoding and discuss new insights on encoding regarding user cooperation to balance reliability. We then propose an efficient algorithm for joint encoder/decoder design. For learning-based coding, we introduce a novel recurrent neural network (RNN)-based coding architecture, where we propose interactive RNNs and a power control layer for encoding, and we incorporate bi-directional RNNs with an attention mechanism for decoding. Through simulations, we show that our two-way coding methodologies outperform conventional channel coding schemes (that do not utilize user cooperation) significantly in sum-error performance. We also demonstrate that our linear coding excels at high signal-to-noise ratios (SNRs), while our RNN-based coding performs best at low SNRs. We further investigate our two-way coding strategies in terms of power distribution, two-way coding benefit, different coding rates, and block-length gain.
△ Less
Submitted 31 December, 2023;
originally announced January 2024.
-
Modeling and Analysis of GEO Satellite Networks
Authors:
Dong-Hyun Jung,
Hongjae Nam,
Junil Choi,
David J. Love
Abstract:
The extensive coverage offered by satellites makes them effective in enhancing service continuity for users on dynamic airborne and maritime platforms, such as airplanes and ships. In particular, geosynchronous Earth orbit (GEO) satellites ensure stable connectivity for terrestrial users due to their stationary characteristics when observed from Earth. This paper introduces a novel approach to mod…
▽ More
The extensive coverage offered by satellites makes them effective in enhancing service continuity for users on dynamic airborne and maritime platforms, such as airplanes and ships. In particular, geosynchronous Earth orbit (GEO) satellites ensure stable connectivity for terrestrial users due to their stationary characteristics when observed from Earth. This paper introduces a novel approach to model and analyze GEO satellite networks using stochastic geometry. We model the distribution of GEO satellites in the geostationary orbit according to a binomial point process (BPP) and examine satellite visibility depending on the terminal's latitude. Then, we identify potential distribution cases for GEO satellites and derive case probabilities based on the properties of the BPP. We also obtain the distance distributions between the terminal and GEO satellites and derive the coverage probability of the network. We further approximate the derived expressions using the Poisson limit theorem. Monte Carlo simulations are performed to validate the analytical findings, demonstrating a strong alignment between the analyses and simulations. The simplified analytical results can be used to estimate the coverage performance of GEO satellite networks by effectively modeling the positions of GEO satellites.
△ Less
Submitted 26 December, 2023;
originally announced December 2023.
-
Cooperative Federated Learning over Ground-to-Satellite Integrated Networks: Joint Local Computation and Data Offloading
Authors:
Dong-Jun Han,
Seyyedali Hosseinalipour,
David J. Love,
Mung Chiang,
Christopher G. Brinton
Abstract:
While network coverage maps continue to expand, many devices located in remote areas remain unconnected to terrestrial communication infrastructures, preventing them from getting access to the associated data-driven services. In this paper, we propose a ground-to-satellite cooperative federated learning (FL) methodology to facilitate machine learning service management over remote regions. Our met…
▽ More
While network coverage maps continue to expand, many devices located in remote areas remain unconnected to terrestrial communication infrastructures, preventing them from getting access to the associated data-driven services. In this paper, we propose a ground-to-satellite cooperative federated learning (FL) methodology to facilitate machine learning service management over remote regions. Our methodology orchestrates satellite constellations to provide the following key functions during FL: (i) processing data offloaded from ground devices, (ii) aggregating models within device clusters, and (iii) relaying models/data to other satellites via inter-satellite links (ISLs). Due to the limited coverage time of each satellite over a particular remote area, we facilitate satellite transmission of trained models and acquired data to neighboring satellites via ISL, so that the incoming satellite can continue conducting FL for the region. We theoretically analyze the convergence behavior of our algorithm, and develop a training latency minimizer which optimizes over satellite-specific network resources, including the amount of data to be offloaded from ground devices to satellites and satellites' computation speeds. Through experiments on three datasets, we show that our methodology can significantly speed up the convergence of FL compared with terrestrial-only and other satellite baseline approaches.
△ Less
Submitted 23 December, 2023;
originally announced December 2023.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Preserving Sparsity and Privacy in Straggler-Resilient Distributed Matrix Computations
Authors:
Anindya Bijoy Das,
Aditya Ramamoorthy,
David J. Love,
Christopher G. Brinton
Abstract:
Existing approaches to distributed matrix computations involve allocating coded combinations of submatrices to worker nodes, to build resilience to stragglers and/or enhance privacy. In this study, we consider the challenge of preserving input sparsity in such approaches to retain the associated computational efficiency enhancements. First, we find a lower bound on the weight of coding, i.e., the…
▽ More
Existing approaches to distributed matrix computations involve allocating coded combinations of submatrices to worker nodes, to build resilience to stragglers and/or enhance privacy. In this study, we consider the challenge of preserving input sparsity in such approaches to retain the associated computational efficiency enhancements. First, we find a lower bound on the weight of coding, i.e., the number of submatrices to be combined to obtain coded submatrices to provide the resilience to the maximum possible number of stragglers (for given number of nodes and their storage constraints). Next we propose a distributed matrix computation scheme which meets this exact lower bound on the weight of the coding. Further, we develop controllable trade-off between worker computation time and the privacy constraint for sparse input matrices in settings where the worker nodes are honest but curious. Numerical experiments conducted in Amazon Web Services (AWS) validate our assertions regarding straggler mitigation and computation speed for sparse matrices.
△ Less
Submitted 8 August, 2023;
originally announced August 2023.
-
Derandomizing Codes for the Binary Adversarial Wiretap Channel of Type II
Authors:
Eric Ruzomberka,
Homa Nikbakht,
Christopher G. Brinton,
David J. Love,
H. Vincent Poor
Abstract:
We revisit the binary adversarial wiretap channel (AWTC) of type II in which an active adversary can read a fraction $r$ and flip a fraction $p$ of codeword bits. The semantic-secrecy capacity of the AWTC II is partially known, where the best-known lower bound is non-constructive, proven via a random coding argument that uses a large number (that is exponential in blocklength $n$) of random bits t…
▽ More
We revisit the binary adversarial wiretap channel (AWTC) of type II in which an active adversary can read a fraction $r$ and flip a fraction $p$ of codeword bits. The semantic-secrecy capacity of the AWTC II is partially known, where the best-known lower bound is non-constructive, proven via a random coding argument that uses a large number (that is exponential in blocklength $n$) of random bits to seed the random code. In this paper, we establish a new derandomization result in which we match the best-known lower bound of $1-H_2(p)-r$ where $H_2(\cdot)$ is the binary entropy function via a random code that uses a small seed of only $O(n^2)$ bits. Our random code construction is a novel application of pseudolinear codes -- a class of non-linear codes that have $k$-wise independent codewords when picked at random where $k$ is a design parameter. As the key technical tool in our analysis, we provide a soft-covering lemma in the flavor of Goldfeld, Cuff and Permuter (Trans. Inf. Theory 2016) that holds for random codes with $k$-wise independent codewords.
△ Less
Submitted 9 July, 2023;
originally announced July 2023.
-
Reliability and repeatability of ISO 3382-3 metrics based on repeated acoustic measurements in open-plan offices
Authors:
Manuj Yadav,
Densil Cabrera,
James Love,
Jungsoo Kim,
Jonothan Holmes,
Hugo Caldwell,
Richard de Dear
Abstract:
This paper investigates variability in the key ISO 3382-3:2012 metrics, based primarily on the repeatability and reliability of these metrics, using repeated measurements in open-plan offices. Two types of repeated measurements were performed in offices, Type1 (n=36), where the same path over workstations was measured from opposite ends, and Type2 (n=7), where two different measurement paths were…
▽ More
This paper investigates variability in the key ISO 3382-3:2012 metrics, based primarily on the repeatability and reliability of these metrics, using repeated measurements in open-plan offices. Two types of repeated measurements were performed in offices, Type1 (n=36), where the same path over workstations was measured from opposite ends, and Type2 (n=7), where two different measurement paths were measured. Overall, most of the Type1 results seem reasonable considering repeats were conducted in complicated room acoustic environments, while Type2 repeats would benefit from larger sample sizes in future studies. Some recommendations are outlined for the ISO 3382-3 methodology vis-a-vis Type1 and Type2 repeats, including future research directions that go beyond increased sample sizes. (This is an abridged version of the abstract. Please see the paper for the full abstract)
△ Less
Submitted 17 June, 2023;
originally announced June 2023.
-
Adversarial Channels with O(1)-Bit Partial Feedback
Authors:
Eric Ruzomberka,
Yongkyu Jang,
David J. Love,
H. Vincent Poor
Abstract:
We consider point-to-point communication over $q$-ary adversarial channels with partial noiseless feedback. In this setting, a sender Alice transmits $n$ symbols from a $q$-ary alphabet over a noisy forward channel to a receiver Bob, while Bob sends feedback to Alice over a noiseless reverse channel. In the forward channel, an adversary can inject both symbol errors and erasures up to an error fra…
▽ More
We consider point-to-point communication over $q$-ary adversarial channels with partial noiseless feedback. In this setting, a sender Alice transmits $n$ symbols from a $q$-ary alphabet over a noisy forward channel to a receiver Bob, while Bob sends feedback to Alice over a noiseless reverse channel. In the forward channel, an adversary can inject both symbol errors and erasures up to an error fraction $p \in [0,1]$ and erasure fraction $r \in [0,1]$, respectively. In the reverse channel, Bob's feedback is partial such that he can send at most $B(n) \geq 0$ bits during the communication session.
As a case study on minimal partial feedback, we initiate the study of the $O(1)$-bit feedback setting in which $B$ is $O(1)$ in $n$. As our main result, we provide a tight characterization of zero-error capacity under $O(1)$-bit feedback for all $q \geq 2$, $p \in [0,1]$ and $r \in [0,1]$, which we prove this result via novel achievability and converse schemes inspired by recent studies of causal adversarial channels without feedback. Perhaps surprisingly, we show that $O(1)$-bits of feedback are sufficient to achieve the zero-error capacity of the $q$-ary adversarial error channel with full feedback when the error fraction $p$ is sufficiently small.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Towards Cooperative Federated Learning over Heterogeneous Edge/Fog Networks
Authors:
Su Wang,
Seyyedali Hosseinalipour,
Vaneet Aggarwal,
Christopher G. Brinton,
David J. Love,
Weifeng Su,
Mung Chiang
Abstract:
Federated learning (FL) has been promoted as a popular technique for training machine learning (ML) models over edge/fog networks. Traditional implementations of FL have largely neglected the potential for inter-network cooperation, treating edge/fog devices and other infrastructure participating in ML as separate processing elements. Consequently, FL has been vulnerable to several dimensions of n…
▽ More
Federated learning (FL) has been promoted as a popular technique for training machine learning (ML) models over edge/fog networks. Traditional implementations of FL have largely neglected the potential for inter-network cooperation, treating edge/fog devices and other infrastructure participating in ML as separate processing elements. Consequently, FL has been vulnerable to several dimensions of network heterogeneity, such as varying computation capabilities, communication resources, data qualities, and privacy demands. We advocate for cooperative federated learning (CFL), a cooperative edge/fog ML paradigm built on device-to-device (D2D) and device-to-server (D2S) interactions. Through D2D and D2S cooperation, CFL counteracts network heterogeneity in edge/fog networks through enabling a model/data/resource pooling mechanism, which will yield substantial improvements in ML model training quality and network resource consumption. We propose a set of core methodologies that form the foundation of D2D and D2S cooperation and present preliminary experiments that demonstrate their benefits. We also discuss new FL functionalities enabled by this cooperative framework such as the integration of unlabeled data and heterogeneous device privacy into ML model training. Finally, we describe some open research directions at the intersection of cooperative edge/fog and FL.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Coded Matrix Computations for D2D-enabled Linearized Federated Learning
Authors:
Anindya Bijoy Das,
Aditya Ramamoorthy,
David J. Love,
Christopher G. Brinton
Abstract:
Federated learning (FL) is a popular technique for training a global model on data distributed across client devices. Like other distributed training techniques, FL is susceptible to straggler (slower or failed) clients. Recent work has proposed to address this through device-to-device (D2D) offloading, which introduces privacy concerns. In this paper, we propose a novel straggler-optimal approach…
▽ More
Federated learning (FL) is a popular technique for training a global model on data distributed across client devices. Like other distributed training techniques, FL is susceptible to straggler (slower or failed) clients. Recent work has proposed to address this through device-to-device (D2D) offloading, which introduces privacy concerns. In this paper, we propose a novel straggler-optimal approach for coded matrix computations which can significantly reduce the communication delay and privacy issues introduced from D2D data transmissions in FL. Moreover, our proposed approach leads to a considerable improvement of the local computation speed when the generated data matrix is sparse. Numerical evaluations confirm the superiority of our proposed method over baseline approaches.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Propagation Measurements and Analyses at 28 GHz via an Autonomous Beam-Steering Platform
Authors:
Bharath Keshavamurthy,
Yaguang Zhang,
Christopher R. Anderson,
Nicolo Michelusi,
James V. Krogmeier,
David J. Love
Abstract:
This paper details the design of an autonomous alignment and tracking platform to mechanically steer directional horn antennas in a sliding correlator channel sounder setup for 28 GHz V2X propagation modeling. A pan-and-tilt subsystem facilitates uninhibited rotational mobility along the yaw and pitch axes, driven by open-loop servo units and orchestrated via inertial motion controllers. A geo-pos…
▽ More
This paper details the design of an autonomous alignment and tracking platform to mechanically steer directional horn antennas in a sliding correlator channel sounder setup for 28 GHz V2X propagation modeling. A pan-and-tilt subsystem facilitates uninhibited rotational mobility along the yaw and pitch axes, driven by open-loop servo units and orchestrated via inertial motion controllers. A geo-positioning subsystem augmented in accuracy by real-time kinematics enables navigation events to be shared between a transmitter and receiver over an Apache Kafka messaging middleware framework with fault tolerance. Herein, our system demonstrates a 3D geo-positioning accuracy of 17 cm, an average principal axes positioning accuracy of 1.1 degrees, and an average tracking response time of 27.8 ms. Crucially, fully autonomous antenna alignment and tracking facilitates continuous series of measurements, a unique yet critical necessity for millimeter wave channel modeling in vehicular networks. The power-delay profiles, collected along routes spanning urban and suburban neighborhoods on the NSF POWDER testbed, are used in pathloss evaluations involving the 3GPP TR38.901 and ITU-R M.2135 standards. Empirically, we demonstrate that these models fail to accurately capture the 28 GHz pathloss behavior in urban foliage and suburban radio environments. In addition to RMS direction-spread analyses for angles-of-arrival via the SAGE algorithm, we perform signal decoherence studies wherein we derive exponential models for the spatial/angular autocorrelation coefficient under distance and alignment effects.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
Distributed Matrix Computations with Low-weight Encodings
Authors:
Anindya Bijoy Das,
Aditya Ramamoorthy,
David J. Love,
Christopher G. Brinton
Abstract:
Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum distance separable) codes into the framework; this can achieve resilience against an optimal number of stragglers. However, these codes assign dense linear combin…
▽ More
Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum distance separable) codes into the framework; this can achieve resilience against an optimal number of stragglers. However, these codes assign dense linear combinations of submatrices to the worker nodes. When the input matrices are sparse, these approaches increase the number of non-zero entries in the encoded matrices, which in turn adversely affects the worker computation time. In this work, we develop a distributed matrix computation approach where the assigned encoded submatrices are random linear combinations of a small number of submatrices. In addition to being well suited for sparse input matrices, our approach continues have the optimal straggler resilience in a certain range of problem parameters. Moreover, compared to recent sparse matrix computation approaches, the search for a "good" set of random coefficients to promote numerical stability in our method is much more computationally efficient. We show that our approach can efficiently utilize partial computations done by slower worker nodes in a heterogeneous system which can enhance the overall computation speed. Numerical experiments conducted through Amazon Web Services (AWS) demonstrate up to 30% reduction in per worker node computation time and 100x faster encoding compared to the available methods.
△ Less
Submitted 22 August, 2023; v1 submitted 30 January, 2023;
originally announced January 2023.
-
Linear Coding for Gaussian Two-Way Channels
Authors:
Junghoon Kim,
Seyyedali Hosseinalipour,
Taejoon Kim,
David J. Love,
Christopher G. Brinton
Abstract:
We consider linear coding for Gaussian two-way channels (GTWCs), in which each user generates the transmit symbols by linearly encoding both its message and the past received symbols (i.e., the feedback information) from the other user. In Gaussian one-way channels (GOWCs), Butman has proposed a well-developed model for linear encoding that encapsulates feedback information into transmit signals.…
▽ More
We consider linear coding for Gaussian two-way channels (GTWCs), in which each user generates the transmit symbols by linearly encoding both its message and the past received symbols (i.e., the feedback information) from the other user. In Gaussian one-way channels (GOWCs), Butman has proposed a well-developed model for linear encoding that encapsulates feedback information into transmit signals. However, such a model for GTWCs has not been well studied since the coupling of the encoding processes at the users in GTWCs renders the encoding design non-trivial and challenging. In this paper, we aim to fill this gap in the literature by extending the existing signal models in GOWCs to GTWCs. With our developed signal model for GTWCs, we formulate an optimization problem to jointly design the encoding/decoding schemes for both the users, aiming to minimize the weighted sum of their transmit powers under signal-to-noise ratio constraints. First, we derive an optimal form of the linear decoding schemes under any arbitrary encoding schemes employed at the users. Further, we provide new insights on the encoding design for GTWCs. In particular, we show that it is optimal that one of the users (i) does not transmit the feedback information to the other user at the last channel use, and (ii) transmits its message only over the last channel use. With these solution behaviors, we further simplify the problem and solve it via an iterative two-way optimization scheme. We numerically demonstrate that our proposed scheme for GTWCs achieves a better performance in terms of the transmit power compared to the existing counterparts, such as the non-feedback scheme and one-way optimization scheme.
△ Less
Submitted 29 October, 2022;
originally announced October 2022.
-
Massive MIMO Channel Prediction Via Meta-Learning and Deep Denoising: Is a Small Dataset Enough?
Authors:
Hwanjin Kim,
Junil Choi,
David J. Love
Abstract:
Accurate channel knowledge is critical in massive multiple-input multiple-output (MIMO), which motivates the use of channel prediction. Machine learning techniques for channel prediction hold much promise, but current schemes are limited in their ability to adapt to changes in the environment because they require large training overheads. To accurately predict wireless channels for new environment…
▽ More
Accurate channel knowledge is critical in massive multiple-input multiple-output (MIMO), which motivates the use of channel prediction. Machine learning techniques for channel prediction hold much promise, but current schemes are limited in their ability to adapt to changes in the environment because they require large training overheads. To accurately predict wireless channels for new environments with reduced training overhead, we propose a fast adaptive channel prediction technique based on a meta-learning algorithm for massive MIMO communications. We exploit the model-agnostic meta-learning (MAML) algorithm to achieve quick adaptation with a small amount of labeled data. Also, to improve the prediction accuracy, we adopt the denoising process for the training data by using deep image prior (DIP). Numerical results show that the proposed MAML-based channel predictor can improve the prediction accuracy with only a few fine-tuning samples. The DIP-based denoising process gives an additional gain in channel prediction, especially in low signal-to-noise ratio regimes.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
A Primer on Rate-Splitting Multiple Access: Tutorial, Myths, and Frequently Asked Questions
Authors:
Bruno Clerckx,
Yijie Mao,
Eduard A. Jorswieck,
Jinhong Yuan,
David J. Love,
Elza Erkip,
Dusit Niyato
Abstract:
Rate-Splitting Multiple Access (RSMA) has emerged as a powerful multiple access, interference management, and multi-user strategy for next generation communication systems. In this tutorial, we depart from the orthogonal multiple access (OMA) versus non-orthogonal multiple access (NOMA) discussion held in 5G, and the conventional multi-user linear precoding approach used in space-division multiple…
▽ More
Rate-Splitting Multiple Access (RSMA) has emerged as a powerful multiple access, interference management, and multi-user strategy for next generation communication systems. In this tutorial, we depart from the orthogonal multiple access (OMA) versus non-orthogonal multiple access (NOMA) discussion held in 5G, and the conventional multi-user linear precoding approach used in space-division multiple access (SDMA), multi-user and massive MIMO in 4G and 5G, and show how multi-user communications and multiple access design for 6G and beyond should be intimately related to the fundamental problem of interference management. We start from foundational principles of interference management and rate-splitting, and progressively delineate RSMA frameworks for downlink, uplink, and multi-cell networks. We show that, in contrast to past generations of multiple access techniques (OMA, NOMA, SDMA), RSMA offers numerous benefits. We then discuss how those benefits translate into numerous opportunities for RSMA in over forty different applications and scenarios of 6G. We finally address common myths and answer frequently asked questions, opening the discussions to interesting future research avenues. Supported by the numerous benefits and applications, the tutorial concludes on the underpinning role played by RSMA in next generation networks, which should inspire future research, development, and standardization of RSMA-aided communication for 6G.
△ Less
Submitted 10 January, 2023; v1 submitted 1 September, 2022;
originally announced September 2022.
-
Deep Reinforcement Learning-Based Adaptive IRS Control with Limited Feedback Codebooks
Authors:
Junghoon Kim,
Seyyedali Hosseinalipour,
Andrew C. Marcum,
Taejoon Kim,
David J. Love,
Christopher G. Brinton
Abstract:
Intelligent reflecting surfaces (IRS) consist of configurable meta-atoms, which can alter the wireless propagation environment through design of their reflection coefficients. We consider adaptive IRS control in the practical setting where (i) the IRS reflection coefficients are attained by adjusting tunable elements embedded in the meta-atoms, (ii) the IRS reflection coefficients are affected by…
▽ More
Intelligent reflecting surfaces (IRS) consist of configurable meta-atoms, which can alter the wireless propagation environment through design of their reflection coefficients. We consider adaptive IRS control in the practical setting where (i) the IRS reflection coefficients are attained by adjusting tunable elements embedded in the meta-atoms, (ii) the IRS reflection coefficients are affected by the incident angles of the incoming signals, (iii) the IRS is deployed in multi-path, time-varying channels, and (iv) the feedback link from the base station (BS) to the IRS has a low data rate. Conventional optimization-based IRS control protocols, which rely on channel estimation and conveying the optimized variables to the IRS, are not practical in this setting due to the difficulty of channel estimation and the low data rate of the feedback channel. To address these challenges, we develop a novel adaptive codebook-based limited feedback protocol to control the IRS. We propose two solutions for adaptive IRS codebook design: (i) random adjacency (RA), which utilizes correlations across the channel realizations, and (ii) deep neural network policy-based IRS control (DPIC), which is based on a deep reinforcement learning. Numerical evaluations show that the data rate and average data rate over one coherence time are improved substantially by the proposed schemes.
△ Less
Submitted 7 May, 2022;
originally announced May 2022.
-
Multi-Edge Server-Assisted Dynamic Federated Learning with an Optimized Floating Aggregation Point
Authors:
Bhargav Ganguly,
Seyyedali Hosseinalipour,
Kwang Taik Kim,
Christopher G. Brinton,
Vaneet Aggarwal,
David J. Love,
Mung Chiang
Abstract:
We propose cooperative edge-assisted dynamic federated learning (CE-FL). CE-FL introduces a distributed machine learning (ML) architecture, where data collection is carried out at the end devices, while the model training is conducted cooperatively at the end devices and the edge servers, enabled via data offloading from the end devices to the edge servers through base stations. CE-FL also introdu…
▽ More
We propose cooperative edge-assisted dynamic federated learning (CE-FL). CE-FL introduces a distributed machine learning (ML) architecture, where data collection is carried out at the end devices, while the model training is conducted cooperatively at the end devices and the edge servers, enabled via data offloading from the end devices to the edge servers through base stations. CE-FL also introduces floating aggregation point, where the local models generated at the devices and the servers are aggregated at an edge server, which varies from one model training round to another to cope with the network evolution in terms of data distribution and users' mobility. CE-FL considers the heterogeneity of network elements in terms of communication/computation models and the proximity to one another. CE-FL further presumes a dynamic environment with online variation of data at the network devices which causes a drift at the ML model performance. We model the processes taken during CE-FL, and conduct analytical convergence analysis of its ML model training. We then formulate network-aware CE-FL which aims to adaptively optimize all the network elements via tuning their contribution to the learning process, which turns out to be a non-convex mixed integer problem. Motivated by the large scale of the system, we propose a distributed optimization solver to break down the computation of the solution across the network elements. We finally demonstrate the effectiveness of our framework with the data collected from a real-world testbed.
△ Less
Submitted 22 October, 2022; v1 submitted 25 March, 2022;
originally announced March 2022.
-
Latency Optimization for Blockchain-Empowered Federated Learning in Multi-Server Edge Computing
Authors:
Dinh C. Nguyen,
Seyyedali Hosseinalipour,
David J. Love,
Pubudu N. Pathirana,
Christopher G. Brinton
Abstract:
In this paper, we study a new latency optimization problem for blockchain-based federated learning (BFL) in multi-server edge computing. In this system model, distributed mobile devices (MDs) communicate with a set of edge servers (ESs) to handle both machine learning (ML) model training and block mining simultaneously. To assist the ML model training for resource-constrained MDs, we develop an of…
▽ More
In this paper, we study a new latency optimization problem for blockchain-based federated learning (BFL) in multi-server edge computing. In this system model, distributed mobile devices (MDs) communicate with a set of edge servers (ESs) to handle both machine learning (ML) model training and block mining simultaneously. To assist the ML model training for resource-constrained MDs, we develop an offloading strategy that enables MDs to transmit their data to one of the associated ESs. We then propose a new decentralized ML model aggregation solution at the edge layer based on a consensus mechanism to build a global ML model via peer-to-peer (P2P)-based blockchain communications. Blockchain builds trust among MDs and ESs to facilitate reliable ML model sharing and cooperative consensus formation, and enables rapid elimination of manipulated models caused by poisoning attacks. We formulate latency-aware BFL as an optimization aiming to minimize the system latency via joint consideration of the data offloading decisions, MDs' transmit power, channel bandwidth allocation for MDs' data offloading, MDs' computational allocation, and hash power allocation. Given the mixed action space of discrete offloading and continuous allocation variables, we propose a novel deep reinforcement learning scheme with a parameterized advantage actor critic algorithm. We theoretically characterize the convergence properties of BFL in terms of the aggregation delay, mini-batch size, and number of P2P communication rounds. Our numerical evaluation demonstrates the superiority of our proposed scheme over baselines in terms of model training efficiency, convergence rate, system latency, and robustness against model poisoning attacks.
△ Less
Submitted 3 July, 2022; v1 submitted 17 March, 2022;
originally announced March 2022.
-
CAFQA: A classical simulation bootstrap for variational quantum algorithms
Authors:
Gokul Subramanian Ravi,
Pranav Gokhale,
Yi Ding,
William M. Kirby,
Kaitlin N. Smith,
Jonathan M. Baker,
Peter J. Love,
Henry Hoffmann,
Kenneth R. Brown,
Frederic T. Chong
Abstract:
This work tackles the problem of finding a good ansatz initialization for Variational Quantum Algorithms (VQAs), by proposing CAFQA, a Clifford Ansatz For Quantum Accuracy. The CAFQA ansatz is a hardware-efficient circuit built with only Clifford gates. In this ansatz, the parameters for the tunable gates are chosen by searching efficiently through the Clifford parameter space via classical simula…
▽ More
This work tackles the problem of finding a good ansatz initialization for Variational Quantum Algorithms (VQAs), by proposing CAFQA, a Clifford Ansatz For Quantum Accuracy. The CAFQA ansatz is a hardware-efficient circuit built with only Clifford gates. In this ansatz, the parameters for the tunable gates are chosen by searching efficiently through the Clifford parameter space via classical simulation. The resulting initial states always equal or outperform traditional classical initialization (e.g., Hartree-Fock), and enable high-accuracy VQA estimations. CAFQA is well-suited to classical computation because: a) Clifford-only quantum circuits can be exactly simulated classically in polynomial time, and b) the discrete Clifford space is searched efficiently via Bayesian Optimization.
For the Variational Quantum Eigensolver (VQE) task of molecular ground state energy estimation (up to 18 qubits), CAFQA's Clifford Ansatz achieves a mean accuracy of nearly 99% and recovers as much as 99.99% of the molecular correlation energy that is lost in Hartree-Fock initialization. CAFQA achieves mean accuracy improvements of 6.4x and 56.8x, over the state-of-the-art, on different metrics. The scalability of the approach allows for preliminary ground state energy estimation of the challenging chromium dimer (Cr$_2$) molecule. With CAFQA's high-accuracy initialization, the convergence of VQAs is shown to accelerate by 2.5x, even for small molecules.
Furthermore, preliminary exploration of allowing a limited number of non-Clifford (T) gates in the CAFQA framework, shows that as much as 99.9% of the correlation energy can be recovered at bond lengths for which Clifford-only CAFQA accuracy is relatively limited, while remaining classically simulable.
△ Less
Submitted 29 September, 2023; v1 submitted 25 February, 2022;
originally announced February 2022.
-
Parallel Successive Learning for Dynamic Distributed Model Training over Heterogeneous Wireless Networks
Authors:
Seyyedali Hosseinalipour,
Su Wang,
Nicolo Michelusi,
Vaneet Aggarwal,
Christopher G. Brinton,
David J. Love,
Mung Chiang
Abstract:
Federated learning (FedL) has emerged as a popular technique for distributing model training over a set of wireless devices, via iterative local updates (at devices) and global aggregations (at the server). In this paper, we develop parallel successive learning (PSL), which expands the FedL architecture along three dimensions: (i) Network, allowing decentralized cooperation among the devices via d…
▽ More
Federated learning (FedL) has emerged as a popular technique for distributing model training over a set of wireless devices, via iterative local updates (at devices) and global aggregations (at the server). In this paper, we develop parallel successive learning (PSL), which expands the FedL architecture along three dimensions: (i) Network, allowing decentralized cooperation among the devices via device-to-device (D2D) communications. (ii) Heterogeneity, interpreted at three levels: (ii-a) Learning: PSL considers heterogeneous number of stochastic gradient descent iterations with different mini-batch sizes at the devices; (ii-b) Data: PSL presumes a dynamic environment with data arrival and departure, where the distributions of local datasets evolve over time, captured via a new metric for model/concept drift. (ii-c) Device: PSL considers devices with different computation and communication capabilities. (iii) Proximity, where devices have different distances to each other and the access point. PSL considers the realistic scenario where global aggregations are conducted with idle times in-between them for resource efficiency improvements, and incorporates data dispersion and model dispersion with local model condensation into FedL. Our analysis sheds light on the notion of cold vs. warmed up models, and model inertia in distributed machine learning. We then propose network-aware dynamic model tracking to optimize the model learning vs. resource efficiency tradeoff, which we show is an NP-hard signomial programming problem. We finally solve this problem through proposing a general optimization solver. Our numerical results reveal new findings on the interdependencies between the idle times in-between the global aggregations, model/concept drift, and D2D cooperation configuration.
△ Less
Submitted 14 June, 2023; v1 submitted 7 February, 2022;
originally announced February 2022.
-
Channel Capacity for Adversaries with Computationally Bounded Observations
Authors:
Eric Ruzomberka,
Chih-Chun Wang,
David J. Love
Abstract:
We study reliable communication over point-to-point adversarial channels in which the adversary can observe the transmitted codeword via some function that takes the $n$-bit codeword as input and computes an $rn$-bit output for some given $r \in [0,1]$. We consider the scenario where the $rn$-bit observation is computationally bounded -- the adversary is free to choose an arbitrary observation fun…
▽ More
We study reliable communication over point-to-point adversarial channels in which the adversary can observe the transmitted codeword via some function that takes the $n$-bit codeword as input and computes an $rn$-bit output for some given $r \in [0,1]$. We consider the scenario where the $rn$-bit observation is computationally bounded -- the adversary is free to choose an arbitrary observation function as long as the function can be computed using a polynomial amount of computational resources. This observation-based restriction differs from conventional channel-based computational limitations, where in the later case, the resource limitation applies to the computation of the (adversarial) channel error. For all $r \in [0,1-H(p)]$ where $H(\cdot)$ is the binary entropy function and $p$ is the adversary's error budget, we characterize the capacity of the above channel. For this range of $r$, we find that the capacity is identical to the completely obvious setting ($r=0$). This result can be viewed as a generalization of known results on myopic adversaries and channels with active eavesdroppers for which the observation process depends on a fixed distribution and fixed-linear structure, respectively, that cannot be chosen arbitrarily by the adversary.
△ Less
Submitted 4 November, 2023; v1 submitted 6 February, 2022;
originally announced February 2022.
-
Practical Distributed Reception for Wireless Body Area Networks Using Supervised Learning
Authors:
Jihoon Cha,
Junil Choi,
David J. Love
Abstract:
Medical applications have driven many areas of engineering to optimize diagnostic capabilities and convenience. In the near future, wireless body area networks (WBANs) are expected to have widespread impact in medicine. To achieve this impact, however, significant advances in research are needed to cope with the changes of the human body's state, which make coherent communications difficult or eve…
▽ More
Medical applications have driven many areas of engineering to optimize diagnostic capabilities and convenience. In the near future, wireless body area networks (WBANs) are expected to have widespread impact in medicine. To achieve this impact, however, significant advances in research are needed to cope with the changes of the human body's state, which make coherent communications difficult or even impossible. In this paper, we consider a realistic noncoherent WBAN system model where transmissions and receptions are conducted without any channel state information due to the fast-varying channels of the human body. Using distributed reception, we propose several symbol detection approaches where on-off keying (OOK) modulation is exploited, among which a supervised-learning-based approach is developed to overcome the noncoherent system issue. Through simulation results, we compare and verify the performance of the proposed techniques for noncoherent WBANs with OOK transmissions. We show that the well-defined detection techniques with a supervised-learning-based approach enable robust communications for noncoherent WBAN systems.
△ Less
Submitted 14 December, 2021;
originally announced December 2021.
-
A Robotic Antenna Alignment and Tracking System for Millimeter Wave Propagation Modeling
Authors:
Bharath Keshavamurthy,
Yaguang Zhang,
Christopher R. Anderson,
Nicolo Michelusi,
James V. Krogmeier,
David J. Love
Abstract:
In this paper, we discuss the design of a sliding-correlator channel sounder for 28 GHz propagation modeling on the NSF POWDER testbed in Salt Lake City, UT. Beam-alignment is mechanically achieved via a fully autonomous robotic antenna tracking platform, designed using commercial off-the-shelf components. Equipped with an Apache Zookeeper/Kafka managed fault-tolerant publish-subscribe framework,…
▽ More
In this paper, we discuss the design of a sliding-correlator channel sounder for 28 GHz propagation modeling on the NSF POWDER testbed in Salt Lake City, UT. Beam-alignment is mechanically achieved via a fully autonomous robotic antenna tracking platform, designed using commercial off-the-shelf components. Equipped with an Apache Zookeeper/Kafka managed fault-tolerant publish-subscribe framework, we demonstrate tracking response times of 27.8 ms, in addition to superior scalability over state-of-the-art mechanical beam-steering systems. Enhanced with real-time kinematic correction streams, our geo-positioning subsystem achieves a 3D accuracy of 17 cm, while our principal axes positioning subsystem achieves an average accuracy of 1.1 degrees across yaw and pitch movements. Finally, by facilitating remote orchestration (via managed containers), uninhibited rotation (via encapsulation), and real-time positioning visualization (via Dash/MapBox), we exhibit a proven prototype well-suited for V2X measurements.
△ Less
Submitted 13 October, 2021;
originally announced October 2021.
-
Limitations of Local Quantum Algorithms on Random Max-k-XOR and Beyond
Authors:
Chi-Ning Chou,
Peter J. Love,
Juspreet Singh Sandhu,
Jonathan Shi
Abstract:
We introduce a notion of \emph{generic local algorithm} which strictly generalizes existing frameworks of local algorithms such as \emph{factors of i.i.d.} by capturing local \emph{quantum} algorithms such as the Quantum Approximate Optimization Algorithm (QAOA).
Motivated by a question of Farhi et al. [arXiv:1910.08187, 2019] we then show limitations of generic local algorithms including QAOA o…
▽ More
We introduce a notion of \emph{generic local algorithm} which strictly generalizes existing frameworks of local algorithms such as \emph{factors of i.i.d.} by capturing local \emph{quantum} algorithms such as the Quantum Approximate Optimization Algorithm (QAOA).
Motivated by a question of Farhi et al. [arXiv:1910.08187, 2019] we then show limitations of generic local algorithms including QAOA on random instances of constraint satisfaction problems (CSPs). Specifically, we show that any generic local algorithm whose assignment to a vertex depends only on a local neighborhood with $o(n)$ other vertices (such as the QAOA at depth less than $ε\log(n)$) cannot arbitrarily-well approximate boolean CSPs if the problem satisfies a geometric property from statistical physics called the coupled overlap-gap property (OGP) [Chen et al., Annals of Probability, 47(3), 2019]. We show that the random MAX-k-XOR problem has this property when $k\geq4$ is even by extending the corresponding result for diluted $k$-spin glasses.
Our concentration lemmas confirm a conjecture of Brandao et al. [arXiv:1812.04170, 2018] asserting that the landscape independence of QAOA extends to logarithmic depth -- in other words, for every fixed choice of QAOA angle parameters, the algorithm at logarithmic depth performs almost equally well on almost all instances. One of these concentration lemmas is a strengthening of McDiarmid's inequality, applicable when the random variables have a highly biased distribution, and may be of independent interest.
△ Less
Submitted 21 February, 2022; v1 submitted 12 August, 2021;
originally announced August 2021.
-
Spatial Analysis of Physical Reservoir Computers
Authors:
Jake Love,
Jeroen Mulkers,
Robin Msiska,
George Bourianoff,
Jonathan Leliaert,
Karin Everschor-Sitte
Abstract:
Physical reservoir computing is a computational framework that implements spatiotemporal information processing directly within physical systems. By exciting nonlinear dynamical systems and creating linear models from their state, we can create highly energy-efficient devices capable of solving machine learning tasks without building a modular system consisting of millions of neurons interconnecte…
▽ More
Physical reservoir computing is a computational framework that implements spatiotemporal information processing directly within physical systems. By exciting nonlinear dynamical systems and creating linear models from their state, we can create highly energy-efficient devices capable of solving machine learning tasks without building a modular system consisting of millions of neurons interconnected by synapses. To act as an effective reservoir, the chosen dynamical system must have two desirable properties: nonlinearity and memory. We present task agnostic spatial measures to locally measure both of these properties and exemplify them for a specific physical reservoir based upon magnetic skyrmion textures. In contrast to typical reservoir computing metrics, these metrics can be resolved spatially and in parallel from a single input signal, allowing for efficient parameter search to design efficient and high-performance reservoirs. Additionally, we show the natural trade-off between memory capacity and nonlinearity in our reservoir's behaviour, both locally and globally. Finally, by balancing the memory and nonlinearity in a reservoir, we can improve its performance for specific tasks.
△ Less
Submitted 14 November, 2022; v1 submitted 3 August, 2021;
originally announced August 2021.
-
Stochastic-Adversarial Channels : Online Adversaries With Feedback Snooping
Authors:
Vinayak Suresh,
Eric Ruzomberka,
David J. Love
Abstract:
The growing need for reliable communication over untrusted networks has caused a renewed interest in adversarial channel models, which often behave much differently than traditional stochastic channel models. Of particular practical use is the assumption of a \textit{causal} or \textit{online} adversary who is limited to causal knowledge of the transmitted codeword. In this work, we consider stoch…
▽ More
The growing need for reliable communication over untrusted networks has caused a renewed interest in adversarial channel models, which often behave much differently than traditional stochastic channel models. Of particular practical use is the assumption of a \textit{causal} or \textit{online} adversary who is limited to causal knowledge of the transmitted codeword. In this work, we consider stochastic-adversarial mixed noise models. In the set-up considered, a transmit node (Alice) attempts to communicate with a receive node (Bob) over a binary erasure channel (BEC) or binary symmetric channel (BSC) in the presence of an online adversary (Calvin) who can erase or flip up to a certain number of bits at the input of the channel. Calvin knows the encoding scheme and has causal access to Bob's reception through \textit{feedback snooping}. For erasures, we provide a complete capacity characterization with and without transmitter feedback. For bit-flips, we provide interesting converse and achievability bounds.
△ Less
Submitted 14 April, 2021;
originally announced April 2021.
-
Channel Estimation via Successive Denoising in MIMO OFDM Systems: A Reinforcement Learning Approach
Authors:
Myeung Suk Oh,
Seyyedali Hosseinalipour,
Taejoon Kim,
Christopher G. Brinton,
David J. Love
Abstract:
In general, reliable communication via multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) requires accurate channel estimation at the receiver. The existing literature largely focuses on denoising methods for channel estimation that depend on either (i) channel analysis in the time-domain with prior channel knowledge or (ii) supervised learning techniques which…
▽ More
In general, reliable communication via multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) requires accurate channel estimation at the receiver. The existing literature largely focuses on denoising methods for channel estimation that depend on either (i) channel analysis in the time-domain with prior channel knowledge or (ii) supervised learning techniques which require large pre-labeled datasets for training. To address these limitations, we present a frequency-domain denoising method based on a reinforcement learning framework that does not need a priori channel knowledge and pre-labeled data. Our methodology includes a new successive channel denoising process based on channel curvature computation, for which we obtain a channel curvature magnitude threshold to identify unreliable channel estimates. Based on this process, we formulate the denoising mechanism as a Markov decision process, where we define the actions through a geometry-based channel estimation update, and the reward function based on a policy that reduces mean squared error (MSE). We then resort to Q-learning to update the channel estimates. Numerical results verify that our denoising algorithm can successfully mitigate noise in channel estimates. In particular, our algorithm provides a significant improvement over the practical least squares (LS) estimation method and provides performance that approaches that of the ideal linear minimum mean square error (LMMSE) estimation with perfect knowledge of channel statistics.
△ Less
Submitted 27 March, 2024; v1 submitted 25 January, 2021;
originally announced January 2021.
-
Is NOMA Efficient in Multi-Antenna Networks? A Critical Look at Next Generation Multiple Access Techniques
Authors:
Bruno Clerckx,
Yijie Mao,
Robert Schober,
Eduard Jorswieck,
David J. Love,
Jinhong Yuan,
Lajos Hanzo,
Geoffrey Ye Li,
Erik G. Larsson,
Giuseppe Caire
Abstract:
In this paper, we take a critical and fresh look at the downlink multi-antenna NOMA literature. Instead of contrasting NOMA with OMA, we contrast NOMA with two other baselines. The first is conventional Multi-User Linear Precoding (MULP). The second is Rate-Splitting Multiple Access (RSMA) based on multi-antenna Rate-Splitting (RS) and SIC. We show that there is some confusion about the benefits o…
▽ More
In this paper, we take a critical and fresh look at the downlink multi-antenna NOMA literature. Instead of contrasting NOMA with OMA, we contrast NOMA with two other baselines. The first is conventional Multi-User Linear Precoding (MULP). The second is Rate-Splitting Multiple Access (RSMA) based on multi-antenna Rate-Splitting (RS) and SIC. We show that there is some confusion about the benefits of NOMA, and we dispel the associated misconceptions. First, we highlight why NOMA is inefficient in multi-antenna settings based on basic multiplexing gain analysis. We stress that the issue lies in how the NOMA literature has been hastily applied to multi-antenna setups, resulting in a misuse of spatial dimensions and therefore loss in multiplexing gains and rate. Second, we show that NOMA incurs a severe multiplexing gain loss despite an increased receiver complexity due to an inefficient use of SIC receivers. Third, we emphasize that much of the merits of NOMA are due to the constant comparison to OMA instead of comparing it to MULP and RS baselines. We then expose the pivotal design constraint that multi-antenna NOMA requires one user to fully decode the messages of the other users. This design constraint is responsible for the multiplexing gain erosion, rate loss, and inefficient use of SIC receivers in multi-antenna settings. Our results confirm that NOMA should not be applied blindly to multi-antenna settings, highlight the scenarios where MULP outperforms NOMA and vice versa, and demonstrate the inefficiency, performance loss and complexity disadvantages of NOMA compared to RS. The first takeaway message is that, while NOMA is not beneficial in most multi-antenna deployments. The second takeaway message is that other non-orthogonal transmission frameworks, such as RS, exist which fully exploit the multiplexing gain and the benefits of SIC to boost the rate in multi-antenna settings.
△ Less
Submitted 12 January, 2021;
originally announced January 2021.
-
Multi-IRS-assisted Multi-Cell Uplink MIMO Communications under Imperfect CSI: A Deep Reinforcement Learning Approach
Authors:
Junghoon Kim,
Seyyedali Hosseinalipour,
Taejoon Kim,
David J. Love,
Christopher G. Brinton
Abstract:
Applications of intelligent reflecting surfaces (IRSs) in wireless networks have attracted significant attention recently. Most of the relevant literature is focused on the single cell setting where a single IRS is deployed and perfect channel state information (CSI) is assumed. In this work, we develop a novel methodology for multi-IRS-assisted multi-cell networks in the uplink. We consider the s…
▽ More
Applications of intelligent reflecting surfaces (IRSs) in wireless networks have attracted significant attention recently. Most of the relevant literature is focused on the single cell setting where a single IRS is deployed and perfect channel state information (CSI) is assumed. In this work, we develop a novel methodology for multi-IRS-assisted multi-cell networks in the uplink. We consider the scenario in which (i) channels are dynamic and (ii) only partial CSI is available at each base station (BS); specifically, scalar effective channel powers from only a subset of user equipments (UE). We formulate the sum-rate maximization problem aiming to jointly optimize the IRS reflect beamformers, BS combiners, and UE transmit powers. In casting this as a sequential decision making problem, we propose a multi-agent deep reinforcement learning algorithm to solve it, where each BS acts as an independent agent in charge of tuning the local UE transmit powers, the local IRS reflect beamformer, and its combiners. We introduce an efficient information-sharing scheme that requires limited information exchange among neighboring BSs to cope with the non-stationarity caused by the coupling of actions taken by multiple BSs. Our numerical results show that our method obtains substantial improvement in average data rate compared to baseline approaches, e.g., fixed UE transmit power and maximum ratio combining.
△ Less
Submitted 1 April, 2021; v1 submitted 2 November, 2020;
originally announced November 2020.
-
Frequency-based Automated Modulation Classification in the Presence of Adversaries
Authors:
Rajeev Sahay,
Christopher G. Brinton,
David J. Love
Abstract:
Automatic modulation classification (AMC) aims to improve the efficiency of crowded radio spectrums by automatically predicting the modulation constellation of wireless RF signals. Recent work has demonstrated the ability of deep learning to achieve robust AMC performance using raw in-phase and quadrature (IQ) time samples. Yet, deep learning models are highly susceptible to adversarial interferen…
▽ More
Automatic modulation classification (AMC) aims to improve the efficiency of crowded radio spectrums by automatically predicting the modulation constellation of wireless RF signals. Recent work has demonstrated the ability of deep learning to achieve robust AMC performance using raw in-phase and quadrature (IQ) time samples. Yet, deep learning models are highly susceptible to adversarial interference, which cause intelligent prediction models to misclassify received samples with high confidence. Furthermore, adversarial interference is often transferable, allowing an adversary to attack multiple deep learning models with a single perturbation crafted for a particular classification network. In this work, we present a novel receiver architecture consisting of deep learning models capable of withstanding transferable adversarial interference. Specifically, we show that adversarial attacks crafted to fool models trained on time-domain features are not easily transferable to models trained using frequency-domain features. In this capacity, we demonstrate classification performance improvements greater than 30% on recurrent neural networks (RNNs) and greater than 50% on convolutional neural networks (CNNs). We further demonstrate our frequency feature-based classification models to achieve accuracies greater than 99% in the absence of attacks.
△ Less
Submitted 19 February, 2021; v1 submitted 2 November, 2020;
originally announced November 2020.
-
Noncoherent OOK Symbol Detection with Supervised-Learning Approach for BCC
Authors:
Jihoon Cha,
Junil Choi,
David J. Love
Abstract:
There has been a continuing demand for improving the accuracy and ease of use of medical devices used on or around the human body. Communication is critical to medical applications, and wireless body area networks (WBANs) have the potential to revolutionize diagnosis. Despite its importance, WBAN technology is still in its infancy and requires much research. We consider body channel communication…
▽ More
There has been a continuing demand for improving the accuracy and ease of use of medical devices used on or around the human body. Communication is critical to medical applications, and wireless body area networks (WBANs) have the potential to revolutionize diagnosis. Despite its importance, WBAN technology is still in its infancy and requires much research. We consider body channel communication (BCC), which uses the whole body as well as the skin as a medium for communication. BCC is sensitive to the body's natural circulation and movement, which requires a noncoherent model for wireless communication. To accurately handle practical applications for electronic devices working on or inside a human body, we configure a realistic system model for BCC with on-off keying (OOK) modulation. We propose novel detection techniques for OOK symbols and improve the performance by exploiting distributed reception and supervised-learning approaches. Numerical results show that the proposed techniques are valid for noncoherent OOK transmissions for BCC.
△ Less
Submitted 19 August, 2020;
originally announced August 2020.
-
Minimum Overhead Beamforming and Resource Allocation in D2D Edge Networks
Authors:
Junghoon Kim,
Taejoon Kim,
Morteza Hashemi,
Christopher G. Brinton,
David J. Love
Abstract:
Device-to-device (D2D) communications is expected to be a critical enabler of distributed computing in edge networks at scale. A key challenge in providing this capability is the requirement for judicious management of the heterogeneous communication and computation resources that exist at the edge to meet processing needs. In this paper, we develop an optimization methodology that considers the n…
▽ More
Device-to-device (D2D) communications is expected to be a critical enabler of distributed computing in edge networks at scale. A key challenge in providing this capability is the requirement for judicious management of the heterogeneous communication and computation resources that exist at the edge to meet processing needs. In this paper, we develop an optimization methodology that considers the network topology jointly with device and network resource allocation to minimize total D2D overhead, which we quantify in terms of time and energy required for task processing. Variables in our model include task assignment, CPU allocation, subchannel selection, and beamforming design for multiple-input multiple-output (MIMO) wireless devices. We propose two methods to solve the resulting non-convex mixed integer program: semi-exhaustive search optimization, which represents a "best-effort" at obtaining the optimal solution, and efficient alternate optimization, which is more computationally efficient. As a component of these two methods, we develop a novel coordinated beamforming algorithm which we show obtains the optimal beamformer for a common receiver characteristic. Through numerical experiments, we find that our methodology yields substantial improvements in network overhead compared with local computation and partially optimized methods, which validates our joint optimization approach. Further, we find that the efficient alternate optimization scales well with the number of nodes, and thus can be a practical solution for D2D computing in large networks.
△ Less
Submitted 16 August, 2022; v1 submitted 25 July, 2020;
originally announced July 2020.
-
Multi-Stage Hybrid Federated Learning over Large-Scale D2D-Enabled Fog Networks
Authors:
Seyyedali Hosseinalipour,
Sheikh Shams Azam,
Christopher G. Brinton,
Nicolo Michelusi,
Vaneet Aggarwal,
David J. Love,
Huaiyu Dai
Abstract:
Federated learning has generated significant interest, with nearly all works focused on a "star" topology where nodes/devices are each connected to a central server. We migrate away from this architecture and extend it through the network dimension to the case where there are multiple layers of nodes between the end devices and the server. Specifically, we develop multi-stage hybrid federated lear…
▽ More
Federated learning has generated significant interest, with nearly all works focused on a "star" topology where nodes/devices are each connected to a central server. We migrate away from this architecture and extend it through the network dimension to the case where there are multiple layers of nodes between the end devices and the server. Specifically, we develop multi-stage hybrid federated learning (MH-FL), a hybrid of intra- and inter-layer model learning that considers the network as a multi-layer cluster-based structure. MH-FL considers the topology structures among the nodes in the clusters, including local networks formed via device-to-device (D2D) communications, and presumes a semi-decentralized architecture for federated learning. It orchestrates the devices at different network layers in a collaborative/cooperative manner (i.e., using D2D interactions) to form local consensus on the model parameters and combines it with multi-stage parameter relaying between layers of the tree-shaped hierarchy. We derive the upper bound of convergence for MH-FL with respect to parameters of the network topology (e.g., the spectral radius) and the learning algorithm (e.g., the number of D2D rounds in different clusters). We obtain a set of policies for the D2D rounds at different clusters to guarantee either a finite optimality gap or convergence to the global optimum. We then develop a distributed control algorithm for MH-FL to tune the D2D rounds in each cluster over time to meet specific convergence criteria. Our experiments on real-world datasets verify our analytical results and demonstrate the advantages of MH-FL in terms of resource utilization metrics.
△ Less
Submitted 12 January, 2022; v1 submitted 18 July, 2020;
originally announced July 2020.
-
A File System For Write-Once Media
Authors:
Simson L. Garfinkel,
J. Spencer Love
Abstract:
A file system standard for use with write-once media such as digital compact disks is proposed. The file system is designed to work with any operating system and a variety of physical media. Although the implementation is simple, it provides a a full-featured and high-performance alternative to conventional file systems on traditional, multiple-write media such as magnetic disks.
A file system standard for use with write-once media such as digital compact disks is proposed. The file system is designed to work with any operating system and a variety of physical media. Although the implementation is simple, it provides a a full-featured and high-performance alternative to conventional file systems on traditional, multiple-write media such as magnetic disks.
△ Less
Submitted 30 March, 2020;
originally announced April 2020.
-
Joint Optimization of Signal Design and Resource Allocation in Wireless D2D Edge Computing
Authors:
Junghoon Kim,
Taejoon Kim,
Morteza Hashemi,
Christopher G. Brinton,
David J. Love
Abstract:
In this paper, we study the distributed computational capabilities of device-to-device (D2D) networks. A key characteristic of D2D networks is that their topologies are reconfigurable to cope with network demands. For distributed computing, resource management is challenging due to limited network and communication resources, leading to inter-channel interference. To overcome this, recent research…
▽ More
In this paper, we study the distributed computational capabilities of device-to-device (D2D) networks. A key characteristic of D2D networks is that their topologies are reconfigurable to cope with network demands. For distributed computing, resource management is challenging due to limited network and communication resources, leading to inter-channel interference. To overcome this, recent research has addressed the problems of wireless scheduling, subchannel allocation, power allocation, and multiple-input multiple-output (MIMO) signal design, but has not considered them jointly. In this paper, unlike previous mobile edge computing (MEC) approaches, we propose a joint optimization of wireless MIMO signal design and network resource allocation to maximize energy efficiency. Given that the resulting problem is a non-convex mixed integer program (MIP) which is prohibitive to solve at scale, we decompose its solution into two parts: (i) a resource allocation subproblem, which optimizes the link selection and subchannel allocations, and (ii) MIMO signal design subproblem, which optimizes the transmit beamformer, transmit power, and receive combiner. Simulation results using wireless edge topologies show that our method yields substantial improvements in energy efficiency compared with cases of no offloading and partially optimized methods and that the efficiency scales well with the size of the network.
△ Less
Submitted 3 March, 2020; v1 submitted 26 February, 2020;
originally announced February 2020.
-
Supersingular Curves With Small Non-integer Endomorphisms
Authors:
Jonathan Love,
Dan Boneh
Abstract:
We introduce a special class of supersingular curves over $\mathbb{F}_{p^2}$, characterized by the existence of non-integer endomorphisms of small degree. A number of properties of this set is proved. Most notably, we show that when this set partitions into subsets in such a way that curves within each subset have small-degree isogenies between them, but curves in distinct subsets have no small-de…
▽ More
We introduce a special class of supersingular curves over $\mathbb{F}_{p^2}$, characterized by the existence of non-integer endomorphisms of small degree. A number of properties of this set is proved. Most notably, we show that when this set partitions into subsets in such a way that curves within each subset have small-degree isogenies between them, but curves in distinct subsets have no small-degree isogenies between them. Despite this, we show that isogenies between these curves can be computed efficiently, giving a technique for computing isogenies between certain prescribed curves that cannot be reasonably connected by searching on $\ell$-isogeny graphs.
△ Less
Submitted 23 June, 2020; v1 submitted 7 October, 2019;
originally announced October 2019.
-
Prospective Multiple Antenna Technologies for Beyond 5G
Authors:
Jiayi Zhang,
Emil Björnson,
Michail Matthaiou,
Derrick Wing Kwan Ng,
Hong Yang,
David J. Love
Abstract:
Multiple antenna technologies have attracted large research interest for several decades and have gradually made their way into mainstream communication systems. Two main benefits are adaptive beamforming gains and spatial multiplexing, leading to high data rates per user and per cell, especially when large antenna arrays are used. Now that multiple antenna technology has become a key component of…
▽ More
Multiple antenna technologies have attracted large research interest for several decades and have gradually made their way into mainstream communication systems. Two main benefits are adaptive beamforming gains and spatial multiplexing, leading to high data rates per user and per cell, especially when large antenna arrays are used. Now that multiple antenna technology has become a key component of the fifth-generation (5G) networks, it is time for the research community to look for new multiple antenna applications to meet the immensely higher data rate, reliability, and traffic demands in the beyond 5G era. We need radically new approaches to achieve orders-of-magnitude improvements in these metrics and this will be connected to large technical challenges, many of which are yet to be identified. In this survey paper, we present a survey of three new multiple antenna related research directions that might play a key role in beyond 5G networks: Cell-free massive multiple-input multiple-output (MIMO), beamspace massive MIMO, and intelligent reflecting surfaces. More specifically, the fundamental motivation and key characteristics of these new technologies are introduced. Recent technical progress is also presented. Finally, we provide a list of other prospective future research directions.
△ Less
Submitted 24 March, 2020; v1 submitted 30 September, 2019;
originally announced October 2019.
-
On the Energy Efficiency of MIMO Hybrid Beamforming for Millimeter Wave Systems with Nonlinear Power Amplifiers
Authors:
Nima N. Moghadam,
Gábor Fodor,
Mats Bengtsson,
David J. Love
Abstract:
Multiple-input multiple-output (MIMO) millimeter wave (mmWave) systems are vulnerable to hardware impairments due to operating at high frequencies and employing a large number of radio- frequency (RF) hardware components. In particular, nonlinear power amplifiers (PAs) employed at the transmitter distort the signal when operated close to saturation due to energy efficiency considerations. In this…
▽ More
Multiple-input multiple-output (MIMO) millimeter wave (mmWave) systems are vulnerable to hardware impairments due to operating at high frequencies and employing a large number of radio- frequency (RF) hardware components. In particular, nonlinear power amplifiers (PAs) employed at the transmitter distort the signal when operated close to saturation due to energy efficiency considerations. In this paper, we study the performance of a MIMO mmWave hybrid beamforming scheme in the presence of nonlinear PAs. First, we develop a statistical model for the transmitted signal in such systems and show that the spatial direction of the inband distortion is shaped by the beamforming filter. This suggests that even in the large antenna regime, where narrow beams can be steered toward the receiver, the impact of nonlinear PAs should not be ignored. Then, by employing a realistic power consumption model for the PAs, we investigate the trade-off between spectral and energy efficiency in such systems. Our results show that increasing the transmit power level when the number of transmit antennas grows large can be counter-effective in terms of energy efficiency. Furthermore, using numerical simulation, we show that when the transmit power is large, analog beamforming leads to higher spectral and energy efficiency compared to digital and hybrid beamforming schemes.
△ Less
Submitted 5 June, 2018;
originally announced June 2018.
-
Adaptive Beam Tracking with the Unscented Kalman Filter for Millimeter Wave Communication
Authors:
Stephen G. Larew,
David J. Love
Abstract:
Millimeter wave (mmWave) communication links for 5G cellular technology require high beamforming gain to overcome channel impairments and achieve high throughput. While much work has focused on estimating mmWave channels and designing beamforming schemes, the time dynamic nature of mmWave channels quickly renders estimates stale and increases sounding overhead. We model the underlying time dynamic…
▽ More
Millimeter wave (mmWave) communication links for 5G cellular technology require high beamforming gain to overcome channel impairments and achieve high throughput. While much work has focused on estimating mmWave channels and designing beamforming schemes, the time dynamic nature of mmWave channels quickly renders estimates stale and increases sounding overhead. We model the underlying time dynamic state space of mmWave channels and design sounding beamformers suitable for tracking in a Kalman filtering framework. Given an initial channel estimate, filtering efficiently leads to refined estimates and allows forward prediction for higher sustained beamforming gain during data transmission. From tracked prior channel estimates, adaptively chosen optimal and constrained suboptimal beams reduce sounding overhead while minimizing estimation error.
△ Less
Submitted 23 April, 2018;
originally announced April 2018.