-
Data Shapley in One Training Run
Authors:
Jiachen T. Wang,
Prateek Mittal,
Dawn Song,
Ruoxi Jia
Abstract:
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts. However, existing approaches require re-training models on different data subsets, which is computationally intensive, foreclosing their application to large-scale models. Furthermore, they produce the same attribution score for any models produced by running the learning algorithm, m…
▽ More
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts. However, existing approaches require re-training models on different data subsets, which is computationally intensive, foreclosing their application to large-scale models. Furthermore, they produce the same attribution score for any models produced by running the learning algorithm, meaning they cannot perform targeted attribution towards a specific model obtained from a single run of the algorithm. This paper introduces In-Run Data Shapley, which addresses these limitations by offering scalable data attribution for a target model of interest. In its most efficient implementation, our technique incurs negligible additional runtime compared to standard model training. This dramatic efficiency improvement makes it possible to perform data attribution for the foundation model pretraining stage for the first time. We present several case studies that offer fresh insights into pretraining data's contribution and discuss their implications for copyright in generative AI and pretraining data curation.
△ Less
Submitted 29 June, 2024; v1 submitted 16 June, 2024;
originally announced June 2024.
-
A Deep Learning Approach to Operational Flare Forecasting
Authors:
Yasser Abduallah,
Jason T. L. Wang
Abstract:
Solar flares are explosions on the Sun. They happen when energy stored in magnetic fields around solar active regions (ARs) is suddenly released. In this paper, we present a transformer-based framework, named SolarFlareNet, for predicting whether an AR would produce a gamma-class flare within the next 24 to 72 hours. We consider three gamma classes, namely the >=M5.0 class, the >=M class and the >…
▽ More
Solar flares are explosions on the Sun. They happen when energy stored in magnetic fields around solar active regions (ARs) is suddenly released. In this paper, we present a transformer-based framework, named SolarFlareNet, for predicting whether an AR would produce a gamma-class flare within the next 24 to 72 hours. We consider three gamma classes, namely the >=M5.0 class, the >=M class and the >=C class, and build three transformers separately, each corresponding to a gamma class. Each transformer is used to make predictions of its corresponding gamma-class flares. The crux of our approach is to model data samples in an AR as time series and to use transformers to capture the temporal dynamics of the data samples. Each data sample consists of magnetic parameters taken from Space-weather HMI Active Region Patches (SHARP) and related data products. We survey flare events that occurred from May 2010 to December 2022 using the Geostationary Operational Environmental Satellite X-ray flare catalogs provided by the National Centers for Environmental Information (NCEI), and build a database of flares with identified ARs in the NCEI flare catalogs. This flare database is used to construct labels of the data samples suitable for machine learning. We further extend the deterministic approach to a calibration-based probabilistic forecasting method. The SolarFlareNet system is fully operational and is capable of making near real-time predictions of solar flares on the Web.
△ Less
Submitted 25 May, 2024;
originally announced May 2024.
-
Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits
Authors:
Jiachen T. Wang,
Tianji Yang,
James Zou,
Yongchan Kwon,
Ruoxi Jia
Abstract:
Data Shapley provides a principled approach to data valuation and plays a crucial role in data-centric machine learning (ML) research. Data selection is considered a standard application of Data Shapley. However, its data selection performance has shown to be inconsistent across settings in the literature. This study aims to deepen our understanding of this phenomenon. We introduce a hypothesis te…
▽ More
Data Shapley provides a principled approach to data valuation and plays a crucial role in data-centric machine learning (ML) research. Data selection is considered a standard application of Data Shapley. However, its data selection performance has shown to be inconsistent across settings in the literature. This study aims to deepen our understanding of this phenomenon. We introduce a hypothesis testing framework and show that Data Shapley's performance can be no better than random selection without specific constraints on utility functions. We identify a class of utility functions, monotonically transformed modular functions, within which Data Shapley optimally selects data. Based on this insight, we propose a heuristic for predicting Data Shapley's effectiveness in data selection tasks. Our experiments corroborate these findings, adding new insights into when Data Shapley may or may not succeed.
△ Less
Submitted 6 May, 2024;
originally announced May 2024.
-
An Economic Solution to Copyright Challenges of Generative AI
Authors:
Jiachen T. Wang,
Zhun Deng,
Hiroaki Chiba-Okabe,
Boaz Barak,
Weijie J. Su
Abstract:
Generative artificial intelligence (AI) systems are trained on large data corpora to generate new pieces of text, images, videos, and other media. There is growing concern that such systems may infringe on the copyright interests of training data contributors. To address the copyright challenges of generative AI, we propose a framework that compensates copyright owners proportionally to their cont…
▽ More
Generative artificial intelligence (AI) systems are trained on large data corpora to generate new pieces of text, images, videos, and other media. There is growing concern that such systems may infringe on the copyright interests of training data contributors. To address the copyright challenges of generative AI, we propose a framework that compensates copyright owners proportionally to their contributions to the creation of AI-generated content. The metric for contributions is quantitatively determined by leveraging the probabilistic nature of modern generative AI models and using techniques from cooperative game theory in economics. This framework enables a platform where AI developers benefit from access to high-quality training data, thus improving model performance. Meanwhile, copyright owners receive fair compensation, driving the continued provision of relevant data for generative model training. Experiments demonstrate that our framework successfully identifies the most relevant data sources used in artwork generation, ensuring a fair and interpretable distribution of revenues among copyright owners.
△ Less
Submitted 24 April, 2024; v1 submitted 22 April, 2024;
originally announced April 2024.
-
Super-Resolution of SOHO/MDI Magnetograms of Solar Active Regions Using SDO/HMI Data and an Attention-Aided Convolutional Neural Network
Authors:
Chunhui Xu,
Jason T. L. Wang,
Haimin Wang,
Haodi Jiang,
Qin Li,
Yasser Abduallah,
Yan Xu
Abstract:
Image super-resolution has been an important subject in image processing and recognition. Here, we present an attention-aided convolutional neural network (CNN) for solar image super-resolution. Our method, named SolarCNN, aims to enhance the quality of line-of-sight (LOS) magnetograms of solar active regions (ARs) collected by the Michelson Doppler Imager (MDI) on board the Solar and Heliospheric…
▽ More
Image super-resolution has been an important subject in image processing and recognition. Here, we present an attention-aided convolutional neural network (CNN) for solar image super-resolution. Our method, named SolarCNN, aims to enhance the quality of line-of-sight (LOS) magnetograms of solar active regions (ARs) collected by the Michelson Doppler Imager (MDI) on board the Solar and Heliospheric Observatory (SOHO). The ground-truth labels used for training SolarCNN are the LOS magnetograms collected by the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO). Solar ARs consist of strong magnetic fields in which magnetic energy can suddenly be released to produce extreme space weather events, such as solar flares, coronal mass ejections, and solar energetic particles. SOHO/MDI covers Solar Cycle 23, which is stronger with more eruptive events than Cycle 24. Enhanced SOHO/MDI magnetograms allow for better understanding and forecasting of violent events of space weather. Experimental results show that SolarCNN improves the quality of SOHO/MDI magnetograms in terms of the structural similarity index measure (SSIM), Pearson's correlation coefficient (PCC), and the peak signal-to-noise ratio (PSNR).
△ Less
Submitted 27 March, 2024;
originally announced March 2024.
-
Prediction of the SYM-H Index Using a Bayesian Deep Learning Method with Uncertainty Quantification
Authors:
Yasser Abduallah,
Khalid A. Alobaid,
Jason T. L. Wang,
Haimin Wang,
Vania K. Jordanova,
Vasyl Yurchyshyn,
Huseyin Cavus,
Ju Jing
Abstract:
We propose a novel deep learning framework, named SYMHnet, which employs a graph neural network and a bidirectional long short-term memory network to cooperatively learn patterns from solar wind and interplanetary magnetic field parameters for short-term forecasts of the SYM-H index based on 1-minute and 5-minute resolution data. SYMHnet takes, as input, the time series of the parameters' values p…
▽ More
We propose a novel deep learning framework, named SYMHnet, which employs a graph neural network and a bidirectional long short-term memory network to cooperatively learn patterns from solar wind and interplanetary magnetic field parameters for short-term forecasts of the SYM-H index based on 1-minute and 5-minute resolution data. SYMHnet takes, as input, the time series of the parameters' values provided by NASA's Space Science Data Coordinated Archive and predicts, as output, the SYM-H index value at time point t + w hours for a given time point t where w is 1 or 2. By incorporating Bayesian inference into the learning framework, SYMHnet can quantify both aleatoric (data) uncertainty and epistemic (model) uncertainty when predicting future SYM-H indices. Experimental results show that SYMHnet works well at quiet time and storm time, for both 1-minute and 5-minute resolution data. The results also show that SYMHnet generally performs better than related machine learning methods. For example, SYMHnet achieves a forecast skill score (FSS) of 0.343 compared to the FSS of 0.074 of a recent gradient boosting machine (GBM) method when predicting SYM-H indices (1 hour in advance) in a large storm (SYM-H = -393 nT) using 5-minute resolution data. When predicting the SYM-H indices (2 hours in advance) in the large storm, SYMHnet achieves an FSS of 0.553 compared to the FSS of 0.087 of the GBM method. In addition, SYMHnet can provide results for both data and model uncertainty quantification, whereas the related methods cannot.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Language Models as Science Tutors
Authors:
Alexis Chevalier,
Jiayi Geng,
Alexander Wettig,
Howard Chen,
Sebastian Mizera,
Toni Annala,
Max Jameson Aragon,
Arturo Rodríguez Fanlo,
Simon Frieder,
Simon Machado,
Akshara Prabhakar,
Ellie Thieu,
Jiachen T. Wang,
Zirui Wang,
Xindi Wu,
Mengzhou Xia,
Wenhan Xia,
Jiatong Yu,
Jun-Jie Zhu,
Zhiyong Jason Ren,
Sanjeev Arora,
Danqi Chen
Abstract:
NLP has recently made exciting progress toward training language models (LMs) with strong scientific problem-solving skills. However, model development has not focused on real-life use-cases of LMs for science, including applications in education that require processing long scientific documents. To address this, we introduce TutorEval and TutorChat. TutorEval is a diverse question-answering bench…
▽ More
NLP has recently made exciting progress toward training language models (LMs) with strong scientific problem-solving skills. However, model development has not focused on real-life use-cases of LMs for science, including applications in education that require processing long scientific documents. To address this, we introduce TutorEval and TutorChat. TutorEval is a diverse question-answering benchmark consisting of questions about long chapters from STEM textbooks, written by experts. TutorEval helps measure real-life usability of LMs as scientific assistants, and it is the first benchmark combining long contexts, free-form generation, and multi-disciplinary scientific knowledge. Moreover, we show that fine-tuning base models with existing dialogue datasets leads to poor performance on TutorEval. Therefore, we create TutorChat, a dataset of 80,000 long synthetic dialogues about textbooks. We use TutorChat to fine-tune Llemma models with 7B and 34B parameters. These LM tutors specialized in math have a 32K-token context window, and they excel at TutorEval while performing strongly on GSM8K and MATH. Our datasets build on open-source materials, and we release our models, data, and evaluations.
△ Less
Submitted 21 July, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
Campana conjecture for coverings of toric surfaces over function fields
Authors:
Carlo Gasbarri,
Ji Guo,
Julie Tzu-Yueh Wang
Abstract:
We first proved Vojta's abc conjecture over function fields for Campana points on projective toric surfaces with high multiplicity along the boundary. As a consequence, we show a version of Campana's conjecture on finite covering of projective toric surfaces over function fields.
We first proved Vojta's abc conjecture over function fields for Campana points on projective toric surfaces with high multiplicity along the boundary. As a consequence, we show a version of Campana's conjecture on finite covering of projective toric surfaces over function fields.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Efficient Data Shapley for Weighted Nearest Neighbor Algorithms
Authors:
Jiachen T. Wang,
Prateek Mittal,
Ruoxi Jia
Abstract:
This work aims to address an open problem in data valuation literature concerning the efficient computation of Data Shapley for weighted $K$ nearest neighbor algorithm (WKNN-Shapley). By considering the accuracy of hard-label KNN with discretized weights as the utility function, we reframe the computation of WKNN-Shapley into a counting problem and introduce a quadratic-time algorithm, presenting…
▽ More
This work aims to address an open problem in data valuation literature concerning the efficient computation of Data Shapley for weighted $K$ nearest neighbor algorithm (WKNN-Shapley). By considering the accuracy of hard-label KNN with discretized weights as the utility function, we reframe the computation of WKNN-Shapley into a counting problem and introduce a quadratic-time algorithm, presenting a notable improvement from $O(N^K)$, the best result from existing literature. We develop a deterministic approximation algorithm that further improves computational efficiency while maintaining the key fairness properties of the Shapley value. Through extensive experiments, we demonstrate WKNN-Shapley's computational efficiency and its superior performance in discerning data quality compared to its unweighted counterpart.
△ Less
Submitted 19 January, 2024;
originally announced January 2024.
-
DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer
Authors:
Junyuan Hong,
Jiachen T. Wang,
Chenhui Zhang,
Zhangheng Li,
Bo Li,
Zhangyang Wang
Abstract:
Large Language Models (LLMs) have emerged as dominant tools for various tasks, particularly when tailored for a specific target by prompt tuning. Nevertheless, concerns surrounding data privacy present obstacles due to the tuned prompts' dependency on sensitive private information. A practical solution is to host a local LLM and optimize a soft prompt privately using data. Yet, hosting a local mod…
▽ More
Large Language Models (LLMs) have emerged as dominant tools for various tasks, particularly when tailored for a specific target by prompt tuning. Nevertheless, concerns surrounding data privacy present obstacles due to the tuned prompts' dependency on sensitive private information. A practical solution is to host a local LLM and optimize a soft prompt privately using data. Yet, hosting a local model becomes problematic when model ownership is protected. Alternative methods, like sending data to the model's provider for training, intensify these privacy issues facing an untrusted provider. In this paper, we present a novel solution called Differentially-Private Offsite Prompt Tuning (DP-OPT) to address this challenge. Our approach involves tuning a discrete prompt on the client side and then applying it to the desired cloud models. We demonstrate that prompts suggested by LLMs themselves can be transferred without compromising performance significantly. To ensure that the prompts do not leak private information, we introduce the first private prompt generation mechanism, by a differentially-private (DP) ensemble of in-context learning with private demonstrations. With DP-OPT, generating privacy-preserving prompts by Vicuna-7b can yield competitive performance compared to non-private in-context learning on GPT3.5 or local private prompt tuning. Codes are available at https://github.com/VITA-Group/DP-OPT .
△ Less
Submitted 17 March, 2024; v1 submitted 26 November, 2023;
originally announced December 2023.
-
Estimating Coronal Mass Ejection Mass and Kinetic Energy by Fusion of Multiple Deep-learning Models
Authors:
Khalid A. Alobaid,
Yasser Abduallah,
Jason T. L. Wang,
Haimin Wang,
Shen Fan,
Jialiang Li,
Huseyin Cavus,
Vasyl Yurchyshyn
Abstract:
Coronal mass ejections (CMEs) are massive solar eruptions, which have a significant impact on Earth. In this paper, we propose a new method, called DeepCME, to estimate two properties of CMEs, namely, CME mass and kinetic energy. Being able to estimate these properties helps better understand CME dynamics. Our study is based on the CME catalog maintained at the Coordinated Data Analysis Workshops…
▽ More
Coronal mass ejections (CMEs) are massive solar eruptions, which have a significant impact on Earth. In this paper, we propose a new method, called DeepCME, to estimate two properties of CMEs, namely, CME mass and kinetic energy. Being able to estimate these properties helps better understand CME dynamics. Our study is based on the CME catalog maintained at the Coordinated Data Analysis Workshops (CDAW) Data Center, which contains all CMEs manually identified since 1996 using the Large Angle and Spectrometric Coronagraph (LASCO) on board the Solar and Heliospheric Observatory (SOHO). We use LASCO C2 data in the period between January 1996 and December 2020 to train, validate and test DeepCME through 10-fold cross validation. The DeepCME method is a fusion of three deep learning models, including ResNet, InceptionNet, and InceptionResNet. Our fusion model extracts features from LASCO C2 images, effectively combining the learning capabilities of the three component models to jointly estimate the mass and kinetic energy of CMEs. Experimental results show that the fusion model yields a mean relative error (MRE) of 0.013 (0.009, respectively) compared to the MRE of 0.019 (0.017, respectively) of the best component model InceptionResNet (InceptionNet, respectively) in estimating the CME mass (kinetic energy, respectively). To our knowledge, this is the first time that deep learning has been used for CME mass and kinetic energy estimations.
△ Less
Submitted 4 December, 2023;
originally announced December 2023.
-
A Machine Learning Approach to Understanding the Physical Properties of Magnetic Flux Ropes in the Solar Wind at 1 AU
Authors:
Hameedullah Farooki,
Yasser Abduallah,
Sung Jun Noh,
Hyomin Kim,
George Bizos,
Youra Shin,
Jason T. L. Wang,
Haimin Wang
Abstract:
Interplanetary magnetic flux ropes (MFRs) are commonly observed structures in the solar wind, categorized as magnetic clouds (MCs) and small-scale MFRs (SMFRs) depending on whether they are associated with coronal mass ejections. We apply machine learning to systematically compare SMFRs, MCs, and ambient solar wind plasma properties. We construct a dataset of 3-minute averaged sequential data poin…
▽ More
Interplanetary magnetic flux ropes (MFRs) are commonly observed structures in the solar wind, categorized as magnetic clouds (MCs) and small-scale MFRs (SMFRs) depending on whether they are associated with coronal mass ejections. We apply machine learning to systematically compare SMFRs, MCs, and ambient solar wind plasma properties. We construct a dataset of 3-minute averaged sequential data points of the solar wind's instantaneous bulk fluid plasma properties using about twenty years of measurements from \emph{Wind}. We label samples by the presence and type of MFRs containing them using a catalog based on Grad-Shafranov (GS) automated detection for SMFRs and NASA's catalog for MCs (with samples in neither labeled non-MFRs). We apply the random forest machine learning algorithm to find which categories can be more easily distinguished and by what features. MCs were distinguished from non-MFRs with an AUC of 94% and SMFRs with an AUC of 89% and had distinctive plasma properties. In contrast, while SMFRs were distinguished from non-MFRs with an AUC of 86%, this appears to rely solely on the $\langle B \rangle$ > 5 nT threshold applied by the GS catalog. The results indicate that SMFRs have virtually the same plasma properties as the ambient solar wind, unlike the distinct plasma regimes of MCs. We interpret our findings as additional evidence that most SMFRs at 1 au are generated within the solar wind, and furthermore, suggesting that they should be considered a salient feature of the solar wind's magnetic structure rather than transient events.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
A Closer Look at Small-Scale Magnetic Flux Ropes in the Solar Wind at 1 AU: Results from Improved Automated Detection
Authors:
Hameedullah Farooki,
Sung Jun Noh,
Jeongwoo Lee,
Haimin Wang,
Hyomin Kim,
Yasser Abduallah,
Jason T. L. Wang,
Yu Chen,
Sergio Servidio,
Francesco Pecora
Abstract:
Small-scale interplanetary magnetic flux ropes (SMFRs) are similar to ICMEs in magnetic structure, but are smaller and do not exhibit ICME plasma signatures. We present a computationally efficient and GPU-powered version of the single-spacecraft automated SMFR detection algorithm based on the Grad-Shafranov (GS) technique. Our algorithm is capable of processing higher resolution data, eliminates s…
▽ More
Small-scale interplanetary magnetic flux ropes (SMFRs) are similar to ICMEs in magnetic structure, but are smaller and do not exhibit ICME plasma signatures. We present a computationally efficient and GPU-powered version of the single-spacecraft automated SMFR detection algorithm based on the Grad-Shafranov (GS) technique. Our algorithm is capable of processing higher resolution data, eliminates selection bias caused by a fixed $\avg{B}$ threshold, has improved detection criteria demonstrated to have better results on an MHD simulation, and recovers full 2.5D cross sections using GS reconstruction. We used it to detect 512,152 SMFRs from 27 years (1996 to 2022) of 3-second cadence \emph{Wind} measurements. Our novel findings are: (1) the radial density of SMFRs at 1 au (${\sim}1$ per $\si{10^6\kilo\meter}$) and filling factor (${\sim}$35\%) are independent of solar activity, distance to the heliospheric current sheet (HCS), and solar wind plasma type, although the minority of SMFRs with diameters greater than ${\sim}$0.01 au have a strong solar activity dependence; (2) SMFR diameters follow a log-normal distribution that peaks below the resolved range ($\gtrsim 10^4$ km), although the filling factor is dominated by SMFRs between $10^5$ to $10^6$ km; (3) most SMFRs at 1 au have strong field-aligned flows like those from PSP measurements; (4) in terms of diameter $d$, SMFR poloidal flux $\propto d^{1.2}$, axial flux $\propto d^{2.0}$, average twist number $\propto d^{-0.8}$, current density $\propto d^{-0.8}$, and helicity $\propto d^{3.2}$. Implications for the origin of SMFRs and switchbacks are briefly discussed. The new algorithm and SMFR dataset are made freely available.
△ Less
Submitted 12 November, 2023;
originally announced November 2023.
-
Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation
Authors:
Jiachen T. Wang,
Yuqing Zhu,
Yu-Xiang Wang,
Ruoxi Jia,
Prateek Mittal
Abstract:
Data valuation aims to quantify the usefulness of individual data sources in training machine learning (ML) models, and is a critical aspect of data-centric ML research. However, data valuation faces significant yet frequently overlooked privacy challenges despite its importance. This paper studies these challenges with a focus on KNN-Shapley, one of the most practical data valuation methods nowad…
▽ More
Data valuation aims to quantify the usefulness of individual data sources in training machine learning (ML) models, and is a critical aspect of data-centric ML research. However, data valuation faces significant yet frequently overlooked privacy challenges despite its importance. This paper studies these challenges with a focus on KNN-Shapley, one of the most practical data valuation methods nowadays. We first emphasize the inherent privacy risks of KNN-Shapley, and demonstrate the significant technical difficulties in adapting KNN-Shapley to accommodate differential privacy (DP). To overcome these challenges, we introduce TKNN-Shapley, a refined variant of KNN-Shapley that is privacy-friendly, allowing for straightforward modifications to incorporate DP guarantee (DP-TKNN-Shapley). We show that DP-TKNN-Shapley has several advantages and offers a superior privacy-utility tradeoff compared to naively privatized KNN-Shapley in discerning data quality. Moreover, even non-private TKNN-Shapley achieves comparable performance as KNN-Shapley. Overall, our findings suggest that TKNN-Shapley is a promising alternative to KNN-Shapley, particularly for real-world applications involving sensitive data.
△ Less
Submitted 25 November, 2023; v1 submitted 29 August, 2023;
originally announced August 2023.
-
Simply connectedness and hyperbolicity
Authors:
Erwan Rousseau,
Carlo Gasbarri,
Amos Turchet,
Julie Tzu-Yueh Wang
Abstract:
We generalize to arbitrary dimension our previous construction of simply connected weakly-special but not special varieties. We show that they satisfy the function field and complex analytic part of Campana's conjecture. Moreover, we give the first examples, in any dimension, of smooth simply connected nonisotrivial projective varieties of general type that satisfy the function field Lang's conjec…
▽ More
We generalize to arbitrary dimension our previous construction of simply connected weakly-special but not special varieties. We show that they satisfy the function field and complex analytic part of Campana's conjecture. Moreover, we give the first examples, in any dimension, of smooth simply connected nonisotrivial projective varieties of general type that satisfy the function field Lang's conjecture.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection
Authors:
Tinghao Xie,
Xiangyu Qi,
Ping He,
Yiming Li,
Jiachen T. Wang,
Prateek Mittal
Abstract:
We present a novel defense, against backdoor attacks on Deep Neural Networks (DNNs), wherein adversaries covertly implant malicious behaviors (backdoors) into DNNs. Our defense falls within the category of post-development defenses that operate independently of how the model was generated. The proposed defense is built upon a novel reverse engineering approach that can directly extract backdoor fu…
▽ More
We present a novel defense, against backdoor attacks on Deep Neural Networks (DNNs), wherein adversaries covertly implant malicious behaviors (backdoors) into DNNs. Our defense falls within the category of post-development defenses that operate independently of how the model was generated. The proposed defense is built upon a novel reverse engineering approach that can directly extract backdoor functionality of a given backdoored model to a backdoor expert model. The approach is straightforward -- finetuning the backdoored model over a small set of intentionally mislabeled clean samples, such that it unlearns the normal functionality while still preserving the backdoor functionality, and thus resulting in a model (dubbed a backdoor expert model) that can only recognize backdoor inputs. Based on the extracted backdoor expert model, we show the feasibility of devising highly accurate backdoor input detectors that filter out the backdoor inputs during model inference. Further augmented by an ensemble strategy with a finetuned auxiliary model, our defense, BaDExpert (Backdoor Input Detection with Backdoor Expert), effectively mitigates 17 SOTA backdoor attacks while minimally impacting clean utility. The effectiveness of BaDExpert has been verified on multiple datasets (CIFAR10, GTSRB and ImageNet) across various model architectures (ResNet, VGG, MobileNetV2 and Vision Transformer).
△ Less
Submitted 5 October, 2023; v1 submitted 23 August, 2023;
originally announced August 2023.
-
Privacy-Preserving In-Context Learning for Large Language Models
Authors:
Tong Wu,
Ashwinee Panda,
Jiachen T. Wang,
Prateek Mittal
Abstract:
In-context learning (ICL) is an important capability of Large Language Models (LLMs), enabling these models to dynamically adapt based on specific, in-context exemplars, thereby improving accuracy and relevance. However, LLM's responses may leak the sensitive private information contained in in-context exemplars. To address this challenge, we propose Differentially Private In-context Learning (DP-…
▽ More
In-context learning (ICL) is an important capability of Large Language Models (LLMs), enabling these models to dynamically adapt based on specific, in-context exemplars, thereby improving accuracy and relevance. However, LLM's responses may leak the sensitive private information contained in in-context exemplars. To address this challenge, we propose Differentially Private In-context Learning (DP-ICL), a general paradigm for privatizing ICL tasks. The key idea for DP-ICL paradigm is generating differentially private responses through a noisy consensus among an ensemble of LLM's responses based on disjoint exemplar sets. Based on the general paradigm of DP-ICL, we instantiate several techniques showing how to privatize ICL for text classification and language generation. We evaluate DP-ICL on four text classification benchmarks and two language generation tasks, and our empirical results show that DP-ICL achieves a strong utility-privacy tradeoff.
△ Less
Submitted 30 September, 2023; v1 submitted 2 May, 2023;
originally announced May 2023.
-
Ensemble Learning for CME Arrival Time Prediction
Authors:
Khalid A. Alobaid,
Jason T. L. Wang
Abstract:
The Sun constantly releases radiation and plasma into the heliosphere. Sporadically, the Sun launches solar eruptions such as flares and coronal mass ejections (CMEs). CMEs carry away a huge amount of mass and magnetic flux with them. An Earth-directed CME can cause serious consequences to the human system. It can destroy power grids/pipelines, satellites, and communications. Therefore, accurately…
▽ More
The Sun constantly releases radiation and plasma into the heliosphere. Sporadically, the Sun launches solar eruptions such as flares and coronal mass ejections (CMEs). CMEs carry away a huge amount of mass and magnetic flux with them. An Earth-directed CME can cause serious consequences to the human system. It can destroy power grids/pipelines, satellites, and communications. Therefore, accurately monitoring and predicting CMEs is important to minimize damages to the human system. In this study we propose an ensemble learning approach, named CMETNet, for predicting the arrival time of CMEs from the Sun to the Earth. We collect and integrate eruptive events from two solar cycles, #23 and #24, from 1996 to 2021 with a total of 363 geoeffective CMEs. The data used for making predictions include CME features, solar wind parameters and CME images obtained from the SOHO/LASCO C2 coronagraph. Our ensemble learning framework comprises regression algorithms for numerical data analysis and a convolutional neural network for image processing. Experimental results show that CMETNet performs better than existing machine learning methods reported in the literature, with a Pearson product-moment correlation coefficient of 0.83 and a mean absolute error of 9.75 hours.
△ Less
Submitted 29 April, 2023;
originally announced May 2023.
-
LAVA: Data Valuation without Pre-Specified Learning Algorithms
Authors:
Hoang Anh Just,
Feiyang Kang,
Jiachen T. Wang,
Yi Zeng,
Myeongseob Ko,
Ming Jin,
Ruoxi Jia
Abstract:
Traditionally, data valuation (DV) is posed as a problem of equitably splitting the validation performance of a learning algorithm among the training data. As a result, the calculated data values depend on many design choices of the underlying learning algorithm. However, this dependence is undesirable for many DV use cases, such as setting priorities over different data sources in a data acquisit…
▽ More
Traditionally, data valuation (DV) is posed as a problem of equitably splitting the validation performance of a learning algorithm among the training data. As a result, the calculated data values depend on many design choices of the underlying learning algorithm. However, this dependence is undesirable for many DV use cases, such as setting priorities over different data sources in a data acquisition process and informing pricing mechanisms in a data marketplace. In these scenarios, data needs to be valued before the actual analysis and the choice of the learning algorithm is still undetermined then. Another side-effect of the dependence is that to assess the value of individual points, one needs to re-run the learning algorithm with and without a point, which incurs a large computation burden. This work leapfrogs over the current limits of data valuation methods by introducing a new framework that can value training data in a way that is oblivious to the downstream learning algorithm. Our main results are as follows. (1) We develop a proxy for the validation performance associated with a training set based on a non-conventional class-wise Wasserstein distance between training and validation sets. We show that the distance characterizes the upper bound of the validation performance for any given model under certain Lipschitz conditions. (2) We develop a novel method to value individual data based on the sensitivity analysis of the class-wise Wasserstein distance. Importantly, these values can be directly obtained for free from the output of off-the-shelf optimization solvers when computing the distance. (3) We evaluate our new data valuation framework over various use cases related to detecting low-quality data and show that, surprisingly, the learning-agnostic feature of our framework enables a significant improvement over SOTA performance while being orders of magnitude faster.
△ Less
Submitted 19 December, 2023; v1 submitted 28 April, 2023;
originally announced May 2023.
-
A Randomized Approach for Tight Privacy Accounting
Authors:
Jiachen T. Wang,
Saeed Mahloujifar,
Tong Wu,
Ruoxi Jia,
Prateek Mittal
Abstract:
Bounding privacy leakage over compositions, i.e., privacy accounting, is a key challenge in differential privacy (DP). The privacy parameter ($\eps$ or $δ$) is often easy to estimate but hard to bound. In this paper, we propose a new differential privacy paradigm called estimate-verify-release (EVR), which addresses the challenges of providing a strict upper bound for privacy parameter in DP compo…
▽ More
Bounding privacy leakage over compositions, i.e., privacy accounting, is a key challenge in differential privacy (DP). The privacy parameter ($\eps$ or $δ$) is often easy to estimate but hard to bound. In this paper, we propose a new differential privacy paradigm called estimate-verify-release (EVR), which addresses the challenges of providing a strict upper bound for privacy parameter in DP compositions by converting an estimate of privacy parameter into a formal guarantee. The EVR paradigm first estimates the privacy parameter of a mechanism, then verifies whether it meets this guarantee, and finally releases the query output based on the verification result. The core component of the EVR is privacy verification. We develop a randomized privacy verifier using Monte Carlo (MC) technique. Furthermore, we propose an MC-based DP accountant that outperforms existing DP accounting techniques in terms of accuracy and efficiency. Our empirical evaluation shows the newly proposed EVR paradigm improves the utility-privacy tradeoff for privacy-preserving machine learning.
△ Less
Submitted 20 November, 2023; v1 submitted 16 April, 2023;
originally announced April 2023.
-
A Note on "Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms"
Authors:
Jiachen T. Wang,
Ruoxi Jia
Abstract:
Data valuation is a growing research field that studies the influence of individual data points for machine learning (ML) models. Data Shapley, inspired by cooperative game theory and economics, is an effective method for data valuation. However, it is well-known that the Shapley value (SV) can be computationally expensive. Fortunately, Jia et al. (2019) showed that for K-Nearest Neighbors (KNN) m…
▽ More
Data valuation is a growing research field that studies the influence of individual data points for machine learning (ML) models. Data Shapley, inspired by cooperative game theory and economics, is an effective method for data valuation. However, it is well-known that the Shapley value (SV) can be computationally expensive. Fortunately, Jia et al. (2019) showed that for K-Nearest Neighbors (KNN) models, the computation of Data Shapley is surprisingly simple and efficient.
In this note, we revisit the work of Jia et al. (2019) and propose a more natural and interpretable utility function that better reflects the performance of KNN models. We derive the corresponding calculation procedure for the Data Shapley of KNN classifiers/regressors with the new utility functions. Our new approach, dubbed soft-label KNN-SV, achieves the same time complexity as the original method. We further provide an efficient approximation algorithm for soft-label KNN-SV based on locality sensitive hashing (LSH). Our experimental results demonstrate that Soft-label KNN-SV outperforms the original method on most datasets in the task of mislabeled data detection, making it a better baseline for future work on data valuation.
△ Less
Submitted 25 November, 2023; v1 submitted 9 April, 2023;
originally announced April 2023.
-
A Note on "Towards Efficient Data Valuation Based on the Shapley Value''
Authors:
Jiachen T. Wang,
Ruoxi Jia
Abstract:
The Shapley value (SV) has emerged as a promising method for data valuation. However, computing or estimating the SV is often computationally expensive. To overcome this challenge, Jia et al. (2019) propose an advanced SV estimation algorithm called ``Group Testing-based SV estimator'' which achieves favorable asymptotic sample complexity. In this technical note, we present several improvements in…
▽ More
The Shapley value (SV) has emerged as a promising method for data valuation. However, computing or estimating the SV is often computationally expensive. To overcome this challenge, Jia et al. (2019) propose an advanced SV estimation algorithm called ``Group Testing-based SV estimator'' which achieves favorable asymptotic sample complexity. In this technical note, we present several improvements in the analysis and design choices of this SV estimator. Moreover, we point out that the Group Testing-based SV estimator does not fully reuse the collected samples. Our analysis and insights contribute to a better understanding of the challenges in developing efficient SV estimation algorithms for data valuation.
△ Less
Submitted 22 February, 2023;
originally announced February 2023.
-
Uncovering Adversarial Risks of Test-Time Adaptation
Authors:
Tong Wu,
Feiran Jia,
Xiangyu Qi,
Jiachen T. Wang,
Vikash Sehwag,
Saeed Mahloujifar,
Prateek Mittal
Abstract:
Recently, test-time adaptation (TTA) has been proposed as a promising solution for addressing distribution shifts. It allows a base model to adapt to an unforeseen distribution during inference by leveraging the information from the batch of (unlabeled) test data. However, we uncover a novel security vulnerability of TTA based on the insight that predictions on benign samples can be impacted by ma…
▽ More
Recently, test-time adaptation (TTA) has been proposed as a promising solution for addressing distribution shifts. It allows a base model to adapt to an unforeseen distribution during inference by leveraging the information from the batch of (unlabeled) test data. However, we uncover a novel security vulnerability of TTA based on the insight that predictions on benign samples can be impacted by malicious samples in the same batch. To exploit this vulnerability, we propose Distribution Invading Attack (DIA), which injects a small fraction of malicious data into the test batch. DIA causes models using TTA to misclassify benign and unperturbed test data, providing an entirely new capability for adversaries that is infeasible in canonical machine learning pipelines. Through comprehensive evaluations, we demonstrate the high effectiveness of our attack on multiple benchmarks across six TTA methods. In response, we investigate two countermeasures to robustify the existing insecure TTA implementations, following the principle of "security by design". Together, we hope our findings can make the community aware of the utility-security tradeoffs in deploying TTA and provide valuable insights for developing robust TTA approaches.
△ Less
Submitted 4 February, 2023; v1 submitted 29 January, 2023;
originally announced January 2023.
-
A Deep Learning Approach to Generating Photospheric Vector Magnetograms of Solar Active Regions for SOHO/MDI Using SDO/HMI and BBSO Data
Authors:
Haodi Jiang,
Qin Li,
Zhihang Hu,
Nian Liu,
Yasser Abduallah,
Ju Jing,
Genwei Zhang,
Yan Xu,
Wynne Hsu,
Jason T. L. Wang,
Haimin Wang
Abstract:
Solar activity is usually caused by the evolution of solar magnetic fields. Magnetic field parameters derived from photospheric vector magnetograms of solar active regions have been used to analyze and forecast eruptive events such as solar flares and coronal mass ejections. Unfortunately, the most recent solar cycle 24 was relatively weak with few large flares, though it is the only solar cycle i…
▽ More
Solar activity is usually caused by the evolution of solar magnetic fields. Magnetic field parameters derived from photospheric vector magnetograms of solar active regions have been used to analyze and forecast eruptive events such as solar flares and coronal mass ejections. Unfortunately, the most recent solar cycle 24 was relatively weak with few large flares, though it is the only solar cycle in which consistent time-sequence vector magnetograms have been available through the Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO) since its launch in 2010. In this paper, we look into another major instrument, namely the Michelson Doppler Imager (MDI) on board the Solar and Heliospheric Observatory (SOHO) from 1996 to 2010. The data archive of SOHO/MDI covers more active solar cycle 23 with many large flares. However, SOHO/MDI data only has line-of-sight (LOS) magnetograms. We propose a new deep learning method, named MagNet, to learn from combined LOS magnetograms, Bx and By taken by SDO/HMI along with H-alpha observations collected by the Big Bear Solar Observatory (BBSO), and to generate vector components Bx' and By', which would form vector magnetograms with observed LOS data. In this way, we can expand the availability of vector magnetograms to the period from 1996 to present. Experimental results demonstrate the good performance of the proposed method. To our knowledge, this is the first time that deep learning has been used to generate photospheric vector magnetograms of solar active regions for SOHO/MDI using SDO/HMI and H-alpha data.
△ Less
Submitted 4 November, 2022;
originally announced November 2022.
-
Inferring Line-of-Sight Velocities and Doppler Widths from Stokes Profiles of GST/NIRIS Using Stacked Deep Neural Networks
Authors:
Haodi Jiang,
Qin Li,
Yan Xu,
Wynne Hsu,
Kwangsu Ahn,
Wenda Cao,
Jason T. L. Wang,
Haimin Wang
Abstract:
Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at th…
▽ More
Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory (BBSO). The training data of SDNN is prepared by a Milne-Eddington (ME) inversion code used by BBSO. We quantitatively assess SDNN, comparing its inversion results with those obtained by the ME inversion code and related machine learning (ML) algorithms such as multiple support vector regression, multilayer perceptrons and a pixel-level convolutional neural network. Major findings from our experimental study are summarized as follows. First, the SDNN-inferred LOS velocities are highly correlated to the ME-calculated ones with the Pearson product-moment correlation coefficient being close to 0.9 on average. Second, SDNN is faster, while producing smoother and cleaner LOS velocity and Doppler width maps, than the ME inversion code. Third, the maps produced by SDNN are closer to ME's maps than those from the related ML algorithms, demonstrating the better learning capability of SDNN than the ML algorithms. Finally, comparison between the inversion results of ME and SDNN based on GST/NIRIS and those from the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory in flare-prolific active region NOAA 12673 is presented. We also discuss extensions of SDNN for inferring vector magnetic fields with empirical evaluation.
△ Less
Submitted 8 October, 2022;
originally announced October 2022.
-
Solar Flare Index Prediction Using SDO/HMI Vector Magnetic Data Products with Statistical and Machine Learning Methods
Authors:
Hewei Zhang,
Qin Li,
Yanxing Yang,
Ju Jing,
Jason T. L. Wang,
Haimin Wang,
Zuofeng Shang
Abstract:
Solar flares, especially the M- and X-class flares, are often associated with coronal mass ejections (CMEs). They are the most important sources of space weather effects, that can severely impact the near-Earth environment. Thus it is essential to forecast flares (especially the M-and X-class ones) to mitigate their destructive and hazardous consequences. Here, we introduce several statistical and…
▽ More
Solar flares, especially the M- and X-class flares, are often associated with coronal mass ejections (CMEs). They are the most important sources of space weather effects, that can severely impact the near-Earth environment. Thus it is essential to forecast flares (especially the M-and X-class ones) to mitigate their destructive and hazardous consequences. Here, we introduce several statistical and Machine Learning approaches to the prediction of the AR's Flare Index (FI) that quantifies the flare productivity of an AR by taking into account the numbers of different class flares within a certain time interval. Specifically, our sample includes 563 ARs appeared on solar disk from May 2010 to Dec 2017. The 25 magnetic parameters, provided by the Space-weather HMI Active Region Patches (SHARP) from Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO), characterize coronal magnetic energy stored in ARs by proxy and are used as the predictors. We investigate the relationship between these SHARP parameters and the FI of ARs with a machine-learning algorithm (spline regression) and the resampling method (Synthetic Minority Over-Sampling Technique for Regression with Gaussian Noise, short by SMOGN). Based on the established relationship, we are able to predict the value of FIs for a given AR within the next 1-day period. Compared with other 4 popular machine learning algorithms, our methods improve the accuracy of FI prediction, especially for large FI. In addition, we sort the importance of SHARP parameters by Borda Count method calculated from the ranks that are rendered by 9 different machine learning methods.
△ Less
Submitted 1 December, 2022; v1 submitted 27 September, 2022;
originally announced September 2022.
-
A complex case of Vojta's general abc conjecture and cases of Campana's orbifold conjecture
Authors:
Ji Guo,
Julie Tzu-Yueh Wang
Abstract:
We proved a truncated second main theorem of level one with explicit exceptional sets for analytic maps into $\mathbb P^2$ intersecting the coordinate lines with sufficiently high multiplicities. As applications, we studied some cases of Campana's orbifold conjecture for $\mathbb P^2$ and finite ramified covers of $\mathbb P^2$ with three components admitting sufficiently large multiplicities.
We proved a truncated second main theorem of level one with explicit exceptional sets for analytic maps into $\mathbb P^2$ intersecting the coordinate lines with sufficiently high multiplicities. As applications, we studied some cases of Campana's orbifold conjecture for $\mathbb P^2$ and finite ramified covers of $\mathbb P^2$ with three components admitting sufficiently large multiplicities.
△ Less
Submitted 22 June, 2023; v1 submitted 23 September, 2022;
originally announced September 2022.
-
Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning
Authors:
Jiachen T. Wang,
Saeed Mahloujifar,
Shouda Wang,
Ruoxi Jia,
Prateek Mittal
Abstract:
Propose-Test-Release (PTR) is a differential privacy framework that works with local sensitivity of functions, instead of their global sensitivity. This framework is typically used for releasing robust statistics such as median or trimmed mean in a differentially private manner. While PTR is a common framework introduced over a decade ago, using it in applications such as robust SGD where we need…
▽ More
Propose-Test-Release (PTR) is a differential privacy framework that works with local sensitivity of functions, instead of their global sensitivity. This framework is typically used for releasing robust statistics such as median or trimmed mean in a differentially private manner. While PTR is a common framework introduced over a decade ago, using it in applications such as robust SGD where we need many adaptive robust queries is challenging. This is mainly due to the lack of Renyi Differential Privacy (RDP) analysis, an essential ingredient underlying the moments accountant approach for differentially private deep learning. In this work, we generalize the standard PTR and derive the first RDP bound for it when the target function has bounded global sensitivity. We show that our RDP bound for PTR yields tighter DP guarantees than the directly analyzed $(\eps, δ)$-DP. We also derive the algorithm-specific privacy amplification bound of PTR under subsampling. We show that our bound is much tighter than the general upper bound and close to the lower bound. Our RDP bounds enable tighter privacy loss calculation for the composition of many adaptive runs of PTR. As an application of our analysis, we show that PTR and our theoretical results can be used to design differentially private variants for byzantine robust training algorithms that use robust statistics for gradients aggregation. We conduct experiments on the settings of label, feature, and gradient corruption across different datasets and architectures. We show that PTR-based private and robust training algorithm significantly improves the utility compared with the baseline.
△ Less
Submitted 16 September, 2022;
originally announced September 2022.
-
Turning a Curse into a Blessing: Enabling In-Distribution-Data-Free Backdoor Removal via Stabilized Model Inversion
Authors:
Si Chen,
Yi Zeng,
Jiachen T. Wang,
Won Park,
Xun Chen,
Lingjuan Lyu,
Zhuoqing Mao,
Ruoxi Jia
Abstract:
Many backdoor removal techniques in machine learning models require clean in-distribution data, which may not always be available due to proprietary datasets. Model inversion techniques, often considered privacy threats, can reconstruct realistic training samples, potentially eliminating the need for in-distribution data. Prior attempts to combine backdoor removal and model inversion yielded limit…
▽ More
Many backdoor removal techniques in machine learning models require clean in-distribution data, which may not always be available due to proprietary datasets. Model inversion techniques, often considered privacy threats, can reconstruct realistic training samples, potentially eliminating the need for in-distribution data. Prior attempts to combine backdoor removal and model inversion yielded limited results. Our work is the first to provide a thorough understanding of leveraging model inversion for effective backdoor removal by addressing key questions about reconstructed samples' properties, perceptual similarity, and the potential presence of backdoor triggers.
We establish that relying solely on perceptual similarity is insufficient for robust defenses, and the stability of model predictions in response to input and parameter perturbations is also crucial. To tackle this, we introduce a novel bi-level optimization-based framework for model inversion, promoting stability and visual quality. Interestingly, we discover that reconstructed samples from a pre-trained generator's latent space are backdoor-free, even when utilizing signals from a backdoored model. We provide a theoretical analysis to support this finding. Our evaluation demonstrates that our stabilized model inversion technique achieves state-of-the-art backdoor removal performance without clean in-distribution data, matching or surpassing performance using the same amount of clean samples.
△ Less
Submitted 23 March, 2023; v1 submitted 14 June, 2022;
originally announced June 2022.
-
2022 Review of Data-Driven Plasma Science
Authors:
Rushil Anirudh,
Rick Archibald,
M. Salman Asif,
Markus M. Becker,
Sadruddin Benkadda,
Peer-Timo Bremer,
Rick H. S. Budé,
C. S. Chang,
Lei Chen,
R. M. Churchill,
Jonathan Citrin,
Jim A Gaffney,
Ana Gainaru,
Walter Gekelman,
Tom Gibbs,
Satoshi Hamaguchi,
Christian Hill,
Kelli Humbird,
Sören Jalas,
Satoru Kawaguchi,
Gon-Ho Kim,
Manuel Kirchen,
Scott Klasky,
John L. Kline,
Karl Krushelnick
, et al. (38 additional authors not shown)
Abstract:
Data science and technology offer transformative tools and methods to science. This review article highlights latest development and progress in the interdisciplinary field of data-driven plasma science (DDPS). A large amount of data and machine learning algorithms go hand in hand. Most plasma data, whether experimental, observational or computational, are generated or collected by machines today.…
▽ More
Data science and technology offer transformative tools and methods to science. This review article highlights latest development and progress in the interdisciplinary field of data-driven plasma science (DDPS). A large amount of data and machine learning algorithms go hand in hand. Most plasma data, whether experimental, observational or computational, are generated or collected by machines today. It is now becoming impractical for humans to analyze all the data manually. Therefore, it is imperative to train machines to analyze and interpret (eventually) such data as intelligently as humans but far more efficiently in quantity. Despite the recent impressive progress in applications of data science to plasma science and technology, the emerging field of DDPS is still in its infancy. Fueled by some of the most challenging problems such as fusion energy, plasma processing of materials, and fundamental understanding of the universe through observable plasma phenomena, it is expected that DDPS continues to benefit significantly from the interdisciplinary marriage between plasma science and data science into the foreseeable future.
△ Less
Submitted 31 May, 2022;
originally announced May 2022.
-
Data Banzhaf: A Robust Data Valuation Framework for Machine Learning
Authors:
Jiachen T. Wang,
Ruoxi Jia
Abstract:
Data valuation has wide use cases in machine learning, including improving data quality and creating economic incentives for data sharing. This paper studies the robustness of data valuation to noisy model performance scores. Particularly, we find that the inherent randomness of the widely used stochastic gradient descent can cause existing data value notions (e.g., the Shapley value and the Leave…
▽ More
Data valuation has wide use cases in machine learning, including improving data quality and creating economic incentives for data sharing. This paper studies the robustness of data valuation to noisy model performance scores. Particularly, we find that the inherent randomness of the widely used stochastic gradient descent can cause existing data value notions (e.g., the Shapley value and the Leave-one-out error) to produce inconsistent data value rankings across different runs. To address this challenge, we introduce the concept of safety margin, which measures the robustness of a data value notion. We show that the Banzhaf value, a famous value notion that originated from cooperative game theory literature, achieves the largest safety margin among all semivalues (a class of value notions that satisfy crucial properties entailed by ML applications and include the famous Shapley value and Leave-one-out error). We propose an algorithm to efficiently estimate the Banzhaf value based on the Maximum Sample Reuse (MSR) principle. Our evaluation demonstrates that the Banzhaf value outperforms the existing semivalue-based data value notions on several ML tasks such as learning with weighted samples and noisy label detection. Overall, our study suggests that when the underlying ML algorithm is stochastic, the Banzhaf value is a promising alternative to the other semivalue-based data value schemes given its computational advantage and ability to robustly differentiate data quality.
△ Less
Submitted 18 December, 2023; v1 submitted 30 May, 2022;
originally announced May 2022.
-
Towards A Proactive ML Approach for Detecting Backdoor Poison Samples
Authors:
Xiangyu Qi,
Tinghao Xie,
Jiachen T. Wang,
Tong Wu,
Saeed Mahloujifar,
Prateek Mittal
Abstract:
Adversaries can embed backdoors in deep learning models by introducing backdoor poison samples into training datasets. In this work, we investigate how to detect such poison samples to mitigate the threat of backdoor attacks. First, we uncover a post-hoc workflow underlying most prior work, where defenders passively allow the attack to proceed and then leverage the characteristics of the post-atta…
▽ More
Adversaries can embed backdoors in deep learning models by introducing backdoor poison samples into training datasets. In this work, we investigate how to detect such poison samples to mitigate the threat of backdoor attacks. First, we uncover a post-hoc workflow underlying most prior work, where defenders passively allow the attack to proceed and then leverage the characteristics of the post-attacked model to uncover poison samples. We reveal that this workflow does not fully exploit defenders' capabilities, and defense pipelines built on it are prone to failure or performance degradation in many scenarios. Second, we suggest a paradigm shift by promoting a proactive mindset in which defenders engage proactively with the entire model training and poison detection pipeline, directly enforcing and magnifying distinctive characteristics of the post-attacked model to facilitate poison detection. Based on this, we formulate a unified framework and provide practical insights on designing detection pipelines that are more robust and generalizable. Third, we introduce the technique of Confusion Training (CT) as a concrete instantiation of our framework. CT applies an additional poisoning attack to the already poisoned dataset, actively decoupling benign correlation while exposing backdoor patterns to detection. Empirical evaluations on 4 datasets and 14 types of attacks validate the superiority of CT over 14 baseline defenses.
△ Less
Submitted 17 June, 2023; v1 submitted 26 May, 2022;
originally announced May 2022.
-
Characterization of Kepler targets based on medium-resolution LAMOST spectra analyzed with ROTFIT
Authors:
A. Frasca,
J. Molenda-Zakowicz,
J. Alonso-Santiago,
G. Catanzaro,
P. De Cat,
J. N. Fu,
W. Zong,
J. X. Wang,
T. Cang,
J. T. Wang
Abstract:
In this work we present the results of our analysis of 16,300 medium-resolution LAMOST spectra of late-type stars in the Kepler field with the aim of determining the stellar parameters, activity level, lithium atmospheric content, and binarity. We have used a version of the code ROTFIT specifically developed for these spectra. We provide a catalog with the atmospheric parameters (Teff, log(g), and…
▽ More
In this work we present the results of our analysis of 16,300 medium-resolution LAMOST spectra of late-type stars in the Kepler field with the aim of determining the stellar parameters, activity level, lithium atmospheric content, and binarity. We have used a version of the code ROTFIT specifically developed for these spectra. We provide a catalog with the atmospheric parameters (Teff, log(g), and [Fe/H]), radial velocity (RV), and projected rotation velocity (vsini). For cool stars (Teff < 6500 K), we also calculated the H-alpha and LiI-6708 equivalent width, which are important indicators of chromospheric activity and evolutionary stage, respectively. We have derived the RV and atmospheric parameters for 14,300 spectra of 7443 stars. Literature data were used for a quality control of the results. The Teff and log(g) values are in good agreement with the literature. The [Fe/H] values appear to be overestimated for metal-poor stars. We propose a relation to correct the [Fe/H] values derived with ROTFIT. We were able to identify double-lined binaries, stars with variable RVs, lithium-rich giants, and emission-line objects. Based on the H-alpha flux, we found 327 active stars. We detected the LiI-6708 line and measure its equivalent width for 1657 stars, both giants and stars on the main sequence. Regarding the latter, we performed a discrete age classification based on the atmospheric lithium abundance and the upper envelopes of a few open clusters. Among the giants, we found 195 Li-rich stars, 161 of which are reported here for the first time. No relationship is found between stellar rotation and lithium abundance, which allows us to rule out merger scenarios as the predominant explanation of the enrichment of Li in our sample. The fraction of Li-rich giants, about 4%, is higher than expected.
△ Less
Submitted 10 May, 2022;
originally announced May 2022.
-
A Deep Learning Approach to Dst Index Prediction
Authors:
Yasser Abduallah,
Jason T. L. Wang,
Prianka Bose,
Genwei Zhang,
Firas Gerges,
Haimin Wang
Abstract:
The disturbance storm time (Dst) index is an important and useful measurement in space weather research. It has been used to characterize the size and intensity of a geomagnetic storm. A negative Dst value means that the Earth's magnetic field is weakened, which happens during storms. In this paper, we present a novel deep learning method, called the Dst Transformer, to perform short-term, 1-6 hou…
▽ More
The disturbance storm time (Dst) index is an important and useful measurement in space weather research. It has been used to characterize the size and intensity of a geomagnetic storm. A negative Dst value means that the Earth's magnetic field is weakened, which happens during storms. In this paper, we present a novel deep learning method, called the Dst Transformer, to perform short-term, 1-6 hour ahead, forecasting of the Dst index based on the solar wind parameters provided by the NASA Space Science Data Coordinated Archive. The Dst Transformer combines a multi-head attention layer with Bayesian inference, which is capable of quantifying both aleatoric uncertainty and epistemic uncertainty when making Dst predictions. Experimental results show that the proposed Dst Transformer outperforms related machine learning methods in terms of the root mean square error and R-squared. Furthermore, the Dst Transformer can produce both data and model uncertainty quantification results, which can not be done by the existing methods. To our knowledge, this is the first time that Bayesian deep learning has been used for Dst index forecasting.
△ Less
Submitted 5 May, 2022;
originally announced May 2022.
-
Predicting Solar Energetic Particles Using SDO/HMI Vector Magnetic Data Products and a Bidirectional LSTM Network
Authors:
Yasser Abduallah,
Vania K. Jordanova,
Hao Liu,
Qin Li,
Jason T. L. Wang,
Haimin Wang
Abstract:
Solar energetic particles (SEPs) are an essential source of space radiation, which are hazards for humans in space, spacecraft, and technology in general. In this paper we propose a deep learning method, specifically a bidirectional long short-term memory (biLSTM) network, to predict if an active region (AR) would produce an SEP event given that (i) the AR will produce an M- or X-class flare and a…
▽ More
Solar energetic particles (SEPs) are an essential source of space radiation, which are hazards for humans in space, spacecraft, and technology in general. In this paper we propose a deep learning method, specifically a bidirectional long short-term memory (biLSTM) network, to predict if an active region (AR) would produce an SEP event given that (i) the AR will produce an M- or X-class flare and a coronal mass ejection (CME) associated with the flare, or (ii) the AR will produce an M- or X-class flare regardless of whether or not the flare is associated with a CME. The data samples used in this study are collected from the Geostationary Operational Environmental Satellite's X-ray flare catalogs provided by the National Centers for Environmental Information. We select M- and X-class flares with identified ARs in the catalogs for the period between 2010 and 2021, and find the associations of flares, CMEs and SEPs in the Space Weather Database of Notifications, Knowledge, Information during the same period. Each data sample contains physical parameters collected from the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory. Experimental results based on different performance metrics demonstrate that the proposed biLSTM network is better than related machine learning algorithms for the two SEP prediction tasks studied here. We also discuss extensions of our approach for probabilistic forecasting and calibration with empirical evaluation.
△ Less
Submitted 27 March, 2022;
originally announced March 2022.
-
Revisiting the Solar Research Cyberinfrastructure Needs: A White Paper of Findings and Recommendations
Authors:
Gelu Nita,
Azim Ahmadzadeh,
Serena Criscuoli,
Alisdair Davey,
Dale Gary,
Manolis Georgoulis,
Neal Hurlburt,
Irina Kitiashvili,
Dustin Kempton,
Alexander Kosovichev,
Piet Martens,
Ryan McGranaghan,
Vincent Oria,
Kevin Reardon,
Viacheslav Sadykov,
Ryan Timmons,
Haimin Wang,
Jason T. L. Wang
Abstract:
Solar and Heliosphere physics are areas of remarkable data-driven discoveries. Recent advances in high-cadence, high-resolution multiwavelength observations, growing amounts of data from realistic modeling, and operational needs for uninterrupted science-quality data coverage generate the demand for a solar metadata standardization and overall healthy data infrastructure. This white paper is prepa…
▽ More
Solar and Heliosphere physics are areas of remarkable data-driven discoveries. Recent advances in high-cadence, high-resolution multiwavelength observations, growing amounts of data from realistic modeling, and operational needs for uninterrupted science-quality data coverage generate the demand for a solar metadata standardization and overall healthy data infrastructure. This white paper is prepared as an effort of the working group "Uniform Semantics and Syntax of Solar Observations and Events" created within the "Towards Integration of Heliophysics Data, Modeling, and Analysis Tools" EarthCube Research Coordination Network (@HDMIEC RCN), with primary objectives to discuss current advances and identify future needs for the solar research cyberinfrastructure. The white paper summarizes presentations and discussions held during the special working group session at the EarthCube Annual Meeting on June 19th, 2020, as well as community contribution gathered during a series of preceding workshops and subsequent RCN working group sessions. The authors provide examples of the current standing of the solar research cyberinfrastructure, and describe the problems related to current data handling approaches. The list of the top-level recommendations agreed by the authors of the current white paper is presented at the beginning of the paper.
△ Less
Submitted 17 March, 2022;
originally announced March 2022.
-
ModelPred: A Framework for Predicting Trained Model from Training Data
Authors:
Yingyan Zeng,
Jiachen T. Wang,
Si Chen,
Hoang Anh Just,
Ran Jin,
Ruoxi Jia
Abstract:
In this work, we propose ModelPred, a framework that helps to understand the impact of changes in training data on a trained model. This is critical for building trust in various stages of a machine learning pipeline: from cleaning poor-quality samples and tracking important ones to be collected during data preparation, to calibrating uncertainty of model prediction, to interpreting why certain be…
▽ More
In this work, we propose ModelPred, a framework that helps to understand the impact of changes in training data on a trained model. This is critical for building trust in various stages of a machine learning pipeline: from cleaning poor-quality samples and tracking important ones to be collected during data preparation, to calibrating uncertainty of model prediction, to interpreting why certain behaviors of a model emerge during deployment. Specifically, ModelPred learns a parameterized function that takes a dataset $S$ as the input and predicts the model obtained by training on $S$. Our work differs from the recent work of Datamodels [1] as we aim for predicting the trained model parameters directly instead of the trained model behaviors. We demonstrate that a neural network-based set function class is capable of learning the complex relationships between the training data and model parameters. We introduce novel global and local regularization techniques to prevent overfitting and we rigorously characterize the expressive power of neural networks (NN) in approximating the end-to-end training process. Through extensive empirical investigations, we show that ModelPred enables a variety of applications that boost the interpretability and accountability of machine learning (ML), such as data valuation, data selection, memorization quantification, and model calibration.
△ Less
Submitted 23 December, 2022; v1 submitted 24 November, 2021;
originally announced November 2021.
-
Matching-invariant running quark masses in Quantum Chromodynamics
Authors:
H. M. Chen,
L. M. Liu,
J. T. Wang,
M. Waqas,
G. X. Peng
Abstract:
The conventional quark mass is not continuous at thresholds. In this paper, we derive matchinginvariant quark masses which are continuous everywhere. They are expanded as an obvious function of the logarithmic Lambda scaled energy. The expansion coefficients are related to the original gamma and beta functions, with concretization to four loop level. The results show that the new expressions for t…
▽ More
The conventional quark mass is not continuous at thresholds. In this paper, we derive matchinginvariant quark masses which are continuous everywhere. They are expanded as an obvious function of the logarithmic Lambda scaled energy. The expansion coefficients are related to the original gamma and beta functions, with concretization to four loop level. The results show that the new expressions for the quark masses converge indeed much faster.
△ Less
Submitted 22 October, 2021;
originally announced October 2021.
-
Deep Learning Based Reconstruction of Total Solar Irradiance
Authors:
Yasser Abduallah,
Jason T. L. Wang,
Yucong Shen,
Khalid A. Alobaid,
Serena Criscuoli,
Haimin Wang
Abstract:
The Earth's primary source of energy is the radiant energy generated by the Sun, which is referred to as solar irradiance, or total solar irradiance (TSI) when all of the radiation is measured. A minor change in the solar irradiance can have a significant impact on the Earth's climate and atmosphere. As a result, studying and measuring solar irradiance is crucial in understanding climate changes a…
▽ More
The Earth's primary source of energy is the radiant energy generated by the Sun, which is referred to as solar irradiance, or total solar irradiance (TSI) when all of the radiation is measured. A minor change in the solar irradiance can have a significant impact on the Earth's climate and atmosphere. As a result, studying and measuring solar irradiance is crucial in understanding climate changes and solar variability. Several methods have been developed to reconstruct total solar irradiance for long and short periods of time; however, they are physics-based and rely on the availability of data, which does not go beyond 9,000 years. In this paper we propose a new method, called TSInet, to reconstruct total solar irradiance by deep learning for short and long periods of time that span beyond the physical models' data availability. On the data that are available, our method agrees well with the state-of-the-art physics-based reconstruction models. To our knowledge, this is the first time that deep learning has been used to reconstruct total solar irradiance for more than 9,000 years.
△ Less
Submitted 23 July, 2021;
originally announced July 2021.
-
Tracing Halpha Fibrils through Bayesian Deep Learning
Authors:
Haodi Jiang,
Ju Jing,
Jiasheng Wang,
Chang Liu,
Qin Li,
Yan Xu,
Jason T. L. Wang,
Haimin Wang
Abstract:
We present a new deep learning method, dubbed FibrilNet, for tracing chromospheric fibrils in Halpha images of solar observations. Our method consists of a data pre-processing component that prepares training data from a threshold-based tool, a deep learning model implemented as a Bayesian convolutional neural network for probabilistic image segmentation with uncertainty quantification to predict…
▽ More
We present a new deep learning method, dubbed FibrilNet, for tracing chromospheric fibrils in Halpha images of solar observations. Our method consists of a data pre-processing component that prepares training data from a threshold-based tool, a deep learning model implemented as a Bayesian convolutional neural network for probabilistic image segmentation with uncertainty quantification to predict fibrils, and a post-processing component containing a fibril-fitting algorithm to determine fibril orientations. The FibrilNet tool is applied to high-resolution Halpha images from an active region (AR 12665) collected by the 1.6 m Goode Solar Telescope (GST) equipped with high-order adaptive optics at the Big Bear Solar Observatory (BBSO). We quantitatively assess the FibrilNet tool, comparing its image segmentation algorithm and fibril-fitting algorithm with those employed by the threshold-based tool. Our experimental results and major findings are summarized as follows. First, the image segmentation results (i.e., detected fibrils) of the two tools are quite similar, demonstrating the good learning capability of FibrilNet. Second, FibrilNet finds more accurate and smoother fibril orientation angles than the threshold-based tool. Third, FibrilNet is faster than the threshold-based tool and the uncertainty maps produced by FibrilNet not only provide a quantitative way to measure the confidence on each detected fibril, but also help identify fibril structures that are not detected by the threshold-based tool but are inferred through machine learning. Finally, we apply FibrilNet to full-disk Halpha images from other solar observatories and additional high-resolution Halpha images collected by BBSO/GST, demonstrating the tool's usability in diverse datasets.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
Vojta's abc Conjecture for algebraic tori and applications over function fields
Authors:
Ji Guo,
Khoa D. Nguyen,
Chia-Liang Sun,
Julie Tzu-Yueh Wang
Abstract:
We prove Vojta's generalized abc conjecture for algebraic tori over function fields with exceptional sets that can be determined effectively. Additionally, we establish a version of the conjecture for toric varieties. As an application, we investigate the Lang-Vojta Conjecture for varieties of log general type that are ramified covers of $\mathbb G_m^n$ over function fields. In particular, we cons…
▽ More
We prove Vojta's generalized abc conjecture for algebraic tori over function fields with exceptional sets that can be determined effectively. Additionally, we establish a version of the conjecture for toric varieties. As an application, we investigate the Lang-Vojta Conjecture for varieties of log general type that are ramified covers of $\mathbb G_m^n$ over function fields. In particular, we consider the case of $ \mathbb P^n\setminus D$, where $D$ is an algebraic curve over a function field in $\mathbb P^n$ with $n+1$ irreducible components and $°D\ge n+2$. Our methods also apply to the complex situation, enabling us to find explicit exceptional sets for the corresponding case of Vojta's general abc conjecture (complex version) and the Green-Griffith-Lang conjecture.
△ Less
Submitted 18 October, 2023; v1 submitted 30 June, 2021;
originally announced June 2021.
-
Divisibility of polynomials and degeneracy of integral points
Authors:
Erwan Rousseau,
Julie Tzu-Yueh Wang,
Amos Turchet
Abstract:
We prove several statements about arithmetic hyperbolicity of certain blow-up varieties. As a corollary we obtain multiple examples of simply connected quasi-projective varieties that are pseudo-arithmetically hyperbolic. This generalizes results of Corvaja and Zannier obtained in dimension 2 to arbitrary dimension. The key input is an application of the Ru-Vojta's strategy. We also obtain the ana…
▽ More
We prove several statements about arithmetic hyperbolicity of certain blow-up varieties. As a corollary we obtain multiple examples of simply connected quasi-projective varieties that are pseudo-arithmetically hyperbolic. This generalizes results of Corvaja and Zannier obtained in dimension 2 to arbitrary dimension. The key input is an application of the Ru-Vojta's strategy. We also obtain the analogue results for function fields and Nevanlinna theory with the goal to apply them in a future paper in the context of Campana's conjectures.
△ Less
Submitted 21 June, 2021;
originally announced June 2021.
-
A truncated second main theorem for algebraic tori with moving targets and applications
Authors:
Ji Guo,
Chia-Liang Sun,
Julie Tzu-Yueh Wang
Abstract:
We establish a second main theorem for algebraic tori with slow growth moving targets with truncation to level 1. As the first application of this result, we prove the Green-Griffith-Lang conjecture for projective spaces with $n+1$ components in the context of moving targets of slow growth. Then we discuss the integrability of the ring of exponential polynomials in the ring of entire functions as…
▽ More
We establish a second main theorem for algebraic tori with slow growth moving targets with truncation to level 1. As the first application of this result, we prove the Green-Griffith-Lang conjecture for projective spaces with $n+1$ components in the context of moving targets of slow growth. Then we discuss the integrability of the ring of exponential polynomials in the ring of entire functions as another application.
△ Less
Submitted 30 March, 2021;
originally announced March 2021.
-
The Ru-Vojta result for subvarieties
Authors:
Min Ru,
Julie Tzu-Yueh Wang
Abstract:
In their recent article, Min Ru and Paul Vojta, among other things, proved the so-called general theorem (arithmetic part) which can be viewed as an extension of Schmidt's subspace theorem. In this note, we extend their result by replacing the divisors by closed subschemes.
In their recent article, Min Ru and Paul Vojta, among other things, proved the so-called general theorem (arithmetic part) which can be viewed as an extension of Schmidt's subspace theorem. In this note, we extend their result by replacing the divisors by closed subschemes.
△ Less
Submitted 3 March, 2021;
originally announced March 2021.
-
On Pisot's $d$-th root conjecture for function fields and related GCD estimates
Authors:
Ji Guo,
Chia-Liang Sun,
Julie Tzu-Yueh Wang
Abstract:
We propose a function-field analog of Pisot's $d$-th root conjecture on linear recurrences, and prove it under some "non-triviality" assumption. Besides a recent result of Pasten-Wang on B{ü}chi's $d$-th power problem, our main tool, which is also developed in this paper, is a function-field analog of an GCD estimate in a recent work of Levin and Levin-Wang. As an easy corollary of such GCD estima…
▽ More
We propose a function-field analog of Pisot's $d$-th root conjecture on linear recurrences, and prove it under some "non-triviality" assumption. Besides a recent result of Pasten-Wang on B{ü}chi's $d$-th power problem, our main tool, which is also developed in this paper, is a function-field analog of an GCD estimate in a recent work of Levin and Levin-Wang. As an easy corollary of such GCD estimate, we also obtain an asymptotic result.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
DeepSun: Machine-Learning-as-a-Service for Solar Flare Prediction
Authors:
Yasser Abduallah,
Jason T. L. Wang,
Yang Nie,
Chang Liu,
Haimin Wang
Abstract:
Solar flare prediction plays an important role in understanding and forecasting space weather. The main goal of the Helioseismic and Magnetic Imager (HMI), one of the instruments on NASA's Solar Dynamics Observatory, is to study the origin of solar variability and characterize the Sun's magnetic activity. HMI provides continuous full-disk observations of the solar vector magnetic field with high c…
▽ More
Solar flare prediction plays an important role in understanding and forecasting space weather. The main goal of the Helioseismic and Magnetic Imager (HMI), one of the instruments on NASA's Solar Dynamics Observatory, is to study the origin of solar variability and characterize the Sun's magnetic activity. HMI provides continuous full-disk observations of the solar vector magnetic field with high cadence data that lead to reliable predictive capability; yet, solar flare prediction effort utilizing these data is still limited. In this paper, we present a machine-learning-as-a-service (MLaaS) framework, called DeepSun, for predicting solar flares on the Web based on HMI's data products. Specifically, we construct training data by utilizing the physical parameters provided by the Space-weather HMI Active Region Patches (SHARP) and categorize solar flares into four classes, namely B, C, M, X, according to the X-ray flare catalogs available at the National Centers for Environmental Information (NCEI). Thus, the solar flare prediction problem at hand is essentially a multi-class (i.e., four-class) classification problem. The DeepSun system employs several machine learning algorithms to tackle this multi-class prediction problem and provides an application programming interface (API) for remote programming users. To our knowledge, DeepSun is the first MLaaS tool capable of predicting solar flares through the Internet.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
Identifying and Tracking Solar Magnetic Flux Elements with Deep Learning
Authors:
Haodi Jiang,
Jiasheng Wang,
Chang Liu,
Ju Jing,
Hao Liu,
Jason T. L. Wang,
Haimin Wang
Abstract:
Deep learning has drawn a lot of interest in recent years due to its effectiveness in processing big and complex observational data gathered from diverse instruments. Here we propose a new deep learning method, called SolarUnet, to identify and track solar magnetic flux elements or features in observed vector magnetograms based on the Southwest Automatic Magnetic Identification Suite (SWAMIS). Our…
▽ More
Deep learning has drawn a lot of interest in recent years due to its effectiveness in processing big and complex observational data gathered from diverse instruments. Here we propose a new deep learning method, called SolarUnet, to identify and track solar magnetic flux elements or features in observed vector magnetograms based on the Southwest Automatic Magnetic Identification Suite (SWAMIS). Our method consists of a data pre-processing component that prepares training data from the SWAMIS tool, a deep learning model implemented as a U-shaped convolutional neural network for fast and accurate image segmentation, and a post-processing component that prepares tracking results. SolarUnet is applied to data from the 1.6 meter Goode Solar Telescope at the Big Bear Solar Observatory. When compared to the widely used SWAMIS tool, SolarUnet is faster while agreeing mostly with SWAMIS on feature size and flux distributions, and complementing SWAMIS in tracking long-lifetime features. Thus, the proposed physics-guided deep learning-based tool can be considered as an alternative method for solar magnetic tracking.
△ Less
Submitted 27 August, 2020;
originally announced August 2020.
-
Inferring Vector Magnetic Fields from Stokes Profiles of GST/NIRIS Using a Convolutional Neural Network
Authors:
Hao Liu,
Yan Xu,
Jiasheng Wang,
Ju Jing,
Chang Liu,
Jason T. L. Wang,
Haimin Wang
Abstract:
We propose a new machine learning approach to Stokes inversion based on a convolutional neural network (CNN) and the Milne-Eddington (ME) method. The Stokes measurements used in this study were taken by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory. By learning the latent patterns in the training data prepared by the…
▽ More
We propose a new machine learning approach to Stokes inversion based on a convolutional neural network (CNN) and the Milne-Eddington (ME) method. The Stokes measurements used in this study were taken by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory. By learning the latent patterns in the training data prepared by the physics-based ME tool, the proposed CNN method is able to infer vector magnetic fields from the Stokes profiles of GST/NIRIS. Experimental results show that our CNN method produces smoother and cleaner magnetic maps than the widely used ME method. Furthermore, the CNN method is 4~6 times faster than the ME method, and is able to produce vector magnetic fields in near real-time, which is essential to space weather forecasting. Specifically, it takes ~50 seconds for the CNN method to process an image of 720 x 720 pixels comprising Stokes profiles of GST/NIRIS. Finally, the CNN-inferred results are highly correlated to the ME-calculated results and are closer to the ME's results with the Pearson product-moment correlation coefficient (PPMCC) being closer to 1 on average than those from other machine learning algorithms such as multiple support vector regression and multilayer perceptrons (MLP). In particular, the CNN method outperforms the current best machine learning method (MLP) by 2.6% on average in PPMCC according to our experimental study. Thus, the proposed physics-assisted deep learning-based CNN tool can be considered as an alternative, efficient method for Stokes inversion for high resolution polarimetric observations obtained by GST/NIRIS.
△ Less
Submitted 8 May, 2020;
originally announced May 2020.
-
Non-Archimedean analytic curves in the complements of hypersurface divisors
Authors:
Ta Thi Hoai An,
J. T. -Y. Wang,
P. -M. Wong
Abstract:
We study the degeneration dimension of non-archimedean analytic maps into the complement of hypersurface divisors of smooth projective varieties. We also show that there exist no non-archimedean analytic maps into $P^n\setminus\cup_{i= 1}^n D_i$ where $D_i, 1\le i\le n$, are hypersurfaces of degree at least 2 in general position and intersecting transversally. Moreover, we prove that there exist n…
▽ More
We study the degeneration dimension of non-archimedean analytic maps into the complement of hypersurface divisors of smooth projective varieties. We also show that there exist no non-archimedean analytic maps into $P^n\setminus\cup_{i= 1}^n D_i$ where $D_i, 1\le i\le n$, are hypersurfaces of degree at least 2 in general position and intersecting transversally. Moreover, we prove that there exist no non-archimedean analytic maps into $P^2\setminus\cup_{i=1}^2 D_i$ when $D_1, D_2$ are generic plane curves with deg$D_1+$deg$D_2\ge 4$.
△ Less
Submitted 22 April, 2020;
originally announced April 2020.
-
Strong uniqueness polynomials: the complex case
Authors:
Ta Thi Hoai An,
Julie T-Y Wang,
Pit-Mann Wong
Abstract:
The theory of strong uniqueness polynomials, satisfying the separation condition (first introduced by Fujimoto \cite{Fuj1}), for complex meromorphic functions is quite complete. We construct examples of strong uniqueness polynomials which do not necessary satisfy the separation condition by constructing regular 1-forms of Wronskian type, a method introduced in \cite{AWW}. We also use this method t…
▽ More
The theory of strong uniqueness polynomials, satisfying the separation condition (first introduced by Fujimoto \cite{Fuj1}), for complex meromorphic functions is quite complete. We construct examples of strong uniqueness polynomials which do not necessary satisfy the separation condition by constructing regular 1-forms of Wronskian type, a method introduced in \cite{AWW}. We also use this method to produce a much easier proof in establishing the necessary and sufficient conditions for a polynomial, satisfying the separation condition, to be a strong uniqueness polynomials for meromorphic functions and rational functions.
△ Less
Submitted 22 April, 2020;
originally announced April 2020.