Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 81 results for author: Wang, J T

.
  1. arXiv:2406.11011  [pdf, other

    cs.LG cs.CL stat.ML

    Data Shapley in One Training Run

    Authors: Jiachen T. Wang, Prateek Mittal, Dawn Song, Ruoxi Jia

    Abstract: Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts. However, existing approaches require re-training models on different data subsets, which is computationally intensive, foreclosing their application to large-scale models. Furthermore, they produce the same attribution score for any models produced by running the learning algorithm, m… ▽ More

    Submitted 29 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

  2. arXiv:2405.16080  [pdf, other

    astro-ph.SR

    A Deep Learning Approach to Operational Flare Forecasting

    Authors: Yasser Abduallah, Jason T. L. Wang

    Abstract: Solar flares are explosions on the Sun. They happen when energy stored in magnetic fields around solar active regions (ARs) is suddenly released. In this paper, we present a transformer-based framework, named SolarFlareNet, for predicting whether an AR would produce a gamma-class flare within the next 24 to 72 hours. We consider three gamma classes, namely the >=M5.0 class, the >=M class and the >… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: 12 pages, 3 figures

  3. arXiv:2405.03875  [pdf, other

    cs.LG stat.ML

    Rethinking Data Shapley for Data Selection Tasks: Misleads and Merits

    Authors: Jiachen T. Wang, Tianji Yang, James Zou, Yongchan Kwon, Ruoxi Jia

    Abstract: Data Shapley provides a principled approach to data valuation and plays a crucial role in data-centric machine learning (ML) research. Data selection is considered a standard application of Data Shapley. However, its data selection performance has shown to be inconsistent across settings in the literature. This study aims to deepen our understanding of this phenomenon. We introduce a hypothesis te… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: ICML 2024

  4. arXiv:2404.13964  [pdf, other

    cs.LG econ.GN stat.ME

    An Economic Solution to Copyright Challenges of Generative AI

    Authors: Jiachen T. Wang, Zhun Deng, Hiroaki Chiba-Okabe, Boaz Barak, Weijie J. Su

    Abstract: Generative artificial intelligence (AI) systems are trained on large data corpora to generate new pieces of text, images, videos, and other media. There is growing concern that such systems may infringe on the copyright interests of training data contributors. To address the copyright challenges of generative AI, we propose a framework that compensates copyright owners proportionally to their cont… ▽ More

    Submitted 24 April, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  5. arXiv:2403.18302  [pdf, other

    astro-ph.SR cs.LG

    Super-Resolution of SOHO/MDI Magnetograms of Solar Active Regions Using SDO/HMI Data and an Attention-Aided Convolutional Neural Network

    Authors: Chunhui Xu, Jason T. L. Wang, Haimin Wang, Haodi Jiang, Qin Li, Yasser Abduallah, Yan Xu

    Abstract: Image super-resolution has been an important subject in image processing and recognition. Here, we present an attention-aided convolutional neural network (CNN) for solar image super-resolution. Our method, named SolarCNN, aims to enhance the quality of line-of-sight (LOS) magnetograms of solar active regions (ARs) collected by the Michelson Doppler Imager (MDI) on board the Solar and Heliospheric… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 17 pages, 7 figures

  6. arXiv:2402.17196  [pdf, other

    astro-ph.IM cs.LG

    Prediction of the SYM-H Index Using a Bayesian Deep Learning Method with Uncertainty Quantification

    Authors: Yasser Abduallah, Khalid A. Alobaid, Jason T. L. Wang, Haimin Wang, Vania K. Jordanova, Vasyl Yurchyshyn, Huseyin Cavus, Ju Jing

    Abstract: We propose a novel deep learning framework, named SYMHnet, which employs a graph neural network and a bidirectional long short-term memory network to cooperatively learn patterns from solar wind and interplanetary magnetic field parameters for short-term forecasts of the SYM-H index based on 1-minute and 5-minute resolution data. SYMHnet takes, as input, the time series of the parameters' values p… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 28 pages, 8 figures

  7. arXiv:2402.11111  [pdf, other

    cs.CL

    Language Models as Science Tutors

    Authors: Alexis Chevalier, Jiayi Geng, Alexander Wettig, Howard Chen, Sebastian Mizera, Toni Annala, Max Jameson Aragon, Arturo Rodríguez Fanlo, Simon Frieder, Simon Machado, Akshara Prabhakar, Ellie Thieu, Jiachen T. Wang, Zirui Wang, Xindi Wu, Mengzhou Xia, Wenhan Xia, Jiatong Yu, Jun-Jie Zhu, Zhiyong Jason Ren, Sanjeev Arora, Danqi Chen

    Abstract: NLP has recently made exciting progress toward training language models (LMs) with strong scientific problem-solving skills. However, model development has not focused on real-life use-cases of LMs for science, including applications in education that require processing long scientific documents. To address this, we introduce TutorEval and TutorChat. TutorEval is a diverse question-answering bench… ▽ More

    Submitted 21 July, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

    Comments: 8 pages without bibliography and appendix, 26 pages total

  8. arXiv:2401.13186  [pdf, ps, other

    math.AG math.NT

    Campana conjecture for coverings of toric surfaces over function fields

    Authors: Carlo Gasbarri, Ji Guo, Julie Tzu-Yueh Wang

    Abstract: We first proved Vojta's abc conjecture over function fields for Campana points on projective toric surfaces with high multiplicity along the boundary. As a consequence, we show a version of Campana's conjecture on finite covering of projective toric surfaces over function fields.

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: arXiv admin note: text overlap with arXiv:2106.15881

    MSC Class: 11J97; 14H05 and 11J87

  9. arXiv:2401.11103  [pdf, other

    cs.DS cs.LG stat.ML

    Efficient Data Shapley for Weighted Nearest Neighbor Algorithms

    Authors: Jiachen T. Wang, Prateek Mittal, Ruoxi Jia

    Abstract: This work aims to address an open problem in data valuation literature concerning the efficient computation of Data Shapley for weighted $K$ nearest neighbor algorithm (WKNN-Shapley). By considering the accuracy of hard-label KNN with discretized weights as the utility function, we reframe the computation of WKNN-Shapley into a counting problem and introduce a quadratic-time algorithm, presenting… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: AISTATS 2024 Oral

  10. arXiv:2312.03724  [pdf, other

    cs.CL cs.AI

    DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer

    Authors: Junyuan Hong, Jiachen T. Wang, Chenhui Zhang, Zhangheng Li, Bo Li, Zhangyang Wang

    Abstract: Large Language Models (LLMs) have emerged as dominant tools for various tasks, particularly when tailored for a specific target by prompt tuning. Nevertheless, concerns surrounding data privacy present obstacles due to the tuned prompts' dependency on sensitive private information. A practical solution is to host a local LLM and optimize a soft prompt privately using data. Yet, hosting a local mod… ▽ More

    Submitted 17 March, 2024; v1 submitted 26 November, 2023; originally announced December 2023.

    Comments: Accepted to ICLR'24 Splotlight (updated version)

  11. arXiv:2312.01691  [pdf, other

    astro-ph.SR cs.LG physics.space-ph

    Estimating Coronal Mass Ejection Mass and Kinetic Energy by Fusion of Multiple Deep-learning Models

    Authors: Khalid A. Alobaid, Yasser Abduallah, Jason T. L. Wang, Haimin Wang, Shen Fan, Jialiang Li, Huseyin Cavus, Vasyl Yurchyshyn

    Abstract: Coronal mass ejections (CMEs) are massive solar eruptions, which have a significant impact on Earth. In this paper, we propose a new method, called DeepCME, to estimate two properties of CMEs, namely, CME mass and kinetic energy. Being able to estimate these properties helps better understand CME dynamics. Our study is based on the CME catalog maintained at the Coordinated Data Analysis Workshops… ▽ More

    Submitted 4 December, 2023; originally announced December 2023.

    Comments: 10 pages, 7 figures

    Journal ref: The Astrophysical Journal Letters, 958:L34, 2023

  12. arXiv:2311.09345  [pdf, other

    physics.space-ph astro-ph.SR physics.plasm-ph

    A Machine Learning Approach to Understanding the Physical Properties of Magnetic Flux Ropes in the Solar Wind at 1 AU

    Authors: Hameedullah Farooki, Yasser Abduallah, Sung Jun Noh, Hyomin Kim, George Bizos, Youra Shin, Jason T. L. Wang, Haimin Wang

    Abstract: Interplanetary magnetic flux ropes (MFRs) are commonly observed structures in the solar wind, categorized as magnetic clouds (MCs) and small-scale MFRs (SMFRs) depending on whether they are associated with coronal mass ejections. We apply machine learning to systematically compare SMFRs, MCs, and ambient solar wind plasma properties. We construct a dataset of 3-minute averaged sequential data poin… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted for publication to ApJ

  13. arXiv:2311.06759  [pdf, other

    physics.space-ph astro-ph.SR

    A Closer Look at Small-Scale Magnetic Flux Ropes in the Solar Wind at 1 AU: Results from Improved Automated Detection

    Authors: Hameedullah Farooki, Sung Jun Noh, Jeongwoo Lee, Haimin Wang, Hyomin Kim, Yasser Abduallah, Jason T. L. Wang, Yu Chen, Sergio Servidio, Francesco Pecora

    Abstract: Small-scale interplanetary magnetic flux ropes (SMFRs) are similar to ICMEs in magnetic structure, but are smaller and do not exhibit ICME plasma signatures. We present a computationally efficient and GPU-powered version of the single-spacecraft automated SMFR detection algorithm based on the Grad-Shafranov (GS) technique. Our algorithm is capable of processing higher resolution data, eliminates s… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

  14. arXiv:2308.15709  [pdf, other

    cs.LG cs.CR cs.GT stat.ML

    Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation

    Authors: Jiachen T. Wang, Yuqing Zhu, Yu-Xiang Wang, Ruoxi Jia, Prateek Mittal

    Abstract: Data valuation aims to quantify the usefulness of individual data sources in training machine learning (ML) models, and is a critical aspect of data-centric ML research. However, data valuation faces significant yet frequently overlooked privacy challenges despite its importance. This paper studies these challenges with a focus on KNN-Shapley, one of the most practical data valuation methods nowad… ▽ More

    Submitted 25 November, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

    Comments: NeurIPS 2023 Spotlight

  15. arXiv:2308.13240  [pdf, ps, other

    math.AG math.CV math.NT

    Simply connectedness and hyperbolicity

    Authors: Erwan Rousseau, Carlo Gasbarri, Amos Turchet, Julie Tzu-Yueh Wang

    Abstract: We generalize to arbitrary dimension our previous construction of simply connected weakly-special but not special varieties. We show that they satisfy the function field and complex analytic part of Campana's conjecture. Moreover, we give the first examples, in any dimension, of smooth simply connected nonisotrivial projective varieties of general type that satisfy the function field Lang's conjec… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  16. arXiv:2308.12439  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection

    Authors: Tinghao Xie, Xiangyu Qi, Ping He, Yiming Li, Jiachen T. Wang, Prateek Mittal

    Abstract: We present a novel defense, against backdoor attacks on Deep Neural Networks (DNNs), wherein adversaries covertly implant malicious behaviors (backdoors) into DNNs. Our defense falls within the category of post-development defenses that operate independently of how the model was generated. The proposed defense is built upon a novel reverse engineering approach that can directly extract backdoor fu… ▽ More

    Submitted 5 October, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

  17. arXiv:2305.01639  [pdf, other

    cs.LG cs.AI cs.CR

    Privacy-Preserving In-Context Learning for Large Language Models

    Authors: Tong Wu, Ashwinee Panda, Jiachen T. Wang, Prateek Mittal

    Abstract: In-context learning (ICL) is an important capability of Large Language Models (LLMs), enabling these models to dynamically adapt based on specific, in-context exemplars, thereby improving accuracy and relevance. However, LLM's responses may leak the sensitive private information contained in in-context exemplars. To address this challenge, we propose Differentially Private In-context Learning (DP-… ▽ More

    Submitted 30 September, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

  18. arXiv:2305.00258  [pdf, other

    astro-ph.SR astro-ph.IM cs.LG physics.space-ph

    Ensemble Learning for CME Arrival Time Prediction

    Authors: Khalid A. Alobaid, Jason T. L. Wang

    Abstract: The Sun constantly releases radiation and plasma into the heliosphere. Sporadically, the Sun launches solar eruptions such as flares and coronal mass ejections (CMEs). CMEs carry away a huge amount of mass and magnetic flux with them. An Earth-directed CME can cause serious consequences to the human system. It can destroy power grids/pipelines, satellites, and communications. Therefore, accurately… ▽ More

    Submitted 29 April, 2023; originally announced May 2023.

    Comments: 13 pages, 8 figures

  19. arXiv:2305.00054  [pdf, other

    cs.LG cs.AI stat.ML

    LAVA: Data Valuation without Pre-Specified Learning Algorithms

    Authors: Hoang Anh Just, Feiyang Kang, Jiachen T. Wang, Yi Zeng, Myeongseob Ko, Ming Jin, Ruoxi Jia

    Abstract: Traditionally, data valuation (DV) is posed as a problem of equitably splitting the validation performance of a learning algorithm among the training data. As a result, the calculated data values depend on many design choices of the underlying learning algorithm. However, this dependence is undesirable for many DV use cases, such as setting priorities over different data sources in a data acquisit… ▽ More

    Submitted 19 December, 2023; v1 submitted 28 April, 2023; originally announced May 2023.

    Comments: ICLR 2023 Spotlight Latest Updated Version: 2023/12/19

  20. arXiv:2304.07927  [pdf, other

    cs.CR cs.DS cs.LG

    A Randomized Approach for Tight Privacy Accounting

    Authors: Jiachen T. Wang, Saeed Mahloujifar, Tong Wu, Ruoxi Jia, Prateek Mittal

    Abstract: Bounding privacy leakage over compositions, i.e., privacy accounting, is a key challenge in differential privacy (DP). The privacy parameter ($\eps$ or $δ$) is often easy to estimate but hard to bound. In this paper, we propose a new differential privacy paradigm called estimate-verify-release (EVR), which addresses the challenges of providing a strict upper bound for privacy parameter in DP compo… ▽ More

    Submitted 20 November, 2023; v1 submitted 16 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023

  21. arXiv:2304.04258  [pdf, ps, other

    stat.ML cs.LG

    A Note on "Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms"

    Authors: Jiachen T. Wang, Ruoxi Jia

    Abstract: Data valuation is a growing research field that studies the influence of individual data points for machine learning (ML) models. Data Shapley, inspired by cooperative game theory and economics, is an effective method for data valuation. However, it is well-known that the Shapley value (SV) can be computationally expensive. Fortunately, Jia et al. (2019) showed that for K-Nearest Neighbors (KNN) m… ▽ More

    Submitted 25 November, 2023; v1 submitted 9 April, 2023; originally announced April 2023.

    Comments: Technical Note

  22. arXiv:2302.11431  [pdf, ps, other

    stat.ML cs.LG

    A Note on "Towards Efficient Data Valuation Based on the Shapley Value''

    Authors: Jiachen T. Wang, Ruoxi Jia

    Abstract: The Shapley value (SV) has emerged as a promising method for data valuation. However, computing or estimating the SV is often computationally expensive. To overcome this challenge, Jia et al. (2019) propose an advanced SV estimation algorithm called ``Group Testing-based SV estimator'' which achieves favorable asymptotic sample complexity. In this technical note, we present several improvements in… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

  23. arXiv:2301.12576  [pdf, other

    cs.LG cs.CR

    Uncovering Adversarial Risks of Test-Time Adaptation

    Authors: Tong Wu, Feiran Jia, Xiangyu Qi, Jiachen T. Wang, Vikash Sehwag, Saeed Mahloujifar, Prateek Mittal

    Abstract: Recently, test-time adaptation (TTA) has been proposed as a promising solution for addressing distribution shifts. It allows a base model to adapt to an unforeseen distribution during inference by leveraging the information from the batch of (unlabeled) test data. However, we uncover a novel security vulnerability of TTA based on the insight that predictions on benign samples can be impacted by ma… ▽ More

    Submitted 4 February, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

  24. A Deep Learning Approach to Generating Photospheric Vector Magnetograms of Solar Active Regions for SOHO/MDI Using SDO/HMI and BBSO Data

    Authors: Haodi Jiang, Qin Li, Zhihang Hu, Nian Liu, Yasser Abduallah, Ju Jing, Genwei Zhang, Yan Xu, Wynne Hsu, Jason T. L. Wang, Haimin Wang

    Abstract: Solar activity is usually caused by the evolution of solar magnetic fields. Magnetic field parameters derived from photospheric vector magnetograms of solar active regions have been used to analyze and forecast eruptive events such as solar flares and coronal mass ejections. Unfortunately, the most recent solar cycle 24 was relatively weak with few large flares, though it is the only solar cycle i… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

    Comments: 15 pages, 6 figures

    Journal ref: Solar Physics, 2023

  25. arXiv:2210.04122  [pdf, other

    astro-ph.SR astro-ph.IM cs.LG

    Inferring Line-of-Sight Velocities and Doppler Widths from Stokes Profiles of GST/NIRIS Using Stacked Deep Neural Networks

    Authors: Haodi Jiang, Qin Li, Yan Xu, Wynne Hsu, Kwangsu Ahn, Wenda Cao, Jason T. L. Wang, Haimin Wang

    Abstract: Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at th… ▽ More

    Submitted 8 October, 2022; originally announced October 2022.

    Comments: 16 pages, 8 figures

    Journal ref: The Astrophysical Journal, 2022

  26. arXiv:2209.13779  [pdf

    astro-ph.SR stat.ML

    Solar Flare Index Prediction Using SDO/HMI Vector Magnetic Data Products with Statistical and Machine Learning Methods

    Authors: Hewei Zhang, Qin Li, Yanxing Yang, Ju Jing, Jason T. L. Wang, Haimin Wang, Zuofeng Shang

    Abstract: Solar flares, especially the M- and X-class flares, are often associated with coronal mass ejections (CMEs). They are the most important sources of space weather effects, that can severely impact the near-Earth environment. Thus it is essential to forecast flares (especially the M-and X-class ones) to mitigate their destructive and hazardous consequences. Here, we introduce several statistical and… ▽ More

    Submitted 1 December, 2022; v1 submitted 27 September, 2022; originally announced September 2022.

    Journal ref: The Astrophysical Journal Supplement Series (2022), Volume 263, Number 2

  27. arXiv:2209.11434  [pdf, ps, other

    math.CV

    A complex case of Vojta's general abc conjecture and cases of Campana's orbifold conjecture

    Authors: Ji Guo, Julie Tzu-Yueh Wang

    Abstract: We proved a truncated second main theorem of level one with explicit exceptional sets for analytic maps into $\mathbb P^2$ intersecting the coordinate lines with sufficiently high multiplicities. As applications, we studied some cases of Campana's orbifold conjecture for $\mathbb P^2$ and finite ramified covers of $\mathbb P^2$ with three components admitting sufficiently large multiplicities.

    Submitted 22 June, 2023; v1 submitted 23 September, 2022; originally announced September 2022.

  28. arXiv:2209.07716  [pdf, other

    cs.CR cs.LG

    Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning

    Authors: Jiachen T. Wang, Saeed Mahloujifar, Shouda Wang, Ruoxi Jia, Prateek Mittal

    Abstract: Propose-Test-Release (PTR) is a differential privacy framework that works with local sensitivity of functions, instead of their global sensitivity. This framework is typically used for releasing robust statistics such as median or trimmed mean in a differentially private manner. While PTR is a common framework introduced over a decade ago, using it in applications such as robust SGD where we need… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

    Comments: NeurIPS 2022

  29. arXiv:2206.07018  [pdf, other

    cs.CV

    Turning a Curse into a Blessing: Enabling In-Distribution-Data-Free Backdoor Removal via Stabilized Model Inversion

    Authors: Si Chen, Yi Zeng, Jiachen T. Wang, Won Park, Xun Chen, Lingjuan Lyu, Zhuoqing Mao, Ruoxi Jia

    Abstract: Many backdoor removal techniques in machine learning models require clean in-distribution data, which may not always be available due to proprietary datasets. Model inversion techniques, often considered privacy threats, can reconstruct realistic training samples, potentially eliminating the need for in-distribution data. Prior attempts to combine backdoor removal and model inversion yielded limit… ▽ More

    Submitted 23 March, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: Because of an equation and author informational error, this paper has been withdrawn by the submitter

  30. 2022 Review of Data-Driven Plasma Science

    Authors: Rushil Anirudh, Rick Archibald, M. Salman Asif, Markus M. Becker, Sadruddin Benkadda, Peer-Timo Bremer, Rick H. S. Budé, C. S. Chang, Lei Chen, R. M. Churchill, Jonathan Citrin, Jim A Gaffney, Ana Gainaru, Walter Gekelman, Tom Gibbs, Satoshi Hamaguchi, Christian Hill, Kelli Humbird, Sören Jalas, Satoru Kawaguchi, Gon-Ho Kim, Manuel Kirchen, Scott Klasky, John L. Kline, Karl Krushelnick , et al. (38 additional authors not shown)

    Abstract: Data science and technology offer transformative tools and methods to science. This review article highlights latest development and progress in the interdisciplinary field of data-driven plasma science (DDPS). A large amount of data and machine learning algorithms go hand in hand. Most plasma data, whether experimental, observational or computational, are generated or collected by machines today.… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

    Comments: 112 pages (including 700+ references), 44 figures, submitted to IEEE Transactions on Plasma Science as a part of the IEEE Golden Anniversary Special Issue

    Report number: Los Alamos Report number LA-UR-22-24834

    Journal ref: IEEE Transactions on Plasma Science 51, 1750 - 1838 (2023)

  31. arXiv:2205.15466  [pdf, other

    cs.LG cs.GT stat.ML

    Data Banzhaf: A Robust Data Valuation Framework for Machine Learning

    Authors: Jiachen T. Wang, Ruoxi Jia

    Abstract: Data valuation has wide use cases in machine learning, including improving data quality and creating economic incentives for data sharing. This paper studies the robustness of data valuation to noisy model performance scores. Particularly, we find that the inherent randomness of the widely used stochastic gradient descent can cause existing data value notions (e.g., the Shapley value and the Leave… ▽ More

    Submitted 18 December, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: AISTATS 2023 Oral

    Journal ref: AISTATS 2023

  32. arXiv:2205.13616  [pdf, other

    cs.LG cs.CR cs.CV

    Towards A Proactive ML Approach for Detecting Backdoor Poison Samples

    Authors: Xiangyu Qi, Tinghao Xie, Jiachen T. Wang, Tong Wu, Saeed Mahloujifar, Prateek Mittal

    Abstract: Adversaries can embed backdoors in deep learning models by introducing backdoor poison samples into training datasets. In this work, we investigate how to detect such poison samples to mitigate the threat of backdoor attacks. First, we uncover a post-hoc workflow underlying most prior work, where defenders passively allow the attack to proceed and then leverage the characteristics of the post-atta… ▽ More

    Submitted 17 June, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: USENIX Security 2023

  33. arXiv:2205.04757  [pdf, other

    astro-ph.SR astro-ph.GA

    Characterization of Kepler targets based on medium-resolution LAMOST spectra analyzed with ROTFIT

    Authors: A. Frasca, J. Molenda-Zakowicz, J. Alonso-Santiago, G. Catanzaro, P. De Cat, J. N. Fu, W. Zong, J. X. Wang, T. Cang, J. T. Wang

    Abstract: In this work we present the results of our analysis of 16,300 medium-resolution LAMOST spectra of late-type stars in the Kepler field with the aim of determining the stellar parameters, activity level, lithium atmospheric content, and binarity. We have used a version of the code ROTFIT specifically developed for these spectra. We provide a catalog with the atmospheric parameters (Teff, log(g), and… ▽ More

    Submitted 10 May, 2022; originally announced May 2022.

    Comments: 32 pages, 34 figures; accepted for publication in Astronomy & Astrophysics

    Journal ref: A&A 664, A78 (2022)

  34. arXiv:2205.02447  [pdf, other

    cs.LG

    A Deep Learning Approach to Dst Index Prediction

    Authors: Yasser Abduallah, Jason T. L. Wang, Prianka Bose, Genwei Zhang, Firas Gerges, Haimin Wang

    Abstract: The disturbance storm time (Dst) index is an important and useful measurement in space weather research. It has been used to characterize the size and intensity of a geomagnetic storm. A negative Dst value means that the Earth's magnetic field is weakened, which happens during storms. In this paper, we present a novel deep learning method, called the Dst Transformer, to perform short-term, 1-6 hou… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

    Comments: 7 pages, 4 figures

  35. arXiv:2203.14393  [pdf, other

    astro-ph.SR cs.LG

    Predicting Solar Energetic Particles Using SDO/HMI Vector Magnetic Data Products and a Bidirectional LSTM Network

    Authors: Yasser Abduallah, Vania K. Jordanova, Hao Liu, Qin Li, Jason T. L. Wang, Haimin Wang

    Abstract: Solar energetic particles (SEPs) are an essential source of space radiation, which are hazards for humans in space, spacecraft, and technology in general. In this paper we propose a deep learning method, specifically a bidirectional long short-term memory (biLSTM) network, to predict if an active region (AR) would produce an SEP event given that (i) the AR will produce an M- or X-class flare and a… ▽ More

    Submitted 27 March, 2022; originally announced March 2022.

    Comments: 22 pages, 6 figures, 8 tables

  36. arXiv:2203.09544  [pdf, ps, other

    astro-ph.IM astro-ph.EP astro-ph.SR

    Revisiting the Solar Research Cyberinfrastructure Needs: A White Paper of Findings and Recommendations

    Authors: Gelu Nita, Azim Ahmadzadeh, Serena Criscuoli, Alisdair Davey, Dale Gary, Manolis Georgoulis, Neal Hurlburt, Irina Kitiashvili, Dustin Kempton, Alexander Kosovichev, Piet Martens, Ryan McGranaghan, Vincent Oria, Kevin Reardon, Viacheslav Sadykov, Ryan Timmons, Haimin Wang, Jason T. L. Wang

    Abstract: Solar and Heliosphere physics are areas of remarkable data-driven discoveries. Recent advances in high-cadence, high-resolution multiwavelength observations, growing amounts of data from realistic modeling, and operational needs for uninterrupted science-quality data coverage generate the demand for a solar metadata standardization and overall healthy data infrastructure. This white paper is prepa… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: White Paper

  37. arXiv:2111.12545  [pdf, other

    cs.LG stat.CO

    ModelPred: A Framework for Predicting Trained Model from Training Data

    Authors: Yingyan Zeng, Jiachen T. Wang, Si Chen, Hoang Anh Just, Ran Jin, Ruoxi Jia

    Abstract: In this work, we propose ModelPred, a framework that helps to understand the impact of changes in training data on a trained model. This is critical for building trust in various stages of a machine learning pipeline: from cleaning poor-quality samples and tracking important ones to be collected during data preparation, to calibrating uncertainty of model prediction, to interpreting why certain be… ▽ More

    Submitted 23 December, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

  38. Matching-invariant running quark masses in Quantum Chromodynamics

    Authors: H. M. Chen, L. M. Liu, J. T. Wang, M. Waqas, G. X. Peng

    Abstract: The conventional quark mass is not continuous at thresholds. In this paper, we derive matchinginvariant quark masses which are continuous everywhere. They are expanded as an obvious function of the logarithmic Lambda scaled energy. The expansion coefficients are related to the original gamma and beta functions, with concretization to four loop level. The results show that the new expressions for t… ▽ More

    Submitted 22 October, 2021; originally announced October 2021.

  39. arXiv:2107.11042  [pdf, other

    astro-ph.SR cs.LG

    Deep Learning Based Reconstruction of Total Solar Irradiance

    Authors: Yasser Abduallah, Jason T. L. Wang, Yucong Shen, Khalid A. Alobaid, Serena Criscuoli, Haimin Wang

    Abstract: The Earth's primary source of energy is the radiant energy generated by the Sun, which is referred to as solar irradiance, or total solar irradiance (TSI) when all of the radiation is measured. A minor change in the solar irradiance can have a significant impact on the Earth's climate and atmosphere. As a result, studying and measuring solar irradiance is crucial in understanding climate changes a… ▽ More

    Submitted 23 July, 2021; originally announced July 2021.

    Comments: 8 pages, 11 figures

  40. arXiv:2107.07886  [pdf, other

    astro-ph.SR cs.LG

    Tracing Halpha Fibrils through Bayesian Deep Learning

    Authors: Haodi Jiang, Ju Jing, Jiasheng Wang, Chang Liu, Qin Li, Yan Xu, Jason T. L. Wang, Haimin Wang

    Abstract: We present a new deep learning method, dubbed FibrilNet, for tracing chromospheric fibrils in Halpha images of solar observations. Our method consists of a data pre-processing component that prepares training data from a threshold-based tool, a deep learning model implemented as a Bayesian convolutional neural network for probabilistic image segmentation with uncertainty quantification to predict… ▽ More

    Submitted 16 July, 2021; originally announced July 2021.

    Comments: 20 pages, 12 figures

  41. arXiv:2106.15881  [pdf, ps, other

    math.NT

    Vojta's abc Conjecture for algebraic tori and applications over function fields

    Authors: Ji Guo, Khoa D. Nguyen, Chia-Liang Sun, Julie Tzu-Yueh Wang

    Abstract: We prove Vojta's generalized abc conjecture for algebraic tori over function fields with exceptional sets that can be determined effectively. Additionally, we establish a version of the conjecture for toric varieties. As an application, we investigate the Lang-Vojta Conjecture for varieties of log general type that are ramified covers of $\mathbb G_m^n$ over function fields. In particular, we cons… ▽ More

    Submitted 18 October, 2023; v1 submitted 30 June, 2021; originally announced June 2021.

    MSC Class: 11J97

  42. arXiv:2106.11337  [pdf, ps, other

    math.NT math.AG

    Divisibility of polynomials and degeneracy of integral points

    Authors: Erwan Rousseau, Julie Tzu-Yueh Wang, Amos Turchet

    Abstract: We prove several statements about arithmetic hyperbolicity of certain blow-up varieties. As a corollary we obtain multiple examples of simply connected quasi-projective varieties that are pseudo-arithmetically hyperbolic. This generalizes results of Corvaja and Zannier obtained in dimension 2 to arbitrary dimension. The key input is an application of the Ru-Vojta's strategy. We also obtain the ana… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

    Comments: 26 pages. Comments welcome

    MSC Class: 11J87; 11J97; 14G05; 32A22

  43. arXiv:2103.16087  [pdf, ps, other

    math.CV

    A truncated second main theorem for algebraic tori with moving targets and applications

    Authors: Ji Guo, Chia-Liang Sun, Julie Tzu-Yueh Wang

    Abstract: We establish a second main theorem for algebraic tori with slow growth moving targets with truncation to level 1. As the first application of this result, we prove the Green-Griffith-Lang conjecture for projective spaces with $n+1$ components in the context of moving targets of slow growth. Then we discuss the integrability of the ring of exponential polynomials in the ring of entire functions as… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: 20pages

    MSC Class: 30D35 (Primary); 30A70; 11J97(Secondary)

  44. arXiv:2103.02775  [pdf, ps, other

    math.NT

    The Ru-Vojta result for subvarieties

    Authors: Min Ru, Julie Tzu-Yueh Wang

    Abstract: In their recent article, Min Ru and Paul Vojta, among other things, proved the so-called general theorem (arithmetic part) which can be viewed as an extension of Schmidt's subspace theorem. In this note, we extend their result by replacing the divisors by closed subschemes.

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: arXiv admin note: text overlap with arXiv:1608.05382

    MSC Class: 11J97

  45. arXiv:2010.01896  [pdf, ps, other

    math.NT

    On Pisot's $d$-th root conjecture for function fields and related GCD estimates

    Authors: Ji Guo, Chia-Liang Sun, Julie Tzu-Yueh Wang

    Abstract: We propose a function-field analog of Pisot's $d$-th root conjecture on linear recurrences, and prove it under some "non-triviality" assumption. Besides a recent result of Pasten-Wang on B{ü}chi's $d$-th power problem, our main tool, which is also developed in this paper, is a function-field analog of an GCD estimate in a recent work of Levin and Levin-Wang. As an easy corollary of such GCD estima… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

    MSC Class: 11D61; 14H05 and 11B37

  46. DeepSun: Machine-Learning-as-a-Service for Solar Flare Prediction

    Authors: Yasser Abduallah, Jason T. L. Wang, Yang Nie, Chang Liu, Haimin Wang

    Abstract: Solar flare prediction plays an important role in understanding and forecasting space weather. The main goal of the Helioseismic and Magnetic Imager (HMI), one of the instruments on NASA's Solar Dynamics Observatory, is to study the origin of solar variability and characterize the Sun's magnetic activity. HMI provides continuous full-disk observations of the solar vector magnetic field with high c… ▽ More

    Submitted 3 September, 2020; originally announced September 2020.

    Comments: 8 pages, 6 figures

  47. arXiv:2008.12080  [pdf, other

    astro-ph.SR cs.LG

    Identifying and Tracking Solar Magnetic Flux Elements with Deep Learning

    Authors: Haodi Jiang, Jiasheng Wang, Chang Liu, Ju Jing, Hao Liu, Jason T. L. Wang, Haimin Wang

    Abstract: Deep learning has drawn a lot of interest in recent years due to its effectiveness in processing big and complex observational data gathered from diverse instruments. Here we propose a new deep learning method, called SolarUnet, to identify and track solar magnetic flux elements or features in observed vector magnetograms based on the Southwest Automatic Magnetic Identification Suite (SWAMIS). Our… ▽ More

    Submitted 27 August, 2020; originally announced August 2020.

    Comments: 17 pages, 12 figures

    Journal ref: The Astrophysical Journal Supplement Series, 250:5 (13pp), 2020

  48. arXiv:2005.03945  [pdf, other

    astro-ph.SR cs.LG

    Inferring Vector Magnetic Fields from Stokes Profiles of GST/NIRIS Using a Convolutional Neural Network

    Authors: Hao Liu, Yan Xu, Jiasheng Wang, Ju Jing, Chang Liu, Jason T. L. Wang, Haimin Wang

    Abstract: We propose a new machine learning approach to Stokes inversion based on a convolutional neural network (CNN) and the Milne-Eddington (ME) method. The Stokes measurements used in this study were taken by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory. By learning the latent patterns in the training data prepared by the… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    Comments: 24 pages, 9 figures

    Journal ref: The Astrophysical Journal, 894:70, 2020

  49. arXiv:2004.10625  [pdf, ps, other

    math.NT

    Non-Archimedean analytic curves in the complements of hypersurface divisors

    Authors: Ta Thi Hoai An, J. T. -Y. Wang, P. -M. Wong

    Abstract: We study the degeneration dimension of non-archimedean analytic maps into the complement of hypersurface divisors of smooth projective varieties. We also show that there exist no non-archimedean analytic maps into $P^n\setminus\cup_{i= 1}^n D_i$ where $D_i, 1\le i\le n$, are hypersurfaces of degree at least 2 in general position and intersecting transversally. Moreover, we prove that there exist n… ▽ More

    Submitted 22 April, 2020; originally announced April 2020.

    Journal ref: Journal of Number Theory 128 (8), 2275-2281, 2008

  50. arXiv:2004.10609  [pdf, ps, other

    math.CV

    Strong uniqueness polynomials: the complex case

    Authors: Ta Thi Hoai An, Julie T-Y Wang, Pit-Mann Wong

    Abstract: The theory of strong uniqueness polynomials, satisfying the separation condition (first introduced by Fujimoto \cite{Fuj1}), for complex meromorphic functions is quite complete. We construct examples of strong uniqueness polynomials which do not necessary satisfy the separation condition by constructing regular 1-forms of Wronskian type, a method introduced in \cite{AWW}. We also use this method t… ▽ More

    Submitted 22 April, 2020; originally announced April 2020.