Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–5 of 5 results for author: Chignell, M

.
  1. arXiv:2405.07440  [pdf, other

    cs.HC cs.CR cs.LG

    Maximizing Information Gain in Privacy-Aware Active Learning of Email Anomalies

    Authors: Mu-Huan Miles Chung, Sharon Li, Jaturong Kongmanee, Lu Wang, Yuhong Yang, Calvin Giang, Khilan Jerath, Abhay Raman, David Lie, Mark Chignell

    Abstract: Redacted emails satisfy most privacy requirements but they make it more difficult to detect anomalous emails that may be indicative of data exfiltration. In this paper we develop an enhanced method of Active Learning using an information gain maximizing heuristic, and we evaluate its effectiveness in a real world setting where only redacted versions of email could be labeled by human analysts due… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2303.00870

  2. arXiv:2307.08782  [pdf, other

    cs.LG

    Unsupervised Learning of Distributional Properties can Supplement Human Labeling and Increase Active Learning Efficiency in Anomaly Detection

    Authors: Jaturong Kongmanee, Mark Chignell, Khilan Jerath, Abhay Raman

    Abstract: Exfiltration of data via email is a serious cybersecurity threat for many organizations. Detecting data exfiltration (anomaly) patterns typically requires labeling, most often done by a human annotator, to reduce the high number of false alarms. Active Learning (AL) is a promising approach for labeling data efficiently, but it needs to choose an efficient order in which cases are to be labeled, an… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

  3. arXiv:2303.00870  [pdf, other

    cs.HC cs.CR cs.LG

    Implementing Active Learning in Cybersecurity: Detecting Anomalies in Redacted Emails

    Authors: Mu-Huan Chung, Lu Wang, Sharon Li, Yuhong Yang, Calvin Giang, Khilan Jerath, Abhay Raman, David Lie, Mark Chignell

    Abstract: Research on email anomaly detection has typically relied on specially prepared datasets that may not adequately reflect the type of data that occurs in industry settings. In our research, at a major financial services company, privacy concerns prevented inspection of the bodies of emails and attachment details (although subject headings and attachment filenames were available). This made labeling… ▽ More

    Submitted 2 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

  4. arXiv:2103.03436  [pdf, other

    cs.LG

    MD-MTL: An Ensemble Med-Multi-Task Learning Package for DiseaseScores Prediction and Multi-Level Risk Factor Analysis

    Authors: Lu Wang, Haoyan Jiang, Mark Chignell

    Abstract: While many machine learning methods have been used for medical prediction and risk factor analysis on healthcare data, most prior research has involved single-task learning (STL) methods. However, healthcare research often involves multiple related tasks. For instance, implementation of disease scores prediction and risk factor analysis in multiple subgroups of patients simultaneously and risk fac… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

    Comments: 14 pages, 8 figures

  5. arXiv:2011.04838  [pdf, other

    cs.DB

    Answer Graph: Factorization Matters in Large Graphs

    Authors: Zahid Abul-Basher, Nikolay Yakovets, Parke Godfrey, Stanley Clark, Mark Chignell

    Abstract: Our answer-graph method to evaluate SPARQL conjunctive queries (CQs) finds a factorized answer set first, an answer graph, and then finds the embedding tuples from this. This approach can reduce greatly the cost to evaluate CQs. This affords a second advantage: we can construct a cost-based planner. We present the answer-graph approach, and overview our prototype system, Wireframe. We then offer p… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.