Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–33 of 33 results for author: Bagnall, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14231  [pdf, other

    cs.LG

    aeon: a Python toolkit for learning from time series

    Authors: Matthew Middlehurst, Ali Ismail-Fawaz, Antoine Guillaume, Christopher Holder, David Guijo Rubio, Guzal Bulatova, Leonidas Tsaprounis, Lukasz Mentel, Martin Walter, Patrick Schäfer, Anthony Bagnall

    Abstract: aeon is a unified Python 3 library for all machine learning tasks involving time series. The package contains modules for time series forecasting, classification, extrinsic regression and clustering, as well as a variety of utilities, transformations and distance measures designed for time series data. aeon also has a number of experimental modules for tasks such as anomaly detection, similarity s… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 10 pages

  2. arXiv:2306.10084  [pdf, other

    cs.LG

    Convolutional and Deep Learning based techniques for Time Series Ordinal Classification

    Authors: Rafael Ayllón-Gavilán, David Guijo-Rubio, Pedro Antonio Gutiérrez, Anthony Bagnall, César Hervás-Martínez

    Abstract: Time Series Classification (TSC) covers the supervised learning problem where input data is provided in the form of series of values observed through repeated measurements over time, and whose objective is to predict the category to which they belong. When the class values are ordinal, classifiers that take this into account can perform better than nominal classifiers. Time Series Ordinal Classifi… ▽ More

    Submitted 13 July, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: 12 pages, 9 figures, 2 tables

  3. arXiv:2305.01429  [pdf, other

    cs.LG stat.ML

    Unsupervised Feature Based Algorithms for Time Series Extrinsic Regression

    Authors: David Guijo-Rubio, Matthew Middlehurst, Guilherme Arcencio, Diego Furtado Silva, Anthony Bagnall

    Abstract: Time Series Extrinsic Regression (TSER) involves using a set of training time series to form a predictive model of a continuous response variable that is not directly related to the regressor series. The TSER archive for comparing algorithms was released in 2022 with 19 problems. We increase the size of this archive to 63 problems and reproduce the previous comparison of baseline algorithms. We th… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: 19 pages, 21 figures, 6 tables. Appendix included

  4. Bake off redux: a review and experimental evaluation of recent time series classification algorithms

    Authors: Matthew Middlehurst, Patrick Schäfer, Anthony Bagnall

    Abstract: In 2017, a research paper compared 18 Time Series Classification (TSC) algorithms on 85 datasets from the University of California, Riverside (UCR) archive. This study, commonly referred to as a `bake off', identified that only nine algorithms performed significantly better than the Dynamic Time Warping (DTW) and Rotation Forest benchmarks that were used. The study categorised each algorithm by th… ▽ More

    Submitted 8 May, 2024; v1 submitted 25 April, 2023; originally announced April 2023.

  5. arXiv:2301.09802  [pdf, ps, other

    cs.LO

    Inductive Reasoning for Coinductive Types

    Authors: Alexander Bagnall, Gordon Stewart, Anindya Banerjee

    Abstract: We present AlgCo (Algebraic Coinductives), a practical framework for inductive reasoning over commonly used coinductive types such as conats, streams, and infinitary trees with finite branching factor. The key idea is to exploit the notion of algebraic complete partial order from domain theory to define continuous operations over coinductive types via primitive recursion on ``dense'' collections o… ▽ More

    Submitted 6 April, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

  6. arXiv:2211.06747  [pdf, other

    cs.PL

    Formally Verified Samplers From Probabilistic Programs With Loops and Conditioning

    Authors: Alexander Bagnall, Gordon Stewart, Anindya Banerjee

    Abstract: We present Zar: a formally verified compiler pipeline from discrete probabilistic programs with unbounded loops in the conditional probabilistic guarded command language (cpGCL) to proved-correct executable samplers in the random bit model. We exploit the key idea that all discrete probability distributions can be reduced to unbiased coin-flipping schemes. The compiler pipeline first translates a… ▽ More

    Submitted 20 April, 2023; v1 submitted 12 November, 2022; originally announced November 2022.

  7. A Review and Evaluation of Elastic Distance Functions for Time Series Clustering

    Authors: Chris Holder, Matthew Middlehurst, Anthony Bagnall

    Abstract: Time series clustering is the act of grouping time series data without recourse to a label. Algorithms that cluster time series can be classified into two groups: those that employ a time series specific distance measure; and those that derive features from time series. Both approaches usually rely on traditional clustering algorithms such as $k$-means. Our focus is on distance based time series t… ▽ More

    Submitted 26 April, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

  8. arXiv:2201.12048  [pdf, other

    cs.LG

    The FreshPRINCE: A Simple Transformation Based Pipeline Time Series Classifier

    Authors: Matthew Middlehurst, Anthony Bagnall

    Abstract: There have recently been significant advances in the accuracy of algorithms proposed for time series classification (TSC). However, a commonly asked question by real world practitioners and data scientists less familiar with the research topic, is whether the complexity of the algorithms considered state of the art is really necessary. Many times the first approach suggested is a simple pipeline o… ▽ More

    Submitted 28 January, 2022; originally announced January 2022.

  9. The Temporal Dictionary Ensemble (TDE) Classifier for Time Series Classification

    Authors: Matthew Middlehurst, James Large, Gavin Cawley, Anthony Bagnall

    Abstract: Using bag of words representations of time series is a popular approach to time series classification. These algorithms involve approximating and discretising windows over a series to form words, then forming a count of words over a given dictionary. Classifiers are constructed on the resulting histograms of word counts. A 2017 evaluation of a range of time series classifiers found the bag of symb… ▽ More

    Submitted 9 May, 2021; originally announced May 2021.

    Comments: arXiv admin note: text overlap with arXiv:1911.12008

    Journal ref: ECML PKDD 2020: Machine Learning and Knowledge Discovery in Databases, pages 660-676, 2020

  10. arXiv:2104.07551  [pdf, other

    cs.LG

    HIVE-COTE 2.0: a new meta ensemble for time series classification

    Authors: Matthew Middlehurst, James Large, Michael Flynn, Jason Lines, Aaron Bostrom, Anthony Bagnall

    Abstract: The Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) is a heterogeneous meta ensemble for time series classification. HIVE-COTE forms its ensemble from classifiers of multiple domains, including phase-independent shapelets, bag-of-words based dictionaries and phase-dependent intervals. Since it was first proposed in 2016, the algorithm has remained state of the art for ac… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

  11. The Canonical Interval Forest (CIF) Classifier for Time Series Classification

    Authors: Matthew Middlehurst, James Large, Anthony Bagnall

    Abstract: Time series classification (TSC) is home to a number of algorithm groups that utilise different kinds of discriminatory patterns. One of these groups describes classifiers that predict using phase dependant intervals. The time series forest (TSF) classifier is one of the most well known interval methods, and has demonstrated strong performance as well as relative speed in training and predictions.… ▽ More

    Submitted 20 August, 2020; originally announced August 2020.

    Journal ref: In proceedings of the IEEE International Conference on Big Data (Big Data), pages 188-195, 2020

  12. Benchmarking Multivariate Time Series Classification Algorithms

    Authors: Alejandro Pasos Ruiz, Michael Flynn, Anthony Bagnall

    Abstract: Time Series Classification (TSC) involved building predictive models for a discrete target variable from ordered, real valued, attributes. Over recent years, a new set of TSC algorithms have been developed which have made significant improvement over the previous state of the art. The main focus has been on univariate TSC, i.e. the problem where each case has a single series and a class label. In… ▽ More

    Submitted 26 April, 2023; v1 submitted 26 July, 2020; originally announced July 2020.

    Comments: Data Min Knowl Disc (2020)

    Journal ref: The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances, Data Mining and Knowledge Discovery volume 35, pages 401 to 449 (2021)

  13. arXiv:2005.02163  [pdf, other

    cs.CV cs.LG stat.ML

    Detecting Electric Devices in 3D Images of Bags

    Authors: Anthony Bagnall, Paul Southam, James Large, Richard Harvey

    Abstract: The aviation and transport security industries face the challenge of screening high volumes of baggage for threats and contraband in the minimum time possible. Automation and semi-automation of this procedure offers the potential to increase security by detecting more threats and improve the customer experience by speeding up the process. Traditional 2D x-ray images are often extremely difficult t… ▽ More

    Submitted 25 April, 2020; originally announced May 2020.

  14. A tale of two toolkits, report the third: on the usage and performance of HIVE-COTE v1.0

    Authors: Anthony Bagnall, Michael Flynn, James Large, Jason Lines, Matthew Middlehurst

    Abstract: The Hierarchical Vote Collective of Transformation-based Ensembles (HIVE-COTE) is a heterogeneous meta ensemble for time series classification. Since it was first proposed in 2016, the algorithm has undergone some minor changes and there is now a configurable, scalable and easy to use version available in two open source repositories. We present an overview of the latest stable HIVE-COTE, version… ▽ More

    Submitted 26 April, 2023; v1 submitted 13 April, 2020; originally announced April 2020.

    Journal ref: On the Usage and Performance of the Hierarchical Vote Collective of Transformation-Based Ensembles Version 1.0 (HIVE-COTE v1.0), Lecture Notes in Computer Science book series (LNAI,volume 12588), 2000

  15. arXiv:1911.12008  [pdf, other

    cs.LG stat.ML

    A tale of two toolkits, report the second: bake off redux. Chapter 1. dictionary based classifiers

    Authors: Anthony Bagnall, James Large, Matthew Middlehurst

    Abstract: Time series classification (TSC) is the problem of learning labels from time dependent data. One class of algorithms is derived from a bag of words approach. A window is run along a series, the subseries is shortened and discretised to form a word, then features are formed from the histogram of frequency of occurrence of words. We call this type of approach to TSC dictionary based classification.… ▽ More

    Submitted 27 November, 2019; originally announced November 2019.

  16. arXiv:1909.07872  [pdf, ps, other

    cs.LG stat.ML

    sktime: A Unified Interface for Machine Learning with Time Series

    Authors: Markus Löning, Anthony Bagnall, Sajaysurya Ganesh, Viktor Kazakov, Jason Lines, Franz J. Király

    Abstract: We present sktime -- a new scikit-learn compatible Python library with a unified interface for machine learning with time series. Time series data gives rise to various distinct but closely related learning tasks, such as forecasting and time series classification, many of which can be solved by reducing them to related simpler tasks. We discuss the main rationale for creating a unified interface,… ▽ More

    Submitted 17 September, 2019; originally announced September 2019.

  17. arXiv:1909.05738  [pdf, other

    cs.LG stat.ML

    A tale of two toolkits, report the first: benchmarking time series classification algorithms for correctness and efficiency

    Authors: Anthony Bagnall, Franz Király, Markus Löning, Matthew Middlehurst, George Oastler

    Abstract: sktime is an open source, Python based, sklearn compatible toolkit for time series analysis developed by researchers at the University of East Anglia (UEA), University College London and the Alan Turing Institute. A key initial goal for sktime was to provide time series classification functionality equivalent to that available in a related java package, tsml, also developed at UEA. We describe the… ▽ More

    Submitted 7 October, 2019; v1 submitted 12 September, 2019; originally announced September 2019.

  18. Scalable Dictionary Classifiers for Time Series Classification

    Authors: Matthew Middlehurst, William Vickers, Anthony Bagnall

    Abstract: Dictionary based classifiers are a family of algorithms for time series classification (TSC), that focus on capturing the frequency of pattern occurrences in a time series. The ensemble based Bag of Symbolic Fourier Approximation Symbols (BOSS) was found to be a top performing TSC algorithm in a recent evaluation, as well as the best performing dictionary based classifier. A recent addition to the… ▽ More

    Submitted 26 July, 2019; originally announced July 2019.

    Journal ref: In proceedings of Intelligent Data Engineering and Automated Learning, pages 11-19. 2019

  19. arXiv:1811.00894  [pdf, other

    cs.LG stat.ML

    Can automated smoothing significantly improve benchmark time series classification algorithms?

    Authors: James Large, Paul Southam, Anthony Bagnall

    Abstract: tl;dr: no, it cannot, at least not on average on the standard archive problems. We assess whether using six smoothing algorithms (moving average, exponential smoothing, Gaussian filter, Savitzky-Golay filter, Fourier approximation and a recursive median sieve) could be automatically applied to time series classification problems as a preprocessing step to improve the performance of three benchmark… ▽ More

    Submitted 1 November, 2018; originally announced November 2018.

  20. arXiv:1811.00075  [pdf, other

    cs.LG stat.ML

    The UEA multivariate time series classification archive, 2018

    Authors: Anthony Bagnall, Hoang Anh Dau, Jason Lines, Michael Flynn, James Large, Aaron Bostrom, Paul Southam, Eamonn Keogh

    Abstract: In 2002, the UCR time series classification archive was first released with sixteen datasets. It gradually expanded, until 2015 when it increased in size from 45 datasets to 85 datasets. In October 2018 more datasets were added, bringing the total to 128. The new archive contains a wide range of problems, including variable length series, but it still only contains univariate time series classific… ▽ More

    Submitted 31 October, 2018; originally announced November 2018.

  21. arXiv:1810.07758  [pdf, other

    cs.LG stat.ML

    The UCR Time Series Archive

    Authors: Hoang Anh Dau, Anthony Bagnall, Kaveh Kamgar, Chin-Chia Michael Yeh, Yan Zhu, Shaghayegh Gharghabi, Chotirat Ann Ratanamahatana, Eamonn Keogh

    Abstract: The UCR Time Series Archive - introduced in 2002, has become an important resource in the time series data mining community, with at least one thousand published papers making use of at least one data set from the archive. The original incarnation of the archive had sixteen data sets but since that time, it has gone through periodic expansions. The last expansion took place in the summer of 2015 w… ▽ More

    Submitted 8 September, 2019; v1 submitted 17 October, 2018; originally announced October 2018.

  22. arXiv:1809.06751  [pdf, other

    cs.LG stat.ML

    From BOP to BOSS and Beyond: Time Series Classification with Dictionary Based Classifiers

    Authors: James Large, Anthony Bagnall, Simon Malinowski, Romain Tavenard

    Abstract: A family of algorithms for time series classification (TSC) involve running a sliding window across each series, discretising the window to form a word, forming a histogram of word counts over the dictionary, then constructing a classifier on the histograms. A recent evaluation of two of this type of algorithm, Bag of Patterns (BOP) and Bag of Symbolic Fourier Approximation Symbols (BOSS) found a… ▽ More

    Submitted 18 September, 2018; originally announced September 2018.

  23. arXiv:1809.06705  [pdf, other

    cs.LG stat.ML

    Is rotation forest the best classifier for problems with continuous features?

    Authors: A. Bagnall, M. Flynn, J. Large, J. Line, A. Bostrom, G. Cawley

    Abstract: In short, our experiments suggest that yes, on average, rotation forest is better than the most common alternatives when all the attributes are real-valued. Rotation forest is a tree based ensemble that performs transforms on subsets of attributes prior to constructing each tree. We present an empirical comparison of classifiers for problems with only real-valued features. We evaluate classifiers… ▽ More

    Submitted 25 April, 2020; v1 submitted 18 September, 2018; originally announced September 2018.

  24. arXiv:1712.06428  [pdf, other

    cs.LG

    A Shapelet Transform for Multivariate Time Series Classification

    Authors: Aaron Bostrom, Anthony Bagnall

    Abstract: Shapelets are phase independent subsequences designed for time series classification. We propose three adaptations to the Shapelet Transform (ST) to capture multivariate features in multivariate time series classification. We create a unified set of data to benchmark our work on, and compare with three other algorithms. We demonstrate that multivariate shapelets are not significantly worse than ot… ▽ More

    Submitted 18 December, 2017; originally announced December 2017.

  25. arXiv:1712.04006  [pdf, ps, other

    cs.LG cs.CR cs.CV

    Training Ensembles to Detect Adversarial Examples

    Authors: Alexander Bagnall, Razvan Bunescu, Gordon Stewart

    Abstract: We propose a new ensemble method for detecting and classifying adversarial examples generated by state-of-the-art attacks, including DeepFool and C&W. Our method works by training the members of an ensemble to have low classification error on random benign examples while simultaneously minimizing agreement on examples outside the training distribution. We evaluate on both MNIST and CIFAR-10, again… ▽ More

    Submitted 11 December, 2017; originally announced December 2017.

  26. The Heterogeneous Ensembles of Standard Classification Algorithms (HESCA): the Whole is Greater than the Sum of its Parts

    Authors: James Large, Jason Lines, Anthony Bagnall

    Abstract: Building classification models is an intrinsically practical exercise that requires many design decisions prior to deployment. We aim to provide some guidance in this decision making process. Specifically, given a classification problem with real valued attributes, we consider which classifier or family of classifiers should one use. Strong contenders are tree based homogeneous ensembles, support… ▽ More

    Submitted 25 October, 2017; originally announced October 2017.

    Journal ref: Data Min Knowl Disc 33, 1674-1709 (2019)

  27. arXiv:1703.09480  [pdf, other

    cs.LG stat.ML

    Simulated Data Experiments for Time Series Classification Part 1: Accuracy Comparison with Default Settings

    Authors: Anthony Bagnall, Aaron Bostrom, James Large, Jason Lines

    Abstract: There are now a broad range of time series classification (TSC) algorithms designed to exploit different representations of the data. These have been evaluated on a range of problems hosted at the UCR-UEA TSC Archive (www.timeseriesclassification.com), and there have been extensive comparative studies. However, our understanding of why one algorithm outperforms another is still anecdotal at best.… ▽ More

    Submitted 28 March, 2017; originally announced March 2017.

  28. arXiv:1703.06777  [pdf, other

    cs.LG stat.ML

    On the Use of Default Parameter Settings in the Empirical Evaluation of Classification Algorithms

    Authors: Anthony Bagnall, Gavin C. Cawley

    Abstract: We demonstrate that, for a range of state-of-the-art machine learning algorithms, the differences in generalisation performance obtained using default parameter settings and using parameters tuned via cross-validation can be similar in magnitude to the differences in performance observed between state-of-the-art and uncompetitive learning systems. This means that fair and rigorous evaluation of ne… ▽ More

    Submitted 20 March, 2017; originally announced March 2017.

  29. arXiv:1602.01711  [pdf, other

    cs.LG

    The Great Time Series Classification Bake Off: An Experimental Evaluation of Recently Proposed Algorithms. Extended Version

    Authors: Anthony Bagnall, Aaron Bostrom, James Large, Jason Lines

    Abstract: In the last five years there have been a large number of new time series classification algorithms proposed in the literature. These algorithms have been evaluated on subsets of the 47 data sets in the University of California, Riverside time series classification archive. The archive has recently been expanded to 85 data sets, over half of which have been donated by researchers at the University… ▽ More

    Submitted 4 February, 2016; originally announced February 2016.

  30. arXiv:1409.4936  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Ensembles of Random Sphere Cover Classifiers

    Authors: Anthony Bagnall, Reda Younsi

    Abstract: We propose and evaluate alternative ensemble schemes for a new instance based learning classifier, the Randomised Sphere Cover (RSC) classifier. RSC fuses instances into spheres, then bases classification on distance to spheres rather than distance to instances. The randomised nature of RSC makes it ideal for use in ensembles. We propose two ensemble methods tailored to the RSC classifier; $αβ$RSE… ▽ More

    Submitted 17 September, 2014; originally announced September 2014.

  31. arXiv:1407.3685  [pdf, other

    cs.LG cs.DB

    Finding Motif Sets in Time Series

    Authors: Anthony Bagnall, Jon Hills, Jason Lines

    Abstract: Time-series motifs are representative subsequences that occur frequently in a time series; a motif set is the set of subsequences deemed to be instances of a given motif. We focus on finding motif sets. Our motivation is to detect motif sets in household electricity-usage profiles, representing repeated patterns of household usage. We propose three algorithms for finding motif sets. Two are gree… ▽ More

    Submitted 14 July, 2014; originally announced July 2014.

    Report number: CMPC14-03

  32. arXiv:1406.4781  [pdf, ps, other

    cs.LG physics.med-ph

    Predictive Modelling of Bone Age through Classification and Regression of Bone Shapes

    Authors: Anthony Bagnall, Luke Davis

    Abstract: Bone age assessment is a task performed daily in hospitals worldwide. This involves a clinician estimating the age of a patient from a radiograph of the non-dominant hand. Our approach to automated bone age assessment is to modularise the algorithm into the following three stages: segment and verify hand outline; segment and verify bones; use the bone outlines to construct models of age. In this… ▽ More

    Submitted 18 June, 2014; originally announced June 2014.

    Report number: CMPC14-02

  33. arXiv:1406.4757  [pdf, other

    cs.LG

    An Experimental Evaluation of Nearest Neighbour Time Series Classification

    Authors: Anthony Bagnall, Jason Lines

    Abstract: Data mining research into time series classification (TSC) has focussed on alternative distance measures for nearest neighbour classifiers. It is standard practice to use 1-NN with Euclidean or dynamic time warping (DTW) distance as a straw man for comparison. As part of a wider investigation into elastic distance measures for TSC~\cite{lines14elastic}, we perform a series of experiments to test w… ▽ More

    Submitted 18 June, 2014; originally announced June 2014.

    Report number: CMP-C14-01