Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Marcus Frean

Network data constantly evolves with new network applications and protocols. There is a need for robust techniques to detect anomalous behaviour. Offline models trained with static data lose validity when new variants of traffic emerge.... more
Network data constantly evolves with new network applications and protocols. There is a need for robust techniques to detect anomalous behaviour. Offline models trained with static data lose validity when new variants of traffic emerge. They require retraining but the need for ground truth and lengthy training times make this task challenging. Meanwhile, online models which detect outliers in streaming data are susceptible to the curse of dimensionality and natural variability. Today's anomalies may be tomorrow's new traffic and existing methods do not provide a way to differentiate between them. We propose a framework that makes the most of both approaches: an offline deep learning model extracts features of normal traffic and provides a bias for an online outlier detection model to select data for training. The online model retains its previously learnt knowledge and retrains itself with new data. Online thresholds are updated in a drifting manner and the Mann-Whitney-U test is incorporated to prevent inaccurate updates. We perform analysis on the scores, develop heuristics to detect new traffic and evaluate using three deep learning models and four outlier detection methods on the UNSW-NB15 and CTU-13 datasets. The framework improves upon any individual offline or online models in isolation.
Network data constantly evolves with new network applications and protocols. There is a need for robust techniques to detect anomalous behaviour. Offline models trained with static data lose validity when new variants of traffic emerge.... more
Network data constantly evolves with new network applications and protocols. There is a need for robust techniques to detect anomalous behaviour. Offline models trained with static data lose validity when new variants of traffic emerge. They require retraining but the need for ground truth and lengthy training times make this task challenging. Meanwhile, online models which detect outliers in streaming data are susceptible to the curse of dimensionality and natural variability. Today's anomalies may be tomorrow's new traffic and existing methods do not provide a way to differentiate between them. We propose a framework that makes the most of both approaches: an offline deep learning model extracts features of normal traffic and provides a bias for an online outlier detection model to select data for training. The online model retains its previously learnt knowledge and retrains itself with new data. Online thresholds are updated in a drifting manner and the Mann-Whitney-U test is incorporated to prevent inaccurate updates. We perform analysis on the scores, develop heuristics to detect new traffic and evaluate using three deep learning models and four outlier detection methods on the UNSW-NB15 and CTU-13 datasets. The framework improves upon any individual offline or online models in isolation.
Internet Service Providers need to deploy and maintain many wireless sites in isolated or inaccessible terrain to provide Internet connectivity to rural communities. Addressing failures at such sites can be very expensive, both in... more
Internet Service Providers need to deploy and maintain many wireless sites in isolated or inaccessible terrain to provide Internet connectivity to rural communities. Addressing failures at such sites can be very expensive, both in identifying the fault, and also in the repair or rectification. Data monitoring can be useful, to spot anomalies and predict a fault (and possibly pre-empt it altogether), or to locate and isolate it quickly once it causes an issue for the network. There might be hundreds of variables to be monitored in principle, but only a few of significance for detecting faults. Here, in a case study involving a Wireless Internet Service Provider (WISP) in a rural area, we first illustrate a bottom-up approach to the identification of variables likely to be of use in an automatic anomaly detector. For the purpose of this study, the detector consists of an autoencoder neural network with weights optimized by machine learning (ML). We then show how the cause of an anomaly can be derived from indirect measurements, and use the model to learn relationships between certain variables.
Internet Service Providers need to deploy and maintain many wireless sites in isolated or inaccessible terrain to provide Internet connectivity to rural communities. Addressing failures at such sites can be very expensive, both in... more
Internet Service Providers need to deploy and maintain many wireless sites in isolated or inaccessible terrain to provide Internet connectivity to rural communities. Addressing failures at such sites can be very expensive, both in identifying the fault, and also in the repair or rectification. Data monitoring can be useful, to spot anomalies and predict a fault (and possibly pre-empt it altogether), or to locate and isolate it quickly once it causes an issue for the network. There might be hundreds of variables to be monitored in principle, but only a few of significance for detecting faults. Here, in a case study involving a Wireless Internet Service Provider (WISP) in a rural area, we first illustrate a bottom-up approach to the identification of variables likely to be of use in an automatic anomaly detector. For the purpose of this study, the detector consists of an autoencoder neural network with weights optimized by machine learning (ML). We then show how the cause of an anomal...
Helping strangers at a cost to oneself is a hallmark of many human interactions, but difficult to justify from the viewpoint of natural selection, particularly in anonymous one-shot interactions. Reputational scoring can provide the... more
Helping strangers at a cost to oneself is a hallmark of many human interactions, but difficult to justify from the viewpoint of natural selection, particularly in anonymous one-shot interactions. Reputational scoring can provide the necessary motivation via “indirect reciprocity,” but maintaining reliable scores requires close oversight to prevent cheating. We show that in the absence of such supervision, it is possible that scores might be managed by mutual consent between the agents themselves instead of by third parties. The space of possible strategies for such “consented” score changes is very large but, using a simple cooperation game, we search it, asking what kinds of agreement can i) invade a population from rare and ii) resist invasion once common. We prove mathematically and demonstrate computationally that score mediation by mutual consent does enable cooperation without oversight. Moreover, the most invasive and stable strategies belong to one family and ground the conc...
Particle filtering provides an approximate representation of a tracked posterior density which converges asymptotically to the true posterior as the number of particles used increases. The greater the number of particles, the higher the... more
Particle filtering provides an approximate representation of a tracked posterior density which converges asymptotically to the true posterior as the number of particles used increases. The greater the number of particles, the higher the computational complexity. This complexity can be implemented by operating the particle filter in parallel architectures. However, the resampling step in the particle filter requires a high level of synchronization and extensive information interchange between the particles, which impedes the use of parallel hardware systems. This paper establishes a new perspective for understanding particle filtering — that particle filtering can be achieved by adopting the principles of information exchange within a network, the nodes of which are now the particles in the particle filter. We propose to connect particles via a minimally connected network and resample each locally. This strategy facilitates full information exchange among the particles, but with each particle communicating with only a small fixed set of other particles, thus leading to minimal communication overhead. The key benefit is that this approach facilitates the use of many particles for accurate posterior approximation and tracking accuracy.
Identifying, contacting and engaging missing shareholders constitutes an enormous challenge for Māori incorporations, iwi and hapū across Aotearoa New Zealand. Without accurate data or tools to har-monise existing fragmented or... more
Identifying, contacting and engaging missing shareholders constitutes an enormous challenge for Māori incorporations, iwi and hapū across Aotearoa New Zealand. Without accurate data or tools to har-monise existing fragmented or conflicting data sources, issues around land succession, opportunities for economic development, and maintenance of whānau relationships are all negatively impacted. This unique three-way research collaboration between Victoria University of Wellington (VUW), Parininihi ki Waitotara Incorporation (PKW), and University of Auckland funded by the National Science Challenge | Science for Technological Innovation catalyses innovation through new digital humanities-inflected data science modelling and analytics with the kaupapa of reconnecting missing Māori shareholders for a prosperous economic, cultural, and socially revitalised future. This paper provides an overview of VUW's culturally-embedded social network approach to the project, discusses the challenge...
Multiple problems in robotics, vision, and graphics can be considered as optimization problems, in which the loss surface can be evaluated only at a collection of sample locations and the problem is regularized with an implicit or... more
Multiple problems in robotics, vision, and graphics can be considered as optimization problems, in which the loss surface can be evaluated only at a collection of sample locations and the problem is regularized with an implicit or explicit prior. In some problems, however, samples are expensive to obtain. This motivates consideration of sample-efficient optimization. A successful approach has been to choose new points to evaluate by considering a distribution over plausible surfaces conditioned on all previous points and their evaluations. To do this the distribution must be updated as each new evaluation is acquired, which has motivated the development of Bayesian methods that update a prior to a posterior over functions. By far, the most common prior distribution in use for this application is the Gaussian process. Here, we consider another family of priors, namely Bayesian Neural Networks. We argue that these exhibit strengths that are different or complementary to Gaussian processes and show that they are competitive or superior on a wide range of test optimization problems.
One of the most important biological theories of religion is also the most controversial. Here we describe and partially defend David Sloan Wilson’s group selectionist model. According to Wilson, religions are best explained as... more
One of the most important biological theories of religion is also the most controversial. Here we describe and partially defend David Sloan Wilson’s group selectionist model. According to Wilson, religions are best explained as ‘superorganisms ’ adapted to succeed in competition against others. The evolutionary history of religion is a battle of these titans.
One of the most important biological theories of religion is also the most controversial. Here we describe and partially defend David Sloan Wilson’s group selectionist model. According to Wilson, religions are best explained as... more
One of the most important biological theories of religion is also the most controversial. Here we describe and partially defend David Sloan Wilson’s group selectionist model. According to Wilson, religions are best explained as ‘superorganisms ’ adapted to succeed in competition against others. The evolutionary history of religion is a battle of these titans. Background
Different approaches to detecting objects have used either perceptionally nonuniform or human perception based colour spaces to mimics human vision in machines. In this paper, we have compared nonuniform RGB derived opponencies with the... more
Different approaches to detecting objects have used either perceptionally nonuniform or human perception based colour spaces to mimics human vision in machines. In this paper, we have compared nonuniform RGB derived opponencies with the HSV colour space. The main motivation about this particular comparison is to improve the quality of saliency detection in challenging situations such as lighting change. Here, Precision-Recall curves are used to compare the colour spaces using bottom-up pyramidal and top-down non-pyramidal saliency models. Our study concludes that if we combine the Saturation-Value or Hue-Value channels then this improves the detection of salient objects and gives a higher Precision-Recall curve. We have also shown in the paper that the RGB colour space gives a low precision score as it detects the object using the colour information present in an image.
Humans invest in fantastic stories -- mythologies.Recent evolutionary theories suggest that cultural selection may favour moralising stories that motivate prosocial behaviours.A key challenge is to explain the emergence of mythologies... more
Humans invest in fantastic stories -- mythologies.Recent evolutionary theories suggest that cultural selection may favour moralising stories that motivate prosocial behaviours.A key challenge is to explain the emergence of mythologies that lack explicit moral exemplars or directives. Here, we resolve this puzzle with an evolutionary model in which arbitrary mythologies transform a collection of egoistic individuals into a cooperative. Importantly, in finite populations, reflecting relative to contemporary population sizes of hunter-gatherers, the model is robust to the cognitive costs in adopting fictions. This approach resolves a fundamental problem across the human sciences by explaining the evolution of otherwise puzzling amoral, nonsensical, and fictional narratives as exquisitely functional coordination devices.
When completed the Square Kilometre Array (SKA) will feature an unprecedented rate of image generation. While previous generations of telescopes have relied on human expertise to extract scientifically interesting information from the... more
When completed the Square Kilometre Array (SKA) will feature an unprecedented rate of image generation. While previous generations of telescopes have relied on human expertise to extract scientifically interesting information from the images, the sheer data volume of the data will now make this impractical. Additionally, the rate at which data are accrued will not allow traditional imaging products to be stored indefinitely for later inspection meaning there is a strong imperative to discard uninteresting data in pseudo-real time. Here we outline components of the SKA science analysis pipeline being developed to produce a series of data products including continuum images, spectral cubes and Faraday depth spectra. We discuss a scheme to automatically extract value from these products and discard scientifically uninteresting data. This pipeline is thus expected to give both an increase in scientific productivity, and offers the possibility of reduced data archive size producing a con...
The simplest method takes part of the available data and sets it aside. Once the network has been trained on the remaining data it can be 'validated' by seeing how well it performs on the withheld data, thus giving an estimate... more
The simplest method takes part of the available data and sets it aside. Once the network has been trained on the remaining data it can be 'validated' by seeing how well it performs on the withheld data, thus giving an estimate of how well it will generalize. This estimate won't be ...
M. Gallagher and M. Jarean randomly generated population, each iteration (generation) of the EA involves the ap-plication of a set of genetic operators (eg, selection, recombination and mutation) to evolve the population for the following... more
M. Gallagher and M. Jarean randomly generated population, each iteration (generation) of the EA involves the ap-plication of a set of genetic operators (eg, selection, recombination and mutation) to evolve the population for the following generation. In this way an EA explores ...
Strangers routinely cooperate and exchange goods without any knowledge of one another in one-off encounters without recourse to a third party, an interaction that is fundamental to most human societies. However, this act of reciprocal... more
Strangers routinely cooperate and exchange goods without any knowledge of one another in one-off encounters without recourse to a third party, an interaction that is fundamental to most human societies. However, this act of reciprocal exchange entails the risk of the other agent defecting with both goods. We examine the choreography for safe exchange between strangers, and identify the minimum requirement, which is a shared hold, either of an object, or the other party; we show that competing agents will settle on exchange as a local optimum in the space of payoffs. Truly safe exchanges are rarely seen in practice, even though unsafe exchange could mean that risk-averse agents might avoid such interactions. We show that an ‘implicit’ hold, whereby an actor believes that they could establish a hold if the other agent looked to be defecting, is sufficient to enable the simple swaps that are the hallmark of human interactions and presumably provide an acceptable trade-off between risk ...
A long-standing obstacle to progress in deep learning is the problem of vanishing and exploding gradients. Although, the problem has largely been overcome via carefully constructed initializations and batch normalization, architectures... more
A long-standing obstacle to progress in deep learning is the problem of vanishing and exploding gradients. Although, the problem has largely been overcome via carefully constructed initializations and batch normalization, architectures incorporating skip-connections such as highway and resnets perform much better than standard feedforward architectures despite well-chosen initialization and batch normalization. In this paper, we identify the shattered gradients problem. Specifically, we show that the correlation between gradients in standard feedforward networks decays exponentially with depth resulting in gradients that resemble white noise whereas, in contrast, the gradients in architectures with skip-connections are far more resistant to shattering, decaying sublinearly. Detailed empirical evidence is presented in support of the analysis, on both fully-connected networks and convnets. Finally, we present a new "looks linear" (LL) initialization that prevents shattering,...
Abstract- We present a PHD filtering approach to estimate the state of an unknown number of persons in a video sequence. Persons are represented by moving blobs, which are tracked across different frames using a first-order moment... more
Abstract- We present a PHD filtering approach to estimate the state of an unknown number of persons in a video sequence. Persons are represented by moving blobs, which are tracked across different frames using a first-order moment approximation to the posterior density. The PHD filter is a good alternative to standard multi-target tracking algorithms, since overrides making explicit associations between measurements and persons locations. The recursive method has linear complexity in the number of targets, so it also has the potential benefit of scaling well with a large number of persons being tracked. The PHD filter achieves interesting results for the multiple persons tracking problem, albeit discarding useful information from higher order interactions. Nevertheless, a backward state-space representation using PHD smoothing can be used to refine the filtered estimates. In this paper, we present two smoothing strategies for improving PHD filter estimates in multiple persons tracki...
In an evolving population, network structure can have striking effects on the survival probability of a mutant allele and on the rate at which it spreads. In networks with ‘hubs’ (representing geographic or other constraints), the... more
In an evolving population, network structure can have striking effects on the survival probability of a mutant allele and on the rate at which it spreads. In networks with ‘hubs’ (representing geographic or other constraints), the heightened probability of an initially rare mutant has led to the prediction that such networks act to amplify the effects of selection over drift. But selection and mutation interplay in a subtle way in such populations: hubs also slow the mutant’s rate of invasion, so that if multiple mutants are allowed to spread at the same time, more of them may be present. In other words it might be misleading to consider only the fixation probability, because new mutants spread at different rates in these networks. Instead of following a single mutation to fixation, we give a very simple model that allows for a stream of mutations, leading to a dynamic equilibrium. In this way we take account of ongoing evolution rather than simply following a single mutant to fixat...
Research Interests:
Multiple problems in robotics, vision, and graphics can be considered as optimization problems, in which the loss surface can be evaluated only at a collection of sample locations and the problem is regularized with an implicit or... more
Multiple problems in robotics, vision, and graphics can be considered as optimization problems, in which the loss surface can be evaluated only at a collection of sample locations and the problem is regularized with an implicit or explicit prior. In some problems, however, samples are expensive to obtain. This motivates consideration of sample-efficient optimization. A successful approach has been to choose new points to evaluate by considering a distribution over plausible surfaces conditioned on all previous points and their evaluations. To do this the distribution must be updated as each new evaluation is acquired, which has motivated the development of Bayesian methods that update a prior to a posterior over functions. By far, the most common prior distribution in use for this application is the Gaussian process. Here, we consider another family of priors, namely Bayesian Neural Networks. We argue that these exhibit strengths that are different or complementary to Gaussian proce...
BackgroundAttention-Deficit/Hyperactivity Disorder (ADHD) is a neurodevelopmental condition characterized by executive function (EF) dynamics disturbances. Notwithstanding, current advances in translational neuroscience, no ADHD... more
BackgroundAttention-Deficit/Hyperactivity Disorder (ADHD) is a neurodevelopmental condition characterized by executive function (EF) dynamics disturbances. Notwithstanding, current advances in translational neuroscience, no ADHD objective, clinically useful, diagnostic marker is available to date.ObjectivesUsing a customized definition of EF and a new clinical paradigm, we performed a prospective diagnostic accuracy trial to assess the diagnostic value of several fractal measures from the thinking processes or inferences in a cohort of ADHD children and typically developing controls.MethodWe included children from age five to twelve diagnosed with a reference standard based on case history, physical and neurological examination, Conners 3rd Edition, and DSM-V™. The index test consisted of a computer-based inference task with a set of eight different instances of the “Battleships” game to be solved. A consecutive series of 18 cases and 18 controls (n = 36) recruited at the primary pa...
In this paper, we explore the concept of sequential learning and the efficacy of global and local neural network learning algorithms on a sequential learning task. Pseudorehearsal, a method developed by Robins19) to solve the catastrophic... more
In this paper, we explore the concept of sequential learning and the efficacy of global and local neural network learning algorithms on a sequential learning task. Pseudorehearsal, a method developed by Robins19) to solve the catastrophic forgetting problem which arises from the excessive plasticity of neural networks, is significantly more effective than other local learning algorithms for the sequential task. We further consider the concept of local learning and suggest that pseudorehearsal is so effective because it works directly at the level of the learned function, and not indirectly on the representation of the function within the network. We also briefly explore the effect of local learning on generalization within the task.
The particle filter approximation to the posterior density converges to the true posterior as the number of particles used increases. The greater the number of particles, the higher the computational load, which can be implemented by... more
The particle filter approximation to the posterior density converges to the true posterior as the number of particles used increases. The greater the number of particles, the higher the computational load, which can be implemented by operating the particle filter in parallel architectures. However, the resampling stage in the particle filter requires synchronisation, extensive interchange and routing of particle information, and thus impedes the use of parallel hardware systems. This paper presents a novel resampling technique using a fixed random network. This idea relaxes the synchronisation constraints and minimises the particle interaction to a significant level. Using simulations we demonstrate the validity of our technique to track targets in linear and non-linear sensing scenarios.
Research Interests:
Research Interests:
Gaussian processes compare favourably with backpropagation neural networks as a tool for regression, and Bayesian neural networks have Gaussian process behaviour when the number of hidden neurons tends to infinity. We describe a simple... more
Gaussian processes compare favourably with backpropagation neural networks as a tool for regression, and Bayesian neural networks have Gaussian process behaviour when the number of hidden neurons tends to infinity. We describe a simple recurrent neural network with connection weights trained by one-shot Hebbian learning. This network amounts to a dynamical system which relaxes to a stable state in which it generates predictions identical to those of Gaussian process regression. In effect an infinite number of hidden units in a feed-forward architecture can be replaced by a merely finite number, together with recurrent connections.

And 55 more