A workload model using the infinite source Poisson model for bursts is combined with the on--off ... more A workload model using the infinite source Poisson model for bursts is combined with the on--off model for within burst activity. Burst durations and on--off durations are assumed to have heavy-tailed distributions with infinite variance and finite mean. Since the number of bursts is random, one can consider limiting results based on "random centering" of a random sum for the total workload from all sources. Convergence results are shown to depend on the tail indices of both the on--off durations and the lifetimes distributions. Moreover, the results can be separated into cases depending on those tail indices. In one case where all distributions are heavy tailed it is shown that the limiting result is Brownian motion. In another case, convergence to fractional Brownian motion is shown, where the Hurst parameter depends on the heavy-tail indices of the distribution of the on, off and burst durations.
Models of infectious disease increasingly seek to incorporate heterogeneity of social interaction... more Models of infectious disease increasingly seek to incorporate heterogeneity of social interactions to more accurately characterise disease spread. We measured attributes of social encounters in two areas of Greater Melbourne, using a telephone survey. A market research company conducted computer assisted telephone interviews (CATIs) of residents of the Boroondara and Hume local government areas (LGAs), which differ markedly in ethnic composition, age distribution and household socioeconomic status. Survey items included household demographic and socio-economic characteristics, locations visited during the preceding day, and social encounters involving two-way conversation or physical contact. Descriptive summary measures were reported and compared using weight adjusted Wald tests of group means. The overall response rate was 37.6 %, higher in Boroondara [n = 650, (46 %)] than Hume [n = 657 (32 %)]. Survey conduct through the CATI format was challenging, with implications for representativeness and data quality. Marked heterogeneity of encounter profiles was observed across age groups and locations. Household settings afforded greatest opportunity for prolonged close contact, particularly between women and children. Young and middle-aged men reported more age-assortative mixing, often with non-household members. Preliminary comparisons between LGAs suggested that mixing occurred in different settings. In addition, gender differences in mixing with household and non-household members, including strangers, were observed by area. Survey administration by CATI was challenging, but rich data were obtained, revealing marked heterogeneity of social behaviour. Marked dissimilarities in patterns of prolonged close mixing were demonstrated by gender. In addition, preliminary observations of between-area differences in socialisation warrant further evaluation.
We compare two broad types of empirically grounded random network models in terms of their abilit... more We compare two broad types of empirically grounded random network models in terms of their abilities to capture both network features and simulated Susceptible-Infected-Recovered (SIR) epidemic dynamics. The types of network models are exponential random graph models (ERGMs) and extensions of the configuration model. We use three kinds of empirical contact networks, chosen to provide both variety and realistic patterns of human contact: a highly clustered network, a bipartite network and a snowball sampled network of a "hidden population". In the case of the snowball sampled network we present a novel method for fitting an edge-triangle model. In our results, ERGMs consistently capture clustering as well or better than configuration-type models, but the latter models better capture the node degree distribution. Despite the additional computational requirements to fit ERGMs to empirical networks, the use of ERGMs provides only a slight improvement in the ability of the mode...
We present statistical tests for the continuous martingale hypothesis. That is, whether an observ... more We present statistical tests for the continuous martingale hypothesis. That is, whether an observed process is a continuous local martingale, or equivalently a continuous time-changed Brownian motion. Our technique is based on the concept of the crossing tree. Simulation experiments are used to assess the power of the tests, which is generally higher than recently proposed tests using the estimated quadratic variation (i.e., realised volatility). In particular, the crossing tree shows significantly more power with shorter datasets. We then show results from applying the methodology to high frequency currency exchange rate data. We show that in 2003, for the AUD-USD, GBP-USD, JPY-USD and EUR-USD rates, at small timescales (less than 15 minutes or so) the continuous martingale hypothesis is rejected, but not so at larger timescales. For 2003 EUR-GBP data, the hypothesis is rejected at small timescales and some moderate timescales, but not all. Comment: v2: Revised title
In the branch of probability called "large deviations," rates of convergence (e.g. of t... more In the branch of probability called "large deviations," rates of convergence (e.g. of the sample mean) are considered. The theory makes use of the moment generating function. So, particularly for sums of independent and identically distributed random variables, the theory can be made accessible to senior undergraduates after a first course in stochastic processes. This paper describes a directed independent study in large deviations offered to a strong senior, providing a sample outline and discussion of resources. Learning points are also highlighted.
In this paper we provide a framework for analyzing network traffic traces through trace-driven qu... more In this paper we provide a framework for analyzing network traffic traces through trace-driven queueing. We also introduce several queueing metrics together with the associated visualization tools (some novel) that provide insight into the traffic features and facilitate comparisons between traces. Some techniques for non-stationary data are discussed. Applying our framework to both real and synthetic traces we (i) illustrate how to compare traces using trace-driven queueing, and (ii) show that traces that look “similar” under various statistical measures (such as the Hurst index) can exhibit rather different behavior under queueing simulation.
We present a new test for the “continuous martingale hypothesis”. That is, a test for the hypothe... more We present a new test for the “continuous martingale hypothesis”. That is, a test for the hypothesis that observed data are from a process which is a continuous local martingale. The basis of the test is an embedded random walk at first passage times, obtained from the well-known representation of a continuous local martingale as a continuous time-change of Brownian motion. With a variety of simulated diffusion processes our new test shows higher power than existing tests using either the crossing tree or the quadratic variation, including the situation where non-negligible drift is present. The power of the test in the presence of jumps is also explored with a variety of simulated jump diffusion processes. The test is also applied to two sequences of high-frequency foreign exchange trade-by-trade data. In both cases the continuous martingale hypothesis is rejected at times less than hourly and we identify significant dependence in price movements at these small scales.
We present statistical tests for the continuous martingale hypothesis; that is, for whether an ob... more We present statistical tests for the continuous martingale hypothesis; that is, for whether an observed process is a continuous local martingale, or equivalently a continuous time-changed Brownian motion. Our technique is based on the concept of the crossing tree. Simulation experiments are used to assess the power of the tests, which is generally higher than that of recently proposed tests using the estimated quadratic variation (i.e. realized volatility). In particular, the crossing tree shows significantly higher power with shorter data sets. We then show results from applying the methodology to five high-frequency currency exchange rate data sets from 2003. For four of them we show that at small time-scales (less than 15 minutes or so) the continuous martingale hypothesis is rejected, but not so at larger time-scales. For the fifth, the hypothesis is rejected at small time-scales and at some moderate time-scales, but not all.
This study uses social network analysis to model a contact network of people who inject drugs (PW... more This study uses social network analysis to model a contact network of people who inject drugs (PWID) relevant for investigating the spread of an infectious disease (hepatitis C). Using snowball sample data, parameters for an exponential random graph model (ERGM) including social circuit dependence and four attributes (location, age, injecting frequency, gender) are estimated using a conditional estimation approach that respects the structure of snowball sample designs. Those network parameter estimates are then used to create a novel, model-dependent estimate of network size. Simulated PWID contact networks are created and compared with Bernoulli graphs. Location, age and injecting frequency are shown to be statistically significant attribute parameters in the ERGM. Simulated ERGM networks are shown to fit the collected data very well across a number of metrics. In comparison with Bernoulli graphs, simulated networks are shown to have longer paths and more clustering. Results from t...
Abstract: The thesis proposes models for aggregate data network traffic which incorporate the add... more Abstract: The thesis proposes models for aggregate data network traffic which incorporate the additional randomness arising from the randomness in the number of data sources. A conditionally-Gaussian scale mixture process is shown to be a limit for the cumulative work from a random superposition of alternating on-off processes. Sub-Fractional Brownian Motion is shown to be the limit in a particular
ABSTRACT After a first course in probability, what might be said about calculus that couldn’t be ... more ABSTRACT After a first course in probability, what might be said about calculus that couldn’t be said beforeΦ Probability density functions must integrate to one. Can we use this to advantageΦ Under independence, the expected value of a product of random variables is the product of the expected valuesΦ How might we use thisΦ Reinterpreting certain integrands as recognizable probability density functions, we show how calculating certain integrals can become quite easy. Building on this idea, we then introduce Monte Carlo integration and give some history of the method.
Australian & New Zealand Journal of Statistics, 2011
ABSTRACT We present statistical tests for the continuous martingale hypothesis; that is, for whet... more ABSTRACT We present statistical tests for the continuous martingale hypothesis; that is, for whether an observed process is a continuous local martingale, or equivalently a continuous time-changed Brownian motion. Our technique is based on the concept of the crossing tree. Simulation experiments are used to assess the power of the tests, which is generally higher than that of recently proposed tests using the estimated quadratic variation (i.e. realized volatility). In particular, the crossing tree shows significantly higher power with shorter data sets. We then show results from applying the methodology to five high-frequency currency exchange rate data sets from 2003. For four of them we show that at small time-scales (less than 15 minutes or so) the continuous martingale hypothesis is rejected, but not so at larger time-scales. For the fifth, the hypothesis is rejected at small time-scales and at some moderate time-scales, but not all.
A workload model using the infinite source Poisson model for bursts is combined with the on--off ... more A workload model using the infinite source Poisson model for bursts is combined with the on--off model for within burst activity. Burst durations and on--off durations are assumed to have heavy-tailed distributions with infinite variance and finite mean. Since the number of bursts is random, one can consider limiting results based on "random centering" of a random sum for the total workload from all sources. Convergence results are shown to depend on the tail indices of both the on--off durations and the lifetimes distributions. Moreover, the results can be separated into cases depending on those tail indices. In one case where all distributions are heavy tailed it is shown that the limiting result is Brownian motion. In another case, convergence to fractional Brownian motion is shown, where the Hurst parameter depends on the heavy-tail indices of the distribution of the on, off and burst durations.
Models of infectious disease increasingly seek to incorporate heterogeneity of social interaction... more Models of infectious disease increasingly seek to incorporate heterogeneity of social interactions to more accurately characterise disease spread. We measured attributes of social encounters in two areas of Greater Melbourne, using a telephone survey. A market research company conducted computer assisted telephone interviews (CATIs) of residents of the Boroondara and Hume local government areas (LGAs), which differ markedly in ethnic composition, age distribution and household socioeconomic status. Survey items included household demographic and socio-economic characteristics, locations visited during the preceding day, and social encounters involving two-way conversation or physical contact. Descriptive summary measures were reported and compared using weight adjusted Wald tests of group means. The overall response rate was 37.6 %, higher in Boroondara [n = 650, (46 %)] than Hume [n = 657 (32 %)]. Survey conduct through the CATI format was challenging, with implications for representativeness and data quality. Marked heterogeneity of encounter profiles was observed across age groups and locations. Household settings afforded greatest opportunity for prolonged close contact, particularly between women and children. Young and middle-aged men reported more age-assortative mixing, often with non-household members. Preliminary comparisons between LGAs suggested that mixing occurred in different settings. In addition, gender differences in mixing with household and non-household members, including strangers, were observed by area. Survey administration by CATI was challenging, but rich data were obtained, revealing marked heterogeneity of social behaviour. Marked dissimilarities in patterns of prolonged close mixing were demonstrated by gender. In addition, preliminary observations of between-area differences in socialisation warrant further evaluation.
We compare two broad types of empirically grounded random network models in terms of their abilit... more We compare two broad types of empirically grounded random network models in terms of their abilities to capture both network features and simulated Susceptible-Infected-Recovered (SIR) epidemic dynamics. The types of network models are exponential random graph models (ERGMs) and extensions of the configuration model. We use three kinds of empirical contact networks, chosen to provide both variety and realistic patterns of human contact: a highly clustered network, a bipartite network and a snowball sampled network of a "hidden population". In the case of the snowball sampled network we present a novel method for fitting an edge-triangle model. In our results, ERGMs consistently capture clustering as well or better than configuration-type models, but the latter models better capture the node degree distribution. Despite the additional computational requirements to fit ERGMs to empirical networks, the use of ERGMs provides only a slight improvement in the ability of the mode...
We present statistical tests for the continuous martingale hypothesis. That is, whether an observ... more We present statistical tests for the continuous martingale hypothesis. That is, whether an observed process is a continuous local martingale, or equivalently a continuous time-changed Brownian motion. Our technique is based on the concept of the crossing tree. Simulation experiments are used to assess the power of the tests, which is generally higher than recently proposed tests using the estimated quadratic variation (i.e., realised volatility). In particular, the crossing tree shows significantly more power with shorter datasets. We then show results from applying the methodology to high frequency currency exchange rate data. We show that in 2003, for the AUD-USD, GBP-USD, JPY-USD and EUR-USD rates, at small timescales (less than 15 minutes or so) the continuous martingale hypothesis is rejected, but not so at larger timescales. For 2003 EUR-GBP data, the hypothesis is rejected at small timescales and some moderate timescales, but not all. Comment: v2: Revised title
In the branch of probability called "large deviations," rates of convergence (e.g. of t... more In the branch of probability called "large deviations," rates of convergence (e.g. of the sample mean) are considered. The theory makes use of the moment generating function. So, particularly for sums of independent and identically distributed random variables, the theory can be made accessible to senior undergraduates after a first course in stochastic processes. This paper describes a directed independent study in large deviations offered to a strong senior, providing a sample outline and discussion of resources. Learning points are also highlighted.
In this paper we provide a framework for analyzing network traffic traces through trace-driven qu... more In this paper we provide a framework for analyzing network traffic traces through trace-driven queueing. We also introduce several queueing metrics together with the associated visualization tools (some novel) that provide insight into the traffic features and facilitate comparisons between traces. Some techniques for non-stationary data are discussed. Applying our framework to both real and synthetic traces we (i) illustrate how to compare traces using trace-driven queueing, and (ii) show that traces that look “similar” under various statistical measures (such as the Hurst index) can exhibit rather different behavior under queueing simulation.
We present a new test for the “continuous martingale hypothesis”. That is, a test for the hypothe... more We present a new test for the “continuous martingale hypothesis”. That is, a test for the hypothesis that observed data are from a process which is a continuous local martingale. The basis of the test is an embedded random walk at first passage times, obtained from the well-known representation of a continuous local martingale as a continuous time-change of Brownian motion. With a variety of simulated diffusion processes our new test shows higher power than existing tests using either the crossing tree or the quadratic variation, including the situation where non-negligible drift is present. The power of the test in the presence of jumps is also explored with a variety of simulated jump diffusion processes. The test is also applied to two sequences of high-frequency foreign exchange trade-by-trade data. In both cases the continuous martingale hypothesis is rejected at times less than hourly and we identify significant dependence in price movements at these small scales.
We present statistical tests for the continuous martingale hypothesis; that is, for whether an ob... more We present statistical tests for the continuous martingale hypothesis; that is, for whether an observed process is a continuous local martingale, or equivalently a continuous time-changed Brownian motion. Our technique is based on the concept of the crossing tree. Simulation experiments are used to assess the power of the tests, which is generally higher than that of recently proposed tests using the estimated quadratic variation (i.e. realized volatility). In particular, the crossing tree shows significantly higher power with shorter data sets. We then show results from applying the methodology to five high-frequency currency exchange rate data sets from 2003. For four of them we show that at small time-scales (less than 15 minutes or so) the continuous martingale hypothesis is rejected, but not so at larger time-scales. For the fifth, the hypothesis is rejected at small time-scales and at some moderate time-scales, but not all.
This study uses social network analysis to model a contact network of people who inject drugs (PW... more This study uses social network analysis to model a contact network of people who inject drugs (PWID) relevant for investigating the spread of an infectious disease (hepatitis C). Using snowball sample data, parameters for an exponential random graph model (ERGM) including social circuit dependence and four attributes (location, age, injecting frequency, gender) are estimated using a conditional estimation approach that respects the structure of snowball sample designs. Those network parameter estimates are then used to create a novel, model-dependent estimate of network size. Simulated PWID contact networks are created and compared with Bernoulli graphs. Location, age and injecting frequency are shown to be statistically significant attribute parameters in the ERGM. Simulated ERGM networks are shown to fit the collected data very well across a number of metrics. In comparison with Bernoulli graphs, simulated networks are shown to have longer paths and more clustering. Results from t...
Abstract: The thesis proposes models for aggregate data network traffic which incorporate the add... more Abstract: The thesis proposes models for aggregate data network traffic which incorporate the additional randomness arising from the randomness in the number of data sources. A conditionally-Gaussian scale mixture process is shown to be a limit for the cumulative work from a random superposition of alternating on-off processes. Sub-Fractional Brownian Motion is shown to be the limit in a particular
ABSTRACT After a first course in probability, what might be said about calculus that couldn’t be ... more ABSTRACT After a first course in probability, what might be said about calculus that couldn’t be said beforeΦ Probability density functions must integrate to one. Can we use this to advantageΦ Under independence, the expected value of a product of random variables is the product of the expected valuesΦ How might we use thisΦ Reinterpreting certain integrands as recognizable probability density functions, we show how calculating certain integrals can become quite easy. Building on this idea, we then introduce Monte Carlo integration and give some history of the method.
Australian & New Zealand Journal of Statistics, 2011
ABSTRACT We present statistical tests for the continuous martingale hypothesis; that is, for whet... more ABSTRACT We present statistical tests for the continuous martingale hypothesis; that is, for whether an observed process is a continuous local martingale, or equivalently a continuous time-changed Brownian motion. Our technique is based on the concept of the crossing tree. Simulation experiments are used to assess the power of the tests, which is generally higher than that of recently proposed tests using the estimated quadratic variation (i.e. realized volatility). In particular, the crossing tree shows significantly higher power with shorter data sets. We then show results from applying the methodology to five high-frequency currency exchange rate data sets from 2003. For four of them we show that at small time-scales (less than 15 minutes or so) the continuous martingale hypothesis is rejected, but not so at larger time-scales. For the fifth, the hypothesis is rejected at small time-scales and at some moderate time-scales, but not all.
Uploads
Papers by David Rolls