Intelligent techniques, including artificial intelligence and deep learning, normally perform on ... more Intelligent techniques, including artificial intelligence and deep learning, normally perform on complete data without missing data. Multiple imputation is indispensable for addressing missing data resulting in unbiased estimates and dealing with uncertainty by providing more valid results. Most stateof-the-art techniques focus on moderate missing rates (around 50%-60%) and short missing gaps, while imputation for high missing gaps and high missing rates is an important challenge for multivariate time-series data generated through the Internet of Things (IoT). Hence, we propose a Lightweight Window Portion-based Multiple Imputation (LWPMI) using correlation, data fusion, regression, and multiple imputations. We conduct extensive experiments by generating generated high missing gaps and high missing rates ranging from 10-90% on sensor-generated datasets with different features (highly, weakly, and a mixture of highly and weakly correlated data). All the obtained results prove LWPMI outperforms baseline techniques in preserving pattern, structure, and trend in both 90% high missing gap and missing rates.
Purpose This study evaluates the Occupational Rehabilitation (OR) initiatives regarding return to... more Purpose This study evaluates the Occupational Rehabilitation (OR) initiatives regarding return to work (RTW) and sustaining at work following work-related injuries. This study also identifies the predictors and predicts the likelihoods of RTW and sustainability for OR users. Methods The study is conducted on the compensation claim data for people who are injured at work in the state of Victoria, Australia. The claims which commenced OR services between the first of July 2012 and the end of June 2015 are included. The claims which used original employer services (OES) have been separated from claims which used new employer services (NES). We investigated a range of predictors categorised into four groups as claimant, injury, and employment characteristics and claim management. The RTW and sustaining at work are outcomes of interest. To evaluate the predictors, we use Chi-squared test and logistic regression modelling. Also, we prioritized the predictors using Akaike Information Criterion (AIC) measure and Cross-validation error. Four predictive models are developed using significant predictors for OES and NES users to predict RTW and sustainability. We examined the multicollinearity of the developed models using Variance Inflation Factor (VIF). Results About 75% and 60% of OES users achieved RTW and have been sustained at work respectively, whilst just approximately 30% of NES users have been placed at a new employer and 25% of them have been sustained at work. The predictors which have the most association with OES and NES outcomes are the use of psychiatric services and age groups respectively. We found that having mental conditions is as an important indicator to allocate injured workers into OES or NES initiatives. Our study shows that injured workers with mental issues do not always have lower RTW rate. They just need special consideration. Conclusion Understanding the predictors of RTW and sustainability helps to develop interventions to ensure sustained RTW. This study will assist decision makers to improve design and implementation of OR services and tailor services according to clients' needs. Introduction Injuries and illnesses that occur at work impose substantial personal, social and economic burdens on society [1]. These injuries may lead to disability, morbidity or even mortality. Many injured workers might have long-term healthcare issues. As such, return to work (RTW) for these workers becomes quite complicated. The longer the injured workers are away from work, the lower the likelihood of a successful RTW [2]. Various factors such as physical, psychological and social factors influence the process of RTW. Five stages have been identified for an injured worker to be ready for RTW in the 'Readiness for Return to Work' (RRTW) model. These stages, which are aligned with stages of change, are precontemplation, contemplation, preparation for action, action and maintenance [3, 4]. In the precontemplation stage, the injured worker has not started to think about RTW because the recovery is the priority. In the contemplation stage, as the injured worker is recovering, they are starting to consider RTW, but are not engaged in any practical plans for RTW. The preparation for action stage involves finding information regarding RTW, evaluating the capability of RTW, making plans and involvement in assistive programs to RTW. In the action stage, the injured worker converts the plan into action and gets back to work with different levels of capacity. The goal of the maintenance stage is to retain the returned injured workers at work. To reach this goal, some strategies such as receiving support from assistive programs, increasing the workload gradually, applying specific safety policies and strengthening exercise can be considered [5].
Abstract
Purpose This study evaluates the Occupational Rehabilitation (OR) initiatives regarding ... more Abstract Purpose This study evaluates the Occupational Rehabilitation (OR) initiatives regarding return to work (RTW) and sustaining at work following work-related injuries. This study also identifies the predictors and predicts the likelihoods of RTW and sustainability for OR users. Methods The study is conducted on the compensation claim data for people who are injured at work in the state of Victoria, Australia. The claims which commenced OR services between the first of July 2012 and the end of June 2015 are included. The claims which used original employer services (OES) have been separated from claims which used new employer services (NES). We investigated a range of predictors categorised into four groups as claimant, injury, and employment characteristics and claim management. The RTW and sustaining at work are outcomes of interest. To evaluate the predictors, we use Chi-squared test and logistic regression modelling. Also, we prioritized the predictors using Akaike Information Criterion (AIC) measure and Cross-validation error. Four predictive models are developed using significant predictors for OES and NES users to predict RTW and sustainability. We examined the multicollinearity of the developed models using Variance Inflation Factor (VIF). Results About 75% and 60% of OES users achieved RTW and have been sustained at work respectively, whilst just approximately 30% of NES users have been placed at a new employer and 25% of them have been sustained at work. The predictors which have the most association with OES and NES outcomes are the use of psychiatric services and age groups respectively. We found that having mental conditions is as an important indicator to allocate injured workers into OES or NES initiatives. Our study shows that injured workers with mental issues do not always have lower RTW rate. They just need special consideration. Conclusion Understanding the predictors of RTW and sustainability helps to develop interventions to ensure sustained RTW. This study will assist decision makers to improve design and implementation of OR services and tailor services according to clients’ needs.
Investigation of household electricity usage patterns, and matching the patterns to behaviours, i... more Investigation of household electricity usage patterns, and matching the patterns to behaviours, is an important area of research given the centrality of such patterns in addressing the needs of the electricity industry. Additional knowledge of household behaviours will allow more effective targeting of demand side management (DSM) techniques. This paper addresses the question as to whether a reasonable number of meaningful motifs, that each represent a regular activity within a domestic household, can be identified solely using the household level electricity meter data. Using UK data collected from several hundred households in Spring 2011 monitored at a frequency of five minutes, a process for finding repeating short patterns (motifs) is defined. Different ways of representing the motifs exist and a qualitative approach is presented that allows for choosing between the options based on the number of regular behaviours detected (neither too few nor too many).
—Stochastic service network designs with uncertain demand represented by a set of scenarios can b... more —Stochastic service network designs with uncertain demand represented by a set of scenarios can be modelled as a large-scale two-stage stochastic mixed-integer program (SMIP). The progressive hedging algorithm (PHA) is a decomposition method for solving the resulting SMIP. The computational performance of the PHA can be greatly enhanced by decomposing according to scenario bundles instead of individual scenarios. At the heart of bundle-based decomposition is the method for grouping the scenarios into bundles. In this paper, we present a fuzzy c-means-based scenario bundling method to address this problem. Rather than full membership of a bundle, which is typically the case in existing scenario bundling strategies such as k-means, a scenario has partial membership in each of the bundles and can be assigned to more than one bundle in our method. Since the multiple bundle membership of a scenario induces overlap between the bundles, we empirically investigate whether and how the amount of overlap controlled by a fuzzy exponent would affect the performance of the PHA. Experimental results for a less-than-truckload transportation network optimization problem show that the number of iterations required by the PHA to achieve convergence reduces dramatically with large fuzzy exponents, whereas the computation time increases significantly. Experimental studies were conducted to find out a good fuzzy exponent to strike a trade-off between the solution quality and the computational time. Keywords—service network design, progressive hedging, fuzzy c-means, stochastic mixed-integer program
Our long-term research goal is to develop datamining methodologies that are robust to changes in ... more Our long-term research goal is to develop datamining methodologies that are robust to changes in data and uncertainty. By robust we mean solutions remain 'optimal' when things change or are easily repaired. Broadly, this robustness can be achieved in two ways: One, by having 'slack' in the solution or two, by constructing the solution such that is easily repairable, e.g. failures are isolated.
In the public goods game, players can be classified into different types according to their parti... more In the public goods game, players can be classified into different types according to their participation in the game. It is an important issue for economists to be able to measure players' strategy changes over time which can be considered as concept drift. In this study, we present a method for measuring changes in items' cluster membership in temporal data. The method consists of three steps in the first step, the temporal data will be transformed into a discrete series of time points then each time point will be clustered separately. In the last step, the items' membership in the clusters is compared with a reference of behaviour to determine the amount of behavioural change in each time point. Different external cluster validity indices and area under the curve are used to measure these changes. Instead of different cluster label comparison, we use these indices a new way to compare between clusters and reference points. In this study, three categories of reference of behaviours are used 1-first time point, 2-previous time pint and 3-the general overall behaviour of the items. For the public goods game, our results indicate that the players are changing over time but the change is smooth and relatively constant between any two time points.
In this paper we present a case study demonstrating how dynamic and uncertain criteria can be inc... more In this paper we present a case study demonstrating how dynamic and uncertain criteria can be incorporated into a multi-criteria analysis with the help of discrete event simulation. The simulation guided multi-criteria analysis can include both monetary and non-monetary criteria that are static or dynamic, whereas standard multi-criteria analysis only deals with static criteria and cost benefit analysis only deals with static monetary criteria. The dynamic and uncertain criteria are incorporated by using simulation to explore how the decision options perform. The results of the simulation are then fed into the multi-criteria analysis. By enabling the incorporation of dynamic and uncertain criteria, the dynamic multiple criteria analysis was able to take a unique perspective of the problem. The highest ranked option returned by the dynamic multi-criteria analysis differed from the other decision aid techniques. The results suggest that dynamic multiple criteria analysis may be highly suitable for decisions that require long term evaluation, as this is often when uncertainty is introduced .
An important role carried out by cyber-security experts is the assessment of proposed computer sy... more An important role carried out by cyber-security experts is the assessment of proposed computer systems, during their design stage. This task is fraught with difficulties and uncertainty, making the knowledge provided by human experts essential for successful assessment. Today, the increasing number of progressively complex systems has led to an urgent need to produce tools that support the expert-led process of system-security assessment. In this research, we use Weighted Averages (WAs) and Ordered Weighted Averages (OWAs) with Evolutionary Algorithms (EAs) to create aggregation operators that model parts of the assessment process. We show how individual overall ratings for security components can be produced from ratings of their characteristics, and how these individual overall ratings can be aggregated to produce overall rankings of potential attacks on a system. As well as the identification of salient attacks and weak points in a prospective system, the proposed method also highlights which factors and security components contribute most to a component's difficulty and attack ranking respectively. A real world scenario is used in which experts were asked to rank a set of technical attacks, and to answer a series of questions about the security components that are the subject of the attacks. The work shows how finding good aggregation operators, and identifying important components and factors of a cyber-security problem can be automated. The resulting operators have the potential for use as decision aids for systems designers and cyber-security experts, increasing the amount of assessment that can be achieved with the limited resources available.
This study optimises manually derived rule-based expert system classification
of objects accordin... more This study optimises manually derived rule-based expert system classification of objects according to changes in their properties over time. One of the key challenges that this study tries to address is how to classify objects that exhibit changes in their behaviour over time, for example how to classify companies’ share price stability over a period of time or how to classify students’ preferences for subjects while they are progressing through school. A specific case the paper considers is the strategy of players in public goods games (as common in economics) across multiple consecutive games. Initial classification starts from expert definitions specifying class allocation for players based on aggregated attributes of the temporal data. Based on these initial classifications, the optimisation process tries to find an improved classifier which produces the best possible compact classes of objects (players) for every time point in the temporal data. The compactness of the classes is measured by a cost function based on internal cluster indices like the Dunn Index, distance measures like Euclidean distance or statistically derived measures like standard deviation. The paper discusses the approach in the context of incorporating changing player strategies in the aforementioned public good games, where common classification approaches so far do not consider such changes in behaviour resulting from learning or in-game experience. By using the proposed process for classifying temporal data and the actual players’ contribution during the games, we aim to produce a more refined classification which in turn may inform the interpretation of public goods game data.
Uncertain data streams have been widely generated in many Web applications. The uncertainty in da... more Uncertain data streams have been widely generated in many Web applications. The uncertainty in data streams makes anomaly detection from sensor data streams far more challenging. In this paper, we present a novel framework that supports anomaly detection in uncertain data streams. The proposed framework adopts an efficient uncertainty pre-processing procedure to identify and eliminate uncertainties in data streams. Based on the corrected data streams, we develop effective period pattern recognition and feature extraction techniques to improve the computational efficiency. We use classification methods for anomaly detection in the corrected data stream. We also empirically show that the proposed approach shows a high accuracy of anomaly detection on a number of real datasets.
Novelty detection in news events has long been a difficult problem. A number of models performed ... more Novelty detection in news events has long been a difficult problem. A number of models performed well on specific data streams but certain issues are far from being solved, particularly in large data streams from the WWW where unpredictability of new terms requires adaptation in the vector space model. We present a novel event detection system based on the Incremental Term Frequency-Inverse Document Frequency (TF-IDF) weighting incorporated with Locality Sensitive Hashing (LSH). Our system could efficiently and effectively adapt to the changes within the data streams of any new terms with continual updates to the vector space model. Regarding miss probability, our proposed novelty detection framework outperforms a rec-ognised baseline system by approximately 16% when evaluating a benchmark dataset from Google News.
The use of artificial immune systems in intrusion detection is an appealing concept for two reaso... more The use of artificial immune systems in intrusion detection is an appealing concept for two reasons. Firstly, the human immune system provides the human body with a high level of protection from invading pathogens, in a robust, self-organised and distributed manner. Secondly, current techniques used in computer security are not able to cope with the dynamic and increasingly complex nature of computer systems and their security. It is hoped that biologically inspired approaches in this area, including the use of immune-based systems will be able to meet this challenge. Here we collate the algorithms used, the development of the systems and the outcome of their implementation. It provides an introduction and review of the key developments within this field, in addition to making suggestions for future research.
As one of the solutions to intrusion detection problems, Artificial Immune Systems (AIS) have sho... more As one of the solutions to intrusion detection problems, Artificial Immune Systems (AIS) have shown their advantages. Unlike genetic algorithms, there is no one archetypal AIS, instead there are four major paradigms. Among them, the Dendritic Cell Algorithm (DCA) has produced promising results in various applications. The aim of this chapter is to demonstrate the potential for the DCA as a suitable candidate for intrusion detection problems. We review some of the commonly used AIS paradigms for intrusion detection problems and demonstrate the advantages of one particular algorithm, the DCA. In order to clearly describe the algorithm, the background to its development and a formal definition are given. In addition, improvements to the original DCA are presented and their implications are discussed, including previous work done on an online analysis component with segmentation and ongoing work on automated data preprocessing. Based on preliminary results, both improvements appear to be promising for online anomaly-based intrusion detection.
How do technology users effectively transit from having zero knowledge about a technology to maki... more How do technology users effectively transit from having zero knowledge about a technology to making the best use of it after an authoritative technology adoption? This post-adoption user learning has received little research attention in technology management literature. In this paper we investigate user learning in authoritative technology adoption by developing an agent-based model using the case of council-led smart meter deployment in the UK City of Leeds. Energy consumers gain experience of using smart meters based on the learning curve in be-havioural learning. With the agent-based model we carry out experiments to validate the model and test different energy interventions that local authorities can use to facilitate energy consumers' learning and maintain their continuous use of the technology. Our results show that the easier energy consumers become experienced, the more energy-efficient they are and the more energy saving they can achieve; encouraging energy consumers' contacts via various informational means can facilitate their learning; and developing and maintaining their positive attitude toward smart metering can enable them to use the technology continuously. Contributions and energy policy/intervention implications are discussed in this paper.
Uncertain data streams have been widely generated in many Web applications. The uncertainty in da... more Uncertain data streams have been widely generated in many Web applications. The uncertainty in data streams makes anomaly detection from sensor data streams far more challenging. In this paper, we present a novel framework that supports anomaly detection in uncertain data streams. The proposed framework adopts an efficient uncertainty pre-processing procedure to identify and eliminate uncertainties in data streams. Based on the corrected data streams, we develop effective period pattern recognition and feature extraction techniques to improve the computational efficiency. We use classification methods for anomaly detection in the corrected data stream. We also empirically show that the proposed approach shows a high accuracy of anomaly detection on a number of real datasets.
Purpose: To develop a framework for identifying and incorporating candidate confounding interacti... more Purpose: To develop a framework for identifying and incorporating candidate confounding interaction terms into a regularised cox regression analysis to refine adverse drug reaction signals obtained via longitudinal observational data. Methods: We considered six drug families that are commonly associated with myocardial infarction in observational healthcare data, but where the causal relationship ground truth is known (adverse drug reaction or not). We applied emergent pattern mining to find itemsets of drugs and medical events that are associated with the development of myocardial infarction. These are the candidate confounding interaction terms. We then implemented a cohort study design using regularised cox regression that incorporated and accounted for the candidate confounding interaction terms. Results The methodology was able to account for signals generated due to confounding and a cox regression with elastic net regularisation correctly ranked the drug families known to be true adverse drug reactions above those that are not. This was not the case without the inclusion of the candidate confounding interaction terms, where confounding leads to a non-adverse drug reaction being ranked highest. Conclusions The methodology is efficient, can identify high-order confounding interactions and does not require expert input to specify outcome specific confounders, so it can be applied for any outcome of interest to quickly refine its signals. The proposed method shows excellent potential to overcome some forms of confounding and therefore reduce the false positive rate for signal analysis using longitudinal data.
Multi-agent systems offer a new and exciting way of understanding the world of work. We apply age... more Multi-agent systems offer a new and exciting way of understanding the world of work. We apply agent-based modeling and simulation to investigate a set of problems in a retail context. Specifically, we are working to understand the relationship between people management practices on the shop-floor and retail performance. Despite the fact we are working within a relatively novel and complex domain, it is clear that using an agent-based approach offers great potential for improving organizational capabilities in the future. Our multidisciplinary research team has worked closely with one of the UK's top ten retailers to collect data and build an understanding of shop-floor operations and the key actors in a department (customers, staff, and managers). Based on this case study we have built and tested our first version of a retail branch agent-based simulation model where we have focused on how we can simulate the effects of people management practices on customer satisfaction and sales. In our experiments we have looked at employee development and cashier empowerment as two examples of shop-floor management practices. In this paper we describe the underlying conceptual ideas and the features of our simulation model. We present a selection of experiments we have conducted in order to validate our simulation model and to show its potential for answering " what-if " questions in a retail context. We also introduce a novel performance measure which we have created to quantify customers' satisfaction with service, based on their individual shopping experiences.
—Jerne's idiotypic-network theory postulates that the immune response involves interantibody stim... more —Jerne's idiotypic-network theory postulates that the immune response involves interantibody stimulation and suppression , as well as matching to antigens. The theory has proved the most popular artificial immune system (AIS) model for incorporation into behavior-based robotics, but guidelines for implementing idiotypic selection are scarce. Furthermore, the direct effects of employing the technique have not been demonstrated in the form of a comparison with nonidiotypic systems. This paper aims to address these issues. A method for integrating an idiotypic AIS network with a reinforcement-learning (RL)-based control system is described, and the mechanisms underlying antibody stimulation and suppression are explained in detail. Some hypotheses that account for the network advantage are put forward and tested using three systems with increasing idiotypic complexity. The basic RL, a simplified hybrid AIS-RL that implements idiotypic selection independently of derived concentration levels, and a full hybrid AIS-RL scheme are examined. The test bed takes the form of a simulated Pioneer robot that is required to navigate through maze worlds detecting and tracking door markers. Index Terms—Artificial immune system (AIS), behavior arbitration mechanism, idiotypic-network theory, reinforcement learning (RL).
Intelligent techniques, including artificial intelligence and deep learning, normally perform on ... more Intelligent techniques, including artificial intelligence and deep learning, normally perform on complete data without missing data. Multiple imputation is indispensable for addressing missing data resulting in unbiased estimates and dealing with uncertainty by providing more valid results. Most stateof-the-art techniques focus on moderate missing rates (around 50%-60%) and short missing gaps, while imputation for high missing gaps and high missing rates is an important challenge for multivariate time-series data generated through the Internet of Things (IoT). Hence, we propose a Lightweight Window Portion-based Multiple Imputation (LWPMI) using correlation, data fusion, regression, and multiple imputations. We conduct extensive experiments by generating generated high missing gaps and high missing rates ranging from 10-90% on sensor-generated datasets with different features (highly, weakly, and a mixture of highly and weakly correlated data). All the obtained results prove LWPMI outperforms baseline techniques in preserving pattern, structure, and trend in both 90% high missing gap and missing rates.
Purpose This study evaluates the Occupational Rehabilitation (OR) initiatives regarding return to... more Purpose This study evaluates the Occupational Rehabilitation (OR) initiatives regarding return to work (RTW) and sustaining at work following work-related injuries. This study also identifies the predictors and predicts the likelihoods of RTW and sustainability for OR users. Methods The study is conducted on the compensation claim data for people who are injured at work in the state of Victoria, Australia. The claims which commenced OR services between the first of July 2012 and the end of June 2015 are included. The claims which used original employer services (OES) have been separated from claims which used new employer services (NES). We investigated a range of predictors categorised into four groups as claimant, injury, and employment characteristics and claim management. The RTW and sustaining at work are outcomes of interest. To evaluate the predictors, we use Chi-squared test and logistic regression modelling. Also, we prioritized the predictors using Akaike Information Criterion (AIC) measure and Cross-validation error. Four predictive models are developed using significant predictors for OES and NES users to predict RTW and sustainability. We examined the multicollinearity of the developed models using Variance Inflation Factor (VIF). Results About 75% and 60% of OES users achieved RTW and have been sustained at work respectively, whilst just approximately 30% of NES users have been placed at a new employer and 25% of them have been sustained at work. The predictors which have the most association with OES and NES outcomes are the use of psychiatric services and age groups respectively. We found that having mental conditions is as an important indicator to allocate injured workers into OES or NES initiatives. Our study shows that injured workers with mental issues do not always have lower RTW rate. They just need special consideration. Conclusion Understanding the predictors of RTW and sustainability helps to develop interventions to ensure sustained RTW. This study will assist decision makers to improve design and implementation of OR services and tailor services according to clients' needs. Introduction Injuries and illnesses that occur at work impose substantial personal, social and economic burdens on society [1]. These injuries may lead to disability, morbidity or even mortality. Many injured workers might have long-term healthcare issues. As such, return to work (RTW) for these workers becomes quite complicated. The longer the injured workers are away from work, the lower the likelihood of a successful RTW [2]. Various factors such as physical, psychological and social factors influence the process of RTW. Five stages have been identified for an injured worker to be ready for RTW in the 'Readiness for Return to Work' (RRTW) model. These stages, which are aligned with stages of change, are precontemplation, contemplation, preparation for action, action and maintenance [3, 4]. In the precontemplation stage, the injured worker has not started to think about RTW because the recovery is the priority. In the contemplation stage, as the injured worker is recovering, they are starting to consider RTW, but are not engaged in any practical plans for RTW. The preparation for action stage involves finding information regarding RTW, evaluating the capability of RTW, making plans and involvement in assistive programs to RTW. In the action stage, the injured worker converts the plan into action and gets back to work with different levels of capacity. The goal of the maintenance stage is to retain the returned injured workers at work. To reach this goal, some strategies such as receiving support from assistive programs, increasing the workload gradually, applying specific safety policies and strengthening exercise can be considered [5].
Abstract
Purpose This study evaluates the Occupational Rehabilitation (OR) initiatives regarding ... more Abstract Purpose This study evaluates the Occupational Rehabilitation (OR) initiatives regarding return to work (RTW) and sustaining at work following work-related injuries. This study also identifies the predictors and predicts the likelihoods of RTW and sustainability for OR users. Methods The study is conducted on the compensation claim data for people who are injured at work in the state of Victoria, Australia. The claims which commenced OR services between the first of July 2012 and the end of June 2015 are included. The claims which used original employer services (OES) have been separated from claims which used new employer services (NES). We investigated a range of predictors categorised into four groups as claimant, injury, and employment characteristics and claim management. The RTW and sustaining at work are outcomes of interest. To evaluate the predictors, we use Chi-squared test and logistic regression modelling. Also, we prioritized the predictors using Akaike Information Criterion (AIC) measure and Cross-validation error. Four predictive models are developed using significant predictors for OES and NES users to predict RTW and sustainability. We examined the multicollinearity of the developed models using Variance Inflation Factor (VIF). Results About 75% and 60% of OES users achieved RTW and have been sustained at work respectively, whilst just approximately 30% of NES users have been placed at a new employer and 25% of them have been sustained at work. The predictors which have the most association with OES and NES outcomes are the use of psychiatric services and age groups respectively. We found that having mental conditions is as an important indicator to allocate injured workers into OES or NES initiatives. Our study shows that injured workers with mental issues do not always have lower RTW rate. They just need special consideration. Conclusion Understanding the predictors of RTW and sustainability helps to develop interventions to ensure sustained RTW. This study will assist decision makers to improve design and implementation of OR services and tailor services according to clients’ needs.
Investigation of household electricity usage patterns, and matching the patterns to behaviours, i... more Investigation of household electricity usage patterns, and matching the patterns to behaviours, is an important area of research given the centrality of such patterns in addressing the needs of the electricity industry. Additional knowledge of household behaviours will allow more effective targeting of demand side management (DSM) techniques. This paper addresses the question as to whether a reasonable number of meaningful motifs, that each represent a regular activity within a domestic household, can be identified solely using the household level electricity meter data. Using UK data collected from several hundred households in Spring 2011 monitored at a frequency of five minutes, a process for finding repeating short patterns (motifs) is defined. Different ways of representing the motifs exist and a qualitative approach is presented that allows for choosing between the options based on the number of regular behaviours detected (neither too few nor too many).
—Stochastic service network designs with uncertain demand represented by a set of scenarios can b... more —Stochastic service network designs with uncertain demand represented by a set of scenarios can be modelled as a large-scale two-stage stochastic mixed-integer program (SMIP). The progressive hedging algorithm (PHA) is a decomposition method for solving the resulting SMIP. The computational performance of the PHA can be greatly enhanced by decomposing according to scenario bundles instead of individual scenarios. At the heart of bundle-based decomposition is the method for grouping the scenarios into bundles. In this paper, we present a fuzzy c-means-based scenario bundling method to address this problem. Rather than full membership of a bundle, which is typically the case in existing scenario bundling strategies such as k-means, a scenario has partial membership in each of the bundles and can be assigned to more than one bundle in our method. Since the multiple bundle membership of a scenario induces overlap between the bundles, we empirically investigate whether and how the amount of overlap controlled by a fuzzy exponent would affect the performance of the PHA. Experimental results for a less-than-truckload transportation network optimization problem show that the number of iterations required by the PHA to achieve convergence reduces dramatically with large fuzzy exponents, whereas the computation time increases significantly. Experimental studies were conducted to find out a good fuzzy exponent to strike a trade-off between the solution quality and the computational time. Keywords—service network design, progressive hedging, fuzzy c-means, stochastic mixed-integer program
Our long-term research goal is to develop datamining methodologies that are robust to changes in ... more Our long-term research goal is to develop datamining methodologies that are robust to changes in data and uncertainty. By robust we mean solutions remain 'optimal' when things change or are easily repaired. Broadly, this robustness can be achieved in two ways: One, by having 'slack' in the solution or two, by constructing the solution such that is easily repairable, e.g. failures are isolated.
In the public goods game, players can be classified into different types according to their parti... more In the public goods game, players can be classified into different types according to their participation in the game. It is an important issue for economists to be able to measure players' strategy changes over time which can be considered as concept drift. In this study, we present a method for measuring changes in items' cluster membership in temporal data. The method consists of three steps in the first step, the temporal data will be transformed into a discrete series of time points then each time point will be clustered separately. In the last step, the items' membership in the clusters is compared with a reference of behaviour to determine the amount of behavioural change in each time point. Different external cluster validity indices and area under the curve are used to measure these changes. Instead of different cluster label comparison, we use these indices a new way to compare between clusters and reference points. In this study, three categories of reference of behaviours are used 1-first time point, 2-previous time pint and 3-the general overall behaviour of the items. For the public goods game, our results indicate that the players are changing over time but the change is smooth and relatively constant between any two time points.
In this paper we present a case study demonstrating how dynamic and uncertain criteria can be inc... more In this paper we present a case study demonstrating how dynamic and uncertain criteria can be incorporated into a multi-criteria analysis with the help of discrete event simulation. The simulation guided multi-criteria analysis can include both monetary and non-monetary criteria that are static or dynamic, whereas standard multi-criteria analysis only deals with static criteria and cost benefit analysis only deals with static monetary criteria. The dynamic and uncertain criteria are incorporated by using simulation to explore how the decision options perform. The results of the simulation are then fed into the multi-criteria analysis. By enabling the incorporation of dynamic and uncertain criteria, the dynamic multiple criteria analysis was able to take a unique perspective of the problem. The highest ranked option returned by the dynamic multi-criteria analysis differed from the other decision aid techniques. The results suggest that dynamic multiple criteria analysis may be highly suitable for decisions that require long term evaluation, as this is often when uncertainty is introduced .
An important role carried out by cyber-security experts is the assessment of proposed computer sy... more An important role carried out by cyber-security experts is the assessment of proposed computer systems, during their design stage. This task is fraught with difficulties and uncertainty, making the knowledge provided by human experts essential for successful assessment. Today, the increasing number of progressively complex systems has led to an urgent need to produce tools that support the expert-led process of system-security assessment. In this research, we use Weighted Averages (WAs) and Ordered Weighted Averages (OWAs) with Evolutionary Algorithms (EAs) to create aggregation operators that model parts of the assessment process. We show how individual overall ratings for security components can be produced from ratings of their characteristics, and how these individual overall ratings can be aggregated to produce overall rankings of potential attacks on a system. As well as the identification of salient attacks and weak points in a prospective system, the proposed method also highlights which factors and security components contribute most to a component's difficulty and attack ranking respectively. A real world scenario is used in which experts were asked to rank a set of technical attacks, and to answer a series of questions about the security components that are the subject of the attacks. The work shows how finding good aggregation operators, and identifying important components and factors of a cyber-security problem can be automated. The resulting operators have the potential for use as decision aids for systems designers and cyber-security experts, increasing the amount of assessment that can be achieved with the limited resources available.
This study optimises manually derived rule-based expert system classification
of objects accordin... more This study optimises manually derived rule-based expert system classification of objects according to changes in their properties over time. One of the key challenges that this study tries to address is how to classify objects that exhibit changes in their behaviour over time, for example how to classify companies’ share price stability over a period of time or how to classify students’ preferences for subjects while they are progressing through school. A specific case the paper considers is the strategy of players in public goods games (as common in economics) across multiple consecutive games. Initial classification starts from expert definitions specifying class allocation for players based on aggregated attributes of the temporal data. Based on these initial classifications, the optimisation process tries to find an improved classifier which produces the best possible compact classes of objects (players) for every time point in the temporal data. The compactness of the classes is measured by a cost function based on internal cluster indices like the Dunn Index, distance measures like Euclidean distance or statistically derived measures like standard deviation. The paper discusses the approach in the context of incorporating changing player strategies in the aforementioned public good games, where common classification approaches so far do not consider such changes in behaviour resulting from learning or in-game experience. By using the proposed process for classifying temporal data and the actual players’ contribution during the games, we aim to produce a more refined classification which in turn may inform the interpretation of public goods game data.
Uncertain data streams have been widely generated in many Web applications. The uncertainty in da... more Uncertain data streams have been widely generated in many Web applications. The uncertainty in data streams makes anomaly detection from sensor data streams far more challenging. In this paper, we present a novel framework that supports anomaly detection in uncertain data streams. The proposed framework adopts an efficient uncertainty pre-processing procedure to identify and eliminate uncertainties in data streams. Based on the corrected data streams, we develop effective period pattern recognition and feature extraction techniques to improve the computational efficiency. We use classification methods for anomaly detection in the corrected data stream. We also empirically show that the proposed approach shows a high accuracy of anomaly detection on a number of real datasets.
Novelty detection in news events has long been a difficult problem. A number of models performed ... more Novelty detection in news events has long been a difficult problem. A number of models performed well on specific data streams but certain issues are far from being solved, particularly in large data streams from the WWW where unpredictability of new terms requires adaptation in the vector space model. We present a novel event detection system based on the Incremental Term Frequency-Inverse Document Frequency (TF-IDF) weighting incorporated with Locality Sensitive Hashing (LSH). Our system could efficiently and effectively adapt to the changes within the data streams of any new terms with continual updates to the vector space model. Regarding miss probability, our proposed novelty detection framework outperforms a rec-ognised baseline system by approximately 16% when evaluating a benchmark dataset from Google News.
The use of artificial immune systems in intrusion detection is an appealing concept for two reaso... more The use of artificial immune systems in intrusion detection is an appealing concept for two reasons. Firstly, the human immune system provides the human body with a high level of protection from invading pathogens, in a robust, self-organised and distributed manner. Secondly, current techniques used in computer security are not able to cope with the dynamic and increasingly complex nature of computer systems and their security. It is hoped that biologically inspired approaches in this area, including the use of immune-based systems will be able to meet this challenge. Here we collate the algorithms used, the development of the systems and the outcome of their implementation. It provides an introduction and review of the key developments within this field, in addition to making suggestions for future research.
As one of the solutions to intrusion detection problems, Artificial Immune Systems (AIS) have sho... more As one of the solutions to intrusion detection problems, Artificial Immune Systems (AIS) have shown their advantages. Unlike genetic algorithms, there is no one archetypal AIS, instead there are four major paradigms. Among them, the Dendritic Cell Algorithm (DCA) has produced promising results in various applications. The aim of this chapter is to demonstrate the potential for the DCA as a suitable candidate for intrusion detection problems. We review some of the commonly used AIS paradigms for intrusion detection problems and demonstrate the advantages of one particular algorithm, the DCA. In order to clearly describe the algorithm, the background to its development and a formal definition are given. In addition, improvements to the original DCA are presented and their implications are discussed, including previous work done on an online analysis component with segmentation and ongoing work on automated data preprocessing. Based on preliminary results, both improvements appear to be promising for online anomaly-based intrusion detection.
How do technology users effectively transit from having zero knowledge about a technology to maki... more How do technology users effectively transit from having zero knowledge about a technology to making the best use of it after an authoritative technology adoption? This post-adoption user learning has received little research attention in technology management literature. In this paper we investigate user learning in authoritative technology adoption by developing an agent-based model using the case of council-led smart meter deployment in the UK City of Leeds. Energy consumers gain experience of using smart meters based on the learning curve in be-havioural learning. With the agent-based model we carry out experiments to validate the model and test different energy interventions that local authorities can use to facilitate energy consumers' learning and maintain their continuous use of the technology. Our results show that the easier energy consumers become experienced, the more energy-efficient they are and the more energy saving they can achieve; encouraging energy consumers' contacts via various informational means can facilitate their learning; and developing and maintaining their positive attitude toward smart metering can enable them to use the technology continuously. Contributions and energy policy/intervention implications are discussed in this paper.
Uncertain data streams have been widely generated in many Web applications. The uncertainty in da... more Uncertain data streams have been widely generated in many Web applications. The uncertainty in data streams makes anomaly detection from sensor data streams far more challenging. In this paper, we present a novel framework that supports anomaly detection in uncertain data streams. The proposed framework adopts an efficient uncertainty pre-processing procedure to identify and eliminate uncertainties in data streams. Based on the corrected data streams, we develop effective period pattern recognition and feature extraction techniques to improve the computational efficiency. We use classification methods for anomaly detection in the corrected data stream. We also empirically show that the proposed approach shows a high accuracy of anomaly detection on a number of real datasets.
Purpose: To develop a framework for identifying and incorporating candidate confounding interacti... more Purpose: To develop a framework for identifying and incorporating candidate confounding interaction terms into a regularised cox regression analysis to refine adverse drug reaction signals obtained via longitudinal observational data. Methods: We considered six drug families that are commonly associated with myocardial infarction in observational healthcare data, but where the causal relationship ground truth is known (adverse drug reaction or not). We applied emergent pattern mining to find itemsets of drugs and medical events that are associated with the development of myocardial infarction. These are the candidate confounding interaction terms. We then implemented a cohort study design using regularised cox regression that incorporated and accounted for the candidate confounding interaction terms. Results The methodology was able to account for signals generated due to confounding and a cox regression with elastic net regularisation correctly ranked the drug families known to be true adverse drug reactions above those that are not. This was not the case without the inclusion of the candidate confounding interaction terms, where confounding leads to a non-adverse drug reaction being ranked highest. Conclusions The methodology is efficient, can identify high-order confounding interactions and does not require expert input to specify outcome specific confounders, so it can be applied for any outcome of interest to quickly refine its signals. The proposed method shows excellent potential to overcome some forms of confounding and therefore reduce the false positive rate for signal analysis using longitudinal data.
Multi-agent systems offer a new and exciting way of understanding the world of work. We apply age... more Multi-agent systems offer a new and exciting way of understanding the world of work. We apply agent-based modeling and simulation to investigate a set of problems in a retail context. Specifically, we are working to understand the relationship between people management practices on the shop-floor and retail performance. Despite the fact we are working within a relatively novel and complex domain, it is clear that using an agent-based approach offers great potential for improving organizational capabilities in the future. Our multidisciplinary research team has worked closely with one of the UK's top ten retailers to collect data and build an understanding of shop-floor operations and the key actors in a department (customers, staff, and managers). Based on this case study we have built and tested our first version of a retail branch agent-based simulation model where we have focused on how we can simulate the effects of people management practices on customer satisfaction and sales. In our experiments we have looked at employee development and cashier empowerment as two examples of shop-floor management practices. In this paper we describe the underlying conceptual ideas and the features of our simulation model. We present a selection of experiments we have conducted in order to validate our simulation model and to show its potential for answering " what-if " questions in a retail context. We also introduce a novel performance measure which we have created to quantify customers' satisfaction with service, based on their individual shopping experiences.
—Jerne's idiotypic-network theory postulates that the immune response involves interantibody stim... more —Jerne's idiotypic-network theory postulates that the immune response involves interantibody stimulation and suppression , as well as matching to antigens. The theory has proved the most popular artificial immune system (AIS) model for incorporation into behavior-based robotics, but guidelines for implementing idiotypic selection are scarce. Furthermore, the direct effects of employing the technique have not been demonstrated in the form of a comparison with nonidiotypic systems. This paper aims to address these issues. A method for integrating an idiotypic AIS network with a reinforcement-learning (RL)-based control system is described, and the mechanisms underlying antibody stimulation and suppression are explained in detail. Some hypotheses that account for the network advantage are put forward and tested using three systems with increasing idiotypic complexity. The basic RL, a simplified hybrid AIS-RL that implements idiotypic selection independently of derived concentration levels, and a full hybrid AIS-RL scheme are examined. The test bed takes the form of a simulated Pioneer robot that is required to navigate through maze worlds detecting and tracking door markers. Index Terms—Artificial immune system (AIS), behavior arbitration mechanism, idiotypic-network theory, reinforcement learning (RL).
3rd International Conference on Smart Sustainable City and Big Data 27-28 July 2015, Shanghai, China
Novelty detection in news events has long been a difficult problem. A number of models performed ... more Novelty detection in news events has long been a difficult problem. A number of models performed well on specific data streams but certain issues are far from being solved, particularly in large data streams from the WWW where unpredictability of new terms requires adaptation in the vector space model. We present a novel event detection system based on the Incremental Term Frequency-Inverse Document Frequency (TF-IDF) weighting incorporated with Locality Sensitive Hashing (LSH). Our system could efficiently and effectively adapt to the changes within the data streams of any new terms with continual updates to the vector space model. Regarding miss probability, our proposed novelty detection framework outperforms a rec-ognised baseline system by approximately 16% when evaluating a benchmark dataset from Google News.
—This article proposes a novel framework for detecting redundancy in supervised sentence categori... more —This article proposes a novel framework for detecting redundancy in supervised sentence categorisation. Unlike traditional singleton neural network, our model incorporates character-aware convolutional neural network (Char-CNN) with character-aware recurrent neural network (Char-RNN) to form a convolutional recurrent neural network (CRNN). Our model benefits from Char-CNN in that only salient features are selected and fed into the integrated Char-RNN. Char-RNN effectively learns long sequence semantics via sophisticated update mechanism. We compare our framework against the state-of-the-art text classification algorithms on four popular benchmarking corpus. For instance, our model achieves competing precision rate, recall ratio, and F1 score on the Google-news data-set. For twenty-news-groups data stream, our algorithm obtains the optimum on precision rate, recall ratio, and F1 score. For Brown Corpus, our framework obtains the best F1 score and almost equivalent precision rate and recall ratio over the top competitor. For the question classification collection, CRNN produces the optimal recall rate and F1 score and comparable precision rate. We also analyse three different RNN hidden recurrent cells' impact on performance and their runtime efficiency. We observe that MGU achieves the optimal runtime and comparable performance against GRU and LSTM. For TFIDF based algorithms, we experiment with word2vec, GloVe, and sent2vec embeddings and report their performance differences.
—In computing the similarity of intervals, current similarity measures such as the commonly used ... more —In computing the similarity of intervals, current similarity measures such as the commonly used Jaccard and Dice measures are at times not sensitive to changes in the width of intervals, producing equal similarities for substantially different pairs of intervals. To address this, we propose a new similarity measure that uses a bi-directional approach to determine interval similarity. For each direction, the overlapping ratio of the given interval in a pair with the other interval is used as a measure of uni-directional similarity. We show that the proposed measure satisfies all common properties of a similarity measure, while also being invariant in respect to multiplication of the interval endpoints and exhibiting linear growth in respect to linearly increasing overlap. Further, we compare the behavior of the proposed measure with the highly popular Jaccard and Dice similarity measures, highlighting that the proposed approach is more sensitive to changes in interval widths. Finally, we show that the proposed similarity is bounded by the Jaccard and the Dice similarity, thus providing a reliable alternative.
Quality of life assessment represents a key process of deciding treatment success and viability. ... more Quality of life assessment represents a key process of deciding treatment success and viability. As such, patients' perceptions of their functional status and well-being are important inputs for impairment assessment. Given that patient completed questionnaires are often used to assess patient status and determine future treatment options, it is important to know the level of agreement of the words used by patients and different groups of medical professionals. In this paper, we propose a measure called the Agreement Ratio which provides a ratio of overall agreement when modelling words through Fuzzy Sets (FSs). The measure has been specifically designed for assessing this agreement in fuzzy sets which are generated from data such as patient responses. The measure relies on using the Jaccard Similarity Measure for comparing the different levels of agreement in the FSs generated. Synthetic examples are provided in order to show how to calculate the measure for given Fuzzy Sets. An application to real-world data is provided as well as a discussion about the results and the potential of the proposed measure.
Fuzzy Rule-Based Classification Systems (FRBCSs) have the potential to provide so-called interpre... more Fuzzy Rule-Based Classification Systems (FRBCSs) have the potential to provide so-called interpretable classifiers, i.e. Classifiers which can be introspective, understood, validated and augmented by human experts by relying on fuzzy-set based rules. This paper builds on prior work for interval type-2 fuzzy set based FRBCs where the fuzzy sets and rules of the classifier are generated using an initial clustering stage. By introducing Subtractive Clustering in order to identify multiple cluster prototypes, the proposed approach has the potential to deliver improved classification performance while maintaining good interpretability, i.e. Without resulting in an excessive number of rules. The paper provides a detailed overview of the proposed FRBC framework, followed by a series of exploratory experiments on both linearly and non-linearly separable datasets, comparing results to existing rule-based and SVM approaches. Overall, initial results indicate that the approach enables comparable classification performance to non rule-based classifiers such as SVM, while often achieving this with a very small number of rules.
In the context of cancer treatment and surgery, quality of life assessment is a crucial part of d... more In the context of cancer treatment and surgery, quality of life assessment is a crucial part of determining treatment success and viability. In order to assess it, patient- completed questionnaires which employ words to capture aspects of patients well-being are the norm. As the results of these questionnaires are often used to assess patient progress and to determine future treatment options, it is important to establish that the words used are interpreted in the same way by both patients and medical professionals. In this paper, we capture and model patients perceptions and associated uncertainty about the words used to describe the level of their physical function used in the highly common (in Sarcoma Services) Toronto Extremity Salvage Score (TESS) questionnaire. The paper provides detail about the interval-valued data capture as well as the subsequent modelling of the data using fuzzy sets. Based on an initial sample of participants, we use Jaccard similarity on the resulting words models to show that there may be considerable differences in the interpretation of commonly used questionnaire terms, thus presenting a very real risk of miscommunication between patients and medical professionals as well as within the group of medical professionals.
In the context of cancer treatment and surgery, quality of life assessment is a crucial part of d... more In the context of cancer treatment and surgery, quality of life assessment is a crucial part of determining treatment success and viability. In order to assess it, patient-completed questionnaires which employ words to capture aspects of patients well-being are the norm. As the results of these questionnaires are often used to assess patient progress and to determine future treatment options, it is important to establish that the words used are interpreted in the same way by both patients and medical professionals. In this paper, we capture and model patients perceptions and associated uncertainty about the words used to describe the level of their physical function used in the highly common (in Sarcoma Services) Toronto Extremity Salvage Score (TESS) questionnaire. The paper provides detail about the interval-valued data capture as well as the subsequent modelling of the data using fuzzy sets. Based on an initial sample of participants, we use Jaccard similarity on the resulting words models to show that there may be considerable differences in the interpretation of commonly used questionnaire terms, thus presenting a very real risk of miscommunication between patients and medical professionals as well as within the group of medical professionals.
Uploads
Purpose This study evaluates the Occupational Rehabilitation (OR) initiatives regarding return to work (RTW) and sustaining at work following work-related injuries. This study also identifies the predictors and predicts the likelihoods of RTW and sustainability for OR users. Methods The study is conducted on the compensation claim data for people who are injured at work in the state of Victoria, Australia. The claims which commenced OR services between the first of July 2012 and the end of June 2015 are included. The claims which used original employer services (OES) have been separated from claims which used new employer services (NES). We investigated a range of predictors categorised into four groups as claimant, injury, and employment characteristics and claim management. The RTW and sustaining at work are outcomes of interest. To evaluate the predictors, we use Chi-squared test and logistic regression modelling. Also, we prioritized the predictors using Akaike Information Criterion (AIC) measure and Cross-validation error. Four predictive models are developed using significant predictors for OES and NES users to predict RTW and sustainability. We examined the multicollinearity of the developed models using Variance Inflation Factor (VIF). Results About 75% and 60% of OES users achieved RTW and have been sustained at work respectively, whilst just approximately 30% of NES users have been placed at a new employer and 25% of them have been sustained at work. The predictors which have the most association with OES and NES outcomes are the use of psychiatric services and age groups respectively. We found that having mental conditions is as an important indicator to allocate injured workers into OES or NES initiatives. Our study shows that injured workers with mental issues do not always have lower RTW rate. They just need special consideration. Conclusion Understanding the predictors of RTW and sustainability helps to develop interventions to ensure sustained RTW. This study will assist decision makers to improve design and implementation of OR services and tailor services according to clients’ needs.
of objects according to changes in their properties over time. One of the key
challenges that this study tries to address is how to classify objects that exhibit
changes in their behaviour over time, for example how to classify companies’
share price stability over a period of time or how to classify students’
preferences for subjects while they are progressing through school. A specific
case the paper considers is the strategy of players in public goods games (as
common in economics) across multiple consecutive games. Initial classification
starts from expert definitions specifying class allocation for players based on
aggregated attributes of the temporal data. Based on these initial classifications,
the optimisation process tries to find an improved classifier which produces the
best possible compact classes of objects (players) for every time point in the
temporal data. The compactness of the classes is measured by a cost function
based on internal cluster indices like the Dunn Index, distance measures like
Euclidean distance or statistically derived measures like standard deviation.
The paper discusses the approach in the context of incorporating changing
player strategies in the aforementioned public good games, where common
classification approaches so far do not consider such changes in behaviour
resulting from learning or in-game experience. By using the proposed process
for classifying temporal data and the actual players’ contribution during the
games, we aim to produce a more refined classification which in turn may
inform the interpretation of public goods game data.
Purpose This study evaluates the Occupational Rehabilitation (OR) initiatives regarding return to work (RTW) and sustaining at work following work-related injuries. This study also identifies the predictors and predicts the likelihoods of RTW and sustainability for OR users. Methods The study is conducted on the compensation claim data for people who are injured at work in the state of Victoria, Australia. The claims which commenced OR services between the first of July 2012 and the end of June 2015 are included. The claims which used original employer services (OES) have been separated from claims which used new employer services (NES). We investigated a range of predictors categorised into four groups as claimant, injury, and employment characteristics and claim management. The RTW and sustaining at work are outcomes of interest. To evaluate the predictors, we use Chi-squared test and logistic regression modelling. Also, we prioritized the predictors using Akaike Information Criterion (AIC) measure and Cross-validation error. Four predictive models are developed using significant predictors for OES and NES users to predict RTW and sustainability. We examined the multicollinearity of the developed models using Variance Inflation Factor (VIF). Results About 75% and 60% of OES users achieved RTW and have been sustained at work respectively, whilst just approximately 30% of NES users have been placed at a new employer and 25% of them have been sustained at work. The predictors which have the most association with OES and NES outcomes are the use of psychiatric services and age groups respectively. We found that having mental conditions is as an important indicator to allocate injured workers into OES or NES initiatives. Our study shows that injured workers with mental issues do not always have lower RTW rate. They just need special consideration. Conclusion Understanding the predictors of RTW and sustainability helps to develop interventions to ensure sustained RTW. This study will assist decision makers to improve design and implementation of OR services and tailor services according to clients’ needs.
of objects according to changes in their properties over time. One of the key
challenges that this study tries to address is how to classify objects that exhibit
changes in their behaviour over time, for example how to classify companies’
share price stability over a period of time or how to classify students’
preferences for subjects while they are progressing through school. A specific
case the paper considers is the strategy of players in public goods games (as
common in economics) across multiple consecutive games. Initial classification
starts from expert definitions specifying class allocation for players based on
aggregated attributes of the temporal data. Based on these initial classifications,
the optimisation process tries to find an improved classifier which produces the
best possible compact classes of objects (players) for every time point in the
temporal data. The compactness of the classes is measured by a cost function
based on internal cluster indices like the Dunn Index, distance measures like
Euclidean distance or statistically derived measures like standard deviation.
The paper discusses the approach in the context of incorporating changing
player strategies in the aforementioned public good games, where common
classification approaches so far do not consider such changes in behaviour
resulting from learning or in-game experience. By using the proposed process
for classifying temporal data and the actual players’ contribution during the
games, we aim to produce a more refined classification which in turn may
inform the interpretation of public goods game data.