In the search for new anti-cancer therapies, the family of kinase enzymes are important biologica... more In the search for new anti-cancer therapies, the family of kinase enzymes are important biological targets since many are intimately connected to cell division and other important maintenance functions. The scientists use a method known as Quantitative Structure-Activity Relationships (QSAR)[5] to mine experimental data for patterns that relate the chemical structure of a drug to its kinase activity.
Abstract This paper describes how cloud computing has been used to reduce the time taken to gener... more Abstract This paper describes how cloud computing has been used to reduce the time taken to generate chemical activity models from years to weeks. Chemists use Quantitative Structure-Activity Relationship (QSAR) models to predict the activity of molecules. Existing Discovery Bus software builds these models automatically from datasets containing known molecular activities, using a “panel of experts” algorithm.
Understanding patient activity levels is important to assessing key lifestyle variables linked to... more Understanding patient activity levels is important to assessing key lifestyle variables linked to obesity, diabetes and cardiovascular disease. The MOVEeCloud project [1] makes use of wrist worn accelerometers to measure movement data over three axes at approximately 80Hz. Once collected, the analysis procedure for this data involves categorizing the acceleration signals into one of several categories: Sedentary, Light Activity, Walking and Running [2].
Abstract The AMUC (Associated Motion capture User Categories) project consisted of building a pro... more Abstract The AMUC (Associated Motion capture User Categories) project consisted of building a prototype sketch retrieval client for exploring motion capture archives. High-dimensional datasets reflect the dynamic process of motion capture and comprise high-rate sampled data of a performer's joint angles; in response to multiple query criteria, these data can potentially yield different kinds of information.
Method. Fortunately, the cloud computing approach very well fits the presented problem. The model... more Method. Fortunately, the cloud computing approach very well fits the presented problem. The modelling process is a combination of the task-‐and data-‐based parallelism and can be effectively run on a cluster of machines. Moreover, after initial processing of the whole ChEMBLdb database with available model building algorithms, further efforts with QSAR analysis will require much less resources.
Abstract This paper describes the e-Science Central (e-SC) cloud data processing system and its a... more Abstract This paper describes the e-Science Central (e-SC) cloud data processing system and its application to a number of e-Science projects. e-SC provides both software as a service (SaaS) and platform as a service for scientific data management, analysis and collaboration. It is a portable system and can be deployed on both private (eg Eucalyptus) and public clouds (Amazon AWS and Microsoft Windows Azure).
Many scientific research projects face severe challenges in extracting value from the increasingl... more Many scientific research projects face severe challenges in extracting value from the increasingly large volumes of data they generate. From our work on a large number of e-science projects, we have identified four activities that are of prime importance for the scientists we have collaborated with:
Abstract This paper presents a description of the architecture, development and deployment of the... more Abstract This paper presents a description of the architecture, development and deployment of the Middleware developed as part of the GOLD project. This Middleware has been derived from the requirement to accelerate the chemical process development lifecycle through the enablement of highly dynamic Virtual Organisations. The generic design of the Middleware will allow its application to a wide variety of additional domains.
Abstract This paper describes the design of a cloud computing platform-e-Science Central (e-SC)-w... more Abstract This paper describes the design of a cloud computing platform-e-Science Central (e-SC)-which provides both Software and Platform as a Service for scientific data management, analysis and collaboration. e-SC can be deployed on both private (eg Eucalyptus) and public Clouds (Amazon AWS and Microsoft Windows Azure). The SaaS application allows scientists to upload data, edit and run workflows, and share results in the cloud.
Abstract One of the foundations of science is that researchers must publish the methodology used ... more Abstract One of the foundations of science is that researchers must publish the methodology used to achieve their results so that others can attempt to reproduce them. This has the added benefit of allowing methods to be adopted and adapted for other purposes. In the field of e-Science, services–often choreographed through workflow, process data to generate results.
Principal components analysis (PCA) is a standard statistical technique, which is frequently empl... more Principal components analysis (PCA) is a standard statistical technique, which is frequently employed in the analysis of large highly correlated data sets. As it stands, PCA is a linear technique which can limit its relevance to the non-linear systems frequently encountered in the chemical process industries. Several attempts to extend linear PCA to cover non-linear data sets have been made, and will be briefly reviewed in this paper. We propose a symbolically oriented technique for non-linear PCA, which is based on the genetic programming (GP) paradigm. Its applicability will be demonstrated using two simple non-linear systems and data collected from an industrial distillation column.
This paper describes a case study in which multivariate statistical procedures have been develope... more This paper describes a case study in which multivariate statistical procedures have been developed to assist in the supervision of an industrial fed-batch fermentation process operated by Biochemie in Austria. The procedures have been developed to enhance the monitoring capabilities of the current system by interfacing directly into the present G2 real-time knowledge based supervisory system. While the G2 rule based system is useful for detecting deviations in single variables, it has been found to be unable to detect some of the more subtle deviations caused by the complex interactions between the process variables. Multivariate statistical techniques have been utilised in this study to provide early indications of deviations from nominal batch behaviour. The cause of these deviations can subsequently be determined by interrogating the information produced by these algorithms. Although the multivariate statistical techniques adopted in this paper are not new, their integration within the industrial supervisory system and the on-line application to the industrial fermentation process is novel.
This paper summarises the results of a 2-year study focusing on the development of a condition mo... more This paper summarises the results of a 2-year study focusing on the development of a condition monitoring system for a fed-batch fermentation system operated by Biochemie Ltd. in Austria. Consumer pressure has esulted in a greater emphasis in industry on product quality. As a direct consequence, the importance of accurate process monitoring has increased steadily in recent years. This paper demonstrates the application of multivariate statistical routines to provide process operators with a monitoring tool capable of detecting process abnormalities.
In the search for new anti-cancer therapies, the family of kinase enzymes are important biologica... more In the search for new anti-cancer therapies, the family of kinase enzymes are important biological targets since many are intimately connected to cell division and other important maintenance functions. The scientists use a method known as Quantitative Structure-Activity Relationships (QSAR)[5] to mine experimental data for patterns that relate the chemical structure of a drug to its kinase activity.
Abstract This paper describes how cloud computing has been used to reduce the time taken to gener... more Abstract This paper describes how cloud computing has been used to reduce the time taken to generate chemical activity models from years to weeks. Chemists use Quantitative Structure-Activity Relationship (QSAR) models to predict the activity of molecules. Existing Discovery Bus software builds these models automatically from datasets containing known molecular activities, using a “panel of experts” algorithm.
Understanding patient activity levels is important to assessing key lifestyle variables linked to... more Understanding patient activity levels is important to assessing key lifestyle variables linked to obesity, diabetes and cardiovascular disease. The MOVEeCloud project [1] makes use of wrist worn accelerometers to measure movement data over three axes at approximately 80Hz. Once collected, the analysis procedure for this data involves categorizing the acceleration signals into one of several categories: Sedentary, Light Activity, Walking and Running [2].
Abstract The AMUC (Associated Motion capture User Categories) project consisted of building a pro... more Abstract The AMUC (Associated Motion capture User Categories) project consisted of building a prototype sketch retrieval client for exploring motion capture archives. High-dimensional datasets reflect the dynamic process of motion capture and comprise high-rate sampled data of a performer's joint angles; in response to multiple query criteria, these data can potentially yield different kinds of information.
Method. Fortunately, the cloud computing approach very well fits the presented problem. The model... more Method. Fortunately, the cloud computing approach very well fits the presented problem. The modelling process is a combination of the task-‐and data-‐based parallelism and can be effectively run on a cluster of machines. Moreover, after initial processing of the whole ChEMBLdb database with available model building algorithms, further efforts with QSAR analysis will require much less resources.
Abstract This paper describes the e-Science Central (e-SC) cloud data processing system and its a... more Abstract This paper describes the e-Science Central (e-SC) cloud data processing system and its application to a number of e-Science projects. e-SC provides both software as a service (SaaS) and platform as a service for scientific data management, analysis and collaboration. It is a portable system and can be deployed on both private (eg Eucalyptus) and public clouds (Amazon AWS and Microsoft Windows Azure).
Many scientific research projects face severe challenges in extracting value from the increasingl... more Many scientific research projects face severe challenges in extracting value from the increasingly large volumes of data they generate. From our work on a large number of e-science projects, we have identified four activities that are of prime importance for the scientists we have collaborated with:
Abstract This paper presents a description of the architecture, development and deployment of the... more Abstract This paper presents a description of the architecture, development and deployment of the Middleware developed as part of the GOLD project. This Middleware has been derived from the requirement to accelerate the chemical process development lifecycle through the enablement of highly dynamic Virtual Organisations. The generic design of the Middleware will allow its application to a wide variety of additional domains.
Abstract This paper describes the design of a cloud computing platform-e-Science Central (e-SC)-w... more Abstract This paper describes the design of a cloud computing platform-e-Science Central (e-SC)-which provides both Software and Platform as a Service for scientific data management, analysis and collaboration. e-SC can be deployed on both private (eg Eucalyptus) and public Clouds (Amazon AWS and Microsoft Windows Azure). The SaaS application allows scientists to upload data, edit and run workflows, and share results in the cloud.
Abstract One of the foundations of science is that researchers must publish the methodology used ... more Abstract One of the foundations of science is that researchers must publish the methodology used to achieve their results so that others can attempt to reproduce them. This has the added benefit of allowing methods to be adopted and adapted for other purposes. In the field of e-Science, services–often choreographed through workflow, process data to generate results.
Principal components analysis (PCA) is a standard statistical technique, which is frequently empl... more Principal components analysis (PCA) is a standard statistical technique, which is frequently employed in the analysis of large highly correlated data sets. As it stands, PCA is a linear technique which can limit its relevance to the non-linear systems frequently encountered in the chemical process industries. Several attempts to extend linear PCA to cover non-linear data sets have been made, and will be briefly reviewed in this paper. We propose a symbolically oriented technique for non-linear PCA, which is based on the genetic programming (GP) paradigm. Its applicability will be demonstrated using two simple non-linear systems and data collected from an industrial distillation column.
This paper describes a case study in which multivariate statistical procedures have been develope... more This paper describes a case study in which multivariate statistical procedures have been developed to assist in the supervision of an industrial fed-batch fermentation process operated by Biochemie in Austria. The procedures have been developed to enhance the monitoring capabilities of the current system by interfacing directly into the present G2 real-time knowledge based supervisory system. While the G2 rule based system is useful for detecting deviations in single variables, it has been found to be unable to detect some of the more subtle deviations caused by the complex interactions between the process variables. Multivariate statistical techniques have been utilised in this study to provide early indications of deviations from nominal batch behaviour. The cause of these deviations can subsequently be determined by interrogating the information produced by these algorithms. Although the multivariate statistical techniques adopted in this paper are not new, their integration within the industrial supervisory system and the on-line application to the industrial fermentation process is novel.
This paper summarises the results of a 2-year study focusing on the development of a condition mo... more This paper summarises the results of a 2-year study focusing on the development of a condition monitoring system for a fed-batch fermentation system operated by Biochemie Ltd. in Austria. Consumer pressure has esulted in a greater emphasis in industry on product quality. As a direct consequence, the importance of accurate process monitoring has increased steadily in recent years. This paper demonstrates the application of multivariate statistical routines to provide process operators with a monitoring tool capable of detecting process abnormalities.
Uploads
Papers by Hugo Hiden