This paper presents the rationale for a new architecture to support a significant increase in the... more This paper presents the rationale for a new architecture to support a significant increase in the scale of data integration and data mining. It proposes the composition into one framework of (1) data mining and (2) data access and integration. We name the combined activity DMI. It supports enactment of DMI processes across heterogeneous and distributed data resources and data
In this paper we address two research questions concerning workflows: 1) how do we abstract and c... more In this paper we address two research questions concerning workflows: 1) how do we abstract and catalogue recurring workflow patterns?; and 2) how do we facilitate optimisation of the mapping from workflow patterns to actual resources at runtime? Our aim here is to explore techniques that are applicable to large-scale workflow compositions, where the resources could change dynamically during the
... A Bandwidth-Aware Job Grouping-Based Scheduling on Grid Environment TF Ang, WK Ng, TC Ling, L... more ... A Bandwidth-Aware Job Grouping-Based Scheduling on Grid Environment TF Ang, WK Ng, TC Ling, LY Por and CS Liew Department of ... In this approach, an association comprising Đš link-disjoint multi-hops are considered as IIb II2, ..., Ilk and the transmission time of packets on ...
To facilitate data mining and integration (DMI) processes in a generic way, we investigate a para... more To facilitate data mining and integration (DMI) processes in a generic way, we investigate a parallel pipeline streaming model. We model a DMI task as a streaming data-flow graph: a directed acyclic graph (DAG) of Processing Elements (PEs). The composition mechanism links PEs via data streams, which may be in memory, buffered via disks or inter-computer data-flows. This makes it
ABSTRACT This paper presents a data-intensive architecture that demonstrates the ability to suppo... more ABSTRACT This paper presents a data-intensive architecture that demonstrates the ability to support applications from a wide range of application domains, and support the different types of users involved in defining, designing and executing data-intensive processing tasks. The prototype architecture is introduced, and the pivotal role of DISPEL as a canonical language is explained. The architecture promotes the exploration and exploitation of distributed and heterogeneous data and spans the complete knowledge discovery process, from data preparation, to analysis, to evaluation and reiteration. The architecture evaluation included large-scale applications from astronomy, cosmology, hydrology, functional genetics, imaging processing and seismology.
This paper presents the rationale for a new architecture to support a significant increase in the... more This paper presents the rationale for a new architecture to support a significant increase in the scale of data integration and data mining. It proposes the composition into one framework of (1) data mining and (2) data access and integration. We name the combined activity DMI. It supports enactment of DMI processes across heterogeneous and distributed data resources and data
In this paper we address two research questions concerning workflows: 1) how do we abstract and c... more In this paper we address two research questions concerning workflows: 1) how do we abstract and catalogue recurring workflow patterns?; and 2) how do we facilitate optimisation of the mapping from workflow patterns to actual resources at runtime? Our aim here is to explore techniques that are applicable to large-scale workflow compositions, where the resources could change dynamically during the
... A Bandwidth-Aware Job Grouping-Based Scheduling on Grid Environment TF Ang, WK Ng, TC Ling, L... more ... A Bandwidth-Aware Job Grouping-Based Scheduling on Grid Environment TF Ang, WK Ng, TC Ling, LY Por and CS Liew Department of ... In this approach, an association comprising Đš link-disjoint multi-hops are considered as IIb II2, ..., Ilk and the transmission time of packets on ...
To facilitate data mining and integration (DMI) processes in a generic way, we investigate a para... more To facilitate data mining and integration (DMI) processes in a generic way, we investigate a parallel pipeline streaming model. We model a DMI task as a streaming data-flow graph: a directed acyclic graph (DAG) of Processing Elements (PEs). The composition mechanism links PEs via data streams, which may be in memory, buffered via disks or inter-computer data-flows. This makes it
ABSTRACT This paper presents a data-intensive architecture that demonstrates the ability to suppo... more ABSTRACT This paper presents a data-intensive architecture that demonstrates the ability to support applications from a wide range of application domains, and support the different types of users involved in defining, designing and executing data-intensive processing tasks. The prototype architecture is introduced, and the pivotal role of DISPEL as a canonical language is explained. The architecture promotes the exploration and exploitation of distributed and heterogeneous data and spans the complete knowledge discovery process, from data preparation, to analysis, to evaluation and reiteration. The architecture evaluation included large-scale applications from astronomy, cosmology, hydrology, functional genetics, imaging processing and seismology.
Uploads
Papers by Chee Liew