Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2542050.2542063acmotherconferencesArticle/Chapter ViewAbstractPublication PagessoictConference Proceedingsconference-collections
research-article

Stretch optimization for virtual screening on multi-user pilot-agent platforms on grid/cloud

Published: 05 December 2013 Publication History
  • Get Citation Alerts
  • Abstract

    Virtual screening has proven very effective on grid infrastructures where large scale deployments have led to the identification of active inhibitors for biological targets of interest against malaria, SARS or diabetes. Operating a dedicated virtual screening platform on grid resources requires optimizing the scheduling policy. The scheduling can be done at 2 levels; at site level and at platform level. Site scheduling is done at each site independently; each site is autonomous in its choice of job scheduling. Each site allocates time slots for different groups of users. Platform scheduling is done at group level: inside a time slot jobs from many users are allocated. Pilot agents are sent to sites and act as a container of actual users jobs. They pick up users jobs from a central queue where the second stage scheduling is done. In this paper, we focus on pilot-agent platform shared by many virtual screening users. They need a suitable scheduling algorithm to ensure a certain fairness between users. We have studied the scheduling of users jobs inside central queue and examined the relevance and impact of different scheduling policies (FIFO, SPT, LPT and Round Robin) on the user experience. Optimal criterion used in our research is the stretch, a measure for user experience on the platform. In a first step, we simulated the operation of virtual screening applications on the pilot-agent platform in order to compare the scheduling policies. According to simulation, SPT algorithm was shown to significantly improve scheduling performances. In a second step, the Shortest Processing Time (SPT) and Longest Processing Time (LPT) scheduling policies were implemented on a DIRAC pilot-agent platform at IFI in Hanoi and tested on EGI Biomed Virtual Organization. Experimental results are in good agreement with simulation and confirm that SPT algorithm significantly improves user experience.
    The relevance of our conclusions also extends to cloud computing. Indeed, cloud infrastructures are also characterized by limited machine availability.

    References

    [1]
    Rao, V. S., and Srinivas, K. 2011. Modern drug discovery process: an in silico approach. Journal of Bioinformatics and Sequence Analysis, 2(5), 89--94.
    [2]
    Goodsell, D. S., Morris, G. M., and Olson, A. J. 1996. Automated docking of flexible ligands: applications of AutoDock. Journal of Molecular Recognition, 9(1), 1--5.
    [3]
    Coleman, R. G., and Sharp, K. A. 2010. Protein pockets: inventory, shape, and comparison. Journal of chemical information and modeling, 50(4), 589--603.
    [4]
    Schellhammer, I., and Rarey, M. 2004. FlexX-Scan: Fast, structure-based virtual screening. PROTEINS: Structure, Function, and Bioinformatics, 57(3), 504--517.
    [5]
    Jacq, N., Breton, V., Chen, H. Y., Ho, L. Y., Hofmann, M., Lee, H. C., and Zimmermann, M. 2006. Large scale in silico screening on grid infrastructures. arXiv preprint cs/0611084.
    [6]
    Jacq, N., Salzemann, J., Jacq, F., Legré, Y., Medernach, E., Montagnat, J., and Breton, V. 2008. Grid-enabled virtual screening against malaria. Journal of Grid Computing, 6(1), 29--43.
    [7]
    Lee, H. C., Salzemann, J., Jacq, N., Chen, H. Y., Ho, L. Y., Merelli, I., and Wu, Y. T. 2006. Grid-enabled high-throughput in silico screening against influenza A neuraminidase. IEEE transactions on nanobioscience, 5, 288--295.
    [8]
    Kasam, V., Salzemann, J., Botha, M., Dacosta, A., Degliesposti, G., Isea, R., and Breton, V. 2009. WISDOM-II: Screening against multiple targets implicated in malaria using computational grid infrastructures. Malaria Journal, 8(1), 88.
    [9]
    van Herwijnen, E., Closier, J., Frank, M., Gaspar, C., Loverre, F., Ponce, S., and Gandelman, M. 2003. Dirac---distributed infrastructure with remote agent control. In Conference for Computing in High-Energy and Nuclear Physics (CHEP 03).
    [10]
    Mościcki, J. T. 2003. Distributed analysis environment for HEP and interdisciplinary applications. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 502(2), 426--429.
    [11]
    Sfiligoi, I. 2008, July. glideinWMS---a generic pilot-based workload management system. In Journal of Physics: Conference Series (Vol. 119, No. 6, p. 062044). IOP Publishing.
    [12]
    Maeno, T. 2008. PanDA: distributed production and distributed analysis system for ATLAS. In Journal of Physics: Conference Series (Vol. 119, No. 6, p. 062036). IOP Publishing.
    [13]
    Da Silva, R. F., Camarasu-Pop, S., Grenier, B., Hamar, V., Manset, D., Montagnat, J., and Glatard, T. 2011. Multi-infrastructure workflow execution for medical simulation in the Virtual Imaging Platform. In Proceedings of the 9th HealthGrid Conference. 1--10.
    [14]
    Maruthanayagam, D. and Uma Rani, R. 2010. Grid scheduling algorithms: a survey. International Journal of Current Research. Vol. 11 (December. 2010). 228--235.
    [15]
    Jiang, C., Wang, C., Liu, X., and Zhao, Y. 2007. A survey of job scheduling in grids. In Advances in Data and Web Management. Springer Berlin Heidelberg. 419--427.
    [16]
    Schmidt, G. 2000. Scheduling with limited machine availability. European Journal of Operational Research, 121(1), 1--15.
    [17]
    Marrow, P., Bonsma, E., Wang, F., and Hoile, C. 2003. DIET---a scalable, robust and adaptable multi-agent platform for information management. BT technology journal, 21(4).130--137.
    [18]
    Berman, F., Wolski, R., Figueira, S., Schopf, J., and Shao, G. 1996. Application-level scheduling on distributed heterogeneous networks. In Proceedings of Supercomputing. vol. 96. Citeseer, 1996. 1--28.
    [19]
    Pandey, S., Wu, L., Guru, S. M., and Buyya, R. 2010. A particle swarm optimization-based heuristic for scheduling workflow applications in cloud computing environments. In AINA '10: Proceedings of the 2010, 24th IEEE International Conference on Advanced Information Networking and Applications. Washington, DC, USA. 2010. IEEE Computer Society. 400--407
    [20]
    Li, W., Tordsson, J., and Elmroth, E. 2011. Modeling for dynamic cloud scheduling via migration of virtual machines. In Proceedings of the 3rd IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2011). 163--171.
    [21]
    Luckow, A., Lacinski, L., and Jha, S. 2010. SAGA BigJob: An extensible and interoperable pilot-job abstraction for distributed applications and systems. In Cluster, Cloud and Grid Computing (CCGrid), 2010 10th IEEE/ACM International Conference. 135--144.
    [22]
    Fifield, T., Carmona, A., Casajús, A., Graciani, R., and Sevior, M. 2011. Integration of cloud, grid and local cluster resources with DIRAC. In Journal of Physics: Conference Series (Vol. 331, No. 6, p 062009)
    [23]
    Muthukrishnan, S., Rajaraman, R., Shaheen, A., and Gehrke, J. E. 1999. Online scheduling to minimize average stretch. In IEEE Symposium on Foundations of Computer Science.433--442.
    [24]
    Legrand, A., Su, A., and Vivien, F. 2006. Minimizing the stretch when scheduling flows of biological requests. In Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures. 103--112. DOI=http://doi.acm.org/10.1145/1148109.1148124
    [25]
    Chen, B., Potts, C. N., and Woeginger, G. J. 1998. A review of machine scheduling: Complexity, algorithms and approximability. In Handbook of combinatorial optimization, 3, 21--169.
    [26]
    Casanova, H., Legrand, A., and Quinson, M. 2008. SimGrid: a generic framework for large-scale distributed experiments. In Proceeding 10th International Conference Computer Modeling and Simulation. (Mar. 2008). 126--131
    [27]
    Medernach, E. 2005. Workload analysis of a cluster in a grid environment. In Job scheduling strategies for parallel processing. Springer Berlin Heidelberg. 36--61.
    [28]
    Lawler, E. L., Lenstra, J. K., Kan, A. R., & Shmoys, D. B. 1993. Sequencing and scheduling: Algorithms and complexity. Handbooks in operations research and management science, 4, 445--522.
    [29]
    Cheng, T. C. E., & Sin, C. C. S. 1990. A state-of-the-art review of parallel-machine scheduling research. European Journal of Operational Research, 47(3), 271--292. DOI=http://dx.doi.org/10.1016/0377-2217(90)90215-W
    [30]
    Jain, Raj. The art of computer systems performance analysis. Vol. 182. Chichester: John Wiley & Sons, 1991.
    [31]
    Downey, Allen B. A parallel workload model and its implications for processor allocation. In Cluster Computing 1.1 (1998): 133--145.
    [32]
    Feitelson, Dror G. Packing schemes for gang scheduling. In Job Scheduling Strategies for Parallel Processing. Springer Berlin Heidelberg, 1996.
    [33]
    Azmi, Z. R. M., Bakar, K. A., Abdullah, A. H., Shamsir, M. S., & Manan, W. N. W. 2011. Performance Comparison of Priority Rule Scheduling Algorithms Using Different Inter Arrival Time Jobs in Grid Environment. International Journal of Grid and Distributed Computing, 4(3), 61--70.

    Cited By

    View all
    • (2017)Towards effective scheduling policies for many-task applications: Practice and experience based on HTCaaSConcurrency and Computation: Practice and Experience10.1002/cpe.424229:21(e4242)Online publication date: 24-Aug-2017

    Index Terms

    1. Stretch optimization for virtual screening on multi-user pilot-agent platforms on grid/cloud

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      SoICT '13: Proceedings of the 4th Symposium on Information and Communication Technology
      December 2013
      345 pages
      ISBN:9781450324540
      DOI:10.1145/2542050
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      • SOICT: School of Information and Communication Technology - HUST
      • NAFOSTED: The National Foundation for Science and Technology Development
      • ACM Vietnam Chapter: ACM Vietnam Chapter
      • Danang Univ. of Technol.: Danang University of Technology

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 05 December 2013

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. SimGrid
      2. cloud computing
      3. fairness
      4. grid computing
      5. online-algorithm
      6. scheduling
      7. stretch
      8. virtual screening

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      SoICT '13
      Sponsor:
      • SOICT
      • NAFOSTED
      • ACM Vietnam Chapter
      • Danang Univ. of Technol.

      Acceptance Rates

      SoICT '13 Paper Acceptance Rate 40 of 80 submissions, 50%;
      Overall Acceptance Rate 147 of 318 submissions, 46%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 09 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2017)Towards effective scheduling policies for many-task applications: Practice and experience based on HTCaaSConcurrency and Computation: Practice and Experience10.1002/cpe.424229:21(e4242)Online publication date: 24-Aug-2017

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media