Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
EUBra-BIGSEA, A Cloud-
Centric Big Data
Scientific Research Platform
Ignacio Blanquer, Universitat Politècnica de València
Wagner Meira Jr., Universidade Federal De Minas Gerais
Regina Morais, Universidade de Campinas
www.eubra-bigsea.eu | @bigsea_eubr 1
EUBra-BIGSEA
www.eubra-bigsea.eu | @bigsea_eubr
• A European-Brazilian Consortium for
― Developing a framework, a platform
and a library to ease the development
of highly-scalable, privacy-aware data
analytic applications running on top of
Quality of Service cloud infrastructures.
― While EUBra-BIGSEA targets Data
Scientists in general in the context of
the project timeline it has been
demonstrated implementing a set of
applications for analysing public
transportation data.
2
www.eubra-bigsea.eu
EUBra-BIGSEA assets
www.eubra-bigsea.eu | @bigsea_eubr
• One QoS Data Analytics Platform.
• A platform with 6 layers and 15 new developments (3 infrastructure
components,
2 Big Data services,
1 programming
framework,
3 security components,
4 high-level services
and 2 applications).
• 5 legacy components
improved & integrated
(IM, EC3, Ophidia,
COMPSs,
Melhor Busão).
3
BIGSEA Architecture
www.eubra-bigsea.eu | @bigsea_eubr 4
A QoS platform
www.eubra-bigsea.eu | @bigsea_eubr
• Defined using Standard recipes
― Convenient deployment on on-premise
and public clouds (github repo)
― A resource configuration estimation
service for meeting a given deadline.
― Horizontal elasticity at job level
― Automatic reconfiguration of the virtual
infrastructure.
― Vertical elasticity
― Both at the level of the Mesos
framework and the CPU and I/O CAP.
― Tuning application performance
without interruptions.
― Applicable to any application
― Monitoring/controlling based on
plugins, simplifying customization
to other frameworks or infrastructure.
5
Programming Models
• Lemonade* is an analytics platform that supports, through
workflows, intuitive definition of tasks for knowledge discovery,
mining, and learning from large amounts of data that come from a
wide spectrum of scenarios.
- Lemonade generates Spark and
COMPSs codes that uses all
functionalities provided by
BIGSEA.
- Provides high level abstractions
and visualizations for data
scientists, students and
researchers.
- Defines transparently QoS and
AAA execution parameters (WP3
& WP6).
- Supports the implementation of
batch, streaming and interactive
applications (web services).
* Live Exploration and Mining Of a Non-trivial Amount of Data from Everywhere
www.eubra-bigsea.eu | @bigsea_eubr 6
Supporting Privacy
www.eubra-bigsea.eu | @bigsea_eubr 7
Conclusions
- EUBra-BIGSEA has developed a Big Data application development
framework that comprises three primary assets that are not
available in the market:
- The QoS Data Analytics Platform based on cloud
services for enabling a running application to meet
an execution deadline.
- The Open-source Data Analytics applications
development framework that provides a graphical
interface to build up data-analytics workflows that
include automatic discovery of parallelism, OLAP
functions, Privacy annotation, quality assurance
and Entity Matching.
- A toolbox of 8 Descriptive and Predictive models for building traffic data analysis
applications.
- The maturity of the components has been assessed externally
- The minimum score of the TRL was 4, with an average TRL score of 6,25.
- Components released under Open Source licenses (Apache 2 and GPLv3).
www.eubra-bigsea.eu | @bigsea_eubr 8

More Related Content

EUBraBIGSEA Final results

  • 1. EUBra-BIGSEA, A Cloud- Centric Big Data Scientific Research Platform Ignacio Blanquer, Universitat Politècnica de València Wagner Meira Jr., Universidade Federal De Minas Gerais Regina Morais, Universidade de Campinas www.eubra-bigsea.eu | @bigsea_eubr 1
  • 2. EUBra-BIGSEA www.eubra-bigsea.eu | @bigsea_eubr • A European-Brazilian Consortium for ― Developing a framework, a platform and a library to ease the development of highly-scalable, privacy-aware data analytic applications running on top of Quality of Service cloud infrastructures. ― While EUBra-BIGSEA targets Data Scientists in general in the context of the project timeline it has been demonstrated implementing a set of applications for analysing public transportation data. 2 www.eubra-bigsea.eu
  • 3. EUBra-BIGSEA assets www.eubra-bigsea.eu | @bigsea_eubr • One QoS Data Analytics Platform. • A platform with 6 layers and 15 new developments (3 infrastructure components, 2 Big Data services, 1 programming framework, 3 security components, 4 high-level services and 2 applications). • 5 legacy components improved & integrated (IM, EC3, Ophidia, COMPSs, Melhor Busão). 3
  • 5. A QoS platform www.eubra-bigsea.eu | @bigsea_eubr • Defined using Standard recipes ― Convenient deployment on on-premise and public clouds (github repo) ― A resource configuration estimation service for meeting a given deadline. ― Horizontal elasticity at job level ― Automatic reconfiguration of the virtual infrastructure. ― Vertical elasticity ― Both at the level of the Mesos framework and the CPU and I/O CAP. ― Tuning application performance without interruptions. ― Applicable to any application ― Monitoring/controlling based on plugins, simplifying customization to other frameworks or infrastructure. 5
  • 6. Programming Models • Lemonade* is an analytics platform that supports, through workflows, intuitive definition of tasks for knowledge discovery, mining, and learning from large amounts of data that come from a wide spectrum of scenarios. - Lemonade generates Spark and COMPSs codes that uses all functionalities provided by BIGSEA. - Provides high level abstractions and visualizations for data scientists, students and researchers. - Defines transparently QoS and AAA execution parameters (WP3 & WP6). - Supports the implementation of batch, streaming and interactive applications (web services). * Live Exploration and Mining Of a Non-trivial Amount of Data from Everywhere www.eubra-bigsea.eu | @bigsea_eubr 6
  • 8. Conclusions - EUBra-BIGSEA has developed a Big Data application development framework that comprises three primary assets that are not available in the market: - The QoS Data Analytics Platform based on cloud services for enabling a running application to meet an execution deadline. - The Open-source Data Analytics applications development framework that provides a graphical interface to build up data-analytics workflows that include automatic discovery of parallelism, OLAP functions, Privacy annotation, quality assurance and Entity Matching. - A toolbox of 8 Descriptive and Predictive models for building traffic data analysis applications. - The maturity of the components has been assessed externally - The minimum score of the TRL was 4, with an average TRL score of 6,25. - Components released under Open Source licenses (Apache 2 and GPLv3). www.eubra-bigsea.eu | @bigsea_eubr 8