New digital business models facilitated by containers require collecting and analyzing device data. Apache Mesos removes the need to build separate stacks and combines optimized application containers and data analytics into a single platform. In this session, we will explore new approaches to data analytics using REX-Ray as a container persistence tool and the SMACK stack - Spark, Mesos, Akka, Cassandra, Kafka – a set of tools for building data and messaging layers for digital engagement apps.
Docker for Private Clouds with RackHD - Justin Kenney and Aaron Spiegel - Del...
Report
Share
1 of 28
More Related Content
Data Analytics Using Container Persistence Through SMACK - Manny Rodriguez-Perez and Kendrick Coleman - Dell EMC World 2017
1. Data Analytics Using
Container Persistence
Through SMACK
Manny Rodriguez-Perez
FL Enterprise Advisory SE
@MannyRodP
{code} by Dell EMC
Kendrick Coleman
Developer Advocate
@KendrickColeman
{code} by Dell EMC
Add talking points to each point
OSS projects: Docker, Mesos, Kubernetes, Cloud Foundry
My first job out of college was at a company that analyzed Sprint Call Center statistics. One of my initial responsibilities was handed over to me from other people and it required going to 15 or so different internal sites and pulling data, put it into an excel spreadsheet, and have it ready for the call center management staff by 8:00am everyday before the phones opened. This required me getting to office at 6am since it was a manual process. Over the course of 1 month, I was able to automate everything through VBA macros and I got an extra hour and a half of sleep since it only took 5 minutes to collect data instead of two hours. However, everyone was analyzing day old data. Things have progressed and people need less than just a few second lag time.
To be a modern software-enabled company you have to embrace the validated learning innovation loop , made popular by Eric Ries in his book The Lean Startup. In addition we believe the companies or teams that operate the loop at the fastest cycles will excel beyond competitors and find new markets. These faster innovation cycles are based on faster software development and deployments. The modern technology and practices that enable faster software development and deployments form a Cloud Native Strategy. Being cloud native meanings that the applications and their supporting ecosystem allows business to pivot as insights are gained.
… [teams] that accelerate this feedback loop quickly accumulate validated learning while minimizing wasted time, money and effort.
- Eric Ries , the Lean Startup
Innovations in analytics processing in the last 10 years have served to reduce the cycle time of the validated learning loop and fundamentally change the questions business can ask.
Lets look at a brief history of analytics processing in terms of the technology, capabilities and business alignment…
When I started in IT in the 1990s , focus was on modeling, organizing, ETL-ing data for the sake of reporting – collect then analyzing process
As data volume, velocity, and variability increased , and to take advantage of increased host and storage per node performance , parallel processing architectures such as hadoop have gained popularity
Hadoop and others provided a faster collection and analysis process , reducing the cycle time considerably, However it was still a collect then analyze model and at best a micro-batch process, and while some companies have build near-realtime engagement models with technology like Hadoop , its still very difficult to do at sacle and with cost efficiencies
To get to a more prescriptive model and respond to devices and customers in real time we can no longer wait for our haddop batch jobs to load and process, we need to look at the data coming in real time and process it, make a decision.
This is a new world of event drive analytics which helps us answer What should we do?
This is a paradigm shift from collect-analyze to analyze-collect
data analysis needs to occur almost in tandem with data collection
I say analyze – then collect because its not like we are going to throw away the legacy analytics capabilities
Because the world is not a batch oriented, the world is streaming live … right now … from billions of people and devices – imagine 50B streams with their own message, uniqueness, waiting to be heard and engaged
Just imagine one application , like twitter which has a potential for 320MM streams of data at any one time
[Need actual data on how many streams does certain companies , web scale and not web scale see/process, example]
and not web scale see/process?
Perishable insights
So we need a way of building applications and analytics capabilities that can run our cloud native applications , handle complex event processing and store historical data , all while
Key Apache project
Key Apache project
Benefits of shared storage
Flexibility
Class of service
Consistent operations across multiple storage endpoints
Enterprise storage features – Snaps etc
Simplify recovery from downed node
Easily Add/Remove additional nodes
Docker Volume Driver Isolator Module
No remote storage capability using Mesos before September 2015
{code} was first to bring an abstracted persistent storage model
Freedom to use the Mesos Containerizer with ANY docker volume driver (that means you don’t need Docker)
Merged upstream into Mesos 1.0 to provide storage platform support for any vendor who has written a docker volume driver, including REX-Ray
Every company that boasts storage integration with Mesos uses the {code} contributed module.
DVDCLI
Sss
Sss
DC/OS 1.7 Release ships with REX-Ray embedded
DC/OS 1.8 Release ships with DVDI embedded for Mesos 1.0
Featured in the documentation - https://dcos.io/docs/1.8/usage/storage/external-storage/
Kenny
Key Apache project
First, a few things about the team that has made this possible.
The Dell EMC {code} team is a team made up of open source software engineers and developer advocates, focused on making EMC a well-known name within the open source community.
We will focus on one of their projects, REX-Ray, in this presentation.