Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Microservices vs
Hadoop ecosystem
Marton Elek
2017 february
2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Microservice definition
”An approach to developing a single application as a
 suite of small services, each running in its own process
 and communicating with lightweight mechanisms, often an HTTP resource
API.
 These services are built around business capabilities and independently
deployable by fully automated deployment machinery.”
– https://martinfowler.com/articles/microservices.html
3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Hadoop cluster
 The definition is almost true for a Hadoop cluster as well
4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Dockerized Hadoop cluster
 How can we use the tools from microservice architecture in hadoop
ecosystem?
 A possible approach to install cluster (hadoop, spark, kafka, hive) based on
– separated docker containers
– Smart configuration management (using well-known tooling from microservices
architectures)
 Goal: rapid prototyping platform
 Easy switch between
– versions (official HDP, snapshot build, apache build)
– configuration (ha, kerberos, metrics, htrace…)
 Developers/Ops tool
– Easy != easy for any user without knowledge about the tool
 Not goal:
– replace current management plaforms (eg. Ambari)

Recommended for you

Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase

This document introduces HBase, an open-source, non-relational, distributed database modeled after Google's BigTable. It describes what HBase is, how it can be used, and when it is applicable. Key points include that HBase stores data in columns and rows accessed by row keys, integrates with Hadoop for MapReduce jobs, and is well-suited for large datasets, fast random access, and write-heavy applications. Common use cases involve log analytics, real-time analytics, and messages-centered systems.

hbase introduction
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...

Ranger’s pluggable architecture allows resource access policy administration and enforcement for standard and custom services from a “single pane of glass”. Apache Ranger has a rich Authorization Model, which provides the mechanism to author Policy in a Ranger Admin Server and serves as policy decision and audit point in authorizing user’s resource access within various components of Hadoop ecosystem. This session will provide a deep dive into Ranger framework and a cook-book for extending Ranger to do authorization / auditing on resource access to external applications, including technical details of Rest APIs, Ranger policy engine and enriching authorization requests, with a demo of a sample application.We will then demonstrate a real-world example of how Ranger has simplified security enforcement for Hadoop-native MPP SQL engine like Apache HAWQ (incubating),which previously used its built-in Postgres-like authorization mechanisms. The integration design includes a Ranger Plugin Service that allows transparent authorization API calls between C-based Apache HAWQ and Java-based Apache Ranger.

dataworks summitdws17dataworks summit 2017
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFi

In this session, we will cover our experience working with Apache NiFi, an easy to use, powerful, and reliable system to process and distribute a large volume of data. The first part of the session will be an introduction to Apache NiFi. We will go over NiFi main components and building blocks and functionality. In the second part of the session, we will show our use case for Apache NiFi and how it's being used inside our Data Processing infrastructure.

5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
What are the Microservices (Theory)
Collection of patterns/best practices
 II. Dependencies
– Explicitly declare and isolate dependencies
 III. Config
– Store config in the environment
 VI. Processes
– Execute the app as one or more stateless processes
 VIII. Concurrency
– Scale out via the process model
 XII. Admin processes
– Run admin/management tasks as one-off processes
12 Factory apps (http://12factor.net)
6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
What are the Microservices (Practice)
 Spring started as a
– Dependency injection framework
 Spring Boot ecosystem
– Easy to use starter projects
– Lego bricks for various problems
• JDBC access
• Database access
• REST
• Health check
 Spring Cloud -- elements to build microservices (based on Netflix stack)
– API gateway
– Service registry
– Configuration server
– Distributed tracing
– Client side load balancing
public class TimeStarter {
@Autowired
TimeService timerService;
public Date now() {
long timeService = timerService.now();
}
}
7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Microservices with Spring Cloud
8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Monolith application
 Monolith but modular application example
auth service
timer service
upload service
report service
Rest call

Recommended for you

HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon

OpenTSDB was built on the belief that, through HBase, a new breed of monitoring systems could be created, one that can store and serve billions of data points forever without the need for destructive downsampling, one that could scale to millions of metrics, and where plotting real-time graphs is easy and fast. In this presentation we’ll review some of the key points of OpenTSDB’s design, some of the mistakes that were made, how they were or will be addressed, and what were some of the lessons learned while writing and running OpenTSDB as well as asynchbase, the asynchronous high-performance thread-safe client for HBase. Specific topics discussed will be around the schema, how it impacts performance and allows concurrent writes without need for coordination in a distributed cluster of OpenTSDB instances.

hbasecon 2012hbase slideshbase stumbleupon
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSA

DNS is critical network infrastructure and securing it against attacks like DDoS, NXDOMAIN, hijacking and Malware/APT is very important to protecting any business.

secure dnsmalwareapt
Power of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data StructuresPower of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data Structures

LSM trees provide an efficient way to structure databases by organizing data sequentially in logs. They optimize for write performance by batching writes together sequentially on disk. To optimize reads, data is organized into levels and bloom filters and caching are used to avoid searching every file. This log-structured approach works well for many systems by aligning with how hardware is optimized for sequential access. The immutability of appended data also simplifies concurrency. This log-centric approach can be applied beyond databases to distributed systems as well.

confluentappend-only data structureslog structured merge trees
9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Monolith application
 Monolith but modular application example
auth service
timer service
upload service
report service
Rest call
@EnableAutoConfiguration
@RestController
@ComponentScan
public class TimeStarter {
@Autowired
TimeService timerService;
@RequestMapping("/now")
public Date now() {
return timerService.now();
}
public static void main(String[] args) {
SpringApplication.run(TimeStarter.class, args);
}
}
10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Microservice version
 First problem: how can we find the right backend port form the frontend?
auth service
timer service
upload service
report service
Rest call
Rest call
Rest call
Rest call
11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Solution: API Gateway
 First problem: how can we find the right backend port form the frontend?
auth service
timer service
upload service
report service
API gateway
Rest call
12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
API Gateway
 Goals: Hide available microservices behind a service facade pattern
– Routing, Authorization
– Deployment handling, Canary testing, Blue/Green deployment
– Logging, SLA, Auditing
 Implementation examples:
– Spring cloud Api Gateway (based on Netflix Zuul)
– Netflix Zuul based implementation
– Twitter Finagle based implementation
– Amazon API gateway
– Simple Nginx reverse proxy configuration
– Traefik, Kong
 Usage in Hadoop ecosystem
– For prototyping: Only if the scheduler/orchestrator starts the service on a random host
– For security: Apache Knox

Recommended for you

What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...

This presentation about Apache Spark covers all the basics that a beginner needs to know to get started with Spark. It covers the history of Apache Spark, what is Spark, the difference between Hadoop and Spark. You will learn the different components in Spark, and how Spark works with the help of architecture. You will understand the different cluster managers on which Spark can run. Finally, you will see the various applications of Spark and a use case on Conviva. Now, let's get started with what is Apache Spark. Below topics are explained in this Spark presentation: 1. History of Spark 2. What is Spark 3. Hadoop vs Spark 4. Components of Apache Spark 5. Spark architecture 6. Applications of Spark 7. Spark usecase What is this Big Data Hadoop training course about? The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab. What are the course objectives? Simplilearn’s Apache Spark and Scala certification training are designed to: 1. Advance your expertise in the Big Data Hadoop Ecosystem 2. Help you master essential Apache and Spark skills, such as Spark Streaming, Spark SQL, machine learning programming, GraphX programming and Shell Scripting Spark 3. Help you land a Hadoop developer job requiring Apache Spark expertise by giving you a real-life industry project coupled with 30 demos What skills will you learn? By completing this Apache Spark and Scala course you will be able to: 1. Understand the limitations of MapReduce and the role of Spark in overcoming these limitations 2. Understand the fundamentals of the Scala programming language and its features 3. Explain and master the process of installing Spark as a standalone cluster 4. Develop expertise in using Resilient Distributed Datasets (RDD) for creating applications in Spark 5. Master Structured Query Language (SQL) using SparkSQL 6. Gain a thorough understanding of Spark streaming features 7. Master and describe the features of Spark ML programming and GraphX programming Who should take this Scala course? 1. Professionals aspiring for a career in the field of real-time big data analytics 2. Analytics professionals 3. Research professionals 4. IT developers and testers 5. Data scientists 6. BI and reporting professionals 7. Students who wish to gain a thorough understanding of Apache Spark Learn more at https://www.simplilearn.com/big-data-and-analytics/apache-spark-scala-certification-training

what is apache sparkintroduction to apache sparkapache spark tutorial
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice

From: DataWorks Summit 2017 - Munich - 20170406 HBase hast established itself as the backend for many operational and interactive use-cases, powering well-known services that support millions of users and thousands of concurrent requests. In terms of features HBase has come a long way, overing advanced options such as multi-level caching on- and off-heap, pluggable request handling, fast recovery options such as region replicas, table snapshots for data governance, tuneable write-ahead logging and so on. This talk is based on the research for the an upcoming second release of the speakers HBase book, correlated with the practical experience in medium to large HBase projects around the world. You will learn how to plan for HBase, starting with the selection of the matching use-cases, to determining the number of servers needed, leading into performance tuning options. There is no reason to be afraid of using HBase, but knowing its basic premises and technical choices will make using it much more successful. You will also learn about many of the new features of HBase up to version 1.3, and where they are applicable.

hbasetuningbest practices
Keeping Latency Low for User-Defined Functions with WebAssembly
Keeping Latency Low for User-Defined Functions with WebAssemblyKeeping Latency Low for User-Defined Functions with WebAssembly
Keeping Latency Low for User-Defined Functions with WebAssembly

WebAssembly (WASM) is a great choice for user-defined functions, due to the fact that it was designed to be easily embeddable, with a focus on security and speed. Still, executing functions provided by users should not cause latency spikes - it's important for individual database clusters, and absolutely crucial for multi-tenancy. In order to keep latency low, one can utilize a WebAssembly runtime with async support. One such runtime is Wasmtime, a Rust project perfectly capable of running WebAssembly functions cooperatively and asynchronously. This talk briefly describes WebAssembly and Wasmtime, and shows how to integrate them into a C++ project in a latency-friendly manner, while implementing the core runtime for user-defined functions in async Rust.

high throughput and low latencyp99p99 conf
13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Service registry
 Problem: how to configure API gateway to automatically route to all the
services
auth service
timer service
upload service
report service
API gateway
Rest call
?
14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Service registry
 Solution: Use service registry
– Components should be registered to the service registry automatically
auth service
timer service
upload service
report service
Rest call
Service registry
API gateway
15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Service registry
 Goal: Store the location and state of the available services
– Health check
– DNS interface
 Implementation examples:
– Spring cloud: Eureka
– Netflix eureka based implementation
– Consul.io
– etcd, zookeeper
– Simple workaround: DNS or hosts file
 Usage in Hadoop ecosystem
– Most of the components needs info about the location of nameserver(s) and other
master components
16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Configuration server
 Problem: how can we configure multiple components
– ”Store config in the environment” (12factor)
auth service
timer service
upload service
report service
Rest call
Service registry
API gateway
Config
?
Config
?
Config
?
Config
?

Recommended for you

Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes

The document discusses the Spark Operator, which allows deploying, managing, and monitoring Spark clusters on Kubernetes. It describes how the operator extends Kubernetes by defining custom resources and reacting to events from those resources, such as SparkCluster, SparkApplication, and SparkHistoryServer. The operator takes care of common tasks to simplify running Spark on Kubernetes and hides the complexity through an abstract operator library.

* apache spark

 *big data

 *ai

 *
Apache Spark 101
Apache Spark 101Apache Spark 101
Apache Spark 101

This document provides an overview of Apache Spark, including its goal of providing a fast and general engine for large-scale data processing. It discusses Spark's programming model, components like RDDs and DAGs, and how to initialize and deploy Spark on a cluster. Key aspects covered include RDDs as the fundamental data structure in Spark, transformations and actions, and storage levels for caching data in memory or disk.

data analyticsapache sparkbig data
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...

"This is a technical architect's case study of how Loggly has employed the latest social-media-scale technologies as the backbone ingestion processing for our multi-tenant, geo-distributed, and real-time log management system. This presentation describes design details of how we built a second-generation system fully leveraging AWS services including Amazon Route 53 DNS with heartbeat and latency-based routing, multi-region VPCs, Elastic Load Balancing, Amazon Relational Database Service, and a number of pro-active and re-active approaches to scaling computational and indexing capacity. The talk includes lessons learned in our first generation release, validated by thousands of customers; speed bumps and the mistakes we made along the way; various data models and architectures previously considered; and success at scale: speeds, feeds, and an unmeltable log processing engine."

aws cloudvpcsrds
17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Configuration server
 Problem: how can we configure multiple components
auth service
timer service
upload service
report service
Rest call
Service registry
API gateway
Configuration
Config server
18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Config server
 Goals: One common place for all of the configuration
– Versioning
– Auditing
– Multiple environment support: Use (almost) the same configuration from DEV to PROD
environment
– Solution for sensitive data
 Solution examples:
– Spring Cloud config service
– Zookeeper
– Most of the service registry have key->value store (Consul, etcd)
– Any persistence datastore (But the versioning is a question)
 For Hadoop ecosystem:
– Most painful point: the same configuration elements (eg. core-site.xml) is needed at
multiple location
– Ambari and other management tools try to solve the problem (but not with the focus of
rapid prototyping)
19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Config server – configuration management
 Config server structure: [branch]/name-profile.extension
 Merge properties for name=timer and profile(environment)=dev
 URL from the config server
– http://config:8888/timer-dev.properties
• server.port=6767
• aws.secret.key=zzz
• exit.code=-23
 Local file system structure (master branch)
– timer.properties
• server.port=6767
– dev.properties
• aws.secret.key=xxx
– application.properties
• exit.code=-23
Config server
20 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Summary
 Tools used in microservice architecture
 Key components:
– Config server
– Service registry
– API gateway
 Configuration server
– Versioning
– One common place to distribute configuration
– Configuration preprocessing!!!
• transformation
• the content of the configuration should be defined, it could be format
independent
• But the final configuration should be visible

Recommended for you

MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...

This presentation by Krzysztof Książek at Percona Live 2017 in Santa Clara, California gives detailed descriptions and comparisons of the leading open source database load balancing technologies

mysql routerload balancingproxysql
Spark architecture
Spark architectureSpark architecture
Spark architecture

The document discusses Apache Spark, an open source cluster computing framework for real-time data processing. It notes that Spark is up to 100 times faster than Hadoop for in-memory processing and 10 times faster on disk. The main feature of Spark is its in-memory cluster computing capability, which increases processing speeds. Spark runs on a driver-executor model and uses resilient distributed datasets and directed acyclic graphs to process data in parallel across a cluster.

A Modern C++ Kafka API | Kenneth Jia, Morgan Stanley
A Modern C++ Kafka API | Kenneth Jia, Morgan StanleyA Modern C++ Kafka API | Kenneth Jia, Morgan Stanley
A Modern C++ Kafka API | Kenneth Jia, Morgan Stanley

The document discusses a new C++ Kafka API called modern-cpp-kafka that was developed to address requirements for a pub/sub messaging system. It provides examples of how the API simplifies and improves upon an existing C++ Kafka client (librdkafka) for key tasks like producing and consuming messages. The modern-cpp-kafka API matches the Java API naming, uses RAII for lifetime management, and hides polling and queue details. It has led to improved throughput of over 26% for an application. The API is now open source and the team is working to expand it further.

kafka summitapache kafkac++
21 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Docker based Hadoop cluster
22 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
 bin
– hdfs
– yarn
– mapred
 etc/hadoop
– core-site.xml
– mapred-site.xml
– hdfs-site.xml
 include
 lib
 libexec
 sbin
 share
apache-hadoop-X.X.tar.gz
1. Configuration server
2. Service registry
3. API gatway
Microservice architecture elements
How to do it with Hadoop?
23 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
 bin
– hdfs
– yarn
– mapred
 etc/hadoop
– core-site.xml
– mapred-site.xml
– hdfs-site.xml
 include
 lib
 libexec
 sbin
 share
apache-hadoop-X.X.tar.gz
1. Configuration server
2. Service registry
3. API gatway
4. +1 Packaging
Microservice architecture elements
Do it with Hadoop
24 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Packaging: Docker
 Packaging: Docker
– Docker Engine:
• a portable,
• lightweight runtime and
• packaging tool
– Docker Hub,
• a cloud service for sharing applications
– Docker Compose:
• Predefined recipes (environment variables, network, …)
 My docker containers: http://hub.docker.com/elek/

Recommended for you

Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introduction

Spark is an open source cluster computing framework for large-scale data processing. It provides high-level APIs and runs on Hadoop clusters. Spark components include Spark Core for execution, Spark SQL for SQL queries, Spark Streaming for real-time data, and MLlib for machine learning. The core abstraction in Spark is the resilient distributed dataset (RDD), which allows data to be partitioned across nodes for parallel processing. A word count example demonstrates how to use transformations like flatMap and reduceByKey to count word frequencies from an input file in Spark.

hadoopsparksqlspark
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers

This document summarizes a presentation about Apache Kafka. It introduces Apache Kafka as a modern, distributed platform for data streams made up of distributed, immutable, append-only commit logs. It describes Kafka's scalability similar to a filesystem and guarantees similar to a database, with the ability to rewind and replay data. The document discusses Kafka topics and partitions, partition leadership and replication, and provides resources for further information.

meetupapache kafkakafka
Deep learning - Part I
Deep learning - Part IDeep learning - Part I
Deep learning - Part I

Interest in Deep Learning has been growing in the past few years. With advances in software and hardware technologies, Neural Networks are making a resurgence. With interest in AI based applications growing, and companies like IBM, Google, Microsoft, NVidia investing heavily in computing and software applications, it is time to understand Deep Learning better! In this lecture, we will discuss the basics of Neural Networks and discuss how Deep Learning Neural networks are different from conventional Neural Network architectures. We will review a bit of mathematics that goes into building neural networks and understand the role of GPUs in Deep Learning. We will also get an introduction to Autoencoders, Convolutional Neural Networks, Recurrent Neural Networks and understand the state-of-the-art in hardware and software architectures. Functional Demos will be presented in Keras, a popular Python package with a backend in Theano. This will be a preview of the QuantUniversity Deep Learning Workshop that will be offered in 2017.

deep learningtheanotensorflow
25 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Docker decisions
 One application per container
– More flexible
– More simple (configuration preprocess + start)
– One deployable unit
 Microservice-like: prefer more similar units against smaller but bigger one
 Using host network for clusters
10.8.0.5
172.13.0.1
172.13.0.5
172.13.0.2
10.8.0.6
172.13.0.3
172.13.0.4
172.13.0.9
10.8.0.5
10.8.0.5
10.8.0.5
10.8.0.5
10.8.0.6
10.8.0.6
10.8.0.6
10.8.0.6
Host networkBridge network
26 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Repositories
 elek/bigdata-docker:
– example configuration
– docker-compose files
– ansible scripts
– getting started
entrypoint
 elek/docker-bigdata-base (base image for all the containers)
– Contains all the configuration loading (and some documentation)
– Use CONFIG_TYPE environment variable to select configuration method
• CONFIG_TYPE=simple (configuration from environment variables – for local env)
• CONFIG_TYPE=consul (configuration from consul – for distributed environment)
 elek/docker-…. (hadoop/spark/hive/...)
– Docker images for the components
27 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Local demo
 Local run, using host network
– More configuration is needed
– Auto scaling is supported
– https://github.com/elek/bigdata-docker/tree/master/compose
bridge network
172.13.0.1
172.13.0.5
172.13.0.2
28 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
 bin
– hdfs
– yarn
– mapred
 etc/hadoop
– core-site.xml
– mapred-site.xml
– hdfs-site.xml
 include
 lib
 libexec
 sbin
 share
apache-hadoop-X.X.tar.gz
1. Packaging
2. Configuration server
3. Service registry
4. API gateway
Components
Do it with Hadoop

Recommended for you

Deep learning and Apache Spark
Deep learning and Apache SparkDeep learning and Apache Spark
Deep learning and Apache Spark

Interest is growing in the Apache Spark community in using Deep Learning techniques and in the Deep Learning community in scaling algorithms with Apache Spark. A few of them to note include: · Databrick’s efforts in scaling Deep learning with Spark · Intel announcing the BigDL: A Deep learning library for Spark · Yahoo’s recent efforts to opensource TensorFlowOnSpark In this lecture we will discuss the key use cases and developments that have emerged in the last year in using Deep Learning techniques with Spark.

tensorflowdatabricksdeep learning
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence

The document provides an overview of deep learning, including its history, key concepts, applications, and recent advances. It discusses the evolution of deep learning techniques like convolutional neural networks, recurrent neural networks, generative adversarial networks, and their applications in computer vision, natural language processing, and games. Examples include deep learning for image recognition, generation, segmentation, captioning, and more.

artificial intelligenceaigoogle
Deep learning Tutorial - Part II
Deep learning Tutorial - Part IIDeep learning Tutorial - Part II
Deep learning Tutorial - Part II

Interest in Deep Learning has been growing in the past few years. With advances in software and hardware technologies, Neural Networks are making a resurgence. With interest in AI based applications growing, and companies like IBM, Google, Microsoft, NVidia investing heavily in computing and software applications, it is time to understand Deep Learning better! In this lecture, we will get an introduction to Autoencoders and Recurrent Neural Networks and understand the state-of-the-art in hardware and software architectures. Functional Demos will be presented in Keras, a popular Python package with a backend in Theano. This will be a preview of the QuantUniversity Deep Learning Workshop that will be offered in 2017.

tensorflowkerasrnn
29 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Service registry/configuration server
 Service registry
– Health check support
– DNS support
 Key-value store
– Binary data is supported
 Based on agents and servers
 Easy to use REST API
 RAFT based consensus protocol
30 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Service registry/configuration server
 Git2Consul
– Mirror git repositories to
consul
 Consul template
– Advanced Template engine
– Renders a template
(configuration file) based on
the information from the
consul
– Run/restart a process on
change
 Registrator
– Listen on docker event
stream
– Register new components to
consul
hdfs-namenode
Consul
Configuration (git)
datanode
datanode
datanode
hdfs-datanode
consul-template
git2consul
Registrator
docker event
stream
31 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Weave scope
 Agents to monitor
– network connections between components
– cpu
– memory
 Supports Docker, Swarm, Weave network, …
 Easy install
 Transparent
 Pluggable
 Only problems:
– Temporary docker containers
32 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Distributed demo
 Distributed run with host network
– https://github.com/elek/bigdata-docker/tree/master/consul
– Configuration is hosted in a consul instance
– Dynamic update
10.8.0.5
10.8.0.5
10.8.0.5
10.8.0.5

Recommended for you

Ansible + Hadoop
Ansible + HadoopAnsible + Hadoop
Ansible + Hadoop

- The document discusses using Ansible to deploy Hortonworks Data Platform (HDP) clusters. - It demonstrates how to use Ansible playbooks to provision AWS infrastructure and install HDP on a 6-node cluster in about 20 minutes with just a few configuration file modifications and running two scripts. - The deployment time can be optimized by adjusting the number and size of nodes, with larger instance types and more master nodes decreasing installation time.

hadoopmeetuphdp
Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)

NVIDIA compute GPUs and software toolkits are key drivers behind major advancements in machine learning. Of particular interest is a technique called "deep learning", which utilizes what are known as Convolution Neural Networks (CNNs) having landslide success in computer vision and widespread adoption in a variety of fields such as autonomous vehicles, cyber security, and healthcare. In this talk is presented a high level introduction to deep learning where we discuss core concepts, success stories, and relevant use cases. Additionally, we will provide an overview of essential frameworks and workflows for deep learning. Finally, we explore emerging domains for GPU computing such as large-scale graph analytics, in-memory databases. https://tech.rakuten.co.jp/

rakuten technology conferencerakutentechnology
Top 5 Deep Learning Stories 2/24
Top 5 Deep Learning Stories 2/24Top 5 Deep Learning Stories 2/24
Top 5 Deep Learning Stories 2/24

AI is now being leveraged by Equifax, SAS, and Wells Fargo in addition to improving tumor diagnosis in this week's Top 5 deep learning.

enterprisemedicineartificial intelligence
33 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
TODO
 More profiles and configuration set
– Ready to use kerberos/HA environments
– On the fly keytab/keystore generation (?)
 Scripting/tool improvement
– Autorestart in case of service registration change
 Configuration for more orcherstration/scheduling
– Nomad?
– Docker Swarm?
 Easy image creation for specific builds
 Improve docker images
– Predefined volume/port definition
– Consolidate default values
34 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Thank You

More Related Content

What's hot

Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming DataDruid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
DataWorks Summit
 
InnoDb Vs NDB Cluster
InnoDb Vs NDB ClusterInnoDb Vs NDB Cluster
InnoDb Vs NDB Cluster
Mark Swarbrick
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
Duyhai Doan
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
alexbaranau
 
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
DataWorks Summit
 
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFi
Lev Brailovskiy
 
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
Cloudera, Inc.
 
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSA
Srikrupa Srivatsan
 
Power of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data StructuresPower of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data Structures
confluent
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
larsgeorge
 
Keeping Latency Low for User-Defined Functions with WebAssembly
Keeping Latency Low for User-Defined Functions with WebAssemblyKeeping Latency Low for User-Defined Functions with WebAssembly
Keeping Latency Low for User-Defined Functions with WebAssembly
ScyllaDB
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Databricks
 
Apache Spark 101
Apache Spark 101Apache Spark 101
Apache Spark 101
Abdullah Çetin ÇAVDAR
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Amazon Web Services
 
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
Severalnines
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
GauravBiswas9
 
A Modern C++ Kafka API | Kenneth Jia, Morgan Stanley
A Modern C++ Kafka API | Kenneth Jia, Morgan StanleyA Modern C++ Kafka API | Kenneth Jia, Morgan Stanley
A Modern C++ Kafka API | Kenneth Jia, Morgan Stanley
HostedbyConfluent
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introduction
sudhakara st
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
confluent
 

What's hot (20)

Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming DataDruid: Sub-Second OLAP queries over Petabytes of Streaming Data
Druid: Sub-Second OLAP queries over Petabytes of Streaming Data
 
InnoDb Vs NDB Cluster
InnoDb Vs NDB ClusterInnoDb Vs NDB Cluster
InnoDb Vs NDB Cluster
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
 
Data ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFiData ingestion and distribution with apache NiFi
Data ingestion and distribution with apache NiFi
 
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
 
DNS Security Presentation ISSA
DNS Security Presentation ISSADNS Security Presentation ISSA
DNS Security Presentation ISSA
 
Power of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data StructuresPower of the Log: LSM & Append Only Data Structures
Power of the Log: LSM & Append Only Data Structures
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
HBase in Practice
HBase in PracticeHBase in Practice
HBase in Practice
 
Keeping Latency Low for User-Defined Functions with WebAssembly
Keeping Latency Low for User-Defined Functions with WebAssemblyKeeping Latency Low for User-Defined Functions with WebAssembly
Keeping Latency Low for User-Defined Functions with WebAssembly
 
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
Spark Operator—Deploy, Manage and Monitor Spark clusters on Kubernetes
 
Apache Spark 101
Apache Spark 101Apache Spark 101
Apache Spark 101
 
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
Infrastructure at Scale: Apache Kafka, Twitter Storm & Elastic Search (ARC303...
 
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
MySQL Load Balancers - Maxscale, ProxySQL, HAProxy, MySQL Router & nginx - A ...
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
 
A Modern C++ Kafka API | Kenneth Jia, Morgan Stanley
A Modern C++ Kafka API | Kenneth Jia, Morgan StanleyA Modern C++ Kafka API | Kenneth Jia, Morgan Stanley
A Modern C++ Kafka API | Kenneth Jia, Morgan Stanley
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introduction
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
 

Viewers also liked

Deep learning - Part I
Deep learning - Part IDeep learning - Part I
Deep learning - Part I
QuantUniversity
 
Deep learning and Apache Spark
Deep learning and Apache SparkDeep learning and Apache Spark
Deep learning and Apache Spark
QuantUniversity
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
Lukas Masuch
 
Deep learning Tutorial - Part II
Deep learning Tutorial - Part IIDeep learning Tutorial - Part II
Deep learning Tutorial - Part II
QuantUniversity
 
Ansible + Hadoop
Ansible + HadoopAnsible + Hadoop
Ansible + Hadoop
Michael Young
 
Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)
Rakuten Group, Inc.
 
Top 5 Deep Learning Stories 2/24
Top 5 Deep Learning Stories 2/24Top 5 Deep Learning Stories 2/24
Top 5 Deep Learning Stories 2/24
NVIDIA
 
Tugas 4 0317-imelda felicia-1412510545
Tugas 4 0317-imelda felicia-1412510545Tugas 4 0317-imelda felicia-1412510545
Tugas 4 0317-imelda felicia-1412510545
imeldafelicia
 
Top 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data AnalyticsTop 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data Analytics
Hortonworks
 
Dynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDPDynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDP
Hortonworks
 
A Multi Colored YARN
A Multi Colored YARNA Multi Colored YARN
A Multi Colored YARN
DataWorks Summit/Hadoop Summit
 
Edw Optimization Solution
Edw Optimization Solution Edw Optimization Solution
Edw Optimization Solution
Hortonworks
 
2015 Internet Trends Report
2015 Internet Trends Report2015 Internet Trends Report
2015 Internet Trends Report
IQbal KHan
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks
 
Web engineering notes unit 3
Web engineering notes unit 3Web engineering notes unit 3
Web engineering notes unit 3
inshu1890
 
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & KerasGoogle Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Taegyun Jeon
 
Real-time Analytics in Financial: Use Case, Architecture and Challenges
Real-time Analytics in Financial: Use Case, Architecture and ChallengesReal-time Analytics in Financial: Use Case, Architecture and Challenges
Real-time Analytics in Financial: Use Case, Architecture and Challenges
DataWorks Summit/Hadoop Summit
 
SQL Server on Linux - march 2017
SQL Server on Linux - march 2017SQL Server on Linux - march 2017
SQL Server on Linux - march 2017
Sorin Peste
 
Seminario Web MongoDB-Paradigma: Cree aplicaciones más escalables utilizando ...
Seminario Web MongoDB-Paradigma: Cree aplicaciones más escalables utilizando ...Seminario Web MongoDB-Paradigma: Cree aplicaciones más escalables utilizando ...
Seminario Web MongoDB-Paradigma: Cree aplicaciones más escalables utilizando ...
MongoDB
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
Leslie Samuel
 

Viewers also liked (20)

Deep learning - Part I
Deep learning - Part IDeep learning - Part I
Deep learning - Part I
 
Deep learning and Apache Spark
Deep learning and Apache SparkDeep learning and Apache Spark
Deep learning and Apache Spark
 
Deep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial IntelligenceDeep Learning - The Past, Present and Future of Artificial Intelligence
Deep Learning - The Past, Present and Future of Artificial Intelligence
 
Deep learning Tutorial - Part II
Deep learning Tutorial - Part IIDeep learning Tutorial - Part II
Deep learning Tutorial - Part II
 
Ansible + Hadoop
Ansible + HadoopAnsible + Hadoop
Ansible + Hadoop
 
Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)
 
Top 5 Deep Learning Stories 2/24
Top 5 Deep Learning Stories 2/24Top 5 Deep Learning Stories 2/24
Top 5 Deep Learning Stories 2/24
 
Tugas 4 0317-imelda felicia-1412510545
Tugas 4 0317-imelda felicia-1412510545Tugas 4 0317-imelda felicia-1412510545
Tugas 4 0317-imelda felicia-1412510545
 
Top 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data AnalyticsTop 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data Analytics
 
Dynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDPDynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDP
 
A Multi Colored YARN
A Multi Colored YARNA Multi Colored YARN
A Multi Colored YARN
 
Edw Optimization Solution
Edw Optimization Solution Edw Optimization Solution
Edw Optimization Solution
 
2015 Internet Trends Report
2015 Internet Trends Report2015 Internet Trends Report
2015 Internet Trends Report
 
Hortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices WorkshopHortonworks Technical Workshop - Operational Best Practices Workshop
Hortonworks Technical Workshop - Operational Best Practices Workshop
 
Web engineering notes unit 3
Web engineering notes unit 3Web engineering notes unit 3
Web engineering notes unit 3
 
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & KerasGoogle Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
Google Dev Summit Extended Seoul - TensorFlow: Tensorboard & Keras
 
Real-time Analytics in Financial: Use Case, Architecture and Challenges
Real-time Analytics in Financial: Use Case, Architecture and ChallengesReal-time Analytics in Financial: Use Case, Architecture and Challenges
Real-time Analytics in Financial: Use Case, Architecture and Challenges
 
SQL Server on Linux - march 2017
SQL Server on Linux - march 2017SQL Server on Linux - march 2017
SQL Server on Linux - march 2017
 
Seminario Web MongoDB-Paradigma: Cree aplicaciones más escalables utilizando ...
Seminario Web MongoDB-Paradigma: Cree aplicaciones más escalables utilizando ...Seminario Web MongoDB-Paradigma: Cree aplicaciones más escalables utilizando ...
Seminario Web MongoDB-Paradigma: Cree aplicaciones más escalables utilizando ...
 
How to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your NicheHow to Become a Thought Leader in Your Niche
How to Become a Thought Leader in Your Niche
 

Similar to Micro services vs hadoop

Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
Krisztián Horváth
 
Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
Future of Data Meetup
 
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsCloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerations
DataWorks Summit
 
Cloudbreak - Technical Deep Dive
Cloudbreak - Technical Deep DiveCloudbreak - Technical Deep Dive
Cloudbreak - Technical Deep Dive
DataWorks Summit/Hadoop Summit
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
Chris Nauroth
 
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache KnoxFortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
DataWorks Summit
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
DataWorks Summit/Hadoop Summit
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the Cloud
DataWorks Summit
 
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San JoseCloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
Mingliang Liu
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
DataWorks Summit
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the Cloud
DataWorks Summit
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014
Hojoong Kim
 
Learning ASP.NET 5 and MVC 6
Learning ASP.NET 5 and MVC 6Learning ASP.NET 5 and MVC 6
Learning ASP.NET 5 and MVC 6
Ido Flatow
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
DataWorks Summit/Hadoop Summit
 
Easy Docker Deployments with Mesosphere DCOS on Azure
Easy Docker Deployments with Mesosphere DCOS on AzureEasy Docker Deployments with Mesosphere DCOS on Azure
Easy Docker Deployments with Mesosphere DCOS on Azure
Mesosphere Inc.
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
DataWorks Summit
 
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
QAware GmbH
 
Oracle - Continuous Delivery NYC meetup, June 07, 2018
Oracle - Continuous Delivery NYC meetup, June 07, 2018Oracle - Continuous Delivery NYC meetup, June 07, 2018
Oracle - Continuous Delivery NYC meetup, June 07, 2018
Oracle Developers
 
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Data Con LA
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
DataWorks Summit
 

Similar to Micro services vs hadoop (20)

Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
 
Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
 
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsCloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerations
 
Cloudbreak - Technical Deep Dive
Cloudbreak - Technical Deep DiveCloudbreak - Technical Deep Dive
Cloudbreak - Technical Deep Dive
 
Hadoop & cloud storage object store integration in production (final)
Hadoop & cloud storage  object store integration in production (final)Hadoop & cloud storage  object store integration in production (final)
Hadoop & cloud storage object store integration in production (final)
 
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache KnoxFortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the Cloud
 
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San JoseCloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
Cloudy with a chance of Hadoop - DataWorks Summit 2017 San Jose
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the Cloud
 
Open shift and docker - october,2014
Open shift and docker - october,2014Open shift and docker - october,2014
Open shift and docker - october,2014
 
Learning ASP.NET 5 and MVC 6
Learning ASP.NET 5 and MVC 6Learning ASP.NET 5 and MVC 6
Learning ASP.NET 5 and MVC 6
 
Hadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in ProductionHadoop & Cloud Storage: Object Store Integration in Production
Hadoop & Cloud Storage: Object Store Integration in Production
 
Easy Docker Deployments with Mesosphere DCOS on Azure
Easy Docker Deployments with Mesosphere DCOS on AzureEasy Docker Deployments with Mesosphere DCOS on Azure
Easy Docker Deployments with Mesosphere DCOS on Azure
 
Apache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data ProcessingApache Tez - A unifying Framework for Hadoop Data Processing
Apache Tez - A unifying Framework for Hadoop Data Processing
 
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
Kubernetes One-Click Deployment: Hands-on Workshop (Mainz)
 
Oracle - Continuous Delivery NYC meetup, June 07, 2018
Oracle - Continuous Delivery NYC meetup, June 07, 2018Oracle - Continuous Delivery NYC meetup, June 07, 2018
Oracle - Continuous Delivery NYC meetup, June 07, 2018
 
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
 

Recently uploaded

❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
Saket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model Safe
Saket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model SafeSaket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model Safe
Saket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model Safe
shruti singh$A17
 
LLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptxLLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptx
Jyotishko Biswas
 
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
Donghwan Lee
 
Applications of Data Science in Various Industries
Applications of Data Science in Various IndustriesApplications of Data Science in Various Industries
Applications of Data Science in Various Industries
IABAC
 
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
SARITA PANDEY
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
javier ramirez
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
Maruti Wagon R on road price in Faridabad - CarDekho
Maruti Wagon R on road price in Faridabad - CarDekhoMaruti Wagon R on road price in Faridabad - CarDekho
Maruti Wagon R on road price in Faridabad - CarDekho
kamli sharma#S10
 
Hiranandani Gardens @Call @Girls Whatsapp 9833363713 With High Profile Offer
Hiranandani Gardens @Call @Girls Whatsapp 9833363713 With High Profile OfferHiranandani Gardens @Call @Girls Whatsapp 9833363713 With High Profile Offer
Hiranandani Gardens @Call @Girls Whatsapp 9833363713 With High Profile Offer
$A19
 
Bangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any Time
Bangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any TimeBangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any Time
Bangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any Time
adityaroy0215
 
@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time
@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time
@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time
manjukaushik328
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
#kalyanmatkaresult #dpboss #kalyanmatka #satta #matka #sattamatka
 
BIGPPTTTTTTTTtttttttttttttttttttttt.pptx
BIGPPTTTTTTTTtttttttttttttttttttttt.pptxBIGPPTTTTTTTTtttttttttttttttttttttt.pptx
BIGPPTTTTTTTTtttttttttttttttttttttt.pptx
RajdeepPaul47
 
[D3T1S03] Amazon DynamoDB design puzzlers
[D3T1S03] Amazon DynamoDB design puzzlers[D3T1S03] Amazon DynamoDB design puzzlers
[D3T1S03] Amazon DynamoDB design puzzlers
Amazon Web Services Korea
 
@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...
@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...
@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...
Disha Mukharji
 
@Call @Girls Saharanpur 0000000000 Priya Sharma Beautiful And Cute Girl any Time
@Call @Girls Saharanpur 0000000000 Priya Sharma Beautiful And Cute Girl any Time@Call @Girls Saharanpur 0000000000 Priya Sharma Beautiful And Cute Girl any Time
@Call @Girls Saharanpur 0000000000 Priya Sharma Beautiful And Cute Girl any Time
gragyogita3
 
iot paper presentation FINAL EDIT by kiran.pptx
iot paper presentation FINAL EDIT by kiran.pptxiot paper presentation FINAL EDIT by kiran.pptx
iot paper presentation FINAL EDIT by kiran.pptx
KiranKumar139571
 

Recently uploaded (20)

❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
 
Saket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model Safe
Saket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model SafeSaket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model Safe
Saket @ℂall @Girls ꧁❤ 9873777170 ❤꧂VIP Neha Singla Top Model Safe
 
LLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptxLLM powered Contract Compliance Application.pptx
LLM powered Contract Compliance Application.pptx
 
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
[D2T2S04] SageMaker를 활용한 Generative AI Foundation Model Training and Tuning
 
Applications of Data Science in Various Industries
Applications of Data Science in Various IndustriesApplications of Data Science in Various Industries
Applications of Data Science in Various Industries
 
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
@Call @Girls Bandra phone 9920874524 You Are Serach A Beautyfull Dolle come here
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
 
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
Cómo hemos implementado semántica de "Exactly Once" en nuestra base de datos ...
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA ...
 
Maruti Wagon R on road price in Faridabad - CarDekho
Maruti Wagon R on road price in Faridabad - CarDekhoMaruti Wagon R on road price in Faridabad - CarDekho
Maruti Wagon R on road price in Faridabad - CarDekho
 
Hiranandani Gardens @Call @Girls Whatsapp 9833363713 With High Profile Offer
Hiranandani Gardens @Call @Girls Whatsapp 9833363713 With High Profile OfferHiranandani Gardens @Call @Girls Whatsapp 9833363713 With High Profile Offer
Hiranandani Gardens @Call @Girls Whatsapp 9833363713 With High Profile Offer
 
Bangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any Time
Bangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any TimeBangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any Time
Bangalore @Call @Girls 0000000000 Riya Khan Beautiful And Cute Girl any Time
 
@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time
@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time
@Call @Girls Kolkata 0000000000 Shivani Beautiful Girl any Time
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
 
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
❻❸❼⓿❽❻❷⓿⓿❼ SATTA MATKA DPBOSS KALYAN FAST RESULTS CHART KALYAN MATKA MATKA RE...
 
BIGPPTTTTTTTTtttttttttttttttttttttt.pptx
BIGPPTTTTTTTTtttttttttttttttttttttt.pptxBIGPPTTTTTTTTtttttttttttttttttttttt.pptx
BIGPPTTTTTTTTtttttttttttttttttttttt.pptx
 
[D3T1S03] Amazon DynamoDB design puzzlers
[D3T1S03] Amazon DynamoDB design puzzlers[D3T1S03] Amazon DynamoDB design puzzlers
[D3T1S03] Amazon DynamoDB design puzzlers
 
@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...
@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...
@Call @Girls Mira Bhayandar phone 9920874524 You Are Serach A Beautyfull Doll...
 
@Call @Girls Saharanpur 0000000000 Priya Sharma Beautiful And Cute Girl any Time
@Call @Girls Saharanpur 0000000000 Priya Sharma Beautiful And Cute Girl any Time@Call @Girls Saharanpur 0000000000 Priya Sharma Beautiful And Cute Girl any Time
@Call @Girls Saharanpur 0000000000 Priya Sharma Beautiful And Cute Girl any Time
 
iot paper presentation FINAL EDIT by kiran.pptx
iot paper presentation FINAL EDIT by kiran.pptxiot paper presentation FINAL EDIT by kiran.pptx
iot paper presentation FINAL EDIT by kiran.pptx
 

Micro services vs hadoop

  • 2. 2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Microservice definition ”An approach to developing a single application as a  suite of small services, each running in its own process  and communicating with lightweight mechanisms, often an HTTP resource API.  These services are built around business capabilities and independently deployable by fully automated deployment machinery.” – https://martinfowler.com/articles/microservices.html
  • 3. 3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Hadoop cluster  The definition is almost true for a Hadoop cluster as well
  • 4. 4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Dockerized Hadoop cluster  How can we use the tools from microservice architecture in hadoop ecosystem?  A possible approach to install cluster (hadoop, spark, kafka, hive) based on – separated docker containers – Smart configuration management (using well-known tooling from microservices architectures)  Goal: rapid prototyping platform  Easy switch between – versions (official HDP, snapshot build, apache build) – configuration (ha, kerberos, metrics, htrace…)  Developers/Ops tool – Easy != easy for any user without knowledge about the tool  Not goal: – replace current management plaforms (eg. Ambari)
  • 5. 5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved What are the Microservices (Theory) Collection of patterns/best practices  II. Dependencies – Explicitly declare and isolate dependencies  III. Config – Store config in the environment  VI. Processes – Execute the app as one or more stateless processes  VIII. Concurrency – Scale out via the process model  XII. Admin processes – Run admin/management tasks as one-off processes 12 Factory apps (http://12factor.net)
  • 6. 6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved What are the Microservices (Practice)  Spring started as a – Dependency injection framework  Spring Boot ecosystem – Easy to use starter projects – Lego bricks for various problems • JDBC access • Database access • REST • Health check  Spring Cloud -- elements to build microservices (based on Netflix stack) – API gateway – Service registry – Configuration server – Distributed tracing – Client side load balancing public class TimeStarter { @Autowired TimeService timerService; public Date now() { long timeService = timerService.now(); } }
  • 7. 7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Microservices with Spring Cloud
  • 8. 8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Monolith application  Monolith but modular application example auth service timer service upload service report service Rest call
  • 9. 9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Monolith application  Monolith but modular application example auth service timer service upload service report service Rest call @EnableAutoConfiguration @RestController @ComponentScan public class TimeStarter { @Autowired TimeService timerService; @RequestMapping("/now") public Date now() { return timerService.now(); } public static void main(String[] args) { SpringApplication.run(TimeStarter.class, args); } }
  • 10. 10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Microservice version  First problem: how can we find the right backend port form the frontend? auth service timer service upload service report service Rest call Rest call Rest call Rest call
  • 11. 11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Solution: API Gateway  First problem: how can we find the right backend port form the frontend? auth service timer service upload service report service API gateway Rest call
  • 12. 12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved API Gateway  Goals: Hide available microservices behind a service facade pattern – Routing, Authorization – Deployment handling, Canary testing, Blue/Green deployment – Logging, SLA, Auditing  Implementation examples: – Spring cloud Api Gateway (based on Netflix Zuul) – Netflix Zuul based implementation – Twitter Finagle based implementation – Amazon API gateway – Simple Nginx reverse proxy configuration – Traefik, Kong  Usage in Hadoop ecosystem – For prototyping: Only if the scheduler/orchestrator starts the service on a random host – For security: Apache Knox
  • 13. 13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Service registry  Problem: how to configure API gateway to automatically route to all the services auth service timer service upload service report service API gateway Rest call ?
  • 14. 14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Service registry  Solution: Use service registry – Components should be registered to the service registry automatically auth service timer service upload service report service Rest call Service registry API gateway
  • 15. 15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Service registry  Goal: Store the location and state of the available services – Health check – DNS interface  Implementation examples: – Spring cloud: Eureka – Netflix eureka based implementation – Consul.io – etcd, zookeeper – Simple workaround: DNS or hosts file  Usage in Hadoop ecosystem – Most of the components needs info about the location of nameserver(s) and other master components
  • 16. 16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Configuration server  Problem: how can we configure multiple components – ”Store config in the environment” (12factor) auth service timer service upload service report service Rest call Service registry API gateway Config ? Config ? Config ? Config ?
  • 17. 17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Configuration server  Problem: how can we configure multiple components auth service timer service upload service report service Rest call Service registry API gateway Configuration Config server
  • 18. 18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Config server  Goals: One common place for all of the configuration – Versioning – Auditing – Multiple environment support: Use (almost) the same configuration from DEV to PROD environment – Solution for sensitive data  Solution examples: – Spring Cloud config service – Zookeeper – Most of the service registry have key->value store (Consul, etcd) – Any persistence datastore (But the versioning is a question)  For Hadoop ecosystem: – Most painful point: the same configuration elements (eg. core-site.xml) is needed at multiple location – Ambari and other management tools try to solve the problem (but not with the focus of rapid prototyping)
  • 19. 19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Config server – configuration management  Config server structure: [branch]/name-profile.extension  Merge properties for name=timer and profile(environment)=dev  URL from the config server – http://config:8888/timer-dev.properties • server.port=6767 • aws.secret.key=zzz • exit.code=-23  Local file system structure (master branch) – timer.properties • server.port=6767 – dev.properties • aws.secret.key=xxx – application.properties • exit.code=-23 Config server
  • 20. 20 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Summary  Tools used in microservice architecture  Key components: – Config server – Service registry – API gateway  Configuration server – Versioning – One common place to distribute configuration – Configuration preprocessing!!! • transformation • the content of the configuration should be defined, it could be format independent • But the final configuration should be visible
  • 21. 21 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Docker based Hadoop cluster
  • 22. 22 © Hortonworks Inc. 2011 – 2017. All Rights Reserved  bin – hdfs – yarn – mapred  etc/hadoop – core-site.xml – mapred-site.xml – hdfs-site.xml  include  lib  libexec  sbin  share apache-hadoop-X.X.tar.gz 1. Configuration server 2. Service registry 3. API gatway Microservice architecture elements How to do it with Hadoop?
  • 23. 23 © Hortonworks Inc. 2011 – 2017. All Rights Reserved  bin – hdfs – yarn – mapred  etc/hadoop – core-site.xml – mapred-site.xml – hdfs-site.xml  include  lib  libexec  sbin  share apache-hadoop-X.X.tar.gz 1. Configuration server 2. Service registry 3. API gatway 4. +1 Packaging Microservice architecture elements Do it with Hadoop
  • 24. 24 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Packaging: Docker  Packaging: Docker – Docker Engine: • a portable, • lightweight runtime and • packaging tool – Docker Hub, • a cloud service for sharing applications – Docker Compose: • Predefined recipes (environment variables, network, …)  My docker containers: http://hub.docker.com/elek/
  • 25. 25 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Docker decisions  One application per container – More flexible – More simple (configuration preprocess + start) – One deployable unit  Microservice-like: prefer more similar units against smaller but bigger one  Using host network for clusters 10.8.0.5 172.13.0.1 172.13.0.5 172.13.0.2 10.8.0.6 172.13.0.3 172.13.0.4 172.13.0.9 10.8.0.5 10.8.0.5 10.8.0.5 10.8.0.5 10.8.0.6 10.8.0.6 10.8.0.6 10.8.0.6 Host networkBridge network
  • 26. 26 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Repositories  elek/bigdata-docker: – example configuration – docker-compose files – ansible scripts – getting started entrypoint  elek/docker-bigdata-base (base image for all the containers) – Contains all the configuration loading (and some documentation) – Use CONFIG_TYPE environment variable to select configuration method • CONFIG_TYPE=simple (configuration from environment variables – for local env) • CONFIG_TYPE=consul (configuration from consul – for distributed environment)  elek/docker-…. (hadoop/spark/hive/...) – Docker images for the components
  • 27. 27 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Local demo  Local run, using host network – More configuration is needed – Auto scaling is supported – https://github.com/elek/bigdata-docker/tree/master/compose bridge network 172.13.0.1 172.13.0.5 172.13.0.2
  • 28. 28 © Hortonworks Inc. 2011 – 2017. All Rights Reserved  bin – hdfs – yarn – mapred  etc/hadoop – core-site.xml – mapred-site.xml – hdfs-site.xml  include  lib  libexec  sbin  share apache-hadoop-X.X.tar.gz 1. Packaging 2. Configuration server 3. Service registry 4. API gateway Components Do it with Hadoop
  • 29. 29 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Service registry/configuration server  Service registry – Health check support – DNS support  Key-value store – Binary data is supported  Based on agents and servers  Easy to use REST API  RAFT based consensus protocol
  • 30. 30 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Service registry/configuration server  Git2Consul – Mirror git repositories to consul  Consul template – Advanced Template engine – Renders a template (configuration file) based on the information from the consul – Run/restart a process on change  Registrator – Listen on docker event stream – Register new components to consul hdfs-namenode Consul Configuration (git) datanode datanode datanode hdfs-datanode consul-template git2consul Registrator docker event stream
  • 31. 31 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Weave scope  Agents to monitor – network connections between components – cpu – memory  Supports Docker, Swarm, Weave network, …  Easy install  Transparent  Pluggable  Only problems: – Temporary docker containers
  • 32. 32 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Distributed demo  Distributed run with host network – https://github.com/elek/bigdata-docker/tree/master/consul – Configuration is hosted in a consul instance – Dynamic update 10.8.0.5 10.8.0.5 10.8.0.5 10.8.0.5
  • 33. 33 © Hortonworks Inc. 2011 – 2017. All Rights Reserved TODO  More profiles and configuration set – Ready to use kerberos/HA environments – On the fly keytab/keystore generation (?)  Scripting/tool improvement – Autorestart in case of service registration change  Configuration for more orcherstration/scheduling – Nomad? – Docker Swarm?  Easy image creation for specific builds  Improve docker images – Predefined volume/port definition – Consolidate default values
  • 34. 34 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Thank You