Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Spark day 2017 - Spark on Kubernetes
• Senior Software Engineer of SK Telecom
• Commercial Products
• Big Data Discovery Solution (~’17)
• Hadoop DW (~’15)
• PaaS(CloudFoundry) (~’13)
• Iaas (OpenStack) (~’13)
• Mail to : jerryjung@apache.org
2
Kubernetes
Spark deployment using Kubernetes
Spark on Kubernetes
Demo
3
Open Source
Automation
Framework for
deploying,
managing, and
scaling
applications.
4

Recommended for you

Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs

This document discusses using Docker containers to run Cassandra clusters at Walmart. It proposes transforming existing Cassandra hardware into containers to better utilize unused compute. It also suggests building new Cassandra clusters in containers and migrating old clusters to double capacity on existing hardware and save costs. Benchmark results show Docker containers outperforming virtual machines on OpenStack and Azure in terms of reads, writes, throughput and latency for an in-house application.

cassandraapache cassandrawalmart
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: What’s New in ScyllaDB Operator for KubernetesScylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes

This document summarizes the Scylla Operator for Kubernetes, including its developers, features, releases, and roadmap. Key points include: - The Scylla Operator manages and automates tasks for Scylla clusters on Kubernetes. - Features include seedless mode, security enhancements, performance tuning, and improved stability. - It follows a rapid 6-week release cycle and supports the latest two releases. - Future plans include additional performance optimizations, persistent storage support, TLS encryption, and multi-datacenter capabilities.

big data databasedatabasenosql
Spark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Pipelines in the Cloud with Alluxio with Gene PangSpark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Pipelines in the Cloud with Alluxio with Gene Pang

Organizations commonly use Apache Spark to gain actionable insight from their large amounts of data. Often, these analytics are in the form of data processing pipelines, where there are a series of processing stages, and each stage performs a particular function, and the output of one stage is the input of the next stage. There are several examples of pipelines, such as log processing, IoT pipelines, and machine learning. The common attribute among different pipelines is the sharing of data between stages. It is also common for Spark pipelines to process data stored in the public cloud, such as Amazon S3, Microsoft Azure Blob Storage, or Google Cloud Storage. The global availability and cost effectiveness of these public cloud storage services make them the preferred storage for data. However, running pipeline jobs while sharing data via cloud storage can be expensive in terms of increased network traffic, and slower data sharing and job completion times. Using Alluxio, a memory speed virtual distributed storage system, enables sharing data between different stages or jobs at memory speed. By reading and writing data in Alluxio, the data can stay in memory for the next stage of the pipeline, and this result in great performance gains. In this talk, we discuss how Alluxio can be deployed and used with a Spark data processing pipeline in the cloud. We show how pipeline stages can share data with Alluxio memory for improved performance benefits, and how Alluxio can improves completion times and reduces performance variability for Spark pipelines in the cloud.

apache sparkspark summit
Kubernetes provides a common API and
self-healing framework which
automatically handles machine failures
and application deployments, logging,
and monitoring.
5
6
https://github.com/kubernetes/kubernetes/blob/master/docs/design/architecture.md
https://thenewstack.io/kubernetes-an-overview/
7
https://thenewstack.io/kubernetes-an-overview/
8

Recommended for you

Gocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentGocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous Deployment

Automate and streamline your build-test-release cycle for reliable, continuous delivery of your product

continuous deliverytasksenvironment
Getting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache MesosGetting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache Mesos

This document provides an overview of Apache Mesos and how to run Apache Spark on a Mesos cluster. It describes Mesos as a distributed systems kernel that allows sharing compute resources across applications. It then gives step-by-step instructions for launching a Mesos cluster in AWS, configuring and running Spark jobs on the cluster, and where to find example Spark jobs and further Mesos resources.

mesospherecluster computinggoogle
Dev ops for big data cluster management tools
Dev ops for big data  cluster management toolsDev ops for big data  cluster management tools
Dev ops for big data cluster management tools

What are the tools that we can find to day to manage Hadoop cluster and its ecosystem? There are two tools ready today: Cloudera Manager and Ambari from Hortonworks. In this presentation I explain what they do and why to use them, as well as Pros. and Cons.

ambarihortonwrkscloudera
9
https://thenewstack.io/kubernetes-an-overview/
Clusters - set of compute, storage, network
resource
Pods - colocated group of application containers
that share volumes and a networking stack
Replication Controllers - ensure a specific number
of pods, manage pods, status updates
Services - cluster wide service discovery
10
Node #1 192.168.0.2
Pod #1
10.0.0.2
Node #5 192.168.0.6
Volume
Network
Pod #2
10.0.0.3
Volume
Network
Pod #8
10.0.0.9
Volume
Network
8080 8080 8080
11
Support for Event Stream Processing
Fast Data Queries in Real Time
Improved Programmer Productivity
Fast Batch Processing of Large Data Set
12

Recommended for you

A Journey through the JDKs (Java 9 to Java 11)
A Journey through the JDKs (Java 9 to Java 11)A Journey through the JDKs (Java 9 to Java 11)
A Journey through the JDKs (Java 9 to Java 11)

This presentation gives an overview of the most relevant updates and features going from Java 9 to Java 11 from a developer's perspective.

javajdkjpms
Homologous Apache Spark Clusters Using Nomad with Alex Dadgar
Homologous Apache Spark Clusters Using Nomad with Alex DadgarHomologous Apache Spark Clusters Using Nomad with Alex Dadgar
Homologous Apache Spark Clusters Using Nomad with Alex Dadgar

- Nomad is a cluster scheduler that makes deploying Spark clusters easy for developers and operationally simple. It allows Spark jobs to be deployed across multiple datacenters and regions. - Currently, Nomad allows running Spark in production environments without compromising functionality. It enables shared clusters for batch and streaming workloads with higher efficiency. It also integrates with Vault for secure secrets management. - Future enhancements may include preempting lower priority Spark executors, implementing quotas and chargebacks, enabling GPU acceleration, and allowing over-subscription of resources to improve cluster utilization. Nomad aims to make deploying and running Spark easier and more cost effective at scale.

spark summitapache spark
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo

In this talk we will walk through how Apache Kafka and Apache Accumulo can be used together to orchestrate a de-coupled, real-time distributed and reactive request/response system at massive scale. Multiple data pipelines can perform complex operations for each message in parallel at high volumes with low latencies. The final result will be inline with the initiating call. The architecture gains are immense. They allow for the requesting system to receive a response without the need for direct integration with the data pipeline(s) that messages must go through. By utilizing Apache Kafka and Apache Accumulo, these gains sustain at scale and allow for complex operations of different messages to be applied to each response in real-time.

kafkastreamsmesos
Driver Process that contains the SparkContext
Executor Process that executes one or more Spark tasks
Master Process that manages applications across the cluster
Worker Process that manages executors on a particular node
13
http://spark.apache.org/docs/latest/cluster-overview.html
Driver Program
SparkContext
Cluster Manager
Worker Node
Executor
Worker Node
Executor
Worker Node
Executor
14
http://freecontent.manning.com/running-spark-an-overview-of-sparks-runtime-architecture/
15
cluster mode client mode
https://www.slideshare.net/grahaindia/new-features-of-kubernetes-v120-beta
DaemonSet
16

Recommended for you

K8S in prod
K8S in prodK8S in prod
K8S in prod

This document discusses Kubernetes usage at VMware SAAS. It covers dynamic provisioning of applications on Kubernetes, monitoring tools used like DataDog and Log Insight, and best practices for upgrading Kubernetes clusters. Key points include using stateless applications where possible, service discovery using Kubernetes services, dynamic provisioning using an onboarding service, and performing rolling upgrades for stateful applications to minimize downtime.

vmwarekubernetessaas
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with SenlinDeploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin

This is a talk from the Austin OpenStack summit. It demonstrates how a resilient, elastic and load-balanced cluster can be deployed using senlin, heat, ceilometer, lbaas v2, nova.

haresiliencysenlin
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talk

You’ve heard all of the hype, but how can SMACK work for you? In this all-star lineup, you will learn how to create a reactive, scaling, resilient and performant data processing powerhouse. Bringing Akka, Kafka and Mesos together provides a foundation to develop and operate an elastically scalable actor system. We will go through the basics of Akka, Kafka and Mesos and then deep dive into putting them together in an end2end (and back again) distrubuted transaction. Distributed transactions mean producers waiting for one or more of consumers to respond. We'll also go through automated ways to failure induce these systems (using LinkedIn Simoorg) and trace them from start to stop through each component (using Twitters Zipkin). Finally, you will see how Apache Cassandra and Spark can be combined to add the incredibly scaling storage and data analysis needed in fast data pipelines. With these technologies as a foundation, you have the assurance that scale is never a problem and uptime is default.

akkasparknosql
StatefulSets
http://blog.kubernetes.io/2017/01/running-mongodb-on-kubernetes-with-statefulsets.html
17
$(statefulset name)-$(ordinal)
https://github.com/Comcast/kube-yarn
18
Node #1 …. #n
HDFS
DN
HDFS
NN
HDFS
DN……………
YARN
NM
YARN
RM
YARN
NM……………
zeppelin
pod
spark submit
19
2020

Recommended for you

Suning OpenStack Cloud and Heat
Suning OpenStack Cloud and HeatSuning OpenStack Cloud and Heat
Suning OpenStack Cloud and Heat

An experience sharing of the OpenStack deployment at Suning.com, a large online retailer in China. The talk presents the challenges and opportunities on orchestrating the enterprise workloads using Heat.

orchestrationheatcloud
Handling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeperHandling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeper

This document discusses using ZooKeeper to automatically handle Redis failover. ZooKeeper is an open-source tool that provides primitives for building distributed applications and handles tasks like leader election and quorum management. The presenter describes how his redis_failover Ruby gem uses ZooKeeper to monitor Redis servers, detect failures, and automatically inform clients so they reconnect to the new master, preventing downtime during a failover. Several companies already use this approach with redis_failover to make their Redis infrastructure more robust and fault-tolerant.

zookeeperrubyredis
Hadoop on-mesos
Hadoop on-mesosHadoop on-mesos
Hadoop on-mesos

The document discusses Hadoop on Mesos, beginning with a short history of distributed computing. It describes how Mesos provides an operating system for clusters that allows applications like Hadoop to run as distributed frameworks. The document outlines challenges in running Hadoop on Mesos and how these were addressed, including using Mesos schedulers and reservations. It also presents a case study of how Airbnb migrated its Hadoop infrastructure from Amazon EMR to run on Mesos, improving availability, performance, and customer satisfaction.

Node #1 …. #n
HDFS
DN
HDFS
NN
HDFS
DN……………
YARN
NM
YARN
RM
YARN
NM……………
zeppelin
X
pod
spark submit
21
22
23
24

Recommended for you

April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...

Apache Hadoop YARN is a modern resource-management platform that handles resource scheduling, isolation and multi-tenancy for a variety of data processing engines that can co-exist and share a single data-center in a cost-effective manner. In the first half of the talk, we are going to give a brief look into some of the big efforts cooking in the Apache Hadoop YARN community. We will then dig deeper into one of the efforts - supporting Docker runtime in YARN. Docker is an application container engine that enables developers and sysadmins to build, deploy and run containerized applications. In this half, we'll discuss container runtimes in YARN, with a focus on using the DockerContainerRuntime to run various docker applications under YARN. Support for container runtimes (including the docker container runtime) was recently added to the Linux Container Executor (YARN-3611 and its sub-tasks). We’ll walk through various aspects of running docker containers under YARN - resource isolation, some security aspects (for example container capabilities, privileged containers, user namespaces) and other work in progress features like image localization and support for different networking modes. Speakers: Vinod Kumar Vavilapalli is the Hadoop YARN and MapReduce guy at Hortonworks. He is a long term Hadoop contributor at Apache, Hadoop committer and a member of the Apache Hadoop PMC. He has a Bachelors degree from Indian Institute of Technology Roorkee in Computer Science and Engineering. He has been working on Hadoop for nearly 9 years and he still has fun doing it. Straight out of college, he joined the Hadoop team at Yahoo! Bangalore, before Hortonworks happened. He is passionate about using computers to change the world for better, bit by bit. Sidharta Seethana is a software engineer at Hortonworks. He works on the YARN team, focussing on bringing new kinds of workloads to YARN. Prior to joining Hortonworks, Sidharta spent 10 years at Yahoo! Inc., working on a variety of large scale distributed systems for core platforms/web services, search and marketplace properties, developer network and personalization.

yahoohadoophug
War Stories: DIY Kafka
War Stories: DIY KafkaWar Stories: DIY Kafka
War Stories: DIY Kafka

(Nina Hanzlikova, Zalando) Kafka Summit SF 2018 My team at Zalando fell in love with KStreams and their programming model straight out of the gate. However, as a small team of developers, building out and supporting our infrastructure while still trying to deliver solutions for our business has not always resulted in a smooth journey. Can a small team of a couple of developers run their own Kafka infrastructure confidently and still spend most of their time developing code? In this talk, we will dive into some of the problems we experienced while running Kafka brokers and Kafka Streams applications, as well as the consultations we had with other teams around this matter. We will outline some of the pragmatic decisions we made regarding backups, monitoring and operations to minimize our time spent administering our Kafka brokers and various stream applications.

apachekafkasummit
Cloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit KubernetesCloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit Kubernetes

Mario-Leander Reimer presented on building cloud-native .NET microservices with Kubernetes. He discussed key principles of cloud native applications including designing for distribution, performance, automation, resiliency and elasticity. He also covered containerization with Docker, composing services with Kubernetes and common concepts like deployments, services and probes. Reimer provided examples of Dockerfiles, Kubernetes definitions and using tools like Steeltoe and docker-compose to develop cloud native applications.

.netmicroserviceskubernetes
https://github.com/kubernetes/kubernetes/issues/34377
https://issues.apache.org/jira/browse/SPARK-18278
25
https://spark-summit.org/2017/events/apache-spark-on-kubernetes/
26
https://spark-summit.org/2017/events/apache-spark-on-kubernetes/
27
SPARK-18278 - Spark on Kubernetes Design Proposal.pdf
28

Recommended for you

Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native KubernetesSimplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes

In 2020, two significant IT platforms converge. On the one hand, Spark 3 becomes available with the support of Kubernetes as a scheduler

spark + ai summit
Kubernetes extensibility
Kubernetes extensibilityKubernetes extensibility
Kubernetes extensibility

Kubernetes is designed to be an extensible system. But what is the vision for Kubernetes Extensibility? Do you know the difference between webhooks and cloud providers, or between CRI, CSI, and CNI? In this talk we will explore what extension points exist, how they have evolved, and how to use them to make the system do new and interesting things. We’ll give our vision for how they will probably evolve in the future, and talk about the sorts of things we expect the broader Kubernetes ecosystem to build with them.

black belt trackdockercon 2018dockercon
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with Kubernetes

DevOps, continuous delivery and modern architectural trends can incredibly speed up the software development process. Big Data applications cannot be an exception and need to keep the same pace.

spark streamingdevopskubernetes
29
Node Manager # 1…N
external
shuffle plugin
RDD

(IntermediateFile)
RDD

(IntermediateFile)
External Shuffle
30
Executor
Long-Running ETL jobs
Interactive application or Server
Any application with large shuffles
Executor
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property>
1. shuffle plugin add jar
2. yarn-site.xml add plugin
31
spark.dynamicAllocation.enabled true
spark.shuffle.service.enabled true
spark.dynamicAllocation.minExecutors 50
spark.dynamicAllocation.maxExecutors 100
spark.dynamicAllocation.initialExecutors 50
spark.dynamicAllocation.schedulerBacklogTimeout 5s
spark.dynamicAllocation.executorIdleTimeout 60
http://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation
cf) Mesos - Coarse-Grained Mode
3. edit spark-default.conf
32

Recommended for you

Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and Python

In this one day workshop, we will introduce Spark at a high level context. Spark is fundamentally different than writing MapReduce jobs so no prior Hadoop experience is needed. You will learn how to interact with Spark on the command line and conduct rapid in-memory data analyses. We will then work on writing Spark applications to perform large cluster-based analyses including SQL-like aggregations, machine learning applications, and graph algorithms. The course will be conducted in Python using PySpark.

distributed computingdistrict data labspython
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform

This document discusses Red Hat's OpenStack platform. It provides an overview of OpenStack and what it is used for. It then discusses why Red Hat is well suited to provide an OpenStack platform, including that it is optimized to run on Red Hat Enterprise Linux and benefits from Red Hat's engineering resources and long term support. Key features of Red Hat's OpenStack platform are also summarized, such as performance, availability, security and manageability.

red hatopenstack days koreaopenstack
Kubernetes for the PHP developer
Kubernetes for the PHP developerKubernetes for the PHP developer
Kubernetes for the PHP developer

Kubernetes is exploding in popularity right now and has all the buzz and cargo-culting that Docker enjoyed just a few years ago. But what even is Kubernetes? How do I run my PHP apps in it? Should I run my PHP apps in it ?

kubernetesdevopsphp
https://spark-summit.org/2017/events/apache-spark-on-kubernetes/
3333
1
2 3
https://www.slideshare.net/cfregly/spark-on-kubernetes-advanced-spark-and-tensorflow-
meetup-jan-19-2017-anirudh-ramanthan-from-google-kubernetes-team
34
1
bin/spark-submit 
--deploy-mode cluster 
--class org.apache.spark.examples.SparkPi 
--master k8s://https://{k8s address}
--kubernetes-namespace default 
--conf spark.executor.instances=5 
--conf spark.app.name=spark-pi 
--conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.1.0-kubernetes-0.2.0 
--conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.1.0-kubernetes-0.2.0 
--conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.1.0-kubernetes-0.2.0 
local:///opt/spark/examples/jars/spark-examples_2.11-2.1.0-k8s-0.2.0-SNAPSHOT.jar
3535
https://www.slideshare.net/cfregly/spark-on-kubernetes-advanced-spark-and-tensorflow-
meetup-jan-19-2017-anirudh-ramanthan-from-google-kubernetes-team
36
2

Recommended for you

The Big Cloud native FaaS Lebowski
The Big Cloud native FaaS Lebowski The Big Cloud native FaaS Lebowski
The Big Cloud native FaaS Lebowski

Linux-Stammtisch Juli 2019, Munich: Talk by Mario-Leander Reimer (@LeanderReimer, Principal Software Architect at QAware) === Please download slides if blurred! === Abstract: Only a few years ago the move towards microservice architecture was the first big disruption in software engineering: instead of running monoliths, systems were now build, composed and run as autonomous services. But this came at the price of added development and infrastructure complexity. Serverless and FaaS seem to be the next disruption, they are the logical evolution trying to address some of the inherent technology complexity we are currently faced when building cloud native apps. FaaS frameworks are currently popping up like mushrooms: Knative, Kubeless, OpenFn, Fission, OpenFaas or Open Whisk are just a few to name. But which one of these is safe to pick and use in your next project? Let's find out. This session will start off by briefly explaining the essence of Serverless application architecture. We will then define a criteria catalog for FaaS frameworks and continue by comparing and showcasing the most promising ones.

application architecturecloud computingcloud native applications
Spark Study Notes
Spark Study NotesSpark Study Notes
Spark Study Notes

This document discusses Apache Spark, an open-source cluster computing framework. It provides an overview of Spark, including its main concepts like RDDs (Resilient Distributed Datasets) and transformations. Spark is presented as a faster alternative to Hadoop for iterative jobs and machine learning through its ability to keep data in-memory. Example code is shown for Spark's programming model in Scala and Python. The document concludes that Spark offers a rich API to make data analytics fast, achieving speedups of up to 100x over Hadoop in real applications.

spark
Best Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBest Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker Containers

Docker containers provide an ideal foundation for running Kafka-as-a-Service on-premises or in the public cloud. However, using Docker containers in production environments for Big Data workloads using Kafka poses some challenges – including container management, scheduling, network configuration and security, and performance. In this session at Kafka Summit in August 2017, Nanda Vijyaydev of BlueData shared lessons learned from implementing Kafka-as-a-Service with Docker containers. https://kafka-summit.org/sessions/kafka-service-docker-containers

kafkaapache kafkadocker
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
labels:
app: spark-shuffle-service
spark-version: 2.1.0
name: shuffle
spec:
template:
metadata:
labels:
app: spark-shuffle-service
spark-version: 2.1.0
spec:
volumes:
- name: temp-volume
hostPath:
path: '/var/tmp' # change this path according to your cluster configuration.
containers:
- name: shuffle
image: kubespark/spark-shuffle:v2.1.0-kubernetes-0.2.0
37
bin/spark-submit 
--deploy-mode cluster 
--class org.apache.spark.examples.GroupByTest 
--master k8s://https://{k8s address} 
--kubernetes-namespace default 
--conf spark.app.name=group-by-test 
--conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.1.0-kubernetes-0.2.0 
--conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.1.0-kubernetes-0.2.0 
--conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.1.0-kubernetes-0.2.0 
--conf spark.dynamicAllocation.enabled=true 
--conf spark.shuffle.service.enabled=true 
--conf spark.kubernetes.shuffle.namespace=default 
--conf spark.kubernetes.shuffle.labels="app=spark-shuffle-service,spark-version=2.1.0" 
local:///opt/spark/examples/jars/spark-examples_2.11-2.1.0-k8s-0.2.0-SNAPSHOT.jar 10 40000 2
38
https://spark-summit.org/2017/events/apache-spark-on-kubernetes/
39
3
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: spark-resource-staging-server
spec:
replicas: 1
---
apiVersion: v1
kind: Service
metadata:
name: spark-resource-staging-service
spec:
type: NodePort
selector:
resource-staging-server-instance: default
ports:
- protocol: TCP
port: 10000
targetPort: 10000
nodePort: 31000
40

Recommended for you

Spark to DocumentDB connector
Spark to DocumentDB connectorSpark to DocumentDB connector
Spark to DocumentDB connector

Combining the awesomeness of Apache Spark and Azure DocumentDB together for real-time machine learning on globally-distributed data.

machine learningapache sparkazure documentdb
Kubernetes Java Operator
Kubernetes Java OperatorKubernetes Java Operator
Kubernetes Java Operator

Once you're familiar with Kubernetes and Helm, and still need more integration, what are your options ? Can Java integrate well with Kubernetes ?

kubernetesjavahelm
G rpc talk with intel (3)
G rpc talk with intel (3)G rpc talk with intel (3)
G rpc talk with intel (3)

Google and Intel speak on NFV and SFC service delivery The slides are as presented at the meet up "Out of Box Network Developers" sponsored by Intel Networking Developer Zone Here is the Agenda of the slides: How DPDK, RDT and gRPC fit into SDI/SDN, NFV and OpenStack Key Platform Requirements for SDI SDI Platform Ingredients: DPDK, IntelⓇRDT gRPC Service Framework IntelⓇ RDT and gRPC service framework

sdnrdtnfv
bin/spark-submit 
--deploy-mode cluster 
--class org.apache.spark.examples.SparkPi 
--master k8s://{k8s address} 
--kubernetes-namespace default 
--conf spark.executor.instances=5 
--conf spark.app.name=spark-pi 
--conf spark.kubernetes.driver.docker.image=kubespark/spark-driver:v2.1.0-kubernetes-0.2.0 
--conf spark.kubernetes.executor.docker.image=kubespark/spark-executor:v2.1.0-kubernetes-0.2.0 
--conf spark.kubernetes.initcontainer.docker.image=kubespark/spark-init:v2.1.0-kubernetes-0.2.0 
--conf spark.kubernetes.resourceStagingServer.uri=http://{node ip}:31000 
examples/jars/spark-examples_2.11-2.1.0-k8s-0.2.0-SNAPSHOT.jar
41
4242
43
https://spark-summit.org/2017/events/hdfs-on-kubernetes-lessons-learned/
4444

Recommended for you

Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...

The overall evolution towards microservices has caused a lot of IT leaders to radically rethink architectures and platforms. One can hardly keep up with the rapid onslaught on new distributed technologies. The same people who just asked yesterday "how can we deploy Docker containers?", are now asking "how can we operate Kubernetes-as-a-Service on-premise?", and are about to start asking "how can we operate the open source frameworks of our choice, such as Spark, TensorFlow, HDFS, and more, as a service across hybrid clouds?”. This session will discuss: Challenges of orchestrating and operating

codemotioncodemotion berlin 2018tech conference
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...

The overall evolution towards microservices has caused a lot of IT leaders to radically rethink architectures and platforms. One can hardly keep up with the rapid onslaught on new distributed technologies. The same people who just asked yesterday "how can we deploy Docker containers?", are now asking "how can we operate Kubernetes-as-a-Service on-premise?", and are about to start asking "how can we operate the open source frameworks of our choice, such as Spark, TensorFlow, HDFS, and more, as a service across hybrid clouds?”. This session will discuss: Challenges of orchestrating and operating.

codemotioncodemotion berlin 2018technology
Reliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on KubernetesReliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on Kubernetes

Kubernetes is an open-source containerization framework that makes it easy to manage applications in isolated environments at scale. In Apache Spark 2.3, Spark introduced support for native integration with Kubernetes. Palantir has been deeply involved with the development of Spark’s Kubernetes integration from the beginning, and our largest production deployment now runs an average of ~5 million Spark pods per day, as part of tens of thousands of Spark applications. Over the course of our adventures in migrating deployments from YARN to Kubernetes, we have overcome a number of performance, cost, & reliability hurdles: differences in shuffle performance due to smaller filesystem caches in containers; Kubernetes CPU limits causing inadvertent throttling of containers that run many Java threads; and lack of support for dynamic allocation leading to resource wastage. We intend to briefly describe our story of developing & deploying Spark-on-Kubernetes, as well as lessons learned from deploying containerized Spark applications in production. We will also describe our recently open-sourced extension (https://github.com/palantir/k8s-spark-scheduler) to the Kubernetes scheduler to better support Spark workloads & facilitate Spark-aware cluster autoscaling; our limited implementation of dynamic allocation on Kubernetes; and ongoing work that is required to support dynamic resource management & stable performance at scale (i.e., our work with the community on a pluggable external shuffle service API). Our hope is that our lessons learned and ongoing work will help other community members who want to use Spark on Kubernetes for their own workloads.

* 
apache spark

 *big data

 *ai

 *
https://spark-summit.org/2017/events/hdfs-on-kubernetes-lessons-learned/
4545
https://spark-summit.org/2017/events/hdfs-on-kubernetes-lessons-learned/
4646
https://spark-summit.org/2017/events/hdfs-on-kubernetes-lessons-learned/
4747
https://github.com/apache-spark-on-k8s/kubernetes-HDFS
48

Recommended for you

betterCode Workshop: Effizientes DevOps-Tooling mit Go
betterCode Workshop:  Effizientes DevOps-Tooling mit GobetterCode Workshop:  Effizientes DevOps-Tooling mit Go
betterCode Workshop: Effizientes DevOps-Tooling mit Go

bettterCode, 24.06.2021, Online: Workshop of Mario-Leander Reimer (@LeanderReimer, Principal Software Architect at QAware) & Markus Zimmermann (@markus_zm, Senior Software Engineer at QAware) == Please download slides in case they are blurred! === Use the right tool and language for the job! Especially in the DevOps tooling area, Go has established itself as a simple, reliable and efficient programming language. In this workshop, we learned about suitable application areas and implementing quite a few tools.

godevops toolingcloud native
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + KubernetesMongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes

MongoDB Ops Manager is an enterprise-grade end-to-end database management, monitoring, and backup solution. Kubernetes has clearly won the orchestration-platform "wars". In this session we'll take a deep dive on how you can leverage both these technologies to host your MongoDB deployments within your Kubernetes infrastructure whether that's OpenShift, PKS, Azure AKS, or just upstream. This talk will review the core technologies, such as containers, Kubernetes, and MongoDB Ops Manager. You'll also have a chance to see real-live demos of MongoDB running on Kubernetes and managed with MongoDB Ops Manager with the MongoDB Enterprise Kubernetes Operator.

mongodbmongodb.local
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on CloudDayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud

Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery. Kubernetes services expose these units to enable dynamic load balancing while maintaining session affinity. It also provides self-healing capabilities by restarting containers that fail, replacing them, and killing containers that don't respond to their health check.

artificial intelligencedaytaaidocker
49
50

More Related Content

What's hot

Scaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftScaling spark on kubernetes at Lyft
Scaling spark on kubernetes at Lyft
Li Gao
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
Nathan Handler
 
February 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with DockerFebruary 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with Docker
Yahoo Developer Network
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: What’s New in ScyllaDB Operator for KubernetesScylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
ScyllaDB
 
Spark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Pipelines in the Cloud with Alluxio with Gene PangSpark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Summit
 
Gocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentGocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous Deployment
Leandro Totino Pereira
 
Getting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache MesosGetting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache Mesos
Paco Nathan
 
Dev ops for big data cluster management tools
Dev ops for big data  cluster management toolsDev ops for big data  cluster management tools
Dev ops for big data cluster management tools
Ran Silberman
 
A Journey through the JDKs (Java 9 to Java 11)
A Journey through the JDKs (Java 9 to Java 11)A Journey through the JDKs (Java 9 to Java 11)
A Journey through the JDKs (Java 9 to Java 11)
Markus Günther
 
Homologous Apache Spark Clusters Using Nomad with Alex Dadgar
Homologous Apache Spark Clusters Using Nomad with Alex DadgarHomologous Apache Spark Clusters Using Nomad with Alex Dadgar
Homologous Apache Spark Clusters Using Nomad with Alex Dadgar
Databricks
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Joe Stein
 
K8S in prod
K8S in prodK8S in prod
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with SenlinDeploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
Qiming Teng
 
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talk
DataStax Academy
 
Suning OpenStack Cloud and Heat
Suning OpenStack Cloud and HeatSuning OpenStack Cloud and Heat
Suning OpenStack Cloud and Heat
Qiming Teng
 
Handling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeperHandling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeper
ryanlecompte
 
Hadoop on-mesos
Hadoop on-mesosHadoop on-mesos
Hadoop on-mesos
Henry Cai 蔡明航
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
Yahoo Developer Network
 
War Stories: DIY Kafka
War Stories: DIY KafkaWar Stories: DIY Kafka
War Stories: DIY Kafka
confluent
 

What's hot (20)

Scaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftScaling spark on kubernetes at Lyft
Scaling spark on kubernetes at Lyft
 
PaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at YelpPaaSTA: Autoscaling at Yelp
PaaSTA: Autoscaling at Yelp
 
February 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with DockerFebruary 2016 HUG: Running Spark Clusters in Containers with Docker
February 2016 HUG: Running Spark Clusters in Containers with Docker
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: What’s New in ScyllaDB Operator for KubernetesScylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
Scylla Summit 2022: What’s New in ScyllaDB Operator for Kubernetes
 
Spark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Pipelines in the Cloud with Alluxio with Gene PangSpark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Pipelines in the Cloud with Alluxio with Gene Pang
 
Gocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous DeploymentGocd – Kubernetes/Nomad Continuous Deployment
Gocd – Kubernetes/Nomad Continuous Deployment
 
Getting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache MesosGetting Started Running Apache Spark on Apache Mesos
Getting Started Running Apache Spark on Apache Mesos
 
Dev ops for big data cluster management tools
Dev ops for big data  cluster management toolsDev ops for big data  cluster management tools
Dev ops for big data cluster management tools
 
A Journey through the JDKs (Java 9 to Java 11)
A Journey through the JDKs (Java 9 to Java 11)A Journey through the JDKs (Java 9 to Java 11)
A Journey through the JDKs (Java 9 to Java 11)
 
Homologous Apache Spark Clusters Using Nomad with Alex Dadgar
Homologous Apache Spark Clusters Using Nomad with Alex DadgarHomologous Apache Spark Clusters Using Nomad with Alex Dadgar
Homologous Apache Spark Clusters Using Nomad with Alex Dadgar
 
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache AccumuloReal-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
Real-Time Distributed and Reactive Systems with Apache Kafka and Apache Accumulo
 
K8S in prod
K8S in prodK8S in prod
K8S in prod
 
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with SenlinDeploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
Deploy an Elastic, Resilient, Load-Balanced Cluster in 5 Minutes with Senlin
 
Make 2016 your year of SMACK talk
Make 2016 your year of SMACK talkMake 2016 your year of SMACK talk
Make 2016 your year of SMACK talk
 
Suning OpenStack Cloud and Heat
Suning OpenStack Cloud and HeatSuning OpenStack Cloud and Heat
Suning OpenStack Cloud and Heat
 
Handling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeperHandling Redis failover with ZooKeeper
Handling Redis failover with ZooKeeper
 
Hadoop on-mesos
Hadoop on-mesosHadoop on-mesos
Hadoop on-mesos
 
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
April 2016 HUG: The latest of Apache Hadoop YARN and running your docker apps...
 
War Stories: DIY Kafka
War Stories: DIY KafkaWar Stories: DIY Kafka
War Stories: DIY Kafka
 

Similar to Spark day 2017 - Spark on Kubernetes

Cloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit KubernetesCloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit Kubernetes
QAware GmbH
 
Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native KubernetesSimplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Databricks
 
Kubernetes extensibility
Kubernetes extensibilityKubernetes extensibility
Kubernetes extensibility
Docker, Inc.
 
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with Kubernetes
Nicola Ferraro
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and Python
Benjamin Bengfort
 
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
OpenStack Korea Community
 
Kubernetes for the PHP developer
Kubernetes for the PHP developerKubernetes for the PHP developer
Kubernetes for the PHP developer
Paul Czarkowski
 
The Big Cloud native FaaS Lebowski
The Big Cloud native FaaS Lebowski The Big Cloud native FaaS Lebowski
The Big Cloud native FaaS Lebowski
QAware GmbH
 
Spark Study Notes
Spark Study NotesSpark Study Notes
Spark Study Notes
Richard Kuo
 
Best Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBest Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker Containers
BlueData, Inc.
 
Spark to DocumentDB connector
Spark to DocumentDB connectorSpark to DocumentDB connector
Spark to DocumentDB connector
Denny Lee
 
Kubernetes Java Operator
Kubernetes Java OperatorKubernetes Java Operator
Kubernetes Java Operator
Anthony Dahanne
 
G rpc talk with intel (3)
G rpc talk with intel (3)G rpc talk with intel (3)
G rpc talk with intel (3)
Intel
 
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Codemotion
 
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Codemotion
 
Reliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on KubernetesReliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on Kubernetes
Databricks
 
betterCode Workshop: Effizientes DevOps-Tooling mit Go
betterCode Workshop:  Effizientes DevOps-Tooling mit GobetterCode Workshop:  Effizientes DevOps-Tooling mit Go
betterCode Workshop: Effizientes DevOps-Tooling mit Go
QAware GmbH
 
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + KubernetesMongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB
 
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on CloudDayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
Jung-Hong Kim
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Anant Corporation
 

Similar to Spark day 2017 - Spark on Kubernetes (20)

Cloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit KubernetesCloud-native .NET Microservices mit Kubernetes
Cloud-native .NET Microservices mit Kubernetes
 
Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native KubernetesSimplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
Simplify and Boost Spark 3 Deployments with Hypervisor-Native Kubernetes
 
Kubernetes extensibility
Kubernetes extensibilityKubernetes extensibility
Kubernetes extensibility
 
Extending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with KubernetesExtending DevOps to Big Data Applications with Kubernetes
Extending DevOps to Big Data Applications with Kubernetes
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and Python
 
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
[OpenStack Days Korea 2016] Track1 - Red Hat enterprise Linux OpenStack Platform
 
Kubernetes for the PHP developer
Kubernetes for the PHP developerKubernetes for the PHP developer
Kubernetes for the PHP developer
 
The Big Cloud native FaaS Lebowski
The Big Cloud native FaaS Lebowski The Big Cloud native FaaS Lebowski
The Big Cloud native FaaS Lebowski
 
Spark Study Notes
Spark Study NotesSpark Study Notes
Spark Study Notes
 
Best Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBest Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker Containers
 
Spark to DocumentDB connector
Spark to DocumentDB connectorSpark to DocumentDB connector
Spark to DocumentDB connector
 
Kubernetes Java Operator
Kubernetes Java OperatorKubernetes Java Operator
Kubernetes Java Operator
 
G rpc talk with intel (3)
G rpc talk with intel (3)G rpc talk with intel (3)
G rpc talk with intel (3)
 
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
 
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
Jörg Schad - Hybrid Cloud (Kubernetes, Spark, HDFS, …)-as-a-Service - Codemot...
 
Reliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on KubernetesReliable Performance at Scale with Apache Spark on Kubernetes
Reliable Performance at Scale with Apache Spark on Kubernetes
 
betterCode Workshop: Effizientes DevOps-Tooling mit Go
betterCode Workshop:  Effizientes DevOps-Tooling mit GobetterCode Workshop:  Effizientes DevOps-Tooling mit Go
betterCode Workshop: Effizientes DevOps-Tooling mit Go
 
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + KubernetesMongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
 
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on CloudDayta AI Seminar - Kubernetes, Docker and AI on Cloud
Dayta AI Seminar - Kubernetes, Docker and AI on Cloud
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
 

More from Yousun Jeong

Stsg17 speaker yousunjeong
Stsg17 speaker yousunjeongStsg17 speaker yousunjeong
Stsg17 speaker yousunjeong
Yousun Jeong
 
Druid meetup 4th_sql_on_druid
Druid meetup 4th_sql_on_druidDruid meetup 4th_sql_on_druid
Druid meetup 4th_sql_on_druid
Yousun Jeong
 
Kubernetes on aws
Kubernetes on awsKubernetes on aws
Kubernetes on aws
Yousun Jeong
 
Data Analytics with Druid
Data Analytics with DruidData Analytics with Druid
Data Analytics with Druid
Yousun Jeong
 
IEEE International Conference on Data Engineering 2015
IEEE International Conference on Data Engineering 2015IEEE International Conference on Data Engineering 2015
IEEE International Conference on Data Engineering 2015
Yousun Jeong
 
Spark streaming , Spark SQL
Spark streaming , Spark SQLSpark streaming , Spark SQL
Spark streaming , Spark SQL
Yousun Jeong
 
Big Telco Real-Time Network Analytics
Big Telco Real-Time Network AnalyticsBig Telco Real-Time Network Analytics
Big Telco Real-Time Network Analytics
Yousun Jeong
 
Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례
Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례
Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례
Yousun Jeong
 
2012 07 28_cloud_reference_architecture_openplatform
2012 07 28_cloud_reference_architecture_openplatform2012 07 28_cloud_reference_architecture_openplatform
2012 07 28_cloud_reference_architecture_openplatform
Yousun Jeong
 

More from Yousun Jeong (9)

Stsg17 speaker yousunjeong
Stsg17 speaker yousunjeongStsg17 speaker yousunjeong
Stsg17 speaker yousunjeong
 
Druid meetup 4th_sql_on_druid
Druid meetup 4th_sql_on_druidDruid meetup 4th_sql_on_druid
Druid meetup 4th_sql_on_druid
 
Kubernetes on aws
Kubernetes on awsKubernetes on aws
Kubernetes on aws
 
Data Analytics with Druid
Data Analytics with DruidData Analytics with Druid
Data Analytics with Druid
 
IEEE International Conference on Data Engineering 2015
IEEE International Conference on Data Engineering 2015IEEE International Conference on Data Engineering 2015
IEEE International Conference on Data Engineering 2015
 
Spark streaming , Spark SQL
Spark streaming , Spark SQLSpark streaming , Spark SQL
Spark streaming , Spark SQL
 
Big Telco Real-Time Network Analytics
Big Telco Real-Time Network AnalyticsBig Telco Real-Time Network Analytics
Big Telco Real-Time Network Analytics
 
Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례
Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례
Enterprise 환경에서의 오픈소스 기반 아키텍처 적용 사례
 
2012 07 28_cloud_reference_architecture_openplatform
2012 07 28_cloud_reference_architecture_openplatform2012 07 28_cloud_reference_architecture_openplatform
2012 07 28_cloud_reference_architecture_openplatform
 

Recently uploaded

Kolkata @Call @Girls 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real Meet
Kolkata @Call @Girls 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real MeetKolkata @Call @Girls 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real Meet
Kolkata @Call @Girls 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real Meet
lovelykumarilk789
 
Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
Mumbai @Call @Girls Whatsapp 9930687706 With High Profile ServiceMumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
kolkata dolls
 
How we built TryBoxLang in under 48 hours
How we built TryBoxLang in under 48 hoursHow we built TryBoxLang in under 48 hours
How we built TryBoxLang in under 48 hours
Ortus Solutions, Corp
 
mobile-app-development-company-in-noida.pdf
mobile-app-development-company-in-noida.pdfmobile-app-development-company-in-noida.pdf
mobile-app-development-company-in-noida.pdf
Mobile App Development Company in Noida - Drona Infotech
 
NYC 26-Jun-2024 Combined Presentations.pdf
NYC 26-Jun-2024 Combined Presentations.pdfNYC 26-Jun-2024 Combined Presentations.pdf
NYC 26-Jun-2024 Combined Presentations.pdf
AUGNYC
 
Intro to Amazon Web Services (AWS) and Gen AI
Intro to Amazon Web Services (AWS) and Gen AIIntro to Amazon Web Services (AWS) and Gen AI
Intro to Amazon Web Services (AWS) and Gen AI
Ortus Solutions, Corp
 
Ghatkopar @Call @Girls 🛴 9930687706 🛴 Aaradhaya Best High Class Mumbai Available
Ghatkopar @Call @Girls 🛴 9930687706 🛴 Aaradhaya Best High Class Mumbai AvailableGhatkopar @Call @Girls 🛴 9930687706 🛴 Aaradhaya Best High Class Mumbai Available
Ghatkopar @Call @Girls 🛴 9930687706 🛴 Aaradhaya Best High Class Mumbai Available
aviva54
 
Bhiwandi @Call @Girls Whatsapp 000000000 With Best And No 1
Bhiwandi @Call @Girls Whatsapp 000000000 With Best And No 1Bhiwandi @Call @Girls Whatsapp 000000000 With Best And No 1
Bhiwandi @Call @Girls Whatsapp 000000000 With Best And No 1
arvindkumarji156
 
@ℂall @Girls Kolkata ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
@ℂall @Girls Kolkata  ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe@ℂall @Girls Kolkata  ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
@ℂall @Girls Kolkata ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Misti Soneji
 
Abortion pills in Fujairah *((+971588192166*)☎️)¥) **Effective Abortion Pills...
Abortion pills in Fujairah *((+971588192166*)☎️)¥) **Effective Abortion Pills...Abortion pills in Fujairah *((+971588192166*)☎️)¥) **Effective Abortion Pills...
Abortion pills in Fujairah *((+971588192166*)☎️)¥) **Effective Abortion Pills...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
Chennai @Call @Girls 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real Meet
Chennai @Call @Girls 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real MeetChennai @Call @Girls 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real Meet
Chennai @Call @Girls 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real Meet
lovelykumarilk789
 
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
Hironori Washizaki
 
Migrate your Infrastructure to the AWS Cloud
Migrate your Infrastructure to the AWS CloudMigrate your Infrastructure to the AWS Cloud
Migrate your Infrastructure to the AWS Cloud
Ortus Solutions, Corp
 
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdfdachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
DNUG e.V.
 
dachnug51 - HCL Domino Roadmap .pdf
dachnug51 - HCL Domino Roadmap      .pdfdachnug51 - HCL Domino Roadmap      .pdf
dachnug51 - HCL Domino Roadmap .pdf
DNUG e.V.
 
Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)
miso_uam
 
Seamless PostgreSQL to Snowflake Data Transfer in 8 Simple Steps
Seamless PostgreSQL to Snowflake Data Transfer in 8 Simple StepsSeamless PostgreSQL to Snowflake Data Transfer in 8 Simple Steps
Seamless PostgreSQL to Snowflake Data Transfer in 8 Simple Steps
Estuary Flow
 
@Call @Girls in Tirunelveli 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Cla...
 @Call @Girls in Tirunelveli 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Cla... @Call @Girls in Tirunelveli 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Cla...
@Call @Girls in Tirunelveli 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Cla...
JoyaBansal
 
Folding Cheat Sheet #7 - seventh in a series
Folding Cheat Sheet #7 - seventh in a seriesFolding Cheat Sheet #7 - seventh in a series
Folding Cheat Sheet #7 - seventh in a series
Philip Schwarz
 
Kolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Kolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model SafeKolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Kolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Misti Soneji
 

Recently uploaded (20)

Kolkata @Call @Girls 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real Meet
Kolkata @Call @Girls 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real MeetKolkata @Call @Girls 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real Meet
Kolkata @Call @Girls 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real Meet
 
Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
Mumbai @Call @Girls Whatsapp 9930687706 With High Profile ServiceMumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
Mumbai @Call @Girls Whatsapp 9930687706 With High Profile Service
 
How we built TryBoxLang in under 48 hours
How we built TryBoxLang in under 48 hoursHow we built TryBoxLang in under 48 hours
How we built TryBoxLang in under 48 hours
 
mobile-app-development-company-in-noida.pdf
mobile-app-development-company-in-noida.pdfmobile-app-development-company-in-noida.pdf
mobile-app-development-company-in-noida.pdf
 
NYC 26-Jun-2024 Combined Presentations.pdf
NYC 26-Jun-2024 Combined Presentations.pdfNYC 26-Jun-2024 Combined Presentations.pdf
NYC 26-Jun-2024 Combined Presentations.pdf
 
Intro to Amazon Web Services (AWS) and Gen AI
Intro to Amazon Web Services (AWS) and Gen AIIntro to Amazon Web Services (AWS) and Gen AI
Intro to Amazon Web Services (AWS) and Gen AI
 
Ghatkopar @Call @Girls 🛴 9930687706 🛴 Aaradhaya Best High Class Mumbai Available
Ghatkopar @Call @Girls 🛴 9930687706 🛴 Aaradhaya Best High Class Mumbai AvailableGhatkopar @Call @Girls 🛴 9930687706 🛴 Aaradhaya Best High Class Mumbai Available
Ghatkopar @Call @Girls 🛴 9930687706 🛴 Aaradhaya Best High Class Mumbai Available
 
Bhiwandi @Call @Girls Whatsapp 000000000 With Best And No 1
Bhiwandi @Call @Girls Whatsapp 000000000 With Best And No 1Bhiwandi @Call @Girls Whatsapp 000000000 With Best And No 1
Bhiwandi @Call @Girls Whatsapp 000000000 With Best And No 1
 
@ℂall @Girls Kolkata ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
@ℂall @Girls Kolkata  ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe@ℂall @Girls Kolkata  ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
@ℂall @Girls Kolkata ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
 
Abortion pills in Fujairah *((+971588192166*)☎️)¥) **Effective Abortion Pills...
Abortion pills in Fujairah *((+971588192166*)☎️)¥) **Effective Abortion Pills...Abortion pills in Fujairah *((+971588192166*)☎️)¥) **Effective Abortion Pills...
Abortion pills in Fujairah *((+971588192166*)☎️)¥) **Effective Abortion Pills...
 
Chennai @Call @Girls 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real Meet
Chennai @Call @Girls 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real MeetChennai @Call @Girls 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real Meet
Chennai @Call @Girls 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Genuine WhatsApp Number for Real Meet
 
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
COMPSAC 2024 D&I Panel: Charting a Course for Equity: Strategies for Overcomi...
 
Migrate your Infrastructure to the AWS Cloud
Migrate your Infrastructure to the AWS CloudMigrate your Infrastructure to the AWS Cloud
Migrate your Infrastructure to the AWS Cloud
 
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdfdachnug51 - HCL Sametime 12 as a Software Appliance.pdf
dachnug51 - HCL Sametime 12 as a Software Appliance.pdf
 
dachnug51 - HCL Domino Roadmap .pdf
dachnug51 - HCL Domino Roadmap      .pdfdachnug51 - HCL Domino Roadmap      .pdf
dachnug51 - HCL Domino Roadmap .pdf
 
Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)
 
Seamless PostgreSQL to Snowflake Data Transfer in 8 Simple Steps
Seamless PostgreSQL to Snowflake Data Transfer in 8 Simple StepsSeamless PostgreSQL to Snowflake Data Transfer in 8 Simple Steps
Seamless PostgreSQL to Snowflake Data Transfer in 8 Simple Steps
 
@Call @Girls in Tirunelveli 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Cla...
 @Call @Girls in Tirunelveli 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Cla... @Call @Girls in Tirunelveli 🐱‍🐉  XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Cla...
@Call @Girls in Tirunelveli 🐱‍🐉 XXXXXXXXXX 🐱‍🐉 Tanisha Sharma Best High Cla...
 
Folding Cheat Sheet #7 - seventh in a series
Folding Cheat Sheet #7 - seventh in a seriesFolding Cheat Sheet #7 - seventh in a series
Folding Cheat Sheet #7 - seventh in a series
 
Kolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Kolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model SafeKolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
Kolkata @ℂall @Girls ꧁❤ 000000000 ❤꧂@ℂall @Girls Service Vip Top Model Safe
 

Spark day 2017 - Spark on Kubernetes