Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
www.hashicorp.comO @hashicor
p
hello@hashicorp.co
m
nic@hashicorp.com
@sheriffjackson
NIC JACKSON
3
AGENDA
HASHICORP
SCHEDULING
NOMAD
Overview
Fundamentals
Job Configuration
Scheduling
Demo
HASHICORP
OVERVIEW
5
FOUNDED 2012 by Mitchell Hashimoto and Armon Dadgar
MISSION
We enable organizations to provision, secure, and run any infrastructure for any
application
INVESTORS Mayfield Fund, GGV Capital, Redpoint and True Ventures
KEY PRODUCTS Vagrant, Packer, Terraform, Vault, Nomad, Consul
COMPANY OVERVIEW
6
OSS TO ENTERPRISE
SOFTWARE INNOVATORS TECHNOLOGY PARTNERS
7
PRODUCT SUITE
8
NOMAD
Nomad
SCHEDULING
OVERVIEW
Schedulers map a set of work to
a set of resources
11
CPU SCHEDULER
11
CORE
CORE
CORE
CORE
CPU
SCHEDULER
KERNEL
APACHE
REDIS
BASH
12
CPU SCHEDULER
12
CORE
CORECPU
SCHEDULER
KERNEL
APACHE
REDIS
BASH
13
SCHEDULERS IN THE WILD
13
Type Work Resources
CPU Scheduler Threads Physical Cores
EC2 / Nova Virtual Machines Hypervisors
Hadoop YARN MapReduce Jobs Client Nodes
Cluster Scheduler Applications Machines
14
SCHEDULER ADVANTAGES
14
Higher Resource Utilization
Decouple Work from Resources
Better Quality of Service
15
SCHEDULER ADVANTAGES
15
Bin Packing
Over-Subscription
Job Queueing
Higher Resource Utilization
Decouple Work from Resources
Better Quality of Service
16
SCHEDULER ADVANTAGES
16
Abstraction
API Contracts
Standardization
Higher Resource Utilization
Decouple Work from Resources
Better Quality of Service
17
SCHEDULER ADVANTAGES
17
Priorities
Resource Isolation
Pre-emption
Higher Resource Utilization
Decouple Work from Resources
Better Quality of Service
18
NOT A NEW CONCEPT
18
19
BASED ON RESEARCH
19
NOMAD
OVERVIEW
21
NOMAD DESIGN PRINCIPLES
21HashiCorp confidential do not distribute
Integrated scheduler and cluster manager
Distributed, shared state, optimistically concurrent
Agent-based, client/server
No dependencies
22
NOMAD CHARACTERISTICS
22HashiCorp confidential do not distribute
Multi-datacenter and multi-region
Highly performant and highly available
Hybrid workloads with multiple schedulers and drivers
Seamlessly integrates with HashiCorp ecosystem
NOMAD
FUNDAMENTALS
24
SINGLE REGION DEPLOYMENT
24
SERVER SERVER SERVER
CLIENT CLIENT CLIENTDC1 DC2 DC3
FOLLOWER LEADER FOLLOWER
REPLICATION
FORWARDING
REPLICATION
FORWARDING
RPC RPC RPC
25
MULTI REGION DEPLOYMENT
25
SERVER SERVER SERVER
FOLLOWER LEADER FOLLOWER
REPLICATION
FORWARDING
REPLICATION
REGION B GOSSIP
REPLICATION REPLICATION
FORWARDING
REGION FORWARDING
REGION A
SERVER
FOLLOWER
SERVER SERVER
LEADER FOLLOWER
26
SERVER ARCHITECTURE
26
Omega Class Scheduler
Pluggable Logic
Internal Coordination and State
Multi-Region / Multi-Datacenter
27
CLIENT ARCHITECTURE
27
Broad OS Support
Host Fingerprinting
Pluggable Drivers
Job restarts and lifecycle management
28
CLIENT DRIVERS
28
Containerized
Docker
rkt
Windows Server Containers
Virtualized
Qemu / KVM
Hyper-V
Xen
Standalone
Java Jar
C#
Static Binaries
29
CLIENT FINGERPRINTING
29
Type Examples
Operating System Kernel, OS, Version
Hardware CPU, Memory, Disk
Apps (Capabilities) Docker, Java, Consul
Environment AWS, GCE
NOMAD
JOB CONFIGURATION
31
JOB FILE
31
Declarative
Scheduler, driver, and resource needs
Lifecycle behavior
Constraints
Versioned
32
redis.nomad
JOB FILE
job "redis" {
datacenters = ["us-east-1"]
task "redis" {
driver = "docker"
config { image = "redis:v13" }
resources {
cpu = 500 # Mhz
memory = 256 # MB
network {
mbits = 10
dynamic_ports = ["redis"]
}
}
}
}
33
redis.nomad
JOB FILE: TASK GROUPS
job "app" {
group "app" {
task "redis" {
# ...
}
task "app" {
# ...
}
}
}
34
redis.nomad
JOB FILE: CONSTRAINTS
job "redis" {
constraint {
attribute = "${attr.kernel.version}"
operator = "version"
value = "> 3.19"
}
constraint {
attribute = "${attr.platform.aws.instance-type}"
value = "p2.16xlarge"
}
task "redis" {
# ...
}
}
35
redis.nomad
JOB FILE: CONSUL SERVICE DISCOVERY
job "redis" {
task "redis" {
# ...
service {
port = “redis”
check {
type = “tcp”
interval = “10s”
}
}
}
}
36
redis.nomad
JOB FILE: CONSUL CONFIGURATION
job "redis" {
task "redis" {
# ...
template {
data = <<EOH
bind_port: {{ env "NOMAD_PORT_db" }}
scratch_dir: {{ env "NOMAD_TASK_DIR" }}
service_key: {{ key "service/my-key" }}
EOH
destination = "local/file.yml"
}
}
}
37
redis.nomad
JOB FILE: VAULT INTEGRATION
job "redis" {
task "redis" {
# ...
template {
data = <<EOH
{{ with secret "secret/credentials" }}
username: {{ .Data.username }}
password: {{ .Data.password }}
{{ end }}
EOH
destination = "local/file.yml"
}
}
}
38
redis.nomad
JOB FILE: PARAMETERIZED
job "encode" {
type = "batch"
parameterized {
payload = "required"
meta_required = ["s3-input", "s3-output", ...]
}
# ...
task "ffmpeg" {
driver = "exec"
config {
command = "ffmpeg"
# When dispatched, the payload is written to a file that is then
# read by the created task upon startup
args = ["-config=${NOMAD_TASK_DIR}/config.json"]
# ...
}
39
$ nomad job dispatch encode video-config.json
$
$ cat video-config.json
{
"s3-input": "https://s3-us-west-1.com/video-bucket/cb31dabb1",
"s3-output": "https://s3-us-west-1.com/video-bucket/a149adbe3",
"input-codec": "mp4",
"output-codec": "webm",
"quality": "1080p"
}
Text
JOB FILE: PARAMETERIZED
NOMAD
MULTI-CLOUD
Why Multi-Cloud?
• High Availability
• Redundancy
• Burstable Workload
• Cloud Migration
• Because we can
42
CONSUL
NOMAD
SERVER
LEADER
SERVER
FOLLOWER
SERVER
LEADER
SERVER
FOLLOWER
SERVER
FOLLOWER
SERVER
FOLLOWER
NODE A NODE B
GOOGLE CLOUD
NATS CLOUD
MESSAGING
REPLICATION
FORWARDING
REPLICATION
FORWARDING
REPLICATION
FORWARDING
REPLICATION
FORWARDING
LOAD BALANCER
LOAD BALANCER
CONSUL
NOMAD
SERVER
LEADER
SERVER
FOLLOWER
SERVER
LEADER
SERVER
FOLLOWER
SERVER
FOLLOWER
SERVER
FOLLOWER
NODE A NODE B
AWS
REPLICATION
FORWARDING
REPLICATION
FORWARDING
REPLICATION
FORWARDING
REPLICATION
FORWARDING
LOAD BALANCER
REGION FORWARDING (VPN)
REGION FORWARDING (VPN)
NOMAD
SCHEDULING
44
SCHEDULING
44
Schedulers process evaluations and generate allocation plans.
Placement is determined using the relevant scheduler.
Scheduling involves feasibility checking and ranking.
Feasibility filters out nodes missing necessary drivers and those
failing the specified constraints.
Ranking score feasible nodes to find the best fit (bin packing).
45
SCHEDULER TYPES
45HashiCorp confidential do not distribute
Service
Long-running applications and services
Batch
Short-lived data processing jobs (benefit from fast placement)
System
Lower level jobs that run on all clients (logging, monitoring)
46
$ nomad plan example.nomad
+ Job: "example"
+ Task Group: "cache" (1 create)
+ Task: "redis" (forces create)
Scheduler dry-run:
- All tasks successfully allocated.
$
Text
SCHEDULING: PLAN
47
$ nomad plan example.nomad.java
+ Job: "example"
+ Task Group: "web" (1 create)
+ Task: "tomcat" (forces create)
Scheduler dry-run:
- WARNING: Failed to place all allocations.
Task Group "web" (failed to place 1 allocation):
* Constraint "missing drivers" filtered 2 nodes
$
Text
SCHEDULING: PLAN
48
$ nomad run example.nomad
==> Monitoring evaluation "4b8b7779"
Evaluation triggered by job "example"
Allocation "38720b8e" created: node "ec2f0830", group "cache"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "4b8b7779" finished with status "complete"
$
Text
SCHEDULING: RUN
49
$ nomad run -region=gcp events.nomad
==> Monitoring evaluation "e2a8dfe6"
!On branch master
Evaluation triggered by job "events"
!Your branch is up-to-date with 'origin/master'.
Allocation "6615b39f" modified: node "0d6a6103", group "pubsub"
!nothing to commit, working tree clean
Evaluation status changed: "pending" -> "complete"
!
==> Evaluation "e2a8dfe6" finished with status "complete"
$
Text
SCHEDULING: RUN DIFFERENT REGION
50
$ nomad status example
ID = example
Name = example
Type = service
Priority = 50
Datacenters = us-west-1
Status = running
Summary
Task Group Queued Starting Running Failed Complete Lost
cache 0 0 1 0 0 0
Allocations
ID Eval ID Node ID Task Group Desired Status Created At
38720b8e 4b8b7779 ec2f0830 cache run running 04/26/17 ...
$
Text
SCHEDULING: STATUS
DEMO!
Nomad
Million Container
Challenge
1,000 Jobs
1,000 Tasks per Job
5,000 Hosts on GCE
1,000,000 Containers
53
MILLION CONTAINER CHALLENGE
53
54
MILLION CONTAINER CHALLENGE
54
– Bill Gates
640 KB ought to be enough for anybody.
“
55
REAL WORLD SCALE
55
2nd Largest Hedge
Fund
18K Cores
5 Hours
2,200 Containers/
second
Q/A
AND HASHICONF
SEPTEMBER18-20
AUSTIN, TEXAS
www.hashiconf.com
#hashiconf
#hashiconf
Links:
https://www.nomadproject.io
https://github.com/nicholasjackson/terraform-nomad-multi-cloud

More Related Content

Living the Nomadic life - Nic Jackson