2 self-managed Docker clusters deployed on public clouds and fight each other in a ruthless battle. One has been designed to resist any form of threat. The other one's only aim is to destroy the first one. Who's going to win?
Although it's presented as an entertainment, this talk will show off two serious platforms leveraging on different principles. Beyond the technical aspects covered (swarm/kubernetes orchestration, IaaS clouds, various tools such as terraform, kops or helm) , it will be the opportunity to discuss more largely architecture topics such as immutable infrastructure, hybridation, microservices, etc.
3. Tweet us on #skynetvsapes ! !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Prolog
4. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Teaser: Deliver smart entertainment
VS
VS
“2 self-managed Docker clusters deployed on public clouds and fight
each other in a ruthless battle.
One has been designed to resist any form of threat.
The other one's only aim is to destroy the first one.
Who's going to win?”
5. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Teaser: Deliver smart entertainment
Autonomy first
▪ Hybridated: IaaS from Azure & AWS
▪ Self-healed
▪ Religion: Docker Swarm
▪ Custom chaos monkey
Services first
▪ Various services from AWS
▪ Cloud-healed
▪ Religion: Kubernetes
▪ BBVA’s chaos monkey
VS
6. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Season 01 timeline
S01E02ter 15/05
Cloud Europe
Rise of Planet of Apes
S01E03 22/06
Voxxed Lux
Skynet Returns!
S01E05 09/11
Devops D-Day
Grand finale
S01E01 05/04
Devoxx BOF
Trailer
S01E02 19/04
Breizhcamp
Grand Premier
S01E02bis 11/05
RivieraDev
Rise of the machines
Previously in SkynetVsApes:
● Proof of concept
● Clusters creation
● Decentralised storage : test Infinit.sh
● Netflix’s Chaos monkey test
7. Tweet us on #skynetvsapes ! !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Apes army
7
8. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
▪ Main objective : targets and terminates instances in a
region
▪ When : randomly in a given range of time
▪ Frequency : one instance every 2 days
configurable… 😈
▪ How : identifies instances running a given app through
Spinnaker
chaosmonkey principles
9. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
chaosmonkey architecture
chaosmonkey
10. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
chaosmonkey install = WTF !
11. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
chaosmonkey install
12. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
chaosmonkey install
13. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
chaosmonkey install = easy!
14. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
oh wait!
15. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
kubernetes on AWS = kops
16. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
chaosmonkey install process
AWS
account
KOPS
K8s on
AWS
Helm
Spinnaker
pkg
Spinnaker
custom pkg
mySQL pkg
chaosmonkey
pod
chaosmonkey
docker image
17. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Demo time!
18. Tweet us on #skynetvsapes ! !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Skynet
20. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Skynet-storage (Infinit.sh)
ServerServerServer
Infinit Network
Infinit Silo : 10GB Infinit Silo : 30GB Infinit Silo : 5GB
User
+Passport for
netwk
Infinit Volume
Docker volume
plugin
Docker volume
plugin
Docker volume
plugin
Docker registry
container
Docker registry
container
Docker registry
container
User
+Passport for
netwk
User
+Passport for
netwk
21. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Skynet-provisioning (Terraform)
▪ Using Terraform to automate cluster creation
▪ Leverage on terraform multiple providers
▪AWS
▪Azure
▪GCE (soon…)
▪ On master node
▪ docker swarm join --token token-master
▪ On slave nodes
▪ docker swarm join --token token-slave
22. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Skynet-resilience
▪ Focused on platform’s resilience
▪ Apps resilience provided by swarm’s orchestrator
▪ Beware of apps architecture !
▪ A simple docker image / service
▪ Encloses Terraform provisioning scripts
▪ Deployed as a global service on every nodes
▪ Small introspector script checking if the subsequent docker
engine is the cluster leader to trigger terraform regularly
23. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Skynet-terminator
▪ Chaos monkey to destroy Apes
▪ A simple docker image / service
▪ Encloses a script to shoot Apes’ instances
▪ Deployed as a unique replica
24. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Demo time!
25. Tweet us on #skynetvsapes ! !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Moving further
26. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
▪ Skynet
▪ Finalize Decentralized storage for Skynet
▪ Build up our dedicated, on purpose, immutable OS with LinuxKit
▪ Let’s be serious : achieve Skynet’s self healing with InfraKit
▪ Unleash the cloud with edge computing : push armed nodes on Raspberries
▪ Implement the least privilege model to enhance resiliency
▪Apes
▪ Unleash its own chaos monkey
▪ Federate several clusters
▪ These are platforms: still have to deliver a brain app
Some next steps to strengthen the setup
27. Tweet us on #skynetvsapes ! !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Deep dive:
Least Privilege Model
28. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
▪ Give each agent* in the system the exact information it needs;
no more, no less
▪ Similar to compartmentalization of classified information
▪ In the Manhattan Project, most teams working on the first A-
bomb didn’t have access to the big picture
▪ In information security: if a service doesn’t need a particular
password, token, or permission – it shouldn’t have it!
*Process, user, service, node...
What’s “Least Privilege” ?
29. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
▪ In a SwarmKit cluster, the only nodes with full access are the
manager nodes
▪ Worker nodes do not have access to the Raft log
▪ Worker nodes only know the addresses of the manager nodes
▪ Worker nodes only have their own private key (and CA cert)
▪ Communication between two nodes uses a session key
(different at each connection; rotated every 12 hours)
Least Privilege in SwarmKit
30. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
▪ A worker node has access to detailed information about a
container only if it is designated to run this container
▪ A worker node knows about an overlay network (and the
associated keys, if it’s an encrypted overlay) only if it is
supposed to run a container attached to this network
▪ Overlay networks are resilient to MITM attacks due to the top-
down approach of their configuration
(see this DockerCon talk by @lbernail for details)
▪ Secrets (a special SwarmKit construct) are only pushed to
worker nodes who need them, and never written to their disk
Worker nodes are like Jon Snow
31. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
▪ Least privilege reduces the splash damage of node compromise
▪ Compromising a node only compromises the containers on
that node; not the whole cluster
▪ Enables effective definition of security perimeters through tags
▪ Services can be restricted to specific perimeters through
placement constraints
docker service create --constraint node.labels.security==low ...
SwarmKit clusters are resilient
32. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
▪ By default, the Docker API is all or nothing
▪ Authorization plugins* let you vet each API request
▪ Example:
▪ deny deployment requests lacking a placement constraint
specifying a security tag
▪ only allow deployment in a given security zone if the
authenticated user is within the appropriate group
▪ Authorization plugins can be cascaded
*Available on Docker CE & EE. UCP is an authorization plugin.
Enforcement through authorization
33. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
▪ An AI like Skynet could leverage different security levels …
▪ Core services
▪ managers, storage, self-healing routines
▪ self-provisioned instances on well-protected, well-funded IAAS accounts
▪ Compute nodes
▪ machine learning, deep learning
▪ instances hacked a long time ago, or deployed with fragile funds
▪ Honeypots
▪ scamming and phishing operations, quarantine
▪ anything goes!
Skynet
34. Tweet us on #skynetvsapes ! !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Deep dive:
Self-Healing Infrastructure
35. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
▪ Example: managing VM deployment with IAAS
▪ Imperative style
▪ I create VMs directly
▪ I use a web console, CLI, API …
▪ when things break, I have to find out what and fix it
▪ Declarative style
▪ I describe what I want (with a Cloud Formation template, Terraform plan…)
▪ I run a tool to reconcile my infrastructure with the description I wrote
▪ when things break, I just run the tool again
▪ enables infrastructure as code
From imperative to declarative
36. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
From declarative to self-healing
▪ Self-healing infrastructure continuously reconciles (fixes) itself
▪ Examples:
▪ AWS Auto-Scaling Groups
▪ Convox (leverages AWS Lambda)
▪ InfraKit (cross-platform approach)
37. Tweet us on #skynetvsapes ! !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Conclusion
38. Tweet us on #skynetvsapes !
@lpiot
@jpetazzo
@adrienblind
@laurentgrangeau
Work in (perpetual)
progress !
Propose cool hacks:
pull requests on the repo
… and be credited
for your participation!
The story of which you can be the
hero