Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Kakaocorp
From unmanned Datacenter
To Algorithmic Economy using
Openstack
Andrew Yongjoon Kong
andrew.kong@kakaocorp.com
LTHlab
KakaocorpAndrew. Yongjoon kong
• Cloud	Technical	Advisory	for	Government	Broad	Cast	Agency
• Adjunct	Prof.	Ajou Univ
• Korea	Data	Base	Agency	Acting	Professor	for	Bigdata
• Member	of	National	Information	Agency	Bigdata Advisory	committee	
• KT	cloudware Tech	lead(ex)!
• Kakao à Daum Kakao à Kakaocorp,	Cloud	Computing	 Cell	lead
Supervised,
Korean	
edition
Korean	Edition	
coming	soon.
KakaocorpOur vision.
KakaocorpOur culture.
Trust,	Conflicts,	Commitment
KakaocorpWhat is Cloud?
From Our Side
• Cloud == “Programmable Resource Management”
• What is Programmable?
• What is Resource?
• What is Management?
• NOP!
• Cloud is the one of the ways of managing/deploying
resources
• Basically, It’s culture.
• Tech. can support this culture
• Our culture is “Automation”
KakaocorpSome Numbers
5xxx VMs is running.
We revealed this already last Feb. in Openstack Community Days, Korea
KakaocorpSome Numbers
964 tenants
455 pull request since 2014.9
136 VMs are created/deleted per day
KakaocorpSome information about kakaoOpenstack
openstack release from grizzly to Kilo
total 3Region
additional service Heat/Trove/Sahara
KakaocorpUnmanned Data Center
Self Managed Computational Resource - Krane
• No Dedicated human resource in Front Desk
for getting order
• 24 x 7 API is open ( try to )
• Using Openstack API is Users Job.
• Maintaining Openstack Cloud is Our Job.
• We do not control anything at all.
KakaocorpSuccess or Issue?
Unit is “krane[virtual money]”.
Just for fun, not actually charged.
Fromonly one region.
KakaocorpCritical Volume vs Controlled Volume
When One thing is
over Critical Volume,
Have to change the point of view!
KakaocorpCloud, We do adopt devops culture : KField
KakaocorpControlled volume
Genesis:
- Krane was based on “left over or warranty-
outed resources”
- Some hypervisor(not vm) has only 16G.
- Interconnect was only 1G
- It’s for only “dev” stage service.
Exodus:
- more than 128GB
- 10Gbps
- SSD
It needs to have control
The easiest way:
- Making quota like everyone does.
KakaocorpCloud, we do have SDN, have No Openflow, No L2 network either
KakaocorpCloud, we do have SDN, not have Openflow, No others
eth0
Compute node
nova-compute
neutron-
linuxbridge-
agent
neutron-dhcp-
agent
Gateway
10.10.100.1
linux bridge
vm
IP:10.10.100.2/32
Routing Table
1 10.100.10.2/32 via 192.1.1.201
BGP
192.1.1.202 BGP
Virtual Switch block
Process block
Neutron-l3-agent
vlan
vlan
Virtual Router
Service Route Table
1
Management Route Table
1
Practice	Frugality	to	Boost	Creativity
KakaocorpWhy we want to rethink about quota
The thing which doesn’t exist in kakao is:
• No live migration
• No H/A in computed node
• No mirror in system disk
• No bonding for compute node network
• No extra interface for service and storage
à Technically, No extra something for failure
à What if server goes? software will take care of it.
à We recommend user to be ready for some failure.
à We do have LB ,volume and object storage for stability
KakaocorpCulture: Trust
We do understand our developer
is on harsh environment.
And adding Quota to this, make
developer more stressful.
KakaocorpCulture2: Commitment
So, We want our user to have the freedom
of creating resource.
But, We want our user to have the
responsibility of deleting unused resource
too.
We understand this is quite tedious job
So, we decide to find unused VM instead
KakaocorpSimple, Difficult at the same time
It looks simple, but quite difficult to define
what “unusedvs low usage”resource is.
KakaocorpThe initial the DC scale Garbage Collection
Anyway, somehow, we have to define some guide linesfor unused resources.
And Should be Done in Algorithmic way.
CPU
Load
Traffic
login
IO
Top	
process
Analysis Noti.
Every	Resource Data model
KakaocorpUnified Data System
First of all, unified metric store/retrieve system needed to detect certain levels
of computing resources.
• Have to gather/retrieve unified way
• Have to cover all resources from physical machine to virtual machine and
network switches.
• Have to interface with Configuration Managing Database
• Have to interface with internal ERP
KakaocorpIntegratedInformation Service Bus & EIP : Code Name Crow
Based on Opensource
Component
• Kafka
• Samza
• Camel
• Storm
• Gobblin
• Yarn
• HDFS
• Etcd
• OpenTSDB
• Hbase
• Tajo
• Grafana
KakaocorpIntegratedInformation Service Bus & EIP : Code Name Crow
Enterprise Integration
• Topic based Data ETL
• Can cover every computing
resource ( Physical Server,
Virtual instance, Container,
Public Cloud )
• Abstracting “Data Center
Information layer”
• Can make deep engineering
experience over every
resources.
Physical	
Servers
Virtual	
Instances
Containers
External	
Clouds
Others
(switches,	
logs)
monitoring
CROW
IMS
(kakao CMDB	
API)
SB
Rule	
Engine
Notificati
on	
ETL	
Data Center Information abstraction layer
API	
predicting
scheduling
Openstack
Heat
Other	
Service	
API
Data	Center	(or	Service	)	Management	Activity
control
KakaocorpResult
Start up with finding unused Resources
– We created unified monitoring system
– can replace pre-existed system monitoring system
– can extract/analyze more information
– End up with creating brand new resource information center
We try to target 10% as a potential candidate.
More than 40% of them was the real “abandoned vm”
because of structural changes ( I mean not purposely left)
KakaocorpResult: Add another controlled volume subsystem
KakaocorpQ&A
Q&AP.S. We’re hiring, always!
http://www.kakaocorp.com/recruit

More Related Content

Cloud: From Unmanned Data Center to Algorithmic Economy using Openstack

  • 1. Kakaocorp From unmanned Datacenter To Algorithmic Economy using Openstack Andrew Yongjoon Kong andrew.kong@kakaocorp.com LTHlab
  • 2. KakaocorpAndrew. Yongjoon kong • Cloud Technical Advisory for Government Broad Cast Agency • Adjunct Prof. Ajou Univ • Korea Data Base Agency Acting Professor for Bigdata • Member of National Information Agency Bigdata Advisory committee • KT cloudware Tech lead(ex)! • Kakao à Daum Kakao à Kakaocorp, Cloud Computing Cell lead Supervised, Korean edition Korean Edition coming soon.
  • 5. KakaocorpWhat is Cloud? From Our Side • Cloud == “Programmable Resource Management” • What is Programmable? • What is Resource? • What is Management? • NOP! • Cloud is the one of the ways of managing/deploying resources • Basically, It’s culture. • Tech. can support this culture • Our culture is “Automation”
  • 6. KakaocorpSome Numbers 5xxx VMs is running. We revealed this already last Feb. in Openstack Community Days, Korea
  • 7. KakaocorpSome Numbers 964 tenants 455 pull request since 2014.9 136 VMs are created/deleted per day
  • 8. KakaocorpSome information about kakaoOpenstack openstack release from grizzly to Kilo total 3Region additional service Heat/Trove/Sahara
  • 9. KakaocorpUnmanned Data Center Self Managed Computational Resource - Krane • No Dedicated human resource in Front Desk for getting order • 24 x 7 API is open ( try to ) • Using Openstack API is Users Job. • Maintaining Openstack Cloud is Our Job. • We do not control anything at all.
  • 10. KakaocorpSuccess or Issue? Unit is “krane[virtual money]”. Just for fun, not actually charged. Fromonly one region.
  • 11. KakaocorpCritical Volume vs Controlled Volume When One thing is over Critical Volume, Have to change the point of view!
  • 12. KakaocorpCloud, We do adopt devops culture : KField
  • 13. KakaocorpControlled volume Genesis: - Krane was based on “left over or warranty- outed resources” - Some hypervisor(not vm) has only 16G. - Interconnect was only 1G - It’s for only “dev” stage service. Exodus: - more than 128GB - 10Gbps - SSD It needs to have control The easiest way: - Making quota like everyone does.
  • 14. KakaocorpCloud, we do have SDN, have No Openflow, No L2 network either
  • 15. KakaocorpCloud, we do have SDN, not have Openflow, No others eth0 Compute node nova-compute neutron- linuxbridge- agent neutron-dhcp- agent Gateway 10.10.100.1 linux bridge vm IP:10.10.100.2/32 Routing Table 1 10.100.10.2/32 via 192.1.1.201 BGP 192.1.1.202 BGP Virtual Switch block Process block Neutron-l3-agent vlan vlan Virtual Router Service Route Table 1 Management Route Table 1 Practice Frugality to Boost Creativity
  • 16. KakaocorpWhy we want to rethink about quota The thing which doesn’t exist in kakao is: • No live migration • No H/A in computed node • No mirror in system disk • No bonding for compute node network • No extra interface for service and storage à Technically, No extra something for failure à What if server goes? software will take care of it. à We recommend user to be ready for some failure. à We do have LB ,volume and object storage for stability
  • 17. KakaocorpCulture: Trust We do understand our developer is on harsh environment. And adding Quota to this, make developer more stressful.
  • 18. KakaocorpCulture2: Commitment So, We want our user to have the freedom of creating resource. But, We want our user to have the responsibility of deleting unused resource too. We understand this is quite tedious job So, we decide to find unused VM instead
  • 19. KakaocorpSimple, Difficult at the same time It looks simple, but quite difficult to define what “unusedvs low usage”resource is.
  • 20. KakaocorpThe initial the DC scale Garbage Collection Anyway, somehow, we have to define some guide linesfor unused resources. And Should be Done in Algorithmic way. CPU Load Traffic login IO Top process Analysis Noti. Every Resource Data model
  • 21. KakaocorpUnified Data System First of all, unified metric store/retrieve system needed to detect certain levels of computing resources. • Have to gather/retrieve unified way • Have to cover all resources from physical machine to virtual machine and network switches. • Have to interface with Configuration Managing Database • Have to interface with internal ERP
  • 22. KakaocorpIntegratedInformation Service Bus & EIP : Code Name Crow Based on Opensource Component • Kafka • Samza • Camel • Storm • Gobblin • Yarn • HDFS • Etcd • OpenTSDB • Hbase • Tajo • Grafana
  • 23. KakaocorpIntegratedInformation Service Bus & EIP : Code Name Crow Enterprise Integration • Topic based Data ETL • Can cover every computing resource ( Physical Server, Virtual instance, Container, Public Cloud ) • Abstracting “Data Center Information layer” • Can make deep engineering experience over every resources. Physical Servers Virtual Instances Containers External Clouds Others (switches, logs) monitoring CROW IMS (kakao CMDB API) SB Rule Engine Notificati on ETL Data Center Information abstraction layer API predicting scheduling Openstack Heat Other Service API Data Center (or Service ) Management Activity control
  • 24. KakaocorpResult Start up with finding unused Resources – We created unified monitoring system – can replace pre-existed system monitoring system – can extract/analyze more information – End up with creating brand new resource information center We try to target 10% as a potential candidate. More than 40% of them was the real “abandoned vm” because of structural changes ( I mean not purposely left)
  • 25. KakaocorpResult: Add another controlled volume subsystem
  • 26. KakaocorpQ&A Q&AP.S. We’re hiring, always! http://www.kakaocorp.com/recruit