13. 13
在Hadoop中动态横向收缩/扩展
为不同租户部署单独的计算集群共享HDFS
根据优先级和可用资源增加或减少Task Tracker数量
Ad hoc
data mining
Dynamic resourcepool
Data layer HDFS
Host Host Host Host Host Host
Production
recommendation engine
Virtualization platform
Compute layer Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Compute
VM
Ad hoc
data mining
Production
recommendation engine
Compute
VM
Job Tracker Job Tracker
14. 14
虚拟化是最佳的多租户整合方案
物理方案 虚拟化方案
Resource Sharing Yes,
Users share a common Hadoop
cluster
Yes,
Users share common physical
servers in different Hadoop
clusters
Data Sharing Yes,
Users share a common Hadoop
cluster
Yes,
Different compute clusters share
a common HDFS cluster
Performance Isolation Weak, by slot number Strong, by CPU, RAM, Disk IO
Failure Isolation No,
Bad job fails entire cluster
Strong,
Failure impact only one cluster
Configuration Isolation No,
Same configuration, same distro,
same version
Yes,
Free to use different distro,
version, configuration
Security Isolation Weak,
Enforced by Hadoop
authentication and authorization
Strong,
Cluster level isolation.
Scalability Single master node capacity will
become a bottle neck
As many Namenode and
Jobtracker as needed