Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Amazon EMR Security: © 2018, Amazon Web Services, Inc. or Its Affiliates. All Rights Reserved

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 16

Amazon EMR Security

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Key takeaways

• Amazon EMR Security


• Authentication
• Authorization
• Data Protection
• Audit
• Central management
• Best Practices

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EMR Security

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Security needs are continuously evolving
• Authentication
• Authenticate users and systems
• Authorization
• Provision access to data
• Data Protection
• Protect data at rest and in transit
• Audit
• Maintain a record of data access
• Administration
• Central management and consistent security
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Security today in Hadoop with EMR 5.x
Authentication Authorization Audit Data Protection
Who am I ? What can I do? What did I do? Can data be encrypted at
rest and over the wire

• Linux Users
• Encryption At-Rest
• Adding public • Posix based • Log Analysis
• Encryption In-Transit
Keys Authorization
• Kerberos

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Authentication
LDAP
HiveServer2
Presto Coordinator
Spark Thrift Server
Hue Server
Zeppelin Server

EC2 key pair


SSH as “hadoop”

AWS credentials
EMR Step (EMR API)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
New – Authentication with Kerberos

DoAs YARN RM
Users

KDC
Service principals for all
cluster nodes
Microsoft
Active Directory
Master Node
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Authorization
• Storage-based
• EMRFS/S3
• HDFS
• HiveServer2 and Presto (SQL-based)
• HBase
• YARN queues
• Fine-grained access control by cluster tag (IAM)
• Apache Ranger on edge node (using CloudFormation)

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
New – EMRFS fine-grained authorization
Context IAM role: analytics_prod
User: aduser
Group: analyst

Context IAM role: analytics_dev


User: aduser2
Group: dev

Can map IAM roles to user, group, or S3 prefix

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EMRFS Security Configuration - Example
{
"AuthorizationConfiguration": {
"EmrFsConfiguration": {
"RoleMappings": [{
"Role":"arn:aws:iam::123456789101:role/allow_EMRFS_access_for_user1 ",
"IdentifierType": "User",
"Identifiers": [ "user1" ]
},{
"Role": "arn:aws:iam::123456789101:role/allow_EMRFS_access_to_MyBuckets ",
"IdentifierType": "Prefix",
"Identifiers": [ "s3://MyBucket/","s3://MyOtherBucket/" ]
},{
"Role": "arn:aws:iam::123456789101:role/allow_EMRFS_access_for_AdminGroup ",
"IdentifierType": "Group",
"Identifiers": [ "AdminGroup" ]
}]
}
}
}
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Security - Authentication and authorization
Apache Ranger

• Plug-ins for Hive, HBase, YARN,


and HDFS

• Row-level authorization for Hive


(with data-masking)

• Full auditing capabilities with


embedded search

• Run Ranger on an edge node – visit


the AWS Big Data Blog
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Encryption

• Spark
• Tez
• MapReduce
• Presto
• HBase
• Hive
• Pig

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Security - Governance and auditing

• Custom AMIs
• AWS CloudTrail for EMR APIs
• S3 access logs for cluster S3 access
• YARN and application logs
• Ranger for UI for application-level auditing

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Custom AMIs

• Benefits
• Reduction of cluster start time
• Prevent unexpected bootstrap action failures
• Support for Amazon EBS root volume encryption

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Custom AMIs

• Requirements
• Must be an Amazon Linux AMI
• Must be an HVM AMI
• Must be an EBS-backed AMI
• Must not have multiple EBS volumes
• Must be a 64-bit AMI
• Must not have users with the same name as applications (example: hadoop,
hdfs, yarn, or spark)

© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

You might also like