Security overview for big data applications
In this section, we will shift our focus to AWS big data application security features. More specifically, we will discuss the features and options for securing EMR clusters and serverless applications.
Securing the EMR cluster
In this section, we provide a quick overview of EMR security, including encryption, authentication, and authorization.
Encryption
EMR supports end-to-end encryption for a variety of different frameworks. You can configure all this in a couple of clicks. You can do S3 server-side or client-side encryption, encrypt all the local disks of your cluster so any executor spills or HDFS blocks getting written get encrypted on the local disk filesystem. You can also encrypt Spark, Tez, MapReduce, HBase, Hive, Presto, and Pig for all the data blocks in flight.
Note
For more details on encrypting data-at-rest and data-in-transit, refer to AWS documentation: https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-data-encryption.html...