Integrating RADOS Gateway with Hadoop S3A plugin
For data analytics applications that require Hadoop Distributed File System (HDFS) access, the Ceph object gateway can be accessed using the Apache S3A connector for Hadoop. The S3A connector is an open source tool that presents S3 compatible object storage as an HDFS file system with HDFS file system read and write semantics to the applications while data is stored in the Ceph object gateway.
Ceph object gateway Jewel version 10.2.9 is fully compatible with the S3A connector that ships with Hadoop 2.7.3.
How to do it...
You can use client-node1
to configure Hadoop S3A client.
- Install Java packages in the
client-node1
:
# yum install java* -y

- Download the Hadoop
.tar
file from https://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-2.7.3.tar.gz:

Note
We have also uploaded the hadoop-2.7.3.tar.gz
file in GitHub - https://github.com/PacktPublishing/Ceph-Cookbook-Second-Edition/raw/master/hadoop-2.7.3.tar.gz if it will be removed from http...