
Tuesday, April 17, 2018

Hadoop V2 - HttpFS setup

In this blog I discuss setup of HttpFS in Hadoop

IN hadoop HttpFS
1. Acts as a proxy server for catering to REST requests
2. Acts as single point of contact for all the clients
Clients do not need connectivity to datanodes as in case of WebHDFS
3. Can work on a multi namenode cluster unlike WebHDFS

(All the step are run on server which will be proxy node or edge node as user root except where mentioned)

1. Create User httpfs
groupadd -g 1000 hadoop
useradd -u 1010  -g hadoop httpfs

2. Setup Java and Hadoop
rpm -Uvh /tmp/jdk-8u152-linux-x64.rpm
scp -r root@nn:/usr/local/hadoop-2.7.5 /usr/local/
rm -rf /usr/local/hadoop/etc/hadoop
mkdir -p /etc/hadoop
scp -r nn:/etc/hadoop/conf /etc/hadoop
chmod -R 755 /etc/hadoop/conf

Create Soft Links
ln -s /usr/local/hadoop-2.7.5 /usr/local/hadoop
ln -s /etc/hadoop/conf /usr/local/hadoop-2.7.5/etc/hadoop

3. Setup Profile
scp root@nn:/tmp/ /etc/profile.d
source /etc/profile.d/

4. Setup Sudo
httpfs        ALL=(ALL)       NOPASSWD: ALL

5. Create Directories
mkdir -p /opt/HDPV2/logs /opt/HDPV2/pids /opt/HDPV2/1 /opt/HDPV2/2 /opt/HDPV2/tmp /opt/HDPV2/temp
chown -R httpfs:hadoop /opt/HDPV2/logs /opt/HDPV2/pids /opt/HDPV2/1 /opt/HDPV2/2 /opt/HDPV2/tmp /opt/HDPV2/temp
chmod -R 755 /opt/HDPV2

chmod 0755 /usr/local/hadoop/share/hadoop/httpfs/tomcat/conf/*

6. Change - [as hdfs - on NN, SNN and httpfs server]

Stop and Start Namenode and SecondaryNamenode stop secondarynamenode start secondarynamenode start namenode stop namenode

7.  Edit [as httpfs on httpfs server]
 cd $CONF
 Add below
 sudo vi

export HTTPFS_LOG=/opt/HDPV2/logs  #Custom
export HTTPFS_TEMP=/opt/HDPV2/temp #Custom

8. Start httpfs [as httpfs on httpfs server] start

Test and your httpfs should be ready.
curl -sS ''

You can use the same API as in webhdfs, except now you are using a proxy host.

No comments:
Write comments