Advertisement

Wednesday, April 25, 2018

HDPV2 - Services Review

A review of all Services Running

This blog discusses all the services / java process running / configured till now as part of Hadoop V2 Configuration.

First let me list out all the processes and then

[root@nn ~]# for i in $(cat /tmp/all_hosts) ; do ssh ${i} 'hostname; jps | grep -vi jps; echo' ;  done;    

nn.novalocal
4230 NameNode
4471 DFSZKFailoverController


Namenode - Runs 2 Services
1. Namenode itself
2. ZKFC Failover Controlller


snn.novalocal
3192 NameNode
3514 DFSZKFailoverController


StandbyNamenode also runs same services as Namenode

rm.novalocal
3297 Master
2870 ResourceManager
3211 JobHistoryServer


ResourceManager Runs 3 services
1. Master - This is Spark's Master (running as spark)
2. ResourceManager - Yarn Framework's ResourceManager (running as yarn)
3. JobHistoryServer - Map Reduce Job History Server (running as mapred)


d1n.novalocal
2529 QuorumPeerMain
2819 NodeManager
2598 DataNode
2986 Worker
2700 JournalNode

All 3 DataNodes are running below 5 Services
1. QuorumPeerMain - Zookeeper's Qurom service (running as zkfc)
2. NodeManager - Yarn - Node manager (running as yarn)
3. DataNode - Hdfs data Node (running as hdfs)
4. Worker - Spark worker process (running as spark)
5. JournalNode - Journal Node for HA Configuration (running as hdfs)

d2n.novalocal
2465 QuorumPeerMain
2738 NodeManager
2517 DataNode
2619 JournalNode
2907 Worker

d3n.novalocal
2513 DataNode
2901 Worker
2615 JournalNode
2730 NodeManager
2461 QuorumPeerMain

Hadoop V2 - Fair Scheduler Configuration

In this blog I discuss on  how to configure Fair Scheduler.

Fair Scheduler is also one of the scheduler used in production environments.
In my words it is more fairer than

Step 1.  Enable Fair Scheduler

Configure (Append/modify)  yarn-site.xml as in Appendix


Step 2. Configure fair-scheduler.xml file as in Appendix

Hadoop will automatically reload fair scheduler configuration every 10 seconds


Step 3. Restart yarn daemon
yarn-daemon.sh stop resourcemanager
yarn-daemon.sh start resourcemanager



Step 4. Verify on rm WebUI - configured queues. 

rm:8088



or Use cmd to verify

[yarn@rm ~]$ hadoop queue -list
DEPRECATED: Use of this script to execute mapred command is deprecated.
Instead use the mapred command for it.

18/04/25 02:35:15 INFO client.RMProxy: Connecting to ResourceManager at rm/192.168.2.102:8032
======================
Queue Name : root.data_science
Queue State : running
Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
    ======================
    Queue Name : root.data_science.best_effort
    Queue State : running
    Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
    ======================
    Queue Name : root.data_science.priority
    Queue State : running
    Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
======================
Queue Name : root.default
Queue State : running
Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
======================
Queue Name : root.marketing
Queue State : running
Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
    ======================
    Queue Name : root.marketing.reports
    Queue State : running
    Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
    ======================
    Queue Name : root.marketing.website
    Queue State : running
    Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
======================
Queue Name : root.sales
Queue State : running
Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
    ======================
    Queue Name : root.sales.asia
    Queue State : running
    Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
    ======================
    Queue Name : root.sales.europe
    Queue State : running
    Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
    ======================
    Queue Name : root.sales.northamerica
    Queue State : running
    Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
======================
Queue Name : root.sqoop
Queue State : running
Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
    ======================
    Queue Name : root.sqoop.sql
    Queue State : running
    Scheduling Info : Capacity: 0.0, MaximumCapacity: UNDEFINED, CurrentCapacity: 0.0
 

Appendix
yarn-site.xml


<property>
  <name>yarn.resourcemanager.scheduler.class</name>
  <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>

<property>
    <name>yarn.scheduler.fair.allocation.file</name>
    <value>/etc/hadoop/conf/fair-scheduler.xml</value>
</property>

<property>
    <name>yarn.scheduler.fair.preemption</name>
    <value>true</value>
</property>

<property>
    <name>yarn.scheduler.fair.preemption</name>
    <value>true</value>
</property>


fair-scheduler.xml
<?xml version="1.0"?>
<allocations>

<queueMaxAMShareDefault>0.5</queueMaxAMShareDefault>
<queue name="root">
 <queue name="sales">
   <queue name="northamerica" />
   <queue name="europe" />
   <queue name="asia" />
 </queue>
 <queue name="marketing">
   <queue name="reports" />
   <queue name="website" />
 </queue>
 <queue name="data_science">
   <queue name="priority">
     <weight>100.0</weight>
   </queue>
   <queue name="best_effort">
     <weight>0.0</weight>
   </queue>
 </queue>
<queue name="sqoop">
    <minResources>10000 mb,0vcores</minResources>
    <maxResources>90000 mb,0vcores</maxResources>
    <maxRunningApps>50</maxRunningApps>
    <maxAMShare>0.1</maxAMShare>
    <weight>2.0</weight>
    <schedulingPolicy>fair</schedulingPolicy>
    <queue name="sql">
      <aclSubmitApps>charlie</aclSubmitApps>
      <minResources>5000 mb,0vcores</minResources>
    </queue>
  </queue>
</queue>
<queuePlacementPolicy>
    <rule name="specified" />
    <rule name="primaryGroup" create="false" />
    <rule name="nestedUserQueue">
        <rule name="secondaryGroupExistingQueue" create="false" />
    </rule>
    <rule name="default" queue="default_queue"/>
  </queuePlacementPolicy>
</allocations>

Monday, April 23, 2018

Hadoop V2 - Capacity Scheduler Configuration


In this blog I discuss how to do fair scheduler configuration for Hadoop 2

I will design Queues and Capacity as per below diagram.

(All the detailed configuration is present in the end of the blog)



Steps [on rm node]
1. Make Backup of Capacity Scheduler File
sudo cp capacity-scheduler.xml capacity-scheduler.xml.bkp

2. Configure /etc/hadoop/conf/capacity.scheduler.xml as in Appendix

3. Configure Properties in Yarn-site.xml
yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(Enable Capacity Scheduler)
yarn.resourcemanager.scheduler.monitor.enable = true
(Enable Preemption)

4. Stop and Start yarn-services

stop-yarn.sh
start-yarn.sh

5. Run any application


6. Verify from rm cluster UI applications, queues





7. Check Queues from RM using CMD


[yarn@rm ]$ hadoop queue -list

18/04/23 03:06:36 INFO client.RMProxy: Connecting to ResourceManager at rm/192.168.2.102:8032
======================
Queue Name : research
Queue State : running
Scheduling Info : Capacity: 30.000002, MaximumCapacity: 30.000002, CurrentCapacity: 0.0
    ======================
    Queue Name : analytics
    Queue State : running
    Scheduling Info : Capacity: 40.0, MaximumCapacity: 60.000004, CurrentCapacity: 0.0
    ======================
    Queue Name : data
    Queue State : running
    Scheduling Info : Capacity: 60.000004, MaximumCapacity: 60.000004, CurrentCapacity: 0.0
======================
Queue Name : support
Queue State : running
Scheduling Info : Capacity: 40.0, MaximumCapacity: 50.0, CurrentCapacity: 0.0
    ======================
    Queue Name : services
    Queue State : running
    Scheduling Info : Capacity: 40.0, MaximumCapacity: 40.0, CurrentCapacity: 0.0
    ======================
    Queue Name : training
    Queue State : running
    Scheduling Info : Capacity: 60.000004, MaximumCapacity: 70.0, CurrentCapacity: 0.0
======================
Queue Name : production
Queue State : running
Scheduling Info : Capacity: 30.000002, MaximumCapacity: 100.0, CurrentCapacity: 22.222223



Appendix
capacity-scheduler.xml, this file is created on yarn-rm node in /etc/hadoop/conf with permissions as 0755 owner root. 

This file governs configuration of queues on RM

<configuration>
<property>
        <name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
        <value>1.0</value>
</property>

<property>
        <name>yarn.scheduler.capacity.maximum-applications</name>
        <value>2000</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.acl_administer_queue</name>
        <value>*</value>
</property>

<property>
        <name>yarn.scheduler.capacity.resource-calculator</name>
        <value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.queues</name>
        <value>research,support,production</value>
</property>


<property>
        <name>yarn.scheduler.capacity.root.research.capacity</name>
        <value>30</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.research.maximum-capacity</name>
        <value>30</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.research.state</name>
        <value>RUNNING</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.research.user-limit-factor</name>
        <value>1</value>
</property>

<property>
        <name>yarn-scheduler.capacity.root.research.minimum-user-limit-percent</name>
        <value>80</value>
</property>



<property>
        <name>yarn.scheduler.capacity.root.research.analytics.capacity</name>
        <value>40</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.research.analytics.maximum-capacity</name>
        <value>60</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.research.analytics.state</name>
        <value>RUNNING</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.research.analytics.user-limit-factor</name>
        <value>1</value>
</property>

<property>
        <name>yarn-scheduler.capacity.root.research.analytics.minimum-user-limit-percent</name>
        <value>20</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.research.queues</name>
        <value>analytics,data</value>
</property>



<property>
        <name>yarn.scheduler.capacity.root.research.data.capacity</name>
        <value>60</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.research.data.maximum-capacity</name>
        <value>60</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.research.data.state</name>
        <value>RUNNING</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.research.data.user-limit-factor</name>
        <value>1</value>
</property>

<property>
        <name>yarn-scheduler.capacity.root.research.data.minimum-user-limit-percent</name>
        <value>20</value>
</property>




<property>
        <name>yarn.scheduler.capacity.root.production.capacity</name>
        <value>30</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.production.maximum-capacity</name>
        <value>100</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.production.state</name>
        <value>RUNNING</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.production.user-limit-factor</name>
        <value>1</value>
</property>

<property>
        <name>yarn-scheduler.capacity.root.production.minimum-user-limit-percent</name>
        <value>20</value>
</property>



<property>
        <name>yarn.scheduler.capacity.root.support.capacity</name>
        <value>40</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.support.maximum-capacity</name>
        <value>50</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.support.state</name>
        <value>RUNNING</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.support.user-limit-factor</name>
        <value>1</value>
</property>

<property>
        <name>yarn-scheduler.capacity.root.support.minimum-user-limit-percent</name>
        <value>20</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.support.queues</name>
        <value>training,services</value>
</property>



<property>
        <name>yarn.scheduler.capacity.root.support.training.capacity</name>
        <value>60</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.support.training.maximum-capacity</name>
        <value>70</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.support.training.state</name>
        <value>RUNNING</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.support.training.user-limit-factor</name>
        <value>1</value>
</property>

<property>
        <name>yarn-scheduler.capacity.root.support.training.minimum-user-limit-percent</name>
        <value>20</value>
</property>




<property>
        <name>yarn.scheduler.capacity.root.support.services.capacity</name>
        <value>40</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.support.services.maximum-capacity</name>
        <value>40</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.support.services.state</name>
        <value>RUNNING</value>
</property>

<property>
        <name>yarn.scheduler.capacity.root.support.services.user-limit-factor</name>
        <value>1</value>
</property>

<property>
        <name>yarn-scheduler.capacity.root.support.services.minimum-user-limit-percent</name>
        <value>20</value>
</property>
<property>
        <name>yarn.scheduler.capacity.queue-mappings</name>
        <value>u:sqoop:production,u:hdfs:production,g:hadoop:services,u:%user:%user,g:analytics:analytics,g:data:data,g:training:training,g:services:services</value>
</property>
</configuration>

Hadoop V2 - Sqoop Import from Oracle Database 12c

In this blog I will demonstrate how to import data using sqoop from Oracle to HDFS

If you have followed my last blog, you have your sqoop installation ready.


Step 1 - Download Jar
Download ojdbc8.jar from below (This is for release 12c, so use drivers jar as per your release)

http://www.oracle.com/technetwork/database/features/jdbc/jdbc-ucp-122-3110062.html

Step 2 - Copy Jar to lib
[root@oem13cr2 conf]# cd /usr/local/sqoop/lib
[root@oem13cr2 lib]# cp /tmp/ojdbc8.jar .

Provide read permissions to others if required

Step 3 - Create directory for import [As hdfs on nn]
hdfs dfs -mkdir /sqoop
hdfs dfs -chown sqoop:admingroup /sqoop


Step 4 - Import

sqoop-import --connect jdbc:oracle:thin:@192.168.1.71:6633:EMPRD \
--table SCOTT.EMP \
--fields-terminated-by '\t' --lines-terminated-by '\n' \
--username SCOTT --password TIGER \
--target-dir /sqoop/scott       


18/04/20 05:37:51 INFO mapreduce.Job:  map 75% reduce 0%
18/04/20 05:37:56 INFO mapreduce.Job:  map 100% reduce 0%
18/04/20 05:37:57 INFO mapreduce.Job: Job job_1524137551614_0005 completed successfully
18/04/20 05:37:57 INFO mapreduce.Job: Counters: 31

        File Output Format Counters
                Bytes Written=817
18/04/20 05:37:57 INFO mapreduce.ImportJobBase: Transferred 817 bytes in 50.9334 seconds (16.0406 bytes/sec)
18/04/20 05:37:57 INFO mapreduce.ImportJobBase: Retrieved 14 records.

 hdfs dfs -ls /sqoop/scott/
Found 5 items
-rw-r--r--   3 sqoop admingroup          0 2018-04-20 05:37 /sqoop/scott/_SUCCESS
-rw-r--r--   3 sqoop admingroup        115 2018-04-20 05:37 /sqoop/scott/part-m-00000
-rw-r--r--   3 sqoop admingroup        117 2018-04-20 05:37 /sqoop/scott/part-m-00001
-rw-r--r--   3 sqoop admingroup        238 2018-04-20 05:37 /sqoop/scott/part-m-00002
-rw-r--r--   3 sqoop admingroup        347 2018-04-20 05:37 /sqoop/scott/part-m-00003


Import All Tables Sqoop

sqoop import-all-tables --connect jdbc:oracle:thin:@192.168.1.71:6633:EMPRD \
--username SCOTT --password TIGER -m 1 -as-avrodatafile



hdfs dfs -ls /user/sqoop
Found 6 items
drwx------   - sqoop admingroup          0 2018-04-20 05:35 /user/sqoop/.Trash
drwx------   - sqoop admingroup          0 2018-04-20 06:01 /user/sqoop/.staging
drwxr-xr-x   - sqoop admingroup          0 2018-04-20 06:00 /user/sqoop/BONUS
drwxr-xr-x   - sqoop admingroup          0 2018-04-20 06:00 /user/sqoop/DEPT
drwxr-xr-x   - sqoop admingroup          0 2018-04-20 06:00 /user/sqoop/EMP
drwxr-xr-x   - sqoop admingroup          0 2018-04-20 06:01 /user/sqoop/SALGRADE



How does it all happen
1. Sqoop is responsible for connecting to RDBMS and fetches metadata of the table.
2. It generates a java class , compiles and connects to hadoop cluster (Resource Manager) and then submits an import job.
3. MR job does the task and data is imported to HDFS directories

Sqoop only does the monitoring work and oversess the completion

Import into specific datafile type
Avro : -as-avrodatafile
sqoop  import-all-tables -Dmapreduce.job.user.classpath.first=true --connect jdbc:oracle:thin:@192.168.1.71:6633:EMPRD \
--username SCOTT --password TIGER -m 1 -as-avrodatafile


Data is now imported as avro format

[hdfs@nn ~]$ hdfs dfs -ls /user/sqoop/EMP
Found 2 items
-rw-r--r--   3 sqoop admingroup          0 2018-04-20 09:29 /user/sqoop/EMP/_SUCCESS
-rw-r--r--   3 sqoop admingroup       1525 2018-04-20 09:29 /user/sqoop/EMP/part-m-00000.avro

Sequence: -as-sequencefile
Text: -as-textfile


Import Using Options file
All the options can be specified into an options file and then can be re-used for ease and management

Create Options file with content as below
--connect
jdbc:oracle:thin:@192.168.1.71:6633:EMPRD
--username
SCOTT
--password
TIGER
-m
1
-as-avrodatafile


Import using Options file

sqoop  import-all-tables -Dmapreduce.job.user.classpath.first=true \
--options-file /tmp/sqoop_import.txt

Friday, April 20, 2018

Hadoop V2 - Sqoop - NoSuchMethodError


Symptoms - 
Sqoop job fails when importing with avro type format. 

Container logs show 

org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoSuchMethodError: org.apache.avro.reflect.ReflectData.addLogicalTypeConversion(Lorg/apache/avro/Conversion;)V
 

Solution - Run the job by adding arguements to the comand line -
-Dmapreduce.job.user.classpath.first=true

sqoop  import-all-tables -Dmapreduce.job.user.classpath.first=true --connect jdbc:oracle:thin:@192.168.1.71:6633:EMPRD \
--username SCOTT --password TIGER -m 1 -as-avrodatafile



This will make sure the right classes are accesible to sqop job during import and MR job.

Thursday, April 19, 2018

Hadoop V2 - Exception from container-launch

In this blog I will discuss how to find out issues with container launch when you are running your jobs / applications.

Looking at below error trace from a MR job in command line.

18/04/19 07:48:51 INFO mapreduce.Job: Task Id : attempt_1524137551614_0001_m_000002_2, Status : FAILED
Exception from container-launch.
Container id: container_1524137551614_0001_01_000015
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
        at org.apache.hadoop.util.Shell.run(Shell.java:482)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)


Container exited with a non-zero exit code 1




Assuming you have turned on  yarn-log aggregation
yarn.log-aggregation-enable to true in yarn-site.xml

Go to directory specified in  yarn.nodemanager.remote-app-log-dir and then look for the application run and the node name.

Open the job log file using

hdfs dfs  -cat <file> | less
and look for the error

In my case the error was quite trivial which was

Error occurred during initialization of VM
Too small initial heap


This was fixed and the job ran fine.


Properties which were tuned were


  1.         mapreduce.map.memory.mb
  2.         mapreduce.reduce.memory.mb
  3.         mapreduce.map.java.opts
  4.         mapreduce.reduce.java.opts


 

Hadoop V2 - Sqoop Install

In this blog I discuss Sqoop deployment, Sqoop stands for SQL to Hadoop.

SQL is a tool which can import / export data from RDBMS

Sqoop
- Comes bundled with special connectors to Many RDBMS
- Is not a cluster, can be installed on one node only
- should be installed on edge node and not any of cluster nodes


On Edge Node
1. Download Sqoop [As root]
 curl http://www-eu.apache.org/dist/sqoop/1.4.7/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz  -o /sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

2. Add user [As root]

# groupadd -g 1000 hadoop
# useradd -u 1012  -g  hadoop sqoop
# passwd sqoop


3. Untar sqoop [As root]
cd /usr/local
tar -xzf /tmp/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz


4. Create Soft Link

# ln -s /usr/local/sqoop-1.99.7-bin-hadoop200 /usr/local/sqoop

5. Set environment variables
vi /etc/profile.d/profile.sh (append)

export SQOOP_HOME=/usr/local/sqoop ;
export PATH=$PATH:$SQOOP_HOME/bin


source /etc/profile.d/profile.sh

6. Set environment for sqoop [As root]


cd $SQOOP_HOME/conf

mv sqoop-env-template.sh sqoop-env.sh

export HADOOP_COMMON_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=/usr/local/hadoop


7. Verify Install and find out sqoop version. [As sqoop]


sqoop-version
18/04/19 05:47:24 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
Sqoop 1.4.7
git commit id 2328971411f57f0cb683dfb79d19d4d19d185dd8
Compiled by maugli on Thu Dec 21 15:59:58 STD 2017

Wednesday, April 18, 2018

Hadoop V2 - QJM (Automatic)

In this blog I discuss my configuration of Automatic Failover using QJM.

This is in continuation with my previous QJM blog for manual configuration.

Automatic failover is configured using ZKFC - Zookepere Failover Controller on Namenodes
and Zookeeper processes on Quorom nodes.

1. Set Automatic failover in hdfs-site.xml

<property>
    <name>dfs.ha.automatic-failover.enabled</name>
    <value>true</value>
</property>

2. Configure Qurom services parameter (core-site.xml)
<property>
    <name>ha.zookeeper.quorum</name>
    <value>d1.novalocal.com:2181,d2.novalocal.com:2181,d3.lcoalhost.com:2181</value>
</property>


Scp files to snn
scp core-site.xml hdfs-site.xml snn:/etc/hadoop/conf

3. Create user zkfc [d1n,d2n,d3n] and set password to hadoop
    useradd -u 1011 -g hadoop zkfc
    password zkfc

4. Download Zookeeper [As root - d1n]
[root@d1n tmp]# curl http://www-eu.apache.org/dist/zookeeper/zookeeper-3.4.10/zookeeper-3.4.10.tar.gz -o zookeeper-3.4.10.tar.gz

Untar Zookeeper
cd /usr/local
tar -xzf /tmp/zookeeper-3.4.10.tar.gz



5. Do configuration [on d1n]
cd /usr/local/zookeeper/conf
cp zoo_sample.cfg zoo.cfg

 
Edit zoo.cfg and make below change
dataDir=/opt/HDPV2/zookeeper

6. Create Hosts file
create all_hosts in tmp directory file
cat all_hosts
d1n
d2n
d3n


7. Set up Passwordless connection
[As root and zkfc on d1n]
ssh-keygen
ssh-copy-id root@d1n
ssh-copy-id root@d2n
ssh-copy-id root@d3n


8. Create directories
[As root on d1n]
#for i in $(cat /tmp/all_hosts) ;do ssh ${i} mkdir -p /opt/HDPV2/zookeeper;  done
#for i in $(cat /tmp/all_hosts) ;do ssh ${i} chmod 775  /opt/HDPV2/zookeeper ;  done
#for i in $(cat /tmp/all_hosts) ;do ssh ${i} chown zkfc:hadoop /opt/HDPV2/zookeeper; done

# scp -r /usr/local/zookeeper-3.4.10 d2n:/usr/local
# scp -r /usr/local/zookeeper-3.4.10 d3n:/usr/local


# for i in $(cat /tmp/all_hosts) ;do ssh ${i} ln -s /usr/local/zookeeper-3.4.10/ /usr/local/zookeeper ; done


9. Start Zookeeper Qurom processes
[As zkfc on d1n]
for i in $(cat /tmp/all_hosts) ;do ssh ${i} /usr/local/zookeeper/bin/zkServer.sh start /usr/local/zookeeper/conf/zoo.cfg ; done


10. Stop nn and nn on Standby
hadoop-daemon.sh stop namenode


11. Start Namenodes
hadoop-daemon.sh start namenode

12. Start zkfc on both namenodes
hadoop-daemon.sh start zkfc -formatZK

Whichever node zkfc will be started first will become active node

You can now kill the process id of active namenode and watch the other one become active to verify auto-failover configuration of zkfc.


Hadoop V2 - HAAdmin


In this blog I discuss usage of haadmin command.
I have already setup Manual HA configuration using QJM in last blog.

haadmin command is supported for failing over , switching over in case of manual configuration.

Below is help document of haadmin


hdfs haadmin
Usage: haadmin
    [-transitionToActive [--forceactive] <serviceId>]
    [-transitionToStandby <serviceId>]
    [-failover [--forcefence] [--forceactive] <serviceId> <serviceId>]
    [-getServiceState <serviceId>]
    [-checkHealth <serviceId>]
    [-help <command>]


Failover Example
hdfs haadmin -failover -forcefence -forceactive nn2 nn1
Failover from nn2 to nn1 successful   

Whenever a failover is forced, the other node is forced kill by the node which is going to be come active.
   
   
Graceful Swithcover Example

[hdfs@nn ~]$ hdfs haadmin -getServiceState nn1
standby
[hdfs@nn ~]$ hdfs haadmin -getServiceState nn2
standby
[hdfs@nn ~]$ hdfs haadmin -transitionToActive nn2
[hdfs@nn ~]$ hdfs haadmin -getServiceState nn2
active

Hadoop V2 - QJM (Manual Configuration)

In this blog I discuss setting up Namenode High availability using QJM (Quorum Journal Manager)

Functioning of QJM
1. 3 (or 5 or odd number) nodes running QJM
2. NN writes Edit logs to these nodes
3. SNN (Standby) reads and applies changes from these edit logs
4. ZKFC (Zookeper failover controller) running on NN AND SNN
5. whenever failure detected a new NN is elected thus

Failover can be manual or automatic
In this blog we only do manual failover, in the following blogs I will add on configuration for automatic failvoer

Stop cluster and spark
[nn-hdfs] stop-dfs.sh
[rm-yarn] - stop-yarn.sh
[rm-spark] - stop-all.sh
[rm-mapred] - mr-jobhistory-daemon.sh stop historyserver



Setting Up Instructions
hdfs-site.xml
1. Create logical nameservice
<property>
    <name>dfs.nameservices</name>
    <value>devcluster</value>
</property>


2. Setup path prefix (core_site.xml)
<property>
    <name>fs.defaultFS</name>
    <value>hdfs://devcluster</value>
</property>


3. Create Namenode identifiers
<property>
    <name>dfs.ha.namenodes.devcluster</name>
    <value>nn1,nn2</value>
</property
>
   
4. Set Journal Nodes
<property>
<name>dfs.namenode.shared.edits.dir</name>
    <value>        <value>qjournal://d1.novalocal:8485;d2.novalocal:8485;d3.novalocal:8485/devcluster</value>
</property>


Here make sure that hosts are fully qualified and resolvable.


5. Provide shared edits directory on JN'setting
<property>
    <name>dfs.journalnode.edits.dir</name>
    <value>/opt/HDPV2/journal/node/local/data</value>
</property>


This is the folder on nodes which will act like journal ndoes

6. Configure RPC Address


<property>
    <name>dfs.namenode.rpc-address.devcluster.nn1</name>
    <value>nn:8020</value>
</property>
<property>
    <name>dfs.namenode.rpc-address.devcluster.nn2</name>
    <value>snn:8020</value>
</property>


7. Configure HTTP Listen addresses

<property>
    <name>dfs.namenode.http-address.devcluster.nn1</name>
    <value>nn:50070</value>
</property>
<property>
    <name>dfs.namenode.http-address.devcluster.nn2</name>
    <value>snn:50070</value>
</property>


8. Configure Java class for Failover
This class helps to determine which namenode is active when client contacts to namenode service.
<property>
    <name> dfs.client.failover.proxy.provider.devcluster </name>
 <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>


9. Configure Fencing
<property>
    <name>dfs.ha.fencing.methods</name>
    <value>sshfence</value>
</property>

Note - you must install fuser (package psmisc)

9.1 Specify Fencing keys
<property>
    <name>dfs.ha.fencing.ssh.private-key-files</name>
    <value>/home/hdfs/.ssh/id_rsa</value>
</property>



10. Finally remove properties from your hdfs-site file. These are no longer required as I have already provided these for individual nodes via cluster and service id 

(Also remove rpc configuration if done)
dfs.http.address
dfs.secondary.http.address

11. Now scp hdfs-site.xml to snn, d1n, d2n, d3n and rm
[As root]
cd $CONF
for i in $(cat /tmp/all_hosts) ;do scp hdfs-site.xml core-site.xml ${i}:/etc/hadoop/conf/ ; done
for i in $(cat /tmp/all_hosts) ;do ssh ${i} chmod -R 755 /etc/hadoop/conf ; done;


12. Setup Journal Nodes d1n,d2n and d3n

[As root]
mkdir -p /opt/HDPV2/journal/node/local/data
chown -R hdfs:hadoop /opt/HDPV2/journal


hdfs hadoop-daemon.sh start journalnode

13. [As hdfs start ZKFC on nn]
hdfs zkfc –formatZK -force

[hdfs@nn conf]$ hdfs zkfc -formatZK -force
18/04/18 04:58:40 INFO tools.DFSZKFailoverController: Failover controller configured for NameNode NameNode at nn/192.168.2.101:8020
18/04/18 04:58:40 FATAL ha.ZKFailoverController: Automatic failover is not enabled for NameNode at nn/192.168.2.101:8020. Please ensure that automatic failover is enabled in the configuration before running the ZK failover controller.



14. Start NameNode [As hdfs on nn]
 hadoop-daemon.sh start namenode

15. Bootstrap Standby

hdfs namenode -bootstrapStandby
=====================================================
About to bootstrap Standby ID nn2 from:
           Nameservice ID: devcluster
        Other Namenode ID: nn1
  Other NN's HTTP address: http://nn:50070
  Other NN's IPC  address: nn/192.168.2.101:8020
             Namespace ID: 1878602801
            Block pool ID: BP-1256862429-192.168.2.101-1523506309307
               Cluster ID: CID-65c51f92-e1c0-4a17-a921-ae0e61f3251f
           Layout version: -63
       isUpgradeFinalized: true



16. Start Secondary NameNode [As hdfs on snn]
 hadoop-daemon.sh start namenode

17. Stop both namenodes [As hdfs on nn and snn]
hadoop-daemon.sh stop namenode

17.1 Stop all journal daemons. [As hdfs on d1n,d2n and d3n]
 hadoop-daemon.sh stop journalnode


18. Start the Cluster [As hdfs on nn]

start-dfs.sh
19. Force Transition one node to Active

This is an important discussion, when you first start, you fill see both your nodes are in standby mode.
This is because we have not configured automatic failover and zoo keeper controller processes.
I will cover that next, but as of now let's focus on task at hand.

Check Service State

[hdfs@nn ~]$ hdfs haadmin -getServiceState nn1
standby
[hdfs@nn ~]$ hdfs haadmin -getServiceState nn2
standby


hdfs haadmin -failover -forcefence -forceactive nn2 nn1
Failover from nn2 to nn1 successful


Remember always - When you do a forced failover, other node must be fenced, so it will be shutdown automatically. You can start it manually.

Now check the Service State

[hdfs@nn ~]$ hdfs haadmin -getServiceState nn1
active
[hdfs@nn ~]$ hdfs haadmin -getServiceState nn2
standby


 You can look into webui of each to confirm the status.




 

Get configuration of namenodes
[hdfs@nn ~]$ hdfs getconf -namenodes
nn snn


Start Cluster
[rm-yarn] - start-yarn.sh
[rm-spark] - start-all.sh
[rm-mapred] - mr-jobhistory-daemon.sh start historyserver



The directory structure of the QJM directory is similar to namenodes metadata directory. however, it has only edits, as they are only required by Standby Node to rollforward it's in memory fsimage.

/opt/HDPV2/journal/node/local/data/devcluster/current
[root@d1n current]# ls -rlht
total 1.1M
-rw-r--r--. 1 hdfs hadoop  155 Apr 18 04:39 VERSION
-rw-r--r--. 1 hdfs hadoop   42 Apr 18 04:39 edits_0000000000000000701-0000000000000000702
-rw-r--r--. 1 hdfs hadoop   42 Apr 18 05:23 edits_0000000000000000703-0000000000000000704
-rw-r--r--. 1 hdfs hadoop   42 Apr 18 05:25 edits_0000000000000000705-0000000000000000706
-rw-r--r--. 1 hdfs hadoop   42 Apr 18 05:26 edits_0000000000000000707-0000000000000000708
-rw-r--r--. 1 hdfs hadoop    2 Apr 18 05:26 last-promised-epoch
drwxr-xr-x. 2 hdfs hadoop    6 Apr 18 05:26 paxos
-rw-r--r--. 1 hdfs hadoop    2 Apr 18 05:26 last-writer-epoch
-rw-r--r--. 1 hdfs hadoop   42 Apr 18 05:28 edits_0000000000000000709-0000000000000000710
-rw-r--r--. 1 hdfs hadoop   42 Apr 18 05:30 edits_0000000000000000711-0000000000000000712
-rw-r--r--. 1 hdfs hadoop   42 Apr 18 05:32 edits_0000000000000000713-0000000000000000714
-rw-r--r--. 1 hdfs hadoop   42 Apr 18 05:34 edits_0000000000000000715-0000000000000000716
-rw-r--r--. 1 hdfs hadoop 1.0M Apr 18 05:34 edits_inprogress_0000000000000000717
-rw-r--r--. 1 hdfs hadoop    8 Apr 18 05:34 committed-txid
 



Hadoop V2 - GetConf

In this short blog I will discuss on getconf class of hdfs.

This class lists the configuration as listed in the configuration file

It can be used to get details on namenodes, secondarynamenode, backupnode etc

General syntax of the utility is as follows

hdfs getconf <item_to_get>

Below is the help

[hdfs@nn logs]$ hdfs getconf
hdfs getconf is utility for getting configuration information from the config file.

hadoop getconf
        [-namenodes]                    gets list of namenodes in the cluster.
        [-secondaryNameNodes]                   gets list of secondary namenodes in the cluster.
        [-backupNodes]                  gets list of backup nodes in the cluster.
        [-includeFile]                  gets the include file path that defines the datanodes that can join the cluster.
        [-excludeFile]                  gets the exclude file path that defines the datanodes that need to decommissioned.
        [-nnRpcAddresses]                       gets the namenode rpc addresses
        [-confKey [key]]                        gets a specific key from the configuration

       
       
Examples
1. Get Namenodes list
[hdfs@nn logs]$ hdfs getconf -namenodes
nn


2. Get SecondaryNamenodes list
[hdfs@nn logs]$ hdfs getconf -secondaryNameNodes
snn


3. Get Details of include file
[hdfs@nn logs]$ hdfs getconf -includeFile
/etc/hadoop/conf/dfs.hosts.include


4. Get specific key - replication
[hdfs@nn logs]$ hdfs getconf -confKey dfs.replication
3

 

5. Get specific key - blocksize
[hdfs@nn logs]$ hdfs getconf -confKey dfs.blocksize
134217728

Hadoop V2 - Safe Mode

In this blog I will discuss about Safe Mode in Namenode

Safe Mode is a special mode of Hadoop which is read only mode (No client connections still) and no changes are allowed in this mode.
No block replications / deletions allowed as well.


Safe Mode is -
- entered automatically at startup and is left only when enough replicas are found
- entered automatically when disk is running out of space on Namenode
- entered automatically when disk space falls below threshold (dfs.namenode.du.reserved)   
(In this situation you have to clear the disk and exit the safe mode manually)
- entered manually for maintenance and administrative operations


Manual Safe Mode
You can put name node in safe mode manually
1. Get Status of Safe Mode
[hdfs@nn ~]$ hdfs dfsadmin -safemode get
Safe mode is OFF

2. Enter Safe Mode
[hdfs@nn ~]$ hdfs dfsadmin -safemode enter
Safe mode is ON


3. Leave Safe Mode
[hdfs@nn ~]$ hdfs dfsadmin -safemode leave
Safe mode is OFF


4. Safe Mode Wait
hdfs dfsadmin –safemode wait
Wait till all replication is complete (this allows replication in safe mode).
You should already be in safemode to run this command.

Using Safe Mode to Backup Namenode metadata

1. Enter Safe Mode
2. Backup (metsave and saveNamespace)
3. Exit Safe Mode


[hdfs@nn logs]$ hdfs dfsadmin -safemode enter
Safe mode is ON

[hdfs@nn logs]$ hdfs dfsadmin -saveNamespace
Save namespace successful

(New FS Image, Edits and FS MD5 file create)

[hdfs@nn logs]$ hdfs dfsadmin -metasave metadata.txt
Created metasave file metadata.txt in the log directory of namenode hdfs://nn:8020

[hdfs@nn logs]$ hdfs dfsadmin -safemode leave
Safe mode is OFF




Hadoop V2 - Namenode Parameters


This blog covers details of Namenode and Namenode related parameters

Checkpoint Frequency

Checkpointing can be configured by setting


- dfs.namenode.checkpoint.period. This parameter controls the time between 2 checkpoints.
- dfs.namenode.checkpoint.txns. This parameter controls number of edit log transactions between 2 checkpoints.

 

Checkpoint Performance

- dfs.image.transfer.bandwidthPerSec. This parameter throttles B.W (bytes/sec) allowed for transferring Edit logs and FSImage during checkpointing process from namenode.
Default value is '0' which implies no throttling is done
   
- dfs.image.transfer.timeout - This parameter controls socket timeout (in milliseconds) for transferring Image
Default Value - 60,000 milliseconds

Safeguard Parameters 

 
- dfs.namenode.max.extra.edits.segments.retained. This control number of extra edit log segments whcih should be retained beyond what is required to restart namenode.
Default value - 10,000

- dfs.namenode.num.extra.edits.retained. This controls number of extra transactions to be retained beyond what is required to restart namenode
Default Value - 10,00,000




Tuesday, April 17, 2018

Hadoop V2 - HttpFS setup

In this blog I discuss setup of HttpFS in Hadoop

IN hadoop HttpFS
1. Acts as a proxy server for catering to REST requests
2. Acts as single point of contact for all the clients
Clients do not need connectivity to datanodes as in case of WebHDFS
3. Can work on a multi namenode cluster unlike WebHDFS

(All the step are run on server which will be proxy node or edge node as user root except where mentioned)

1. Create User httpfs
groupadd -g 1000 hadoop
useradd -u 1010  -g hadoop httpfs



2. Setup Java and Hadoop
rpm -Uvh /tmp/jdk-8u152-linux-x64.rpm
scp -r root@nn:/usr/local/hadoop-2.7.5 /usr/local/
rm -rf /usr/local/hadoop/etc/hadoop
mkdir -p /etc/hadoop
scp -r nn:/etc/hadoop/conf /etc/hadoop
chmod -R 755 /etc/hadoop/conf


Create Soft Links
ln -s /usr/local/hadoop-2.7.5 /usr/local/hadoop
ln -s /etc/hadoop/conf /usr/local/hadoop-2.7.5/etc/hadoop


3. Setup Profile
scp root@nn:/tmp/profile.sh /etc/profile.d
source /etc/profile.d/profile.sh


4. Setup Sudo
httpfs        ALL=(ALL)       NOPASSWD: ALL

5. Create Directories
mkdir -p /opt/HDPV2/logs /opt/HDPV2/pids /opt/HDPV2/1 /opt/HDPV2/2 /opt/HDPV2/tmp /opt/HDPV2/temp
chown -R httpfs:hadoop /opt/HDPV2/logs /opt/HDPV2/pids /opt/HDPV2/1 /opt/HDPV2/2 /opt/HDPV2/tmp /opt/HDPV2/temp
chmod -R 755 /opt/HDPV2

chmod 0755 /usr/local/hadoop/share/hadoop/httpfs/tomcat/conf/*


6. Change - core.site.xml [as hdfs - on NN, SNN and httpfs server]
<property>
    <name>hadoop.proxyuser.httpfs.hosts</name>
    <value>192.168.1.71</value>
</property>
<property>
    <name>hadoop.proxyuser.httpfs.groups</name>
    <value>*</value>
</property>


Stop and Start Namenode and SecondaryNamenode
hadoop-daemon.sh stop secondarynamenode
hadoop-daemon.sh start secondarynamenode
hadoop-daemon.sh start namenode
hadoop-daemon.sh stop namenode


7.  Edit httpfs-env.sh [as httpfs on httpfs server]
 cd $CONF
 Add below
 sudo vi httpfs-env.sh

export HTTPFS_LOG=/opt/HDPV2/logs  #Custom
export HTTPFS_TEMP=/opt/HDPV2/temp #Custom



8. Start httpfs [as httpfs on httpfs server]
 httpfs.sh start

Test and your httpfs should be ready.
curl -sS 'http://192.168.1.71:14000/webhdfs/v1?op=gethomedirectory&user.name=hdfs'
{"Path":"\/user\/hdfs"}


You can use the same API as in webhdfs, except now you are using a proxy host.

Monday, April 16, 2018

Hadoop V2 - WebHdfs

In this blog I discuss how to setup WebHDFS.


In your hdfs-site.xml
Setup below property on namenode in hdfs-site.xml

<property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
</property>

Distribute this on all nodes

This will need bounce of all Datanodes and Namenode (all Namenodes in case of Federated / Standby Cluster)

[As hdfs on namenode]
stop-dfs.sh
start-dfs.sh


Now in any browser of your choice try to open a file using below format 
 http://<namenode>:<port>/webhdfs/v1/<path_to_file>?op=OPEN&user.name=<username>

where username = user which has right permissions to read the file. 
 

http://nn:50070/webhdfs/v1/data/conf/hosts?op=OPEN&user.name=hdfs 

You will see the request is redirected to one of the nodes which can cater your file. 
 (this is why you need a bounce)
  


IN order to do the same using CLI / CURL


[hdfs@nn tmp]$ curl -i -L "http://nn:50070/webhdfs/v1/data/conf/hosts?op=OPEN&user.name=hdfs"
HTTP/1.1 307 TEMPORARY_REDIRECT
Cache-Control: no-cache
Expires: Mon, 16 Apr 2018 09:20:33 GMT
Date: Mon, 16 Apr 2018 09:20:33 GMT
Pragma: no-cache
Expires: Mon, 16 Apr 2018 09:20:33 GMT
Date: Mon, 16 Apr 2018 09:20:33 GMT
Pragma: no-cache
Content-Type: application/octet-stream
Set-Cookie: hadoop.auth="u=hdfs&p=hdfs&t=simple&e=1523906433975&s=gI+p66RzmNMV1f7DKQM1oZ4aEoE="; Path=/; Expires=Mon, 16-Apr-2018 19:20:33 GMT; HttpOnly
Location: http://d1.novalocal:50075/webhdfs/v1/data/conf/hosts?op=OPEN&user.name=hdfs&namenoderpcaddress=nn:8020&offset=0
Content-Length: 0
Server: Jetty(6.1.26)

HTTP/1.1 200 OK
Access-Control-Allow-Methods: GET
Access-Control-Allow-Origin: *
Content-Type: application/octet-stream
Connection: close
Content-Length: 416

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.2.101 nn.novalocal      nn
192.168.2.102 rm.novalocal      rm
192.168.2.103 snn.novalocal     snn
192.168.2.104 d1.novalocal        d1n
192.168.2.105 d2.novalocal        d2n
192.168.2.106 d3.novalocal        d3n
192.168.2.107 d4.novalocal        d4n


Now to list Directory Status use 


curl -i -L "http://nn:50070/webhdfs/v1/data/conf/?op=LISTSTATUS"
HTTP/1.1 200 OK
Cache-Control: no-cache
Expires: Mon, 16 Apr 2018 09:23:47 GMT
Date: Mon, 16 Apr 2018 09:23:47 GMT
Pragma: no-cache
Expires: Mon, 16 Apr 2018 09:23:47 GMT
Date: Mon, 16 Apr 2018 09:23:47 GMT
Pragma: no-cache
Content-Type: application/json
Transfer-Encoding: chunked
Server: Jetty(6.1.26)

{"FileStatuses":{"FileStatus":[
{"accessTime":1523854423644,"blockSize":134217728,"childrenNum":0,"fileId":16420,"group":"admingroup","length":4436,"modificationTime":1523854423825,"owner":"hdfs","pathSuffix":"capacity-scheduler.xml","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"},
{"accessTime":1523854423837,"blockSize":134217728,"childrenNum":0,"fileId":16421,"group":"admingroup","length":1335,"modificationTime":1523854423863,"owner":"hdfs","pathSuffix":"configuration.xsl","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"},
{"accessTime":1523854423870,"blockSize":134217728,"childrenNum":0,"fileId":16422,"group":"admingroup","length":318,"modificationTime":1523854423890,"owner":"hdfs","pathSuffix":"container-executor.cfg","permission":"644","replication":3,"storagePolicy":0,"type":"FILE"},
.
.
.
]}} 



You can get full list of command information here (or as per version you use)

https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/WebHDFS.html

Hadoop V2 - FSCK

FSCK is one of the key utilized and monitoring command for monitoring hdfs

FSCK
1. Similar to Linux fsck, finds out block corruptions and issue with the File System
2. Does changes only when told, and only checks by default
3. Is a metadata only operation i.e. only Namenode FSImage is processed.
4. can be run on complete FS or specific directory
5. Should be run as superuser only.


Options -

fsck /: Performs an HDFS file system check
fsck / -files: Displays files being checked
fsck / -files –blocks: Displays files and blocks
fsck / files –blocks –locations: Displays files, blocks and their locations
fsck / -files –blocks –locations –racks: Displays files, blocks,
locations and racks
fsck –locations: Shows locations for every block
fsck – move: Moves corrupted files to the /lost+found directory
fsck –delete: Deletes corrupt files
fsck –list-corruptfileblocks: Lists missing blocks and the files they belong to

Example - Running FSCK for a directory and display everything

hdfs fsck  /data/conf -files -blocks -locations -racks
.
.
.
/data/conf/yarn-env.sh 4760 bytes, 1 block(s):  OK
0. BP-1256862429-192.168.2.101-1523506309307:blk_1073741860_1036 len=4760 repl=3 [/default-rack/192.168.2.104:50010, /default-rack/192.168.2.105:50010, /default-rack/192.168.2.106:50010]

Status: HEALTHY
 Total size:    78390 B
 Total dirs:    1
 Total files:   31
 Total symlinks:                0
 Total blocks (validated):      30 (avg. block size 2613 B)
 Minimally replicated blocks:   30 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       0 (0.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     3.0
 Corrupt blocks:                0
 Missing replicas:              0 (0.0 %)
 Number of data-nodes:          3
 Number of racks:               1
FSCK ended at Mon Apr 16 02:01:20 EDT 2018 in 3 milliseconds


The filesystem under path '/data/conf' is HEALTHY

Hadoop V2 - Snapshot

In this blog I discuss HDFS snapshot feature

HDFS Snapshot - 
1. Feature to take snapshots of directory to prevent errors
2. It is used to query old versions of data
3. Default directories are not enabled for snapshot
4. Only Namenode knows about Snapshots as it maintains the metadata information, Datanodes do not have knowledge of it
5. Name are unique i.e. you cannot create a snapshot with same name for a given directory

6. N data copying happens, only blocks list and file size are recorded by the snapshot file and normal operations go on as in normal mode.
7. Cannot delete files from hdfs snapshot directories as they can only be listed and copied only.



1. Enabling Snapshot
    hdfs dfs -allowSnapshot <path>
   [hdfs@nn ~]$ hdfs dfsadmin -allowSnapshot /data/conf

   
    Allowing snaphot on /data/conf succeeded

2. Create Snapshot
    hdfs dfs -allowSnapshot <path> [<snapshotname>]
    [hdfs@nn ~]$ hdfs dfs -createSnapshot /data/conf
    Created snapshot /data/conf/.snapshot/s20180416-005602.358

    If you do not specify anyname, a system generated name is created for snapshot.
    [hdfs@nn ~]$ hdfs dfs -createSnapshot /data/conf Snap1
    Created snapshot /data/conf/.snapshot/Snap1


3. Deleting Snapshot
    hdfs dfs -deleteSnapshot <path> <snapshotname>
    [hdfs@nn ~]$ hdfs dfs -deleteSnapshot /data/conf s20180416-005602.358

   

4. Listing Snapshots   
    hdfs dfs -ls <path/.snapshot>
    [hdfs@nn ~]$ hdfs dfs -ls /data/conf/.snapshot
    Found 1 items
    drwxr-xr-x   - hdfs admingroup          0 2018-04-16 00:56 /data/conf/.snapshot/Snap1

   
5. List Directories on which Snapshots are enabled
    [hdfs@nn ~]$ hdfs lsSnapshottableDir
    drwxr-xr-x 0 hdfs admingroup 0 2018-04-16 00:53 0 65536 /data1
    drwxr-xr-x 0 hdfs admingroup 0 2018-04-16 00:56 1 65536 /data/conf


6. Difference in 2 Snapshots
    hdfs snapshotDiff <path> <Snap1_name> <Snap2_name>
    [hdfs@nn ~]$ hdfs snapshotDiff /data/conf Snap1 Snap2
    Difference between snapshot Snap1 and snapshot Snap2 under directory /data/conf:
    M       .
    +       ./hosts
    -       ./ssl-server.xml.example
    -       ./yarn-site.xml
    Where '+' is file added and '-' is removed, 'R' is renamed


7.  Listing Old Contents
    hdfs dfs -ls /data/conf/.snapshot/Snap2
    You can list the old content by listing files in the snapshot directory.
   
8. Recovering
    Files can be recovered by copying contents (file) from the snapshot Directory

9. Deleting
    Directory can only be deleted only if there are no snapshots present, so delete all snapshots manually by using
     hdfs dfs -deleteSnapshot <path> <snapshotname>
   

Hadoop V2 - Trash

In this blog I discuss about HDFS Trash Feature.

Trash is a feature provided by HDFS similar to recycle bin of Windows. However there are differences / few changes 


1. It is a user only feature - i.e files are moved to trash only when deleted using 'hdfs dfs' command
Files removed programmeticaly are deleted permanently by default
(for this you need to use moveToTrash() instance)
2. Trash is disabled by default and enabled by setting property fs.trash.interval in minutes
Below example sets it for 2 days

<name>fs.trash.interval</name>
<value>2880</value>

3. After the specified period files are deleted by default.

4. fs.trash.checkpoint.interval makes sure after how many minutes as defined, the trash directory is checked for files greater than the trash.interval.

5. This parameter should be set on all the nodes which are going to be client nodes and not just the Namenode, so different clients can have different settings.

Deleting File
[hdfs@nn conf]$ hdfs dfs -rm /user/hosts
18/04/16 00:27:43 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 1440 minutes, Emptier interval = 0 minutes.
18/04/16 00:27:43 INFO fs.TrashPolicyDefault: Moved: 'hdfs://nn:8020/user/hosts' to trash at: hdfs://nn:8020/user/hdfs/.Trash/Current/user/hosts
Moved: 'hdfs://nn:8020/user/hosts' to trash at: hdfs://nn:8020/user/hdfs/.Trash/Current


Restoring File
Copy
[hdfs@nn conf]$ hdfs dfs -cp /user/hdfs/.Trash/Current/user/hosts /user/
or Move
[hdfs@nn conf]$ hdfs dfs -mv /user/hdfs/.Trash/Current/user/hosts /user/

Deleting Trash
Expunge command will only remove trash files of the user who is running the command.
[mapred@nn ~]$ hdfs dfs -expunge
18/04/16 00:31:56 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 1440 minutes, Emptier interval = 0 minutes.
18/04/16 00:31:56 INFO fs.TrashPolicyDefault: Created trash checkpoint: /user/mapred/.Trash/180416003156


Skipping Trash
[hdfs@nn ~]$ hdfs dfs -rm -skipTrash /user/hosts/
Deleted /user/hosts



Note

  1. Trash Directory is automatically created even if you delete , make sure user has permission on /user directory to write files
  2. The directory will have the time stamp when first while was deleted.
  3. The file which is deleted will have timestamp of when it was created.
  4. If you try to delete 2 files with same name then second file will numbered with timestamp.
  5. Trash enabled --> Delete file; Trash Disabled --> Refreshed configuration --> File     still present. This means even if you disable trash and restart, the files will be present
  6. /user/<username>/.Trash/Current is where the trash files are