Advertisement

Thursday, February 22, 2018

Hadoop V1 Install - Mapred Configuration

MapReduce  - Basic Configuration to Start Hadoop Daemons
 

Configuration of mapred-site.xml
### Only Properties are mentioned ### 


<property>
        <name>mapred.job.tracker</name>
        <value>nn:8021</value>
</property>
<property>
        <name>mapred.local.dir</name>
        <value>/opt/HDPV1/1/mr1,/opt/HDPV1/1/mr2</value>
</property>


Copy the Configuration to all the nodes
[As root]
# for i in $(cat /tmp/hosts) ;do scp mapred-site.xml ${i}:/etc/hadoop/conf/ ; done

[As root - Give Permissions]
# for i in $(cat /tmp/hosts) ;do ssh ${i} chmod -R 755 /etc/hadoop ; done;


# for i in $(cat /tmp/hosts) ;do ssh ${i} chmod  775 /opt/HDPV1/1/ ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} mkdir /opt/HDPV1/1/mr1   ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} mkdir /opt/HDPV1/1/mr2   ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} chown mapred:hadoop   /opt/HDPV1/1/mr1 ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} chown mapred:hadoop   /opt/HDPV1/1/mr2 ; done;


[As mapred- on namenode]
Start mapred
start-mapred.sh


for i in $(cat /tmp/hosts) ; do ssh ${i} 'hostname; jps | grep -vi jps; echo' ;  done;


namenode.cluster.com
29378 JobTracker


d1node.cluster.com
4931 TaskTracker

d2node.cluster.com
7712 TaskTracker

d3node.cluster.com
2359 TaskTracker

d4node.cluster.com
17635 TaskTracker




stop-mapred.sh and start-mapred.sh

 To Optimize Performance You can use below configuration file for mapred-site.xml and restart the daemons using - 


MapReduce - Performance Configuration File


<property>
        <name>mapred.job.tracker</name>
        <value>nn:8021</value>
</property>
<property>
        <name>mapred.local.dir</name>
        <value>/opt/HDPV1/1/mr1,/opt/HDPV1/1/mr2</value>
</property>
<property>
        <name>mapred.java.child.opts</name>
        <value>-Xmx1024m</value>
</property>
<property>
        <name>mapred.child.ulimit</name>
        <value>1572864</value>
</property>
<property>
        <name>mapred.tasktracker.map.tasks.maximum</name>
        <value>4</value>
</property>
<property>
        <name>mapred.tasktracker.reduce.tasks.maximum</name>
        <value>2</value>
</property>
<property>
        <name>io.sort.mb</name>
        <value>200</value>
</property>

<property>
        <name>io.sort.factor</name>
        <value>32</value>
</property>
<property>
        <name>mapred.compress.map.output</name>
        <value>true</value>
</property>
<property>
        <name>mapred.map.output.compression.codec</name>
        <value>org.apache.io.compress.SnappyCodec</value>
</property>
<property>
        <name>mapred.jobtracker.taskScheduler</name>
        <value>org.apache.hadoop.mapred.FairScheduler</value>
</property>
<property>
        <name>mapred.reduce.tasks</name>
        <value>8</value>
</property>
<property>
        <name>mapred.reduce.slowstart.completed.maps</name>
        <value>0.7</value>
</property>

 

No comments:
Write comments