MapReduce - Basic Configuration to Start Hadoop Daemons
Configuration of mapred-site.xml
### Only Properties are mentioned ###
<property>
<name>mapred.job.tracker</name>
<value>nn:8021</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/opt/HDPV1/1/mr1,/opt/HDPV1/1/mr2</value>
</property>
Copy the Configuration to all the nodes
[As root]
# for i in $(cat /tmp/hosts) ;do scp mapred-site.xml ${i}:/etc/hadoop/conf/ ; done
[As root - Give Permissions]
# for i in $(cat /tmp/hosts) ;do ssh ${i} chmod -R 755 /etc/hadoop ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} chmod 775 /opt/HDPV1/1/ ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} mkdir /opt/HDPV1/1/mr1 ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} mkdir /opt/HDPV1/1/mr2 ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} chown mapred:hadoop /opt/HDPV1/1/mr1 ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} chown mapred:hadoop /opt/HDPV1/1/mr2 ; done;
[As mapred- on namenode]
Start mapred
start-mapred.sh
for i in $(cat /tmp/hosts) ; do ssh ${i} 'hostname; jps | grep -vi jps; echo' ; done;
namenode.cluster.com
29378 JobTracker
d1node.cluster.com
4931 TaskTracker
d2node.cluster.com
7712 TaskTracker
d3node.cluster.com
2359 TaskTracker
d4node.cluster.com
17635 TaskTracker
stop-mapred.sh and start-mapred.sh
To Optimize Performance You can use below configuration file for mapred-site.xml and restart the daemons using -
MapReduce - Performance Configuration File
<property>
<name>mapred.job.tracker</name>
<value>nn:8021</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/opt/HDPV1/1/mr1,/opt/HDPV1/1/mr2</value>
</property>
<property>
<name>mapred.java.child.opts</name>
<value>-Xmx1024m</value>
</property>
<property>
<name>mapred.child.ulimit</name>
<value>1572864</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>4</value>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>2</value>
</property>
<property>
<name>io.sort.mb</name>
<value>200</value>
</property>
<property>
<name>io.sort.factor</name>
<value>32</value>
</property>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>org.apache.io.compress.SnappyCodec</value>
</property>
<property>
<name>mapred.jobtracker.taskScheduler</name>
<value>org.apache.hadoop.mapred.FairScheduler</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>8</value>
</property>
<property>
<name>mapred.reduce.slowstart.completed.maps</name>
<value>0.7</value>
</property>
Configuration of mapred-site.xml
### Only Properties are mentioned ###
<property>
<name>mapred.job.tracker</name>
<value>nn:8021</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/opt/HDPV1/1/mr1,/opt/HDPV1/1/mr2</value>
</property>
Copy the Configuration to all the nodes
[As root]
# for i in $(cat /tmp/hosts) ;do scp mapred-site.xml ${i}:/etc/hadoop/conf/ ; done
[As root - Give Permissions]
# for i in $(cat /tmp/hosts) ;do ssh ${i} chmod -R 755 /etc/hadoop ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} chmod 775 /opt/HDPV1/1/ ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} mkdir /opt/HDPV1/1/mr1 ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} mkdir /opt/HDPV1/1/mr2 ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} chown mapred:hadoop /opt/HDPV1/1/mr1 ; done;
# for i in $(cat /tmp/hosts) ;do ssh ${i} chown mapred:hadoop /opt/HDPV1/1/mr2 ; done;
[As mapred- on namenode]
Start mapred
start-mapred.sh
for i in $(cat /tmp/hosts) ; do ssh ${i} 'hostname; jps | grep -vi jps; echo' ; done;
namenode.cluster.com
29378 JobTracker
d1node.cluster.com
4931 TaskTracker
d2node.cluster.com
7712 TaskTracker
d3node.cluster.com
2359 TaskTracker
d4node.cluster.com
17635 TaskTracker
stop-mapred.sh and start-mapred.sh
To Optimize Performance You can use below configuration file for mapred-site.xml and restart the daemons using -
MapReduce - Performance Configuration File
<property>
<name>mapred.job.tracker</name>
<value>nn:8021</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/opt/HDPV1/1/mr1,/opt/HDPV1/1/mr2</value>
</property>
<property>
<name>mapred.java.child.opts</name>
<value>-Xmx1024m</value>
</property>
<property>
<name>mapred.child.ulimit</name>
<value>1572864</value>
</property>
<property>
<name>mapred.tasktracker.map.tasks.maximum</name>
<value>4</value>
</property>
<property>
<name>mapred.tasktracker.reduce.tasks.maximum</name>
<value>2</value>
</property>
<property>
<name>io.sort.mb</name>
<value>200</value>
</property>
<property>
<name>io.sort.factor</name>
<value>32</value>
</property>
<property>
<name>mapred.compress.map.output</name>
<value>true</value>
</property>
<property>
<name>mapred.map.output.compression.codec</name>
<value>org.apache.io.compress.SnappyCodec</value>
</property>
<property>
<name>mapred.jobtracker.taskScheduler</name>
<value>org.apache.hadoop.mapred.FairScheduler</value>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>8</value>
</property>
<property>
<name>mapred.reduce.slowstart.completed.maps</name>
<value>0.7</value>
</property>
No comments:
Write comments