I will design Queues and Capacity as per below diagram.
(All the detailed configuration is present in the end of the blog)
Steps [on rm node]
1. Make Backup of Capacity Scheduler File
sudo cp capacity-scheduler.xml capacity-scheduler.xml.bkp
2. Configure /etc/hadoop/conf/capacity.scheduler.xml as in Appendix
3. Configure Properties in Yarn-site.xml
yarn.resourcemanager.scheduler.class=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler
(Enable Capacity Scheduler)
yarn.resourcemanager.scheduler.monitor.enable = true
(Enable Preemption)
4. Stop and Start yarn-services
stop-yarn.sh
start-yarn.sh
5. Run any application
6. Verify from rm cluster UI applications, queues
7. Check Queues from RM using CMD
[yarn@rm ]$ hadoop queue -list
18/04/23 03:06:36 INFO client.RMProxy: Connecting to ResourceManager at rm/192.168.2.102:8032
======================
Queue Name : research
Queue State : running
Scheduling Info : Capacity: 30.000002, MaximumCapacity: 30.000002, CurrentCapacity: 0.0
======================
Queue Name : analytics
Queue State : running
Scheduling Info : Capacity: 40.0, MaximumCapacity: 60.000004, CurrentCapacity: 0.0
======================
Queue Name : data
Queue State : running
Scheduling Info : Capacity: 60.000004, MaximumCapacity: 60.000004, CurrentCapacity: 0.0
======================
Queue Name : support
Queue State : running
Scheduling Info : Capacity: 40.0, MaximumCapacity: 50.0, CurrentCapacity: 0.0
======================
Queue Name : services
Queue State : running
Scheduling Info : Capacity: 40.0, MaximumCapacity: 40.0, CurrentCapacity: 0.0
======================
Queue Name : training
Queue State : running
Scheduling Info : Capacity: 60.000004, MaximumCapacity: 70.0, CurrentCapacity: 0.0
======================
Queue Name : production
Queue State : running
Scheduling Info : Capacity: 30.000002, MaximumCapacity: 100.0, CurrentCapacity: 22.222223
Appendix
capacity-scheduler.xml, this file is created on yarn-rm node in /etc/hadoop/conf with permissions as 0755 owner root.
This file governs configuration of queues on RM
<configuration>
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>1.0</value>
</property>
<property>
<name>yarn.scheduler.capacity.maximum-applications</name>
<value>2000</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.acl_administer_queue</name>
<value>*</value>
</property>
<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DominantResourceCalculator</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>research,support,production</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.capacity</name>
<value>30</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.maximum-capacity</name>
<value>30</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.state</name>
<value>RUNNING</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>yarn-scheduler.capacity.root.research.minimum-user-limit-percent</name>
<value>80</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.analytics.capacity</name>
<value>40</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.analytics.maximum-capacity</name>
<value>60</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.analytics.state</name>
<value>RUNNING</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.analytics.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>yarn-scheduler.capacity.root.research.analytics.minimum-user-limit-percent</name>
<value>20</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.queues</name>
<value>analytics,data</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.data.capacity</name>
<value>60</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.data.maximum-capacity</name>
<value>60</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.data.state</name>
<value>RUNNING</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.research.data.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>yarn-scheduler.capacity.root.research.data.minimum-user-limit-percent</name>
<value>20</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.production.capacity</name>
<value>30</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.production.maximum-capacity</name>
<value>100</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.production.state</name>
<value>RUNNING</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.production.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>yarn-scheduler.capacity.root.production.minimum-user-limit-percent</name>
<value>20</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.capacity</name>
<value>40</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.maximum-capacity</name>
<value>50</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.state</name>
<value>RUNNING</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>yarn-scheduler.capacity.root.support.minimum-user-limit-percent</name>
<value>20</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.queues</name>
<value>training,services</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.training.capacity</name>
<value>60</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.training.maximum-capacity</name>
<value>70</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.training.state</name>
<value>RUNNING</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.training.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>yarn-scheduler.capacity.root.support.training.minimum-user-limit-percent</name>
<value>20</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.services.capacity</name>
<value>40</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.services.maximum-capacity</name>
<value>40</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.services.state</name>
<value>RUNNING</value>
</property>
<property>
<name>yarn.scheduler.capacity.root.support.services.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>yarn-scheduler.capacity.root.support.services.minimum-user-limit-percent</name>
<value>20</value>
</property>
<property>
<name>yarn.scheduler.capacity.queue-mappings</name>
<value>u:sqoop:production,u:hdfs:production,g:hadoop:services,u:%user:%user,g:analytics:analytics,g:data:data,g:training:training,g:services:services</value>
</property>
</configuration>
No comments:
Write comments