In this blog I will discuss how to find out issues with container launch when you are running your jobs / applications.
Looking at below error trace from a MR job in command line.
18/04/19 07:48:51 INFO mapreduce.Job: Task Id : attempt_1524137551614_0001_m_000002_2, Status : FAILED
Exception from container-launch.
Container id: container_1524137551614_0001_01_000015
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
at org.apache.hadoop.util.Shell.run(Shell.java:482)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Container exited with a non-zero exit code 1
Assuming you have turned on yarn-log aggregation
yarn.log-aggregation-enable to true in yarn-site.xml
Go to directory specified in yarn.nodemanager.remote-app-log-dir and then look for the application run and the node name.
Open the job log file using
hdfs dfs -cat <file> | less
and look for the error
In my case the error was quite trivial which was
Error occurred during initialization of VM
Too small initial heap
This was fixed and the job ran fine.
Properties which were tuned were
Looking at below error trace from a MR job in command line.
18/04/19 07:48:51 INFO mapreduce.Job: Task Id : attempt_1524137551614_0001_m_000002_2, Status : FAILED
Exception from container-launch.
Container id: container_1524137551614_0001_01_000015
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
at org.apache.hadoop.util.Shell.run(Shell.java:482)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Container exited with a non-zero exit code 1
Assuming you have turned on yarn-log aggregation
yarn.log-aggregation-enable to true in yarn-site.xml
Go to directory specified in yarn.nodemanager.remote-app-log-dir and then look for the application run and the node name.
Open the job log file using
hdfs dfs -cat <file> | less
and look for the error
In my case the error was quite trivial which was
Error occurred during initialization of VM
Too small initial heap
This was fixed and the job ran fine.
Properties which were tuned were
- mapreduce.map.memory.mb
- mapreduce.reduce.memory.mb
- mapreduce.map.java.opts
- mapreduce.reduce.java.opts
No comments:
Write comments