Advertisement

Friday, May 4, 2018

HDPV2 - Oozie Setup

In this setup I am going to discuss setting up Oozie in your cluster.

I am going to use one of the edge nodes for this and not any of the nodes which are part of my cluster.

I already have setup sqoop on this node and am I already have ojdbc8.jar for Oracle.

Setup
Step 1 - Create user oozie [As root]
1. Create User Oozie
groupadd -g 1000 hadoop
useradd -u 1013  -g hadoop oozie [This should be done on both nn]


2. Setup Java [As root]
rpm -Uvh /tmp/jdk-8u152-linux-x64.rpm

3. Download Oozie and maven [As root]
curl http://www-eu.apache.org/dist/oozie/5.0.0/oozie-5.0.0.tar.gz -o oozie-5.0.0.tar.gz
curl http://mirror.olnevhost.net/pub/apache/maven/maven-3/3.0.5/binaries/apache-maven-3.0.5-bin.tar.gz -o apache-maven-3.0.5-bin.tar.gz


4. Unzip Maven
cd /usr/local
tar -xf /tmp/apache-maven-3.0.5-bin.tar.gz
ln -s /usr/local/apache-maven-3.0.5 /usr/local/apache-maven


4. Unzip Oozie [As root]

cd /tmp
tar -xvf oozie-5.0.0.tar.gz



Building Oozie using maven can be tricky and you are for sure to run into errors which you have no idea about.

So keep patient and resolve errors as you get on cmd.


5. Export Maven Variables (.bashrc and .bash_profile)
    Logout and Login again
    export M2_HOME=/usr/local/apache-maven
    export M2=$M2_HOME/bin
    export PATH=$M2:$PATH


Make changes as below in pom.xml (in untarred oozie directory)
If you are using Java 8 then add below in your pom.xml profiles
   <profile>
            <id>disable-doclint</id>
            <activation>
                <jdk>[1.8,)</jdk>
            </activation>
            <build>
                <plugins>
                    <plugin>
                        <groupId>org.apache.maven.plugins</groupId>
                        <artifactId>maven-javadoc-plugin</artifactId>
                        <configuration>
                            <additionalparam>-Xdoclint:none</additionalparam>
                        </configuration>
                    </plugin>
                </plugins>
            </build>
        </profile>


6. Build oozie

cd /tmp/oozie-5.0.0

bin/mkdistro.sh -DskipTests  -Puber -Dhadoop.version=2.7.5
This command requires internet connection as it will require repos to be downloaded from internet
So do this from a machine having internet connectivity


Oozie distro created, DATE[2018.05.02-09:31:53GMT] VC-REV[unavailable], available at [/tmp/oozie-5.0.0/distro/target]



7.  Make Changes required to configure
 cd /tmp/oozie-5.0.0/distro/target/oozie-5.0.0-distro/oozie-5.0.0
 mkdir libext


 cd /tmp/oozie-5.0.0/sharelib
 find -name '*.jar' -exec cp  -f '{}'  /tmp/oozie-5.0.0/distro/target/oozie-5.0.0-distro/oozie-5.0.0/libext \;

  cd /tmp/oozie-5.0.0/distro/target/oozie-5.0.0-distro/oozie-5.0.0/
 find -name '*.jar' -exec cp -f '{}'  /tmp/oozie-5.0.0/distro/target/oozie-5.0.0-distro/oozie-5.0.0/libext \;

 cd libext
 curl https://ext4all.com/ext/download/ext-2.2.zip -o ext-2.2.zip

 #cp /usr/local/sqoop/lib/ojdbc8.jar .
 zip ojdbc.zip /usr/local/sqoop/lib/ojdbc8.jar

 mkdir ../lib
 cd ../lib
 cp /usr/local/sqoop/lib/ojdbc8.jar .
 cp -n ../libext/* .

 cd ..


bin/oozie-setup.sh

INFO: Oozie is ready to be started



8. In hdfs-core-site.xml on Namenode SNN and RM.
Then restart using
hadoop-daemon.sh stop namenode
hadoop-daemon.sh start namenode

yarn-daemon.sh stop resourcemanager
yarn-daemmon.sh start resourcemanager

<property>
    <name>hadoop.proxyuser.oozie.hosts</name>
    <value>192.168.1.71</value>
</property>
<property>
    <name>hadoop.proxyuser.oozie.groups</name>
    <value>hadoop</value>
</property>

9. Copy Binaries
cd /usr/local/

cp -R /tmp/oozie-5.0.0/distro/target/oozie-5.0.0-distro/oozie-5.0.0/ .


10. Provide Permissions

chown oozie:hadoop -R oozie*

 
11. Update config files [As oozie user]
Update site.xml file (/conf/oozie-site.xml)
<property>
     <name>oozie.service.JPAService.jdbc.driver</name>
     <value>oracle.jdbc.driver.OracleDriver</value>
</property>

<property>
     <name>oozie.service.JPAService.jdbc.url</name>
     <value>jdbc:oracle:thin:@192.168.1.71:6633:EMPRD</value>
</property>

<property>
     <name>oozie.service.JPAService.jdbc.username</name>
     <value>oozie</value>
</property>

<property>
     <name>oozie.service.JPAService.jdbc.password</name>
     <value>oozie</value>
</property>


12. Create oozie user in Database (Oracle)

create user oozie identified by oozie default tablespace users temporary tablespace temp;

grant alter any index to oozie;
grant alter any table to oozie;
grant alter database link to oozie;
grant create any index to oozie;
grant create any sequence to oozie;
grant create database link to oozie;
grant create session to oozie;
grant create table to oozie;
grant drop any sequence to oozie;
grant select any dictionary to oozie;
grant drop any table to oozie;
grant create procedure to oozie;
grant create trigger to oozie;

alter user oozie default tablespace users;
alter user oozie quota unlimited on users;


13.  Validate oozie DB Connection [As oozie user]

cd /usr/local/oozie
bin/ooziedb.sh version

  setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"

Oozie DB tool version: 4.2.0

Validate DB Connection
DONE
DB schema does not exist

Error: Oozie DB doesn't exist



14. Create Oozie DB Schema into sql file

 bin/ooziedb.sh create -sqlfile oozie.sql

15. Create oozie schema in database

bin/ooziedb.sh create -sqlfile oozie.sql -run

Validate DB Connection
DONE
DB schema does not exist
Check OOZIE_SYS table does not exist
DONE
Create SQL schema
DONE
Create OOZIE_SYS table
DONE

Oozie DB has been created for Oozie version '5.0.0'



The SQL commands have been written to: oozie.sql

16. Validate connection using
bin/ooziedb.sh version


17.  Finalize Installation

ln -s /usr/local/oozie-5.0.0/ /usr/local/oozie
chown -R oozie:hadoop oozie*


18. Start Oozie
su - oozie
cd /usr/local/oozie
bin/oozied.sh start


19. Validate Admin Interface
bin/oozie admin -oozie http://localhost:11000/oozie -status

Status should be NORMAL.



No comments:
Write comments