Installing and Running Apache NiFi on your HDP Cluster

NiFi Workspace
Hey everyone,

I learned today about a cool ETL/data pipeline/make your life easier tool that was recently released by the NSA (not kidding) as a way to manage the flow of data in and out of system: Apache NiFi. To me, that functionality seems to match PERFECTLY with what people like to do with Hadoop. This guide will just set up NiFi, not do anything with it (that’ll come later!)

Things you’ll need:

  • Maven > 3.1

And to use with Hadoop, obviously you’ll need:

  • HDP > 2.1

You don’t even need root access!


Here’s how to get it running on your HDP2.3 cluster:

First we need to actually get the source code:


wget https://github.com/apache/nifi/archive/master.zip

unzip master.zip

cd nifi-master

export MAVEN_OPTS="-Xms1024m -Xmx3076m -XX:MaxPermSize=256m"

mvn -T C2.0 clean install

(Takes about 8 minutes to run all the tests)

After that’s done,

cd nifi-assembly/target

tar -zxvf nifi-0.3.1-SNAPSHOT-bin.tar.gz

cd nifi-0.3.1-SNAPSHOT

vi conf/nifi.properties

On line 106 ( :106 in vim)

nifi.web.http.port=9000

bin/nifi.sh install

service nifi start

or

bin/nifi.sh start

if you’re not root.

Then navigate to http://localhost:9000/nifi

There you have it! With any luck, you now have NiFi installed!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s