Getting Started with Hortonworks Data Platform 2.3

As my first post, I’m going to walk through setting up Hortonworks Data Platform (HDP) 2.3. HDP is very nice because it is free to use at any level for any sized cluster, from curious developers with virtual environments to Fortune 50 companies with 100+ node clusters. The cost comes from requiring support on Hortonworks‘ software. To get your very own Hadoop cluster going, read on!

So there’s a couple of reasons for this post:

  1. I’m familiar with developing on and ‘administrating’ a HDP 2.2 VM and want to see the differences
  2. I want to do something I know I can accomplish on my first post.
  3. Finally, when getting started with a new technology, I’ve found is often the most difficult part. Hopefully this will help someone stuck on their first steps into the world of Big Data

So here we go:

  1. Get yourself a hot new copy of VirtualBox (or whatever VM hosting environment you please, this guide will be through VirtualBox, however) at Oracle’s website: https://www.virtualbox.org/wiki/Downloads
    • I’m using VirtualBox 5.0 on Windows 8.1 at the time of this post.
    • Install it using the wizard
  2. Make your way over to Hortonwork’s website and download the .ova file for their VirtualBox environment: http://hortonworks.com/hdp/downloads/
    • I got HDP 2.3, again for Windows 8.1
    • It’s a large file (~7GB), so make sure you have enough space in your Downloads area
  3. Once that’s downloaded, start up VirtualBox. Navigate your way through any “Welcome to VirtualBox” splash screens until you find yourself looking at something like this:
    welcometovb

    • To import your HDP 2.3 image, do the following:
      1. File > Import Appliance…
      2. Navigate to where your HDP 2.3.ova file was saved and select that
      3. Select Next > Import
    • It might take some time to import it; that’s normal, don’t worry.
    • Once it’s imported, your VirtualBox window should look something like this:  vbmanagerpostadd
    • Double click the icon in the VirtualBox tray and it should bring up a command prompt window and boot up the machine. After it boots, the window should look like this:
      machinestarted
    • Do as it says and open a browser and navigate to http://127.0.0.1:8888/
  4. By logging into your cluster’s Ambari interface (http://127.0.0.1:8080, Username: admin; Password: admin), you can now begin to administrate your cluster.

Congratulations, you now have your very own hadoop cluster set up in a VirtualBox environment courtesy of Hortonworks and their robust Hortonworks Data Platform.

Please leave a comment if you have any questions or concerns!

-James

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s