Ambari Views: Introduction

Ambari-logo-300x141

If you’ve spent any time with a Hortonworks Data Platform cluster, you’re familiar with Ambari. It’s one of the finest, open source cluster management tools that allows you to easily first launch a cluster, add or remove nodes, change configurations and add services to your cluster. Using Ambari takes a lot of the guesswork out of managing a hadoop cluster and I absolutely love it.

The one downside of Ambari is that it can be tedious to add functionality to the core client. For that reason, the smart people building the tool in Apache decided to add something called an Ambari View. An Ambari View is a way to extend the functionality of Ambari without going down the rabbit hole of modifying Ambari’s source code. Views are essentially plug-and-play tools that only require restarting your cluster to work.

In the following blog post, I’ll discuss getting your View off the ground and show you several tips about actually using them.

Next Post: Apache Ambari: Hello World!

Advertisements

Guide to Apache Falcon #4: Process Entity Definitions

falcon-logo

This series is designed to be the ultimate guide on Apache Falcon, a data governance pipeline for Hadoop. Falcon excels at giving you control over workflow scheduling, data retention and replication, and data lineage. This guide will (hopefully) excel at helping you understand and use Falcon effectively. Continue reading

Guide to Apache Falcon #3: Feed Entity Definitions

falcon-logo

This series is designed to be the ultimate guide on Apache Falcon, a data governance pipeline for Hadoop. Falcon excels at giving you control over workflow scheduling, data retention and replication, and data lineage. This guide will (hopefully) excel at helping you understand and use Falcon effectively. Continue reading

Guide to Apache Falcon #1: Introducing Falcon

falcon-logo

This series is designed to be the ultimate guide on Apache Falcon, a data governance pipeline for Hadoop. Falcon excels at giving you control over workflow scheduling, data retention and replication, and data lineage. This guide will (hopefully) excel at helping you understand and use Falcon effectively. Continue reading

Moving Data Within or Between Hadoop Clusters with DistCP

hdfs-logo

Copying chunks of data in and around Hadoop is relatively trivial. But moving larger chunks can be time-consuming or needlessly complicated. Sometimes you even want to move data between Hadoop clusters (if you have 2 or more). With this article, I’ll show you a great way to handle all of these scenarios. Continue reading

Getting Started with Hortonworks Data Platform 2.3

As my first post, I’m going to walk through setting up Hortonworks Data Platform (HDP) 2.3. HDP is very nice because it is free to use at any level for any sized cluster, from curious developers with virtual environments to Fortune 50 companies with 100+ node clusters. The cost comes from requiring support on Hortonworks‘ software. To get your very own Hadoop cluster going, read on!
Continue reading

Installing and Running Apache NiFi on your HDP Cluster

NiFi Workspace
Hey everyone,

I learned today about a cool ETL/data pipeline/make your life easier tool that was recently released by the NSA (not kidding) as a way to manage the flow of data in and out of system: Apache NiFi. To me, that functionality seems to match PERFECTLY with what people like to do with Hadoop. This guide will just set up NiFi, not do anything with it (that’ll come later!)

Continue reading