If you’ve been running Spark applications for a few months, you might start to notice some odd behavior with the history server (default port 18080). Specifically, it’ll take forever to load the page, show links to applications that don’t exist or even crash. Three parameters take care of this once and for all. Continue reading
If you’ve spent any time with a Hortonworks Data Platform cluster, you’re familiar with Ambari. It’s one of the finest, open source cluster management tools that allows you to easily first launch a cluster, add or remove nodes, change configurations and add services to your cluster. Using Ambari takes a lot of the guesswork out of managing a hadoop cluster and I absolutely love it.
The one downside of Ambari is that it can be tedious to add functionality to the core client. For that reason, the smart people building the tool in Apache decided to add something called an Ambari View. An Ambari View is a way to extend the functionality of Ambari without going down the rabbit hole of modifying Ambari’s source code. Views are essentially plug-and-play tools that only require restarting your cluster to work.
In the following blog post, I’ll discuss getting your View off the ground and show you several tips about actually using them.
Next Post: Apache Ambari: Hello World!
With your data now in HDFS in an “analytic-ready” format (it’s all cleaned and in common formats), you can now put a Hive table on top of it.
Apache Hive is a RDBMS-like layer for data in HDFS that allows you to run batch or ad-hoc queries in a SQL-like language. This post will go over what you need to know about Apache Hive in preparation for the HDPCD Exam. Continue reading
I learned today about a cool ETL/data pipeline/make your life easier tool that was recently released by the NSA (not kidding) as a way to manage the flow of data in and out of system: Apache NiFi. To me, that functionality seems to match PERFECTLY with what people like to do with Hadoop. This guide will just set up NiFi, not do anything with it (that’ll come later!)