Guide to Apache Oozie #2: Understanding Workflows

oozie_282x1178

This series is designed to be a “get off the ground” guide to Apache Oozie, a job scheduling framework for Hadoop. Oozie offers multi-action workflow scheduling, ability to run actions in parallel, and great APIs. This guide is designed to help you answer your Oozie technical questions.

Continue reading

Advertisements

Guide to Apache Oozie #1: Introducing Oozie

oozie_282x1178

This series is designed to be a “get off the ground” guide to Apache Oozie, a job scheduling framework for Hadoop. Oozie offers multi-action workflow scheduling, ability to run actions in parallel, and great APIs. This guide is designed to help you answer your Oozie technical questions. Continue reading

Guide to Apache Falcon #7: Updating Jobs

falcon-logo

This series is designed to be the ultimate guide on Apache Falcon, a data governance pipeline for Hadoop. Falcon excels at giving you control over workflow scheduling, data retention and replication, and data lineage. This guide will (hopefully) excel at helping you understand and use Falcon effectively. Continue reading

Guide to Apache Falcon #6: Monitoring Jobs

falcon-logo

This series is designed to be the ultimate guide on Apache Falcon, a data governance pipeline for Hadoop. Falcon excels at giving you control over workflow scheduling, data retention and replication, and data lineage. This guide will (hopefully) excel at helping you understand and use Falcon effectively. Continue reading

How to Run a Jar in Oozie with Java Actions

oozie_282x1178

You probably know how jars work. Jars, short for Java Archives, are zipped up packages of Java class files with or without dependencies included. In most cases, it’s just your application code, and dependencies live elsewhere and are exported into a classpath. While we’ll cover that topic another day, let’s focus on the task at hand: getting your Jar running in Oozie. Continue reading