How to Sqoop an RDBMS Source Directly to a Hive Table In Any Format

This tutorial will accomplish a few key feats that make ingesting data to Hive far less painless. In this writeup, you will learn not only how to Sqoop a source table directly to a Hive table, but also how to Sqoop a source table in any desired format (ORC, for example) instead of just plain old text.

Continue reading

Apache Crunch Tutorial 9: Reading & Ingesting Orc Files

LITTLE_CRUNCH

This post is the ninth in a hopefully substantive and informative series of posts about Apache Crunch, a framework for enabling Java developers to write Map-Reduce programs more easily for Hadoop.

Continue reading

Benefits of the Orc File Format in Hadoop, And Using it in Hive

logo

As a developer/engineer in the Hadoop and Big Data space, you tend to hear a lot about file formats. All have their own benefits and trade-offs: storage savings, split-ability, compression time, decompression time, and much more. All of these factors play a huge role in what file formats you use for your projects, or as a team or company-wide standard. Continue reading