Welcome to Cascalog for the Impatient, a series of tutorial and Cascalog code examples to get you started. This series is a fork of Cascading for the Impatient.
This set of progressive coding examples starts with a simple file copy and builds up to a MapReduce implementation of the TF-IDF algorithm.
Clone this repository and head over to the Wiki to follow through with this 6-part tutorial.
Install the following:
- Hadoop, see Apache's instruction on setting up a local node
- Leiningen build tool for Clojure
Some basic knowledge of Clojure and using Leiningen would be helpful.