Today I will describe the setup of a Hadoop / HDSF multi-node cluster on Debian Lenny with a redundant Namenode using DRBD and Heartbeat, four Datanodes and Tasktracker, a Backup- Checkpointnode and Rack awareness.
Hadoop Cluster Setup on Debian Lenny purposes
This article descibes how to setup a hadoop (version 0.21.0) cluster on debian lenny (version 5.x). I will not describe how to use MapReduce.
general
Hadoop is a framework for distributed computing written in Java. The project includs the following subprojects:
- HDFS: A distributed file system
- MapReduce: A framework for distributed large data processing
list of references
[Read more]