Assignment 11: Raspberry Pi Version

You can work on this assignment in small groups, but write up your submissions individually. Groups will need to take turns with the hardware: no sympathy will be available if you all wait until the last minute. Alternate versions: with Docker containers; with virtual machines.

The actual Cluster that we have for this course is nice, but it has one notable limitation: it actually works.

Part of the selling point of the Hadoop technologies is that they are fault tolerant. It's one thing to read that in the docs, but another to see it actually play out. James will be quite cross if we go into the server room to start unplugging nodes to see what happens.

So, I built an alternate cluster to meet this need (and let's be honest, because it was fun)…

Raspberry Pi Cluster

The cluster consists of seven Raspberry Pi 2 model Bs, each with an 8GB microSD card (boot, OS, home) and 32GB USB flash drive (HDFS). That gives a total of 28 cores, and 280GB of storage. Full parts list and configuration details are on GitHub, if you're interested.

Working with the cluster

The computer you're working on needs to be on the internal network for the cluster. Connect a laptop to one of the LAN (yellow) ethernet ports on the router.

TODO: or connect to the wifi? or use wifi for bridge to outside internet?

That should be all the configuration necessary: plug in the cluster's power bar and turn it on.

The nodes are named master.local (NameNode, ResourceManager, JobTracker, login) and hadoop1.local to hadoop6.local (DataNodes and TaskTrackers). [I will adopt the convention of putting a computername> prompt on commands for this assignment: there are too many to keep track of otherwise.]

Once you have the Pis running, you should be able to SSH to the master node:

yourcomputer> ssh pi@master.local

The password for the pi account is on a note in the case.

You need to start the services on the nodes (this starts HDFS and YARN):

master> start-all

You can copy code by SCP to master.local and then start jobs like on the real cluster with spark-submit, or start a Spark shell with pyspark.

Web Frontends

HDFS NameNode: http://master.local:50070/
YARN ResourceManager: http://master.local:8088/

Notes on the Simulation

Rather than actually destroying disks and nodes, the instructions below do a couple of things to simulate failure. These are:

Restarting a node (sudo reboot) stops all of the Hadoop processes and effectively makes the node disappear from the cluster. It is as if the node failed completely (until we restart the Hadoop processes).
The script clear-dfs deletes all of the HDFS files on the node. Doing this (while HDFS is not running) is effectively the same as a disk failing and being replaced.

Expanding the cluster

Start by shutting down everything:

master> halt-all

Unplug the power from node 6: we will be starting with a 6 node cluster (5 + master) and seeing how easy it is to add nodes to Hadoop.

Power up the cluster and start everything (minus node 6):

yourcomputer> ssh pi@master.local
master> start-all

[It will fail to start on hadoop6, but otherwise should be okay.] Give it some time: these are not fast machines. After 30 seconds or so, you should be able to visit the HDFS namenode and YARN application master (URLs above) and see 5 active DataNodes/TaskTrackers.

We are imagining that hadoop6 is a brand new node being added to expand capacity on the cluster. Plug it in and wait for it to boot. You should soon be able to connect to it. Clear any existing HDFS data on it:

yourcomputer> ssh pi@hadoop6.local
hadoop6> clear-dfs

The clear-dfs script deletes all of the HDFS data in /hadoop/* on that computer. Effectively, hadoop6 is now a new node with a newly-formatted hard drive.

To add it to the cluster, on master simply:

master> start-all

This will fail on nodes 1–5 (because the services are already running), but will bring services up on hadoop6. Check in the web frontends that it has appeared as a DataNodes and TaskTrackers. The overall capacity (storage, memory, cores) should have increased appropriately as well.

Drive Failure

HDFS stores redundant copies (by default three copies of each block). There are two reasons: (1) so the content is available on multiple nodes as input to jobs, and (2) to provide protection against disk failure.

So, what actually happens when a disk fails?

Find a file in the HDFS filesystem: go to the web frontend (http://master.local:50070/) → Utilities → Browse the filesystem. When you click a filename, a popup window gives you some metadata, including the list of nodes where that file is stored.

Pick one of the nodes where that file is stored. We're going to simulate failure of that disk. For the commands below, I'm going to assume it's hadoop5 that we're failing.

SSH to the sacrificial node. Restart it: this will stop the HDFS processes and probably would have been necessary to replace the drive anyway:

yourcomputer> ssh hadoop5.local
hadoop5> sudo reboot

Have a look at the list of Datanodes in the web frontend. (Keep reloading the page to get updated statuses.) What happens?

After HDFS gives up on the “failed” node, have a look at the number of “Under-Replicated Blocks” on the NameNode front page. What happens to the file you were watching? Was the replication maintained? [❓]

Note: the dead-node timeouts have been made very short on the Pi cluster. They are typically much longer so something like a network switch reboot doesn't declare everything behind it dead.

Replacement Drive

Now we will wipe the HDFS data from the disk, as we would see on a newly-provisioned drive:

yourcomputer> ssh hadoop5.local
hadoop5> clear-dfs

As before, get things started from the master node:

yourcomputer> ssh master.local
master> start-all

In the web frontend, you should see the node with the “replaced” drive back in the cluster. It should have more free space than the other nodes. (If that really bothers you, you can run hdfs balancer -threshold 2 to rebalance.)

Aside: You might have noticed that the “add a new node” and “replace a drive” procedures are quite similar. To me that makes it feel like what's in a Hadoop cluster is very ephemeral: the cluster consists of whatever resources happen to be there just at the moment. The YARN/HDFS machinery takes care of the rest.

Computation Failure

Now we will simulate failure of a node during a job's execution.

Make sure the cluster is up and running. Start a job that will take long enough to give you some time: this can even by the pyspark shell with:

sc.range(1000000000, numSlices=100).sum()

Or you can start a Scala Spark shell with spark-shell and try a calculation there:

sc.range(0, 1000000000, 1, 100).reduce((a, b) => a+b)

Start the job, and once the executors are start getting something done, look at the Spark frontend (http://master.local:8088/). Select the running application → ApplicationMaster.

Find a node that's doing some work (i.e. one listed in the “Executors” tab with tasks running). Log into that node and restart it: this will rudely stop the NodeManager and leave any work unfinished. If node 3 is the one being killed:

yourcomputer> ssh hadoop3.local
hadoop3> sudo reboot

Keep watching the attempt page in the frontend (reloading as necessary), and the output of your application in the terminal. How does YARN respond to the node failing? [❓]

Other Things To Try

There are many things our little cluster can do (if slowly). These aren't required for the assignment, but may be interesting.

Pick a (small) file and decrease its replication factor to one:

master> hdfs dfs -setrep -w 1 /user/pi/output/part-r-00000

What happens if the drive it's on “fails”? Or if you bring up the cluster without that node?

More experiments

If you'd like to experiment more, feel free. I'll just ask if you play with any config files, that you keep backups and restore when you're done. Please also leave the /user/pi/wordcount and /user/pi/euler-input HDFS directories in place for the next group.

Shutting Down

Please shut the system down gracefully when you're done:

master> stop-all # somewhat optional, but nice
master> halt-all

After a new seconds for the machines to shut down, you can power off/unplug.

Questions

In a text file answers.txt, answer these questions:

What happened to the HDFS file when one of the nodes it was stored on failed?
How did YARN/MapReduce behave when one of the compute nodes disappeared?
Were there more things you'd like to try with this cluster, or anything you did try that you think should have been in this assignment?

Submitting

Submit your answers.txt to [No activity "A11"] in CourSys.

Updated Thu Aug. 22 2024, 11:06 by ggbaker.