Assignment 3: Blockchains with Rust

Introduction

For this assignment, you will be building functionality to create a simple blockchain system. Our blockchain will include a proof-of-work, so we will have to do some computation to create valid blocks.

I have provided a code skeleton with some structure and hints.

Blockchain Basics

A single block in a blockchain is fundamentally a data structure that contains some information (more on what information later). That data is hashed with a cryptographic hash function. As long as the hash function matches the contents, we will say that it's a valid hash (but we'll refine the idea under “Proof-Of-Work” below).

Blocks become a blockchain when they are joined together. This happens because one of the pieces of data in each block is the hash value from the previous block.

That means that if you have the block \(n\) from the chain and believe that its hash is valid, then you can also verify block \(n-1\). Block \(n-1\) must hash to the “previous hash” from block \(n\). Similarly, you can verify blocks \(n-2\), \(n-3\)….

As a result, we need a very small amount of information (one block) to be able to certify all steps in the chain since the beginning of time.

Proof-Of-Work Basics

If just verifying a sequence of blocks is all you want, then we're done. The proof-of-work forces certain amount of computation to be required before someone can create a “valid” block.

The “proof” part of the proof-of-work requires that we have some value that is difficult to produce/find, but easy to verify once it is found. Any calculation with that property will do, but we will use the hash values we already need in the blockchain.

We will add a proof value to each block which can be any integer. We will select a difficulty, \(d\) and insist that the last \(d\) bits of the hash value (including the proof in the data that is hashed) be zero. A block can only be considered valid if its hash ends with \(d\) zero bits.

That means in order to find a valid block, we have to iterate through possible “proof” values until we find one that makes the block's hash end in zero bits. That work is (1) unavoidable as long as the hashing algorithm is good, and (2) easy to verify, since we just have to re-hash the block to check that it was done.

In a (simplified) diagram with \(d=16\) (so the last 2 bytes = 4 hex characters are zeros), we end up with a blockchain that looks like this:

In each case, we had to choose a “proof” value that made all of the block's data (in the dashed box) hash to a value ending in \(d\) zero bits. That hash becomes part of the data for the next block, and the chain continues.

Work Queue

We need to be able to compute the proof-of-work in parallel in order to calculate it as quickly as possible. We also need to be able to stop the calculations and return a valid proof when we find one, without doing a bunch of (now useless) calculations.

I think this is a situation where solving the more general problem is easier than doing it in the context of the blockchain mining.

See the provided queue.rs for an outline of how we need a work queue to behave for this assignment. The architecture is:

tasks have an Output type that they return on success; running a task produces an Option<Output>: None if no result was found or Some(x) if it found a result x;
we will have several worker threads (created with thread::spawn) waiting for tasks that they should do;
a single-producer multiple-consumer channel (spmc from the spmc crate) for the tasks: they will by enqueued by the main thread, and be received by workers;
a multiple-producer single-consumer channel (std::sync::mpsc) for results: they will be sent by workers as they are produced, and received by the main thread.

When a new queue is created, it needs to have channels created for jobs going into the queue, and results coming out. (Our work queue doesn't distinguish between results from the tasks: they come out in whatever order they are produced.) Worker threads also need to be created which will watch the .recv_tasks queue for work to do.

When a job (Task instance) is put into the .send_tasks channel, it should be started (by calling .run()) as soon as possible by one of the worker threads. Its result should be sent into the mpsc channel so it can be retrieved by whatever code is using the results.

The provided WorkQueue.recv() method can be used by outside code to retrieve the first available result. Presumably, after a user of the work queue receives a result, they will call .shutdown (or destroy the queue, which will call .shutdown because of the provided .drop implementation).

Stopping the Queue

We will need our work queue to do a slightly non-standard trick: once we find a result (proof-of-work in this case), we want to stop processing work because all subsequent calculations are unnecessary.

For a worker (the WorkQueue::run function), that will mean processing jobs until the spmc::Sender is destroyed so no more tasks can be sent. It should then exit.

When the queue's .shutdown method is called,

the .send_tasks channel should be destroyed (reassigned to None);
the worker threads should start to exit, because there can be no more work to do;
any messages waiting in .recv_tasks should be quickly received (and discarded, not run) by .shutdown;
ensure that the worker threads have exited (by calling each JoinHandle's .join method).

Tests

The provided queue_tests.rs contains tests for the work queue. You do not need to add any additional tests to queue_tests.rs. The provided tests should check all required functionality as described above.

Blocks

In block.rs, we will start creating blocks. A struct that holds all of the data we need has been provided.

The first block in a blockchain in a little special. It will have generation 0, difficulty as specified in the argument to Block::initial, the empty string for data, and a previous hash of 32 zero bytes (0u8). Create a function Block::initial that generates the initial block for the chain.

To continue the chain, we will call a method on the last block on the chain. If we have a block b, calling Block::next(b, message) should create and return a new block that can go in the chain after b. That is, it will have one higher generation; the same difficulty; data as given in the argument; previous hash equal to b's hash. Create a function Block::next that creates a next-in-chain block.

Note that these functions will not set the block's proof-of-work. That comes later…

Valid Blocks

Recall: a block is valid if it has a “proof” value that makes it hash to a value that ends in .difficulty zero bits. We need a few things to check this: the actual data to hash, a function calculate the hash, and a function to check if the hash ends in zero bits.

Hashing Blocks

These fields will go into our hash string, separated by colons. The integers will be in the usual decimal integer representation (i.e. the number ten is "10"):

the previous hash, encoded to a string in lower-case hexadecimal;
the generation;
the difficulty;
the data string;
the proof-of-work.

After this code…

let mut b0 = Block::initial(16);
b0.mine(1); // or b0.set_proof(56231);
let mut b1 = Block::next(&b0, String::from("message"));
b1.mine(1); // or b1.set_proof(2159);

… my two hash strings are:

"0000000000000000000000000000000000000000000000000000000000000000:0:16::56231"
"6c71ff02a08a22309b7dbbcee45d291d4ce955caa32031c50d941e3e9dbd0000:1:16:message:2159"

[You might not have mining working as you read this, but you will soon. The .set_proof calls produce the same results in these examples.]

We will use the sha2 crate to hash these strings. Write a method .hash_for_proof(p) that calculates the SHA256 hash value a block would have if we filled in .proof=p.

My hashes with the code above (hex-encoded) are:

"6c71ff02a08a22309b7dbbcee45d291d4ce955caa32031c50d941e3e9dbd0000"
"9b4417b36afa6d31c728eed7abc14dd84468fdb055d8f3cbe308b0179df40000"

For the hex conversion the {:02x} format of an array of bytes is the format we need. Here's a hint:

use std::fmt::Write;
let mut output = String::new();
// fmt::Write for String always returns Ok() and never Err.
write!(&mut output, "{:02x}", b0.hash()).unwrap();

Valid Hashes

We have a “valid” proof of work a block if the block's hash ends in .difficulty zero bits. We need to check that.

Write a method .is_valid_for_proof that takes a candidate proof value, and returns true if (and only if) the block's hash ends would end in .difficulty zero bits if we used that proof value for this block.

The hash value will be a byte array. The easiest way to check for difficulty zero bits at the end of the slice is probably something like:

n_bytes = difficulty/8
n_bits = difficulty%8
Check each of the last n_bytes bytes are 0u8.
Check that the next byte (from the end) is divisible by 1<<n_bits (one left-shifted n_bits times is equal to \(2^\mathit{n\_bits}\)).

In the example above, we can see that if we create an initial block with b0 = Block::initial(16) then b0.is_valid_for_proof(0) should return false, but b0.is_valid_for_proof(56231) should return true. Then if we b0.set_proof(56231), we should have a valid block: b0.is_valid() returns true.

Another example, with this setup, b1 is valid and has exactly 19 zero bits at the end of its hash:

let mut b0 = Block::initial(19);
b0.set_proof(87745);
let mut b1 = Block::next(&b0, String::from("hash example 1234"));
b1.set_proof(1407891);

But if we did this, the block it not valid. It has only 18 zero bits at the end of its hash:

b1.set_proof(346082);

Mining

Now that we have the basic structure for blocks, we need to do some mining.

The process of mining a block is conceptually easy: check proof-of-work values until you find for which block.is_valid_for_proof() returns true. Once you have found it, set block.proof to that value to make the block valid.

A function block.mine_serial is provided that loops over proof values starting a 0 until a valid one is found. You don't need this, but it completely avoids the need for the work queue, and may be convenient for testing the more complex implementation we're about to create…

A mining task will be a range of possible proof-of-work values: we want to check those value and return any one of them that makes the block's hash valid.

Mining Tasks

We will use our work queue for this. We will need a struct that implements the Task trait, i.e. has an .Output type and a .run() method. We can also use the struct to store any other information needed to do the mining we want done.

The .run() call must return an Option<Output> value where None indicates no valid proof was found, and Some(p) is returned if p is a valid proof.

Once you have that, you can use it in block.mine_range. This should check proof values for this block, from start to end (inclusive). The calculation should be done in parallel by the given number of workers, and dividing the work into chunks approximately-equal parts.

This must be done using the work queue created above. If that is implemented correctly, it should be fairly easy to do this work in parallel, and to stop checking proof values (soon) after a valid proof is found.

Example Blocks

Here is an example of valid hashes being mined, and the proof values I get:

let mut b0 = Block::initial(7);
b0.mine(1);
println!("{}", b0.hash_string());
println!("{:02x}", b0.hash());
let mut b1 = Block::next(&b0, String::from("this is an interesting message"));
b1.mine(1);
println!("{}", b1.hash_string());
println!("{:02x}", b1.hash());
let mut b2 = Block::next(&b1, String::from("this is not interesting"));
b2.mine(1);
println!("{}", b2.hash_string());
println!("{:02x}", b2.hash());

This outputs:

0000000000000000000000000000000000000000000000000000000000000000:0:7::385
379bf2fb1a558872f09442a45e300e72f00f03f2c6f4dd29971f67ea4f3d5300
379bf2fb1a558872f09442a45e300e72f00f03f2c6f4dd29971f67ea4f3d5300:1:7:this is an interesting message:20
4a1c722d8021346fa2f440d7f0bbaa585e632f68fd20fed812fc944613b92500
4a1c722d8021346fa2f440d7f0bbaa585e632f68fd20fed812fc944613b92500:2:7:this is not interesting:40
ba2f9bf0f9ec629db726f1a5fe7312eb76270459e3f5bfdc4e213df9e47cd380

There are other valid proof values for these blocks, but these are the ones my code finds (and the numerically-smallest). Changing to difficulty 20 in the above code, it outputs:

0000000000000000000000000000000000000000000000000000000000000000:0:20::1209938
19e2d3b3f0e2ebda3891979d76f957a5d51e1ba0b43f4296d8fb37c470600000
19e2d3b3f0e2ebda3891979d76f957a5d51e1ba0b43f4296d8fb37c470600000:1:20:this is an interesting message:989099
a42b7e319ee2dee845f1eb842c31dac60a94c04432319638ec1b9f989d000000
a42b7e319ee2dee845f1eb842c31dac60a94c04432319638ec1b9f989d000000:2:20:this is not interesting:1017262
6c589f7a3d2df217fdb39cd969006bc8651a0a3251ffb50470cbc9a0e4d00000

Tests

The provided block_tests.rs doesn't actually contain any (useful) tests, but it should.

Write some tests that check the core functionality of the Block. Your tests should ensure that at least some of the functionality described above is correct. There is no requirement for complete code coverage or testing every single thing mentioned.

You should write at least three or four tests (depending on the complexity of each test, of course) and cover a reasonable variety of the requirements for the assignment. At least some tests should be on examples not directly from the assignment description.

Summary

All of the functions provided in the code skeleton should be complete with arguments and return values as specified. We will expect them to be there when testing your implementation.

These structs/functions/methods as in the provided code skeleton should be completed to work as described above:

WorkQueue::new and WorkQueue::run
WorkQueue.enqueue and WorkQueue.shutdown
Block::initial and Block::next
Block.hash_string_for_proof, Block.hash_for_proof, Block.mine_range
MiningTask implementation
Some tests in block_tests.rs.

Where there are stub implementations in the provided hints, you aren't required to use them, as long as the functions listed here work as specified.

Questions

In a file answers.txt, briefly answer these questions:

In the Task trait definition, the type Output is specified as Send. What does this do, and why is this necessary?
The hint for .shutdown suggests using a vector's .drain method. Why can we not iterate with .iter() like usual?
Why must we .clone the Block when we put it in the Arc? Why not Arc<&Block>?

Blockchains Revisited

There is a lingering question here: is this a “good” blockchain implementation?

The answer is clear: no. You are dangerously bad at cryptography, and so am I. But, it does contain and illustrate some of the core ideas.

Submitting

Submit your files through CourSys for Assignment 3.

Updated Fri July 28 2023, 07:40 by ggbaker.