Assignment 1: Rainbow Tables with Haskell
Introduction
The Problem
Think of all of the systems that know some password for you. Of course, those systems have to somehow store the password so you can log in later. But, it is insecure to store your password in plaintext (i.e. store the string “myPassword” directly in the database) since if someone can view the database, they can see all of the passwords.
Each system should run your password through a hash function before it is stored. The hash function creates a hash value which is stored in the database. When you enter your password later, it is hashed with the same function and the results are compared: same results mean you must have entered the correct password. [There's actually a bunch more to safely storing passwords. That is to say, what we do below shouldn't work in practice.]
The hash function is meant to be very hard to reverse. That is, even if somebody has access to the password database, they can't practically convert the hash values back to the passwords they need to enter to log in.
Herein lies the problem: the ne'er-do-wells of the world would like to be able to reverse hash functions and figure out the corresponding passwords.
Rainbow Tables
For more info: How rainbow tables work and Rainbow tables in Wikipedia.
A rainbow table is a compromise between hashing every possible password until we find a match (which takes too long) and storing every possible password/hash pair (which would require too big a file).
To construct a rainbow table, we need the hash function that the system uses, as well as a reduce function that maps a hash value to an arbitrary password. Note that the reduce function does not reverse the hash, it just provides a deterministic way to transform a hash value to some legal password.
If we start with some randomly generated passwords (the “PW 0” column in the table), we can repeatedly apply the hash and reduce functions to get several chains of values:
PW 0 | Hash 0 | PW 1 | Hash 1 | PW 2 | Hash 2 |
---|---|---|---|---|---|
dccdecee | 839310862 | daecbecc | -1003698034 | cdbdaddb | 1802158710 |
cdeccaed | -723447964 | eebdbbcb | 504323679 | abdcecae | -716915723 |
acbcaeec | 1384205608 | ceadeebd | -2138650808 | abbddcdc | 1305042273 |
eeeeaebd | 537636003 | bddcdaad | 1446259833 | cadaddbd | 1629794057 |
ccdccbeb | -1131047593 | ccedebbc | 170163192 | dacbacdc | -1862546954 |
To generate this table, we used our hash function to transform “PW (n)” into “Hash (n)”, and the reduce function to transform “Hash (n)” into “Password (n+1)”.
Instead of storing the whole table, we will only need to store the first and last columns. If we have those values, we can reconstruct the rest of the table by using the hash/reduce functions as necessary.
Cracking a Password with a Rainbow Table
Realistically, rainbow tables would be several orders of magnitude larger than the above. They are large enough that (1) you would usually download them, not calculate them and (2) are “wide” enough that storing only the first and last column uses much less space than storing everything in the table.
So now, imagine that you can only see the “Password 0” and “Hash 2” columns of the above table (since that's what is stored).
PW 0 | Hash 2 |
---|---|
dccdecee | 1802158710 |
cdeccaed | -716915723 |
acbcaeec | 1305042273 |
eeeeaebd | 1629794057 |
ccdccbeb | -1862546954 |
You are trying to crack a hashed password value 1446259833.
The hash value you're looking for isn't in the table, but by applying the reduce and hash functions (once) to the hash value, you will get the new hash value 1629794057: the password you're looking for is almost certainly somewhere in fourth row of the table.
Once you realize that, you can look back at the initial password for row four, “eeeeaebd”. From there, it's just a matter of recalculating the chain until you find the value you're looking for: the password “bddcdaad” led to the hash value 1446259833.
You have successfully cracked a password and can log into the system.
Restrictions
The computation time needed to create a genuinely useful rainbow table is unrealistic for this assignment: there's a reason rainbow tables are usually downloaded, not computed. With a restricted password set, we should be able to get somewhere, but the real power of a rainbow table comes when you put days of processor time into creating the table, and then distribute it to all of your cracker friends. We should get far enough to see that once the table has been created, it can be applied to a hash value relatively quickly.
For this assignment, we will restrict ourselves to a limited set of passwords. We will say that all passwords:
- have exactly the same length (e.g. for length 4,
"abcd"
will be a valid password, but"abc"
will not). - contain only lowercase letters (and usually only the first
n
letters of the alphabet, depending how hard we want to make the problem).
The tables above were generated with 5 letters (a–e) and length 8. We will continue to use those values in the first examples below.
Basically: the techniques here are somewhat realistic, but the numbers aren't.
The Module
In order to get you started, you should use the provided RainbowAssign
module which contains some functions that you'll find useful. The functions in the module will be described below as you need them.
Save the module in the same directory as your program. You can then import the module into your program like this:
import RainbowAssign
The following types are defined for hashes and passwords: you should use them as appropriate to make it clear what kind of values your functions are using.
type Hash = Int32
type Passwd = String
Since the Hash
type is really an Int32
, you may find yourself wanting to sometimes do arithmetic on it like an Int
. To convert an Int32
to an Int
in the obvious way, use fromEnum
.
You'll need the System.Random
module. It is already in CSIL, or one of these commands should do the job:
apt-get install libghc-random-dev
cabal install random
Parameters
We need to define some constants that represent the parameters of the system. For testing (and to get results like my examples), I'd suggest something like this:
pwLength, nLetters, width, height :: Int
filename :: FilePath
pwLength = 8 -- length of each password
nLetters = 5 -- number of letters to use in passwords: 5 -> a-e
width = 40 -- length of each chain in the table
height = 1000 -- number of "rows" in the table
filename = "table.txt" -- filename to store the table
When testing your assignment, we will change these values, so make sure you don't assume they are always as above.
Hashing & Reducing
The first things we need for rainbow tables are functions that implement the two basic operations:
pwHash
: convert a password (string) to a hash value.pwReduce
: convert a hash value back to a possible password.
The pwHash
function would typically be implied by the passwords we are trying to crack: we need to use the same hash function as the system that stored the passwords. For this assignment we will simply use a convenient hash function that is in the Haskell standard library. The pwHash
function is provided in the RainbowAssign
module. It maps strings to a (signed) 32-bit integer, so has type pwHash
:: Passwd -> Hash
.
We also need the pwReduce
function to map a hash back into a possible password.
A good pwReduce
function shouldn't produce too many collisions. That is, if h1/=h2
, it should be very likely that pwReduce h1 /= pwReduce h2
. (We can't necessarily guarantee this last identity, but we can do our best.)
For our reduce function, we will use a simple base conversion. That is, if we are creating passwords with the first n
lowercase letters, we can convert the hash value to base n
, and use the “digits” of that value to choose letters for our password. Here is an example, using passwords of length 8 and 5 characters (a–e):
Operation | Value |
---|---|
some initial password | "abcdeabc" |
apply pwHash |
1726491528 (base 10 integer) |
convert to base nLetters |
…000012013440212103 (left-padded with 0s) |
take pwLength least significant digits |
40212103 |
convert digits to letters | "eacbcbad" |
The last three lines above describe the pwReduce
function. You don't have to follow exactly those steps to get the value, but should get the same result. So, pwReduce 1726491528 == "eacbcbad"
.
For the last step, the RainbowAssign
module contains a function toLetter
that converts an integer to the corresponding letter. For example, toLetter 3 == 'd'
.
The negative hash values may cause you stress. If you have done the base conversion in the standard straightforward way, you shouldn't have to worry about them. (i.e. find the lowest-order digit by taking the hash value mod nLetters
, then recurse to find the rest of the digits.) You should get pwReduce (-1726491528::Hash) == "aecdcdec"
.
I have included more examples of base conversion to clear up any confusion. Hint: it's easier to generate the base-nLetters
value in reverse order as an infinite list (just leave out the base case), and simply take
the number of values you need.
Complete the pwReduce
function as described above.
The Map Data Structure
In order to make our table useful, we will need a data structure that can do more efficient lookups than the built-in list (which does linear searches). The Haskell standard library contains a type Data.Map
which is analogous to the Python dictionary, or C++ std::map, or the Java TreeMap. Because the Data.Map
module contains many functions with common names (like map
and lookup
), you should probably import it this way, to keep those functions out of your main namespace:
import qualified Data.Map as Map
Then you can access any of the functions by prefacing with “Map.
”. For example the module's keys function can be accessed as Map.keys
.
The easiest way to create a map (from values of type A
to values of type B
) is to create a list of pairs (of type [(A,B)]
) and then give that to the Map.fromList
function to construct the map.
Building the Table
To create the actual rainbow table, you need to take a list of initial passwords, and use the hash/reduce functions to map the initial passwords onto a corresponding (distant) hash value. I will use the terms height to refer to the number of initial passwords (and thus the number of chains in the table), and width to refer to the number of hash/reduce operations applied to each chain.
Create a function rainbowTable
that takes two arguments: the “width” of the table, and the list of initial passwords. It should return a Map.Map
that maps the final Hash
values onto the Passwd
values at the start of the chain.
With the same parameters as the above example, we should get:
*Main> :t rainbowTable
rainbowTable :: Int -> [Passwd] -> Map.Map Hash Passwd
*Main> rainbowTable 0 ["abcdeabc"]
fromList [(1726491528,"abcdeabc")]
*Main> rainbowTable 1 ["abcdeabc"]
fromList [(1477708406,"abcdeabc")]
*Main> rainbowTable 40 ["abcdeabc"]
fromList [(-1993856529,"abcdeabc")]
*Main> rainbowTable 40 ["abcdeabc", "aabbccdd", "eeeeeeee"]
fromList [(-1993856529,"abcdeabc"),(1781092264,"aabbccdd"),(2135711886,"eeeeeeee")]
*Main> rainbowTable 2 ["dccdecee","cdeccaed","acbcaeec","eeeeaebd","ccdccbeb"]
fromList [(-1862546954,"ccdccbeb"),(-716915723,"cdeccaed"),(1305042273,"acbcaeec"),(1629794057,"eeeeaebd"),(1802158710,"dccdecee")]
(The last example is the example table from the start of the assignment.)
Creating, Reading, and Writing Tables
To create a rainbow table based on random initial passwords, you can use the provided buildTable
function which takes these arguments: your rainbowTable
function as described above, the number of letters being used, the length of each password, and the width and height of the table. For example, it might return something like this:
*Main> buildTable rainbowTable nLetters pwLength width height
fromList [(-1860304949,"bddaaaeb"),(-1422325927,"accabecb"),(-857324890,"aaaababb"),(454267643,"dbaacded")]
Because the buildTable
function requires input (a seed for the random number generator), the result is an IO
object (actually, a Map.Map
wrapped in an IO
object). There are some special rules for working with IO
objects that we won't go into.
Since the calculating a rainbow table can take a long time, the RainbowAssign
module contains functions that write a table to a file and read it back again. The easiest thing to do with buildTable
is to immediately write the results to a file. You can define a function like this, and then just call generateTable
when you want to generate a new table and save it to disk:
generateTable :: IO ()
generateTable = do
table <- buildTable rainbowTable nLetters pwLength width height
writeTable table filename
To read the table back, you can use the readTable
function. Again, since the results of readTable
are dependant on input (contents of the file), you must deal with the IO
object. You can probably get by with this recipe:
test1 = do
table <- readTable filename
return (Map.lookup 0 table)
Put the expression you want to test in the parens on the last line: calling the test function will give you the result of evaluating that expression.
Reversing Hashes
Now that we have our rainbow tables created, we can try to reverse the hash function.
In order to do this, we take a hash value, and repeatedly apply pwReduce
and pwHash
to it. At each point, we check to see if the hash value is a key in the table. If it is, we start with the corresponding password and iterate until we either find the corresponding password or reach the end of the chain.
Note that it is possible to get false positives in the initial lookup: you may find a generated hash value as a key in the table, but not find the original hash value in the chain. This is an unavoidable consequence of the possible collisions in the hash and reduce functions. Make sure you check every row of the table that is a potential match, not just the first one.
Create a function findPassword
that takes three arguments: the rainbow table, the “width” of the table, and the hash value you are trying to reverse.
The function should return a Maybe Passwd
. In Haskell, Maybe
values can either take on a value of the corresponding type, or Nothing
. You should return Just
the password if it was found, and Nothing
otherwise. The Data.Maybe
module contains some functions that may simplify your life when working with Maybe
values.
Here are some examples:
*Main> let table = rainbowTable 40 ["abcdeabc", "aabbccdd", "eeeeeeee"]
*Main> findPassword table 40 1726491528
Just "abcdeabc"
*Main> findPassword table 40 (-206342227)
Just "dbddecab"
*Main> findPassword table 40 1726491529
Nothing
*Main> findPassword table 40 0
Nothing
More Examples
I have put some more examples of the functions doing what they do on another page of examples.
Experimenting
At this point, you're probably wondering just how effective your password cracking system is. With letters a–e and length 8, we have only 5^8 = 390625 possible passwords: it shouldn't be that hard to crack them.
This function tries to crack n
randomly generated passwords, and reports the results:
test2 :: Int -> IO ([Passwd], Int)
test2 n = do
table <- readTable filename
pws <- randomPasswords nLetters pwLength n
let hs = map pwHash pws
let result = Maybe.mapMaybe (findPassword table width) hs
return (result, length result)
This uses the mapMaybe
function from the Data.Maybe
module to return only the passwords that could be cracked. With the sample parameters from above, I successfully reverse about 7.5% of hashes: not bad for such a toy-sized table.
Choosing a good value for width
is surprisingly important: too small and not enough hashes are represented in your table; too large and you get too many chain collisions (decreasing the effective height), causing more false-positives in the lookup, and much longer times to reverse a hash. For given values of nLetters
and pwLength
, there seems to be a “sweet spot” for the value of width where the success rate stops increasing, but you are still not getting many collisions.
For height
, bigger is better.
Please feel free to experiment further. You should find that as the number of potential passwords (nLetters
and pwLength
) increases, the best value for width increases as well. So, as the situation gets more realistic, the rainbow table becomes more and more space-efficient, with cracking times increasing relatively little.
Notes and Summary
You should create the following functions as described above:
pwReduce
: maps a hash value to an arbitrary password.rainbowTable
: generates a rainbow table, given a list of initial passwords.findPassword
: reverses a hash to the corresponding password, if possible.
All functions you define (the above and any others you need to get there) should have explicit type declarations (the ::
thing).
Coding style/readability/maintainability count. Using the -Wall
argument at the GHC/GHCi command line might help you find a few style problems (but certainly not all).
Compiling Your Code
You can likely work entirely at the GHCi prompt, but if you do want to compile your code, you first need to define a main function, perhaps like this:
main :: IO ()
main = do
generateTable
res <- test2 10000
print res
Then, the command line will be like this:
ghc -O2 --make -Wall rainbow.hs
./rainbow
Wait… are my passwords really that insecure?
Not if the person implementing the system did it right in the first place. Also keep in mind that the password and key spaces are much larger in reality than in our examples. Salting the hash helps defeat any precomputed-table attack as well.
Submitting
Submit your files through CourSys for Assignment 1.