Assignment 1: Rainbow Tables with Haskell
Think of all of the systems that know some password for you. Of course, those systems have to somehow store the password so you can log in later. But, it is insecure to store your password in plaintext (i.e. store the string “myPassword” directly in the database) since if someone can view the database, they can see all of the passwords.
Each system should run your password through a hash function before it is stored. The hash function creates a hash value which is stored in the database. When you enter your password later, it is hashed with the same function and the results are compared: same results mean you must have entered the correct password. [There's actually a bunch more to safely storing passwords. That is to say, what we do below shouldn't work in practice.]
The hash function is meant to be very hard to reverse. That is, even if somebody has access to the password database, they can't practically convert the hash values back to the passwords they need to enter to log in.
Herein lies the problem: the ne'er-do-wells of the world would like to be able to reverse hash functions and figure out the corresponding passwords.
A rainbow table is a compromise between hashing every possible password until we find a match (which takes too long) and storing every possible password/hash pair (which would require too big a file).
To construct a rainbow table, we need the hash function that the system uses, as well as a reduce function that maps a hash value to an arbitrary password. Note that the reduce function does not reverse the hash, it just provides a deterministic way to transform a hash value to some legal password.
If we start with some randomly generated passwords (the “PW 0” column in the table), we can repeatedly apply the hash and reduce functions to get several chains of values:
|PW 0||Hash 0||PW 1||Hash 1||PW 2||Hash 2|
To generate this table, we used our hash function to transform “PW (n)” into “Hash (n)”, and the reduce function to transform “Hash (n)” into “Password (n+1)”.
Instead of storing the whole table, we will only need to store the first and last columns. If we have those values, we can reconstruct the rest of the table by using the hash/reduce functions as necessary.
Cracking a Password with a Rainbow Table
Realistically, rainbow tables would be several orders of magnitude larger than the above. They are large enough that (1) you would usually download them, not calculate them and (2) are “wide” enough that storing only the first and last column uses much less space than storing everything in the table.
So now, imagine that you can only see the “Password 0” and “Hash 2” columns of the above table (since that's what is stored).
|PW 0||Hash 2|
You are trying to crack a hashed password value 1446259833.
The hash value you're looking for isn't in the table, but by applying the reduce and hash functions (once) to the hash value, you will get the new hash value 1629794057: the password you're looking for is almost certainly somewhere in fourth row of the table.
Once you realize that, you can look back at the initial password for row four, “eeeeaebd”. From there, it's just a matter of recalculating the chain until you find the value you're looking for: the password “bddcdaad” led to the hash value 1446259833.
You have successfully cracked a password and can log into the system.
The computation time needed to create a genuinely useful rainbow table is unrealistic for this assignment: there's a reason rainbow tables are usually downloaded, not computed. With a restricted password set, we should be able to get somewhere, but the real power of a rainbow table comes when you put days of processor time into creating the table, and then distribute it to all of your cracker friends. We should get far enough to see that once the table has been created, it can be applied to a hash value relatively quickly.
For this assignment, we will restrict ourselves to a limited set of passwords. We will say that all passwords:
- have exactly the same length (e.g. for length 4,
"abcd"will be a valid password, but
- contain only lowercase letters (and usually only the first
nletters of the alphabet, depending how hard we want to make the problem).
The tables above were generated with 5 letters (a–e) and length 8. We will continue to use those values in the first examples below.
Basically: the techniques here are somewhat realistic, but the numbers aren't.
In order to get you started, you should use the provided
RainbowAssign module which contains some functions that you'll find useful. The functions in the module will be described below as you need them.
Save the module in the same directory as your program. You can then import the module into your program like this:
The following types are defined for hashes and passwords: you should use them as appropriate to make it clear what kind of values your functions are using.
type Hash = Int32 type Passwd = String
Hash type is really an
Int32, you may find yourself wanting to sometimes do arithmetic on it like an
Int. To convert an
Int32 to an
Int in the obvious way, use
You'll need the
System.Random module. It is already in CSIL, or one of these commands should do the job:
apt-get install libghc-random-dev cabal install random
We need to define some constants that represent the parameters of the system. For testing (and to get results like my examples), I'd suggest something like this:
pwLength, nLetters, width, height :: Int filename :: FilePath pwLength = 8 -- length of each password nLetters = 5 -- number of letters to use in passwords: 5 -> a-e width = 40 -- length of each chain in the table height = 1000 -- number of "rows" in the table filename = "table.txt" -- filename to store the table
When testing your assignment, we will change these values, so make sure you don't assume they are always as above.
Hashing & Reducing
The first things we need for rainbow tables are functions that implement the two basic operations:
pwHash: convert a password (string) to a hash value.
pwReduce: convert a hash value back to a possible password.
pwHash function would typically be implied by the passwords we are trying to crack: we need to use the same hash function as the system that stored the passwords. For this assignment we will simply use a convenient hash function that is in the Haskell standard library. The
pwHash function is provided in the
RainbowAssign module. It maps strings to a (signed) 32-bit integer, so has type
Passwd -> Hash.
We also need the
pwReduce function to map a hash back into a possible password.
pwReduce function shouldn't produce too many collisions. That is, if
h1/=h2, it should be very likely that
pwReduce h1 /= pwReduce h2. (We can't necessarily guarantee this last identity, but we can do our best.)
For our reduce function, we will use a simple base conversion. That is, if we are creating passwords with the first
n lowercase letters, we can convert the hash value to base
n, and use the “digits” of that value to choose letters for our password. Here is an example, using passwords of length 8 and 5 characters (a–e):
|some initial password||
||1726491528 (base 10 integer)|
|convert to base
||…000012013440212103 (left-padded with 0s)|
|convert digits to letters||"eacbcbad"|
The last three lines above describe the
pwReduce function. You don't have to follow exactly those steps to get the value, but should get the same result. So,
pwReduce 1726491528 == "eacbcbad".
For the last step, the
RainbowAssign module contains a function
toLetter that converts an integer to the corresponding letter. For example,
toLetter 3 == 'd'.
The negative hash values may cause you stress. If you have done the base conversion in the standard straightforward way, you shouldn't have to worry about them. (i.e. find the lowest-order digit by taking the hash value mod
nLetters, then recurse to find the rest of the digits.) You should get
pwReduce (-1726491528::Hash) == "aecdcdec".
I have included more examples of base conversion to clear up any confusion. Hint: it's easier to generate the base-
nLetters value in reverse order as an infinite list (just leave out the base case), and simply
take the number of values you need.
pwReduce function as described above.
The Map Data Structure
In order to make our table useful, we will need a data structure that can do more efficient lookups than the built-in list (which does linear searches). The Haskell standard library contains a type
Data.Map which is analogous to the Python dictionary, or C++ std::map, or the Java TreeMap. Because the
Data.Map module contains many functions with common names (like
lookup), you should probably import it this way, to keep those functions out of your main namespace:
import qualified Data.Map as Map
Then you can access any of the functions by prefacing with “
Map.”. For example the module's keys function can be accessed as
The easiest way to create a map (from values of type
A to values of type
B) is to create a list of pairs (of type
[(A,B)]) and then give that to the
Map.fromList function to construct the map.
Building the Table
To create the actual rainbow table, you need to take a list of initial passwords, and use the hash/reduce functions to map the initial passwords onto a corresponding (distant) hash value. I will use the terms height to refer to the number of initial passwords (and thus the number of chains in the table), and width to refer to the number of hash/reduce operations applied to each chain.
Create a function
rainbowTable that takes two arguments: the “width” of the table, and the list of initial passwords. It should return a
Map.Map that maps the final
Hash values onto the
Passwd values at the start of the chain.
With the same parameters as the above example, we should get:
*Main> :t rainbowTable rainbowTable :: Int -> [Passwd] -> Map.Map Hash Passwd *Main> rainbowTable 0 ["abcdeabc"] fromList [(1726491528,"abcdeabc")] *Main> rainbowTable 1 ["abcdeabc"] fromList [(1477708406,"abcdeabc")] *Main> rainbowTable 40 ["abcdeabc"] fromList [(-1993856529,"abcdeabc")] *Main> rainbowTable 40 ["abcdeabc", "aabbccdd", "eeeeeeee"] fromList [(-1993856529,"abcdeabc"),(1781092264,"aabbccdd"),(2135711886,"eeeeeeee")] *Main> rainbowTable 2 ["dccdecee","cdeccaed","acbcaeec","eeeeaebd","ccdccbeb"] fromList [(-1862546954,"ccdccbeb"),(-716915723,"cdeccaed"),(1305042273,"acbcaeec"),(1629794057,"eeeeaebd"),(1802158710,"dccdecee")]
(The last example is the example table from the start of the assignment.)
Creating, Reading, and Writing Tables
To create a rainbow table based on random initial passwords, you can use the provided
buildTable function which takes these arguments: your
rainbowTable function as described above, the number of letters being used, the length of each password, and the width and height of the table. For example, it might return something like this:
*Main> buildTable rainbowTable nLetters pwLength width height fromList [(-1860304949,"bddaaaeb"),(-1422325927,"accabecb"),(-857324890,"aaaababb"),(454267643,"dbaacded")]
buildTable function requires input (a seed for the random number generator), the result is an
IO object (actually, a
Map.Map wrapped in an
IO object). There are some special rules for working with
IO objects that we won't go into.
Since the calculating a rainbow table can take a long time, the
RainbowAssign module contains functions that write a table to a file and read it back again. The easiest thing to do with
buildTable is to immediately write the results to a file. You can define a function like this, and then just call
generateTable when you want to generate a new table and save it to disk:
generateTable :: IO () generateTable = do table <- buildTable rainbowTable nLetters pwLength width height writeTable table filename
To read the table back, you can use the
readTable function. Again, since the results of
readTable are dependant on input (contents of the file), you must deal with the
IO object. You can probably get by with this recipe:
test1 = do table <- readTable filename return (Map.lookup 0 table)
Put the expression you want to test in the parens on the last line: calling the test function will give you the result of evaluating that expression.
Now that we have our rainbow tables created, we can try to reverse the hash function.
In order to do this, we take a hash value, and repeatedly apply
pwHash to it. At each point, we check to see if the hash value is a key in the table. If it is, we start with the corresponding password and iterate until we either find the corresponding password or reach the end of the chain.
Note that it is possible to get false positives in the initial lookup: you may find a generated hash value as a key in the table, but not find the original hash value in the chain. This is an unavoidable consequence of the possible collisions in the hash and reduce functions. Make sure you check every row of the table that is a potential match, not just the first one.
Create a function
findPassword that takes three arguments: the rainbow table, the “width” of the table, and the hash value you are trying to reverse.
The function should return a
Maybe Passwd. In Haskell,
Maybe values can either take on a value of the corresponding type, or
Nothing. You should return
Just the password if it was found, and
Nothing otherwise. The
Data.Maybe module contains some functions that may simplify your life when working with
Here are some examples:
*Main> let table = rainbowTable 40 ["abcdeabc", "aabbccdd", "eeeeeeee"] *Main> findPassword table 40 1726491528 Just "abcdeabc" *Main> findPassword table 40 (-206342227) Just "dbddecab" *Main> findPassword table 40 1726491529 Nothing *Main> findPassword table 40 0 Nothing
I have put some more examples of the functions doing what they do on another page of examples.
At this point, you're probably wondering just how effective your password cracking system is. With letters a–e and length 8, we have only 58 = 390625 possible passwords: it shouldn't be that hard to crack them.
This function tries to crack
n randomly generated passwords, and reports the results:
test2 :: Int -> IO ([Passwd], Int) test2 n = do table <- readTable filename pws <- randomPasswords nLetters pwLength n let hs = map pwHash pws let result = Maybe.mapMaybe (findPassword table width) hs return (result, length result)
This uses the
mapMaybe function from the
Data.Maybe module to return only the passwords that could be cracked. With the sample parameters from above, I successfully reverse about 7.5% of hashes: not bad for such a toy-sized table.
Choosing a good value for
width is surprisingly important: too small and not enough hashes are represented in your table; too large and you get too many chain collisions (decreasing the effective height), causing more false-positives in the lookup, and much longer times to reverse a hash. For given values of
pwLength, there seems to be a “sweet spot” for the value of width where the success rate stops increasing, but you are still not getting many collisions.
height, bigger is better.
Please feel free to experiment further. You should find that as the number of potential passwords (
pwLength) increases, the best value for width increases as well. So, as the situation gets more realistic, the rainbow table becomes more and more space-efficient, with cracking times increasing relatively little.
Notes and Summary
You should create the following functions as described above:
pwReduce: maps a hash value to an arbitrary password.
rainbowTable: generates a rainbow table, given a list of initial passwords.
findPassword: reverses a hash to the corresponding password, if possible.
All functions you define (the above and any others you need to get there) should have explicit type declarations (the
Coding style/readability/maintainability count. Using the
-Wall argument at the GHC/GHCi command line might help you find a few style problems (but certainly not all).
Compiling Your Code
You can likely work entirely at the GHCi prompt, but if you do want to compile your code, you first need to define a main function, perhaps like this:
main :: IO () main = do generateTable res <- test2 10000 print res
Then, the command line will be like this:
ghc -O2 --make -Wall rainbow.hs ./rainbow
Wait… are my passwords really that insecure?
Not if the person implementing the system did it right in the first place. Also keep in mind that the password and key spaces are much larger in reality than in our examples. Salting the hash helps defeat any precomputed-table attack as well.
Submit your files through CourSys for Assignment 1.