Interactive Session with the Simpleton Matcher
Interactive Session with the Simpleton Matcher
Recall our matching program, stored in RE1.hs:
data RE = Epsilon | Ch Char | Seq RE RE | Alt RE RE | Star RE | Group RE deriving Show match :: RE -> [Char] -> Bool splits :: [Char] -> [([Char], [Char])] combine_all :: Char -> [([Char], [Char])] -> [([Char], [Char])] match_any_split :: RE -> RE -> [([Char], [Char])] -> Bool match_any_nonempty_split :: RE -> RE -> [([Char], [Char])] -> Bool match Epsilon s = s == "" match (Ch a) "" = False match (Ch a) (c : more_chars) = a == c && more_chars == [] match (Alt r1 r2) string = match r1 string || match r2 string match (Seq r1 r2) string = match_any_split r1 r2 (splits string) match (Star r1) "" = True match (Star r1) s = match_any_nonempty_split r1 (Star r1) (splits s) splits "" = [("", "")] splits (c : chars) = combine_all c (splits chars) combine_all c [] = [] combine_all c (("", s): more) = ([c], s) : ("", c:s) : (combine_all c more) combine_all c ((a:as, s): more) = (c:a:as, s) : (combine_all c more) match_any_split r1 r2 [] = False match_any_split r1 r2 ((s1, s2) : more_splits) | match r1 s1 && match r2 s2 = True | otherwise = match_any_split r1 r2 more_splits match_any_nonempty_split r1 r2 [] = False match_any_nonempty_split r1 r2 ((s1, s2) : more_splits) | s1 /= "" && match r1 s1 && match r2 s2 = True | otherwise = match_any_nonempty_split r1 r2 more_splits
Starting Our Session
We run ghci
the Glasgow Haskell Interpreter from the command line.
Robs-MacBook-Pro:RE1 cameron$ ghci GHCi, version 7.8.3: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Prelude> :load RE1.hs [1 of 1] Compiling Main ( RE1.hs, interpreted ) Ok, modules loaded: Main. *Main>
At this point, the interpreter has started up and we have typed in a command to load our program file RE1.hs. The interpreter tells us that everything is OK and then enters a "read-eval-print" (repl) loop. That is the interpreter will read an expression that we type in, evaluate the expression, print out its value and then be ready for another expression to be input.
As a first test, we can try our splits
function to see if it does what we think.
*Main> splits "string" [("string",""),("strin","g"),("stri","ng"),("str","ing"),("st","ring"),("s","tring"),("","string")] *Main>
Good! The splits function is returning the list of pairs consisting of all ways to split up a given string.
Matching with Epsilon
Now we might try some matching tests. Let's begin with the empty regexp Epsilon
. What kind of strings does it match?
*Main> match Epsilon "abc" False *Main> match Epsilon "c" False *Main> match Epsilon "" True *Main> match Epsilon [] True *Main> match Epsilon [3] <interactive>:9:16: No instance for (Num Char) arising from the literal ‘3’ In the expression: 3 In the second argument of ‘match’, namely ‘[3]’ In the expression: match Epsilon [3] *Main>
Here we see that Epsilon
matches the empty string whether represented as ""
or []
. We also see that Haskell complains about type errors if we give it the wrong kind of data.
Matching Individual Characters
Now let us consider matching an individual character with an input.
*Main> match (Ch 'w') "" False *Main> match (Ch 'w') "w" True *Main> match (Ch 'w') ['w'] True *Main> match (Ch 'w') "wd" False *Main> match (Ch 'w') 'w' <interactive>:15:16: Couldn't match expected type ‘[Char]’ with actual type ‘Char’ In the second argument of ‘match’, namely ‘'w'’ In the expression: match (Ch 'w') 'w' In an equation for ‘it’: it = match (Ch 'w') 'w' *Main>
As expected, a character as a regular expression matches only a string of length 1 containing that character. Note that we can specify the string as a list of characters or using string notation. Also note that it is a type error to use a single character (rather than a string containing a single character) as the second argument to match
.
Matching Compound Expressions
Now we can see how compound expressions with sequences, alternatives and star-expressions are matched.
*Main> match (Seq (Ch 'p') (Ch ':')) "" False *Main> match (Seq (Ch 'p') (Ch ':')) ['p', ':'] True *Main> match (Seq (Ch 'p') (Ch ':')) "p:" True *Main> match (Seq (Ch 'p') (Ch ':')) "p:x" False *Main> match (Alt (Ch 'p') (Ch ':')) "" False *Main> match (Alt (Ch 'p') (Ch ':')) ":" True *Main> match (Alt (Ch 'p') (Ch ':')) "q" False *Main> match (Star (Alt (Ch 'p') (Ch ':'))) ":" True *Main> match (Star (Alt (Ch 'p') (Ch ':'))) "" True *Main> match (Star (Alt (Ch 'p') (Ch ':'))) "pp::pppp:pp" True *Main> match (Star (Alt (Ch 'p') (Ch ':'))) "pp::ppZ" False *Main>