Not logged in. Login

Interactive Session with the Simpleton Matcher

Interactive Session with the Simpleton Matcher

Recall our matching program, stored in RE1.hs:

data RE = Epsilon | Ch Char | Seq RE RE | Alt RE RE | Star RE | Group RE deriving Show

match :: RE -> [Char] -> Bool
splits :: [Char] -> [([Char], [Char])]
combine_all :: Char -> [([Char], [Char])] -> [([Char], [Char])]
match_any_split :: RE -> RE -> [([Char], [Char])] -> Bool
match_any_nonempty_split :: RE -> RE -> [([Char], [Char])] -> Bool

match Epsilon s = s == ""
match (Ch a) "" = False
match (Ch a) (c : more_chars) = a == c && more_chars == []
match (Alt r1 r2) string = match r1 string || match r2 string
match (Seq r1 r2) string = match_any_split r1 r2 (splits string)
match (Star r1) "" = True
match (Star r1) s = match_any_nonempty_split r1 (Star r1) (splits s)

splits "" = [("", "")]
splits (c : chars) = combine_all c (splits chars)

combine_all c [] = []
combine_all c (("", s): more) = ([c], s) : ("", c:s) : (combine_all c more)
combine_all c ((a:as, s): more) = (c:a:as, s) : (combine_all c more)

match_any_split r1 r2 [] = False
match_any_split r1 r2 ((s1, s2) : more_splits) 
   | match r1 s1 && match r2 s2 = True
   | otherwise = match_any_split r1 r2 more_splits 

match_any_nonempty_split r1 r2 [] = False
match_any_nonempty_split r1 r2 ((s1, s2) : more_splits) 
   | s1 /= "" && match r1 s1 && match r2 s2 = True
   | otherwise = match_any_nonempty_split r1 r2 more_splits 

Starting Our Session

We run ghci the Glasgow Haskell Interpreter from the command line.

Robs-MacBook-Pro:RE1 cameron$ ghci
GHCi, version 7.8.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> :load RE1.hs
[1 of 1] Compiling Main             ( RE1.hs, interpreted )
Ok, modules loaded: Main.
*Main> 

At this point, the interpreter has started up and we have typed in a command to load our program file RE1.hs. The interpreter tells us that everything is OK and then enters a "read-eval-print" (repl) loop. That is the interpreter will read an expression that we type in, evaluate the expression, print out its value and then be ready for another expression to be input.

As a first test, we can try our splits function to see if it does what we think.

*Main> splits "string"
[("string",""),("strin","g"),("stri","ng"),("str","ing"),("st","ring"),("s","tring"),("","string")]
*Main> 

Good! The splits function is returning the list of pairs consisting of all ways to split up a given string.

Matching with Epsilon

Now we might try some matching tests. Let's begin with the empty regexp Epsilon. What kind of strings does it match?

*Main> match Epsilon "abc"
False
*Main> match Epsilon "c"
False
*Main> match Epsilon ""
True
*Main> match Epsilon []
True
*Main> match Epsilon [3]

<interactive>:9:16:
    No instance for (Num Char) arising from the literal ‘3’
    In the expression: 3
    In the second argument of ‘match’, namely ‘[3]’
    In the expression: match Epsilon [3]
*Main> 

Here we see that Epsilon matches the empty string whether represented as "" or []. We also see that Haskell complains about type errors if we give it the wrong kind of data.

Matching Individual Characters

Now let us consider matching an individual character with an input.

*Main> match (Ch 'w') ""
False
*Main> match (Ch 'w') "w"
True
*Main> match (Ch 'w') ['w']
True
*Main> match (Ch 'w') "wd"
False
*Main> match (Ch 'w') 'w'

<interactive>:15:16:
    Couldn't match expected type ‘[Char]’ with actual type ‘Char’
    In the second argument of ‘match’, namely ‘'w'’
    In the expression: match (Ch 'w') 'w'
    In an equation for ‘it’: it = match (Ch 'w') 'w'
*Main> 

As expected, a character as a regular expression matches only a string of length 1 containing that character. Note that we can specify the string as a list of characters or using string notation. Also note that it is a type error to use a single character (rather than a string containing a single character) as the second argument to match.

Matching Compound Expressions

Now we can see how compound expressions with sequences, alternatives and star-expressions are matched.

*Main> match (Seq (Ch 'p') (Ch ':')) "" 
False
*Main> match (Seq (Ch 'p') (Ch ':')) ['p', ':']
True
*Main> match (Seq (Ch 'p') (Ch ':')) "p:"
True
*Main> match (Seq (Ch 'p') (Ch ':')) "p:x"
False
*Main> match (Alt (Ch 'p') (Ch ':')) "" 
False
*Main> match (Alt (Ch 'p') (Ch ':')) ":"
True
*Main> match (Alt (Ch 'p') (Ch ':')) "q"
False
*Main> match (Star (Alt (Ch 'p') (Ch ':'))) ":"
True
*Main> match (Star (Alt (Ch 'p') (Ch ':'))) ""
True
*Main> match (Star (Alt (Ch 'p') (Ch ':'))) "pp::pppp:pp"
True
*Main> match (Star (Alt (Ch 'p') (Ch ':'))) "pp::ppZ"
False
*Main> 
Updated Thu Sept. 06 2018, 11:14 by cameron.