Not logged in. Login

Assignment 1 Solutions - Matching Semantics

Extending the Matching Semantics

Now that we have decided how to represent our extensions and updated our parser to be able to parse the concrete syntax, we now just have to update the semantics of matching REs.

Escaped Metacharacters

Because the processing of backslash escapes is handled by the parser and there are no changes to our abstract syntax tree structure, we do not need any changes to the matching functions to handle escaped characters.

Any Metacharacter

To represent the "." metacharacter, we added a new alternative Any to our RE data type.

The changes to the match function are straightforward, simply add anew equation for matching Any. We can use a structure very similar to the equation for matching single characters.

match (Ch a) "" = False
match (Ch a) (c : more_chars) = a == c && more_chars == []

However, we don't need to worry about which specific character occurs as the first element of the string, so we just match using the anonymous pattern variable _.

match Any "" = False
match Any (_ : more_chars) = more_chars == []

Option Metacharacter

To represent the "?" metacharacter, we translate occurrences of R? to

Alt R Epsilon

Since we already have definitions for the semantics of both Alt and Epsilon, no further work is needed.

If instead, we had created a form Option RE, we would instead require new equations.

match (Option r) "" = True
match (Option r) s = match r s

One-or-More Metacharacter

match (Plus r) "" = match r ""
match (Plus r) s = match_any_split r (Star r) s

Character Class Expressions

Most of the work of the work for character class expressions is handled by the parser, which gives us a new representation for Ch regular expressions, using the form Ch Bool [Char].

How do our equations for matching a Char need to be updated? Here are the originals.

match (Ch a) "" = False
match (Ch a) (c : more_chars) = a == c && more_chars == []

Now we need to change the patterns so that we have a first field that represents whether the character class is negated or not, and a second field which is a list of characters.

match (Ch _ _) "" = False
match (Ch True chars) (c : more_chars) = c in chars && more_chars == []
match (Ch False chars) (c : more_chars) = !(c in chars) && more_chars == []
Updated Sat Sept. 01 2018, 18:24 by cameron.