lexer

Function summary
collect-char-class	lexer
end-of-string-p	lexer
fail	lexer
get-number	lexer &key (radix 10) max-length no-whitespace-p
get-quantifier	lexer
get-token	lexer
looking-at-p	lexer chr
make-char-from-code	number error-pos
make-lexer	string
map-char-to-special-char-class	chr
maybe-parse-flags	lexer
next-char	lexer
next-char-non-extended	lexer
parse-register-name-aux	lexer
start-of-subexpr-p	lexer
try-number	lexer &key (radix 10) max-length no-whitespace-p
unescape-char	lexer
unget-token	lexer

map-char-to-special-char-class

chr

[Function]

Maps escaped characters like "\d" to the tokens which represent their associated character classes.

make-lexer

string

[Function]

end-of-string-p

lexer

[Function]

Tests whether we're at the end of the regex string.

looking-at-p

lexer chr

[Function]

Tests whether the next character the lexer would see is CHR. Does not respect extended mode.

next-char-non-extended

lexer

[Function]

Returns the next character which is to be examined and updates the POS slot. Does not respect extended mode.

next-char

lexer

[Function]

Returns the next character which is to be examined and updates the POS slot. Respects extended mode, i.e. whitespace, comments, and also nested comments are skipped if applicable.

fail

lexer

[Function]

Moves (LEXER-POS LEXER) back to the last position stored in (LEXER-LAST-POS LEXER) and pops the LAST-POS stack.

get-number

lexer &key (radix 10) max-length no-whitespace-p

[Function]

Read and consume the number the lexer is currently looking at and return it. Returns NIL if no number could be identified. RADIX is used as in PARSE-INTEGER. If MAX-LENGTH is not NIL we'll read at most the next MAX-LENGTH characters. If NO-WHITESPACE-P is not NIL we don't tolerate whitespace in front of the number.

try-number

lexer &key (radix 10) max-length no-whitespace-p

[Function]

Like GET-NUMBER but won't consume anything if no number is seen.

make-char-from-code

number error-pos

[Function]

Create character from char-code NUMBER. NUMBER can be NIL which is interpreted as 0. ERROR-POS is the position where the corresponding number started within the regex string.

unescape-char

lexer

[Function]

Convert the characters(s) following a backslash into a token which is returned. This function is to be called when the backslash has already been consumed. Special character classes like \W are handled elsewhere.

collect-char-class

lexer

[Function]

Reads and consumes characters from regex string until a right bracket is seen. Assembles them into a list (which is returned) of characters, character ranges, like (:RANGE #\A #\E) for a-e, and tokens representing special character classes.

maybe-parse-flags

lexer

[Function]

Reads a sequence of modifiers (including #\- to reverse their meaning) and returns a corresponding list of "flag" tokens. The "x" modifier is treated specially in that it dynamically modifies the behaviour of the lexer itself via the special variable *EXTENDED-MODE-P*.

get-quantifier

lexer

[Function]

Returns a list of two values (min max) if what the lexer is looking at can be interpreted as a quantifier. Otherwise returns NIL and resets the lexer to its old position.

parse-register-name-aux

lexer

[Function]

Reads and returns the name in a named register group. It is assumed that the starting #< character has already been read. The closing #> will also be consumed.

get-token

lexer

[Function]

Returns and consumes the next token from the regex string (or NIL).

unget-token

lexer

[Function]

Moves the lexer back to the last position stored in the LAST-POS stack.

start-of-subexpr-p

lexer

[Function]

Tests whether the next token can start a valid sub-expression, i.e. a stand-alone regex.