Function summary | |
---|---|
collect-char-class | lexer |
end-of-string-p | lexer |
fail | lexer |
get-number | lexer &key (radix 10) max-length no-whitespace-p |
get-quantifier | lexer |
get-token | lexer |
looking-at-p | lexer chr |
make-char-from-code | number error-pos |
make-lexer | string |
map-char-to-special-char-class | chr |
maybe-parse-flags | lexer |
next-char | lexer |
next-char-non-extended | lexer |
parse-register-name-aux | lexer |
start-of-subexpr-p | lexer |
try-number | lexer &key (radix 10) max-length no-whitespace-p |
unescape-char | lexer |
unget-token | lexer |
Maps escaped characters like "\d" to the tokens which represent their associated character classes.
Tests whether the next character the lexer would see is CHR. Does not respect extended mode.
Returns the next character which is to be examined and updates the POS slot. Does not respect extended mode.
Returns the next character which is to be examined and updates the POS slot. Respects extended mode, i.e. whitespace, comments, and also nested comments are skipped if applicable.
Moves (LEXER-POS LEXER) back to the last position stored in (LEXER-LAST-POS LEXER) and pops the LAST-POS stack.
Read and consume the number the lexer is currently looking at and return it. Returns NIL if no number could be identified. RADIX is used as in PARSE-INTEGER. If MAX-LENGTH is not NIL we'll read at most the next MAX-LENGTH characters. If NO-WHITESPACE-P is not NIL we don't tolerate whitespace in front of the number.
Like GET-NUMBER but won't consume anything if no number is seen.
Create character from char-code NUMBER. NUMBER can be NIL which is interpreted as 0. ERROR-POS is the position where the corresponding number started within the regex string.
Convert the characters(s) following a backslash into a token which is returned. This function is to be called when the backslash has already been consumed. Special character classes like \W are handled elsewhere.
Reads and consumes characters from regex string until a right bracket is seen. Assembles them into a list (which is returned) of characters, character ranges, like (:RANGE #\A #\E) for a-e, and tokens representing special character classes.
Reads a sequence of modifiers (including #\- to reverse their meaning) and returns a corresponding list of "flag" tokens. The "x" modifier is treated specially in that it dynamically modifies the behaviour of the lexer itself via the special variable *EXTENDED-MODE-P*.
Returns a list of two values (min max) if what the lexer is looking at can be interpreted as a quantifier. Otherwise returns NIL and resets the lexer to its old position.
Reads and returns the name in a named register group. It is assumed that the starting #< character has already been read. The closing #> will also be consumed.
Moves the lexer back to the last position stored in the LAST-POS stack.