public class E4GLLexer extends antlr.CharScanner implements E4GLParserTokenTypes, antlr.TokenStream
Most HTML constructs are NOT directly matched. Instead, this only
matches those constructs needed to unambiguously parse embedded 4GL
in an HTML page and translate that input into a valid 4GL program. No
4GL constructs are matched here as the parser is not aware of any 4GL
syntax. Since the parser is only a filter (preprocessor), it only needs
to match on a small subset of HTML syntax such that it can make its
state transitions and so forth. Generally, there is a top-level rule
or artificial token type for every critical parser decision. Many of
these are single character matches but some more complicated
multi-character constructs are matched as needed. Every possible ASCII
and extended ASCII character can be matched by itself. For any character
that doea not match a unique top-level rule or artificial token type, the
character can be returned as part of mWS(boolean)
or mUNKNOWN(boolean)
.
There are some places in the parser that matches on WS
but
the content (token text) is not really important. Anything that is matched
as UNKNOWN
is simply passed through by the parser grammar
rules, though on output the parser my do some tranformations as needed
to represent 4GL source code (e.g. unsafe character escaping).
Whitespace processing takes into account the platform-specific newline string and maintains the newline counter as appropriate. In the case of a platform's newline of "\r\n", the '\r' character immediately preceding a '\n' character will be dropped. This allows the parser to match on a simple '\n' in that case and on output the full platform-specific newline will be inserted as needed. This also allows the '\r' to be dropped from output if run on a system where the newline string is "\n".
Symbol processing is very simplistic. It leverages the built-in literals
testing (case-insensitively) of ANTLR. The key factor is that any text
that must be converted into an artificial token type must first match via
some other top-level lexer rule (see mSYMBOL(boolean)
). The end of the
nextToken()
rule then does a case-insensitive literal text lookup
with the list of hard coded string literals that are associated with
artificial token types. If a match is found, the token type of the
SYMBOL
is changed to the corresponding artificial type.
Modifier and Type | Field and Description |
---|---|
static antlr.collections.impl.BitSet |
_tokenSet_0 |
static antlr.collections.impl.BitSet |
_tokenSet_1 |
static antlr.collections.impl.BitSet |
_tokenSet_2 |
static antlr.collections.impl.BitSet |
_tokenSet_3 |
static antlr.collections.impl.BitSet |
_tokenSet_4 |
private boolean |
dropCR
Flag to force dropping CR characters from newlines.
|
private boolean |
newlineCR
Flag to force maintaining newline based solely on CR characters.
|
_returnToken, caseSensitive, caseSensitiveLiterals, commitToPath, EOF_CHAR, hashString, inputState, literals, saveConsumedInput, tabsize, text, tokenObjectClass, traceDepth
BACK_TICK, CLOSE_COMMENT, CLOSE_CURLY_EQ, CLOSE_PCT, CLOSE_QUESTION, CLOSE_TAG, CLOSE_TAG_NO_CONTENT, COLON, CONTENT, DIGIT, DOT, DQUOTE, ENCODED_CHAR, EOF, EQUALS, EXCLAIM, GT, HEX_DIGIT, HTML_DSTRING, HTML_SSTRING, HTTP_EQUIV, HYPHEN, JUNK, LANGUAGE, LEFT_CURLY, LETTER, LT, META, NAME, NULL_TREE_LOOKAHEAD, OPEN_COMMENT, OPEN_CURLY_EQ, OPEN_END_TAG, OPEN_PCT, OPEN_PCT_EQ, OPEN_QUESTION, OPEN_START_TAG, OTHER, PERCENT, QUESTION, RIGHT_CURLY, SCRIPT, SERVER, SLASH, SQUOTE, SYMBOL, UNDERSCORE, UNKNOWN, WS, WS4GL, WSE, WSMETA, WSPEED, WSS
Constructor and Description |
---|
E4GLLexer(antlr.InputBuffer ib) |
E4GLLexer(java.io.InputStream in) |
E4GLLexer(antlr.LexerSharedInputState state) |
E4GLLexer(java.io.Reader in) |
Modifier and Type | Method and Description |
---|---|
static void |
main(java.lang.String[] args)
Command line driver to allow lexer testing.
|
void |
mBACK_TICK(boolean _createToken)
Matches a single '`' (back-tick) character.
|
void |
mCLOSE_COMMENT(boolean _createToken)
Matches the HTML close comment "-->".
|
void |
mCLOSE_CURLY_EQ(boolean _createToken)
Match '=}' used to close one form of expression escape.
|
void |
mCLOSE_PCT(boolean _createToken)
Match '<!--WSE' used to open one form of expression escape.
|
void |
mCLOSE_QUESTION(boolean _createToken)
Match '<?' used to open one form of statement escape.
|
void |
mCLOSE_TAG_NO_CONTENT(boolean _createToken)
Matches a '/' followed by a '>' normally used to close an HTML element
that has no content.
|
void |
mCLOSE_TAG(boolean _createToken)
Matches a single '>' character normally used to close an HTML element.
|
protected void |
mCOLON(boolean _createToken)
Matches a single ':' character.
|
protected void |
mDIGIT(boolean _createToken)
Matches any numeric digit:
0 - 9 . |
protected void |
mDOT(boolean _createToken)
Matches a single '.' character.
|
void |
mDQUOTE(boolean _createToken)
Matches the double quote character.
|
void |
mENCODED_CHAR(boolean _createToken)
Match '%' followed by 2 hexidecimal digits.
|
void |
mEQUALS(boolean _createToken)
Matches a single '=' character.
|
protected void |
mEXCLAIM(boolean _createToken)
Matches a single '!' character.
|
protected void |
mGT(boolean _createToken)
Matches a single '>' character.
|
protected void |
mHEX_DIGIT(boolean _createToken)
Matches any hexidecimal digit:
0 - 9 and a - f . |
protected void |
mHTML_DSTRING(boolean _createToken)
String delimited by 2 double-quote characters.
|
protected void |
mHTML_SSTRING(boolean _createToken)
String delimited by 2 single-quote characters.
|
protected void |
mHYPHEN(boolean _createToken)
Matches a single '-' character.
|
protected void |
mJUNK(boolean _createToken)
Matches non-visible ASCII codes that should be silently ignored.
|
private static long[] |
mk_tokenSet_0() |
private static long[] |
mk_tokenSet_1() |
private static long[] |
mk_tokenSet_2() |
private static long[] |
mk_tokenSet_3() |
private static long[] |
mk_tokenSet_4() |
protected void |
mLEFT_CURLY(boolean _createToken)
Matches a single '{' character.
|
protected void |
mLETTER(boolean _createToken)
Matches any alphabetic character:
a - z . |
protected void |
mLT(boolean _createToken)
Matches a single '<' character.
|
void |
mOPEN_COMMENT(boolean _createToken)
Matches the HTML open comment "<!--".
|
void |
mOPEN_CURLY_EQ(boolean _createToken)
Match '{=' used to open one form of expression escape.
|
void |
mOPEN_END_TAG(boolean _createToken)
Matches a single '<' character followed by a '/', normally
used to start a closing HTML element.
|
void |
mOPEN_PCT_EQ(boolean _createToken)
Match '<!--WSE' used to open one form of expression escape.
|
void |
mOPEN_PCT(boolean _createToken)
Match '<!--WSE' used to open one form of expression escape.
|
void |
mOPEN_QUESTION(boolean _createToken)
Match '<?' used to open one form of statement escape.
|
void |
mOPEN_START_TAG(boolean _createToken)
Matches a single '<' character, normally used to start an opening HTML
element.
|
protected void |
mOTHER(boolean _createToken)
Matches all other unknown, non-whitespace characters, including the
extended ASCII range.
|
protected void |
mPERCENT(boolean _createToken)
Matches a single '%' character.
|
protected void |
mQUESTION(boolean _createToken)
Matches a single '?' character.
|
protected void |
mRIGHT_CURLY(boolean _createToken)
Matches a single '}' character.
|
protected void |
mSLASH(boolean _createToken)
Matches a single '/' character.
|
void |
mSQUOTE(boolean _createToken)
Matches a single quote character.
|
void |
mSYMBOL(boolean _createToken)
Match any valid HTML name or id.
|
protected void |
mUNDERSCORE(boolean _createToken)
Matches a single '_' character.
|
void |
mUNKNOWN(boolean _createToken)
Match everything else (this is the inverse of all other top-level lexer
rules) as unknown text.
|
void |
mWS(boolean _createToken)
Matches any amount of whitespace in a program.
|
antlr.Token |
nextToken() |
append, append, commit, consume, consumeUntil, consumeUntil, getCaseSensitive, getCaseSensitiveLiterals, getColumn, getCommitToPath, getFilename, getInputBuffer, getInputState, getLine, getTabSize, getText, getTokenObject, LA, makeToken, mark, match, match, match, matchNot, matchRange, newline, panic, panic, reportError, reportError, reportWarning, resetText, rewind, setCaseSensitive, setColumn, setCommitToPath, setFilename, setInputState, setLine, setTabSize, setText, setTokenObjectClass, tab, testLiteralsTable, testLiteralsTable, toLower, traceIn, traceIndent, traceOut, uponEOF
private boolean dropCR
private boolean newlineCR
public static final antlr.collections.impl.BitSet _tokenSet_0
public static final antlr.collections.impl.BitSet _tokenSet_1
public static final antlr.collections.impl.BitSet _tokenSet_2
public static final antlr.collections.impl.BitSet _tokenSet_3
public static final antlr.collections.impl.BitSet _tokenSet_4
public E4GLLexer(java.io.InputStream in)
public E4GLLexer(java.io.Reader in)
public E4GLLexer(antlr.InputBuffer ib)
public E4GLLexer(antlr.LexerSharedInputState state)
public static void main(java.lang.String[] args)
args
- List of command line arguments.public antlr.Token nextToken() throws antlr.TokenStreamException
nextToken
in interface antlr.TokenStream
antlr.TokenStreamException
public final void mWS(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
Spaces, tabs and carriage returns and line feeds (newlines) are all matched.
Non-visible ASCII codes decimal 127 and below are matched here as well.
This is a top level lexer rule which means that there is an associated
WS
token.
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mJUNK(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mDQUOTE(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mSQUOTE(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mSYMBOL(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mLETTER(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
a - z
.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mDIGIT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
0 - 9
.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mHYPHEN(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mUNDERSCORE(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mCOLON(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mDOT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mENCODED_CHAR(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mPERCENT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mHEX_DIGIT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
0 - 9
and a - f
.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mOPEN_PCT_EQ(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mLT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mEQUALS(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mOPEN_PCT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mCLOSE_PCT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mGT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mOPEN_QUESTION(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mQUESTION(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mCLOSE_QUESTION(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mOPEN_END_TAG(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mOPEN_CURLY_EQ(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mLEFT_CURLY(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mCLOSE_CURLY_EQ(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mRIGHT_CURLY(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mUNKNOWN(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mSLASH(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mEXCLAIM(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mOTHER(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mBACK_TICK(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mOPEN_COMMENT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mOPEN_START_TAG(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mCLOSE_COMMENT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mCLOSE_TAG_NO_CONTENT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mCLOSE_TAG(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mHTML_DSTRING(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mHTML_SSTRING(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
private static final long[] mk_tokenSet_0()
private static final long[] mk_tokenSet_1()
private static final long[] mk_tokenSet_2()
private static final long[] mk_tokenSet_3()
private static final long[] mk_tokenSet_4()