public class HQLLexer extends antlr.CharScanner implements HQLParserTokenTypes, antlr.TokenStream
HQLParser
.
Each token has an integer token type by which the parser references and matches tokens. This is done to make a high performance parser. To make this work, the lexer must create a token object when it matches a top level rule, this token object includes the token type and the actual text found in the original character stream. The lexer's main job is to look ahead into the character stream, switch into the top level rule that matches some set of characters and when this is complete, to create a valid token object and return this to the parser. The lexer thus implements a token stream interface from the parser's perspective.
Please note that this is a generated file using ANTLR 2.7.4 and a
grammar specified in hql.g
.
Symbol resolution is the process by which the proper token type is assigned to a token created from a match to the generic symbol rule. In the lexer, the only symbol resolution that can be done is to match with reserved keywords. Reserved keywords are given preference over other symbol types in any case where there is a conflict.
Other key design points:
nextToken
. This
method has a for-loop that uses up to 3 characters of lookahead
to identify which of the top level token rules to call. To do
this, each token rule's left-most match characters are "rolled up"
and tested in this top level rule.
All token types are defined as integer constants in HQLParserTokenTypes
which is an interface that the parser, lexer and
other related classes all implement. This allows all of these classes to
directly refer to the token types and share this common set of
definitions. All top-level lexer rules generate a token of the same name
in the HQLParserTokenTypes
interface.
There is a tokens { } section in the parser where artificial tokens are
defined (tokens that are not backed by a lexer rule). These tokens are
also added to the HQLParserTokenTypes
interface and can thus
be referenced directly by the lexer and parser.
Modifier and Type | Field and Description |
---|---|
static antlr.collections.impl.BitSet |
_tokenSet_0 |
static antlr.collections.impl.BitSet |
_tokenSet_1 |
private static java.util.Map |
keywords
Map of keywords to symbol tokens
|
_returnToken, caseSensitive, caseSensitiveLiterals, commitToPath, EOF_CHAR, hashString, inputState, literals, saveConsumedInput, tabsize, text, tokenObjectClass, traceDepth
ALIAS, AND, AS, BOOL_FALSE, BOOL_TRUE, CASE, CAST, COMMA, CONCAT, DEC_LITERAL, DIGIT, DIVIDE, DMO, DOT, ELSE, END, EOF, EQUALS, ESCAPE, FROM, FUNCTION, GT, GTE, IN, INDEX, IS, IS_NULL, LBRACKET, LETTER, LIKE, LONG_LITERAL, LPARENS, LT, LTE, MINUS, MULTIPLY, NOT, NOT_EQ, NOT_NULL, NULL, NULL_TREE_LOOKAHEAD, NUM_LITERAL, OR, PLUS, PROPERTY, RBRACKET, RPARENS, SELECT, STRING, SUBSCRIPT, SUBSELECT, SUBST, SYM_CHAR, SYMBOL, TERNARY, THEN, UN_MINUS, VALID_1ST_IDENT, VALID_SYM_CHAR, WHEN, WHERE, WS
Constructor and Description |
---|
HQLLexer(antlr.InputBuffer ib) |
HQLLexer(java.io.InputStream in) |
HQLLexer(antlr.LexerSharedInputState state) |
HQLLexer(java.io.Reader in) |
Modifier and Type | Method and Description |
---|---|
static void |
main(java.lang.String[] args)
Provides a command line interface for an end user to drive and/or test
the HQLLexer class.
|
void |
mCOMMA(boolean _createToken)
Matches the ',' character.
|
void |
mCONCAT(boolean _createToken)
Matches the '||' concatenation operator.
|
protected void |
mDIGIT(boolean _createToken)
Matches any numeric digit:
0 - 9 |
void |
mDIVIDE(boolean _createToken)
Matches the '/' character.
|
void |
mEQUALS(boolean _createToken)
Matches the "=" string.
|
void |
mGT(boolean _createToken)
Matches the '>' character.
|
void |
mGTE(boolean _createToken)
Matches the '>=' character sequence.
|
private static long[] |
mk_tokenSet_0() |
private static long[] |
mk_tokenSet_1() |
void |
mLBRACKET(boolean _createToken)
Matches the '[' character.
|
protected void |
mLETTER(boolean _createToken)
Matches any alphabetic character:
a - z |
void |
mLPARENS(boolean _createToken)
Matches the '(' character.
|
void |
mLT(boolean _createToken)
Matches the '<' character.
|
void |
mLTE(boolean _createToken)
Matches the '<=' character sequence.
|
void |
mMINUS(boolean _createToken)
Matches the '-' character.
|
void |
mMULTIPLY(boolean _createToken)
Matches the '*' character.
|
void |
mNOT_EQ(boolean _createToken)
Matches the inequality string.
|
void |
mNUM_LITERAL(boolean _createToken)
Matches all forms of valid integer literals and decimal literals.
|
void |
mPLUS(boolean _createToken)
Matches the '+' character.
|
void |
mRBRACKET(boolean _createToken)
Matches the ']' character.
|
void |
mRPARENS(boolean _createToken)
Matches the ')' character.
|
void |
mSTRING(boolean _createToken)
Matches an opening single quote, arbitrary contents and an ending single
quote.
|
void |
mSUBST(boolean _createToken)
Matches the query substitution parameter placeholder character.
|
protected void |
mSYM_CHAR(boolean _createToken)
Matches all characters that can be made part of a valid user-defined
symbol (except alphabetic and numeric characters).
|
void |
mSYMBOL(boolean _createToken)
Match a valid symbol name.
|
protected void |
mVALID_1ST_IDENT(boolean _createToken)
Matches any character that can appear in the 1st position of a symbol (matches any
LETTER or SYM_CHAR ). |
protected void |
mVALID_SYM_CHAR(boolean _createToken)
Matches any single character that can appear in the 2nd or later position
in a symbol (matches any
LETTER, DIGIT or SYM_CHAR ). |
void |
mWS(boolean _createToken)
Matches any amount of whitespace in an expression and sets the token type
to SKIP.
|
antlr.Token |
nextToken() |
append, append, commit, consume, consumeUntil, consumeUntil, getCaseSensitive, getCaseSensitiveLiterals, getColumn, getCommitToPath, getFilename, getInputBuffer, getInputState, getLine, getTabSize, getText, getTokenObject, LA, makeToken, mark, match, match, match, matchNot, matchRange, newline, panic, panic, reportError, reportError, reportWarning, resetText, rewind, setCaseSensitive, setColumn, setCommitToPath, setFilename, setInputState, setLine, setTabSize, setText, setTokenObjectClass, tab, testLiteralsTable, testLiteralsTable, toLower, traceIn, traceIndent, traceOut, uponEOF
private static java.util.Map keywords
public static final antlr.collections.impl.BitSet _tokenSet_0
public static final antlr.collections.impl.BitSet _tokenSet_1
public HQLLexer(java.io.InputStream in)
public HQLLexer(java.io.Reader in)
public HQLLexer(antlr.InputBuffer ib)
public HQLLexer(antlr.LexerSharedInputState state)
public static void main(java.lang.String[] args)
Syntax:
java HQLLexer <expression>
args
- List of command line arguments.public antlr.Token nextToken() throws antlr.TokenStreamException
nextToken
in interface antlr.TokenStream
antlr.TokenStreamException
public final void mWS(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
Spaces, tabs and carriage returns and line feeds (newlines) are all matched.
This is a top level lexer rule which means that there is an associated
WS
token.
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mSYMBOL(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
_
and $
).
Symbols are matched case-insensitively.
Once a symbol has been found, a keyword lookup occurs. If matched, the the token's type is
overridden from the default (SYMBOL
) to the artificial token type associated with
the keyword.
This is a top level lexer rule which means that there is an associated SYMBOL
token.
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mVALID_1ST_IDENT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
LETTER
or SYM_CHAR
).antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mVALID_SYM_CHAR(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
LETTER, DIGIT or SYM_CHAR
). This is
simply a helper rule to make references easier.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mCONCAT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
CONCAT
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mCOMMA(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
COMMA
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mEQUALS(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
EQUALS
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mNOT_EQ(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
NOT_EQ
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mGT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
GT
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mLT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
LT
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mGTE(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
GTE
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mLTE(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
LTE
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mLBRACKET(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
LBRACKET
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mRBRACKET(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
RBRACKET
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mLPARENS(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
LPARENS
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mRPARENS(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
RPARENS
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mPLUS(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
PLUS
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mMINUS(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
MINUS
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mMULTIPLY(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
MULTIPLY
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mDIVIDE(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
DIVIDE
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mSUBST(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
?
). This is a top level rule, creating a
SUBST
token type.antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mNUM_LITERAL(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
NUM_LITERAL
token type.
NUM_LITERAL
literal cannot be parsed by Long.parseLong
then it is too large to fit in a 64-bit variable so it will be matched as
DEC_LITERAL
.
DEC_LITERAL
token type.
DEC_LITERAL
token type.
DEC_LITERAL
token type.
This is a top level rule. The token type defaults to
NUM_LITERAL
and is overridden by specific actions depending
on the scenario matched.
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mDIGIT(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
0 - 9
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
public final void mSTRING(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
Any newlines inside the string are identified and the lexer's internal newline counter is properly maintained.
The greedy option does not need to be disabled here as the closure rule termination is built into the subrule itself: it accepts anything that isn't an unescaped single quote character (see above).
Tabs and spaces are maintained inside strings. Opening and closing single quote characters are dropped.
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mLETTER(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
a - z
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
protected final void mSYM_CHAR(boolean _createToken) throws antlr.RecognitionException, antlr.CharStreamException, antlr.TokenStreamException
$ _
antlr.RecognitionException
antlr.CharStreamException
antlr.TokenStreamException
private static final long[] mk_tokenSet_0()
private static final long[] mk_tokenSet_1()