Sorting Query Results¶
- Sorting Query Results
- Database Indexes and Query Sorting
- Single versus Multiple Index Selection
- General Sorting Behavior
- Implicit WHERE Clause Components
- Single Index Selection Rules
- Conversion Examples
- Example1: Default Index Selection
- Example 2: Implicit Index Selection
- Example 3: Explicit Index Selection (rule 1)
- Example 4: BY Clause Index Selection
- Example 5: Criteria Index Selection
- Example 6: One Index For Fully Qualified Unique Option (Rule 2 )
- Example 7: One Unique Index For Two Fully Qualified Options (Rule 2)
- Example 8: One Unique Index For Two Fully Qualified Options Plus Sorting (Rule 2)
- Example 9: Most Sequential Leading Equality Matches (Rule 4)
- Example 10. On Same Number of Matches BEGINS Is Used To Make the Pick (Rule 4a)
- Example 11. On Same Number of Matches* the Range Match Is Used To Make the Pick (Rule 4b)*
- Example 12. An Index with its First Component Involved in a Range or BEGINS Match (Rule 5)
- Example 13. Range Versus Sort Matches* (Rule 5)*
- Example 14. Most Sequential Leading Sort Matches (Rule 6)
- Example 15. Primary Versus Alphabetical Tie Break (Rule 7)
- Example 16. Two Equal Range Selections with a Find (Rule 8)
- Example 17. Operators and Functions with No Impact On Index Selection
- Example 18. Lookup Function
- Example 19. ROW-ID
Database Indexes and Query Sorting¶
In order to accurately duplicate the behavior of the Progress 4GL single record reading approach, one must dynamically select each record of the specified table in the same traversal order which Progress would generate. This problem is complicated by the feature provided by Progress where the same record may be visited more than once if its data is changed in a manner that reorders it later (or earlier if the walk is backwards) in the sequence of records. This is very different from SQL's set oriented approach where one can be sure that the set returned is of a static size. Besides the obvious danger of creating infinite loops or otherwise non-intuitive behavior, this requirement means that each "next" or "previous" record retrieval must be based on a relative movement between records instead of a walk of a static list.
In order to duplicate the traversal order of Progress, the relative movement from one record to the next/previous must be based on knowledge of the exact sorting criteria used by Progress. As with many of the behaviors of the Progress language, there is a mixture of explicit and implicit factors that determine the exact sorting criteria used for each query.
The Progress database accesses all records using an index, via the combined use of multiple indexes or via sequential table scan. When a single index is used, if the sorting criteria isn't specified or in some cases where the sort criteria doesn't fully specify an unambiguous ordering (but is compatible with the index), the index used determines the order in which the records are returned. In other words, the index in use can and often does determine (or partially determine) the real sort order.
In the multiple index and table scan approaches, if no explicit sort order is specified, the sort order is undefined. In addition, even in cases where the sort order is explicitly specified, it is possible that records can be ambiguously ordered at a more granular level that the explicit sort order specifies. When this occurs, the "secondary" sort order is undefined.
The BY
phrase is used to specify a sort order explicitly. It is especially important to note that the sort order can be:
- completely implicit (if no
BY
clause is provided); - partially implicit (if the
BY
clause does not unambiguously specify an exact ordering); - completely explicit (if the
BY
clause results in an unambiguous order).
For this reason, it is possible (and even likely) that a given Progress application will have ordering dependencies built in, possibly without the specific intention of the original author. This may appear as the order of records in a report or the order in which records are displayed in the user interface.
Implicit sort behavior in the single index case is dependent upon the selected index. For this reason, one must duplicate the implicit index selection that the Progress compiler makes. This is driven by analysis of query predicates (WHERE
, OF
and USING
clauses), sort specification (BY
clause) and explicit index specifications (USE-INDEX
clause). These rules may be used for all query types: standalone FIND
queries, queries which allow the iteration of results preselected by DO/REPEAT PRESELECT
blocks, FOR
loops and queries opened by the OPEN QUERY
statement. Since there are differences in what can be done with these query types, some rules may not apply to all of them. For instance, BY
clauses cannot occur in standalone FIND
statements; i.e., sort order always is based only on the selected index.
This chapter describes the rules by which the sorting criteria are determined during the FWD conversion process, which follows, to the extent possible, the behavior of the Progress 4GL environment.
Single versus Multiple Index Selection¶
As noted above, some ambiguity in sorting is resolved using the index selected for record reading. This point really assumes that a single index is in use. Multiple indexes can be selected and are used together to read the specified records. However, in the multiple index case, all non-explicit sorting behavior is undefined. According to 4GL documentation, in the single index case, the 4GL developer can sometimes rely upon secondary sort behavior based on the index. So, in order to provide more predictable sorting, the FWD conversion always assumes that a single index is used. This approach is designed to approximate the 4GL environment's sorting to some extent, though it is not expected to completely match the undefined behavior.
All of the queries presented in 4GL use multiple indexes whenever possible (assuming the -v6q
command line option is not active). This multiple index support appeared in Progress v7 and applications that had a dependency on single index selection (an implicit sorting dependency) needed the v6 query compatibility mode (the -v6q
command line option).
Even though certain language constructs are enabled for multiple indexes, it is the query predicate (i.e., record selection criteria) which determines whether a single index or multiple indexes are chosen. In particular:
- Having
OR
criteria makes multiple index use much more likely. For block queries, if anOR
operator is involved, multiple indexes will be selected even if there is only a subset of the index components that are involved in equality or range matches. - With
AND
criteria an index must have all index components referenced in the predicate in equality matches (see below). This means that indexes that have only one component and predicates that reference a superset of fields in multiple indexes are more likely to trigger this condition. - Query predicate role is most important when there is no
BY
clause.
General Sorting Behavior¶
As discussed earlier, when multiple indexes are selected any implicit sort order is undefined. 4GL documentation indicates that the 4GL developer should not rely upon any particular sort order that may be implicitly provided in the multiple index case, since this behavior may change in future releases. One is expected to explicitly specify a BY
clause in this case, or to write the code in such a manner that it is not dependent upon of the resulting sort order.
When a single index is used and there is no explicit sort order, the sort order is defined by the index. This is behavior upon which 4GL developers can rely.
To the extent a BY
clause does not specify an unambiguous sort order, the sorting within the "buckets" defined in the BY
clause is usually undefined. In other words, if multiple records have the same data in the given set of fields specified in the BY
clause, the sort order of these records will be undefined even though the overall sort order of the query is defined. This is called the secondary sort order. For example, given the BY
clause:
BY a1 BY a2
the sort order may be:
a1 a2 a3 --- --- --- 1 1 1 1 1 3 1 1 2 1 2 1 1 2 2
as well as:
a1 a2 a3 --- --- --- 1 1 3 1 1 2 1 1 1 1 2 2 1 2 1
Thus, the secondary sort order of this data set using the given BY
clause generally is undefined, with the following exception: if a single index is used for record access and the BY
clause matches that selected index, then that index defines the secondary sort order, to the extent the index is more specific than the BY
clause. The term matches in this case means that the sort string (i.e., BY
clause) components match the leading components of the index or the sort string matches the index completely. An index can be matched in either the forward direction (if sorting directions match for each matching field) or the inverse direction (if sorting directions are opposite for each matching field). Consider that we have a table with the fields a1
, a2
and a3
and the index a1 a2
:
def temp-table a field a1 as integer field a2 as integer field a3 as integer index idxa a1 a2.
The following sort strings all match the index idxa
:
a1 a1 a2 a1 desc a1 desc a2 desc
The following sort strings do not:
a3 a3 a1 a2 a1 a2 a3 a1 desc a2 a1 a2 desc
Index matching usually occurs when the index is selected based on sort matches (see Index Selection Rules below). In this case, the index and the sort criteria are compatible and the sorting is completely specified (and implemented) by the index on the 4GL database server. This means that in such cases, there will be a reliable secondary sort ordering based on the additional (more specific) fields in the index that were not part of the BY
clause.
General sorting rules for both single and multiple index cases can be summarized in this way:
- The
BY
clause is honored as the primary sort criteria. - The selected indexes can affect the secondary sorts. Other than the exceptions noted above (single index with matching or missing
BY
clause), the rules by which this occurs are undefined. - In some cases a secondary sort is based on the record create or recid/rowid order, but one should not rely upon this kind of sorting for the following reasons:
- It is not definitely known whether create order or rowid order is used. Tests have been performed with the sets of records where records that had been added sequentially and therefore the create order is the same as rowid order. In the future, a more active set of data with volumes of deletes and adds might be able to differentiate this undetermined behavior.
- It is not known to which extent indexes and create/rowid order affect the sort. It is also unknown how these two methods may be chosen separately or used together.
- One may delete a record and then create an identical record and the resulting record may have a different rowid than the original. Since rowid is a physical pointer to the location of the record in the database, anything that changes the physical order of records in the database may change the secondary sort order in these undefined cases. This means that reloading the database, exporting/importing and compacting activities can change sorting.
Implicit WHERE Clause Components¶
Query predicate (i.e. record selection criteria) plays a role in index selection mechanism and therefore in record sorting. As with many things in Progress, a query predicate can be specified explicitly (WHERE
clause) or implicitly (through OF
or USING
clauses and through the special unqualified constant form of FIND
). Before analyzing the query predicate, one must add any implicit portions to get the complete expression used to select records.
Unqualified Constant¶
A FIND
with a single constant (literal value) instead of a standard record phrase is the same as making an addition to the WHERE
clause referencing a primary index (which must have only a single field) in an equality match. Assuming the customer
table has a primary key which has the single index component cust-num
, this:
FIND customer 1 WHERE country = "USA".
is actually the same as:
FIND customer WHERE country = "USA" AND cust-num = 1.
OF Clause¶
Usage of the OF
clause causes an addition of an equality match for each field in the common index (an index that is common between the two tables and unique in at least one of them). These equality matches are connected via an AND
operator. If the common index between two tables is comprised of fields A
, B
and C
:
FIND customer OF order WHERE name = "Bogus".
is converted into:
FIND customer WHERE name = "Bogus" AND customer.A = order.A AND customer.B = order.B AND customer.C = order.C.
Direct Logical Field References¶
A logical field that is directly referenced as the entire predicate or as a direct operand of a logical operator is actually considered to be the same as an equality comparison with the literal true
(thus it will affect the index selection process described below). This means that:
WHERE logicalField
is the same as
WHERE logicalField = true
However the use of the NOT
operator in this case does not get translated into the corresponding equality comparison (so it does not affect index selection). For instance:
WHERE NOT logicalField
is not the same as
WHERE logicalField = false
USING Clause¶
The usage of USING
clause causes an addition of an equality match for each referenced field with the instance of that field in the default or specified (using FRAME
option) screen buffer. So:
PROMPT-FOR customer.cust-num customer.seq. FIND customer USING cust-num AND seq WHERE country = "Finland".
is the same as:
PROMPT-FOR customer.cust-num customer.seq. FIND customer WHERE country = "Finland" AND cust-num = INPUT customer.cust-num AND seq = INPUT customer.seq.
An alternate form of USING
usage exists for any field that is defined as an abbreviated field in an index (abbreviate index fields are character fields that lets you conveniently search for a partial match based on the first few characters of a field). In this case:
PROMPT-FOR customer.cust-name. FIND customer USING cust-name WHERE country = "Finland".
is the same as:
PROMPT-FOR customer.cust-name. FIND customer WHERE country = "Finland" AND cust-name BEGINS INPUT customer.cust-name.
The difference is that the BEGINS
operator is used instead of equality operator.
Single Index Selection Rules¶
Since the selected index affects the resulting sorting order, it is important to know which index will be selected. FWD always assumes that a single is index is used (or none). In this section the selection process of this single index will be covered.
Indexes are selected using explicit and implicit criteria. In the absence of an explicit index specification (USE-INDEX
), the Progress compiler makes an implicit index selection. This selection is based on an analysis of the query predicate (WHERE
/ OF
/ USING
clauses) and sorting specifications (BY
clause). The idea is that the Progress database server is completely driven by index processing (or table scans if no index can be used). There are a very limited set of operations that can be specified in a predicate which can be translated into index operations. Anything that can be translated into an index operation can be resolved by the database server without the Progress client evaluating expressions to determine if a given record matches.
In other words, a single query predicate can be processed completely on the client, completely on the database server or in both places. The degree with which the predicate can be processed on the server has a great impact on the performance of the database access since it is highly inefficient to read many records from the server only to reject them at the client when they do not match the predicate. This is a statement of data transfer volume (a less visible factor in shared memory implementations than on network implementations) and, more importantly, it is a statement of the efficiency of general purpose interpreted 4GL expression performance compared to a database server that is tuned for such processing.
Operands Which Affect Index Selection¶
Only database fields that are components of an index of the current table affect index selection. Consider the following table:
DEF TEMP-TABLE tt FIELD f1 AS INTEGER FIELD f2 AS INTEGER FIELD f3 AS INTEGER FIELD f4 AS INTEGER FIELD f5 AS INTEGER INDEX i1 f1 INDEX i2 f2 f3 INDEX i3 IS PRIMARY f1 f3 f4.
In this case fields f1
, f2
, f3
and f4
if encountered in a WHERE
clause will have an impact on the index selection process, while field f5
will not.
The following operands are ignored during analysis:
- literals;
- variable references;
- widget attributes references;
- field references to fields of the current table which are not components of any index;
- field references to fields external to the current table.
Operators Which Affect Index Selection¶
The set of operators known by the Progress database server is limited: EQ
(=
), LT
(<
), GT
(>
), LE
(<=
), GE
(>=
), BEGINS
and CONTAINS
. The CONTAINS
operator works only for OPEN QUERY
/ FOR
statements and if the referenced field is contained in a word index. When any of these operators are encountered, the compiler is allowed to examine the two operands to determine if one of them is an index component. If one of the operands (it does not have to be on the left side only) is an index component, then this may modify the index selected (based on the rules provided below).
As an example, the following predicates affect index selection:
indexField1 = 5 aVariable > indexField1 indexField2 begins “some”
Any operator than is not noted in the list above is unknown to the database server. Furthermore, any operand of the operators unknown to database server will be ignored and it (and its children) will not have any impact on the index selection.
Here is the list of operators and functions which do not participate in the index selection process:
- Comparison operators which are unknown to the database server:
NE
(<>
),NOT
,MATCHES
. For example, these predicates will be ignored:indexField <> 5 not (indexField = 5)
- Other operators with no functional counterpart on the server (e.g., arithmetic operators). For example, these predicates will be ignored:
indexField + 1 = 5 indexField1 + indexField2 > 5
- Methods, constructions, built-in and user-defined functions. For example, these predicates will be ignored:
substring(indexField, 3) = “some string” someFunction(indexField) > 5 IF indexField1 THEN indexField2 ELSE indexField3 = 5
The only other operators that are recognized during this analysis are the conjunction (AND
) and precedence (parentheses) operators.
In the single index case, the AND
operator allows the compiler to analyze both operands for index selection purposes, because the best fit index will naturally be able to reduce the result set based on an AND
condition. However OR
operator does not allow operand analysis, because it is significantly harder (or sometimes impossible) to ensure that the results of a single index can fulfill all possible conditions joined via an OR
operator, so FWD ignores the OR
operator and its children during analysis. In the multiple index case, the AND
and OR
operators may both have some support, however this has not been explored or documented.
Parentheses neither hinder nor help the analysis process since they merely ensure that the structure/precedence of the expression is forced into a desired pattern. This resulting structure is what is analyzed, after the parentheses already have been applied.
Referencing Fields¶
There are two ways that a table field (either index or non-index) can be a reference in an operand of an operator known to Progress database server:
- Direct reference. In this case a field is the operand of an operator known to Progress database server. The following example contains two direct references:
field1 <= field2
- Nested reference. In this case a field participates in an expression (at an arbitrary nesting level) which represents an operand. The following example contains two nested references:
field1 + 5 = someFunction(field2)
The sub-expression rooted at an operator known by Progress database server will affect index selection if one of the operands is a direct reference to an index field and the other operand does not contain any references to the fields of the current table. This happens because in other cases (including the case when both operands are direct references), the sub-expression is evaluated on the Progress client after the query has been executed on the database server. Note that an index field which is an operand of a BEGINS
or CONTAINS
operator affects index selection only if it is the first operand. Also, the CONTAINS
operator requires that the first operand is contained in a word index.
The following sub-expressions affect index selection:
indexField1 = 5 5 = indexField1
While the following do not affect it:
indexField1 = indexField2 indexField1 = indexField2 + 5 indexField1 = entry(1, indexField2)
Allowed Nesting Level¶
In order to affect index selection, a sub-expression (rooted by an operator known to Progress database server) should be the root expression, or should be an operand of a top-level AND operator. For instance, all sub-expressions participating in the expressions below affect index selection:
indexField1 = 5 indexField1 = 5 AND indexField2 > 6 AND (indexField3 < 7 AND indexField4 = 8)
while these do not:
(indexField1 = 5) = (indexField2 > 6) (indexField1 = 5 AND indexField2 > 6) OR indexField3 < 7
Summary of Expressions That Affect Index Selection¶
Summarizing information about operations, types of operands, and the index field referencing rules provided above, we can form the rules below.
A sub-expression affects index selection only if:
- It is represented by an operator known to the Progress database server (
EQ
(=
),LT
(<
),GT
(>
),LE
(<=
),GE
(>=
),BEGINS
andCONTAINS
). - One of the operands is a direct reference to an index field and the other operand doesn't contain any references to the fields of the current table.
- The sub-expression is the root expression of a
WHERE
clause or an operand of a top-levelAND
operator.
Let's take a look at some sample WHERE
clauses.
Example 1:
indexField1 + 5 = 6 AND 3 > indexField2 AND indexField3 <> 1
indexField1 + 5
is ignored because of the arithmetic operator +
. indexField3 <> 1
is ignored because of the <>
operator, which is unknown to the database server. So, the part which affects the index selection process is:
> indexField2
Example 2:
(indexField1 = variable1 + 5) AND someFunction(indexField2) = 10
someFunction(indexField2) = 10
is ignored because a function is used. The part that affects the index selection process is:
indexField1 =
Example 3:
indexField4 BEGINS “some string” AND indexField2 = indexField3 AND indexField1 > nonIndexField1 + 1
indexField2 = indexField3
and indexField1 > nonIndexField1 + 1
are ignored because both sides of operands contain references to the fields of the current table. The part that affects the index selection process is:
indexField4 BEGINS
Example 4:
indexField1 = 5 OR indexField2 = 1
Because of the top-level OR
operator, FWD ignores the whole expression and it will not affect index selection process (however Progress may use single or multiple indexes that contain index fields indexField1
or indexField2
).
Example 5:
(indexField1 > 5) = true
In this case there is no part that can affect analysis because the indexField1 > 5
sub-expression is neither the root expression nor an operand of a top-level AND
operator.
Types of Matches¶
Each operator known by the Progress database server can be classified by the type of match it provides. The table below specifies this correspondence. Additionally, rows in the table are ordered by the relative weight of match types, which is defined by the performance value of different types of index traversals (rows are ordered from the fastest operation to the slowest). This order in fact formed the order of index selection rules provided further.
Operator(s) | Match type |
---|---|
EQ (= ) |
Equality match |
LT (< ), GT (> ), LE (<= ), GE (>= ) |
Range match |
BEGINS |
Begins match |
CONTAINS |
Word match |
Sort Match¶
A sort match is a special type of match that is determined based on analysis of a BY
clause rather than by analyzing a WHERE
predicate. Identifying sort matches is performed using these rules:
- There is a specificity to the order in which the
BY
clauses occur. The firstBY
clause is the least specific component of the sort order (the least granular which forces the overall ordering of the results). Each subsequentBY
clause only changes the sort order within the larger/broader categories created by the previous sort specification. Thus, subsequentBY
clauses are increasingly specific sort criteria. This order must be preserved in the list of sort matches and any comparisons must take this order into account. Consider that we have the indexindexField1, indexField2
. Then we will have the following number of matches for the specified sort strings:BY indexField1 BY indexField2 /* 2 matches */ BY indexField1 /* 1 match */ BY indexField3 BY indexField1 BY indexField2 /* no matches */ BY indexField2 BY indexField1 /* no matches */ BY indexField1 BY indexField2 BY indexField3 /* 2 matches */ BY indexField1 BY indexField3 BY indexField4 /* 1 match */
- There is a direction component to the
BY
clause (implicitly this isASCENDING
, but it can be explicitly specified asDESCENDING
). If aBY
clause references an index component field, BUT the sorting direction is not compatible with the index component's definition, Progress still counts this as a sort match (this statement is based on index selections reported in the compiler's XREF output). However in this case the Progress client would be forced to ignore the ordering provided by the server index and handle sorting itself in this case (unless EACH matched component of the sort string have sorting direction that is opposite the corresponding index component). Consider that we have the indexindexFiled1, indexField2
. Then we will have the following number of matches for the specified sort strings:BY indexField1 DESC BY indexField2 /* 2 matches */ BY indexField1 DESC /* 1 match */ BY indexField1 BY indexField2 DESC BY indexField3 /* 2 matches */
- The expression in a
BY
clause can be more complex than a simple field reference (e.g., expressions likeBY SUBSTRING(field1)
orBY filed2 + filed3
are allowed). Only a simple field reference can be counted as a sort match as more complicated expressions must be resolved on the client at runtime (and thus they cannot be used with an index, so they do not affect index selection).
Single Index Selection Rules¶
For each of the match types (equality match, range match, begins match, word match and sort match) a set of statistics is recorded by index component that was matched. This information is then used with the rules described in this section to pick the best index.
This list of rules is specified in priority order (1 is the highest priority). Each priority level is checked in turn, from highest to lowest. At the start of processing, there is a "selectable" list that includes all possible indexes for the given table. As each priority level is processed, indexes that are selectable based on the criteria at that level will remain in the selectable list and all other indexes will be removed. If the selectable list contains a single matching index after a given priority level is evaluated, that index will be the selected index. If there are multiple matching indexes at a given priority level, then the reduced selectable list is used in subsequent priority levels to provide a "tie breaking" process to determine which index is selected. This general rule does have some exceptions, which are noted below.
This example table will be used in the following examples:
define temp-table tt field f1 as integer field f2 as integer field f3 as integer field f4 as integer field f5 as integer field f6 as integer index idx1 is primary f1 f2 index idx2 is unique f3 f4 index idx3 is unique f3 f5 index idx4 f3 f6 s1 index idx5 f3 f6 s2 index idx6 f3 f1 f2 index idx7 f1 f4 index idx8 f4 f1 index word1 is word-index s1 index word2 is word-index s2.
Rules:
1. An index that was specified in USE-INDEX
*.* The use of the USE-INDEX
phrase overrides any compiler efforts at selecting the index(es) to use and limits the selection to that single index. This is the (one and only) explicit index case. Only one USE-INDEX
clause can be present in any record phrase, so there is never a need for a tie breaker for this priority level. Sample WHERE
clause:
f1 = 1 AND f2 = 1 USE-INDEX idx2
Selected index: idx2
.
2. A unique index with all components in equality matches. When each and every index field of an unique index is involved in an equality match that index will be selectable. Since there can be more than one unique index and the predicate could cause the selection of multiple unique indexes, the unique index that has the most components will be the one chosen. If there are multiple unique indexes that are selectable with the same number of components, then this list of selectable choices will be further processed starting at rule 7 below (this is a tie breaker). Sample WHERE
clause:
f3 = 1 AND f4 = 3
Selected index: idx2
f3 = 1 AND f4 = 3 AND f5 = 5
Because idx2
and idx3
have the same number of matching components, no index will be selected at this step and processing goes to the step 7.
3. A word index referenced through CONTAINS
*.* When a word index is referenced through a CONTAINS
operator, that index is selectable (this operator can take as the first operand only a field referenced in a word index, and a word index always contains a single index field). If only one word index is selectable, that index is used. If there are multiple word indexes that are selectable, then this list of selectable choices will be further processed starting at rule 7 below (this is a tie breaker). Sample WHERE
clause:
s1 CONTAINS “item”
Selected index: word1
.
s1 CONTAINS “item” AND s2 CONTAINS “work”
Because indexes word1
and word2
have the same number of matching components, no index will be selected at this step and processing goes to the step 7.
Note that the described order of steps: first step 2 then step 3 matches FWD behavior, however Progress documentation tells that the step 3 should go before the step 2, while analysis of .xrf files provided by Progress compiler shows that in the case when an unique index with all components in equality matches occurs along with a word index match, both indexes are selected (according to Progress documentation the word index should have preference). Also note that so far FWD does not completely support neither word indexes nor CONTAINS
operator.
4. Most sequential leading equality matches. Due to the special nature of rules 2 and 3, when this rule is evaluated, it always processes starting with the full list of all possible indexes as the potential indexes. The indexes that have the most sequential leading components with equality matches are selectable. Most is interpreted in an absolute sense, not a relative or percentage sense. For example, if one index has the first 4 index components all involved in equality matches, and no other index has 4 or more leading index components in equality matches then this index is chosen. Sample WHERE
clause:
f3 = 1 AND f1 = 1
Selected index: idx6
.
If no indexes were selectable via this rule, then the full list of all possible indexes will be checked using the next rule (№5). If more than one index has the same number of leading index components in equality matches, then this reduced list of selectable indexes further refined:
- The next index component (after those already known to be involved in an equality match) of each selectable index must be evaluated to see if it has a begins match in this component. For example, an index with its first 3 components with equality matches and the 4th component as a begins match will be selected over an index with 3 equality matches (only).
The equality matches must all occur before the begins matches and all equality/begins matches must still be sequential (the first index component without any match ends the calculation). Only one component after the equality matches is checked for a begins match.
If only one index is selectable after the begins match check, this index is chosen.
If more than one index is still selectable after the begins test (i.e. if multiple indexes have the next component as a begins match), then this reduced list of selectable indexes is processed further starting at the rule 6.
If none of the selectable indexes were matched via this begins tie breaker, processing of the selected indexes (all with the same number of equality matches) continues in the next sub-rule (b).
SampleWHERE
clause:f3 = 1 AND f6 = 1 AND s2 BEGINS “item”
Selected index:idx5
. - The next index component (after those already known to be involved in an equality match) of each selectable index must be evaluated to see if it has a range match in this component. Note that despite the fact that Progress documentation refers "most range matches" as an index selection criteria, only one range match is ever considered.
If only one index is selectable after the range match check, this index is chosen.
If more than one selectable index is still selectable after the range test (i.e. if multiple indexes have the next component as a range match), then this reduced list of selectable indexes is processed further starting at the rule 6.
If none of the selectable indexes were matched via this range tie breaker, processing of the selected indexes (all with the same number of equality matches) continues in rule 6.
SampleWHERE
clause:f3 = 1 AND f5 > 10
Selected index:idx3
.
5. An index with its first component involved in a range or BEGINS
match. All indexes that have their leading component in a range or begins match are selectable. Note that despite the fact that Progress documentation refers "most range/begins matches" as an index selection criteria, only one range/begins match is ever considered.
If only one index is selectable after the range/begins match check, this index is chosen.
If more than one index is selectable, then the reduced list of selectable indexes is evaluated by the next rule (№6).
Sample WHERE
clause:
f1 > 0 AND f4 < 10
Because idx1
, idx7
and idx8
have the first component involved in a range match, no index will be selected at this step and processing goes to the step 6.
6. Most sequential leading sort matches. The indexes that have the most sequential leading components with sort matches (an index component directly specified in a BY
clause rather than the WHERE
clause) are selectable. Most is interpreted in an absolute sense, not a relative or percentage sense. For example, if one index has the first 4 index components all involved in sort matches, and no other index has 4 or more leading index components in sort matches then this index is chosen. You can read more about matching rules in the Sort Match section above.
If only one index is selectable after the sort match check, this index is chosen.
If more than one index has the same number of leading index components in sort matches, then this reduced list of selectable indexes is evaluated by the next rule (№7).
Sample case (no WHERE
clause specified):
BY f1 BY f4
Selected index: idx7
.
7. The primary index. This rule is special because it can be "directly" reached from the rules №2 and №3 above. When this rule is evaluated, the selectable list may include all possible indexes (if no rule above was matched) or it may include a reduced set of selectable indexes (if one or more of the rules above selected indexes). Whichever is the case, this list is searched for the primary index.
If the primary index is selectable, then this index is used.
There can be only one primary index, so there can never be a "tie" here and in the case where the primary index is not selectable, the next rule (№8) is used.
Sample WHERE
clause:
f1 > 0 AND f4 < 10
For this case, on the step №5 the list of selected indexes has been refined to idx1
, idx7
and idx8
. On this step the primary index was selected: idx1
.
8. The index with the alphabetically first name. In general, the index name (from the selectable list) that is lexicographically first (case insensitive) is the index chosen. The only exception is if this is being used as a tie breaker for unique indexes (rule №2), where the lexicographically last (case insensitive) index name is the index chosen. Since all index names must be different (using case insensitive comparison) within a given table, this rule will always result in a single index choice.
Sample WHERE
clause:
s1 CONTAINS “item” AND s2 CONTAINS “work”
For this case, on the step №3 the list of selected indexes has been refined to word1
and word2
. On this step the lexicographically first index was selected: word1
.
Index Selection Special Case¶
A FIND
lookup by recid/rowid does not use an index.
Conversion Examples¶
The index selection examples are grouped in three sections:
- general index criteria selection examples;
- single index selection rules examples;
- operators and functions with no impact on index selection.
The following examples will use the person
temporary table, as defined in person-temp-table.i
:
/* person-temp-table.i */ define temp-table person field site-id as integer field emp-num as integer field first-name as character field last-name as character field middle-init as character field ssn as character field pay-type as character field hire-date as date field term-date as date field comments as character field area-code as character field phone-number as character field extension as character field schedule as character field hours as integer index site-emp is primary unique site-id emp-num index comments is word-index comments index emp-ssn is unique ssn index exten is unique extension index hire-date hire-date index name last-name first-name middle-init index pay-type pay-type index phone area-code phone-number extension index term-date term-date.
Example1: Default Index Selection¶
4GL source:
{person-temp-table.i} find first person no-lock no-error. if available(person) then display person. else message "No person available".
Converted example:
... new FindQuery(person, (String) null, null, "person.siteId asc, person.empNum asc", LockType.NONE).first(); ...
Details:
This example illustrates simple default index selection. The default index used is the primary index as there are no criteria to work with.
Example 2: Implicit Index Selection¶
4GL source:
{person-temp-table.i} for each person no-lock where person.hire-date = 01/01/2000 : display person. end.
Converted example:
... query0 = new AdaptiveQuery(person, "person.hireDate = '2000-01-01'", null, "person.hireDate asc", LockType.NONE);
Details:
FIND
uses an implicit index. Since the hire-date
field is used in an equality match, and no USE-INDEX
clause is specified, the hire-date
index is selected.
Example 3: Explicit Index Selection (rule 1)¶
4GL source:
{person-temp-table.i} for each person no-lock where person.hire-date = 01/01/2000 use-index name: display person. end.
Converted example:
... query1 = new AdaptiveQuery(person, "person.hireDate = '2000-01-01'", null, "person.lastName asc, person.firstName asc, person.middleInit asc", LockType.NONE); ...
Details:
This time USE-INDEX
overrides default index selection and the name
index is selected.
Example 4: BY Clause Index Selection¶
4GL source:
{person-temp-table.i} for each person no-lock by person.last-name: ... end.
Converted example:
... query0 = new AdaptiveQuery(person, (String) null, null, "person.lastName asc, person.firstName asc, person.middleInit asc", LockType.NONE); ...
Details:
This example illustrates index selection based on a BY
clause. The additional sort criteria in the converted code result from the selection of the name
index as a sort match for the less specific BY
clause (see Sort Match section above).
Example 5: Criteria Index Selection¶
4GL source:
{person-temp-table.i} for each person no-lock where person.site-id = 1 by person.last-name: ... end.
Converted example:
... query0 = new PreselectQuery(person, "person.siteId = 1", null, "person.lastName asc, person.siteId asc, person.empNum asc", LockType.NONE); ...
Details:
This example illustrates criteria index selection combined with a simple sort specified by a BY
clause. The primary index site-emp
is selected, based on the equality match against the site-id
field in the WHERE
clause. However, since the BY
clause calls for an explicit sort on the last-name
field, lastName
becomes the leading component in the HQL order by clause passed to PreselectQuery
's constructor. The components of the site-emp
index (site-id
and emp-num
) are added to the HQL order by clause (converted to siteId
and empNum
, respectively), to ensure that the sort order of records with identical last-name
values matches as closely as possible the (undefined) sort order provided by the 4GL.
Example 6: One Index For Fully Qualified Unique Option (Rule 2 )¶
4GL source:
{person-temp-table.i} for each person no-lock where person.site-id = 1 and person.ssn = "111223333": ... end.
Converted example:
... query2 = new AdaptiveQuery(person, "person.siteId = 1 and upper(person.ssn) = '111223333'", null, "person.ssn asc", LockType.NONE); ...
Details:
The example illustrates the case of a* unique index with all components in equality matches. *The selected index is the fully qualified unique index emp-ssn
.
Example 7: One Unique Index For Two Fully Qualified Options (Rule 2)¶
4GL source:
{person-temp-table.i} for each person no-lock where person.site-id = 1 and person.emp-num = 123 and person.ssn = "111223333": ... end.
Converted example:
... query3 = new AdaptiveQuery(person, "person.siteId = 1 and person.empNum = 123 and "+ "upper(person.ssn) = '111223333'", null, "person.siteId asc, person.empNum asc", LockType.NONE); ...
Details:
Both site-emp
and emp-ssn
have all components in equality matches. Rule number 2 applies, with the emp-ssn
index selected based on the fact that it has the most components.
Example 8: One Unique Index For Two Fully Qualified Options Plus Sorting (Rule 2)¶
4GL source:
{person-temp-table.i} for each person no-lock where person.site-id = 1 and person.emp-num = 123 and person.ssn = "111223333" by person.ssn: ... end.
Converted example:
... query4 = new PreselectQuery(person, "person.siteId = 1 and person.empNum = 123 and " + "upper(person.ssn) = '111223333'", null, "person.ssn asc, person.siteId asc, person.empNum asc", LockType.NONE); ...
Details:
Both site-emp
and emp-ssn
have all components in equality matches. Here rule number 2 applies, with the emp-ssn
index selected based on the fact that it has the most components. The sorting on the ssn
field was added to because of the BY
clause.
Example 9: Most Sequential Leading Equality Matches (Rule 4)¶
4GL source:
{person-temp-table.i} find first person no-lock where person.area-code = "111" and person.phone-number = "2223333" and person.last-name = "Jones" no-error. if available(person) then do: display person. end.
Converted example:
... new FindQuery(person, "upper(person.areaCode) = '111' and upper(person.phoneNumber) = '2223333' and " + "upper(person.lastName) = 'JONES'", null, "person.areaCode asc, person.phoneNumber asc, person.extension asc", LockType.NONE).first(); ...
Details:
In this case no unique indexes are in the list of candidates. Both phone
and name
indexes have all components in equality matches. The phone
index is selected, since it has the most components.
Example 10. On Same Number of Matches BEGINS Is Used To Make the Pick (Rule 4a)¶
4GL source:
{person-temp-table.i} find first person no-lock where person.area-code = "111" and person.last-name = "Jones" and person.first-name begins "R" no-error.
Converted example:
... new FindQuery(person, "upper(person.areaCode) = '111' and upper(person.lastName) = 'JONES' and " + "upper(person.firstName) like 'R%'", null, "person.lastName asc, person.firstName asc, person.middleInit asc", LockType.NONE).first(); ...
Details:
In this case no unique indexes are in the list of candidates. Both phone
and name
indexes have equal number of components in equality matches but not all of them. The name
index is selected, since the BEGINS
expression contains the next available component of the name
index not already used in an equality match.
Example 11. On Same Number of Matches* the Range Match Is Used To Make the Pick (Rule 4b)*¶
4GL source:
{person-temp-table.i} find first person no-lock where person.area-code = "111" and person.phone-number > "2220000" and person.last-name = "Jones" no-error.
Converted example:
... new FindQuery(person, "upper(person.areaCode) = '111' and upper(person.phoneNumber) > '2220000' and " + "upper(person.lastName) = 'JONES'", null, "person.areaCode asc, person.phoneNumber asc, person.extension asc", LockType.NONE).first(); ...
Details:
In this case, no unique indexes are in the list of candidates. Both the phone
and name
indexes have an equal number of components in equality matches, but not all of them. The phone
index is selected, since the range expression contains the next available component of the phone
index not already used in an equality match.
Example 12. An Index with its First Component Involved in a Range or BEGINS Match (Rule 5)¶
4GL source:
{person-temp-table.i} find first person no-lock where person.hire-date >= 01/01/2000 no-error.
Converted example:
... new FindQuery(person, "person.hireDate >= '2000-01-01'", null, "person.hireDate asc", LockType.NONE).first(); ...
Details:
In this case there are no equality matches. The hire-date
index will be selected based on the index component reference in the range expression.
Example 13. Range Versus Sort Matches* (Rule 5)*¶
4GL source:
{person-temp-table.i} for each person no-lock where person.hire-date >= 01/01/2000 and person.hire-date < 01/01/2001 by person.last-name by person.first-name: ... end.
Converted example:
... query5 = new PreselectQuery(person, "person.hireDate >= '2000-01-01' and person.hireDate < '2001-01-01'", null, "person.lastName asc, person.firstName asc, person.hireDate asc", LockType.NONE); ...
Details:
In this case there are no equality matches. The hire-date
index is selected based on the index component reference in the range expression with no regard to sort matches. The BY
clauses are honored as the primary sort criteria.
Example 14. Most Sequential Leading Sort Matches (Rule 6)¶
4GL source:
{person-temp-table.i} for each person no-lock where person.phone-number > "2220000" and person.last-name > "Jooooo" by person.last-name by person.first-name by person.area-code: ... end.
Converted example:
... query6 = new PreselectQuery(person, "upper(person.phoneNumber) > '2220000' and " + "upper(person.lastName) > 'JOOOOO'", null, "person.lastName asc, person.firstName asc, " + "person.areaCode asc, person.middleInit asc", LockType.NONE); ...
Details:
The name
index is selected in this case because more components of this index match the BY
clauses.
Example 15. Primary Versus Alphabetical Tie Break (Rule 7)¶
4GL source:
{person-temp-table.i} for each person no-lock where person.site-id = 1 and person.last-name = "Jones": ... end.
Converted example:
... query7 = new AdaptiveQuery(person, "person.siteId = 1 and upper(person.lastName) = 'JONES'", null, "person.siteId asc, person.empNum asc", LockType.NONE); ...
Details:
Both site-emp
and name
indexes have an equal number of components in equality matches. The primary index site-emp
is selected in this case.
Example 16. Two Equal Range Selections with a Find (Rule 8)¶
4GL source:
{person-temp-table.i} find first person no-lock where person.hire-date >= 01/01/2000 and person.hire-date < 01/01/2002 and person.term-date >= 01/01/2004 and person.term-date < 01/01/2005 no-error.
Converted example:
... new FindQuery(person, "person.hireDate >= '2000-01-01' and person.hireDate < '2002-01-01' and " + "person.termDate >= '2004-01-01' and person.termDate < '2005-01-01'", null, "person.hireDate asc", LockType.NONE).first(); ...
Details:
The hire-date index is used as alphabetical choice.
Example 17. Operators and Functions with No Impact On Index Selection¶
4GL source:
{person-temp-table.i} find first person no-lock where not (person.site-id = 1) and person.last-name <> "Gaither" and person.last-name matches "R*" and person.first-name = entry(1,"Rod,George,Frank") and person.term-date = person.hire-date and person.term-date + 5 = 01/01/2003 no-error.
Converted example:
... new FindQuery(person, "not (person.siteId = 1) and " + "upper(person.lastName) != 'GAITHER' and " + "upper(person.lastName) like 'R%' and " + "upper(person.firstName) = ? and " + "person.termDate = person.hireDate and " + "plus(person.termDate, 5) = '2003-01-01'", null, "person.siteId asc, person.empNum asc", new Object[] { toUpperCase(entry(1, "Rod,George,Frank")) }, LockType.NONE).first(); ...
Details:
The operators and functions used do not participate in the index selection process and as a result the index selected is the default, primary index.
Example 18. Lookup Function¶
4GL source:
{person-temp-table.i} find first person no-lock where lookup(person.pay-type,"H,S") = 1 no-error.
Converted example:
... new FindQuery(person, "lookup(upper(person.payType), 'H,S') = 1", null, "person.siteId asc, person.empNum asc", LockType.NONE).first(); ...
Details:
The primary index is selected as a default.
Example 19. ROW-ID¶
4GL source:
{person-temp-table.i} define buffer b-person for person. define variable t-rowid as rowid no-undo. find first person no-lock. assign t-rowid = rowid(person). find b-person no-lock where rowid(b-person) = t-rowid no-error.
Converted example:
... new FindQuery(person, (String) null, null, "person.siteId asc, person.empNum asc", LockType.NONE).first(); tRowid.assign(RecordBuffer.rowID(person)); ErrorManager.silentErrorEnable(); new FindQuery(bPerson, "bPerson.id = ?", null, "bPerson.siteId asc, bPerson.empNum asc", new Object[] { tRowid }, LockType.NONE).unique(); ErrorManager.silentErrorDisable(); ...
Details:
FIND
with ROWID
selection shows no impact on index selection. The default primary index is used. Of course, since a unique record is being found here, a sort order essentially is meaningless.*
© 2004-2017 Golden Code Development Corporation. ALL RIGHTS RESERVED.