Bug #2552
bad hql caching prevents queries to position on the correct rows
100%
History
#1 Updated by Ovidiu Maxiniuc about 9 years ago
- Status changed from New to WIP
- Target version set to Milestone 11
- Start date deleted (
04/17/2015)
When a query iterates over multiple tables joined by a non-mandatory filed, sometimes it will fail to locate the correct rows.
Here is the testcase:
DEFINE TEMP-TABLE tt1 FIELD f1 AS INT FIELD f2 AS DATE. CREATE tt1. ASSIGN tt1.f1 = 0 tt1.f2 = ?. CREATE tt1. ASSIGN tt1.f1 = 1 tt1.f2 = TODAY. DEFINE TEMP-TABLE tt2 FIELD f1 AS INT FIELD f2 AS DATE. CREATE tt2. ASSIGN tt2.f1 = 0 tt2.f2 = 12/12/12. CREATE tt2. ASSIGN tt2.f1 = 1 tt2.f2 = TODAY. DEFINE VARIABLE qh AS HANDLE NO-UNDO. CREATE QUERY qh. qh:SET-BUFFERS(BUFFER tt1:HANDLE, BUFFER tt2:HANDLE). qh:QUERY-PREPARE("FOR EACH tt1, EACH tt2 WHERE tt2.f1 EQ tt1.f1 AND tt2.f2 EQ tt1.f2"). qh:QUERY-OPEN. qh:GET-NEXT(NO-LOCK). IF NOT qh:QUERY-OFF-END THEN DO: DISPLAY tt1. DISPLAY tt2. END. ELSE MESSAGE "QUERY FAILED.". qh:QUERY-CLOSE(). DELETE OBJECT qh.
The output from P4GL is:
┌───────────────────────────────────────┐ │ f1 f2 f1 f2 │ │────────── ──────── ────────── ────────│ │ 1 17/04/15 1 17/04/15│ └───────────────────────────────────────┘
P2J will fail to locate the records that are sought and displays
QUERY FAILED.
message.
The cause of the issue is the fact that PreselectQuery.Component
keeps an private filed hqlPreprocessor
that 'preprocessed' one-time-only.
The preprocessing involves replacing the "= NULL"
with "NOT NULL" phrases. In a join where the common field is unknown/null in the first iteration will cause the entire query to get stuck with "not null" even if next row of the components are not null any more.
In the protected PreselectQuery.Component.getHQLPreprocessor()
, hqlPreprocessor
is lazy initialized and kept for subsequent usages, but it should not be cached at all. Or at least it should re-processed with each set of resolvedArgs
.
#2 Updated by Eric Faulhaber about 9 years ago
- Assignee set to Ovidiu Maxiniuc
- Priority changed from Normal to High
#3 Updated by Ovidiu Maxiniuc about 9 years ago
Eric,
In HQLPreprocessor.get()
there is a check (line ~454) if the current arguments contain a null/unknown value.
I believe that there was some kind of handling of this issue or something similar. I understand that at this moment, only HQLPreprocessor
with not-null arguments are cached.
For a quick solution, I wonder: what if we cache the all the objects ?
Another thing that is worthy to be mentioned here: it is express noted that the cache is simplified of inlined HQLs. I'm thinking that replacing = null
with correct syntax is also some king of inlining - the reason the HQL is not seek in the cache. I discovered this bug when I observed that the HQL contained a "not null" when there shouldn't be any. What if other inline operations causes the same bug? I mean: 1st time the HQL is processed, inlined, not saved in cache, but saved in PreselectQuery.Component
as a private field. If the arguments that were inlined change during the loop iteration of the previous Component
, wouldn't that render the HQL stored in PreselectQuery.Component
member field invalid ?
#4 Updated by Eric Faulhaber about 9 years ago
Ovidiu Maxiniuc wrote:
Eric,
InHQLPreprocessor.get()
there is a check (line ~454) if the current arguments contain a null/unknown value.
I believe that there was some kind of handling of this issue or something similar. I understand that at this moment, onlyHQLPreprocessor
with not-null arguments are cached.
For a quick solution, I wonder: what if we cache the all the objects ?
Please explain how this would solve the issue. Also, I'm concerned this would make the cache huge.
Another thing that is worthy to be mentioned here: it is express noted that the cache is simplified of inlined HQLs. I'm thinking that replacing
= null
with correct syntax is also some king of inlining - the reason the HQL is not seek in the cache. I discovered this bug when I observed that the HQL contained a "not null" when there shouldn't be any. What if other inline operations causes the same bug? I mean: 1st time the HQL is processed, inlined, not saved in cache, but saved inPreselectQuery.Component
as a private field. If the arguments that were inlined change during the loop iteration of the previousComponent
, wouldn't that render the HQL stored inPreselectQuery.Component
member field invalid ?
I think I tried to address this with PreselectQuery$Component.resetArgs()
. I have not debugged this in a long time...perhaps that mechanism is flawed or is not called at the right time?
#5 Updated by Ovidiu Maxiniuc about 9 years ago
Indeed, storing all inlined HQLPreprocessor
is not a good idea. Many of them won't ever be reused. The difficulty is identifying when the HQLPreprocessor
should be dropped or reset. I had a short look over the PreselectQuery$Component.resetArgs()
and this looks like the right solution. I am investigating now why this is not called at the right moment or is not doing its job right.
Thanks!
#6 Updated by Eric Faulhaber about 9 years ago
Ovidiu Maxiniuc wrote:
The difficulty is identifying when the
HQLPreprocessor
should be dropped or reset.
The original idea was to cache only those which had NOT had specific arguments inlined, since those would be useful only with those specific substitution parameters (and the HQL we use as part of the key would have been changed).
The implementation of inlining was expanded over time to address other issues as we found them. Nevertheless, as long as the HQLPreprocessor
instance is aware it had been inlined (i.e., wasInlined()
returns true
), this should still work. IIRC, the inlining of null substitution parameters as an is [not] null
phrase was one of the first types of inlining to be implemented, so I would expect that to work, but it is worth confirming this mechanism has not been compromised by later changes.
Even if the wasInlined()
part is still working properly, your findings suggest the code which uses that information may be flawed.
#7 Updated by Ovidiu Maxiniuc about 9 years ago
wasInlined()
seems to work correctly but resetArgs()
is not called when it should.
I checked and it is only called from PreselectQuery.reset()
. This is not correct. The argument should be reset after each iteration from the loop. At this moment CompoundQuery.processComponent()
calls reset()
but with argument resolveArgs = false
because resolveOnce = true
.
This is NOT how it is documented in the javadoc of CompoundQuery
:
FOR {EACH | FIRST | LAST} resolveOnce = false
#8 Updated by Ovidiu Maxiniuc about 9 years ago
I believe that I found why resolveOnce = true
.
The converted java code is:
qh.unwrapQuery().prepare("FOR EACH tt1, EACH tt2 WHERE tt2.f1 EQ tt1.f1 AND tt2.f2 EQ tt1.f2");
I was expected to apply the Progress construct from my previous note. This is not the case because
QueryProcessor.preparePredicate()
will rewrite the predicate like OPEN QUERY DynGenQuery FOR EACH tt1, EACH tt2 WHERE tt2.f1 EQ tt1.f1 AND tt2.f2 EQ tt1.f2
. So the converter will go for 1st choice listed in the CompoundQuery
javadoc:OPEN QUERY FOR resolveOnce = true
#9 Updated by Eric Faulhaber about 9 years ago
So, the conversion semantic is being honored correctly, but there is a flaw with our runtime implementation? To help me understand this better, would you please post the converted form of the dynamic query? I'd like to see what the HQL and substitution parameter(s) look like.
#10 Updated by Ovidiu Maxiniuc about 9 years ago
I've done some investigations, rewriting the dynamic query from the note 1 as a static iteration and also in a Progress query. Here are the P4GL sources:
FOR EACH tt1, EACH tt2 WHERE tt2.f1 EQ tt1.f1 AND tt2.f2 EQ tt1.f2: DISPLAY tt1. DISPLAY tt2. END.
DEFINE QUERY qq FOR tt1, tt2. OPEN QUERY qq FOR EACH tt1, EACH tt2 WHERE tt2.f1 EQ tt1.f1 AND tt2.f2 EQ tt1.f2. GET FIRST qq. DO WHILE AVAILABLE tt1: DISPLAY tt1. DISPLAY tt2. GET NEXT qq. END.
and java sources respectively:
query0 = new CompoundQuery(false, false); query0.addComponent(new AdaptiveQuery(tt1, (String) null, null, "tt1.id asc")); RecordBuffer.prepare(tt2); query0.addComponent(new AdaptiveQuery(tt2, "tt2.f1 = ? and tt2.f2 = ?", null, "tt2.id asc", new Object[] { new FieldReference(tt1, "f1"), new FieldReference(tt1, "f2") })); public void body() { query0.iterate(); [...] }
query0.assign(new CompoundQuery(true, false)); query0.addComponent(new AdaptiveQuery(tt1, (String) null, null, "tt1.id asc")); RecordBuffer.prepare(tt2); query0.addComponent(new AdaptiveQuery(tt2, "tt2.f1 = ? and tt2.f2 = ?", null, "tt2.id asc", new Object[] { new FieldReference(tt1, "f1"), new FieldReference(tt1, "f2") })); query0.open(); query0.first(); while (tt1._available()) { [...] query0.next(); }
Note that the general structure is the same, one first major difference is the QUERY version has a precondition (in the while
loop) while the 1st form depends on QueryOffEndException
to leave the loop when the query is iterate()
-ed. The second difference is in the CompoundQuery
constructor arguments: false, false
for FOR EACH
construct and true, false
for the QUERY
. They are the in accord with the CompoundQuery
javadoc.
There is one problem though: in P4GL, both codes will print out the same output:
┌───────────────────────────────────────┐ │ f1 f2 f1 f2 │ │────────── ──────── ────────── ────────│ │ 1 20/04/15 1 20/04/15│ └───────────────────────────────────────┘
(this is the same as the dynamic part in fact), while running converted Java of the
QUERY
version will not print any matching rows.
The dynamic P2J is generated using the OPEN QUERY
syntax, so the behavior became quite expected with these new informations.
I tried to force the resolveOnce
to false
for the query, too, and the java version gave the expected result. I checked the flow of that value and I saw that it is used only for reset()
-ting the query when the next component is processed.
At this moment I think we should always add components of multiple joined tables with resolveOnce
set to false
. This was thought as an optimization (to skip re-evaluating the arguments and adjusting the hql when possible) but if the query iterates multiple tables joined this way, it won't work.
#11 Updated by Eric Faulhaber about 9 years ago
Ovidiu Maxiniuc wrote:
At this moment I think we should always add components of multiple joined tables with
resolveOnce
set tofalse
. This was thought as an optimization (to skip re-evaluating the arguments and adjusting the hql when possible) but if the query iterates multiple tables joined this way, it won't work.
So, is your conclusion limited to the [OPEN QUERY] FOR EACH form of CompoundQuery
? Have you tested with [OPEN QUERY] PRESELECT forms as well?
What about this logic in PreselectQuery$Component.resetArgs
:
if (hqlPreprocessor != null && hqlPreprocessor.wasInlined()) { resetHQLPreprocessor(); }
Can this go wrong in the case where the HQLPreprocessor
was not inlined, because the first substitution parameter was not null
, but in a subsequent pass of the loop is null
? In this case, wouldn't the non-inlined preprocessor incorrectly be re-used?
If so, we need to find a way to correct this which does not throw away a good HQLPreprocessor
unnecessarily, since these are quite expensive to create.
#12 Updated by Ovidiu Maxiniuc about 9 years ago
- File hql-issue.p added
Eric Faulhaber wrote:
Ovidiu Maxiniuc wrote:
At this moment I think we should always add components of multiple joined tables with
resolveOnce
set tofalse
. This was thought as an optimization (to skip re-evaluating the arguments and adjusting the hql when possible) but if the query iterates multiple tables joined this way, it won't work.So, is your conclusion limited to the [OPEN QUERY] FOR EACH form of
CompoundQuery
? Have you tested with [OPEN QUERY] PRESELECT forms as well?
I tested right now. In my testcase the generated code for PRESELECT-ed query looks like
query0.assign(new PreselectQuery("tt1.id asc, tt2.id asc"));
so it does not use
CompoundQuery
constructor. When it is looped, the hql (select tt1.id, tt2.id from TempRecord1Impl as tt1, TempRecord2Impl as tt2 where (tt1._multiplex = ?0) and ((tt2._multiplex = ?1) and (tt2.f1 = tt1.f1 and tt2.f2 = tt1.f2)) order by tt1.id asc, tt1._multiplex asc, tt2.id asc
) is feed directly to Hibernate Session
to produce & scroll a Query
. There is one more bug here! In Progress, tt2.f2 = tt1.f2
stands even if both are nulls/unknown. If this hql is used in hibernate without pre-processing, rows that satisfy tt2.f2 = tt1.f2 = null
will not be selected!
What about this logic in
PreselectQuery$Component.resetArgs
:
[...]
Can this go wrong in the case where theHQLPreprocessor
was not inlined, because the first substitution parameter was notnull
, but in a subsequent pass of the loop isnull
? In this case, wouldn't the non-inlined preprocessor incorrectly be re-used?
If so, we need to find a way to correct this which does not throw away a goodHQLPreprocessor
unnecessarily, since these are quite expensive to create.
Indeed, if the first loop does not need "is [not] null" inlining, the query gets stuck so future iterations. I believe we need to always reset the hql or we should at least use a different flag for is [not] null
inlineing. It's preferable to be slow and correct rather than fast and wrong.
Here are my findings in these scenarios:
- with P2J untouched (bzr 10840) it will print (3 10 3 10), half of the records are clearly ignored;
- if
resolveOnce = false
forOPEN QUERY
we get (10 10 10 10) - slightly better, but the unknown pairs (record with id 3) are not selected because hql got stuck withinline = false
; - otherwise, always resetting the
hqlPreprocessor
, we get (3 10 3 10), the same as the initial test. SinceresolveOnce = true
, thePreselectQuery.Component.resetArgs()
does not get called so thehqlPreprocessor
is kept; - if
resolveOnce = false
forOPEN QUERY
andPreselectQuery.Component.resetArgs()
always resets thehqlPreprocessor
we get (30 30 30 10).
We can see that the last result (for preselect query) is clearly unaffected by these changes. The wrong values is the result of the missing records with id = 3
because both these records have f2=?
- see my 1st answer in this note.
#13 Updated by Ovidiu Maxiniuc about 9 years ago
inline
flag. The HQLPreprocessor.get()
is aware of the iterations with unknown parameters and will not cache these instances. There are two important steps:
- creation of the
HQLPreprocessor
. At this moment a 'fingerprint' of the argument used for creating the custom hql is saved (boolean array withtrue
on unknown position). This has the same lifespan as thehqlPreprocessor
in thePreselectQuery.Component
. - when the
Component
is about to be reset. If thehqlPreprocessor
was inlined (other cases than [not] null substitution), then it is dropped (and the fingerprint with it). Next, the current parameters are checked against the fingerprint from previous iteration. If they match, thehqlPreprocessor
stays, otherwise it is discarded because the hql contains another configuration of unknown fields.
cache
) only when really needed. The price is composed from:
- the fingerprint (array of N booleans). Usually they are only a couple of them, I never saw more than a dozen in the customer server project. They are discarded on each iteration so the memory is collected immediately.
- a linear iteration on the parameter list to check and store if they are null when the
hqlPreprocessor
is created and one iteration on each loop to see if the fingerprint changed in the new list of parameters. And only if thehqlPreprocessor
was not otherwise inlined case in which it is dropped beforehand.
#14 Updated by Eric Faulhaber about 9 years ago
This seems like a well-reasoned, practical approach. Please post an update for review when you are satisfied with your local testing.
#15 Updated by Ovidiu Maxiniuc about 9 years ago
- File hql-issue.p added
- File om_upd20150421a.zip added
The attached update fixes the management of the preprocessed hql. There are some things I am not completely satisfied so some part of the old code remained in place.
Regarding the remaining bug. Normally a query like tt2.f1 = tt1.f1
in SQL will select records with respective fields f1
not nulls. To mimic the Progress semantics, the SQL/hql should be rewritten into something like: (tt2.f1 = tt1.f1 or (tt2.f1 is null and tt1.f1 is null))
.
However, the syntax change should be doe not only for EQ
operator, but for all of the relational operators. I believe that the easiest solution is to use the functions exported from Operators
that return the correct expected relations between database fields.
#16 Updated by Eric Faulhaber about 9 years ago
Ovidiu Maxiniuc wrote:
Regarding the remaining bug. Normally a query like
tt2.f1 = tt1.f1
in SQL will select records with respective fieldsf1
not nulls. To mimic the Progress semantics, the SQL/hql should be rewritten into something like:(tt2.f1 = tt1.f1 or (tt2.f1 is null and tt1.f1 is null))
.
However, the syntax change should be doe not only forEQ
operator, but for all of the relational operators. I believe that the easiest solution is to use the functions exported fromOperators
that return thecorrectexpected relations between database fields.
The problem with this approach is the database query planner. We don't use the Operators
UDFs (I've tried before) because they are opaque to the database's query planner and cause the database to choose very inefficient query plans (i.e., lots of table scans). This is a non-starter, since it effectively disables the use of database indexes in queries and thus absolutely kills query performance.
#17 Updated by Ovidiu Maxiniuc about 9 years ago
Then the only solution is to rewrite the relational operators applied on two fields in the hqls as mentioned in note 15.
#18 Updated by Eric Faulhaber about 9 years ago
Agreed, we should try this. Hopefully, the or
in the where clause does not cause a performance problem as well.
I think we probably should implement this in the HQLPreprocessor
, to keep the converted HQL cleaner. On the other hand, this approach is a bit opaque, in that the behavior of the HQL the developer sees is changed at runtime. What are your thoughts?
#19 Updated by Eric Faulhaber about 9 years ago
Code review 0421a:
Looks good, please regression test. It seems the change to the CompoundQuery(boolean, boolean)
constructor be accompanied (perhaps in a separate pass) by a conversion change which eliminates the resolveOnce
parameter? If it is always false
now, why not eliminate it altogether?
#20 Updated by Ovidiu Maxiniuc about 9 years ago
- File hql-issue2.p added
- File om_upd20150422a.zip added
Eric Faulhaber wrote:
Please find attached the testcase and the update that fixes it. As you can see, without the update the result is bad (it outputs the row results too, in a text file):Agreed, we should try this. Hopefully, the
or
in the where clause does not cause a performance problem as well.I think we probably should implement this in the
HQLPreprocessor
, to keep the converted HQL cleaner. On the other hand, this approach is a bit opaque, in that the behavior of the HQL the developer sees is changed at runtime. What are your thoughts?
EQ NE LT GT LTE GTE Expected: 30,008 180,741 107 512 30,115 30,520 Computed: 5 619 107 512 112 517
The adjustments done are the following:
EQ |
tt2.f2 = tt1.f2 | tt2.f2 = tt1.f2 or tt2.f2 is null and tt1.f2 is null |
NE |
tt2.f2 != tt1.f2 | tt2.f2 != tt1.f2 or tt2.f2 is null and tt1.f2 is not null or tt2.f2 is not null and tt1.f2 is null |
LT |
tt2.f2 < tt1.f2 | changes not required |
GT |
tt2.f2 > tt1.f2 | changes not required |
LTE |
tt2.f2 <= tt1.f2 | tt2.f2 <= tt1.f2 or tt2.f2 is null and tt1.f2 is null |
GTE |
tt2.f2 >= tt1.f2 | tt2.f2 >= tt1.f2 or tt2.f2 is null and tt1.f2 is null |
The most complicated is the NE
, I don't know if there is a shorter form for the expression. The good part is that formed of OR
-s too.
All these changes happen in the HQLPreprocessor
. I understand that the queries might confuse peoples, but they are required to match the results P4GL returns using HQL/SQL dialect. I suppose a knowledgeable programmer does not expect to have P4GL simply work in SQL as well. If needed, we can add some warnings in the docs/javadoc.
#21 Updated by Ovidiu Maxiniuc about 9 years ago
Eric Faulhaber wrote:
Code review 0421a:
Looks good, please regression test. It seems the change to the
CompoundQuery(boolean, boolean)
constructor be accompanied (perhaps in a separate pass) by a conversion change which eliminates theresolveOnce
parameter? If it is alwaysfalse
now, why not eliminate it altogether?
The CompoundQuery
remained the same because I wanted to be sure the runtime works fine before making more changes. Unfortunately, there were many (hundreds) regressions in the test. It's clear something is wrong in there.
#22 Updated by Eric Faulhaber about 9 years ago
Ovidiu Maxiniuc wrote:
Eric Faulhaber wrote:
Code review 0421a:
Looks good, please regression test. It seems the change to the
CompoundQuery(boolean, boolean)
constructor be accompanied (perhaps in a separate pass) by a conversion change which eliminates theresolveOnce
parameter? If it is alwaysfalse
now, why not eliminate it altogether?The
CompoundQuery
remained the same because I wanted to be sure the runtime works fine before making more changes. Unfortunately, there were many (hundreds) regressions in the test. It's clear something is wrong in there.
Yes, I tested this update with the passing and failing tests overnight and it went badly (did not finish, lots of timeouts and failures). I noticed some new failures in the primary key search tests, so I then ran just the passing search tests. There were 30+ new failures. Unfortunately, my archive of the results was overwritten, so I need to run the tests again. I'll post the before and after results once that's done.
Also, you've added a SEVERE-level log message about the OUTER-JOIN to INNER-JOIN override. AFAIR, this is what Progress does (please double-check this), and I was just mimicking the behavior, so I don't think this deserves an error message -- I'm pretty sure Progress does not give you one. There are many of these messages in the server.log and it is unnecessarily alarming. I think we should (a) reduce the logging severity to FINE (since I believe this is Progress-compatible behavior); and (b) make the message more informative, so the reader can actually associated it with a query.
#23 Updated by Ovidiu Maxiniuc about 9 years ago
Yes, I tested this update with the passing and failing tests overnight and it went badly (did not finish, lots of timeouts and failures). I noticed some new failures in the primary key search tests, so I then ran just the passing search tests. There were 30+ new failures. Unfortunately, my archive of the results was overwritten, so I need to run the tests again. I'll post the before and after results once that's done.
I am really sorry I broke your test results. My tests were quite focused on the task, I did not fully test it to see any side effects.
Also, you've added a SEVERE-level log message about the OUTER-JOIN to INNER-JOIN override. AFAIR, this is what Progress does (please double-check this), and I was just mimicking the behavior, so I don't think this deserves an error message -- I'm pretty sure Progress does not give you one. There are many of these messages in the server.log and it is unnecessarily alarming. I think we should (a) reduce the logging severity to FINE (since I believe this is Progress-compatible behavior); and (b) make the message more informative, so the reader can actually associated it with a query.
I will certainly remove the message, it was not meant to reach the repository in this form. In fact the only thing that really needs logged is when the conversion from OUTER-JOIN to INNER-JOIN truly happens. So far I did not observe this, I just wanted to be sure it is logged if ever the case.
#24 Updated by Eric Faulhaber about 9 years ago
Ovidiu Maxiniuc wrote:
I haven't reviewed the update yet, but to respond to one of your items...
The most complicated is theNE
, I don't know if there is a shorter form for the expression. The good part is that formed ofOR
-s too.
...
NE
tt2.f2 != tt1.f2 tt2.f2 != tt1.f2 or tt2.f2 is null and tt1.f2 is not null or tt2.f2 is not null and tt1.f2 is null
I'm going to use A and B (and parentheses), because it's easier for me to understand. The equivalent of your expansion is:
(B != A) or (B is null and A is not null) or (B is not null and A is null)
Doesn't this simplify to the following?
(B != A) or (B is null != A is null)
Is that valid HQL/SQL? If not, we go with your original version.
#25 Updated by Eric Faulhaber about 9 years ago
Ovidiu Maxiniuc wrote:
I am really sorry I broke your test results. My tests were quite focused on the task, I did not fully test it to see any side effects.
No apology necessary. The purpose was to regression test your update.
I will certainly remove the message, it was not meant to reach the repository in this form. In fact the only thing that really needs logged is when the conversion from OUTER-JOIN to INNER-JOIN truly happens. So far I did not observe this, I just wanted to be sure it is logged if ever the case.
I didn't notice it either, though I did not comb through all the messages (there were a lot). I think leaving the inner message (perhaps as INFO) is good, but since you are removing the more general message before it, make the prefix something like, "Converting OUTER-JOIN to INNER-JOIN: ".
#26 Updated by Ovidiu Maxiniuc about 9 years ago
Doesn't this simplify to the following?
(B != A) or (B is null != A is null)
Is that valid HQL/SQL? If not, we go with your original version.
You are right. I hardcoded it (it also have the advantage of the similarity with =
form, except for central node).
I tested it in H2 query tester and the results are very good (full parentheses were required). I quickly added into HQLPreprocessor
:
{tt2.f2 != tt1.f2} HQLPreprocessor => [tt2.f2 != tt1.f2 or tt2.f2 is null <> tt1.f2 is null]
Unfortunately, it does not work. The result were different. The only way to see why was to enable SQL output, and here it is:
select temprecord0_.id as col_0_0_, temprecord1_.id as col_1_0_ from tt1 temprecord0_ cross join tt2 temprecord1_ where temprecord0_._multiplex=? and temprecord1_._multiplex=? and (temprecord1_.f2<>temprecord0_.f2 or (temprecord1_.f2=null)<>temprecord0_.f2 is null) order by temprecord0_.id asc, temprecord0_._multiplex asc, temprecord1_.id asc
As you can see there is a big failure to properly convert it. An educated guess is that we need to patch Hibernate again because is not able to handle this kind of hql-s properly.
#27 Updated by Eric Faulhaber about 9 years ago
Ovidiu Maxiniuc wrote:
As you can see there is a big failure to properly convert it. An educated guess is that we need to patch Hibernate again because is not able to handle this kind of hql-s properly.
Let's not take this on if we don't have to. If your form works and doesn't cause a significant performance penalty, let's use it. The application developer won't see this complexity anyway, unless they have to log the expanded form.
#28 Updated by Ovidiu Maxiniuc about 9 years ago
Let's not take this on if we don't have to. If your form works and doesn't cause a significant performance penalty, let's use it. The application developer won't see this complexity anyway, unless they have to log the expanded form.
OK, I will document this in the code, leaving the second tree construction commented for a future optimization.
#29 Updated by Eric Faulhaber about 9 years ago
See #1868, note 51 for the primary key search test results with and without the 0421a update. There are 33 new failures which may help track down the problem.
#30 Updated by Eric Faulhaber about 9 years ago
It is difficult to review the 0422a update, because it is built on top of the 0421a update. Since we know we're not keeping the latter due to regressions, we either need a combined update which fixes the regressions from 0421a, or a clean version of 0422a which is not mixed with the 0421a changes.
#31 Updated by Ovidiu Maxiniuc almost 9 years ago
- File om_upd20150603a.zip added
Eric,
I found the probable cause for the failures in regression testing of the ?
relational semantics. I was working with HQLAst
tree and I was expecting that if I create an OR
node under an AND
node, the emit()
will correctly walk and create the hql using parentheses where needed. Unfortunately, emit()
method is not smart enough so the LPARENTS
nodes must be forced in this case. The attached update fixes this issue.
I know that the file is a little messed mostly because of the createAstNode
helper method. Please ignore the 'intermediary' System.out
s. I will remove them at commit time, but now they log the changes I apply in code. If there are issues with Majic, I can see immediately the bad queries. The most important changes are the blocks starting at lines 1308 and 1426. The rest are small code improvements.
I have a small question: at lines 1599, shouldn't we also handle the NOT_EQ
operator?
#32 Updated by Eric Faulhaber almost 9 years ago
Code review 0603a:
Looks good, nice find. Agreed, will look better with the println statements removed. Do you want to leave these behind as FINEST-level log messages, or are you concerned about the overhead?
at lines 1599, shouldn't we also handle the
NOT_EQ
operator?
That already evaluates to the BOOL_FALSE result. Is that an incorrect outcome?
#33 Updated by Ovidiu Maxiniuc almost 9 years ago
Looks good, nice find. Agreed, will look better with the println statements removed. Do you want to leave these behind as FINEST-level log messages, or are you concerned about the overhead?
I was going to remove them. I don't think processing of the relational operators worth getting to repo. Maybe the cache PUT operation, logged with lowest priority, eventually.
at lines 1599, shouldn't we also handle the
NOT_EQ
operator?That already evaluates to the BOOL_FALSE result. Is that an incorrect outcome?
In P4GL two unknowns are equals, contrary to standard SQL. So yes, NOT_EQ
operator must return No
when comparing two ?
.
#34 Updated by Eric Faulhaber almost 9 years ago
Ovidiu, was your 0603a update meant to be applied on top of the 0421a update?
I tested P2J 10871 + 0603a (without 0421a) against the full set of search unit tests. It does not fix any additional tests on its own, but it doesn't seem to regress anything either. I guess we need to correct the HQLPreprocessor
caching as well to potentially see an improvement? The good news is, there was no noticeable performance degradation with 0603a -- I was worried about this, given the greater complexity of the HQL.
#35 Updated by Ovidiu Maxiniuc almost 9 years ago
0603a is an independent update. I intentionally removed any relation to previous updates in order to isolate the issue at hand.
From my analysis, the impact of the rewritten SQL should be minimal on performance: the location and HQL changes are processed quickly, no iterations involved, there are rare cases when the additional boolean terms are evaluated.
I could count only 7 occurrences in the PRIMARY_KEY searches on the customer's server but there are 1227 queries affected in Majic tests. There are quite many of them and I wonder how this unknown issue is not affecting any of the tests. Probably there are not so many NULLs in database (?).
BTW, with the exception of two failed testcases in gso_ctrlc_tests
because of a client crash, the update passed the regression testing.
#36 Updated by Eric Faulhaber almost 9 years ago
I have posted my full search test results at #1868, note 60 (including the stdout.log
this time), for your reference.
Even though no new tests passed, I agree that this update produces more correct HQL than we had before. Perhaps we will see a different result when the original, HQL preprocessor caching issue is worked out.
Please very carefully remove the debug statements or convert them to proper logging as discussed above. After confirming the code compiles after these changes, please commit and distribute the revised update.
#37 Updated by Ovidiu Maxiniuc almost 9 years ago
- File om_upd20150604a.zip added
Eric Faulhaber wrote:
I have posted my full search test results at #1868, note 60 (including the
stdout.log
this time), for your reference.
Thanks, you read my mind.
Even though no new tests passed, I agree that this update produces more correct HQL than we had before. Perhaps we will see a different result when the original, HQL preprocessor caching issue is worked out.
Please very carefully remove the debug statements or convert them to proper logging as discussed above. After confirming the code compiles after these changes, please commit and distribute the revised update.
I removed the traces to System console and even ran a quick testset against the customer server project. The attached update was committed to bzr (10873) and distributed by mail, as usually.
#38 Updated by Ovidiu Maxiniuc almost 9 years ago
- File om_upd20150608c.zip added
This update should handle the main issue of this task: the caching of the hql-s. It will put into cache only the queries that have no null parameters. Also the constructor for CompoundQuery
will ignore the resolveOnce
parameter because the hql needs to be re-evaluated in the eventuality that unknown
fields were inlined with in not null
.
It a first run of regression testing it has failed some testcases but I cannot clearly state if it passed or not.
#39 Updated by Eric Faulhaber almost 9 years ago
Code review 0608c:
I think the changes are OK. Please post your regression test results when finished. I'll test this update tonight against the full set of search unit tests.
#40 Updated by Eric Faulhaber almost 9 years ago
Eric Faulhaber wrote:
I'll test this update tonight against the full set of search unit tests.
The test went well. A handful fewer tests failed than my last run, but it's hard to make a conclusion about whether this was due to the update or the instability of the current version of the test environment (my last few runs before this have had varying results, even though the code under test and runtime were the same). If it passes regular regression testing, please check this update in and distribute.
#41 Updated by Ovidiu Maxiniuc almost 9 years ago
The 0608c update passed the regression testing, was committed to bzr as revno 10876 and distributed by mail.
#42 Updated by Eric Faulhaber almost 9 years ago
- Status changed from WIP to Closed
- % Done changed from 0 to 100
#43 Updated by Eric Faulhaber almost 9 years ago
- Status changed from Closed to WIP
I've re-opened this task to address a defect: it is possible for the HQLPreprocessor
to create invalid ASTs when the new, NULL-handling logic affects fields of type character
. This is due to the fact that by the time this tree manipulation is done, the <alias>.<property>
references may already be enclosed by functions (like rtrim
). The fix is to confirm the top of the branch is an ALIAS node and if not, to traverse down the branch until a descendant node of type ALIAS is found. As an optimization, we only check the child at index 0 for each descendant level, since we know which functions can be in use and the order of their parameters.
The proposed fix is in 2552a/10888 (branched from trunk 10887). I currently am testing it with the set of unit tests which exposed this defect. If that goes well, I'll regression test.
#44 Updated by Eric Faulhaber almost 9 years ago
- Status changed from WIP to Closed
The change fixed the unit tests I was targeting.
I rebased 2552a to trunk rev. 10888 to pick up Hynek's redirected output fix, which put this fix at 2552a task rev. 10889.
The update passed regression testing and was merged into the trunk as rev. 10889. I'm re-closing this task as a result.
#45 Updated by Greg Shah over 7 years ago
- Target version changed from Milestone 11 to Cleanup and Stablization for Server Features