Bug #7474
FindQuery provides the wrong result because the record is not updated in the database
0%
History
#1 Updated by Radu Apetrii almost 1 year ago
- Subject changed from Final sort is done even if a reverse index is used to FindQuery provides the wrong result because the record is not updated in the database
define temp-table tt field f1 as int field f2 as int index idx1 as unique primary f1. define buffer b1 for tt. define buffer b2 for tt. do transaction: create b1. b1.f1 = 1. b1.f2 = 1. end. do transaction: b1.f1 = 2. create b2. b2.f1 = 1. b2.f2 = 2. define buffer b3 for tt. find first b3 where b3.f1 = 1. message b3.f1 b3.f2. end.After
RandomAccessQuery.honorRecordNursery
is executed, the table tt
has two records with f1 = 1
:
- One with
f1 = 1
andf2 = 1
, which should've hadf1 = 2
- One with
f1 = 1
andf2 = 2
This means that when the query is actually executed, the first record is found and not the second one (which should be the only one with f1 = 1
).
Thus, the result in FWD is 2 1
while in 4GL is 1 2
.
It seems to me that RecordNursery
doesn't update the records that are already in the database.
#2 Updated by Radu Apetrii 12 months ago
- At line
create b2. b2.f1 = 1. b2.f2 = 2.
, there is no flush happening for the first record, because different buffers are in use and we are in a transaction. - At line
find first b3 where b3.f1 = 1.
, the first record doesn't get updated because:- The buffer it is using is neither
b3
, nor the default buffer fortt
. RecordNursery
(I think) handles only new records, while the first record is already in the database, it's just not updated.
- The buffer it is using is neither
#4 Updated by Alexandru Lungu 12 months ago
- Status changed from New to WIP
- Assignee set to Radu Apetrii
- direct-access is non-deterministically finding any stale record
- lazy isn't able to catch up with the in-memory changes
As a summary, I think we shall have AdaptiveQuery
use RecordNursery
if it is in lazy-mode (to start with). EDIT: not only when in lazy mode. Even if we invalidate, the underlying RAQ
or CompoundQuery
will face duplicate records in the database.
#5 Updated by Alexandru Lungu 12 months ago
I feel this vulnerability we have here is quite concerning as it might generate inconsistent data from _temp
. There are many ways to trigger is (just tested with several scenarios), so as long as a user is working with more than 1 buffer over a temporary tables, there are lots of ways to exploit this issue.
At this stage, I am not quite sure why why "refrain" from eagerly flushing in the temporary tables. They aren't part of concurrency issues, so any updates/inserts there should be done eagerly. The only goal of the transactional environment is related to the possibility of roll-back, but from data consistency point of view, the only user to view the changes is the one that triggered the transaction anyway. Please correct me if I am wrong.
Therefore, I don't feel like storing too much data in memory makes sense when working with _temp
. This causes the problems we have in #7474-1 (what is in memory vs what is in the database): if we do where b3.f1 = 1
, we get a record with f1
set on 2
- this is highly counter-intuitive.
I agree that we need to batch the changes as long as we can, so we don't spam the database with update
and insert
, but as long as the table is queried, we need to make sure it is in an up-to-date state.
I will reiterate the fact that RecordNursery
is already handling the "new records". We need to extend this to "updated records", otherwise queries will fail for records that were freshly updated in the database. I will need some advice here:
- make
RecordNursery
(or other similar class) responsible for updates (at least for _temp) - make it sure that
makeVisible
(or other similar method) also flushes the updates (at least for _temp) - extend the use of
RecordNursery
for lazyAdaptiveQuery
when iterating
#6 Updated by Ovidiu Maxiniuc 12 months ago
Isn't this the same issue as #7373?
As noted in my #7373-35 the problem can apparently (it did for that testcase, but extended test are required) be fixed by using buffer.isTemporary()
as 2nd parameter of buffer.validateMaybeFlush()
to force the flush of _temp table records.
However, the problem is not limited to _temp
database, if you construct the same testcase using permanent database the issue is still present.
#7 Updated by Alexandru Lungu 12 months ago
Ovidiu Maxiniuc wrote:
Isn't this the same issue as #7373?
As noted in my #7373-35 the problem can apparently (it did for that testcase, but extended test are required) be fixed by using
buffer.isTemporary()
as 2nd parameter ofbuffer.validateMaybeFlush()
to force the flush of _temp table records.However, the problem is not limited to
_temp
database, if you construct the same testcase using permanent database the issue is still present.
It is the same. However, I am not sure that the solution of "force" flush is ideal as it may spam the database with small updates, instead of batching. I mean, if we "stack" the changes and flush them at the latest point of time possible, we can have less updates (less SQL), right? That is why I am still trying to hang on RecordNursery
, which does that for creates. Anyway, I read #7373 and there are other major problems to be tackled first (persistent database).
Even so, I am long past the point where I handle_temp database the same as the persistent database. This is mostly because, we already have some low-level implementation over H2, which allows us to work with temporary data with virtually no overhead. I mean, for _temp database, there is close to no difference between in-memory FWD state and database FWD-H2 state - so we can compromise on aggressively flushing _temp tables alone and find another solution for the persistent database. Also, for _temp database we also have some low level interaction (LAZY, DIRECT-ACCESS), which require consistent data. For LAZY at least, having changes in memory and in database is really bad.
#8 Updated by Ovidiu Maxiniuc 12 months ago
I also had some thoughts on performance. Ideally, we should delay the flush as long as possible, as long as the queries perform (provide same result sets) the same as in 4GL. But now this is a problem so, apparently we cannot wait that much. The correctness prevails, right?
The batching solution is good. If we know that a burst of record flush will occur, we can use it (like in a 4GL assign statement or when a procedure ends and buffers reach the end of their scope). Otherwise, we need to stick to 'group' the flushes at transaction level.
#11 Updated by Tomasz Domin 4 months ago
I've hit the issue while working on intra session dirty share - in case record is modified in a transaction different then it was created its changes are not reflected into database thus queries (RandomAccessQuery) do not work correctly.
I am working on the issue, it is not solved yet. Please see #7288-42.
#13 Updated by Tomasz Domin 4 months ago
I think I've implemented a fix on this.
The issue was related with a fact, that RecordNursery
was only involved in transaction when record was created and IndexState
was empty.
In 6667i
I've implemented record indexes tracking in RecordNursery no matter what IndexState
was.
--- old/src/com/goldencode/p2j/persist/orm/Validation.java 2024-01-16 13:46:19 +0000 +++ new/src/com/goldencode/p2j/persist/orm/Validation.java 2024-03-08 19:48:20 +0000 @@ -46,6 +46,7 @@ ** 008 DDF 20231127 Used a dialect specific method to handle unique constraint violation. ** 009 RAA 20230724 Replaced FunctionWithException with QueryStatement. ** TJD 20240110 Fixes for DirtyShare cross session. +** TJD 20240308 Allow RecordNursery to work over records modified in another transaction */ /* @@ -359,24 +360,25 @@ // * any indices for a new record were updated - if (dmo.getIndexState().isEmpty())