Bug #6923
inner transaction block reverts too much
0%
History
#1 Updated by Radu Apetrii over 1 year ago
//define temp-table test field f1 as integer field f3 as integer. for each test: delete test. end. do transaction: create test. test.f1 = 999. test.f3 = 2. create test. test.f1 = 998. test.f3 = 5. end. do transaction on error undo, leave: create test. test.f1 = 997. test.f3 = 7. create test. test.f1 = 996. test.f3 = 11. test.f3 = 13. do transaction: test.f3 = 17. undo, leave. end. end. find last test. message test.f3. define query q for test scrolling. open query q preselect each test. get first q. do while available(test): message test.f1 test.f3. get next q. end. for each test: delete test. end.
If I test with temporary tables, here are the results:
4GL | FWD |
---|---|
13 | 7 |
999 2 | 999 2 |
998 5 | 998 5 |
997 7 | 997 7 |
996 13 |
4GL | FWD |
---|---|
2 | 2 |
996 13 | 997 7 |
997 7 | 998 5 |
998 5 | 999 2 |
999 2 |
I want to point out that my persistent table called test
has 2 fields (f1
and f3
, both of type integer), but, the field f1
is part of an index (I don't think it matters, but just in case).
The problem seems to be inside the inner transaction when encountering undo, leave
. Instead of undo-ing/reverting only the value of test.f3
to 13, the program deletes the whole record of test
with test.f1 = 996
.
#2 Updated by Radu Apetrii over 1 year ago
Update: If you replace the middle transaction block with the code below, the program works fine. It reverts only the value of the test.f3
, (re)making it equal to 13, and no other places are affected by the undo, leave
.
do transaction on error undo, leave: create test. test.f1 = 997. test.f3 = 7. create test. test.f1 = 996. test.f3 = 11. test.f3 = 13. create test. test.f1 = 995. test.f3 = 19. find prev test. do transaction: test.f3 = 17. undo, leave. end. end.
Also, maybe it was not clear enough, but the definition of the temp-table is commented out because I also have a persistent table called
test
. If I were to test this example with a temporary table, I would simply uncomment the first line.#3 Updated by Radu Apetrii over 1 year ago
- Start date deleted (
11/10/2022) - Assignee set to Radu Apetrii
- Status changed from New to WIP
I narrowed down the code, in order to describe what I have found until this moment. This is the most minimalistic example that does not work (I found others too):
define temp-table test field f1 as integer field f3 as integer. do transaction: create test. test.f1 = 995. test.f3 = 13. do transaction: test.f3 = 17. undo, leave. end. end. find first test.In 4GL there is no error given, while in FWD it says that it cannot find the first value of
test
.Each record has a
RecordState
that defines the state that it is currently in. This is done as follows:
- In the first
do transaction
, the record of the tabletest
will receive the stateNEW CHANGED TRACKED
, which is fine because the intent is to flush it to the database if no errors appear. - In the nested
do transaction
, at linetest.f3 = 17.
, the record will receive again the stateNEW CHANGED TRACKED
, which is fine again, due to the same reason as before. - At line
undo, leave
, the program reachesChangeSet.rollback
, more specifically,liveState.state = baselineState
. This means that the record will reset its state to the baseline one.
NEW CHANGED TRACKED
because we still have the outside transaction (the first one), but, that is not the case. The value of baselineState
is DEFAULT
, which means that when the flushing to the database happens, the record would remain untouched, because there is nothing to do with it (this is the DEFAULT
behavior). It will exist in the buffer, but not in the database.After that, I have investigated why the value of
baselineState
is DEFAULT
at that moment and I have come to these conclusions:
- Its value is only modified inside the constructor of
ChangeSet
. - The code that deals with creating new
ChangeSet
s is located inBaseRecord.prepareChangeSet
:
// create a change set if one with the same transaction ID does not already exist if (changeSet == null || !changeSet.verify(session, txId)) { changeSet = new ChangeSet(session, txId, data, state.state, txLevel); }
- For this example, there should be 2
ChangeSet
s, because there are 2 transactions that start. The second one is not created because the transaction ID is the same as the one before (and they should not be equal). txId
is obtained by taking the last 32 bits fromtxNestingId
.txNestingId
is obtained inSession.getTxNestingId
, and it consists of 2 parts: the first 32 bits represent the block nesting level and the last 32 bits are obtained fromtxCount
.- I see that
txCount
is increased insideSession.beginTransaction
, but that only happens once (for the firstdo transaction
) and it seems to me that this method does not take into consideration nesteddo transaction
s. Thus, the value of both ids is the same which leads to not making a newChangeSet
.
I've seen this behavior only inside nested do transaction
s. Also, I have tested with 3821c, rev. 14361.
Should I continue to investigate this further? If so, can you give me some information on how txCount
is supposed to work? Thank you.
#4 Updated by Eric Faulhaber over 1 year ago
When you nest transactions like:
do transaction: ... do transaction: ... end. ... end.
...this does not represent two full transactions. It represents a full transaction with an enclosed sub-transaction. When OE opens a full transaction, any block further down the stack which normally would open its own, full transaction becomes a sub-transaction block. Think of a sub-transaction as a savepoint, where anything done in that block can be committed or rolled back. If it is rolled back, it does not affect work done outside of it, in the context of the full transaction. Sub-transactions can be nested arbitrarily deeply (there may be a practical limit, but I mean this conceptually).
In FWD, we only open a full database transaction at the outermost, full transaction block we encounter. It stays open and is committed or rolled back when that block ends or iterates. This is not ideal, as SQL best practices encourage transactions to be short-lived, and we don't know how long such a block will take to finish its work, but we follow the 4GL semantic in this regard.
As we encounter sub-transactions within that outermost, open transaction, we use:
- savepoints to represent these sub-transaction units of work on the database side, and
ChangeSet
levels to track the in-memory state of a record.
Both the database and in-memory state need to remain consistent w.r.t. each transaction and sub-transaction level, but they are not necessarily in sync with each other at a given point, because of the timing with which in-memory changes are flushed to the database. So, we use these two independent but related mechanisms.
Session.txCount
is the count of full transactions opened by a particular Session
instance. The purpose of this variable is only to produce a unique identifier for a particular, full transaction within that scope/context. It is not incremented for sub-transactions. Individual sub-transactions are identified by their level within a given full transaction.
Session.getTxNestingId
provides both these values in a 64-bit long
. The javadoc for this method is wrong (I must have changed the implementation without updating the javadoc). The method stores the current block's sub-transaction nesting level in the high-order 4 bytes of the return value, and the full transaction identifier (i.e., txCount
) in the low-order 4 bytes.
There should only ever be 0 or 1 ChangeSet
instances per record at any moment. There is never more than 1, because only one full transaction can be active at a time. So, I believe that part is working correctly. The bug could be in the restoration of state for a sub-transaction, but that is based on intuition only; I have not debugged into your test cases. OTOH, the fact that the results are different in your first test case between temp-table and persistent table use cases suggests some difference in the savepoint processing. Are you using H2 or PostgreSQL as the back-end for your persistent table use case?
Please continue with the investigation, taking my description of the intended design into account, and document your further findings here.
#5 Updated by Eric Faulhaber over 1 year ago
Eric Faulhaber wrote:
...the fact that the results are different in your first test case between temp-table and persistent table use cases suggests some difference in the savepoint processing.
I just realized the results of the 4GL programs are different between these two cases as well.
I think this is explained by the difference in transaction behavior between the temp-table and persistent table use cases.
For the persistent-table case, I believe the for each test
loops at the start and end of the program which delete all the records in the table cause an implicit, full transaction to be scoped to the external procedure block. So, the two do transaction
blocks are treated as sub-transactions.
For the temp-table case, I believe the delete loops do NOT cause an implicit, full transaction to be scoped to the external procedure block. So, the two do transaction
blocks each are treated as full transactions.
You can create a listing of the original program in both forms using the COMPILE statement to confirm this.
Regardless, FWD's results still differ, so we need to understand why.
#6 Updated by Radu Apetrii over 1 year ago
Thank you for the explanations!
I've taken them on board and I have seen that the bug is in fact about the restoration of the state for a sub-transaction. Unfortunately, it seems that there are more factors to this than just assigning the correct state. I'll use two examples to illustrate this point.
- When entering the full transaction, the record will be given the
NEW CHANGED TRACKED
state. Also, this means that inside theChangeSet
class, the arraystates
will haveCHANGED
on the first position. - When entering the sub-transaction, the state of the record will remain unchanged, but, the array
states
will haveCHANGED
on the second position as well. - At the line with
undo, leave
, the program would executeChangeSet.rollback
, which means that the record would be given the stateDEFAULT
. - When the sub-transaction ends, the program would try to flush the record to the database, which won't work because of the
DEFAULT
state that indicates that there is nothing to do with the record. This is fine because we do not want to flush the record when the sub-transaction ends. - When the outer transaction ends, the program would try to flush the record to the database, which won't work due to the same reason: the record's state is
DEFAULT
. The problem is that the record should've been committed to the database.
liveState.state = baselineState
inside ChangeSet.rollback
, I tried this:if (inner > 0) { liveState.state = states[inner - 1]; } else { liveState.state = baselineState; }This means that instead of assigning the baseline state, the record would receive the stored state (the one from
states
) of the "parent" transaction block. This leads to the following steps:
- The first two steps remain the same as in case 1.
- At the line with
undo, leave
, the program would executeChangeSet.rollback
, which means that the record would receive the stateCHANGED
, taken from the arraystates
. This state corresponds to the full transaction's stored state inside the array. - When the sub-transaction ends, the program would enter the
flush
method, but will not commit the record to the database. This happens because a transaction is already active (the full-transaction), and it is good, because this is not the moment to commit. But, inside theValidation.flush
method, the record will change its state, becomingCACHED
. - When the outer transaction ends, the program would try to flush the record to the database, but this will not work because the state
CACHED
does not suggest that it needs flushing.
In the second case, the record has a good state for committing (
CHANGED
) at a given moment, but that moment is not the right one. In order to commit the record, its state should be CHANGED
when the full transaction ends and the flushing is tried.There are 2 solutions that come to my mind, but I'm not sure if any of them is actually intended:
- The changing of the record's state could be done in two steps (at the moment, it happens only in the
ChangeSet.rollback
method): first, in therollback
method, make the state equal toDEFAULT
so when the flushing part happens (from the sub-transaction), nothing would be done. Then, after the flushing from the sub-transaction, make the state equal to what it is supposed to be. That means that if we are still inside a sub-transaction, make the state equal tostates[inner - 1]
, or, if we are in the full transaction block, make it equal tobaselineState
. Thus, the commit will only happen at the end of the full transaction, which I think was the intent anyways (with accent on "I think"). - The other solution, perhaps not as "clean" as the other one, is an extension of the second case. The
if (inner > 0)
would remain insideChangeSet.rollback
, which means that the record's state will be updated (in this case it will becomeCHANGED
). But, inside the flush method, instead of making the state equal toCACHED
, maybe another check could be done to see if indeed theCACHED
state should be given to the record or maybe another state (or remain unchanged, in this caseCHANGED
). This way, the state would remain the same until the record is finally committed to the database, and then it would be changed.
I will continue to investigate this, but I just wanted to provide an update. The main test is the same as in the #6923-3, the branch and revision are 3821c, rev. 14361.
Eric Faulhaber wrote:
Are you using H2 or PostgreSQL as the back-end for your persistent table use case?
I am using PostgreSQL to test with persistent tables.
For the persistent-table case, I believe the for each test loops at the start and end of the program which delete all the records in the table cause an implicit, full transaction to be scoped to the external procedure block. So, the two do transaction blocks are treated as sub-transactions.
For the temp-table case, I believe the delete loops do NOT cause an implicit, full transaction to be scoped to the external procedure block. So, the two do transaction blocks each are treated as full transactions.
You can create a listing of the original program in both forms using the COMPILE statement to confirm this.
I have done so. In my findings, the for each
loops at the start and end do not affect the full transaction scope, so those two do transaction
blocks are not treated as sub-transactions. A case where this would happen would be if the program would start with create test
. Then, the whole program would be considered a large, full-transaction.
Still, I will provide further updates on this issue.
#7 Updated by Radu Apetrii over 1 year ago
- File 6923-2.patch added
- In a
rollback
situation, instead of updating a record's state with thebaselineState
in all cases, I made the state equal to the stored state of the "parent" transaction, if the "parent" transaction is a sub-transaction, or equal to thebaselineState
otherwise. This happens inChangeSet.rollback
. - When the
states
array is being updated, instead of storing the valueflags
, I thought that the intent was to store the state of the record. So I modified the parameter fromflags
tostate.state
. This happens inBaseRecord.dataChanged
and inBaseRecord.bulkDataChanged
.
This is the test with the problem:
define temp-table test field f1 as integer field f3 as integer. do transaction: create test. test.f1 = 995. test.f3 = 13. create test. test.f1 = 996. test.f3 = 14. do transaction: find first test. test.f3 = 17. undo, leave. end. end. find first test. message test.f1 test.f3.What happens here is:
- Right before the creation of the second record (the one with
test.f1 = 996.
), the first record is committed to the database. This means that the record changed its state, becomingCACHED TRACKED
fromNEW CHANGED TRACKED
. This is fine and it makes sense. - The second record was created and right before the sub-transaction begins, the
states
array is updated with the stateNEW CHANGED TRACKED
on level 0 (corresponding to the full transaction) for that record. - Inside the sub-transaction, right before
find first test.
, the record withf1 = 996
is committed to the database and its state changes fromNEW CHANGED TRACKED
toCACHED TRACKED
. Also, thestates
array for that record, on level 1 (because we are in the sub-transaction) becomesCACHED TRACKED
. - When the
rollback
happens, the second record "reverts" its state, becomingNEW CHANGED TRACKED
again, because that was the value it last had in the full transaction. - When the
flush
happens, because the second record has the stateNEW CHANGED TRACKED
, it is committed to the database (or at least tried). This breaks the program, because the record already exists there and I cannot insert one with the sameprimaryKey
.
I'll pend this issue for the moment, because I'm not sure when I should revert the state and when not. Maybe you can give me some information on this, if that's OK.
Also, these changes were tested with 3821c, rev.14373.