Project

General

Profile

Bug #5625

changing a before-table reserved field does not flush the change in the after-table

Added by Constantin Asofiei almost 3 years ago. Updated almost 3 years ago.

Status:
WIP
Priority:
Normal
Target version:
-
Start date:
Due date:
% Done:

0%

billable:
No
vendor_id:
GCD
case_num:
version_reported:
version_resolved:

History

#1 Updated by Constantin Asofiei almost 3 years ago

In OE, the error-string attribute is in sync between the before-table record and the after-table record. In FWD, the code responsible to keep them in sync is in TemporaryBuffer.setupBeforeFlagImpl:

         if (updatePeer && record._peerRowid() != null)
         {
            Class<? extends Record> peerClass = 
                  ((BufferImpl) getDMOProxy()).peerBuffer.buffer().getDMOImplementationClass();
            Session session = getPersistence().getSession();
            TempRecord peerRecord = (TempRecord) session.get(peerClass, record._peerRowid());
            TempRecord peerSnapshot = (TempRecord) peerRecord.deepCopy();
            updater.update(peerRecord);
            peerRecord.updateState(null, DmoState.CHANGED, true);

            // reusing props
            changeBroker.stateChanged(RecordChangeEvent.Type.UPDATE, this, peerRecord, peerSnapshot, props);
         }

I've changed it to pass the proper TemporaryBuffer instance to stateChanged call, like this:

            TemporaryBuffer peerBuf = (TemporaryBuffer) ((BufferImpl) getDMOProxy()).peerBuffer.buffer();
            Class<? extends Record> peerClass = peerBuf.getDMOImplementationClass();
            Session session = getPersistence().getSession();
            TempRecord peerRecord = (TempRecord) session.get(peerClass, record._peerRowid());
            TempRecord peerSnapshot = (TempRecord) peerRecord.deepCopy();
            updater.update(peerRecord);
            peerRecord.updateState(null, DmoState.CHANGED, true);

            // reusing props
            changeBroker.stateChanged(RecordChangeEvent.Type.UPDATE, peerBuf, peerRecord, peerSnapshot, props); // changed here

but it still doesn't work.

The problem here is that FWD keeps the cached copy, and it doesn't want to flush the record.

This test duplicates this problem:

def var i as int.
def temp-table tt1 before-table btt1 field f1 as int.
def dataset ds1 for tt1.

temp-table tt1:tracking-changes = true.

do transaction:
   create tt1.
   tt1.f1 = 10.
   release tt1.

   // buffer tt1:error = true.
   // buffer tt1:error-string = "a".

   buffer btt1:error = true.
   buffer btt1:error-string = "b".
   buffer btt1:rejected = true.
   release btt1.
end.

temp-table tt1:tracking-changes = false.
// flush the cache
do i = 1 to 1200:
    create tt1.
    tt1.f1 = - i.
    release tt1.
end.

find first tt1 where tt1.f1 = 10.
find first btt1.

message buffer tt1:error-string buffer tt1:error.  // b yes
message buffer btt1:error-string buffer btt1:error. // b yes

If the 'cache isn't flushed' by the 1200 created records, then FWD behaves OK because the record remains in the cache! But actually is never flushed to the database. This affects cases where fast-copy is used to i.e. copy a dataset temp-table from a parameter, so the FWD cache is bypassed (this standalone test just forces to remove a stale record from the cache).

If the first release tt1. is removed, FWD gets a NPE on the buffer btt1:error = true. call. I think this test needs to be checked with and without transaction.

Ovidiu/Eric: can someone look at this? Is a show-stopper for #5505.

#3 Updated by Ovidiu Maxiniuc almost 3 years ago

Constantin Asofiei wrote:

Ovidiu/Eric: can someone look at this? Is a show-stopper for #5505.

I will do it tomorrow. Thank you for the testcase and detailed information.

#4 Updated by Ovidiu Maxiniuc almost 3 years ago

I identified two issues here.

The first was the NPE. This happened when the peer buffer was not found in cache. This happen quite easy. If the record is not in session cache it means it must be a transient record in the peer buffer. Picking the record from that location cures the exception.

I hoped fixing it will automatically fix the second issue. The problem is a bit different but also is related to session's cache. If a record attribute is changed while it is in cache but NOT loaded in a buffer, the flags will change as expected (CHANGED flag will be set on). If the record is loaded in a buffer, the issue fixes itself as it will be saved when the buffer is released. The problem appears, as Constantin noticed, when the cache is full and our record expired. In this case, the record is simply discarded (and CACHED flag reset), regardless of the CHANGED flag. No other action is performed so the changes to the record are lost.

The solution is to check the CHANGED flag of the records processed in cacheExpired and either:
  • allow them to live longer (already too late at this moment since this is only a notification that the record was dropped from cache) or
  • issue an additional update for these records (which are probably not loaded in any buffer).
    Once the record is saved at this event, the procedure prints the expected result.

Committed revision: 12878.

#5 Updated by Constantin Asofiei almost 3 years ago

  • Assignee set to Ovidiu Maxiniuc
  • Status changed from New to WIP

Ovidiu, this doesn't solves my issue. The original scenario involved using SQLs like insert into ... select from to copy a temp-table to another temp-table directly (fast-copy). I don't see how your solution which still keeps the record in the cache, without flushing it, solves this problem.

Try disabling the cache completely and see what happens. This is a duplicate for this failure:

def var i as int.
def temp-table tt1 before-table btt1 field f1 as int.
def dataset ds1 for tt1.

temp-table tt1:tracking-changes = true.

do transaction:
   create tt1.
   tt1.f1 = 10.
   release tt1.

   // buffer tt1:error = true.
   // buffer tt1:error-string = "a".

   buffer btt1:error = true.
   buffer btt1:error-string = "b".
   buffer btt1:rejected = true.
   release btt1.
end.

def var hds as handle.
hds = dataset ds1:handle.

// message tt1.f1 buffer tt1:error-string.  CA: do not use this
run b4state2b.p (input-output dataset-handle hds).

second program:

def input-output parameter dataset-handle hds.

def var hb1 as handle.
hb1 = hds:get-buffer-handle("tt1").
hb1:find-first().

message hb1::f1 hb1:error-string.

I think the correct solution would be this:
  • if the peer record (from before-table or after-table) is associated with a buffer, update the record and don't commit/flush the change. I think this should follow the normal flushing rules (i.e. when buffer is released).
  • otherwise, commit the change immediately, don't keep it in the cache.

#6 Updated by Ovidiu Maxiniuc almost 3 years ago

Your second test program works fine for me, except that it lacks one line. When message tt1.f1 buffer tt1:error-string. is processed, tt1 is empty (it was released within the transaction). I added (back) the find first tt1 where tt1.f1 = 10. from first note.

However, I investigated more the construction and I think there might be other causes for which the customer's application fails to run:
  1. if the second procedure (b4state2b.p) is called in a different user context, ie: with a different Session. Since the original record is still in cache and not flushed, it cannot be found from the second context. It seems natural, so need to test how 4GL behaves in this case before jumping to implementation.
  2. a single procedure, but the after-image (which was stored in tt1 and then released) was dropped from the cache before setting the error attribute on the before-image. In this case the peer record will not be found in memory and the persistence must fetch it back from database. It will be altered (and CHANGED flag set) but not stored in any buffer, just in curent session's cache. At this point the testcase seems similar with the above.

#7 Updated by Constantin Asofiei almost 3 years ago

Sorry, I had a message tt1.f1 buffer tt1:error-string. in the first program, which confused you, that should not have been there.

I've edited the post.

#8 Updated by Ovidiu Maxiniuc almost 3 years ago

Indeed, the extra line caused the testcase to work correctly, hiding the flaw by reloading the CHANGED record into a buffer, so everything appeared to work fine for me.

The problem is we are working on two distinct levels. Fist: a high level, where the records are stored in buffers and consequently, in session cache. This is the most close to the 4GL record management. There is a second is the low-level, where we are doing some specific optimizations directly at SQL level. Normally these two must be kept in sync, otherwise the low level SQL operations will fail to process changes in buffers and session cache or the other way around, when the system may process stale cached data while the persisted data was altered by low-level SQL statements.

This issue is the former type: the special before-table attributes are written to after image but not mandatory saved. If the record is stored in a buffer, we do not need to do anything because when the record is released the system will do the flush automatically and correctly. In fact, we must NOT do this except at the end of a transaction otherwise the get a potentially unwanted data leak. However, these attributes are only encountered for temp-tables. In this case I think it is safe to force the record to be flushed if its CHANGED flag was not set, regardless whether the record is loaded in any buffer.

There is a mirrored case: when the attributes are set onto the after-image and must be copied to the before image, if any. The handling is identical, practically the operations are done via the symmetrical peer field.

Eric, do you find any flaw with this reasoning?

#9 Updated by Eric Faulhaber almost 3 years ago

If the issue is that the bulk copy does not pick up changes in process in related buffers, can we use the ChangeBroker to notify of the imminent "bulk copy", so that buffers for the same backing table know to flush their pending changes before the copy? Listeners would perform a validate/flush. Or is that too abstract (i.e., if we know the specific buffers which require the flush)?

#10 Updated by Eric Faulhaber almost 3 years ago

Or is the issue that there is a version of the record which represents a "snapshot" of the state (i.e., not the current state), and we don't want this version associated with the session or the cache? Or something else?

#11 Updated by Constantin Asofiei almost 3 years ago

Eric Faulhaber wrote:

If the issue is that the bulk copy does not pick up changes in process in related buffers, can we use the ChangeBroker to notify of the imminent "bulk copy", so that buffers for the same backing table know to flush their pending changes before the copy? Listeners would perform a validate/flush. Or is that too abstract (i.e., if we know the specific buffers which require the flush)?

The issue is that the peer record is not loaded at all in any buffers, when the error-string is changed - so we update a record which ends up only the session cache, and will not exist loaded in the buffer (to be flushed/committed at some point).

Next, we have fast-copy usage which does sql-level operations (insert into ... select from), and this bypasses completely the session level cache. So the changed, cached, record which exists in the session cache (and not loaded in any buffer) will remain there, and will not be copied by the sql operation to the target table.

I think a better phrasing may be this: the session cache must not end up with changed records which are not loaded in a buffer. Or if it does, sql-level operations need to be aware of such records and flush them.

I think the inverse may be possible, too, in FWD: sql-level operations update a record directly, bypassing any version of it in the session cache (loaded in a buffer or not).

#12 Updated by Eric Faulhaber almost 3 years ago

Constantin Asofiei wrote:

Eric Faulhaber wrote:

If the issue is that the bulk copy does not pick up changes in process in related buffers, can we use the ChangeBroker to notify of the imminent "bulk copy", so that buffers for the same backing table know to flush their pending changes before the copy? Listeners would perform a validate/flush. Or is that too abstract (i.e., if we know the specific buffers which require the flush)?

The issue is that the peer record is not loaded at all in any buffers, when the error-string is changed - so we update a record which ends up only the session cache, and will not exist loaded in the buffer (to be flushed/committed at some point).

Is the peer record supposed to follow the usual validation/flushing rules, which are based on index updates and/or (normally) release from a buffer? If not in a buffer, how is the peer record updated? We have a lot of logic in the invocation handler, Record, and BasedRecord for updates, which should not be bypassed if this needs to be treated like a "normal" record.

Next, we have fast-copy usage which does sql-level operations (insert into ... select from), and this bypasses completely the session level cache. So the changed, cached, record which exists in the session cache (and not loaded in any buffer) will remain there, and will not be copied by the sql operation to the target table.

I think a better phrasing may be this: the session cache must not end up with changed records which are not loaded in a buffer. Or if it does, sql-level operations need to be aware of such records and flush them.

Yes, any records in the session cache must be subject to validation and flushing logic, so they are not out of sync with the database when they need to be retrieved from the database, for whatever purpose (query, copy, etc.).

I think the inverse may be possible, too, in FWD: sql-level operations update a record directly, bypassing any version of it in the session cache (loaded in a buffer or not).

The intended design was that "normal" records would not be updated outside the invocation handler and Record/BaseRecord architecture. We made an exception for fast copy, but this was meant to copy the entire state of a record, not to create a different version of the same record. We should not be updating normal records through low-level SQL, as this bypasses validation, flushing, undo handling...

#13 Updated by Constantin Asofiei almost 3 years ago

Eric Faulhaber wrote:

Is the peer record supposed to follow the usual validation/flushing rules, which are based on index updates and/or (normally) release from a buffer? If not in a buffer, how is the peer record updated? We have a lot of logic in the invocation handler, Record, and BasedRecord for updates, which should not be bypassed if this needs to be treated like a "normal" record.

Our problem here is that the ERROR-STRING is in sync between the after and before record (you set it at one record, the other record is updated automatically regardless if is loaded in a buffer or not).

In FWD, this 'other record' is the peer - which, when not loaded in a buffer, remains in the session cache, and the change is not flushed; and fast-copy will not see the change.

I think the inverse may be possible, too, in FWD: sql-level operations update a record directly, bypassing any version of it in the session cache (loaded in a buffer or not).

The intended design was that "normal" records would not be updated outside the invocation handler and Record/BaseRecord architecture. We made an exception for fast copy, but this was meant to copy the entire state of a record, not to create a different version of the same record. We should not be updating normal records through low-level SQL, as this bypasses validation, flushing, undo handling...

Yes, I understand this. But this is a special case where we really need to update the peer, regardless if is loaded in a buffer or not. This is only for ERROR-STRING and some other fields (in other words, fields which are part of the OE runtime, and not from the application schema).

From what I can find, Persistence.executeSQL is used only in one place for UPDATE, see BufferImpl.acceptChanges which does an SQL UPDATE to "execute a simple query to update the [_rowState], [_originRowid] and [_peerRowid] fields". This may be problematic, too, and should be reviewed.

Beside fast-copy, there is TemporaryBuffer.removeRecords which executes a SQL DELETE - I can't tell what happens if there are records which were deleted still in the session cache.

#14 Updated by Eric Faulhaber almost 3 years ago

Constantin Asofiei wrote:

Eric Faulhaber wrote:

The intended design was that "normal" records would not be updated outside the invocation handler and Record/BaseRecord architecture. We made an exception for fast copy, but this was meant to copy the entire state of a record, not to create a different version of the same record. We should not be updating normal records through low-level SQL, as this bypasses validation, flushing, undo handling...

Yes, I understand this. But this is a special case where we really need to update the peer, regardless if is loaded in a buffer or not. This is only for ERROR-STRING and some other fields (in other words, fields which are part of the OE runtime, and not from the application schema).

OK, then in this (and any other cases which bypass the normal/legacy validation/flush processing), we will have to take special care to ensure a record is flushed before any other SQL runs which needs the database to be in sync with the state of the record in the runtime.

From what I can find, Persistence.executeSQL is used only in one place for UPDATE, see BufferImpl.acceptChanges which does an SQL UPDATE to "execute a simple query to update the [_rowState], [_originRowid] and [_peerRowid] fields". This may be problematic, too, and should be reviewed.

Beside fast-copy, there is TemporaryBuffer.removeRecords which executes a SQL DELETE - I can't tell what happens if there are records which were deleted still in the session cache.

I'll look at these, thanks.

#15 Updated by Ovidiu Maxiniuc almost 3 years ago

I changes the code which sets these attributes like this:

+           // check if the peer record was changed since it was saved or fetched from database
+           boolean noSave = peerRecord.checkState(DmoState.NEW) ||
+                            peerRecord.checkState(DmoState.CHANGED);

            TempRecord peerSnapshot = (TempRecord) peerRecord.deepCopy();
            updater.update(peerRecord);
            peerRecord.updateState(null, DmoState.CHANGED, true);

            // reusing props
            changeBroker.stateChanged(RecordChangeEvent.Type.UPDATE,
                                      peerBuffer,
                                      peerRecord,
                                      peerSnapshot,
                                      props);
+           if (!noSave)
+           {
+              // now the record is evidently changed, must flush it back
+              session.save(peerRecord, true);
+           }

Instead of checking whether the record is loaded in a buffer, the NEW and CHANGED are tested. In case the record is transient or was changed from the version from database, it must be loaded in a buffer so the save is postponed until the buffer is released. If the record was 'pristine' I force the save operation directly. These attributes are not related to any index so validation is not a problem. If the record is not loaded in a buffer, once saved the low-level ops will work correctly and the record will be loaded directly changed when needed in a buffer. If the record is stored in a buffer, it will also be updated in database and the buffer will still have the up-to-date instance of the record, even if it was not actually informed about the event.

With this change, the testcase from #/5625-5 works as expected.

#16 Updated by Ovidiu Maxiniuc almost 3 years ago

The above changes were committed in revision 12898/3821c, after testing them for a few days on my local environment.
They address only a particular case but, hopefully, will help Constantin advance with his task.

#17 Updated by Constantin Asofiei almost 3 years ago

Unfortunately this makes things worse - try the test without a full tx present and you will get a "Attempting DML which cannot be committed!" error message for session.save(peerRecord, true);.

#18 Updated by Ovidiu Maxiniuc almost 3 years ago

This happened because there was no session / transaction open at that moment. I created a short-lived micro-transaction and now the save operation works fine.
See r12902.

#19 Updated by Constantin Asofiei almost 3 years ago

Ovidiu Maxiniuc wrote:

This happened because there was no session / transaction open at that moment. I created a short-lived micro-transaction and now the save operation works fine.
See r12902.

Yes, this solves the issue in the customer application.

#20 Updated by Greg Shah almost 3 years ago

Is there any work remaining in this task?

#21 Updated by Greg Shah almost 3 years ago

Can we close this?

#22 Updated by Constantin Asofiei almost 3 years ago

The original issue is fixed. But I can't tell for sure what happens with the fix in #5625-15 if the record is just loaded in a buffer, and not marked CHANGED or NEW - this will force a 'save' without going through the validation/index check/etc logic. The other unknown is what happens in a no-undo table, if the change is rolled back.

Also, the notes in #5625-13 should be investigated.

#23 Updated by Ovidiu Maxiniuc almost 3 years ago

Constantin Asofiei wrote:

The original issue is fixed. But I can't tell for sure what happens with the fix in #5625-15 if the record is just loaded in a buffer, and not marked CHANGED or NEW - this will force a 'save' without going through the validation/index check/etc logic.

This should be fine. There is nothing to validate: there is no index built on the fields of before-attributes and none of them is mandatory.

The other unknown is what happens in a no-undo table, if the change is rolled back.

I have just done a test in this regard. The results show it works fine.

Also, the notes in #5625-13 should be investigated.

I did not test those. They need more elaborate testcases than minor adaptions of current code.

Also available in: Atom PDF