Bug #6812
fix FastFindCache to use a shared instance per persistent database
90%
History
#2 Updated by Constantin Asofiei over 1 year ago
RecordIdentifier
in FastFindCache
having the table identifier a simple string with the record's implementation class. This is not correct - you can have the same table in multiple databases (think meta table, like _file
), and the result will be incorrect:
- run a FIND on
_file
for a database where no record exists; this gets cached - run a FIND on
_file
for a database where a record does exist - FWD will incorrectly use the cached value for the previous FIND, as the cache has no knowledge about the record's database.
This needs to be fixed in 6129a.
#3 Updated by Eric Faulhaber over 1 year ago
Does it make more sense to qualify the table name with database name, or to have a separate cache for each database?
#4 Updated by Constantin Asofiei over 1 year ago
Eric Faulhaber wrote:
... or to have a separate cache for each database?
Good point. It needs to be per database and get invalidated if a database disconnects.
#5 Updated by Constantin Asofiei over 1 year ago
Eric, the FastFindCache looks like is only for temp-tables, and not other databases (like meta). So my assumption is incorrect...
The customer reported a test like this:
def var tname as char. def var fname as char. tname = "book". fname = "isbn". find fwd._file where fwd._file._file-name = tname no-lock no-error. if available (fwd._file) then do: find fwd._field of fwd._file where fwd._field._field-name = fname no-lock no-error. if available fwd._field then message "1 found" fwd._field._label. else message "1 not found". end. else message "1 not found". find fwd._field where fwd._field._field-name = fname no-lock no-error. if available fwd._field then message "2 found" fwd._field._label. else message "2 not found".
where in the first case the
find fwd._field of fwd._file where fwd._field._field-name = fname no-lock no-error.
sometimes returns a record and sometimes not. And sometimes even _file
meta table does not find a record.
My assumption was that this being intermitent, the FastFindCache may be at fault... but this is only for temp-tables.
Do you have any idea of any other kind of cache which would interfere with this? In customer's case, the database is referenced via an alias, and the program is called for different databases.
#6 Updated by Constantin Asofiei over 1 year ago
Constantin Asofiei wrote:
Eric, the FastFindCache looks like is only for temp-tables, and not other databases (like meta). So my assumption is incorrect...
I take this back, I read the code wrong - FastFindCache is not being used for meta databases, in RAQ.initialize
:
if (!dmoMeta.isMeta() && index != 0) { this.ffCache = FastFindCache.getInstance(buffer.isTemporary()); }
#7 Updated by Greg Shah over 1 year ago
FastFindCache looks like is only for temp-tables
It isn't used for permanent databases?
#8 Updated by Ovidiu Maxiniuc over 1 year ago
The factory makes sure the permanent databases use a commonly shared instance of FastFindCache
while for the temp-tables a context-local is used.
There should be no problems with temp-tables instances. Maybe we should do the same for the _meta?
#9 Updated by Constantin Asofiei over 1 year ago
FastFindCache is now created for both temp-table and permanent databases. The problem is that for permanent (or meta, if we enable it) databases, the cache does not differentiate between tables with the same structure from different databases.
Regarding the original test in note #6812-5 - I don't see how FastFindCache can be involved, as this is not enabled for meta tables. So, does anyone have any idea what other cache would be used in this test?
#10 Updated by Eric Faulhaber over 1 year ago
Ovidiu Maxiniuc wrote:
The factory makes sure the permanent databases use a commonly shared instance of
FastFindCache
while for the temp-tables a context-local is used.There should be no problems with temp-tables instances. Maybe we should do the same for the _meta?
Igor explicitly disabled FFC for _meta
due to some problematic behavior. I wonder now whether that behavior essentially was this defect. Igor, do you recall what the issue was that led you to disable the cache for metadata tables?
#11 Updated by Eric Faulhaber over 1 year ago
- Subject changed from fix FastFindCache to use the database.table as key instead of just table to fix FastFindCache to use a shared instance per persistent database
#12 Updated by Eric Faulhaber about 1 year ago
- Assignee set to Boris Schegolev
Boris, please look at the FastFindCache
implementation and ask any questions you may have as it pertains to this task.
#13 Updated by Boris Schegolev about 1 year ago
- File 6812-fast-find-cache.diff added
I am attaching a suggested solution. It's somewhat raw (I had to make some elements public), but I can fix that for the final version. For now I just need to know if that approach makes sense at all and if it potentially solves the problem described. Please, let me know. Thank you!
#14 Updated by Eric Faulhaber about 1 year ago
Please create a 6812a branch from trunk and apply the diff. This will make it easier to review in context.
I've only done a cursory review so far, but please use the Database
object instead of the database name. The database name will be the same in multi-tenant (database-per-tenant) situations.
#15 Updated by Boris Schegolev about 1 year ago
I pushed the patch as revision 6812a/14485, working on better implementation (Database
object as key, cleanup, etc.)
#16 Updated by Boris Schegolev about 1 year ago
- Status changed from New to WIP
I converted the code to use Database
object directly and reverted changes I made initially to visibility modifiers. See revision 6812a/14486.
I just need to figure out one call in SavepointManager
. I didn't find any suitable object here that would give me the Database
.
#17 Updated by Boris Schegolev about 1 year ago
- Status changed from WIP to Review
- % Done changed from 0 to 90
I prepared a change for the last missing bit (SavepointManager
) - see revision 6812a/14487. It needs more testing, but the code can reviewed already.
#18 Updated by Eric Faulhaber about 1 year ago
Code review 6812a/14487:
Sorry for the delay in reviewing. A lot of changes have gone in to trunk in the meantime; please rebase. Hopefully, this doesn't impact your changes too much.
Please:
- add missing files header entries;
- add missing javadoc for any new constructs;
- update the
FastFindCache
class javadoc to describe the organization of cache instances by permanent and temporary databases.
I don't think we need to store a reference to a Database
instance in each SavepointManager$Block
instance. Instead, why not just pass the Session
's Database
instance into SavepointManager$Block.invalidateFastFind
from SavepointManager.rollback
? This is the only place the Database
is needed.
#21 Updated by Greg Shah 12 months ago
I'll let Eric review this. In the future, please do not create a new branch instead of rebasing. That typically loses all continuity with the work from the branch (it is "flattened" into a single revision and it doesn't map to any discussion in the task). Rebsing is quite easy to do. Use it.
#22 Updated by Boris Schegolev 12 months ago
Greg Shah wrote:
I'll let Eric review this. In the future, please do not create a new branch instead of rebasing. That typically loses all continuity with the work from the branch (it is "flattened" into a single revision and it doesn't map to any discussion in the task). Rebsing is quite easy to do. Use it.
I understand the consequences and that it's easy to do (usually). It didn't work out in my case and it was just an implementation commit + review fixes, I thought it would be more efficient this way.
#23 Updated by Eric Faulhaber 12 months ago
Code review 6812a/14567:
Looks good. I did a little format cleanup and committed to rev 14568. I did not update the file headers for these minor edits.
Since trunk has moved ahead today, another rebase is needed.
What testing has been done? The changes are pretty straightforward, but this code is hit very often and I don't like committing anything to trunk without some regression testing.
#24 Updated by Boris Schegolev 12 months ago
Eric Faulhaber wrote:
Since trunk has moved ahead today, another rebase is needed.
Rebased, conflicts resolved, headers updated. Revision is 6812b/14573.
What testing has been done? The changes are pretty straightforward, but this code is hit very often and I don't like committing anything to trunk without some regression testing.
Just testing and debugging on a running application. I haven't done any load testing to validate performance and concurrency. On the other hand, the code uses ConcurrentHashMap with atomic operations, so there should be no issues here.
#26 Updated by Eric Faulhaber 5 months ago
- Status changed from Review to Internal Test
Constantin, does it make sense to regression test this with the large POC?