Project

General

Profile

Bug #5912

improve FastFindCache memory consumption

Added by Constantin Asofiei over 2 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Target version:
-
Start date:
Due date:
% Done:

100%

billable:
No
vendor_id:
GCD
case_num:
version_reported:
version_resolved:

History

#1 Updated by Constantin Asofiei over 2 years ago

The cache in FastFindCache is a basic map, which keeps accumulating; I think some invalidation cases were missed.

For example, once a temp-table is deleted (static or dynamic), all cache should be dropped for it. I don't see code related to this in FWD. This may be a memory leak.

#2 Updated by Constantin Asofiei over 2 years ago

  • Start date deleted (12/21/2021)

Ovidiu, I think we need to call ffCache.invalidate(getDmoInfo().getId(), multiplexID); in case of temp-tables, in doCloseMultiplexScope().

#3 Updated by Constantin Asofiei over 2 years ago

Another problem is reverseLookup - the average cache size is 10k, and the size is the same as the modCount value. How should this be invalidated? As it doesn't get garbage collected.

#4 Updated by Eric Faulhaber over 2 years ago

Constantin Asofiei wrote:

Another problem is reverseLookup - the average cache size is 10k, and the size is the same as the modCount value. How should this be invalidated? As it doesn't get garbage collected.

See the "important implementation note" in the ReverseLookup class javadoc. As long as the primary cache entries are invalidated, the reverse cache will clean itself when memory is needed, as it is a WeakHashMap. I followed the logic through for a number of use cases in the debugger when developing this feature and it worked. But if we have a leak with the primary cache, the ReverseLookup.cache keys still will be strongly referenced. If we fix the leak you found with the primary cache, this should resolve itself.

#5 Updated by Constantin Asofiei over 2 years ago

Eric Faulhaber wrote:

Constantin Asofiei wrote:

Another problem is reverseLookup - the average cache size is 10k, and the size is the same as the modCount value. How should this be invalidated? As it doesn't get garbage collected.

See the "important implementation note" in the ReverseLookup class javadoc. As long as the primary cache entries are invalidated, the reverse cache will clean itself when memory is needed, as it is a WeakHashMap.

Hmm... this finding was with the fix in note 2 applied. FastFindCache.cache.remove(combine(dmoUid, multiplex) does not affect the l3cache, it does not remove entries from there. The RecordIdentifier instance remains referenced by reverseLookup (as a weak key) and the l3cache (as a strong reference).

Maybe in your tests you filled the l3cache so entries get removed from it, and only the key in reverseLookup remained?

#6 Updated by Eric Faulhaber over 2 years ago

Constantin Asofiei wrote:

The RecordIdentifier instance remains referenced by reverseLookup (as a weak key) and the l3cache (as a strong reference).

Hm, I must have misunderstood what I was seeing in the debugger, then.

In the ReverseLookup.cache, the map keys (RecordIdentifier) are weakly referenced, but I suppose the fact that the map values have strong reference chains to the canonical RecordIdentifier objects via the L3 caches in the Node lists) is problematic. I think Node.l3Cache needs to be reimplemented as WeakReference<Cache<L2Key, RecordIdentifier<String>>>. This should break the strong reference chain from the canonical RecordIdentifier to the GC roots. We will need to add a null check in ReverseLookup.invalidate(BitSet) to avoid NPE, if the weakly referenced L3 cache has been garbage collected in the meantime:

// remove the entry from the L3 cache
Cache<L2Key, RecordIdentifier<String>> l3 = node.l3Cache.get();
if (l3 != null)
{
   l3.remove(node.l2Key);
}

Alternatively, we could use a ReferenceQueue to remove nodes whose L3 caches have been GC'd, but I think that would complicate it more. I hate to make it this complicated already, but the whole point of the FFC is to speed things up, and if we were to clean up the cache more directly (i.e., w/o ReverseCache) as we were doing before, we would lose a lot of the performance gain.

Please consider this idea and let me know if you see a logic flaw.

#7 Updated by Eric Faulhaber over 2 years ago

Constantin, have you tested my proposed change to see if it resolves the leak?

#8 Updated by Constantin Asofiei over 2 years ago

Eric Faulhaber wrote:

Constantin, have you tested my proposed change to see if it resolves the leak?

I haven't yet, I'm doing this now.

#9 Updated by Constantin Asofiei over 2 years ago

  • Status changed from New to WIP
  • Assignee set to Constantin Asofiei
  • % Done changed from 0 to 100

So the fix works, no more leak at FastFindCache. Added to 3821c/13337.

#10 Updated by Eric Faulhaber over 2 years ago

  • Status changed from WIP to Closed

Also available in: Atom PDF