Project

General

Profile

Bug #7167

Associating records from opened buffers to new sessions is slow

Added by Alexandru Lungu about 1 year ago. Updated 4 months ago.

Status:
Closed
Priority:
Normal
Target version:
-
Start date:
Due date:
% Done:

100%

billable:
No
vendor_id:
GCD
case_num:
version:

Related issues

Related to Database - Feature #4021: better use of session-related resources such as PreparedStatements and caches New
Related to Database - Bug #7334: Reclaim used sessions to improve performance Test
Related to Database - Bug #8196: Reduce number of AbstractTempTable._hasRecords calls to avoid BufferManager.activeBuffers Test

History

#1 Updated by Alexandru Lungu about 1 year ago

There are several moments when Persistence$Context.getSession is used (usually by query execution, persistence record loading or transaction starting).

The performance hit was already noticed in #5919-59 and #5237-5, but FWD still faces today a full BufferManager.activeBuffers iteration to retrieve the records which are not associated with a new session. In a large customer application POC:
  • ~1.700 sessions are created
  • ~40.000 active buffers actually have a record which should be associated with the new session
  • ~250.000 active buffers don't have a record at all, so no need to associate
  • ~2.250.000 active buffers are not matching the persistence (e.g. are temporary buffers and the session is for a permanent database)
For the application I am testing, this means:
  • 98% of the active buffers are iterated, but not associated.
  • on average, there are ~1500 active buffers on a session creation
  • on average, 23 buffers should be found and associated
  • on average, 147 buffers are empty
  • on average, 1323 buffers are not for the right persistence

Having 1.700 sessions closed and re-created may be a problem in the first place (also noted in #5919-59), as we are losing record caches. Note that the session to the temp database has a very long lifespan (having a volatile nature). This issue mostly refers to the sessions over the persistence database.

#2 Updated by Alexandru Lungu about 1 year ago

  • Related to Feature #4021: better use of session-related resources such as PreparedStatements and caches added

#3 Updated by Alexandru Lungu about 1 year ago

After a bit of investigation, the database sessions com.goldencode.p2j.persist.orm.Session are closed quite eagerly. Also, the ones closed are mostly sessions for the persistent database. Each time the session count reaches 0, the session is closed. A session close also clears the cache and closes the underlying connection.

Eric, I see #4021 is something near this problem, implying that the session shouldn't be closed that often. This way, we shouldn't address the BufferManager.activeBuffers iteration bottleneck, but the session opening/closing issue. I will test if FWD is doing better now without closing sessions, comparing to your empirical test in #4021.

#4 Updated by Alexandru Lungu 10 months ago

  • Status changed from New to WIP
  • Assignee set to Dănuț Filimon

This is basically fixed by the reclaimable session Danut implemented recently in #7334. This is in trunk now.

#5 Updated by Eric Faulhaber 6 months ago

Can this task be closed?

#6 Updated by Alexandru Lungu 6 months ago

Eric, in theory, this can be closed. However, I am waiting for some performance tests to check that BufferManager.activeBuffers is no longer a problem. Reaching you back asap.

#7 Updated by Alexandru Lungu 6 months ago

  • % Done changed from 0 to 100
  • Status changed from WIP to Internal Test

With some slim analysis, I still see BufferManager.activeBuffers in top 50 performance items (being called ~100 times in a POC). Most probably, we need to increase the session reclaiming time. By default, the reclaiming time is 1s. I will increase the time to 2s / 5s / 10s and see if there are any major changes. In theory, this can be configurable from directory.xml, so there is nothing more to be worked here per-se.

Still (I think this was in another task I will search), we need to see if we can reclaim Sessions but switching connections to avoid starving.

#8 Updated by Alexandru Lungu 6 months ago

Danut, please re-test the large POC and see if why we are still having spikes on BufferManager.activeBuffers. Please attempt to check with 10s of session lifespan first. After, do some debugging and understand why are we still expiring sessions - are they so sparsely used?

#9 Updated by Alexandru Lungu 6 months ago

Also, please remake the statistic on #7167-1 now with 1s of lifespan vs 10s of lifespan.

#10 Updated by Dănuț Filimon 6 months ago

Alexandru Lungu wrote:

Danut, please re-test the large POC and see if why we are still having spikes on BufferManager.activeBuffers. Please attempt to check with 10s of session lifespan first. After, do some debugging and understand why are we still expiring sessions - are they so sparsely used?

I retested the POC and got the following results for the session lifespan:
1s lifespan (ms) 10s lifespan (ms) Difference (%)
average of last 20 runs 16200.6 15305.4 -5.525%
total average 15799 14789.8 -6.387%

EDIT: The results of the tests are the average of 5 cold runs.

Also, please remake the statistic on #7167-1 now with 1s of lifespan vs 10s of lifespan.

I also retested the scenario for the case when the reclaiming is disabled and obtained:
Counts Reclaiming disabled 1s lifespan 10s lifespan
No matching persistence 285708435 5874605 3564224
No records to associate 31051569 26227813 26226169
Records associated 2517852 614027 613197
Sessions created 124912 1127 89
Sessions reclaimed 0 123767 124805

#11 Updated by Alexandru Lungu 4 months ago

Danut, I don't think that the shift to 10s looks OK with these statistics. Mind that we "lock" a connection from the pool while we keep the session "reclaimable", so 1s is enough in the current context to avoid starving. Mind that we already faced starving in some applications with 1s (locking the connection pool). If we are to increase the lifespan to 10s, we need to take care of connection releasing first. I think there is a task for it already - close connection when session enters the "reclaiming" state (please search and relate). We can expect for some performance downgrade from "unpreparing" the statements, but we still have the DMO cache preserved. This is lead 1

It is curious why "No records to associate" and "Records associated" are similar, even if there are ~90% less sessions created. After some investigation, activeBuffers is also used in AbstractTempTable._hasRecords. I don't have a tracker on this path, so please investigate further how often is activeBuffers hit from AbstractTempTable._hasRecords. If you see heavy traffic there, please create a new task on that (as it is not related to this caption: "Associating records from opened buffers to new sessions is slow". This is lead 2

#12 Updated by Dănuț Filimon 4 months ago

I've got the results of how often is activeBuffers hit from AbstractTempTable._hasRecords, and how often it is hit from other sources.
Counts from AbstractTempTable from BufferManager / other
Number of calls 11312 1188
No matching persistence 3534544 2457786
No records to associate 26237166 1974
Records associated 601546 1123

Most of the records are associated through AbstractTempTable._hasRecords, as seen above.

EDIT: The test was done with POC with default session lifespan.

#13 Updated by Dănuț Filimon 4 months ago

  • Related to Bug #7334: Reclaim used sessions to improve performance added

#14 Updated by Dănuț Filimon 4 months ago

  • Related to Bug #8196: Reduce number of AbstractTempTable._hasRecords calls to avoid BufferManager.activeBuffers added

#15 Updated by Eric Faulhaber 4 months ago

  • Status changed from Internal Test to Closed

I understand the work begun here is being continued in #8196.

Also available in: Atom PDF