Project

General

Profile

Bug #7132

Eliminate for-each loops and lambda in FWD-H2 code

Added by Alexandru Lungu over 1 year ago. Updated over 1 year ago.

Status:
WIP
Priority:
Normal
Assignee:
Target version:
-
Start date:
Due date:
% Done:

50%

billable:
No
vendor_id:
GCD
case_num:
version_reported:
version_resolved:

History

#1 Updated by Alexandru Lungu over 1 year ago

There is an overhead in using foreach loops: for (Object obj : objs). This is because an iterator object is generated in the background. This object is defined in the heap-memory, so the memory allocation and GC overhead affect performance. Consider replacing all for-each loops with classical for loops: for (int i = 0; i < objs.size(); i++) where possible (List or Java arrays). This should give us some modest, yet honest performance boost. Please address this right away!

Second, there is an overhead with using lambdas, especially the ones which reference local variables and arguments. Please report with a list of lambda from the FWD-H2 code and review their use one-by-one. If there are tens of lambda in the FWD-H2 code, we should change the strategy (maybe group them by use). Please address this as soon as possible

Lastly, I think there may be an overhead by using synchronized even in single-user H2 instances. Note that the default now is private temp-tables with LOCK-MODE 0: every single H2 instance is private to the user, so it is single-user. However, the H2 code is written for multi-user environments, so there are a lot of synchronized blocks (on session and database). Note that LOCK-MODE 0 just avoids doing the table logical locking using Java Locks. For this matter, consider adapting the FWD-H2 code to honor LOCK-MODE even for database and session synchronizing - maybe replace synchronized blocks with Java locks? Before doing that, please check that there are no daemon threads which analyze the H2 structure or something that really depends on having the session or database synchronized beyond the obvious multi-user environments. Please consider this with care, as we don't want to have concurrency regressions as they are hard to spot

#2 Updated by Radu Apetrii over 1 year ago

Alexandru Lungu wrote:

Second, there is an overhead with using lambdas, especially the ones which reference local variables and arguments. Please report with a list of lambda from the FWD-H2 code and review their use one-by-one. If there are tens of lambda in the FWD-H2 code, we should change the strategy (maybe group them by use). Please address this as soon as possible

I've gone through the H2 code and I did not find any lambdas. I think this point can be considered done.

#3 Updated by Alexandru Lungu over 1 year ago

Created 7132a_h2 from trunk rev. 11. Committed rev. 12 including lots of replacements from for-each loops to classical for loops. I am planning to do a rev. 13 ASAP including other for-each loops rewritting and profile 7132a_h2.

#4 Updated by Alexandru Lungu over 1 year ago

Committed rev. 13 finishing replacements from for-each loops to classical for loops. I've done some tests, but there is no visible difference (in 3 test rounds: 1 shown +0.2%, 1 shown a +0.1%, 1 shown -0.1%). Unfortunately, this is not a consistent improvement. I will keep 7132a_h2 for reference for now. I will move on with the synchronization part.

#5 Updated by Radu Apetrii over 1 year ago

Alexandru Lungu wrote:

Lastly, I think there may be an overhead by using synchronized even in single-user H2 instances. Note that the default now is private temp-tables with LOCK-MODE 0: every single H2 instance is private to the user, so it is single-user. However, the H2 code is written for multi-user environments, so there are a lot of synchronized blocks (on session and database). Note that LOCK-MODE 0 just avoids doing the table logical locking using Java Locks. For this matter, consider adapting the FWD-H2 code to honor LOCK-MODE even for database and session synchronizing - maybe replace synchronized blocks with Java locks? Before doing that, please check that there are no daemon threads which analyze the H2 structure or something that really depends on having the session or database synchronized beyond the obvious multi-user environments. Please consider this with care, as we don't want to have concurrency regressions as they are hard to spot

I've committed some changes to 7132a_h2, rev. 14. I've added some checks to potentially remove the synchronized keyword for some blocks. Mainly, if LOCK_MODE is set to 0, then synchronized should not be needed.

I've tested these changes with ./build.sh testFWD and a large customer application and got no errors.

Also, as a side note, I did not manage to rewrite all the synchronized blocks in H2 since there are quite a lot. I've tried to focus on the more frequent ones.

#6 Updated by Alexandru Lungu over 1 year ago

  • Status changed from New to WIP
  • % Done changed from 0 to 50

This is a review for 7132a_h2, rev. 14

There are many changes in several files, so I've done a skim review of the technique you used. I am conceptually OK with the changes. However, I have some non-deterministic regressions on a large customer application (some POC runs are OK, some crash due to a FWD persistence issue and some are showing an warning regarding H2 transactions).

It is hard to detect the issue; most probably it is due to a race condition, which alters the H2 state explaining the non-deterministic behavior. I will like to drop some of the changes for now to focus only on a subset of classes which are really stressed by FWD:
  • Set: it is used only at the very beginning to usually set some configurations. Although it can be used at the run-time, it is not that common. We can drop the changes here for the moment.
  • Command: utterly relevant. Please double check the changes here!
  • Parser: even if it is widely use, you changes only target CTE views (especially recursive). These are very rare in FWD, so we can drop the changes.
  • ConstraintCheck: there is only a small change here and it look alright. Please avoid wildcard import
  • Database: relevant, but has many changes. Maybe we can reduce the number of synchronizations here and keep only for releaseDatabaseObjectIds, addSchemaObject, addDatabaseObject, allocateObjectId, getTempTableName, commit, flush and sync
  • FunctionAlias: slim number of uses; drop it for now
  • Session relevant and has only some changes. Please double-check.
  • PageBtreeIndex and PageDataLeaf AFAIK these are only for persistent databases. We won't ever use LOCK_MODE 0 for such databases, so drop them.
  • PageStore relevant and has only some changes. Please double-check.
  • Schema relevant. Please double-check.
  • Column changes are not oftenly hit. We can drop them for safety.
  • Recover, TableLink and TableView can be dropped

The whole CompareMode lack of synchronization mechanism is quite complex. Double-check that for temp-tables there is only one compare-mode per database/session/user and that is not synchronized. Check that dirty and meta still have a synchronized cache.

Please make a new commit which reverts the changes for that less used classes. I would like to test such an intermediate solution and guarantee that we have a subset of changes which are stable enough. At that point we can do baby steps to integrate your other changes.

#7 Updated by Radu Apetrii over 1 year ago

Alexandru Lungu wrote:

Please make a new commit which reverts the changes for that less used classes. I would like to test such an intermediate solution and guarantee that we have a subset of changes which are stable enough. At that point we can do baby steps to integrate your other changes.

I've considered what you said and I reverted some of the changes. Also, I've dealt with those wildcard imports.
The new revision is on 7132a_h2, rev. 15.

#8 Updated by Alexandru Lungu over 1 year ago

Done several testing iterations with 7132a_h2/rev. 15 and the issue is solved. The bad news are that I couldn't notice any consistent performance improvement (<0.1%). In this state, 7132a_h2 is stable enough to be merged, but we may consider some further optimizations to make it worth it.

Also available in: Atom PDF