Project

General

Profile

Feature #7416

Reiterate DatabaseStatistics and leverage to other database profling tools

Added by Alexandru Lungu about 1 year ago. Updated 6 months ago.

Status:
Test
Priority:
Normal
Assignee:
Target version:
-
Start date:
Due date:
% Done:

100%

billable:
No
vendor_id:
GCD
version_reported:
version_resolved:

QueryAnalyzer.java Magnifier (3.42 KB) Radu Apetrii, 08/02/2023 04:18 AM


Related issues

Related to Database - Feature #7960: Allow query logs to be collected in a H2 database Review

History

#1 Updated by Alexandru Lungu about 1 year ago

DatabaseStatistics is a good solution for query profiling on the FWD part. This was implemented before Hibernate and became deprecated in the mean-time. It was refactored but still needs some attention. Please check if this can be used with large customer applications now. Please leverage with the effort in #6970.

#2 Updated by Alexandru Lungu about 1 year ago

Radu, please make the documentation of #6970 sql profiler and #7416 query profiler (leveraging DatabaseStatistics) your top priority. We will need to have this in-place in the next days to allow customers extract valuable hot-spots for us to analyze.

Note that the first things to consider are:
  • The loggers don't fatally crash, of course.
  • The output is easy to access and analyze. I want to have a system where a customer can simply send us the profiling results. Based on that, we shall be able to track and reproduce slow-downs. Basically, we need times, queries, stack traces and any hints we can use to identify slow-down causes.
  • Ensure that no sensible data is logged (query parameters) in neither profiler
  • Ensure that the FWD query profiler captures cases where a single (RAQ / CompoundQuery) is iterated several times. Maybe there are hotspots where a CompoundQuery executed several (lets say 1000) SQL, but this won't show in the sql-profiler, because each of the 1000 SQL are running quite fast.
  • Don't introduce unusable slow-downs. Make sure you can run a customer application with this setting on without spending >1 minute to open a single window.

#3 Updated by Radu Apetrii about 1 year ago

Alexandru Lungu wrote:

  • Ensure that no sensible data is logged (query parameters) in neither profiler
I have been looking into this because something seemed a bit off. Basically, in order to print the statement, I would simply use receiver.toString(), where receiver is the actual statement.
The problem is that receiver.toString() prints the parameters too, no matter which dialect I use (PostgreSQL, MariaDB, H2). PostgreSQL, for example, replaces the values inside the statement, while H2 adds them at the end.
Two sample statement logs on a persistent table test and a temporary table tt that I created look like:
23/06/12 13:51:40.404+0300 | WARNING | null | fwd: Statement:  com.mchange.v2.c3p0.impl.NewProxyPreparedStatement@4f96f7fe [wrapping: insert into test (f1, f2, recid) values (31, 0, 4092)] Batch size: 1; Stack tracing: com.goldencode.dataset.Start.lambda$null$6(Start.java:100)
and
23/06/12 14:40:28.279+0300 | WARNING | null | _temp: Statement:  SqlStmt{prep571: select tt_2_1__im0_.* from tt12 tt_2_1__im0_ where tt_2_1__im0_._multiplex = ? order by tt_2_1__im0_._multiplex asc, tt_2_1__im0_.f1 asc nulls last lazy {1: 1}} Batch size: 1; Stack tracing: com.goldencode.dataset.Start.lambda$null$3(Start.java:235)
There are two possible solutions that come to my mind:
  • (The quicker one) In SQLExecutor and SQLStatementLogger, instead of receiving just the Statement, another parameter can be received: a String that represents the SQL used in the statement. Most of the places that call SQLExecutor.getInstance().execute have access to the SQL, which means that very few code will be added.
  • (The cleaner one) Instead of passing the Statement to the SQLExecutor, the code can be refactored so that only the SQL will be given. This means that SQLExecutor will create its own Statement or PreparedStatement and will take care of assigning the parameter values. The bright side of this is that there are lots of places in which a Statement or a PreparedStatement is being created, and with this change a lot of code will be deleted, because there will be a central point that creates the Statements, it being SQLExecutor.

Also, in SQLQuery I noticed a few log(() -> "FWD ORM: " + stmt);. It is slightly worrying me because the statements that I'm trying to avoid printing are simply logged on a FINE level. The problem is that by logging them, the parameter values will also be printed. Was this done on purpose or perhaps it is subjected to change?

Can someone advise me on this, please? I'm not sure which path to follow.

#4 Updated by Alexandru Lungu about 1 year ago

I have two ideas here:

1. For H2, we always proxy the statements through TempTableDataSourceProvider, so all statements we execute are in fact UnclosablePreparedStatement. You can extend UnclosablePreparedStatement to store the SQL (which is secure) and override toString to print that string which was used for preparing. The String used here is the same as the one used by the cache, so we don't bring memory overhead (at worst, the SQL will outlive a bit).
2. For persistent databases over which we use c3p0 I have see we have the following begin reported when a statement is cached multiple times:

com.mchange.v2.c3p0.stmt.GooGooStatementCache | ThreadName:Conversation, Session:00000016, Thread:0000004C, User:bogus | Multiply-cached PreparedStatement:
select   [...]
from     [...]
where    [...] = ? and [...] < '9'
order by [...]
limit    ?

My point is that it uses ?, so it prints the prepared statement. I guess this happens because the log is done before any set of the parameters. However, it strikes to me that c3p0 can be configured or used in some way to do this kind of printing for us. Radu, please implement point 1 (this is safe) and do some investigation on point 2.

#5 Updated by Alexandru Lungu about 1 year ago

Created 7416a.

#6 Updated by Radu Apetrii about 1 year ago

Alexandru Lungu wrote:

1. For H2, we always proxy the statements through TempTableDataSourceProvider, so all statements we execute are in fact UnclosablePreparedStatement. You can extend UnclosablePreparedStatement to store the SQL (which is secure) and override toString to print that string which was used for preparing. The String used here is the same as the one used by the cache, so we don't bring memory overhead (at worst, the SQL will outlive a bit).

I've added this and committed to 7416a, rev. 14630. Also tested with a large customer application and noticed that the output looks "safe" now.

2. For persistent databases over which we use c3p0 I have see we have the following begin reported when a statement is cached multiple times:

I've investigated this quite a bit. The problem is that the Statement stored in c3p0 doesn't have the SQL as a field so it is not as straight forward as simply printing it. The only place where the SQL is stored is in the key for the GooGooStatementCache, but the key is used only internally and there is no way to access it through FWD.
If any ideas come to my mind, I will post updates.

#7 Updated by Alexandru Lungu about 1 year ago

Radu, please make sure you don't miss out the dirty databases as part of the SQL logging / query logging. I think they can inherit the "per-database" configuration. If you could dig up into this, you can also mind adding something like ignore_dirty flag at some point (most similar to ignore_meta). At best, you can also think of adding a %dirty placeholder (<empty-string> or dirty) if you mind separating the logs.
This is not a top priority, as I couldn't find any noticeable bottlenecks with the dirty database by now.

#8 Updated by Alexandru Lungu about 1 year ago

Radu Apetrii wrote:

If any ideas come to my mind, I will post updates.

I think we can adopt a naive way of censorship for now:
Add a sensible flag to the configuration. If this is set, output only: sql.substring(0, 10) + "... (length: " + sql.length + ")". We already have the stack trace, so it can help us do the digging. Please consider integrating this asap (eventually before allowing #6970 tests). I am planning to merge it today.

#10 Updated by Radu Apetrii about 1 year ago

Alexandru Lungu wrote:

Add a sensible flag to the configuration. If this is set, output only: sql.substring(0, 10) + "... (length: " + sql.length + ")". We already have the stack trace, so it can help us do the digging. Please consider integrating this asap (eventually before allowing #6970 tests). I am planning to merge it today.

I added a flag to SQLStatementLoggerConfiguration called ALLOW_SENSIBLE. This can be configured in directory.xml by adding the following attribute in the list of configurations (for example, after output-to-file):

            <node class="boolean" name="allow-sensible">
              <node-attribute name="value" value="FALSE"/>
            </node>
If this flag is set to TRUE, then statements will be printed exactly as before (with potential sensible data). Setting it to FALSE means that only the first 10 characters of each statement will be printed, followed by the length.

Note: This flag is always TRUE for the _temp database since the statements that affect that database do not leak sensible data.
The commit is on 7416a, rev. 14631.

#11 Updated by Greg Shah about 1 year ago

Add a sensible flag to the configuration.

Please use sensitive instead of sensible.

#12 Updated by Radu Apetrii about 1 year ago

Greg Shah wrote:

Add a sensible flag to the configuration.

Please use sensitive instead of sensible.

Done. I knew something sounded weird, but I couldn't make out what is was.
The change is on 7416a, rev. 14632.

Edit: This also means that the attribute name in directory.xml needs to change from allow-sensible to allow-sensitive.

#13 Updated by Radu Apetrii about 1 year ago

I've done two more changes that can be found on 7416a, rev. 14633:
  • I've increased the number of characters printed from the statement to 100 (it was 10 before). This allows us to see only which type of statement was executed (SELECT, INSERT, etc.), without seeing the rest.
  • SQLExecutor now receives the SQL used in statements (where it is possible). It then passes the SQL to SQLStatementLogger so that it can be printed in case it is necessary.
Basically, when logging is enabled, there are 3 possible cases:
  • If ALLOW_SENSITIVE is true, then the whole statement is printed (including parameter values).
  • If ALLOW_SENSITIVE is false, but the SQL was received, then the SQL is printed (this means without parameter values).
  • If ALLOW_SENSITIVE is false, and no SQL was received, then the first 100 characters from the statement are printed.

#14 Updated by Alexandru Lungu about 1 year ago

This sounds good to me. There are cases in which a query from a stack trace is basically a dynamic query. Therefore, we should get the full SQL to make some diagnostics, otherwise we will pin-point something like DynamicQuery.open kind of converted procedure.

This is mostly good to go. I need to do a smart rebase now, because 7416a was branched with some unwanted commits from the trunk.
Radu, please retest with the rebased version first thing tomorrow morning. I will do the same. If everything is find, #6970 is finally good to go.

#15 Updated by Alexandru Lungu about 1 year ago

Greg Shah wrote:

Add a sensible flag to the configuration.

Please use sensitive instead of sensible.

Hmm, I am the one that started this trend - my bad :/

Done the "smart rebase" - I basically uncommited to the latest trunk merge an reapplied Radu's changes. I can proudly say that it compiles, but didn't got the chance to test.
Greg, if everything is alright on you, after doing a skim test, I will merge 7416a.

#16 Updated by Greg Shah about 1 year ago

Code Review Task Branch 7416a Revision 14626

No objections. You can merge to trunk.

#17 Updated by Alexandru Lungu about 1 year ago

  • % Done changed from 0 to 30

Committed 7416a to trunk as rev. 14626.

#18 Updated by Radu Apetrii about 1 year ago

Alexandru Lungu wrote:

DatabaseStatistics is a good solution for query profiling on the FWD part.

I started to investigate this, but I've reached a point in which I have no idea how to continue.

I've managed to enable the DatabaseStatistics by setting in directory.xml the tag persistence/collect-statistics/enabled to true.
More than that, I noticed how queries were register to DatabaseStatistics throughout the execution of the program.
The problem that I have is that I do not know when/how the program will reach the following code in StandardServer, in order to actually collect the statistics:

         @Override
         public void terminate(Throwable t)
         {
            if (DatabaseStatistics.isEnabled())
            {
               DatabaseStatistics.get().collectStatistics();
            }
         }
Does it work if I close the server myself with Ctrl + C? Also, will the output be printed somewhere or do I have to specify this?
Ovidiu, Constantin, can one of you, please, enlighten me on how to do this properly (the intended way)? Thank you in advance!

#19 Updated by Ovidiu Maxiniuc about 1 year ago

the normal way of shutting down the server (and collect the database statistics) is to 'asked' it nicely, using the -k parameter. For example, if you started the server with

.../your-project/deploy/server$ ./server.sh -d &
you will use
.../your-project/deploy/server$ ./server.sh -k
to shut it down and the termination hooks (StandardServer.hookTerminate()) are called along the way. This allows the termination of the server even if you do not have it running in a console.

The SIGINT (CTRL+C) is the alternative 'hard' way.

#20 Updated by Alexandru Lungu about 1 year ago

Ovidiu Maxiniuc wrote:

The SIGINT (CTRL+C) is the alternative 'hard' way.

This may be an amateur question on my part: should we capture CTRL-C on the server to trigger a nice termination? Or we shouldn't bother about this 'hard' way and stick with -k whenever possible. On my part, I always find it easier (in my GUI environment) to block the bash with server.sh and do CTRL-C when finished. I think it is easy to mishandle the server termination using CTRL-C (or kill) and reach a (possible) invalid state.

#21 Updated by Radu Apetrii about 1 year ago

Ovidiu Maxiniuc wrote:

the normal way of shutting down the server (and collect the database statistics) is to 'asked' it nicely, using the -k parameter.

Thank you for the response, it worked!

After doing some more investigation, I've reached the conclusion that DatabaseStatistics can't be used in its current form. In order to actually collect the statistics, the program needs to call DatabaseStatistics.registerDatabase, which is commented out because it highly depends upon Hibernate. I tried to replace that code, but I was not able to find any equivalent in FWD, nor something else that collected statistics.

Although both SQLStatementLogger and P2JQueryLogger need some more tuning, I believe they are the way to go forward with this collecting statistics process. I will focus my attention on this so that they
will be ready soon.
Alex, if you do not have something to add, maybe we can close this task and post further updates on #6970?

#22 Updated by Alexandru Lungu about 1 year ago

Makes sense. However, I still feel there is some valuable logic behind DatabaseStatistics that can be reused. Not necessarily inside DatabaseStatistics, but more exactly where it collects data. AFAIK, your query logger captures some information on RAQ and Compound. I see DatabaseStatistics being used in AbstractQuery, Preselect and RAQ.

I feel like there is some interesting reasoning behind DatabaseStatistics that allows collecting data per query, using queryCtorLocs with Location. I think the goal of this task is to "merge" the effort of P2J query logging and database statistics. Therefore, we won't close this as long as DatabaseStatistics doesn't work / is not removed. So you either decide to move the logic from DatabaseStatistics to P2J query logging and remove the statistics, or the other way around. We need the best from both worlds.

DatabaseStatistics has better data grouping and identification: we get a query only once with its statistics (time, execution count) and its location in code. In P2J query logging, we have intertwined information and partial data. Therefore, we can print all data on server close or/and keep queries as weak references in DatabaseStatistics. We periodically check the weak references and eventually dump the data if the query was GC-ed. Maybe make a configurable dump-period so that once 100 iterations, we check the set for all queries that were GC-ed. It is your call of how you merge these two features.

Final expectancy: all kind of queries to be tested and pass through such logging (including CompoundQuery, RAQ, Adaptive, Preselect, PresortQuery, FindQuery, etc.), but keep only one functional P2J query profiling mechanism. I will attach here an inspiring quote from The Zen of Python:

There should be one-- and preferably only one --obvious way to do it.

#23 Updated by Alexandru Lungu 12 months ago

Radu, I have the following warning when building:

[ant:javadoc] /home/al2/gcd/7156b/src/com/goldencode/p2j/persist/orm/SQLExecutor.java:206: warning - @param argument "sql" is not a parameter name.
[ant:javadoc] /home/al2/gcd/7156b/src/com/goldencode/p2j/persist/orm/SQLStatementLogger.java:156: warning - @param argument "log" is not a parameter name.

Please address them in one of your current branches.

#24 Updated by Alexandru Lungu 12 months ago

Radu, I just encountered the following in Persistence:

   /** Milliseconds a scrolling query may take before being reported */
   private final long scrollThreshold;

   /** Milliseconds a list query may take before being reported */
   private final long listThreshold;

   /** Milliseconds a load query may take before being reported */
   private final long loadThreshold;

   /** Milliseconds a load-by-id query may take before being reported */
   private final long loadByIDThreshold;

These seem to be a previous attempt on implementing threshold for query profiling, added by Eric a long long time. Please consider removing them as it conflicts with your recent query logging changes. I am concerned that this profiling is done:

      boolean warn = (trace || LOG.isLoggable(Level.WARNING));
      long start = (warn ? System.currentTimeMillis() : 0L);

which is basically always, as we usually use WARNING logging.

I created 7416b. Include any leverage you may see fit here. The intend is to remove DatabaseStatistics for good, but integrate whatever was of interest there in your logging process.

#25 Updated by Radu Apetrii 11 months ago

I've added some changes to 7416b, rev. 14681. They include the following:
  • Fixed the javadoc mentioned in #7416-23.
  • Removed the logging attempt from Persistence, as discussed in #7416-24.
  • Refactored the logging process so that it matches the behavior/format from #6970-82 and #6970-84.
  • Added the possibility of having a per-database configuration (the same as SQLStatementLogger).
  • Added the layer of P2JQueryExecutor to all P2J queries, so they can all be logged if desired.
  • Mixed the logic from P2JQueryLogger with the query logging logic from DatabaseStatistics:
    • Kept the Location mechanism from DatabaseStatistics. This means that there two Map s now, one for storing Location based on a query, and one storing the actual statistics of a query based on a key (Location + FQL).
    • The last point implies that a query can also log its location (the place in the FWD code from where it is executed), as opposing to the already existing stackTrace which is used to log the place of the query from the non-FWD code.
    • A query now displays the number of times it has been used. More precisely, how many times the exact FQL was executed in the exact Location.
Unfortunately:
  • I couldn't extract any useful (and working) logic from the database logging process from DatabaseStatistics, as it was "too dependent" on Hibernate. Some methods of reference are: registerDatabase, collectStatistics.
  • For a query, I wanted to display how many records were processed, but it seems that there needs some more work to be done in this department. If this proves to be useful, I will try to add this feature. Note: Some queries might not process all the records in one go (e.g. with lazy), so this number might not be that accurate.

A possible configuration for directory.xml, under the persistence tag, can be:

          <node class="container" name="p2j-query-logging">
            <node class="string" name="enabled">
              <node-attribute name="value" value="TRUE"/>
            </node>
            <node class="boolean" name="stack-tracing">
              <node-attribute name="value" value="TRUE"/>
            </node>
            <node class="boolean" name="location">
              <node-attribute name="value" value="TRUE"/>
            </node>
            <node class="integer" name="processed-components">
              <node-attribute name="value" value="1"/>
            </node>
            <node class="boolean" name="profiling">
              <node-attribute name="value" value="TRUE"/>
            </node>
            <node class="double" name="min-time-threshold">
              <node-attribute name="value" value="0.01"/>
            </node>
            <node class="string" name="output-to-file">
              <node-attribute name="value" value="p2j-query_%db.log"/>
            </node>
          </node>
One can also configure the P2JQueryLogger for each database, this example being equivalent to a "global" configuration.

What does this configuration mean and what will happen:
  • enabled on TRUE means that the intent is to log details about P2J queries.
  • stack-tracing on TRUE means that the place from the non-FWD code will be displayed.
  • location on TRUE means that the place from the FWD code will be displayed.
  • processed-components with value 1 means that every time a query is executed it will be logged. For another value "x", a specific query will be logged every "x"-th time.
  • profiling on TRUE means that the execution time of the query will be displayed.
  • min-time-threshold marks that all the queries that are slower than this value will be logged, the other ones will be skipped (will still be executed, of course).
  • output-to-file marks the name of the file responsible for storing the logging result. It can accept placeholders, such as %db.

There are quite a lot of changes, but I hope that I didn't miss anything important. Still, I need a second pair of eyes to look at the changes.

#26 Updated by Radu Apetrii 11 months ago

I have also created a script to analyze the following things:
  • How many times each type of query was executed.
  • What was the slowest query of them all.
  • What was the most used query.
In order to use it, one must:
  • Run the script with one argument, the name of the .log file that contains the logging result.
  • Have a .log file generated by the P2JQueryLogger in the same directory as the script.

The result will be shown in a separate file, with the .txt extension and having the keyword analyzed in its name.

#27 Updated by Alexandru Lungu 10 months ago

  • Assignee changed from Radu Apetrii to Andrei Iacob

#28 Updated by Alexandru Lungu 10 months ago

  • Assignee changed from Andrei Iacob to Radu Apetrii

Sorry, my bad.

#29 Updated by Alexandru Lungu 9 months ago

  • % Done changed from 30 to 100
  • Status changed from WIP to Review

Radu, please rebase 7416b to the latest trunk. "Rereview" the changes and test them with a project you already have it set up and ensure it still works properly. Thank you!

#30 Updated by Radu Apetrii 9 months ago

Alexandru Lungu wrote:

Radu, please rebase 7416b to the latest trunk. "Rereview" the changes and test them with a project you already have it set up and ensure it still works properly. Thank you!

Rebased 7416b with trunk rev. 14767. 7416b is now at rev. 14769.
I will "rereview" the changes.

#31 Updated by Radu Apetrii 9 months ago

  • Status changed from Review to WIP
  • % Done changed from 100 to 60
I tried to run ./gradlew.all in the 7416b branch, but I've encountered some errors. Thus, I've committed to 7416b, rev. 14770 a fix so that the branch can be used now.
There were two issues:
  • Even though DatabaseStatistics.java was removed, its usages were not. I don't know how I've managed to miss this and I'm not sure why Eclipse was so casual about it (there were no errors, no notifications).
  • Some dependencies in SQLQuery got mixed up.

I noticed that I was quite careless when implementing those features, so, there will probably be a consistent "rereview" soon.

#32 Updated by Radu Apetrii 9 months ago

I've done another commit in which I've fixed some of the things that I noticed during the "rereview":
  • There was a case in which sensitive data would not be printed even though the flag ALLOW_SENSITIVE would be true.
  • Fixed some javadoc.
  • databaseName in CompoundQuery was not used for anything, it was just assigned.

Branch 7416b is now at rev. 14771. I've tested with a large customer application and the log files generated were pretty informative.

The changes are ready for a proper review, I think.

#33 Updated by Radu Apetrii 9 months ago

  • Status changed from WIP to Review
  • % Done changed from 60 to 100

#34 Updated by Alexandru Lungu 9 months ago

Review:

  • Add history numbers to Validation (and all other classes that have changes on this branch exclusively - 7416b)
  • Rename RANDOMACCESS into RANDOM_ACCESS
  • I see that the legacy DatabaseStatistics had a perfdb target which was populated with the statistics. Right now, we are logging into the file, dropping the support of "logging" into a H2 database. As long as the previous feature was not working (due to the switch from Hibernate), I guess it was quite obsolete, but I still think we can have such support. As 7416b is needed asap by #6970, I would rather merge 7416b and open a new task on "allowing the query logs to be collected in a H2 database".
With this being said, I am OK with the changes. I would like to have Andrei Iacob check-out this branch and stress test it a bit with some configurations. The goal in #6970 is to have an optimal configuration for testing, but until then, we should have 4/5 configurations for specific matters:
  • general configuration for catching slow queries
  • general configuration for printing almost all queries
  • etc.

Eventually, we will move this is a Wiki. With this whole preparing, we automatically test the code.
I can do a set of regression tests, but most probably next week, just before we merge this.

#35 Updated by Alexandru Lungu 9 months ago

Radu, please let me know when you finish with the changes here. I will like to have this reviewed, tested and merged at some point soon so we can resume the work on #6970 properly.

#36 Updated by Radu Apetrii 8 months ago

Alexandru Lungu wrote:

  • Add history numbers to Validation (and all other classes that have changes on this branch exclusively - 7416b)
  • Rename RANDOMACCESS into RANDOM_ACCESS

Committed to 7416b, rev. 14772. The changes contain the addition of a history entry number in Validation and the change of name from RANDOMACCESS into RANDOM_ACCESS.

  • I would rather merge 7416b and open a new task on "allowing the query logs to be collected in a H2 database".

I will do this right now.

#37 Updated by Alexandru Lungu 8 months ago

  • Related to Feature #7960: Allow query logs to be collected in a H2 database added

#38 Updated by Alexandru Lungu 8 months ago

Review of 7416b, rev. 14772

  • I am OK with the changes. These are not functionally changing FWD, but are quite intrusive in the persistence layer and may alter performance.
  • I am looking forward for a second review here (Eric / Ovidiu, are you available for tackling this?)
  • I will run some (performance and regression) tests with these changes, but it will take a bit as I already queued some branches to test. Hopefully, I will manage to do some tests by this Friday. Anyway, I would prefer to have the review done before that time, so I can test with the latest version.

Radu, please make a really short summary of the changes you've made so it will be easier for an upcoming reviewer to understand the changes and their motivation.

#39 Updated by Radu Apetrii 8 months ago

Branch 7416b, rev. 14767-14772 summary:
  • Removed DatabaseStatistics and its usages.
  • In P2JQueryStatistics:
    • Integrated the Location logic from DatabaseStatistics.
  • Added P2JQueryStatisticsConfiguration with the purpose of storing a configuration for each database.
  • Refactored P2JQueryExecutor so that the cases for LOGGING, PROFILING and NONE are more clear.
  • Removed the query execution logic from P2JQueryLogger and moved it to P2JQueryExecutor.
  • SQLExecutor follows the same design as P2JQueryExecutor.
  • In CompoundQuery:
    • Added two functions which are intended only for logging: getCombinedFqls, which returns the FQLs concatenated from each component, and getCombinedExecutionTime which returns the total time of execution of that CompoundQuery (the sum of its components).
    • Obtained the database and passed it as an argument to P2JQueryExecutor.getInstance().execute.
  • In RandomAccessQuery I added getBundleFql, a function used for logging, which returns the FQL of the whole bundle.
The goals of this task were to:
  • Remove DatabaseStatistics since it became unusable, but salvage anything that might be useful.
  • Have a logger for P2JQueries which provides more information than just the query strings themselves (e.g. time, stack trace).
  • Separate and have different configurations for logging queries depending on the database. For example, queries over temporary databases are generally faster and we might intend on having a higher threshold than queries executed over a persistent database.

#40 Updated by Alexandru Lungu 8 months ago

Profiled and the change has a reasonable time (it fits the time boundaries). It is neither faster nor slower.

Eric / Ovidiu, are you available to review this?

The tool is quite substantial to enable us to do SQL / query logging - it would be nice if it is integrated soon and maintained with the trunk.

#41 Updated by Ovidiu Maxiniuc 8 months ago

Wow, this a much larger change-set that I expected. Things seem more structured now. Good job.
Here are my notes:

  • StandardServer.java:1347. Method terminate() can be dropped completely as it has the same content as its super from AbstractInitTermListener;
  • CompoundQuery.java:
    • line 1919: please use indexed loop as in getCombinedFqls() method above;
    • line 2842: I understand your intention of optimisation by caching the database from one component to another. Although the buffers of a components may belong to same database, cross databases queries can be created (for example: for each book, first ttAuthor where book.author-id eq ttAuthor.auth-id:). Unfortunately, we cannot do this optimization here;
  • P2JQueryStatistics.java:
    • please use 'bulk' import;
    • Location class:
    • line 264: Throwable.getStackTrace() return an array of StackTraceElement. Please use indexed loop for better performance. Even if this code is not supposed to run on production, it may affect our profiling.
      • all fields of are final. A good optimization is to pre-compute the hash value instead of do the calculation each time it is needed;
      • in constructor the String fields are intern()-ed. In this case the test for equality can be performed with == operator instead of equals() method;
    • Key class, hash can also be precomputed;
    • I am not sure whether the location and fql are mandatory fields of P2JQueryStatistics. Each of these objects are accessed by a Key object, hence, you already have these informations from Key.location and Key.fql.
  • Persistence.java: a single H entry is required for a merge to trunk.
  • RandomAccessQuery.java: taking into consideration how it is used, does it make sense to cache the result of getBundleFql()?
  • Loader.java, Persister.java, SQLExecutor.java, SQLStatementLogger.java, SQLStatementLoggerConfiguration.java: add a new number to your H entry (005, 008, etc, I guess).
  • P2JQueryExecutor.java
    • I liked this especially from the H note: Query execution can no longer happen in P2JQueryLogger. :-)
    • line 445: why creating a new QueryOffEndException with same content as that caught? The only difference is that some of the initial exception elements are lost;
  • P2JQueryLogger.java, method log():
    • the fql may be null. For example FIND FIRST book. is converted as new FindQuery(book, (String) null, null, "book.bookId asc").first();
    • it looks to me as a bit verbose. As I read it, it prints multiple informations for each query executed. Shouldn't it collect and only print the final statistics?

#42 Updated by Alexandru Lungu 8 months ago

Radu, please address the review as soon as you have the chance.

#43 Updated by Alexandru Lungu 8 months ago

As the review is quite detailed, Andrei [ai], can you manage clearing as many as the minor points above? If there are functional things, you can ask Radu / leave them for Radu to solve. I want to have some visible progress here.

#44 Updated by Radu Apetrii 8 months ago

Alexandru Lungu wrote:

As the review is quite detailed, Andrei [ai], can you manage clearing as many as the minor points above? If there are functional things, you can ask Radu / leave them for Radu to solve. I want to have some visible progress here.

Alex, I already started working on these points. Currently I'm handling a commit for #8012 which I need to document as well, but after that I should be clear to continue working on this.

Taken this into consideration, should Andrei still take on from this point?

#45 Updated by Alexandru Lungu 8 months ago

Okay then, please continue working on the review. Andrei, disregard my comment.

#46 Updated by Radu Apetrii 7 months ago

I've committed to 7416b rev. 14773 the changes suggested in #7416-41 with two exceptions: the two points related to P2JQueryLogger.java.

the fql may be null.

I don't think this can happen. Wouldn't the fql contain the from clause in all cases? Even in the example with book, the fql would be something similar to: select book.recid from Book__Impl__ as book order by .... Even assuming that book is a temporary table, the fql would contain at least the from clause and a order by multiplex clause. Maybe I'm misunderstanding, please correct me if so.

Shouldn't it collect and only print the final statistics?

From the start, my intention was for this to be a "live logger", so one can see in real-time, for example, how much slower a prev() is than a next(). Regarding final statistics, there is a task called #7960 which is supposed to handle this exact case: collect statistics about queries and print them in the end, or store them in a database.

#47 Updated by Alexandru Lungu 7 months ago

  • Status changed from Review to Internal Test

Lets move this into Internal Test and have it completely tested and merged. Even if these are refactorings, we still need to make sure we don't break something in the performance layer.

First:
  • [Radu] rebase 7416b
Second:
  • [Radu] POC performance tests
  • [Radu] large customer application smoke tests
  • [Alexandru] large customer application regression tests
  • [Alexandru] ChUI regression testing

#48 Updated by Radu Apetrii 7 months ago

I'm starting the rebase now. I'll update you when it is done so we can both do the tests.

#49 Updated by Radu Apetrii 7 months ago

Radu Apetrii wrote:

I'm starting the rebase now. I'll update you when it is done so we can both do the tests.

Done. Branch 7416b was rebased with trunk rev. 14863 and reached rev. 14869.

#50 Updated by Radu Apetrii 7 months ago

Alexandru Lungu wrote:

large customer application smoke tests

Done. Worked without any issue in all cases (sql logging on/off, query logging on/off).

POC performance tests

The tests show a ~3.5% decrease in performance. To be fair, a downgrade was expected, but I'm not sure if we are willing to accept such a value.
Alex, shall we have another look at the solution performance-wise?

#51 Updated by Greg Shah 7 months ago

The tests show a ~3.5% decrease in performance.

No, we can't move ahead with this.

#52 Updated by Alexandru Lungu 7 months ago

Greg Shah wrote:

The tests show a ~3.5% decrease in performance.

No, we can't move ahead with this.

Radu, first "batch" of logging refactoring didn't show any downgrade. I didn't expect to see any downgrade with this. We need to do better analysis of the code to see if we don't regress.

#53 Updated by Eric Faulhaber 7 months ago

The original implementation (when Hibernate was involved) had a very high overhead. I don't know the exact number, but it was unusable for a production system. As such, it was never meant to be enabled in production, just for test systems. However when it was passive/disabled, there was negligible overhead (just a static variable check, IIRC).

Are you saying there is a 3.5% penalty when the implementation is passive, or when it is actively recording performance data?

#54 Updated by Radu Apetrii 7 months ago

Eric Faulhaber wrote:

Are you saying there is a 3.5% penalty when the implementation is passive, or when it is actively recording performance data?

That 3.5% penalty is passive (all logging features being disabled). I forgot to mention it in the last post, sorry.

#55 Updated by Radu Apetrii 7 months ago

Alexandru Lungu wrote:

Radu, first "batch" of logging refactoring didn't show any downgrade. I didn't expect to see any downgrade with this. We need to do better analysis of the code to see if we don't regress.

I'll take another look at the code and then I'll re-run the performance tests. Maybe it was a misconfiguration from my part. I'll double check everything.

#56 Updated by Radu Apetrii 7 months ago

Radu Apetrii wrote:

Alexandru Lungu wrote:

Radu, first "batch" of logging refactoring didn't show any downgrade. I didn't expect to see any downgrade with this. We need to do better analysis of the code to see if we don't regress.

I'll take another look at the code and then I'll re-run the performance tests. Maybe it was a misconfiguration from my part. I'll double check everything.

Regarding the hardware discussion from #5768, I'm also facing differences in time depending on the location that I'm at (home vs office), which I find very strange. At first I thought it was a coincidence, but I've redone the experiment to confirm it. The difference is not huge, but it certainly plays a part in testing. That's why that 3.5% slowdown seems so massive, due to the fact that I ran the 7156b baseline at home (which is the faster environment), and the 7156b + 7416b at office.

I'll post the results once the tests are finished (from the same location this time).

#57 Updated by Radu Apetrii 7 months ago

After re-running the performance tests for both the baseline and the changed version, I got the following result:
  • 7156b with the 7416b changes on top is ~0.23% faster (or shows a ~-0.23% performance improvement, if you like) than simply 7156b.

I believe this is closer to the result Alex got. Shall we ask someone else to test this (as we've seen some mismatches between our results lately) or are we good to go?

#58 Updated by Greg Shah 7 months ago

You can go ahead.

#59 Updated by Radu Apetrii 7 months ago

Greg Shah wrote:

You can go ahead.

I'll quickly rebase the branch and then proceed with the merging.

#60 Updated by Radu Apetrii 7 months ago

Done. Branch 7416b was merged into trunk as rev. 14872 and archived.

#61 Updated by Alexandru Lungu 6 months ago

  • Status changed from Internal Test to Test

Also available in: Atom PDF