Project

General

Profile

4GL Database Access Performance Tips

It is difficult to make bullet-proof, blanket recommendations in this area, since there are so many variables at play, and the runtime is evolving. As we make improvements to address underlying performance issues in different areas, some recommendations may become obsolete over time. At this time, the following tips may be considered best practices.

Schema

Fields

  • Mandatory fields help query performance.
    • This allows FWD to create simpler queries downstream, which are more likely to get better query plans at the database.

Temp-Tables and Datasets

  • Prefer statically defined to dynamically defined. Dynamic definition incurs a runtime cost.
    • Dynamic temp-tables must be converted at runtime.
    • Definitions are cached, so the conversion only occurs once, but subsequent lookups still have a cost.
    • Dynamic datasets require more runtime setup than statically defined datasets.

Indices

  • Use indices for performance, not sorting.
    • Creating indices just for sorting can lead to very wide indices, often largely redundant.
      • Updating indices has a performance cost.
      • PostgreSQL does not compress indices, so they take up more disk space.
  • The fewer index components, the better.
    • Wide, multi-component indices make FIND NEXT/PREV operations on those indices slower.
      • If you need very granular sorting, use BY clauses which overlap an existing, compact index.
    • Use mandatory fields as index components, if possible.

Record Buffers

Buffer Creation

  • Prefer static definition to dynamic creation.
    • Dynamic buffer creation has a higher runtime cost.
  • Be minimalist: don't define every buffer you may possibly need in a shared include file.
    • Defining a buffer has a runtime cost (time and memory).
    • Only define buffers you use.
  • Define buffers close to where they are used.
    • Processing buffers moving in and out of block scopes has a runtime cost.

Record Retrieval

General Advice

  • Minimize round trips to the database.
    • FWD does not use shared memory like Progress; round trips cost time.

Loops

  • Methods of record retrieval
    • Bad
      • FIND NEXT/PREV (fully dynamic record navigation)
      • This may be mitigated; AdaptiveFind will be used, if the FIND NEXT/PREV statement is not conditional.
    • Better
      • FOR EACH (if you might exit the loop early, or modify the index being walked).
    • Best
      • REPEAT PRESELECT (if you know you won't exit the loop early!).

Queries / WHERE Clauses

  • Prefer statically defined queries to dynamically defined. Dynamic queries incur a runtime cost.
    • Dynamic queries must be converted at runtime.
    • Definitions are cached, so the conversion only occurs once, but subsequent lookups still have a cost.
  • Mandatory Fields
    • Avoid WHERE clause comparisons between two non-mandatory, indexed fields.
      • FIELD1 = FIELD2 must be rewritten as (FIELD1 = FIELD2) OR (FIELD1 IS NULL AND FIELD2 IS NULL).
      • The PostgreSQL query planner will not be able to use an index on FIELD1 or FIELD2.
  • Functions in a WHERE clause
    • ABL built-in functions are functionally OK to use in a WHERE clause, but some are not as performant as others
      • These are implemented as database server-side User Defined Functions (UDFs)
      • The more complex implementations may be opaque to the PostgreSQL query planner. See udf/postgresql/udfs.sql. Generally speaking...
        • if the LANGUAGE is sql, the query planner may be able to optimize
        • if the LANGUAGE is plpgsql, the query planner will not be able to optimize
    • Invoking a function defined in a 4GL program is OK as long as it does not reference the current buffer. For example:
      • OK: FOR EACH customer WHERE customer.name = my-4GL-func1()
        • my-4GL-func1() is refactored to a query substitution parameter, which is executed once at query execution time
      • Not OK: FOR EACH customer WHERE my-4GL-func2(customer.name)
        • my-4GL-func2() is refactored to a "client-side" WHERE clause expression, which is executed once per record found by the remaining WHERE clause criteria
  • Nested CAN-FIND in a WHERE clause
    • Avoid nesting CAN-FIND with a LOCK in a WHERE clause
      • FWD refactors nested CAN-FIND to a sub-query, but we can't do this if a lock check is needed
    • Avoid nesting CAN-FIND more than 1 (maybe 2) levels deep
      • Not necessarily a performance issue (though it can be), but it will give the conversion logic fits
  • Joins between temp-tables and database tables
    • Temp-tables are processed using an in-memory H2 database (inside the application server JVM).
    • Database tables are usually processed using an external DBMS (PostgreSQL, MariaDB...).
    • Avoid joins between tables that are in-JVM and tables that are in an external DBMS.
    • FWD attempts to process joins at the database server but joining temp-tables and database tables cannot be processed at the DBMS server.
    • Instead, the FWD Java runtime must handle the join, which is slower than letting the DBMS handle it.