Bug #6627
improve conversion memory footprint for AstKey persistent maps
0%
Related issues
History
#2 Updated by Constantin Asofiei almost 2 years ago
There are these persistent (in the H2 conversion database) maps:
frame_generator_pre.xml: 117: <rule>frameInterfaces = createMapFrameAstKeyToString("frameInterfaces")</rule> p2o_pre.xml 139: <action>tmpTabNames = createMapAstKeyToString("pre_tmpTabNames")</action> p2o.xml (2 matches) 537: <action>tmpTabNames = createMapAstKeyToString("tmpTabNames")</action> 538: <action>tmpTabNodes = createMapAstKeyToAst('tmpTabNodes')</action>These work with a
AstKey
or FrameAstKey
. The problem here is that they keep the AST in memory, even if it is used only for equals
(which AFAIK the map will call it only if there is a collision on the hashcode). My idea here is to:
- keep the AST id as an instance field,
astId
- the
ast
field is kept asnull
- all access to this field is replaced with a getter, which will load the AST (if not already loaded), using theastId
field. This will load the AST only when needed.
I have some changes in testing which allow this 'lazy loading' of the AST only when reading the map from the H2 database. During the map creation, the AST is still kept in memory. I'll see how this conversion goes, and if is OK with a smaller heap, I'll make the ast
fully lazy, even for new map entries. The point is, let the AST be loaded and remain in memory in the AstKey field only if equals
is ever called.
The above should allow for a smaller memory footprint for these maps, but with some penalty on the total conversion time. Considering that these maps are used in conversion phases which are not that time-consuming (5 minutes on a large project for P2O processing and temp-table Java code generation), the impact should not be massive.
#3 Updated by Greg Shah almost 2 years ago
It is a good plan and a reasonable trade-off.
#4 Updated by Constantin Asofiei almost 2 years ago
Experimenting showed that the impact of these changes didn't have the magnitude I first assumed. The changes I have may improve some performance for incremental conversion, but I will not release them until I have a chance to test them more.
As a side note, what the customer was experiencing was a problem with a conversion heap allocated too high, and the operating system was terminating the Java process with error 137, as the machine was out of resource (this was not an OutOfMemoryException
thrown by the JVM). After experimenting with lower heap values, I could find a setting where the conversion time was minimally impacted and allowed the customer to convert it with that machine's RAM (the conversion heap was reduced with ~25%).
- a part of the memory is occupied during parsing by the legacy 4GL classes -
ClassDefinition
structures are kept in memory, and these can be in the 10s of thousands, each pinning to memory the entire AST. These instances remain in memory for the entire conversion. - there are
.schema
files which can be in the 100MB range on disk - some analysis of theSchemaDictionary
memory footprint should be performed, this can be determined easily by converting only the .df files (without any 4GL code) and looking at the used heap (after garbage collection) in VisualVM.
#5 Updated by Constantin Asofiei almost 2 years ago
- Related to Bug #6642: improve parsing memory footprint added
#6 Updated by Constantin Asofiei about 1 year ago
- Status changed from New to Review
- lazy loading of the AST at the map's value.
- when a sub-AST is required at the key or the value, leave the entire AST file in the registry - otherwise, the same AST file will be re-loaded 10s or 100s of times, depending on how many sub-ASTs are required from that file.
In short, the problem above was with the temp-table ASTs - the previous approach was to reload the full AST each time a temp-table AST is required. But, for temp-tables, in the end the full AST is required to be loaded, and this resulted in re-loading the same AST 100's of times, impacting the overall performance. The solution was to load the temp-table AST fully, only once.
This same approach can't be applied to frame ASTs, as these reside in the main .ast file, and we can't keep in memory all the .ast files.