Bug #7924
add support for creating an empty database
100%
Related issues
History
#1 Updated by Alexandru Lungu 9 months ago
- Assignee set to Dănuț Filimon
- Status changed from New to WIP
A summary from a customer task - mostly from Eric:
- Empty database: A database with a fully working schema (including UDFs), but no pre-existing data (other than perhaps converted _User records, if needed)
- We can create the tables using the generated
ddl/schema_table_<name>_<dialect>.sql
- Indices will be missing, since currently, it is the import process which creates these (we use unique indices during the import to detect duplicate records; however, non-unique indices are not created until after the import, to allow bulk inserts to execute as quickly as possible).
- I don't think the p2j_id_generator_sequence will be created without the import. This sequence is created after all records have been inserted.
- If there needs to be authentication of one or more users, we will need those user records imported into the meta_user table.
- Two ways of enabling this:
- development environment: ant task named
create.empty.db
- development environment: ant task named
Problems:
1. From Ovidiu:
The indices are created during the import of each table. If there is no dump file, the table is not touched. If the .d file exists but is empty (its size is 0), it is listed in import log but actually ignored. The workaround for having the indices created is to have .d files for each table without any records, but just the PCS footer. It would look like this, for all files: . PSC records=0000000000000 timestamp=2016/12/07-14:55:06 numformat=44,46 dateformat=dmy-1950 cpstream=ISO8859-1 . 0000000000 Running the import for this set of files will create all indices. There is another benefit here: these tables will look pristine, so a subsequent import process will allow the tables to be populated, if desired. Of course, the ddl/schema_index_<name>_<dialect>.sql are available anyway and can be used at the same time with ddl/schema_table_<name>_<dialect>.sql
2. From Ovidiu:
A important issue: while doing some researches in this area, I noted that we have a small difference in index naming. For example, the import utility will attempt to delete and then recreate idx__meta_user_idx__userid while the static conversion will generate the same index with idx__meta_user__userid. Notice the extra _idx token. While working 'normally' (aka running an import without creating the indices) this is not a problem, there is a couple of problems if the indices are created before the import is performed: * the import cannot delete the existing indices and the performance is penalized at import (because the kept indices are updated); * after the table is imported, a duplicate index is created with unpredictable results (beside the extra used space).
Danut, please investigate the concern here. I tried to summarize the issue here, but maybe there are still questions left from your part. Please use these days to investigate and start implementing. We need a solution for this in a reasonable time.
Eric / Ovidiu, please feel free to complete with whatever you feel it is relevant here.
#2 Updated by Greg Shah 9 months ago
Please note that the ant task is a development convenience. When deploying to production environments, there will be no conversion project installed. Since there will be no ant task to run, we must have a clean/easy way to implement an empty database that can be run directly from FWD itself.
I'm fine with adding an ant task, but it should just run the simple process in FWD itself.
See #4722 for similar problems that we solved recently for non-empty database imports.
#3 Updated by Alexandru Lungu 9 months ago
Greg, thank you for the heads-up. This exact same thing I wanted to mention, but I missed it. Please note my "Two ways of enabling this", but there is only one way in fact :) The second one was meant to be done just like you state in #7924-2.
#5 Updated by Alexandru Lungu 8 months ago
- Priority changed from Normal to High
Danut, mind that I will move this in High Priority. Even if it is part of the Database thread, it is also a feature required by a customer in a quite reasonable time.
#6 Updated by Dănuț Filimon 8 months ago
I started working on this task, should the import.sh
added in trunk/rev.14773 be extended for this case or a new create_empty.sh
be created? There is a lot of useful functionality in import.sh
that can be used for creating an empty database.
#8 Updated by Dănuț Filimon 8 months ago
- % Done changed from 0 to 50
I extended import.sh
to support creation of an empty database. I've committed 7924a/rev.14790, the changes consist of adding a new -e
option to import.sh
that sets the new empty_import_only
variable. The new variable is used in the final import command and is a parameter for schema/import
. This parameter is used to initialize ImportWorker.empty
(default is false
) and this property is used to toggle record processing and table analysis. It is important to note that indexes and sequences are still generated, including p2j_id_generator_sequence
(since the value of the imported record doesn't influence if the table is added to the complete
set, the dmo is always added when importAsync
is called). There's still some investigation that needs to be done on the second issue mentioned by Ovidiu in #7924-1 and the _User records which should be imported (currently no table is imported).
The import "skipping" is done for each table so that only the record processing is skipped (this should allow some degree of flexibility if we want to allow certain tables to be imported e.g _User), the index creation before and after the processing will be done. The table analysis is skipped since there are no records in the table.
#9 Updated by Dănuț Filimon 8 months ago
Fixed a NPE at conversion time in 7924a/rev.14791.
#10 Updated by Dănuț Filimon 8 months ago
The additional idx__
is added when instantiating the index definition in schema/import
using data.formatWithEscapedParameter("idx__%s", text)
. When indexes are created in ImportWorker$Library.createIndexes
, they reach Dialect.getCreateIndexString
which calls Dialect.getSqlIndexName
. H2 and Postgres add the idx__
prefix to the index, but other dialects (MariaDB) don't do so, the name of the index is simply returned.
#11 Updated by Ovidiu Maxiniuc 8 months ago
Please replace data.formatWithEscapedParameter("idx__%s", text)
with the simple text
. H2 and PostgreSQL will still add another suffix, as in generated DDLs.
More details.
For example idx__meta_user_idx__userid
. There are two idx__
tokens here. The first will be kept (it is optional, and comes form the Dialect.getSqlIndexName()
), but the second must go (it comes from import.xml
). The final name should be idx__meta_user_userid
, the same as static conversion generates.
Than please do a conversion and an import test using hotel_gui with h2, psql and mariadb. Check whether the index names are matching (those from ddl/
against those reported in import's log for each dialect).
#12 Updated by Dănuț Filimon 8 months ago
Ovidiu Maxiniuc wrote:
Please replace
data.formatWithEscapedParameter("idx__%s", text)
with the simpletext
. H2 and PostgreSQL will still add another suffix, as in generated DDLs.
Than please do a conversion and an import test using hotel_gui with h2, psql and mariadb. Check whether the index names are matching (those fromddl/
against those reported in import's log for each dialect).
This change was already applied together with the _user
import, I've also tested conversion using H2 and PostgreSQL on Hotel_GUI yesterday and today I ran the import.sh
script on both types of databases.
I currently have a problem with MariaDB, it's related to the following ant job create.db.mariadb: [exec] mariadb: /lib/x86_64-linux-gnu/libncurses.so.6: no version information available (required by mariadb)
. I recently patched ncurses since it helped me solve a problem with a customer application. After a new setup, it's still not working and it seems that 6.2
version is available, while the wiki specifies version 6.1
.
Can I receive some help with this? I've modified the commands to use 6.1
but it wasn't helpful.
#13 Updated by Dănuț Filimon 8 months ago
Committed 7924a/rev.14792. Fixed the additional idx__
and modified the import to always process the records of the meta_user
table, even if an empty database is being created.
The only testing left that needs to be done is with MariaDB.
#14 Updated by Ovidiu Maxiniuc 8 months ago
I tested the import using both ant script and fwd's import.sh
.
The project's build_db.xml
needed some changes and I added them to hotel_gui in rev 94.
The import.sh
performed o for empty database (-e
) after replacing the mariadb-admin
with mysqladmin
. I guess this is debatable. However, the normal import failed with some errors like these:
_user.d: SQL: create index user_domain_name on meta_user (domain_name__null,domain_name, sql_only_user__null,sql_only_user, userid__null,userid, recid) Error: 1072-42000: Key column 'sql_only_user__null' doesn't exist in table checkout-rooms.d: SQL: create index idx_rc_room on checkout_rooms (room_number__null,room_number, recid) _user.d: Error processing import data; Error: 1072-42000: Key column 'room_number__null' doesn't exist in table 1 of 1 record(s) successfully processed; 1 record(s) uncommitted due to this error; 0 record(s) dropped. checkout-rooms.d: Error processing import data; 2 of 2 record(s) successfully processed; 2 record(s) uncommitted due to this error; 0 record(s) dropped.
This is because there seems to be a regression in generating the table DDL for the strict variant of the dialect.
OTOH, I was not able to test the lenient variant because it is not supported by the script. Here is how the script was executed from the project root:
p2j/import.sh -ddata/dump -hlocalhost -p3306 -fp2j/build/lib -abuild/lib/hotel.jar \ --db_type=mariadblenient --db_adminid=fwd_admin --db_adminpw=fwd_password \ --db_userid=fwd_user --db_userpw=fwd_pasword --db_names=hotel
#15 Updated by Dănuț Filimon 8 months ago
Ovidiu Maxiniuc wrote:
I tested the import using both ant script and fwd's
import.sh
.The
import.sh
performed o for empty database (-e
) after replacing themariadb-admin
withmysqladmin
. I guess this is debatable. However, the normal import failed with some errors like these:
[...]This is because there seems to be a regression in generating the table DDL for the strict variant of the dialect.
[...]
It seems that MariaDB has a problem that is not related to the changes made here. As you mentioned, ddls contain incorrect named indexes and I can confirm the same after converting the project with a newer version of trunk. For the index sql_only_user__null
, __null
should probably not exist, correct me if I am wrong.
I've managed to solve the problem with MariaDB after upgrading my OS version, but I am still having issues when trying to connect to the database.
#16 Updated by Dănuț Filimon 8 months ago
... but I am still having issues when trying to connect to the database.
I overlooked the fact that I need to change the port in build.properties
. There are no more problems and I am able to replicate the issues mentioned by Ovidiu when importing using MariaDB.
#17 Updated by Ovidiu Maxiniuc 8 months ago
Dănuț Filimon wrote:
It seems that MariaDB has a problem that is not related to the changes made here. As you mentioned, ddls contain incorrect named indexes and I can confirm the same after converting the project with a newer version of trunk. For the index
sql_only_user__null
,__null
should probably not exist, correct me if I am wrong.
In fact, the sql_only_user__null
is specific to strict variant of MariaDb. It is a computed column which is used to emulate the order of null
elements in indices and sorting because this dialect does not feature a nulls last
/ nulls first
option. So instead of having domain_name
as index component, we add domain_name__null,domain_name
pair instead. The semantic of this extra column is, as you expect, to test whether its pair the null and force the sorting order accordingly.
This is (one of) the reason we created the lenient variant, where the indices and queries are simplified by ignoring this this aspect. Note that this additional SQL column is created only for indexed columns of the non-lenient mariadb dialect.
However, it is a bit strange. I am looking at the DDLs generated for hotel_gui project and they are generated only for the meta_user
table definitions, although the fields in other tables are recognized as indexed column and the index definitions are correctly defined. As noted in my previous note, this is the cause of the failure: the indices attempts to use these computed columns but they are not defined.
#18 Updated by Dănuț Filimon 8 months ago
Ovidiu Maxiniuc wrote:
I've managed to look over the import issue and found the following case where an undefined computed column is used:As noted in my previous note, this is the cause of the failure: the indices attempts to use these computed columns but they are not defined.
- The table has the following field defined:
addr_id integer not null
, the computed column is not generated because it is not a character type. During ddl generation, it reachesDialect.getCreateIndexString
which calls theMariaDbDialect
specific methodorderByNulls
where an index is created on(addr_id_null, addr_id)
. The field should not exist since the property is defined asnot null
so the table definition is correct, but the index is created without knowing that the index field can't be null.
#19 Updated by Dănuț Filimon 8 months ago
- File 7924-20231114.patch
added
- % Done changed from 50 to 70
- Status changed from WIP to Review
I attached a patch that solves the issue. Ovidiu, can you retest MariaDB and tell me if there are any more problems when using it? You'll have to reconvert the application to generate the new ddls.
#20 Updated by Ovidiu Maxiniuc 5 months ago
Dănuț Filimon wrote:
Ovidiu Maxiniuc wrote:
I've managed to look over the import issue and found the following case where an undefined computed column is used:As noted in my previous note, this is the cause of the failure: the indices attempts to use these computed columns but they are not defined.
- The table has the following field defined:
addr_id integer not null
, the computed column is not generated because it is not a character type. During ddl generation, it reachesDialect.getCreateIndexString
which calls theMariaDbDialect
specific methodorderByNulls
where an index is created on(addr_id_null, addr_id)
. The field should not exist since the property is defined asnot null
so the table definition is correct, but the index is created without knowing that the index field can't be null.
Indeed, the addr_id integer not null
column should not create (addr_id_null, addr_id)
pair in the index definition. However, the reason is not because its type is character
but because it is mandatory
, Ie, not null
. (Since it is defined as mandatory
/ not null
by SQL constraints, we do not need to add this overhead to index definition). But we do need to have these generated columns in case of other types of nullable datatypes which are index components.
#21 Updated by Dănuț Filimon 5 months ago
- Status changed from Review to WIP
Ovidiu Maxiniuc wrote:
Indeed, the
addr_id integer not null
column should not create(addr_id_null, addr_id)
pair in the index definition. However, the reason is not because its type ischaracter
but because it ismandatory
, Ie,not null
. (Since it is defined asmandatory
/not null
by SQL constraints, we do not need to add this overhead to index definition). But we do need to have these generated columns in case of other types of nullable datatypes which are index components.
I've made an assumption based on the field types that were allowing this type of index definition. In this case, we should mark fields as mandatory so that we can create the correct index definition. I see that P2JField
has a mandatory
class member variable, maybe I can start from there.
#22 Updated by Dănuț Filimon 5 months ago
I've looked into DDLGeneratorWorker
and how we can make use of the mandatory
and while debugging the conversion I noted that the P2JField for the table are created after the index so we can't simply get the P2JField instance corresponding to the index component when it is created. And another concern is adding an additional field to the P2JIndex.addComponent()
requires multiple changes to methods that eventually use it and in some cases, the mandatory
value might not be available.
Is the order index -> index component -> fields
important? The rules specify this order, but I did not did upwards to actually look into what rules are used. Even if the order is not necessary, the second concern is still a problem.
#23 Updated by Greg Shah 5 months ago
The order in the ruleset does not matter too much (in this case), but the structure of the tree will drive the rules in that order regardless of changing the sequence/order of the rules in the ruleset. In other words, if the tree is structured such that an index contains index components which contain fields, then normal walk rules will execute at the index node before the contined index component nodes (and so forth).
#24 Updated by Ovidiu Maxiniuc 5 months ago
I think you can access the not-null
annotation when configuring the index components.
- in index components are added using
ddl.addIndexComponent()
(rules/schema/dmo_common.rules
:1024); - the added elements are
INDEX_COL
nodes. They have arefid
annotation; - the
refid
annotation is already used (line 1027) to obtain thePROPERTY
node which have thenot-null
annotation; - use this annotation as an additional parameter to
addIndexComponent()
and store it inP2JIndexComponent
as a new flag; - when the
Dialect
method is invoked pass the value of the flag;
Note you should also take care of dynamic case. When ready, compare the index
and table
schema sql files to see whether they match.
#25 Updated by Dănuț Filimon 5 months ago
Ovidiu Maxiniuc wrote:
I think you can access the
not-null
annotation when configuring the index components.
- in index components are added using
ddl.addIndexComponent()
(rules/schema/dmo_common.rules
:1024);- the added elements are
INDEX_COL
nodes. They have arefid
annotation;- the
refid
annotation is already used (line 1027) to obtain thePROPERTY
node which have thenot-null
annotation;- use this annotation as an additional parameter to
addIndexComponent()
and store it inP2JIndexComponent
as a new flag;- when the
Dialect
method is invoked pass the value of the flag;Note you should also take care of dynamic case. When ready, compare the
index
andtable
schema sql files to see whether they match.
This is perfect. I wanted to know initially if we can get the fields before the index since we could just look into the table fields when the index component is created in addIndexComponent
, but adding the parameter to addIndexComponent
is better.
I will go ahead with this implementation.
#26 Updated by Dănuț Filimon 5 months ago
The changes work well, I tested conversion with hotel_gui and compared the index ddls for mariadb (before and after). When comparing the indices, I confirmed that __null
is not added to fields that do not have not null
in the table definition. At the same time, I ended up making more changes to the rules and fwd methods since both DDLGeneratorWorker.addIndexComponent
and P2JIndex.addComponent
are used in the rules and in both cases the not-null
property is available for use. I'm planning to clear up the implementation and see if I can adjust anything. The default value for not-null
is false
, for cases where the mandatory
value can't be found (there is no Property
, P2JIndexComponent
available, e.g. IndexMetadataCollector.retrieveIndexData
).
#27 Updated by Dănuț Filimon 5 months ago
- % Done changed from 70 to 90
- Status changed from WIP to Review
Committed 7924a/rev.14793.
Based on the error from #7924-14 I noticed that the column does not exist because it is a boolean and the field for the table is not generated in DDLGeneratorWorker.addFieldImpl
because it was not a character. At the same time, __null columns are generated for not null
fields which are part of indices.
- Mandatory character properties that are part of indices do not generate a second index component with
__null
; - Added a MANDATORY flag for index components;
- Changes rules to fit the addition of the not-null value;
I am also looking into removing mandatory fields that are augmented with __null
from the table schema ddl.
Please review.
#28 Updated by Ovidiu Maxiniuc 5 months ago
Review of 7924a/rev.14793.
The code seems fine, except two issues:
1. In case of nullable fields, why injection of generated <field>_null
(MariaDbDialect
:191) is conditioned by the type of field being a character
?
For example, for the table:
ADD TABLE "pt1" AREA "Schema Area" DUMP-NAME "pt1" ADD FIELD "f1" OF "pt1" AS integer FORMAT "->,>>>,>>9" INITIAL "0" POSITION 2 MAX-WIDTH 4 ORDER 10 ADD FIELD "f3" OF "pt1" AS integer FORMAT "->,>>>,>>9" INITIAL "0" POSITION 4 MAX-WIDTH 4 ORDER 10 ADD INDEX "idx3" ON "pt3" AREA "Schema Area" UNIQUE PRIMARY INDEX-FIELD "f3" ASCENDINGthe expected DDLs are as follows:
create table pt1 ( recid bigint not null, f1 integer, f3 integer, f3__null boolean generated always as (f3 is null) virtual, primary key (recid) ); drop index if exists idx3 on pt1; create unique index idx3 on pt1 (f3__null,f3);
The
f3__null
is created/added as an index component regardless of f3
's type (in this case it is an integer
, not a character
).
2. In case of MariaDbDialect, am not sure the proper order by
clause is assembled in order to force/trick the planner to use the desired index.
I will do a smoke-test myself after this issues are fixed.
#29 Updated by Dănuț Filimon 5 months ago
Ovidiu, the problem does not start from the index, but from the table field. If you take a look at DDLGeneratorWorker.addFieldImpl
you will notice that the additional field is created when isChar && dialect.injectComputedColumns() && isIndexComponent(field, table)
condition is true
. There are no other types that generate it aside from character so the method MariaDbDialect.getComputedColumnString
is not called.
In fact, the f3__null
field will not be generated. This is also backed by the error you mentioned in #7924-14 where the sql_only_user
is a boolean field and sql_only_user__null
is not generated. Let me know if f3__null
should actually be generated since this is originally not done so. If this is the case, I can have this fixed immediately.
#30 Updated by Ovidiu Maxiniuc 5 months ago
- % Done changed from 90 to 70
Dănuț,
sql_only_user__null
must be generated.
This thing seemed familiar to me. I did a bit of research and found out why. I was working on a non-stock FWD, which includes the 8009a. This branch has been frozen last 2 months because the main issue in #8009 was identified at other level. Nonetheless, I think the code in that branch should go into trunk, and this might be the chance. I am talking about changes in DDLGeneratorWorker
. Notice there were two methods for adding fields, with slightly different implementations. The character type you mention above was only checked if the filed was the last component of the index. Of course this is not correct.
#31 Updated by Dănuț Filimon 5 months ago
- % Done changed from 70 to 100
Issue has been fixed in 7924a.rev.14794. All field types can be augmented with __null
for MariaDbDialect
when they are part of an index, but only when they are not mandatory. At the same time I expanded the import.sh
to handle MariaDb lenient version with --db_type=mariadblenient
, I noticed it when you tried to use it in #7924-14 and I've been trying to come back to it.
This should include all functionality for creating an empty database for each dialect which was the purpose of this issue.
#32 Updated by Constantin Asofiei 16 days ago
- Related to Bug #8883: ensure FWD data import always creates indexes even if no dump file exists for a table added
#33 Updated by Dănuț Filimon 12 days ago
Rebased 7924a to latest trunk/rev.15306, now the branch is at revision 15311.
#35 Updated by Ovidiu Maxiniuc 11 days ago
Review of 7924a, rev.15307-15311.
Good job!
The majority of the changes are required for carrying along the mandatory flag needed by the MariaDB non lenient dialect when the indexes are re-created.
I saw no problem in the new code except for:
MariaDbDialect.java
:- the value returned by
orderByNulls()
does not look correct.+1
should be added onlyif (!mandatory)
was processed;
- the value returned by
#36 Updated by Dănuț Filimon 8 days ago
- Status changed from Review to Internal Test
Ovidiu Maxiniuc wrote:
Review of 7924a, rev.15307-15311.
I saw no problem in the new code except for:
MariaDbDialect.java
:
- the value returned by
orderByNulls()
does not look correct.+1
should be added onlyif (!mandatory)
was processed;
Fixed in 7924a/rev.15312.
Moving to Internal Test.