Project

General

Profile

Bug #7059

FWD shouldn't raise errors for non-numeric extent indexes.

Added by Stanislav Lomany over 1 year ago. Updated about 1 year ago.

Status:
WIP
Priority:
Normal
Target version:
-
Start date:
Due date:
% Done:

0%

billable:
No
vendor_id:
GCD
case_num:
version_reported:
version_resolved:

History

#1 Updated by Stanislav Lomany over 1 year ago

The original issue was #7043-4. Copied from there:

def temp-table tt1 field f1 as int extent 5 field f2 as int.

create tt1.
tt1.f1[1] = 10.
tt1.f1[2] = 20.
tt1.f1[3] = 30.

def var hb as handle.
hb = buffer tt1:handle.

message hb::f1(1).
message hb::f1(2).
message hb::f1("2"). // array subscript 50 is out of range
message hb::f1("garbage"). // array subscript 24935 is out of range

// no error here
message hb::f2(1).
message hb::f2(2).
message hb::f2("2").
message hb::f2("garbage").

FWD expects for the argument of a DEREFERENCE operator to be a numeric index, but OE accepts anything (even date!). From the looks of it, it may interpret the bytes for this extent as an integer, similar to how CALL does it.

#3 Updated by Alexandru Lungu about 1 year ago

  • Start date deleted (01/20/2023)
  • Status changed from New to WIP
  • Assignee set to Dănuț Filimon

#4 Updated by Dănuț Filimon about 1 year ago

I've investigated the issue and found the following:
  • for characters, the index is calculated using (second char << 8) + first char meaning that the two character values "ga" and "garbage" will have the same index, 24935;
  • for dates, the index is calculated using the number of days from 12/31/-4714 until the given date % 32768;
  • for decimals, there are two ways to get the index. In the following code I provided an example, note that this is an addition to the snippet provided in #7059-1:
    DEFINE TEMP-TABLE tmp3
        FIELD f1 AS DECIMAL.
    
    CREATE tmp3.
    tmp3.f1 = 2.1.
    
    message hb::f1(2.1). // 513
    message hb::f1(tmp3.f1). // 640
    
    • When using a literal, the index is calculating using the total digits except trailing zeros * 256 + 1;
    • When using a field value from a table or a variable, the index is calculated using the total digits except trailing zeros * 256 + 1 + 127;

I will work on handling these cases, my only concern is related to decimal values since the converted code looks like this:

      message(hb.unwrapDereferenceable().dereference(decimal.fromLiteral("2.1"), "f1")); //  513
      message(hb.unwrapDereferenceable().dereference(((decimal) new FieldReference(tmp3, "f1").getValue()), "f1")); // 640
Is it possible to distinguish between those two inside the dereference method?

EDIT1: An exception for negative decimals where the index value will be calculated using the total digits except trailing zeros * 256 in both cases.
EDIT1: By total digits except trailing zeros I mean that 2.10 and 2.1 will use the same index value.
EDIT2: Additional decimal specification.

#5 Updated by Alexandru Lungu about 1 year ago

Danut, comparing to other 4GL data type, decimal seems to have two values stored: value and scale. You issue may be related to the fact that decimal literals have a "default/implicit" scale of something and the variables + field references have an explicit scale that forces 127 extra size.

  • check if the decimals provided by tmp3.f1 and 2.1 are any different
  • check the scale in FWD
  • review the 4GL documentation to check how you can change the precision/scale and if you get different results than the one in #7059

Unless we find a clear way to replicate this for decimals, I think we can drop the effort and stick to a simple formula and log a WARNING. We shouldn't burden the conversion to achieve such obscure quirk. I really don't think that any customer application will expect to have this difference in decimal conversion when accessing an extent field.

#6 Updated by Alexandru Lungu about 1 year ago

Some related questions:
  • empty character results in extent 0?
  • negative decimal is any different? I guess that +1 comes from + exactly like with integer.

Danut, did you check CALL statement? As Stanislav mentioned, it may be some related code there in regard to integer casting.

EDIT: AFAIK, intVar = dynamic-function("charFunction"). works in 4GL, and converts a char to an int. Can you check that dynamic-function is already converting characters to integers properly, similar to your dereference? If not, please do the same tests you did with CALL and DYNAMIC-FUNCTION and ensure we have a consistent data type "implicit casting" all around.

#7 Updated by Dănuț Filimon about 1 year ago

Here's a decimal overview of the formulas used in each situation:

Action Decimal value type Decimal value Formula Index value
hb::f1(1.0) Literal 1.0 1 257
hb::f1(var) Variable 1.0 2 384
hb::f1(tt1.f1) Temp-table field 1.0 2 384
hb::f1(1.1) Literal 1.1 1 513
hb::f1(var) Variable 1.1 2 640
hb::f1(tt1.f1) Temp-table field 1.1 2 640
hb::f1(-1.0) Literal -1.0 3 256
hb::f1(var) Variable -1.0 3 256
hb::f1(tt1.f1) Temp-table field -1.0 3 256
hb::f1(-1.1) Literal -1.1 3 512
hb::f1(var) Variable -1.1 3 512
hb::f1(tt1.f1) Temp-table field -1.1 3 512
hb::f1(f(1.0)) Function call, Literal 1.0 1 257
hb::f1(f(val)) Function call, Variable 1.0 2 384
hb::f1(f(tt1.f1)) Function call, Temp-table field 1.0 2 384

The table is incomplete because the using functions will result in similar index values to the ones that do not use functions. It seems the result is calculated using the final result and the scale if references are used.

The formulas that were specified in the previous table:
Number Formula
1 The total number of digits * 256 + 1
2 The total number of digits * 256 + 1 + 127
3 The total number of digits * 256

Alexandru Lungu wrote:

  • check if the decimals provided by tmp3.f1 and 2.1 are any different

I don't see any difference between the two decimals.

  • check the scale in FWD
  • review the 4GL documentation to check how you can change the precision/scale and if you get different results than the one in #7059

The definition of decimal is "DECIMAL data consists of decimal numbers up to 50 digits in length including up to 10 digits to the right of the decimal point." and in an article it is specified that "In general the Progress/OpenEdge 4GL/ABL does not provide native methods to extend the limit of either the precision or the scale of its decimal variables.".

empty character results in extent 0?

In 4GL, ? is returned when using "".

negative decimal is any different? I guess that +1 comes from + exactly like with integer

I provided tables with examples.

Danut, did you check CALL statement? As Stanislav mentioned, it may be some related code there in regard to integer casting.

I did and only noticed Call.NativeCallParameter.convertTo, but I don't think it's related.

#8 Updated by Dănuț Filimon about 1 year ago

Committed 7059a/rev.14633. I added deference operators to handle character/longchar/logical/date types, I did not research datetime and datetimetz types in 4GL yet. At the same time I modified how the index value is calculated for the decimal type and since it is hard to determine the correct formula of the index value, I think that we should decide on either to use it or not.

#9 Updated by Constantin Asofiei about 1 year ago

Danut, I think 4GL just reads the first two bytes from the memory address where this value is stored (as max extent size is 28000 in OpenEdge).

If you are trying to find the binary representation of different data types, take a look CallParameter.convertTo(BaseDataType val, Class type), that may help.

As a side note, this converTo method has some assumptions for i.e. int64 when the OUTPUT returned value is character, I tried to duplicate this in a standalone test, but I can't find a way to convert i.e. garbage output value to an int64.

#10 Updated by Constantin Asofiei about 1 year ago

Constantin Asofiei wrote:

As a side note, this converTo method has some assumptions for i.e. int64 when the OUTPUT returned value is character, I tried to duplicate this in a standalone test, but I can't find a way to convert i.e. garbage output value to an int64.

I found what I did wrong; the parameter variable is int64, but set-parameter actually sets character, not int64. So the binary representation of the returned character value is loaded into the int64 variable. See this example, if you want to experiment:

def var i as int64.

procedure proc0.
   def output param p1 as char.
   p1 = "garbage".
end.

def var hb as handle.
create call hb.
hb:num-parameters = 1.
hb:set-parameter(1, "char", "output", i).
hb:call-name = "proc0".
hb:invoke().
message i.

#11 Updated by Dănuț Filimon about 1 year ago

Constantin Asofiei:

If you are trying to find the binary representation of different data types, take a look CallParameter.convertTo(BaseDataType val, Class type), that may help.

I was looking in the wrong place. CallParameter.convertTo(BaseDataType val, Class type) it's exactly what I need, thank you. I will try to create my own test based on the one you provided and see what results I obtain for each type.

#12 Updated by Constantin Asofiei about 1 year ago

Dănuț Filimon wrote:

Constantin Asofiei:

If you are trying to find the binary representation of different data types, take a look CallParameter.convertTo(BaseDataType val, Class type), that may help.

I was looking in the wrong place. CallParameter.convertTo(BaseDataType val, Class type) it's exactly what I need, thank you. I will try to create my own test based on the one you provided and see what results I obtain for each type.

There are lots of combinations in testcases/uast/call, but it can take a while to understand them.

#13 Updated by Constantin Asofiei about 1 year ago

You can also try combinations like this, when using characters and octal representation:

def temp-table tt1 field f1 as int extent 5.
def var hb as handle.
hb = buffer tt1:handle.
create tt1.

def var i as int.
i = hb::f1("~001~001").

keep in mind that ~000 is converted to space (ascii value 32) so ~000 will be ~040

#14 Updated by Dănuț Filimon about 1 year ago

I ran a few tests using CALL and found out that there is aren't a lot of similarities between the conversion to the array index used in 4GL and the result of the test in #7059-10. The specific case mentioned in #7059-10 uses the same way of converting a character to int64 as the array index, except that it does so for all characters that are 8 or less in size (it returns 0 when greater), while the array index is represented by the first 2 chars when dereferencing. Another thing to note is that CALL can return negative values when converting, while the array index can't be negative.

I also tried to find out how datetime and datetime-tz and found that both types will always convert to the value 13488 which makes things pretty simple, but I don't have any idea why this specific value is used.

#15 Updated by Greg Shah about 1 year ago

I also tried to find out how datetime and datetime-tz and found that both types will always convert to the value 13488 which makes things pretty simple, but I don't have any idea why this specific value is used.

Weird. Our previous testing of date values suggests they are internally stored as a kind of julian day number, but with 12/31/-4714 as the 0 day instead of 01/01/-4713 which is the normal julian base day. That would make the current day number (in the 4GL) 2460123 (for June 27, 2023). My first thought would be that there is some kind of truncation and/or reading bytes out of a little endian value but looking at the binary values it is not obvious. I would like to know how that number is derived as I would guess it is only stable for certain date ranges.

Also available in: Atom PDF