Project

General

Profile

Bug #4433

Problem with text concat

Added by Vladimir Tsichevski over 4 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
Start date:
11/29/2019
Due date:
% Done:

0%

billable:
No
vendor_id:
GCD
case_num:
version_reported:
version_resolved:

concaterror.png (4.31 KB) Vladimir Tsichevski, 11/29/2019 12:21 PM

History

#1 Updated by Vladimir Tsichevski over 4 years ago

The following code snippet runs smoothly in Progress.

DEFINE TEMP-TABLE tt NO-UNDO
    FIELD df AS DECIMAL.
CREATE tt.
  ASSIGN df = 1234567890.

define variable buf as HANDLE no-undo.
buf = buffer tt:handle.

DEFINE VARIABLE hf as HANDLE no-undo.
hf = buf:buffer-field("df").

message "The value is" + hf:buffer-value.

In FWD, it raises the following error:

The error is in the TextOps.concat(Object ... list) method, wich uses the toString() to convert from the decimal type to string.

#2 Updated by Greg Shah over 4 years ago

From Vladimir:

I am not sure which is the right way to fix it. According to the Javadocs, the toString() method is intentionally used in the TextOps.concat() method. Probably, the BaseDataType.toStringMessage() should be used instead for objects which are BaseDataType?

#3 Updated by Greg Shah over 4 years ago

Consider that the following code is not valid (it does not compile in the 4GL):

message "text " + 1234567890.

You cannot normally pass a non-text value to the concatenation operator. The reason why the testcase in #4433-1 does cause the issue is related to the BUFFER-VALUE usage. The is one of the "polymorphic" attributes in the 4GL. It returns the specific type of the field, but such polymorphic features get special treatment in the 4GL. Generally, there is usually a kind of implicit conversion of data types in these cases such that you can insert an unexpected type and it will work in usage that would otherwise fail to compile in the 4GL.

Usually, the way we handle such cases is to convert them with a "wrapper" constructor that matches the correct data type. The constructor will compile with a single BaseDataType parameter. Each BDT subclass has a constructor that takes a since BDT value. This constructor is designed to handle exactly this implicit conversion behavior.

I think the answer here is that we must wrapper this case with character. In that case the behavior will work exactly as you have found.

#4 Updated by Constantin Asofiei over 4 years ago

Question: why not do this automatically at runtime? If the argument is not a Java String or a character, then automatically wrap via new character().

#5 Updated by Greg Shah over 4 years ago

I guess we can do that safely, as long as we document this clearly in the javadoc. The only downside is that it can be confusing unless the documentation is there. I think this should only be done for a BaseDataType that is not a Text. Since a non-Text BDT should never appear except for this polymorphic case, it will get the same result.

#6 Updated by Vladimir Tsichevski over 4 years ago

Greg Shah wrote:

I guess we can do that safely, as long as we document this clearly in the javadoc. The only downside is that it can be confusing unless the documentation is there. I think this should only be done for a BaseDataType that is not a Text. Since a non-Text BDT should never appear except for this polymorphic case, it will get the same result.

If I got it right, something like this should be added to the code:

         else if(txt instanceof BaseDataType)
         {
            sb.append(new character((BaseDataType)txt).safeValue());
         }

#7 Updated by Greg Shah over 4 years ago

Can a POLY value ever be passed as the first operand? If not then this is probably the correct approach.

#8 Updated by Vladimir Tsichevski over 4 years ago

Greg Shah wrote:

Can a POLY value ever be passed as the first operand? If not then this is probably the correct approach.

The value returned by the :buffer-value is converted to a BDT by conversion rules before it is passed to the concat(). So a POLY value cannot probably be passed as any operand.

#9 Updated by Greg Shah over 4 years ago

I'm asking about in the 4GL. Can the buffer-value sub-expression appear as the leftmost operand of the + concatenation operator?

#10 Updated by Vladimir Tsichevski over 4 years ago

Greg Shah wrote:

I'm asking about in the 4GL. Can the buffer-value sub-expression appear as the leftmost operand of the + concatenation operator?

Yes, it can. Progress compiles and executes it as expected:

message hf:buffer-value + " is the value".

#11 Updated by Greg Shah over 4 years ago

So then the algorithm needs to be slightly different because the leftmost operand is treated differently than the subsequent operands.

#12 Updated by Vladimir Tsichevski over 4 years ago

Greg Shah wrote:

So then the algorithm needs to be slightly different because the leftmost operand is treated differently than the subsequent operands.

What do you mean?

#13 Updated by Greg Shah over 4 years ago

The first character operand uses getValue() instead of safeValue() in TextOps.concat().

#14 Updated by Vladimir Tsichevski over 4 years ago

Greg Shah wrote:

The first character operand uses getValue() instead of safeValue() in TextOps.concat().

This requirement holds because the value is converted into a character before being passed to concat().

#15 Updated by Greg Shah over 4 years ago

I thought we were discussing the approach from #4433-6 in which the code change was not at conversion but only exists inside TextOps.concat().

#16 Updated by Vladimir Tsichevski over 4 years ago

Greg Shah wrote:

I thought we were discussing the approach from #4433-6 in which the code change was not at conversion but only exists inside TextOps.concat().

I suppose I meant the same thing. The code change a proposed in #4433-6 is for TextOps.concat(). No any conversion rules need to be changed.

#17 Updated by Greg Shah over 4 years ago

In that scenario, the value will be a BaseDataType and if it is the first operand, then the proposed change in the else if would not yield the same results.

#18 Updated by Vladimir Tsichevski over 4 years ago

Greg Shah wrote:

In that scenario, the value will be a BaseDataType and if it is the first operand, then the proposed change in the else if would not yield the same results.

If I get it right, the difference between the safeValue() and getValue() is in how they handle zero characters.

Question: is it possible for an object o, which is BaseDataType, but which is not a character, that the new character(o) expression contains zero characters?

#19 Updated by Greg Shah over 4 years ago

Yes, it would be possible by using get-string() on memptr or raw or with certain forms of string literals. See #2429.

#20 Updated by Vladimir Tsichevski over 4 years ago

Greg Shah wrote:

Yes, it would be possible by using get-string() on memptr or raw or with certain forms of string literals. See #2429.

Ok, this implementation will probably do:

   public static character concat(Object... list)
   {
      StringBuilder sb = new StringBuilder();

      boolean caseSens = false;
      boolean first = true;

      for (Object txt : list)
      {
         while (true)
         {
            if (txt instanceof String)
            {
               sb.append((String) txt);
               break;
            }

            if (txt instanceof Text)
            {
               Text op = (Text) txt;

               // any character or longchar operand that is unknown causes the entire
               // result to be unknown
               if (op.isUnknown())
               {
                  return new character();
               }

               // any character or longchar operand that is case-sensitive causes the
               // entire result to be case-sensitive
               caseSens = (caseSens || op.isCaseSensitive());

               if (txt instanceof character && !first)
               {
                  // the leftmost operand does not truncate at the embedded null, but all
                  // subsequent operands do (the safeValue() call truncates at the embedded null
                  // char)
                  txt = ((character) op).safeValue();
                  continue;
               }

               // first character op and any longchar op (no known null handling is needed)
               txt = op.getValue();
               continue;
            }

            if (txt instanceof BaseDataType)
            {
               txt = new character((BaseDataType) txt);
               continue;
            }

            txt = txt.toString();
         }

         first = false;
      }

      return new character(sb.toString(), caseSens);
   }

#21 Updated by Greg Shah over 4 years ago

Yes, I think this will work correctly.

Although a careful reading of the code is quite clear about the behavior, I think that a quick read of the code might be mis-interpreted. Or the reader will realize that there is some implicit processing going on and will look more closely. For this reason, I'd like to see some implicit processing made more explicit:

  • Add outer: label to the for and inner: label to the while.
  • Qualify each break and continue@ with inner.

Otherwise I think this is quite good.

#22 Updated by Vladimir Tsichevski over 4 years ago

Greg Shah wrote:

Yes, I think this will work correctly.

Although a careful reading of the code is quite clear about the behavior, I think that a quick read of the code might be mis-interpreted. Or the reader will realize that there is some implicit processing going on and will look more closely. For this reason, I'd like to see some implicit processing made more explicit:

  • Add outer: label to the for and inner: label to the while.

The outer: label will never be referenced and cause a compiler warning.

  • Qualify each break and continue@ with inner.

Otherwise I think this is quite good.

With the inner: label the code will look like:

   public static character concat(Object... list)
   {
      StringBuilder sb = new StringBuilder();

      boolean caseSens = false;
      boolean first = true;

      for (Object txt : list)
      {
      inner:
         while (true)
         {
            if (txt instanceof String)
            {
               sb.append((String) txt);
               break inner;
            }

            if (txt instanceof Text)
            {
               Text op = (Text) txt;

               // any character or longchar operand that is unknown causes the entire
               // result to be unknown
               if (op.isUnknown())
               {
                  return new character();
               }

               // any character or longchar operand that is case-sensitive causes the
               // entire result to be case-sensitive
               caseSens = (caseSens || op.isCaseSensitive());

               if (txt instanceof character && !first)
               {
                  // the leftmost operand does not truncate at the embedded null, but all
                  // subsequent operands do (the safeValue() call truncates at the embedded null
                  // char)
                  txt = ((character) op).safeValue();
                  continue inner;
               }

               // first character op and any longchar op (no known null handling is needed)
               txt = op.getValue();
               continue inner;
            }

            if (txt instanceof BaseDataType)
            {
               txt = new character((BaseDataType) txt);
               continue inner;
            }

            txt = txt.toString();
         }

         first = false;
      }

      return new character(sb.toString(), caseSens);
   }

#23 Updated by Greg Shah over 4 years ago

Good point, don't add the outer: label, just add the inner: label and references.

#24 Updated by Vladimir Tsichevski over 4 years ago

Greg Shah wrote:

Good point, don't add the outer: label, just add the inner: label and references.

Is it the proper time now to create a branch for this task and fix it?

#25 Updated by Greg Shah over 4 years ago

I think it is OK to include this in 3820a.

Also available in: Atom PDF