Project

General

Profile

Bug #2075

unknown literals in nested expressions

Added by Constantin Asofiei over 11 years ago. Updated over 11 years ago.

Status:
WIP
Priority:
Normal
Target version:
-
Start date:
03/03/2013
Due date:
% Done:

0%

billable:
No
vendor_id:
GCD
case_num:
version_reported:
version_resolved:

ca_upd20130303b.zip (6.56 KB) Constantin Asofiei, 03/03/2013 08:06 AM

History

#1 Updated by Constantin Asofiei over 11 years ago

P2J fails to convert this case:

def var h as handle.
if h:ADD-NEW-FIELD("", "", ?, ?, ?) then message "bla".

but is OK with:
h:ADD-NEW-FIELD("", "", ?, ?, ?).

The root cause is related to the fact that in the first case, we have nested expressions, and the expr rule adds an EXPRESSION parent to the literal only if is the first call. Once it goes in recursion, it will not wrap each child node with an EXPRESSION parent.
The AST in the first case looks like:
   assignment [ASSIGNMENT] @0:0
      : [COLON] @3:2
         h [VAR_HANDLE] @3:1
         ADD-NEW-FIELD [METH_LOGICAL] @3:3
            expression [EXPRESSION] @0:0
               "" [STRING] @3:17
            expression [EXPRESSION] @0:0
               "" [STRING] @3:21
            expression [EXPRESSION] @0:0
               ? [UNKNOWN_VAL] @3:25
            expression [EXPRESSION] @0:0
               ? [UNKNOWN_VAL] @3:28
            expression [EXPRESSION] @0:0
               ? [UNKNOWN_VAL] @3:31

while in the second case looks like:
   statement [STATEMENT] @0:0
      if [KW_IF] @3:1
         expression [EXPRESSION] @0:0
            : [COLON] @3:5
               h [VAR_HANDLE] @3:4
               ADD-NEW-FIELD [METH_LOGICAL] @3:6
                  "" [STRING] @3:20
                  "" [STRING] @3:24
                  ? [UNKNOWN_VAL] @3:28
                  ? [UNKNOWN_VAL] @3:31
                  ? [UNKNOWN_VAL] @3:34

As there is no EXPRESSION parent node for the UNKNOWN_VAL nodes, convert/literals fails to emit new unknown() for them, thus leaving the nodes without a peerid (this is why there is the NPE in methods_attributes).

I'm putting this through conversion regression testing.

#2 Updated by Constantin Asofiei over 11 years ago

The fix has passed conversion regression testing.

#3 Updated by Greg Shah over 11 years ago

Actually, I am worried that the problem is really in how the parser leaves the tree. A pretty important rule about the EXPRESSION node: it should always be the topmost node of an expression and should never be encountered inside an expression. The idea is that you can find the top of every expression using that node. In the ASSIGNMENT case, we have a peculiar approach to matching in the parser, which does not enter the normal top-level expression rules (expr) but instead it enters at lvalue (which does not create an EXPRESSION node, nor does it act like a top-level expression). So when the parameters are encountered, normally we recurse into expr and that recursion is detected and the EXPRESSION node is not inserted again. But in this case, we enter expr for the first time and thus each parameter gets an EXPRESSION node. That is not the way I intended it.

The question: how much code do we have that is dependent upon this unintended result? I prefer to cleanup the parser, but I worry that we have too much code that would break as a result.

Code Review Feedback:

Besides the dependence upon the unintended bad behavior described above, I worry that the approach here seems to always wrap literal parameters of methods. I don't think we want to do that. Note that we only wrap some function parameters (all user-defined funcs and only those built-ins which we have explcitly marked with a classname).

#4 Updated by Constantin Asofiei over 11 years ago

The question: how much code do we have that is dependent upon this unintended result? I prefer to cleanup the parser, but I worry that we have too much code that would break as a result.

OK, I had the hole idea backwards, I thought the parameter nodes needed to be enclosed with an expression, not that the expression node was incorrectly added. The problem with fixing this at the parser is that prog.expression is extensively in the annotations and core code conversion phases. I don't think it will be that easy to check all this prog.expression usage.

Besides the dependence upon the unintended bad behavior described above, I worry that the approach here seems to always wrap literal parameters of methods. I don't think we want to do that. Note that we only wrap some function parameters (all user-defined funcs and only those built-ins which we have explcitly marked with a classname).

The literals change I made are only related to unknown vals. Note that an ? literal set as parameter for a method will always convert to a new unknown() instance, unless we explicitly map the parameter on index idx for method mth with the correct classname annotation. Thus, my change solves only the problem of converting the unknown literal to the new unknown() instance (and allowing the conversion to pass). To fully solve it at compile time too, we will need to add the classname annotation explicitly, so that an instance of the correct wrapper type (based on the parameter's real type) will be emitted.

#5 Updated by Greg Shah over 11 years ago

Go ahead and commit the change and distribute it.

If I understand properly, we are about to have compile issues with the code the way it is emitted now, right? I have already been thinking that we would need to have knowledge of the signatures of all built-in functions for a while. Now, we can add methods to this list too.

One of the things I was interested in using the signature data for was during expression typing (we could know exactly the type that should be used for each parm in the ECW). But you're right that we will also need this for emitting unknown as the proper type.

#6 Updated by Constantin Asofiei over 11 years ago

Go ahead and commit the change and distribute it.

Committed to bzr revision 10237.

If I understand properly, we are about to have compile issues with the code the way it is emitted now, right?

Correct, new unknown() will be compile time error, when emitted for a parameter. That's why I've added those warnings too.

#7 Updated by Constantin Asofiei over 11 years ago

One of the things I was interested in using the signature data for was during expression typing (we could know exactly the type that should be used for each parm in the ECW). But you're right that we will also need this for emitting unknown as the proper type.

As a side note: in #2068, note 18, issue 3: if _POLY is passed to the first parameter for LENGTH function, then we can't just wrap this using character, because the _POLY might return a longchar. This is because there might be code-page related info which might get lost during conversion from longchar to char (and viceversa). Instead, if _POLY expression returns a Text (character or longchar), P2J should leave it as is; _POLY should be wrapped using character only if it returns a non-Text value (sure, this needs to be proved first). I think this is the only case where we can't just wrap _POLY expressions, we need to check their actual type too.

#8 Updated by Greg Shah over 11 years ago

See #2053 where we are creating a parameter type lookup facility for builtin funcs and methods.

Also available in: Atom PDF