Project

General

Profile

Bug #5294

parsing problems for assign-style statements with a OO chained lvalue

Added by Constantin Asofiei about 3 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
Start date:
Due date:
% Done:

100%

billable:
No
vendor_id:
GCD
case_num:
version_reported:
version_resolved:

History

#1 Updated by Constantin Asofiei about 3 years ago

In OE, is possible to use an OO property or variable as the lvalue for an assign-style-statement. FWD can't parse these cases:

overlay(o:c1, 1, 2) = "rde".
set-byte-order(o:m1) = 1.
set-pointer-value(o:m1) = 1.
set-size(o:m1) = 11.

where o:c1 is a character property and o:m1 is a memptr property.

#2 Updated by Constantin Asofiei about 3 years ago

Greg, the parser patch is this:

### Eclipse Workspace Patch 1.0
#P p2j
Index: src/com/goldencode/p2j/uast/progress.g
===================================================================
--- src/com/goldencode/p2j/uast/progress.g    (revision 2860)
+++ src/com/goldencode/p2j/uast/progress.g    (working copy)
@@ -6377,7 +6377,7 @@
              type == KW_PUT_UL   || type == KW_PUT_USHT ||
              type == KW_RAW      || type == KW_SET_B_OR ||
              type == KW_SET_PTR  || type == KW_SET_SZ   ||
-             type == KW_SUBSTR;
+             type == KW_SUBSTR   || type == KW_OVERLAY;
    }

    /**
@@ -15027,7 +15027,7 @@
  */
 set_size_stmt
    :
-      KW_SET_SZ^ lparens lvalue rparens
+      KW_SET_SZ^ lparens chained_object_members rparens
    ;

 /**
@@ -15038,7 +15038,7 @@
  */
 set_byte_order_stmt
    :
-      KW_SET_B_OR^ lparens lvalue rparens
+      KW_SET_B_OR^ lparens chained_object_members rparens
    ;

 /**
@@ -15049,7 +15049,7 @@
  */
 set_pointer_value_stmt
    :
-      KW_SET_PTR^ lparens lvalue rparens
+      KW_SET_PTR^ lparens chained_object_members rparens
    ;

 /**
@@ -19879,7 +19879,7 @@
  */   
 overlay_stmt   
    :   
-      KW_OVERLAY^ lparens lvalue comma expr (comma expr (comma expr)? )? rparens
+      KW_OVERLAY^ lparens chained_object_members comma expr (comma expr (comma expr)? )? rparens
    ;   

 /**   

My question is that this rule:

assign_type_syntax_stmt_list   
   :
      (
           current_value_stmt
         | dynamic_current_value_stmt
         | dynamic_new_stmt
         | entry_stmt
         | extent_stmt
         | fix_codepage_stmt
         | length_stmt
         | overlay_stmt
         | put_memptr_stmt
         | raw_stmt
         | set_byte_order_stmt
         | set_pointer_value_stmt
         | set_size_stmt
         | substring_stmt
      )
   ;

will only be called for tokens which are not also a function. For example, SUBSTRING (even for a case of a assign-style statement) will be parsed as a function call and fixed after. This assign_type_syntax_stmt_list rule will never be reached for SUBSTRING and the other cases which exist as a function, too. So the substring_stmt rules are never reached and can be concluded as 'dead code'. Should I remove them?

Otherwise, I'll add a comment to assign_type_syntax_stmt_list.

#3 Updated by Greg Shah about 3 years ago

Code Review Eclipse Diff in #5294-2

1. The lvalue to chained_object_members change makes sense. I suspect there are many other lvalue locations in the parser that will also have this issue (but we won't worry about them now).

2. The change in isLstmt() seems unnecessary. The type == KW_OVERLAY already exists there.

3. In regard to this:

assign_type_syntax_stmt_list will only be called for tokens which are not also a function.

I think you are right. I guess this is why you didn't fix the extent_stmt, length_stmt, put_memptr_stmt, raw_stmt or substring_stmt to use chained_object_members instead of lvalue.

So the substring_stmt rules are never reached and can be concluded as 'dead code'. Should I remove them?

Right now this is just a quirk of the implementation. And it is a kind of hidden quirk. I'm worried that future changes may expose this or break it. For example, I haven't checked if the possible signatures of the builtin functions always exactly match the possible lvalue cases for the matching language statement. If these have differences, then in a future revision where we implement checking for the builtin function signatures, this might break. For now, please comment out these rules (and rule references) and leave a comment in assign_type_syntax_stmt_list that explains why these are dead.

#5 Updated by Constantin Asofiei over 2 years ago

Greg, how do I comment the rule in antlr? I can't neither with /**/ or with //.

These rules need to be commented:

 * <ul>
 *    <li> {@link #entry_stmt}
 *    <li> {@link #extent_stmt}
 *    <li> {@link #length_stmt}
 *    <li> {@link #raw_stmt}
 *    <li> {@link #substring_stmt}
 * </ul>

#6 Updated by Greg Shah over 2 years ago

There is no way to comment out the rule definition itself. You can use /* */ or // comments inside a rule to remove things. You can remove the references to that rule, making it uncallable.

#7 Updated by Constantin Asofiei over 2 years ago

  • Status changed from New to WIP
  • % Done changed from 0 to 100

Thanks, the patch is in 3821c/13690.

#8 Updated by Constantin Asofiei over 2 years ago

  • Status changed from WIP to Review

#9 Updated by Greg Shah over 2 years ago

  • Status changed from Review to Closed

Code Review Task Branch 3821c Revision 13690

No objections.

Also available in: Atom PDF