desug.mli 5.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156
  1. (** Desugaring: split variable initialisations, generate array dimension
  2. variables, and transform for-loops to while-loops. *)
  3. (** {2 Split variable initialisations}
  4. Variable initialisations are split into declarations and assignments, to
  5. generalize the AST format. This makes context analysis easier, since
  6. initialisations need not be considered. The assignments are placed in the
  7. function body, after local fuction declarations (which are not in the
  8. example below). Initialisations fo global variables are moved to a new
  9. function called "__init", which is a reserved function that is called by the
  10. VM before the main function is called.
  11. {v int glob = 1;
  12. void foo() \{
  13. int foo = 0;
  14. int bar = foo;
  15. \} v}
  16. becomes:
  17. {v export void __init() \{
  18. glob = 1;
  19. \}
  20. int glob;
  21. void foo() \{
  22. int foo;
  23. int bar;
  24. foo = 0;
  25. bar = foo;
  26. \} v}
  27. {3 Array initialisations}
  28. A more complex class of initialisations is that of array initialisations.
  29. Arrays can be initialised to a scalar value or to an array literal in
  30. bracket notation. A scalar value is rewritten to a nested for-loop over all
  31. array dmensions, with an assignment in the most nested loop. An array
  32. constant is rewritten to a series of separate assign statements to the
  33. corresponding array indices. The following example shows both
  34. transformations:
  35. {v void foo() \{
  36. int[3] a = 4;
  37. int[2, 2] b = [[3, 4], [5, 6]];
  38. \} v}
  39. This is transformed into:
  40. {v void foo() \{
  41. int[3] a;
  42. int[2, 2] b;
  43. a = __allocate(3);
  44. for (int i = 0, 3) \{
  45. a[i] = 4;
  46. \}
  47. b = __allocate(2, 2);
  48. b[0, 0] = 1;
  49. b[0, 1] = 2;
  50. b[1, 0] = 3;
  51. b[1, 1] = 4;
  52. \} v}
  53. Actually, the dimensions of [a] and [b] should have been replaced by new
  54. variables earlier for reasons described below, but this is not done in the
  55. example to maintain readability.
  56. Note that array constants in bracket expressions must have a nesting level
  57. that is equal to the number of array dimensions, else an error will occur.
  58. {2 Move array dimensions and scalars into new variables}
  59. In the following code:
  60. {v int twos = 0;
  61. int two() \{
  62. twos = twos + 1;
  63. return 2;
  64. \}
  65. void foo() \{
  66. int[2, two()] a = two();
  67. printInt(twos);
  68. printInt(a[1, 1]);
  69. \} v}
  70. [two()] must be evaluated exactly twice in order to preserve correct
  71. behaviour. However, the scalar inialisation will be transformed into a
  72. for-loop, evaluating [two()] in each loop iteration. Moreover, [a[1, 1]]
  73. would at a later stage (during array dimension reduction) be transformed
  74. into [a[(1 * two()) + 1]], thus incorrectly evaluating [two()] an additional
  75. time.
  76. The problem is solved by generating new variables for scalar initialisations
  77. and array dimensions, and replacing the original expression with the
  78. generated variables. Note that these variables are marked so-called
  79. "constant variables" since they are known to be assigned exactly once, and
  80. thus likely optimizable by {!Constprop}. This way, only the non-constant
  81. expressions are defined in new variables in the final code.
  82. In the example above, [int[2, two()] a = two();] is transformed as follows:
  83. {v ...
  84. int _a_0_ = 2; // 2 will be propagated back by constant propagation
  85. int _a_1_ = two();
  86. int _scalar_1_ = two();
  87. int[_a_0_, _a_1_] a = _scalar_1_;
  88. ... v}
  89. resulting in:
  90. {v ...
  91. int _a_0_;
  92. int _a_1_;
  93. int _scalar_1_;
  94. int[_a_0_, _a_1_] a;
  95. _a_0_ = 2;
  96. _a_1_ = two();
  97. _scalar_1_ = two();
  98. a := <allocate>(_a_0_, _a_1_);
  99. for (int _i_2 = 0, _a_0_) \{
  100. for (int _i_3 = 0, _a_1_) \{
  101. a[_i_2, _i_3] = _scalar_1_;
  102. \}
  103. \}
  104. ... v}
  105. The transformation described above is applied to all array definitions,
  106. including extern arrays. Although dimensions of extern arrays are not
  107. expressions (but identifiers), the transformation is necessary in order to
  108. generate consistent names to be imported/exported. E.g. in [extern int[n]
  109. a], [n] is just a name given locally to the first dimension of [a].
  110. Therefore it is transformed into:
  111. {v extern int _a_0_;
  112. extern int[_a_0_] a; v}
  113. Also, all occurrences of [n] in the rest of the module are replaced by
  114. [_a_0_]. For exported arrays, the generated dimension variables need to be
  115. exported as well.
  116. {2 Transforming for-loops to while-loops}
  117. {v for (int i = <start>, <stop>, <step>) \{
  118. <body>
  119. \} v}
  120. is transformed into:
  121. {v _i_1 = <start>;
  122. _step_2 = <step>;
  123. _stop_3 = <stop>;
  124. while ((_step_2 > 0) ? (_i_1 < _stop_3) : (_i_1 > _stop_3)) \{
  125. <body>
  126. _i_1 = _i_1 + _step_2;
  127. \} v}
  128. Here, [_i_1], [_step_2] and [_stop_3] are fresh variables. Definitions of
  129. these new variables are added to the scope of the current function. Every
  130. occurrence of [i] in [<body>] is replaced with the fresh variable [_i_1],
  131. this prevents problems with nested for-loops that use the same induction
  132. variable.
  133. *)
  134. (** Main phase function, called by {!Main}. *)
  135. val phase : Main.phase_func