|
|
@@ -35,7 +35,7 @@ the keywords in to an action.
|
|
|
|
|
|
\section{Design}
|
|
|
|
|
|
-There are two general types of of optimizations of the assembly code, global
|
|
|
+There are two general types of optimizations of the assembly code, global
|
|
|
optimizations and optimizations on a so-called basic block. These optimizations
|
|
|
will be discussed separately
|
|
|
|
|
|
@@ -99,6 +99,16 @@ Appendix \ref{opt}.
|
|
|
A more advanced optimization is common subexpression elimination. This means
|
|
|
that expensive operations as a multiplication or addition are performed only
|
|
|
once and the result is then `copied' into variables where needed.
|
|
|
+\begin{verbatim}
|
|
|
+
|
|
|
+addu $2,$4,$3 addu = $t1, $4, $3
|
|
|
+... mov = $2, $t1
|
|
|
+... -> ...
|
|
|
+... ...
|
|
|
+addu $5,$4,$3 mov = $4, $t1
|
|
|
+
|
|
|
+\end{verbatim}
|
|
|
+
|
|
|
|
|
|
A standard method for doing this is the creation of a DAG or Directed Acyclic
|
|
|
Graph. However, this requires a fairly advanced implementation. Our
|
|
|
@@ -112,27 +122,13 @@ We now add the instruction above the first use, and write the result in a new
|
|
|
variable. Then all occurrences of this expression can be replaced by a move of
|
|
|
from new variable into the original destination variable of the instruction.
|
|
|
|
|
|
-This is a less efficient method then the DAG, but because the basic blocks are
|
|
|
+This is a less efficient method then the dag, but because the basic blocks are
|
|
|
in general not very large and the execution time of the optimizer is not a
|
|
|
primary concern, this is not a big problem.
|
|
|
|
|
|
-\subsubsection*{Constant folding}
|
|
|
+\subsubsection*{Fold constants}
|
|
|
|
|
|
-Another optimization is to do constant folding. Constant folding is replacing
|
|
|
-a expensive step like addition with a more simple step like loading a constant.
|
|
|
-Of course, this is not always possible. It is possible in cases where you apply
|
|
|
-an operation on two constants, or a constant and a variable of which you know
|
|
|
-for sure that it always has a certain value at that point. For example:
|
|
|
-\begin{verbatim}
|
|
|
-li $regA, 1 li $regA, 1
|
|
|
-addu $regB, $regA, 2 -> li $regB, 3
|
|
|
-\end{verbatim}
|
|
|
-Of course, if \texttt{\$regA} is not used after this, it can be removed, which
|
|
|
-will be done by the dead code elimination.
|
|
|
|
|
|
-One problem we encountered with this is that the use of a \texttt{li} is that
|
|
|
-the program often also stores this in the memory, so we had to check whether
|
|
|
-this was necessary here as well.
|
|
|
|
|
|
\subsubsection*{Copy propagation}
|
|
|
|
|
|
@@ -159,11 +155,12 @@ of the move operation.
|
|
|
|
|
|
An example would be the following:
|
|
|
\begin{verbatim}
|
|
|
-move $regA, $regB move $regA, $regB
|
|
|
-... ...
|
|
|
-Code not writing $regA, $regB -> ...
|
|
|
-... ...
|
|
|
-addu $regC, $regA, ... addu $regC, $regB, ...
|
|
|
+move $regA, $regB move $regA, $regB
|
|
|
+... ...
|
|
|
+Code not writing $regA, -> ...
|
|
|
+$regB ...
|
|
|
+... ...
|
|
|
+addu $regC, $regA, ... addu $regC, $regB, ...
|
|
|
\end{verbatim}
|
|
|
This code shows that \texttt{\$regA} is replaced with \texttt{\$regB}. This
|
|
|
way, the move instruction might have become useless, and it will then be
|
|
|
@@ -171,18 +168,7 @@ removed by the dead code elimination.
|
|
|
|
|
|
\subsubsection*{Algebraic transformations}
|
|
|
|
|
|
-Some expression can easily be replaced with more simple once if you look at
|
|
|
-what they are saying algebraically. An example is the statement $x = y + 0$, or
|
|
|
-in Assembly \texttt{addu \$1, \$2, 0}. This can easily be changed into $x = y$
|
|
|
-or \texttt{move \$1, \$2}.
|
|
|
-
|
|
|
-Another case is the multiplication with a power of two. This can be done way
|
|
|
-more efficiently by shifting left a number of times. An example:
|
|
|
-\texttt{mult \$regA, \$regB, 4 -> sll \$regA, \$regB, 2}. We perform this
|
|
|
-optimization for any multiplication with a power of two.
|
|
|
|
|
|
-There are a number of such cases, all of which are once again stated in
|
|
|
-appendix \ref{opt}.
|
|
|
|
|
|
\section{Implementation}
|
|
|
|
|
|
@@ -206,7 +192,7 @@ languages like we should do otherwise since Lex and Yacc are coupled with C.
|
|
|
|
|
|
The decision was made to not recognize exactly every possible instruction in
|
|
|
the parser, but only if something is for example a command, a comment or a gcc
|
|
|
-directive. We then transform per line to a object called a Statement. A
|
|
|
+directive. We then transform per line to an object called a Statement. A
|
|
|
statement has a type, a name and optionally a list of arguments. These
|
|
|
statements together form a statement list, which is placed in another object
|
|
|
called a Block. In the beginning there is one block for the entire program, but
|
|
|
@@ -219,7 +205,7 @@ The optimizations are done in two different steps. First the global
|
|
|
optimizations are performed, which are only the optimizations on branch-jump
|
|
|
constructions. This is done repeatedly until there are no more changes.
|
|
|
|
|
|
-After all possible global optimizations are done, the program is separated into
|
|
|
+After all possible global optimizations are done, the program is seperated into
|
|
|
basic blocks. The algorithm to do this is described earlier, and means all
|
|
|
jump and branch instructions are called leaders, as are their targets. A basic
|
|
|
block then goes from leader to leader.
|
|
|
@@ -231,7 +217,8 @@ steps can be done to optimize something.
|
|
|
\subsection{Writing}
|
|
|
|
|
|
Once all the optimizations have been done, the IR needs to be rewritten into
|
|
|
-Assembly code, so the xgcc cross compiler can make binary code out of it.
|
|
|
+Assembly code. After this step the xgcc crosscompiler can make binary code from
|
|
|
+the generated Assembly code.
|
|
|
|
|
|
The writer expects a list of statements, so first the blocks have to be
|
|
|
concatenated again into a list. After this is done, the list is passed on to
|