Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
P
peephole
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Taddeüs Kroes
peephole
Commits
b031248e
Commit
b031248e
authored
Dec 29, 2011
by
Jayke Meijer
Browse files
Options
Browse Files
Download
Plain Diff
Merge branch 'master' of github.com:taddeus/peephole
parents
6623d7e7
53d3416c
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
108 additions
and
48 deletions
+108
-48
benchmarks/build/test.s
benchmarks/build/test.s
+55
-0
benchmarks/test.c
benchmarks/test.c
+7
-0
report/report.tex
report/report.tex
+23
-36
src/optimize/advanced.py
src/optimize/advanced.py
+23
-12
No files found.
benchmarks/build/test.s
0 → 100644
View file @
b031248e
.
file
1
"test.c"
#
GNU
C
2
.7.2.3
[
AL
1
.1
,
MM
40
,
tma
0
.1
]
SimpleScalar
running
sstrix
compiled
by
GNU
C
#
Cc1
defaults
:
#
-
mgas
-
mgpOPT
#
Cc1
arguments
(-
G
value
=
8
,
Cpu
=
default
,
ISA
=
1
)
:
#
-
quiet
-
dumpbase
-
O0
-
o
gcc2_compiled
.:
__gnu_compiled_c
:
.
text
.
align
2
.
globl
main
.
text
.
loc
1
3
.
ent
main
main
:
.
frame
$fp
,
48
,
$
31
#
vars
=
24
,
regs
=
2
/
0
,
args
=
16
,
extra
=
0
.
mask
0xc0000000
,-
4
.
fmask
0x00000000
,
0
subu
$sp
,
$sp
,
48
sw
$
31
,
44
(
$sp
)
sw
$fp
,
40
(
$sp
)
move
$fp
,
$sp
jal
__main
li
$
2
,
0x00000002
#
2
sw
$
2
,
16
(
$fp
)
li
$
2
,
0x00000005
#
5
sw
$
2
,
20
(
$fp
)
lw
$
2
,
16
(
$fp
)
lw
$
3
,
20
(
$fp
)
mult
$
2
,
$
3
mflo
$
2
sw
$
2
,
24
(
$fp
)
lw
$
2
,
16
(
$fp
)
move
$
4
,
$
2
sll
$
3
,
$
4
,
1
addu
$
3
,
$
3
,
$
2
sll
$
2
,
$
3
,
1
sw
$
2
,
28
(
$fp
)
li
$
2
,
0x00000015
#
21
sw
$
2
,
32
(
$fp
)
move
$
2
,
$
0
j
$L1
$L1
:
move
$sp
,
$fp
#
sp
not
trusted
here
lw
$
31
,
44
(
$sp
)
lw
$fp
,
40
(
$sp
)
addu
$sp
,
$sp
,
48
j
$
31
.
end
main
benchmarks/test.c
0 → 100644
View file @
b031248e
#include <stdio.h>
int
main
(
void
)
{
int
a
=
2
,
b
=
5
,
c
=
a
*
b
,
d
=
a
*
6
,
e
=
3
*
7
;
return
0
;
}
report/report.tex
View file @
b031248e
...
...
@@ -35,7 +35,7 @@ the keywords in to an action.
\section
{
Design
}
There are two general types of o
f o
ptimizations of the assembly code, global
There are two general types of optimizations of the assembly code, global
optimizations and optimizations on a so-called basic block. These optimizations
will be discussed separately
...
...
@@ -99,6 +99,16 @@ Appendix \ref{opt}.
A more advanced optimization is common subexpression elimination. This means
that expensive operations as a multiplication or addition are performed only
once and the result is then `copied' into variables where needed.
\begin
{
verbatim
}
addu
$
2,
$
4
,
$
3 addu =
$
t
1
,
$
4,
$
3
... mov
=
$
2,
$
t
1
...
-
> ...
... ...
addu
$
5,
$
4
,
$
3 mov =
$
4
,
$
t1
\end{verbatim}
A standard method for doing this is the creation of a DAG or Directed Acyclic
Graph. However, this requires a fairly advanced implementation. Our
...
...
@@ -112,27 +122,13 @@ We now add the instruction above the first use, and write the result in a new
variable. Then all occurrences of this expression can be replaced by a move of
from new variable into the original destination variable of the instruction.
This is a less efficient method then the
DAG
, but because the basic blocks are
This is a less efficient method then the
dag
, but because the basic blocks are
in general not very large and the execution time of the optimizer is not a
primary concern, this is not a big problem.
\subsubsection
*
{
Constant folding
}
\subsubsection*
{
Fold constants
}
Another optimization is to do constant folding. Constant folding is replacing
a expensive step like addition with a more simple step like loading a constant.
Of course, this is not always possible. It is possible in cases where you apply
an operation on two constants, or a constant and a variable of which you know
for sure that it always has a certain value at that point. For example:
\begin
{
verbatim
}
li
$
regA, 1 li
$
regA,
1
addu
$
regB,
$
regA,
2
-
> li
$
regB, 3
\end{verbatim}
Of course, if
\texttt
{
\$
regA
}
is not used after this, it can be removed, which
will be done by the dead code elimination.
One problem we encountered with this is that the use of a
\texttt
{
li
}
is that
the program often also stores this in the memory, so we had to check whether
this was necessary here as well.
\subsubsection*
{
Copy propagation
}
...
...
@@ -159,11 +155,12 @@ of the move operation.
An example would be the following:
\begin{verbatim}
move
$
regA,
$
regB move
$
regA,
$
regB
... ...
Code not writing
$
regA,
$
regB -> ...
... ...
addu
$
regC,
$
regA, ... addu
$
regC,
$
regB, ...
move
$
regA,
$
regB move
$
regA,
$
regB
... ...
Code not writing
$
regA,
-
> ...
$
regB ...
... ...
addu
$
regC,
$
regA, ... addu
$
regC,
$
regB, ...
\end{verbatim}
This code shows that
\texttt
{
\$
regA
}
is replaced with
\texttt
{
\$
regB
}
. This
way, the move instruction might have become useless, and it will then be
...
...
@@ -171,18 +168,7 @@ removed by the dead code elimination.
\subsubsection*
{
Algebraic transformations
}
Some expression can easily be replaced with more simple once if you look at
what they are saying algebraically. An example is the statement
$
x
=
y
+
0
$
, or
in Assembly
\texttt
{
addu
\$
1,
\$
2, 0
}
. This can easily be changed into
$
x
=
y
$
or
\texttt
{
move
\$
1,
\$
2
}
.
Another case is the multiplication with a power of two. This can be done way
more efficiently by shifting left a number of times. An example:
\texttt
{
mult
\$
regA,
\$
regB, 4 -> sll
\$
regA,
\$
regB, 2
}
. We perform this
optimization for any multiplication with a power of two.
There are a number of such cases, all of which are once again stated in
appendix
\ref
{
opt
}
.
\section
{
Implementation
}
...
...
@@ -206,7 +192,7 @@ languages like we should do otherwise since Lex and Yacc are coupled with C.
The decision was made to not recognize exactly every possible instruction in
the parser, but only if something is for example a command, a comment or a gcc
directive. We then transform per line to a object called a Statement. A
directive. We then transform per line to a
n
object called a Statement. A
statement has a type, a name and optionally a list of arguments. These
statements together form a statement list, which is placed in another object
called a Block. In the beginning there is one block for the entire program, but
...
...
@@ -219,7 +205,7 @@ The optimizations are done in two different steps. First the global
optimizations are performed, which are only the optimizations on branch-jump
constructions. This is done repeatedly until there are no more changes.
After all possible global optimizations are done, the program is sep
a
rated into
After all possible global optimizations are done, the program is sep
e
rated into
basic blocks. The algorithm to do this is described earlier, and means all
jump and branch instructions are called leaders, as are their targets. A basic
block then goes from leader to leader.
...
...
@@ -231,7 +217,8 @@ steps can be done to optimize something.
\subsection
{
Writing
}
Once all the optimizations have been done, the IR needs to be rewritten into
Assembly code, so the xgcc cross compiler can make binary code out of it.
Assembly code. After this step the xgcc crosscompiler can make binary code from
the generated Assembly code.
The writer expects a list of statements, so first the blocks have to be
concatenated again into a list. After this is done, the list is passed on to
...
...
src/optimize/advanced.py
View file @
b031248e
...
...
@@ -147,10 +147,27 @@ def fold_constants(block):
elif
s
.
name
==
'lw'
and
s
[
1
]
in
constants
:
# Usage of variable with constant value
register
[
s
[
0
]]
=
constants
[
s
[
1
]]
elif
s
.
name
in
[
'addu'
,
'subu'
,
'mult'
,
'div'
]:
# TODO: implement 'mult' optimization
# Calculation with constants
rd
,
rs
,
rt
=
s
[
0
],
s
[
1
],
s
[
2
]
elif
s
.
name
==
'mflo'
:
# Move of `Lo' register to another register
register
[
s
[
0
]]
=
register
[
'Lo'
]
elif
s
.
name
==
'mfhi'
:
# Move of `Hi' register to another register
register
[
s
[
0
]]
=
register
[
'Hi'
]
elif
s
.
name
in
[
'mult'
,
'div'
]
\
and
s
[
0
]
in
register
and
s
[
1
]
in
register
:
# Multiplication/division with constants
rs
,
rt
=
s
if
s
.
name
==
'mult'
:
binary
=
bin
(
register
[
rs
]
*
register
[
rt
])[
2
:]
binary
=
'0'
*
(
64
-
len
(
binary
))
+
binary
register
[
'Hi'
]
=
int
(
binary
[:
32
],
base
=
2
)
register
[
'Lo'
]
=
int
(
binary
[
32
:],
base
=
2
)
elif
s
.
name
==
'div'
:
register
[
'Lo'
],
register
[
'Hi'
]
=
divmod
(
rs
,
rt
)
elif
s
.
name
in
[
'addu'
,
'subu'
]:
# Addition/subtraction with constants
rd
,
rs
,
rt
=
s
rs_known
=
rs
in
register
rt_known
=
rt
in
register
...
...
@@ -167,22 +184,16 @@ def fold_constants(block):
if
s
.
name
==
'subu'
:
result
=
to_hex
(
rs_val
-
rt_val
)
if
s
.
name
==
'mult'
:
result
=
to_hex
(
rs_val
*
rt_val
)
if
s
.
name
==
'div'
:
result
=
to_hex
(
rs_val
/
rt_val
)
block
.
replace
(
1
,
[
S
(
'command'
,
'li'
,
rd
,
result
)])
register
[
rd
]
=
result
changed
=
True
elif
rt_known
:
#
c = 10 -> b = a
+ 10
#
a = 10 -> b = c
+ 10
# b = c + a
s
[
2
]
=
register
[
rt
]
changed
=
True
elif
rs_known
and
s
.
name
==
'addu'
:
#
a = 10 -> b = c
+ 10
#
c = 10 -> b = a
+ 10
# b = c + a
s
[
1
]
=
rt
s
[
2
]
=
register
[
rs
]
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment