Next:
1. Introduction
Up:
No Title
Previous:
No Title
Contents
1. Introduction
1.1 Overview of decoupled architecture
2. A Compilation Model for Decoupled Architectures
2.1 The target architecture
2.2 A dynamic execution model for decoupled architectures
Loss of decoupling events
2.3 The experimental compiler
2.4 Causes of loss of decoupling
Indirect accesses
Computed array indices
I-cache disruption
Control transfers
While loops and premature loop exits
2.5 Placement of LOD events
3. Compiler Effectiveness
3.1 Cost model for decoupled execution
3.2 LOD frequencies in the Perfect Club
3.3 Effect of LOD placement
4. Idiomatic Transformations
4.1 Pre-queueing for indirect accesses
4.2 Loop distribution for computed indices
4.3 IF-conversion
4.4 Subroutine inlining
4.5 Speculative loop dispatch
5. Conclusions
References
A. Dominant LOD points in the Perfect Club
A.1 Examples of IF-conversion in the Perfect Club
A.2 Example of inline decoupling of intrinsics
A.3 Potential for improved decoupling
A.4 Robust hoisting
A.5 The Effect of Subroutine Calls
A.6 Difficult LODs
B. LOD Locality in the Perfect Club
C. Acknowledgements
About this document ...
npt@dcs.ed.ac.uk