Source Code Optimization Techniques for Data Flow Dominated Embedded Software
Falk, Heiko, Marwedel, Peter
2004, XX, 226 p.
Springer eBooks may be purchased by end-customers only and are sold without copy protection (DRM free). Instead, all eBooks include personalized watermarks. This means you can read the Springer eBooks across numerous devices such as Laptops, eReaders, and tablets.
You can pay for Springer eBooks with Visa, Mastercard, American Express or Paypal.
After the purchase you can directly download the eBook file or read it online in our Springer eBook Reader. Furthermore your eBook will be stored in your MySpringer account. So you can always re-download your eBooks.
This book focuses on source-to-source code transformations that remove addressing-related overhead present in most multimedia or signal processing application programs. This approach is complementary to existing compiler technology. What is particularly attractive about the transformation flow pre sented here is that its behavior is nearly independent of the target processor platform and the underlying compiler. Hence, the different source code trans formations developed here lead to impressive performance improvements on most existing processor architecture styles, ranging from RISCs like ARM7 or MIPS over Superscalars like Intel-Pentium, PowerPC, DEC-Alpha, Sun and HP, to VLIW DSPs like TI C6x and Philips TriMedia. The source code did not have to be modified between processors to obtain these results. Apart from the performance improvements, the estimated energy is also significantly reduced for a given application run. These results were not obtained for academic codes but for realistic and rep resentative applications, all selected from the multimedia domain. That shows the industrial relevance and importance of this research. At the same time, the scientific novelty and quality of the contributions have lead to several excellent papers that have been published in internationally renowned conferences like e. g. DATE. This book is hence of interest for academic researchers, both because of the overall description of the methodology and related work context and for the detailed descriptions of the compilation techniques and algorithms.
List Of Figures. List Of Tables. Acknowledgments. Foreword. 1. Introduction. 1.1. Why Source Code Optimization? 1.1.1. Abstraction Levels Of Code Optimization. 1.1.2. Survey Of The Traditional Code Optimization Process. 1.1.3. Scopes For Code Optimization. 1.2. Target Application Domain. 1.3. Goals And Contributions. 1.4. Outline Of The Book. 2. Existing Code Optimization Techniques. 2.1. Description Optimization. 2.2. Algorithm Selection. 2.3. Memory Hierarchy Exploitation. 2.4. Processor Independent Source Code Optimizations. 2.5. Processor Specific Source Code Optimizations. 2.6. Compiler Optimizations. 2.6.1. Loop Optimizations For High Performance Computing. 2.6.2. Code Generation For Embedded Processors. 3. Fundamental Concepts For Optimization And Evaluation. 3.1. Polyhedral Modeling. 3.2. Optimization Using Genetic Algorithms. 3.3. Benchmarking Methodology. 3.3.1. Profiling Of Pipeline And Cache Performance. 3.3.2. Compilation For Runtime And Code Size Measurement. 3.3.3. Estimation Of Energy Dissipation. 3.4. Summary. 4. Intermediate Representations. 4.1. Low-Level Intermediate Representations. 4.1.1. GNU RTL. 4.1.2. Trimaran ELCOR IR. 4.2. Medium-Level Intermediate Representations. 4.2.1. Sun IR. 4.2.2. IR-C / LANCE. 4.3. High Level Intermediate Representations. 4.3.1. SUIF. 4.3.2. IMPACT. 4.4. Selection Of An IR For Source Code Optimization. 4.5. Summary. 5. Loop Nest Splitting. 5.1. Introduction. 5.1.1. Control Flow Overhead In Data Dominated Software. 5.1.2. Control Flow Overhead Caused By Data Partitioning. 5.1.3. Splitting Of Loop Nests For Control Flow Optimization. 5.2. Related Work. 5.3. Analysis And Optimization Techniques For Loop Nest Splitting. 5.3.1. Preliminaries. 5.3.2. Condition Satisfiability. 5.3.3. Condition Optimization. 126.96.36.199. Chromosomal Representation. 188.8.131.52. Fitness Function. 184.108.40.206. Polytope Generation. 5.3.4. Global Search Space Construction. 5.3.5. Global Search Space Exploration. 220.127.116.11. Chromosomal Representation. 18.104.22.168. Fitness Function. 5.3.6. Source Code Transformation. 22.214.171.124. Generation Of The Splitting If-Statement. 126.96.36.199. Loop Nest Duplication. 5.4. Extensions For Loops With Non-Constant Bounds. 5.5. Experimental Results. 5.5.1. Stand-Alone Loop Nest Splitting. 188.8.131.52. Pipeline And Cache Performance. 184.108.40.206. Execution Times And Code Sizes. 220.127.116.11. Energy Consumption. 5.5.2. Combined Data Partitioning And Loop Nest Splitting For Energy-Efficient Scratchpad Utilization. 18.104.22.168. Execution Times And Code Sizes. 22.214.171.124. Energy Consumption. 5.6. Summary. 6. Advanced Code Hoisting. 6.1. A Motivating Example. 6.2. Related Work. 6.3. Analysis Techniques For Advanced Code Hoisting. 6.3.1. Common Subexpression Identification. 126.96.36.199. Collection Of Equivalent Expressions. 188.8.131.52. Computation Of Live Ranges Of Expressions. 6.3.2. Determination Of The Outermost Loop For A CSE. 6.3.3. Computation Of Execution Frequencies Using Polytope Models. 6.4. Experimental Results. 6.4.1. Pipeline And Cache Performance. 6.4.2. Execution Times And Code Sizes. 6.4.3. Energy Consumption. 6.5. Summary. 7. Ring Buffer Replacement. 7.1. Motivation. 7.2. Optimization Steps. 7.2.1. Ring Buffer Scalarization. 7.2.2. Loop Unrolling For Ring Buffers. 7.3. Experimental Results. 7.3.1. Pipeline And Cache Performance. 7.3.2. Execution Times And Code Sizes. 7.3.3. Energy Consumption. 7.4. Summary. 8. Summary And Conclusions. 8.1. Summary And Contribution To Research. 8.2. Future Work. Appendices: Experimental Comparison Of SUIF And IR-C / LANCE. Benchmarking Data For Loop Nest Splitting. B.1. Values Of Performance-Monitoring Counters. B.1.1. Intel Pentium III. B.1.2. Sun Ultrasparc III. B.1.3. MIPS R10000. B.2. Execution Times And Code Sizes. B.3. Energy Consumption Of An ARM7TDMI Core. B.4. Combined Data Partitioning And Loop Nest Split