MAQAO (Modular Assembly Quality Analyzer and Optimizer) is a performance analysis and optimization framework operating at binary level with a focus on core performance. Its main goal of is to guide application developpers along the optimization process through synthetic reports and hints.
MAQAO mixes both dynamic and static analyses based on its ability to reconstruct high level structures such as functions and loops from an application binary. Since MAQAO operates at binary level, it is agnostic with regard to the language used in the source code and does not require recompiling the application to perform analyses. MAQAO has also been designed to concurrently support multiple architectures. Currently the Intel64 and Xeon Phi architectures are implemented.
The main modules of MAQAO are LProf, a sampling-based lightweight profiler offering results at both function and loop levels, CQA, a static analyser assessing the quality of the code generated by the compiler, and ONE View, a supervising module responsible for invoking the others and aggregating their results. Other modules, currently in beta version, allow performing value profiling (VProf) and decremental analysis (DECAN).
Besides performance evaluation and optimization, MAQAO can be used in many different high performance computing projects. You might be interested in MAQAO if you are:
MAQAO is available for 64 bits Linux based operating systems and can be downloaded here.
More versions of MAQAO are available on Documentation page
A collection of the different types of analysis reports that MAQAO can generate is available here.
A more complete repository of analysis reports generated by MAQAO is available here.
It reads binaries in ELF format and uses dwarf information if present to make the connection with source files. MAQAO provides the list of functions, binary loops, basic blocks and instruction sequences found by a linear sweep analysis of the binary code.
The LProf module profiles applications on large parallel systems with very low overhead. It provides a summary of time spent in functions and loops (hotspots) and also a categorization view to quickly distinguish between user code and runtimes/system code.
The CQA module analyses the quality of the code generated by the compiler and provides users with human readable workarounds/hints.