MAQAO (Modular Assembly Quality Analyzer and Optimizer) is a performance analysis and optimization framework operating at binary level with a focus on core performance. Its main goal of is to guide application developpers along the optimization process through synthetic reports and hints.
MAQAO mixes both dynamic and static analyses based on its ability to reconstruct high level structures such as functions and loops from an application binary. Since MAQAO operates at binary level, it is agnostic with regard to the language used in the source code and does not require recompiling the application to perform analyses. MAQAO has also been designed to concurrently support multiple architectures. Currently the Intel64 and Xeon Phi architectures are implemented.
The main modules of MAQAO are LProf, a sampling-based lightweight profiler offering results at both function and loop levels, CQA, a static analyser assessing the quality of the code generated by the compiler, and ONE View, a supervising module responsible for invoking the others and aggregating their results. Other modules, currently in beta version, allow performing value profiling (VProf) and decremental analysis (DECAN).
Besides performance evaluation and optimization, MAQAO can be used in many different high performance computing projects. You might be interested in MAQAO if you are:
MAQAO is available for 64 bits Linux based operating systems and can be downloaded here.
More versions of MAQAO are available on Documentation page
It reads binaries in ELF format and uses dwarf information if present to make the connection with source files. MAQAO provides the list of functions, binary loops, basic blocks and instruction sequences found by a linear sweep analysis of the binary code.
The LProf module profiles applications on large parallel systems with very low overhead. It provides a summary of time spent in functions and loops (hotspots) and also a categorization view to quickly distinguish between user code and runtimes/system code.
The CQA module analyses the quality of the code generated by the compiler and provides users with human readable workarounds/hints.
Gabriel Staffelbach, Researcher
AVBP is a state of the art code for compressible reactive flows developed by CERFACS. The development team has benefitted from the MAQAO toolset which encompasses the expertise and experience of the MAQAO Team. Specifically, the CQA tool has allowed a continuous performance improvement for each new intel architecture. Acceleration factors from 1.3 to 1.8 have been obtained for real industrial applications with a clear impact on the R&D process. MAQAO is now part of the gold standard of performance tools used at CERFACS for code modernization and performance engineering.
Vincent Moureau and Ghislain Lartigue, Researchers
The CORIA team, who develops the massively parallel CFD code YALES2, has benefited for several years of the expertise and support from the MAQAO team. With their help, we have identified many performance bottlenecks, which led to acceleration factors between 2 and 4 depending on the considered kernels. The MAQAO tools are now routinely used at CORIA during the development of YALES2.
Patrick Hede, Expert Engineer
MAQAO helped us pinpointing issues in code sections that were affecting the global performance of our applications. Thanks to the low overhead of MAQAO we have been able to run our applications with real life datasets without making any compromises. We started using MAQAO during the perfcloud projet during which we managed to obtain a 3x factor on our PiRiA application.