After integrating Milepost into nspr, I found that it was not possible to measure the performance boost there. So Dan suggested that I try the same thing for libvorbis, the ogg library which is used for html5 video. So I started working on that, and as now I'm used to the mozilla build system it didn't take too long to perform the same steps, with minor variations for libvorbis.
Unfortunately, while using Milepost for libvorbis, I am now facing a compilation error on the file vorbis_res0.c. It works perfectly fine as long as I am not using Milepost, but the moment I enable Milepost, it gives a compilation error.
I have informed the cTuning people about this, and a thread is running here. It might just be that its something I am doing wrong, but frankly I can't think of anything. I hope they can give me a solution to the problem soon.
Lastly, I still cannot try checking the perf boost, because the Web Service is still not running properly. I have informed the cTuning people about this too and they are working on it right now. I hope it gets resolved soon, because until it does, I am stuck with nowhere to go.
I haven't posted for some time now, mainly because I did not have anything concrete, but now I am regretting that. There are so many tweaks here and there that I have done in the past 10 days or so to be able to integrate Milepost GCC into the Mozilla build-system that I am finding it difficult to remember all of them. Still, I'll give here a brief sum up of my progress since the last time.
Well it started with a chat with Dan (dwitte). And he suggested that we start work with SpiderMonkey and try and get Milepost integrated there, because the build system of SpiderMonkey is the same as Mozilla. He also suggested that we try and tweak the code written for PGO which also uses two passes on the system. So off I dived into PGO, got tangled up in makefiles and configures and whatnots and then finally had to approach Ted (ted) to make some sense out of all of it :). And then I got to know about config.mk which sets the CFLAGS and then all the fog cleared :). So I basically figured how PGO was working, but Ted told me that it had been disabled on Linux. Anyways, it helped me a lot in how to call multiple makes etc.
So I started trying to integrate Milepost into SpiderMonkey when I came across this discussion on the cTuning group which said that C++ is not currently supported and I saw that most of the files I was compiling were cpp, so I had run into a dead-end. So then off I went to Dave(humph) and Ted and they gave me names of some C-only modules and humph suggested nspr, a C-only module which provides a platform independent API for system level functions. So then I had to start again with nspr, although it was quite similar to SpiderMonkey and much simpler to understand :).
After much banging heads against the build-system I managed to understand the flow and I now knew what I had to do. So I created a "milepostbuild" target, similar to a "profiledbuild" target of PGO, which would call submakes, changing the environment variables as required by the ICI plugins of Milepost everytime. It was a huge task in itself to find out how to export the variables to the current shell in the first place but I finally managed. I am currently exporting them initially in configure, and to call configure I am using "source ../configure" in bash instead of a normal "./configure". This sets up my variables in the current shell and now I can make whatever changes I want to these variables while calling make again from the makefile by adding VAR=Val alongside.
This done, I now had to get the ICI plugins to work with the build, properly. For this I needed to add the filename of the source file being compiled by gcc to the files that the plugins create, which basically contain the executed passes (.txt) and the static features (.ft). I couldn't find a way to do this at first and I found myself wandering here and there in the gcc code with no idea at all, but after some searching I found the "function_filename" feature which would be able to return the filename. I used it and it worked fine on my small programs, but it did not work when I used the same thing in the build. Finally, I realised that it was because of the relative path of the filename being returned instead of the filename itself, and then I mended that and voila, everything works now!
So, I am now able to extract the gcc executed passes for each nspr file and its respective static features. I have to integrate the web service in now, to predict the flags using the extracted features, but right now there is some problem with the cTuning web services. Also, there is a slight problem regarding the passes for which I need help from the Milepost people. Lastly, although I am currently trying to make this work on a file-to-file basis, what we really need is something that can work on the whole module in one go, as working on each file, especially using a web service, takes a lot of time and as Dwitte mentioned, it won't scale. Already I see that my build time for nspr has changed from seconds to minutes, when I haven't even used the web service yet. I have posted the same concern to the Milepost authors and hope to receive a positive reply...
So finally after banging my head for almost two days against the Web Service of cTuning and CCC Framework, I have finally managed to get the combinations correct and am able to receive the Compiler Optimization flags now from the cTuning web services. There are a few glitches here and there, I think it requires that the platform, compiler and environment that one is using should be already present in the database. So I opened up the database and retrieved some records and used the platform, env and compiler ids from there to retrieve the predicted compiler flags and I am quite pleased to say that it is working now. Although, I see that using the flags generated, I am getting a worse runtime than by using normal -o2, -o3 levels :)..
Also, I read on the cTuning website that currently only compiler optimization flags are being predicted by the ML based compiler and optimization passes are not being predicted but there are plans to incorporate this too. I will try to follow up the authors of Milepost and find whether that part has been done, or is in the process.
Here are some of the plugins that I have installed and a summary of what they do:
- save_executed_passes.legacy
This plugin saves the executed passes per function in external files
"ici_passes_function.txt"
- save_executed_passes_with_time.legacy
This plugin times the execution of the passes split-compilation.
- substitute-passes.legacy
This plugin substitutes original GCC pass order with the one read from either external files "ici_passes_function.txt", 1 global file "ici_passes_all.txt" or environment variable ICI_PASSES_ALL (passes are separated by comma) thus allowing external manipulation with passes (adding, removing or reordering).
- extract_program_static_features
This plugin extracts program static features per function as vectors and saves them into "ici_features_function.txt"
The installation of these plugins was not very intuitive but I managed by tweaking the code here and there. The plugins are working fine now with GCC. The only problem I now face with Milepost is how to predict the optimization flags using the static program features which are extracted by the last plugin mentioned above. I believe (I may be wrong here) that we need to access the cDatabase and push our features in there, which then returns a set of optimization flags which can be used to compile with GCC. There are two ways to do this: using Web Services or using the CCC Framework (http://ctuning.org/wiki/index.php/CDatabase:Documentation:API). I have currently tried both but there are issues.
I cannot find a php script using which I can send a request to the "predict_opt" Web Service. Without this, its not possible to use the Web Service. The script for adding the optimization cases is available though.
With the CCC framework, I am unable to access the Web Service using sockets. This is the part where I am currently working. I have posted a request on the Discussions of the Community for making available the php script. Meanwhile, I'll try to get the CCC framework to work and return the predicted flags.
Continuing with the work of the previous week, this week I started reading up about Milepost GCC. Some of the information can be found on this link: http://ctuning.org/wiki/index.php/CTools and the following paper: http://gcc-ici.sourceforge.net/papers/fmtp2008.pdf.
I am putting up here some of the extracts which I thought were important and which highlight the key concept of the Machine Learning based compiler:
- Considerable speed-ups can be already obtained after iterative compilation on all platforms. However, this is a time-consuming process and different speed-ups across different platforms motivate the use of machine learning to automatically build specialized compilers and predict the best optimization flags or sequences of passes for different architectures.
- Research has shown a great potential to improve program execution time or reduce code size by carefully selecting global compiler flags or transformation parameters using iterative compilation. The quality of generated code can also be improved by selecting different optimization orders. Milepost GCC's approach combines the selection of optimal optimization orders and tuning parameters of transformations at the same time.
- Here is a description of some of the related tools and frameworks:
- Continuous Collective Compilation Framework (CCC): A tool that generates the training examples for the Machine Learning tools. It does this by evaluating different compilation optimizations, storing execution time, code size and other metrics in a database.
- Interactive Compilation Interface (ICI): The ICI provides opportunities for external control and examination of the compiler. The new version of ICI expands on the capabilities of its predecessor permitting the pass order to be modified. This version of ICI is used in the Milepost GCC to automatically learn good sequences of optimization passes.
- PLUGINS: The features of the program (program structure) are extracted from Milepost GCC via a plugin and are also stored in the database. They are driven through shared libraries.
- Working of Milepost GCC:
- The plugins are invoked by the new -fici GCC flag or by setting ICI_USE environment variable to 1. When GCC detects these options, it loads a plugin (dynamic library) with a name specified by ICI_PLUGIN environment variable.
- The machine learnt model predicts the best GCC optimization to apply to an input program based on its program structure or program features.
- To extract these static program features an additional GCC pass 'ml-feat' is implemented. This pass can be called using a extract_program_static_features plugin after any arbitrary pass starting from FRE when all the GCC data necessary to produce features is ready.
There is still a lot of material to be read. I will give some details about the plugins I have installed in the next post. I believe Milepost GCC is an ingenious idea, however, as a user, I would have been happier if the documentation had been better. Currently I have to read the READMEs scattered around in the installation directories, and they are not very clear and specific.
I have started working on a student project in Mozilla which requires using Milepost GCC to speed up Firefox. I started with doing a few builds of Firefox, both by using Mercurial and downloading the source code archive directly. I faced a few problems during the build process. After taking a lot of time for the build, it would crash on me giving errors like "vsnprintf not found", "fprintf not found" etc. After trying all possible options, I googled the errors and even found a related bug already filed:
https://bugzilla.mozilla.org/show_bug.cgi?id=485019
But the bug is marked "Resolved Invalid" (I have no idea what that means). Anyways, once I found that it was the breakpad which was causing problems, I disabled the crashreporter in the mozconfig file and that worked. So I managed to do some optimization and debug builds.
After that I moved onto Profile Guided Optimization Builds, which I believe Mozilla is currently using for its releases. A PGO build consists of two passes: a first pass to build binaries, then a second pass to re-build optimized binaries using profile information extracted by running the first binaries. I managed to successfully build firefox 3.6 with PGO.
I have moved on currently to studying Milepost GCC for now, but I'll be back to the builds to examine the makefiles and the code changes between normal and PGO builds, because I'll need to know how the builds actually work.