Tuesday, July 26, 2011

Callgraph

There remains, as of this blog post, one major regression from the original dehydra-based DXR, and that is the lack of support for callgraph. The main problem I have here is trying to pin down the scope of the feature. To give an idea of why this hard, I pose the following challenge: how many different ways are there in C++ of invoking a function, as would be viewed from the generated assembly code? Here is my list:

Global function
Invocation of non-virtual member
Nonvirtual invocation of virtual member
Constructor, copy constructor invocations
Implicit destructor invocation
Functors (objects that implement operator())
C++0x lambdas (which appears to be a functor)
Count all of these as "one," if you like, since they're all still more or less boil down to a call addr instruction, so it really boils down to how pedantic you get about the differences. Most of these look fairly distinct in source code level as well too. But, on the plus side, being a statically-known function call, all of these are easy to handle.
Virtual member function invoked virtually
This is clearly different at an assembly level, since it requires a vtable lookup with the added possibility of thunking. From a handling perspective, though, this is more or less statically known, since we know the possible targets of a virtual method call are limited to subclasses of the method, if we assume that people aren't sneaky.
Function pointers
Easy to dip into, impossible to think about off the top of your head (how many tries does it take you to write the type of a function pointer that returns a function pointer without using typedefs?), and it requires data-flow analysis to solve. The main question is if it's even worth trying to at least plan to support function pointers, or if they should just be left out entirely.
Pointer-to-member function pointers
Theoretically easier than function pointers (because we again have a more limited state space), but they still suffer from the same need to data flow analysis. Not to mention that they are much harder to use, and that their implementations are much different from function pointers.
Template dependent name resolution
Templates, as far as C++ is concerned, is essentially a macro system that is more powerful and more inane than the C preprocessor. So if the expression that is to be called is dependent on the type parameter, then the function call can be any (or all) of the above depending on what was passed in.
throw
The entire exception handling semantics in C++ essentially gets handled by the ABI as a combination of magic function calls and magic data sections; undoubtedly a throw itself is a function call which gets to do stack unwinding. Most of this stuff isn't exposed in the code, it just leaks out in semantics.

One goal I've had in mind is to make sure I can do some reasonable support for dynamic languages as well as static languages. To that end, I've looked up other callgraph implementations to see what they do in the realm of indirect function calls. The answer is a mixture of either "do runtime instrumentation to build the callgraph" or "ignore them altogether." The closest I've seen is one implementation that indicated when the address of a function was taken.

Friday, July 22, 2011

Viewsource up and running

This morning, I have made available a public version of the viewsource code. Viewsource was originally a web tool to output a pretty-printed version of the Dehydra objects for simple snippets of source code. Since I found it a useful idea for debugging, I also added dumps for the now-defunct Pork rewriting toolset, JSHydra, and now, finally Clang.

The odd tool out here is clang, as it has neither a debug-useful XML output nor a simple JS representation that can report the output, which requires me to make the tool a two-stage process: the first stage is a compiler plugin that figures out from clang::RecursiveASTVisitor which functions and classes it needs to worry about dumping out, and then writes out the code for the second stage, another compiler plugin that actually dumps the generated AST information to the console. In other words, I have a plugin to write a plugin that is used to dump information to write plugins.

Eventually, I hope to be more complete in the information I can dump out (particularly type locations and type information), but this is complete enough to be at least somewhat useful for finding interesting things, such as understanding what the location information actually refers to. As I work on this tool more and more, this should enable me to find and fix many bugs in DXR.

Thursday, July 21, 2011

DXR alpha 3 release

I am pleased to announce the third release of DXR. Compared to the previous release, the UI has been tweaked and several bugs have been noticed and fixed. In particular, I have improved the support of links, and fixed the bug that prevented the type hierarchy from being properly realized. I have also introduced support for indexing the IDL code in Mozilla's codebase, as well as indexing generated code in general. I have also added yet another tree, comm-central.

Compared to the original implementation of DXR by Dave Humphrey, there remains one unimplemented feature, namely support for the callgraph. My plan for the next release will focus on implementing this feature as well as any other bugs I uncover via real-world testing. The other major features for the next release will be a JSON query API for data and improved support for multiple trees.

In a more general viewpoint, the plugin architecture has been much improved. There is now support for viewing code-coverage data in the source tree (I do not have a visible demo of this yet), and the support for IDL is entirely a separate plugin. Implementing these plugins has caused me to realize that there is need for better hooks in the current architecture (especially with regards to the generated HTML), and these will also be forthcoming in the next release.

The final point to make is that I have moved the viewsource code into its own repository, as well as adding support for a limited subset of the clang AST. The website does not properly support clang at this time owing to problems in the server's current toolchain configuration; when these issues are resolved, I will be demonstating the tool better.

Wednesday, July 13, 2011

Random discovery

I've been playing around with the idea of replacing mork for leveldb, the two being at roughly the same "no feature database" level. The sanest way to do a decent head-to-head comparison involves picking the least powerful wrapping around this API (the message folder cache) and then reimplementing it twice. Actually, I originally intended to try implementing mdb with a leveldb backend for a drop-in replacement to mork. Five minutes successfully dissuaded me, as the mdb interfaces require you to implement about five interfaces to get to "open a database".

For realistic testing, I grabbed a real database and instrumented all of the accesses in real use cases (i.e., startup and shutdown). Since the real data is what I love to call my Profile from Hell™, there's enough data that I can get some reasonable statistical accuracy in timing. I'm also testing what is a path of note in Thunderbird startup, so I made sure to test cold startup, which, incidentally, is very annoying. A 1 second test takes several seconds to reload xpcshell.

The first performance results were very surprising to me. I expected that not touching the knobs would probably result in a mild performance regression (mork is fast by virtue of choosing the "space" option in the space/time tradeoff). What I wasn't expecting was something that was an order of magnitude slower. I eventually got off my butt and hooked up jprof to see what was going wrong. Unfortunately, I forgot that jprof's visualization of output results sucks, so I traipsed back to my very old post to pull down my script to turn the dumb HTML file into a more manageable dot file (which needed fixing, since the only commit to jprof in 3 years changed the output syntax on me).

Clearly, the output file implicates JS as being the hotpath. Since this is xpcshell and I had to hack the jprof initialization in, I decided to move the jprof initialization to the execution point to eliminate the effect of parsing my ~20K-line input file (I did say it loads a lot of data). When I last used jprof for serious performance tracking, I noted that the main regression appeared to be caused by XPCThrower::ThrowBadResult (about 60%, apparently). So here, the main regression appears to be... you guessed it, JS exceptions (the method names have changed in 3 years, though).

That is truly unexpected, since I'm not supposed to be actually changing any API. After looking back at the code and inspecting much more deeply, I found out that I actually was changing API. You see, the nsIMsgFolderCacheElement has two ways to access properties: as strings or as integers. Accessing integers clearly throws an error if the value isn't there; accessing strings also appears to throw an error. Assuming this to be the case, I explicitly threw an error in both cases in my new implementation. It turns out that the current implementation actually doesn't throw for strings (I'm not sure if it's intended to or not). Fixing that greatly improved the times. A brief summary of the real results is that mork is faster on startup, even on general property access, and slower on shutdown, only the last of which is not surprising.

In summary: throwing JS exceptions, at least via XPCOM, is very, very, very slow. If that code is a potential hotpath, just say no to exceptions.

Tuesday, July 12, 2011

Location information

One of the hardest parts about writing a compiler is producing correct location information. Well, it is very easy to get the precise location of a token (most of the time, in any case), but the hard part is assigning token locations to actual AST productions, since they are composed of several tokens. Take a simple function definition: void foo(int bar) {}. Nearly every token in that definition can be argued to be the right location of the function. Hence showing only line information is common for compilers: in most normal usage, you get the same information no matter which token you pick. However, there are times when the actual column number is crucial: if you need to correlate something to the original source code (e.g., for rewriting or for DXR), you want that column number to be spot on.

Now let's consider pathological cases. Suppose we define that function in a templated class. Do I want the location of the template or the location of the template instantiation? The former is often better, but there are times you want the latter. But you can also make the function in a macro: #define FUNCTION(name) void name(int bar) {} and FUNCTION(foo). Do I want the location within the macro definition or the location in the macro instantiation? Macros let you be really evil: #define FUNCTION(name) void start##name(int bar) {}. Now which location do I want? Oh yeah, and while we're on the topic, there is also the issue of autogenerated code (e.g., I want the location to be what it was in the original file). So which location do I use?

There are two orthogonal issues here: which token do I use in the "final" version of the text, and which version of the text do I grab it from. There are several versions of the text (think of the lexer not as a single unit but instead of a chain of steps that pass the information between them); we don't care about some of the intermediate representations (e.g., what the code looks like after replacing trigraphs). The lowest level of these is the position in the last form of text, just before being parsed for real. It's also important to track the position through repeated invocations of macros (most of the time). We also want the location in the text file as it is when fed to the compiler to begin with, as well as the position in the original file before it is fed to whatever output the C/C++ source file.

Clang gives us some of this information for any SourceLocation. The source of the actual token is the spelling location. On the other hand, the instantiation location is the location of the outermost macro invocation to generate the production. It is possible to get the tree of locations by repeatedly calling getImmediateSpellingLocation, which only looks down one level of invocation. To get the result after following #line directives, you need to use the presumed location, which looks at the instantiation location. It is possible to pass in the spelling location to getPresumedLoc to produce the correct results, though. Unfortunately, precise location information goes out the window if your macro uses # or ## (it goes into "scratch space" whose line numbers don't correlate well enough to the original source code for me to make sense of).

The other direction is one of the possible methods to get the source location in the class. A function declaration merely has 6 of these: the start and end of the declaration, the location of the name identifier, the location of the type specifier, the location after the template stuff, and the location of the right brace. I would go in more detail, but I haven't yet catalogued all of the ways to get this information to be sure that the method names actually mean what they claim to mean.

Wednesday, July 6, 2011

DXR alpha 2 release

I am pleased to announce a second alpha release of DXR. Several improvements have been made, including changes to the UI as well as better cooperation with macros. A complete list of changes can be found at the end. As this is the second release, I have expanded the live demo to include not one but two separate indexing trees: clang and mozilla-central, both of which are current as of last night.

The advantage of DXR is that it sees the source code by instrumenting the compiler at build time, so it is able to understand classes most of whose methods are defined by macros, like Clang's RecursiveASTVisitor. Like any source code browser, it is possible to click on an identifier and then go see information about the identifier, such as its points of declaration, or its definition. Unlike LXR and MXR, however, it knows what type of identifier it's working with, so you won't be offered the definition of BuiltinType::getKind if you look up the getKind method of an Expr object.

Another new feature is support for plugins, to allow people to use different compilers, add in support for more languages, or even augment the UI with new features like capturing line-coverage information. The current support is rudimentary, but there is already active work on a dehydra-based indexer.

A list of new features:

  • Improved search speed at runtime, and dropped glimpse as a prerequisite
  • The web UI works in newer versions of Firefox
  • Support for plugins has been added
  • Disk space requirements have been radically reduced
  • Setup configuration is easier
  • Link extents have been fixed in many cases
  • Declarations can now be seen along with definitions
  • Macros can now be browsed like source code