DGrok is all about parsing Delphi source code. It has three parts:
More information is available in the DGrok posts on my blog.
This project is currently inactive, as I no longer use Delphi. As of July 2014, DGrok is on GitHub and MIT-licensed. I'll be happy to take pull requests. If someone wants to fork and take over the project, let me know so I can link to you!
The current parser has full support for Delphi 2007 source code (to the best of my knowledge), but no support for Delphi 2009 features like generics.
There's also no symbol table support, so the tools can't do refactorings or Find References.
DGrok comes with a demo app, which you can use to parse one or more directory trees, and then analyze the code looking for patterns.
Here's the current list of patterns it can look for. It's no FxCop, but it's a start.
class varinstead of unit globals.) This tool lists all the global variables in your code, so you can see how bad things are and start cleaning them up.
withstatements make for confusing code. Avoid them. This tool lists all the
withstatements in your code.
asmblocks in your code.
You can also add code to look for patterns of your own. See the classes in the DGrok.Framework\Visitors directory for examples.
When I first started the project that was to become DGrok, it was just a fancy Find tool, and I wrote it in .NET because .NET had a regular-expression library. Later I tried using a parser generator, and there aren't any good ones that produce Delphi code, so I stuck with C#. And when I eventually switched to a hand-coded recursive descent parser, well, I already had all these unit tests written in C#.
Besides that, C# has a lot of language niceties like generics, anonymous methods, iterators, and garbage collection, none of which Delphi had back in 2004-2007 when I was writing DGrok.
I also used Ruby for text processing (building the grammar document) and codegen. Interpreted languages are great for codegen, because you can run them during your build process without needing to compile the code generator first.
There's no technical reason you couldn't port DGrok to Delphi. If you do, let me know so I can link to you.
ANTLR is a fine tool, but it has problems with ambiguous grammars. It wants to be able to read from left to right, one token at a time, and always know what type of construct it's dealing with based only on what it's seen so far. (There's support for lookahead but it's extremely limited.)
That isn't good enough for the Delphi grammar. Delphi is full of ambiguity.
For example, take the humble semicolon. Most of the time, it's an unambiguous statement separator. That is, until you see a semicolon in the middle of a variable declaration:
var Foo: procedure; stdcall = nil;
So when you see the first semicolon, you don't know whether you're done with the variable declaration or not. ANTLR doesn't take well to that sort of thing.
Once you start digging into the grammar, it becomes obvious that the Delphi grammar grew organically over time, rather than being designed from the beginning to be easy to write tools for.
DGrok uses a hand-coded recursive-descent parser. It's hard to tell a tool how to deal with the grammar ambiguity if it wasn't designed for it, but it's easy to write code to deal with the ambiguity.
I haven't played around with Delphi 2009, but I suspect that supporting generics would require adding support for symbol tables. Consider this code snippet:
Does that have a call to a generic method
B<C, D>, or is it a call to method
A passing two Boolean parameters
I suspect the real Delphi compiler builds symbol tables as it goes, and uses them to decide which parsing rules to apply. (Please correct me if I'm wrong about the above code being ambiguous!)
DGrok was written by Joe White. If you have any comments, corrections, questions, suggestions, etc., please feel free to use my contact form to get in touch.