IainPeregrine's excellent Get Something Done challenge will be starting in a few days, so I thought I'd make a blog post about what I'm intending to work on for the period.
If you've been following my BYOND-related antics over the last few years, you'll know that I've got some interest in the idea of writing a third-party parser for DM. I've actually written some code to that effect, and then discovered that the grammar I was using was fundamentally ambiguous with-respect-to DM programs, because I wasn't handling newline as a lexical token. By the time I got around to looking at the problem again, I'd completely forgotten what I was getting at, and the flex/bison files were... not easy reading. In short, my first attempt was a bit of a cockup.
For the Get Something Done challenge this year, I intend to have another go at writing a (simple) DM parser, using flex, bison, and C. I've set myself the target of having a program that I can run over multiple DM files to extract the object tree, assuming usual-ish files.
What do I mean by 'usual-ish'? I mean DM as it is usually written and missing a few of the advanced features. I won't necessarily handle braces, for example (Although I probably will), and don't expect parent_type or the . and : path search operators to do what they should.
That's a goal set in the knowledge that I'll probably actually fulfil it, assuming I actually manage to get some work done over the month. The further goal is to get decent error handling, better handling of unusual DM files, preprocessing, and eventually the ability to parse the internals of procedures. Finally, get said parser packaged up into a library people could actually use, rather than just an interesting research project.
What's the point of all of this? What could a third-party parser accomplish?
Well, I originally started thinking about the problem in terms of getting some sort of javadoc or doxygen equivalent for DM. javadoc and doxygen are programs that read in source files, and then generate HTML pages documenting the objects/procedures/variables defined within. They use special comments for the actual documentation. An equivalent BYOND tool could be very useful, and requires getting more information out of the source code than can be extracted from the output of the command-line DM compiler. Using a DM parser is one way to go about the problem. (Note: It is possible to do this sort of documentation generation without a parser, but then the object-tree/procedure/variable information needs to be embedded in the autodoc comments - that is, you can't document the existence of an object/proc/var without the special comments for it).
Other useful utilities that could use a DM parser include a DM 'lint' tool. lint is a program that reads in C source files and generates warnings for constructs that look suspicious. A tool that read DM source code, and, for example, flagged uses of 'usr' in a procedure (explicit or implicit), use of the : operator for variable/proc access, not returning a value from Move()/Enter()/Exit(), etc, would be quite useful (but would require a pretty sophisticated parser)
A parser is also useful for code-completion and 'intellisense' features in IDEs. If, for example, DMIDE had a DM parser on hand to generate the object tree, it could provide a list of procedures and variables a given object has when you start writing an access.
It'll take a lot of time to get all that stuff going, of course, but they're some possible uses to which a parser can be put. BYOND could really benefit from having some more development tools written up.