To build the internal SQL module for Matrex I decided to work in the following way:
Antlr has the following advantages compared to the other parsers:
The reasons are probably:
Simple, right? Wrong. Here are the problems for this approach:
- Write a parser that converts the SQL expression in an internal object structure
- Write the code that applies the parsed SQL to the matrices/vectors arguments of the SQL function.
Antlr has the following advantages compared to the other parsers:
- More people use it (at least it looks like)
- The parse result can be code in different languages (Java, C#, python...) that can be useful if I want to port the same grammar to other projects
- Together with the library you can download a graphical application called AntlrWorks to interactively test and debug your grammar. AntlrWorks is a very good tool, that let you find errors in your grammar before you start to use it.
The reasons are probably:
- my inexperience in terms of parsers/lexers
- some confusion and some holes in the free documentation
Simple, right? Wrong. Here are the problems for this approach:
- Antlr is in version 3 now (is normally called v3). The example grammars are most made for version 2 (2.7.x). Altough v2 and v3 grammars look very similar, to convert a v2 grammar to a v3 one is not easy.
- It is possible to buy a book written by the Antlr's author. I did not want to buy the book because I'm not planning to use Antlr in the future. But then I discovered that the online documentation is partial and often referring to the old 2.7.2 version.
- The produced java classes can have a method to get the AST tree, but as far as I understood the tree cannot be used for an interpreter, but only to check the result of the statement parsing.
- To build an interpreter is not the only purpose for using a parser:
Antlr is used for many other things, for example to compile, which
means convert expressions from a grammar to another one. Consider this to avoid to get confused reading the documentation.
- You can add java code directly in the grammar. With this code you can build the interpreter structures directly in the generated java code.
- Be very careful about the case of the initial letter of the rule names. Upper case: lexer rule; Lower case : parser rule. It looks simple, but if only the initial letter of one rule name is wrong nothing works as it should.
- The lexer is used to parse single words (identifier, strings, numbers). The parser is used to parse phrases.
- Spaces are handled automatically by the parser.
- In the java code that you add to the grammar you can set the package of the generated classes.
- AntlrWorks generates two types of java classes: the ones to use in your application and the ones that it uses to debug. They are saved in the same place with the same names. The debug classes don't work in your application, so remember to generate the application classes after a debug session.
4 comments:
Thanks for sharing. I've encountered the same problem with you. Start to use antlr just for a few days ago and the reason to choose it is the so called well supported documents. at least i know, not for v3.
The book is great. I'm not sure why you wouldn't count that is superior documentation for v3 instead of v2 -- which it clearly is. It would be pretty silly to use ANTLR and not get the book.
Thanks a lot!
Having worked with Lex and Yacc in the past it was a some what difficult to understand how ANTLR works. And you just answered my doubts.
Thats what I was looking..... Thanks a lot
Post a Comment