I described in a previous post how I was trying to parse Matlab code. I’ve given up on this endeavour because it is way too much work (we designated a Matlab compiler as a last resort to get everything stand-alone, we will require Matlab in our project to speed up development).
I will however provide my incomplete and broken grammar, as promised. I hope somebody can use this later on as we have dropped the compiler approach completely.
On a side note, we figured out why nobody has a complete grammar for Matlab: its too darn difficult – if you manage to somehow describe the syntax, implementing it (type checking and internal Matlab functions for example) will be a heck of a lot of work. But it can be done, that much is obvious by now.
// Simple grammar for interpreting Matlab files for the MDDP project.
// Written by Berend Dekens
// Note: this part of the project is abandoned and will not be completed. The grammar is mostly working in Antlr except
// for some dodgy errors. If you find this usefull and/or manage to fix the parser errors, please let me know so I can
// fix the problems.
// Known limitations:
// - No function calls without parenthesis
// Matlab allows function calls in the form of 'function_name arguments'. This is annoying and thus not allowed.
// - No functions calls (period)
// Currently we do not allow function calls at all. Implementing this means supporting a large portion of basic
// Matlab functions and support for declaring new functions across files. This is beyond the scope of this project.
// - No characters or strings in variables
// Our application is matrices and vectors (integers). Boolean logic is included for the sake of logic blocks and loops.
// Grammar rules below, start the grammar with a list of statements
: statement (lineSep+ statementList? )?
: ';' | '\n' | ','
: 'if' parExpression statementList ( lineSep 'elseif' parExpression statementList)* ('else' statementList)? lineSep 'end'
| 'for' Identifier '=' (Identifier | integerLiteral) ':' (Identifier | integerLiteral) (':' (Identifier | integerLiteral))? statementList 'end'
: '(' expression ')'
: conditionalOrExpression (assignmentOperator expression)?
| '(' conditionalOrExpression (assignmentOperator expression)? ')'
: conditionalAndExpression ( '||' conditionalAndExpression )*
: equalityExpression ( '&&' equalityExpression )*
: relationalExpression ( ('==' | '!=') relationalExpression )*
: additiveExpression ( relationalOp additiveExpression )*
: multiplicativeExpression ( ('+' | '-') multiplicativeExpression )*
: unaryExpressionNotPlusMinus ( ( '*' | '/' ) unaryExpressionNotPlusMinus )*
: '~' unaryExpressionNotPlusMinus
| '!' unaryExpressionNotPlusMinus
| primary ('++'|'--')?
| Identifier (identifierSuffix)?
: ('[' ']')+ '.' 'class'
| ('[' expression ']')+ // can also be matched by selector, but do here
: expression (',' expression)*
: '(' expressionList? ')'
: Letter (Letter | Digit)*
Letter : 'a'..'z' | 'A'..'Z';
Digit : '0'..'9';