Edgar Lopez's Master Thesis: parser

Showing posts with label parser. Show all posts

Thursday, June 9, 2011

OMCCp Thesis status: Published

This work is done!

I have to thank a lot of people for this, but I need another post for that. Now finally the thesis is been submitted and published.

The final URL is http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-68863

This ends this Blog and the story of how I made my thesis.

As common as fairy tales, my examiner wanted me to work part time for him, in order to finish some further work related with my thesis. So this story ends with me sitting in the same place as I produce the most of my thesis but now as an employee.

This is then my stone to this field of science called Computer Science:

“People think that computer science is the art of geniuses but the actual reality is the opposite, just many people doing things that build on each other, like a wall of mini stones.” Donald Knuth

BibTex

@misc{Lopez-Rojas11,

author = {Lopez-Rojas Edgar Alonso},

institution = {Linköping University, PELAB - Programming Environment Laboratory},

pages = {233},

title = {OMCCp : A MetaModelica Based Parser Generator Applied to Modelica},

year = {2011}

}

Monday, May 30, 2011

OMCC Presentation coming soon

Only one day left for the presentation, the latex Beamer software helped me for the first time to produce a PDF for the presentation. It is quite uncommon but very interesting to use LaTeX for a presentation. The result seems quite professional and only a few will tell it is LaTeX.

Today I will do one last rehearsal to prepare the presentation. It will take place in the room called Donald Knuth which is nice due to the contribution of Knuth to the LR parsing in 1965.

The last week I have been working only in the presentation and I did some reading again of the literature in order to have fresh all the concepts of compilers in case tricky questions come up by the opponent.

After a remarkable progress with the grammar with about 92% of the test suite working, OMCC is getting ready to be incorporated into the OMC compiler for OpenModelica.

Tuesday, May 10, 2011

My parser can parse itself

Finally after several months of hard work, I have archive the OMCC parser generator for MetaModelica and Modelica grammar. It is still a subset but it is working pretty fine.

The grammar for modelica file is about 700 lines of code. The files generated even more. In the last test it took around 6 seconds to parse all the file. This is quite big but it was running in debug mode. It means that some other optimizations should be made before the end of the thesis.

A new version of the draft will be released soon. So I hope I can post here the day of the presentation very soon when I talk with my examiner.

this was the last result of the test



***************-ACCEPTED-***************



 SUCCEED - (AST)

args:ParserModelica.mo

Total time:6.113409437

Friday, May 6, 2011

draft 1 delivered

I just finished the first draft of the thesis, now it is under my supervisors review. However, the testing set for a subset of the Modelica grammmar wont be enough for the examiner, so I am currently working on extending the grammar for the parser while waiting for comments over the draft 1.

Tuesday, April 12, 2011

OMCC with error handling

I have been working on the thesis very hard lately, so far I have
accomplished this:

* Lexer and Parser
* Lexer and Parser Generator
* Error Handling

The last part is now working like this:

It uses error recovery to detect more errors, but it always fails at the end. Right now it only recovers from the action ERASE token, however if there are tokens with errors nearby it will stop parsing.

From the papers I read, I have this primary error handling techniques (single correction) which could be insert, replace or erase. So far I have not implemented MERGE because my parser only understand tokens and has no idea about the semantic value of the tokens. I need to implement a table to look for the semantic value, also important for displaying the errors with the right String not with the Token NAME as now.

 ***ERRORS WERE FOUND AND COULD NOT RECOVER ***



Syntax Error near [TOKEN:T_DO 'do' from (line:3 col:18) until (line:3

col:20)], ERASE token

Syntax Error near [TOKEN:T_ADD '+' from (line:4 col:13) until (line:4

col:14)], INSERT token {T_INTCONST or T_IDENT}

Syntax Error near [TOKEN:T_ADD '+' from (line:4 col:13) until (line:4

col:14)], REPLACE token with {T_LPAREN}

Syntax Error near [TOKEN:T_ADD '+' from (line:4 col:13) until (line:4

col:14)], ERASE token

Syntax Error near [TOKEN:T_IDENT 'y' from (line:8 col:5) until (line:8

col:6)], INSERT token {T_THEN}



program file:



1/* this is while test of comment */

2read x y z w;

3while x <> 99 do do

4 ans := (x++1) - (y/3);

5  write ans;

6  read  x, y;

7 if x = 10

8    y := 234;

9  endif

10end

My thesis is getting close to the end, I hope it can be completed by the end of this month.

Friday, March 25, 2011

New omc-parser code project

I recently opened a google code project to handle all the source files of this master's thesis.

The link is here:

http://code.google.com/p/omc-parser/

Command-line access:

Use this command to anonymously check out the latest project source code:

# Non-members may check out a read-only working copy anonymously over HTTP.

svn checkout http://omc-parser.googlecode.com/svn/trunk/ omc-parser-read-only

It is a good place to keep my code safe and also to report all the issues that I get during the development.

I have just setup a development environment in Linux and I want to be able to switch and keep updated all my source code between both operative systems.

Soon I will write an article about software configuration management for students that are developing a thesis, and my experience with this website will definitely help.

Wednesday, March 23, 2011

Tables generator ready

Now my work consist on build part of the code automatically from the Flex and Bison code generated in C.

In order to build this it is necessary to read the lexer.c and parser.c file and generate the LexTable.mo ParseTable.mo that contains the arrays used for the automata machine, and the LexCode.mo and ParseCode.mo used for the actions after a token is found for lexer and a shift reduce rule to build the AST for the parser. The files Token.mo needs to be rebuild also from the parser.c code.

For making this machine customized for any language, it is necessary to link the generated files together with the main files Lexer.mo and Parser.mo, so those files will be generated also with a sufix which will identify the machine.

For my thesis, the exercise 10 and exercise 4 are the examples I am using for testing the machine, so the machines will contain all these files with the sufix 10 and 4 respectively.

Some code cleaning is still necessary to do, but this does not affect the current work, it will be complemented later after the Error Handled is improved in the parser.