Edgar Lopez's Master Thesis: 2011

Thursday, June 9, 2011

OMCCp Thesis status: Published

This work is done!

I have to thank a lot of people for this, but I need another post for that. Now finally the thesis is been submitted and published.

The final URL is http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-68863

This ends this Blog and the story of how I made my thesis.

As common as fairy tales, my examiner wanted me to work part time for him, in order to finish some further work related with my thesis. So this story ends with me sitting in the same place as I produce the most of my thesis but now as an employee.

This is then my stone to this field of science called Computer Science:

“People think that computer science is the art of geniuses but the actual reality is the opposite, just many people doing things that build on each other, like a wall of mini stones.” Donald Knuth

BibTex

@misc{Lopez-Rojas11,

author = {Lopez-Rojas Edgar Alonso},

institution = {Linköping University, PELAB - Programming Environment Laboratory},

pages = {233},

title = {OMCCp : A MetaModelica Based Parser Generator Applied to Modelica},

year = {2011}

}

Wednesday, June 8, 2011

OMCCP new name suggested

OMCCP is the new name of this tool. OpenModelica Compiler-Compiler Parser is now the name of this thesis project.

After a carefully review from the Examiner and a short discussion about how suitable is to call Compiler-Compiler a tool that only generates the parser we decided to call it OMCCP, which stands for OpenModelica Compiler-Compiler Parser as stated before.

The thesis is now graded by the examiner as ACCEPTED. The final version with the corrections included was already issued and it is just a matter of time when the PDF will be ready and officially accepted for being published online.

The last post of this BLOG will contain the URL where it is finally located the latest version of the report.

Wednesday, June 1, 2011

Final version submitted for approval

Today I have submitted the final version of the report of the master's thesis to my examiner. He will take a look and approve or suggest changes for the final report.

I hope everything is alright and only minor changes from him will need to be applied.

Next week is my opposition, so I will be done with all the requirements that Linköping University asks for the final project.

Tuesday, May 31, 2011

Thesis presented!!

After a successful presentation today, now it remains only the correction of a few mistakes in the report and wait for the approval of the examiner for the final version of the report.

Next week I will have a meeting with my examiner to define the final modifications for submitting the final version of this report.

The project is almost done!

Monday, May 30, 2011

OMCC Presentation coming soon

Only one day left for the presentation, the latex Beamer software helped me for the first time to produce a PDF for the presentation. It is quite uncommon but very interesting to use LaTeX for a presentation. The result seems quite professional and only a few will tell it is LaTeX.

Today I will do one last rehearsal to prepare the presentation. It will take place in the room called Donald Knuth which is nice due to the contribution of Knuth to the LR parsing in 1965.

The last week I have been working only in the presentation and I did some reading again of the literature in order to have fresh all the concepts of compilers in case tricky questions come up by the opponent.

After a remarkable progress with the grammar with about 92% of the test suite working, OMCC is getting ready to be incorporated into the OMC compiler for OpenModelica.

Thursday, May 19, 2011

Final optimizations

I have increase the performance of my parser by simplifying the lexer and the parser. After I took out the recursion lexing and parsing one token at the time was easier.

My examiner wants me to implement an external C function to improve the speed of the lexer but the lexer is not my concern now. The time that the parser takes is considerable long, compared to the lexer.

More test will be performed over this weekend at the same time I keep growing the grammar which now only fails in 98 out of 568 cases in the grammar.

Monday, May 16, 2011

Presentation for the thesis scheduled

The time has come to present my thesis.

I have 2 weeks to prepare a good presentation and make good use of the software I produced.

However some improvements are required for the lexer to improve the performance.

The time for the presentation will be the 31st May 2011 15.15

Now is time to prepare an awesome presentation!!

Thursday, May 12, 2011

Draft 1 released and grammar for Modelica+MetaModelica

The implementation of the grammar for Modelica+MetaModelica is almost done. Only a large subset will be implemented. However it is large enough to parse my own OMCC as showed in the last post.

The corrections from the draft 1 and some additions regarding the development of the grammar and some fixed and additions were introduced in draft 2.

Thesis status: Implementation 99.9% (as usual with software projects never end).
Report draft2 sent.

Hopefully the presentation defense will be schedule soon.

Tuesday, May 10, 2011

My parser can parse itself

Finally after several months of hard work, I have archive the OMCC parser generator for MetaModelica and Modelica grammar. It is still a subset but it is working pretty fine.

The grammar for modelica file is about 700 lines of code. The files generated even more. In the last test it took around 6 seconds to parse all the file. This is quite big but it was running in debug mode. It means that some other optimizations should be made before the end of the thesis.

A new version of the draft will be released soon. So I hope I can post here the day of the presentation very soon when I talk with my examiner.

this was the last result of the test



***************-ACCEPTED-***************



 SUCCEED - (AST)

args:ParserModelica.mo

Total time:6.113409437

Friday, May 6, 2011

draft 1 delivered

I just finished the first draft of the thesis, now it is under my supervisors review. However, the testing set for a subset of the Modelica grammmar wont be enough for the examiner, so I am currently working on extending the grammar for the parser while waiting for comments over the draft 1.

Thursday, April 21, 2011

Extra help to finish OMCC and Opponents selected

Extra help has arrived to help me finishing my report. Yes, the AMAZON kindle is finally with me, and here is the first glimpse of my thesis on a kindle.

I am currently into the various sections of the report trying to finish everything by next week when the first draft will be public.

That is why I have selected my opponents and also the thesis I will be the opponent as a part of the graduation requirement. Everything looks fine to start May with the revision of the draft and plan the presentation for the middle of May.

Sunday, April 17, 2011

Roadmap to the end of the thesis

Finally, I am reaching the light at the end of the tunnel.

Soon I will be done with this project, this is the output of the last meeting with my supervisor for the road map until the end of the project.

Now my activities are 100% on finishing the first draft of the thesis, after that I will start to create the grammar for a subset of modelica and use the OMCC to build the compiler :D

It is very probable that some adjustments may be necessary, but I will have pleanty time until the presentation to fix them.

The initial goal of finishing the thesis by the end of april is looking more and more a reality.

Soon ... the first draft :D

Tuesday, April 12, 2011

OMCC with error handling

I have been working on the thesis very hard lately, so far I have
accomplished this:

* Lexer and Parser
* Lexer and Parser Generator
* Error Handling

The last part is now working like this:

It uses error recovery to detect more errors, but it always fails at the end. Right now it only recovers from the action ERASE token, however if there are tokens with errors nearby it will stop parsing.

From the papers I read, I have this primary error handling techniques (single correction) which could be insert, replace or erase. So far I have not implemented MERGE because my parser only understand tokens and has no idea about the semantic value of the tokens. I need to implement a table to look for the semantic value, also important for displaying the errors with the right String not with the Token NAME as now.

 ***ERRORS WERE FOUND AND COULD NOT RECOVER ***



Syntax Error near [TOKEN:T_DO 'do' from (line:3 col:18) until (line:3

col:20)], ERASE token

Syntax Error near [TOKEN:T_ADD '+' from (line:4 col:13) until (line:4

col:14)], INSERT token {T_INTCONST or T_IDENT}

Syntax Error near [TOKEN:T_ADD '+' from (line:4 col:13) until (line:4

col:14)], REPLACE token with {T_LPAREN}

Syntax Error near [TOKEN:T_ADD '+' from (line:4 col:13) until (line:4

col:14)], ERASE token

Syntax Error near [TOKEN:T_IDENT 'y' from (line:8 col:5) until (line:8

col:6)], INSERT token {T_THEN}



program file:



1/* this is while test of comment */

2read x y z w;

3while x <> 99 do do

4 ans := (x++1) - (y/3);

5  write ans;

6  read  x, y;

7 if x = 10

8    y := 234;

9  endif

10end

My thesis is getting close to the end, I hope it can be completed by the end of this month.

Wednesday, April 6, 2011

OMCC - (OpenModelica Compiler-Compiler)

Compiler-Compiler is a term that refers to a program that can generate a parser for compiler, so the new name after a deliberation with my Examiner is OMCC, having in mind that OMC is the command line instruction for the current compiler, I think that OMCC will be easily remembered for the new developers that start to work in this compiler and base their work on my thesis.

$omcc MetaModelica

  ..... Generating files

  DONE!

Saturday, April 2, 2011

OMC-LPG OpenModelica Compiler - Lexer and Parser Generator

OMC-LPG (OpenModelica Compiler - Lexer and Parser Generator) is the name I came up to call my thesis. I realize when I was writing my report that it is difficult to write all the time the long name, so I needed a short, easy to remember and off course unique to name my project. This is how OMC-LPG was born.

The focus of my work during this week is been mainly on the report, after setting up the SVN environment and fixed some bugs with my supervisor on the OMC. It is time to rush with the report.

So far I have been writing about 3.600 words for the report in the last month, and well I don't count pages because it is pretty bulky but including the source code (which will change later) I got around 111 pages. By the way this is a pretty easy way to count words in a pdf under linux:

/thesis$ pdftotext MasterReport.pdf MasterReport.txt

/thesis$ wc -w MasterReport.txt

3591 MasterReport.txt



/thesis/code$ wc -w *.*

   171 lexer10.l

   934 Lexer10.mo

   595 LexerCode10.mo

   152 LexerCode.tmo

   992 LexerGenerator.mo

   930 Lexer.mo

   848 LexTable10.mo

   499 Main.mo

  1294 ParseCode10.mo

   137 ParseCode.tmo

   626 parser10.y

  2272 ParserGenerator.mo

  1151 Parser.mo

  1157 ParseTable10.mo

    40 SCRIPT.mos

   138 Token10.mo

 11936 total

LaTeX is a formidable tool for professional text writers, I am glad I started earlier with this report because I got the opportunity to play around a bit with it without the pressure of time.

I have set as a deadline to finish both, project and report the last day of April. Let's see how this plan goes.

Friday, March 25, 2011

New omc-parser code project

I recently opened a google code project to handle all the source files of this master's thesis.

The link is here:

http://code.google.com/p/omc-parser/

Command-line access:

Use this command to anonymously check out the latest project source code:

# Non-members may check out a read-only working copy anonymously over HTTP.

svn checkout http://omc-parser.googlecode.com/svn/trunk/ omc-parser-read-only

It is a good place to keep my code safe and also to report all the issues that I get during the development.

I have just setup a development environment in Linux and I want to be able to switch and keep updated all my source code between both operative systems.

Soon I will write an article about software configuration management for students that are developing a thesis, and my experience with this website will definitely help.

Wednesday, March 23, 2011

Tables generator ready

Now my work consist on build part of the code automatically from the Flex and Bison code generated in C.

In order to build this it is necessary to read the lexer.c and parser.c file and generate the LexTable.mo ParseTable.mo that contains the arrays used for the automata machine, and the LexCode.mo and ParseCode.mo used for the actions after a token is found for lexer and a shift reduce rule to build the AST for the parser. The files Token.mo needs to be rebuild also from the parser.c code.

For making this machine customized for any language, it is necessary to link the generated files together with the main files Lexer.mo and Parser.mo, so those files will be generated also with a sufix which will identify the machine.

For my thesis, the exercise 10 and exercise 4 are the examples I am using for testing the machine, so the machines will contain all these files with the sufix 10 and 4 respectively.

Some code cleaning is still necessary to do, but this does not affect the current work, it will be complemented later after the Error Handled is improved in the parser.

Friday, March 11, 2011

Parser Prototype done

This week is been very productive, after a good explanation of how to build the AST, I manage to build the typed AST for MetaModelica, it means that the goals for the prototype are reached and now I can move to the next phase which is to build the Lexer Generator.

There are still things to do, like cleaning the code from unused variables and better performance of the loops, besides the error handling which is another major milestone in this project.

Thursday, March 10, 2011

The Parser is now Parsing

After dealing with the C code generated by BISON, with more than 15 arrays and crazy GOTO statements, I have managed to create the first parser in MetaModelica that parses the exercise 10 of the MetaModelica examples (for those who have read the MetaModelica Document).

For the rest of the mortals, what I have achieve is to make the code of a program to be recognized by another program (Compiler Parser). However there are more stuff to do before this task is complete, including the generation of the AST, error handling and cleaning of the code.

It is now week 10 and hopefully 10 weeks more to go and finish this project. Next activities will be to focus on the MetaModelica code generation based on the Flex and Bison files.

Friday, February 25, 2011

Lexer prototype working

The first challenge of my thesis project is to understand how to program in MetaModelica language. In order to get some understanding I practiced by making the first Lexer Prototype using some samples of automata's machines from the compilers book.

This approach only gave me confidence on how to write code for MetaModelica, the next part was to use the Flex Lexer generator to generate the C code for one of the exercises of the MetaModelica book. This code help me to create the first Lexer that is able to understand the Flex tables and run a machine that generates the tokens required by the parser.

Now the next challenge is to mimic the same behavior with BISON, which is a parser generator, but much more complex that FLEX.

Thursday, February 10, 2011

Modelica Conference

It was good Conference to get some understanding on how big and stable is this project called OpenModelica (http://www.modprod.liu.se/MODPROD2011?l=en)

I got to listen to several expositors including my thesis examiner Peter and my supervisor Martin.

A nice workshop given by one of the creators of the compiler for Modelica Adrian Pop was also interesting to get some hands of this.

It is a good starting point to make this project to start with the coding.

Monday, January 10, 2011

First day at the "office"

From today, I will start coming to the office during week days at LiU PELAB, the duties for today are to finish writting my Planning Report of the thesis for the next semester.

Some important points to add are the goal of the thesis and the full schedule for the rest of the semester. This document is very important because it will be like the first requirement document of my thesis and will be an agreement of time with my examiner.

My idea is to finish with the presentation of the thesis at the begining of May, for this the parser for OpenModelica must be completed by the end of march and the written thesis report should be in its first draft by the begining of april.

This is why I will come more often to the office since there is a lot work to do and just a few months ahead.