BASIC Interpreter, Part II

This is a continuation of Part I. In Part II, we will add functionality to store and run BASIC programs, although still limited to the PRINT statement. We will also be adding a LIST statement to print out the currently stored program.

We will be adding a number of new C++ classes now. The overall architecture is to have a singleton class named Basic which maintains all the lines of code, and can update and execute them as needed. The program lines will be kept in a std::map<int, const Program*>, where Program is another new class we will be creating that is the base class for all BASIC program lines. Here is the header file basic.h:

#ifndef _BASIC_H_
#define _BASIC_H_

#include <map>

#include "program.h"

/*
This is a singleton class which contains all the program lines.
*/
class Basic {
public:
	void add(int index, const Program *program);		// add a new line to the program
	void list();										// list out all the existing lines
	void execute();										// run the program
	
	static Basic *instance();							// access the singleton instance
	
private:
	std::map<int, const Program*> lines;				// store the lines in a map
	
	static Basic *b;									// singleton instance
};

#endif

Here is the corresponding implementation, basic.cpp:

#include <iostream>

#include "basic.h"

using std::map;

Basic *Basic::b;

// add a program line at index, overwriting if it exists
void Basic::add(int index, const Program *program){
	// see if index already exists, if so delete it
	map<int, const Program*>::iterator it = lines.find(index);
	if( it != lines.end() ){
		const Program *old = it->second;
		delete old;
		lines.erase(index);
	}
	
	lines.insert(std::pair<int, const Program *>(index, program));
}

// print out the program lines
void Basic::list(){
	for( map<int, const Program *>::iterator it = lines.begin(); it!= lines.end(); ++it ){
		std::cout << it->first << " ";
		it->second->list(std::cout);
		std::cout << std::endl;
	}
}

// run the program
void Basic::execute(){
	for( map<int, const Program*>::iterator it = lines.begin(); it != lines.end(); ++it ){
		it->second->execute();
	}
}

// access the singleton instance, creating it if necessary
Basic *Basic::instance(){
	if( b == NULL )
		b = new Basic();
	return b;
}

The add function will insert an executable line into the map lines. If that line number already exists, the Program at the line will be deleted. We are using pointers to the programs because they will are derived classes, and we want to call the virtual functions belonging to the instances, not to the base class. The list function will print out all the source lines, in order, to the standard output stream. The function execute steps through each program line and runs it. The static function instance returns the singleton instance of the Basic class, creating it for the first time if necessary.

Here is the header file program.h, which declares the Program base class. The base class has two functions, which all sub-classes must implement:

#ifndef _PROGRAM_H_
#define _PROGRAM_H_

#include <iostream>

/*
This is the base class for executable program lines
*/
class Program{
public:
	virtual void execute() const;				// run this line of the program
	virtual void list(std::ostream& os) const;	// list this line
};

#endif

The implementation file is pretty self-explanatory, program.cpp:

#include "program.h"

// nothing to do in the base class
void Program::execute() const{
}

// if you ever see this, something is wrong
void Program::list(std::ostream& os) const{
	os << "GENERIC PROGRAM (SHOULDN'T SEE THIS)";
}

Since we decided to make the PRINT statement our first implemented BASIC call, we create a Print subclass of Program. Our PRINT statement will take in one or more Expressions to print, and provides an implementation of the execute and list functions it inherits from Program. Here is the header file print.h:

#ifndef _PRINT_H_
#define _PRINT_H_

#include <vector>

#include "program.h"
#include "expression.h"

using std::ostream;

/*
This is the implementation for the PRINT statement
*/
class Print : public Program {
public:
	Print(const std::vector<Expression>& exprList);	// create with a vector of expressions to print

	virtual void execute() const;					// print the expression
	virtual void list(ostream& os) const;			// list this statement
	
private:
	std::vector<Expression> exprList;				// store the expressions here
};

#endif

Here is the source file, print.cpp:

#include <iostream>

#include "print.h"

using std::endl;
using std::cout;

// constructor for Print class
Print::Print(const std::vector<Expression>& exprList){
	this->exprList = exprList;
}

// prints out each expression to std::cout
void Print::execute() const{
	for( int i = 0; i < exprList.size()-1; i++ ){
		cout << exprList[i].value() << ' ';
	}
	cout << exprList[exprList.size()-1].value() << endl;
}

// lists the expressions, as they were originally given
void Print::list(ostream& os) const{
	os << "PRINT ";
	for( int i = 0; i < exprList.size()-1; i++ ){
		os << exprList[i].print() << ", ";
	}
	os << exprList[exprList.size()-1].print();
}

The constructor takes a reference to a std::vector<Expression>, since we never modify anything once the statement has been created. The execute function prints out the value of each Expression in order, separated by commas. The list function prints out the source definition of each Expression.

The last piece of the puzzle is the Expression class. An Expression at this stage will only be a text string, but will later include all data items, such as numbers, variables, functions, and operators. Here is the header file, expression.h:

#ifndef _EXPRESSION_H_
#define _EXPRESSION_H_

#include <string>

using std::string;

/*
Base class used for storing and evaluating data items
*/
class Expression {
public:
	Expression(const char *text);		// our first implementation will take a string as input
	
	const string value() const;			// return the stored value
	const string print() const;			// printable version
	
private:
	string text;						// data storage
};

#endif

Note that once the object is created, the only accessor functions are declared const, since they will will either evaluate, or print the value, but never change it. Here is the implementation, expression.cpp:

#include "expression.h"

#include <iostream>

// create a new Expression, storing its text
Expression::Expression(const char *text){
	this->text = std::string(text);
}

// return the text value
const string Expression::value() const{
	return text;
}

// return a string for printing
const string Expression::print() const{
	return '"' + text + '"';
}

The constructor takes a c-style char * and stores it in the C++ string member variable text. The function value is designed to evaluate the expression, which for a string means simply returning the stored text. The printable version, returned by print, adds double quotes, to provide the original definition.

Now that we have our implementation created, we need to update the scanning and parsing to take advantage of our new features. Let’s start with the updates we need to make to the Bison file. In the C Declarations portion at the top, we need to include a couple more STL header files:

#include <vector>
#include <string>

At the bottom of that section, we will include our new local header files:

#include "basic.h"
#include "expression.h"
#include "print.h"
#include "program.h"

We need to add one more token type to our available options in the Bison Declarations section:

// token type definition
%union {
	int iVal;
	char *sVal;
	Program *progVal;
}

Next, add a new terminal token for the LIST operation:

%token LIST

At this point we will begin needing token types defined for some of our non-terminal symbols. Add this to the end of your Bison declarations to declare that the program non-terminal symbol will be of the type progVal, defined above in the Bison union:

// non-terminal symbols
%type <progVal> program

The grammar rules start off the same, keeping the input and line definitions. However, we will be modifying the stmt definition as follows:

stmt:
	LINE program		{ Basic::instance()->add($1, $2); }
	| RUN			{ Basic::instance()->execute(); }
	| LIST			{ Basic::instance()->list(); }
;

This is where we begin making use of our new C++ classes. If the Bison parser matches a line number followed by a program rule, it will add it to the Basic singleton instance. The $1 represents the LINE token, which we defined to be of type intVal, i.e., an int. The $2 represents the program non-terminal token, which we defined to be of type progVal, i.e., a Program *. This matches the signature for the Basic::add member function: void add(int index, const Program *program), so the C++ call will succeed. Likewise, matching the terminal symbols RUN or LIST will result in calls to the Basic singleton instance to run or list the stored program.

Next, update the program non-terminal symbol to match a PRINT statement:

program:
	PRINT STRING		{
					Expression e($2);
					vector<Expression> v(1, e);
					free($2);	// malloced in basic.l
					$$ = new Print(v);
				}
;

When Bison reads a PRINT token followed by a STRING token, it will create a new instance of the Expression class, passing the string to its constructor. It then puts the Expression instance into a vector; frees the string (since we allocated space for it manually in flex); and creates a pointer to a new Print instance, which is returned as the value of the entire program non-terminal token, to be consumed above by the stmt rule. Although our Print class supports taking in a vector of multiple expression to print, for now in our parser we only support a single string.

The remainder of the Bison input file, consisting of the main and yyerror functions, stays the same. Let’s take a look at the changes we made to the flex input file:

%option noyywrap

%{
#include 

#define YY_DECL extern "C" int yylex()

 class Program;

#include "basic.tab.h"

%}

%%

[ \t]+	;	// skip white space
[0-9]+		{ yylval.iVal = atoi(yytext); return LINE; }
\"[^\"\n]*\"	{	// remove the quote marks from either end
			yylval.sVal = (char *)malloc(sizeof(char)*strlen(yytext)-1);
			strncpy(yylval.sVal, yytext+1, strlen(yytext)-2);
			yylval.sVal[strlen(yytext)-2] = '\0';	// terminate the string!
			return STRING;
		}
PRINT		{ return PRINT; }
RUN		{ return RUN; }
LIST		{ return LIST; }
\n		return ENDL;

%%

The first new line, %option noyywrap, tells flex that it will not need to parse multiple input files. This removes the dependence on the flex library to be linked in, so we will remove that from the make file. Next, in the C declarations section, we put class Program;. This is because the Bison type union now has a Program * defined, which gets picked up in the file basic.tab.h that we include. Since we don’t actually use that class inside flex, we just declare it instead of importing its header file. The next change is our string matcher. The regular expression is the same, but we want to remove the quotation marks from either end of what we read. So instead of duplicating the string, we copy out the inside, and then make sure to terminate the new string with a ‘\0’. The last new part is that we added a rule to match the LIST statement.

Of course, we also need to update the make file to include all the new header and source files we have created, and remove the flex library link:

.PHONY: all clean

all:	basic.tab.c lex.yy.c \
		basic.h basic.cpp \
		program.h program.cpp \
		print.h print.cpp \
		expression.h expression.cpp
	g++ basic.tab.c lex.yy.c program.cpp basic.cpp print.cpp expression.cpp -o basic

basic.tab.c: basic.y
	bison -d basic.y
	
basic.tab.h: basic.y
	bison -d basic.y

lex.yy.c: basic.tab.h basic.l
	flex basic.l

clean:
	rm basic.tab.c basic.tab.h lex.yy.c basic

You can now make and run the new basic application. Here is an example session:

10 PRINT "hello"
20 PRINT "Hi"
RUN
hello
Hi
LIST
10 PRINT "hello"
20 PRINT "Hi"

The complete source files from this part are available here: https://github.com/VisceralLogic/basic/tree/part-2

Continue to Part III.

2 thoughts on “BASIC Interpreter, Part II

  1. Pingback: BASIC Interpreter, Part I | Visceral Logic Programming

  2. Pingback: BASIC Interpeter, Part III | Visceral Logic Programming

Leave a Reply

Your email address will not be published. Required fields are marked *