The world between runtime programming and scripting

Here is a problem for all the theoretical computer scientists out there.  Say you want to write a program, this program needs to parse a certain file structure and extract useful data.  A single file is setup as a bunch of independent records built on a handful of fields in said record.  There are almost 1000 different kinds of fields, and close to 200 different kinds of records.  The requirements for output is standard duplicate checking and removal, and programmable output, meaning you can change how the output looks or is handled without changing the program.

If you can think of a good solution feel free to post your ideas, today I will be writing about my solution and things that worked and did not work, and things that still do not work.

The situation I am referring to here is parsing a call data file known as a BAF file.  The file structure and definitions for these files are described in detail in a document known as the GR-1100 which is published by Telcordia.

To parse this file your program must know and recognize about  1000 different field types and 200 different record types.  Fields are organized in records, you can probably say that records are defined by their fields.  Oh, also, there are about 100-200 defined modules that theoretically can be appended to any record.  Each module is again a set of fields.  So if you were to summarize this file it would be laid out like this.

Record Start
Fields
Record End
Any number of modules
New Record

I have already talked about pluggable factories, which is the main design element I chose when first designing bafprp.  I decided to create a maker factory for fields and records, and define each field and record as a class.  Now this was BEFORE I found out exactly how many different types of fields and records there were in total.  All I had to build off of was an open source project called bafview which contained a very small subset of the actual fields and records that are defined in the GR-1100.  I only recently learned about the inadequacies of this subset when I was notified about missing records and fields that were not in bafview, and thus, not in bafprp.

When I released version 1.0 of bafprp it contained only about 100 different fields and records.  A number small enough that the maker factory pattern made sense.  Obviously however this needed to be changed if bafprp was to become a complete GR-1100 compliant program.  So now we must free ourselves from the basic design and fall back to the drawing board.  Something that many designers, myself included, are very bad at doing, and so I did not.  Instead I though about another solution, which extended the concept of pluggable factories into what I will very loosely call run time programming.

Now do not jump to any conclusions about a self correctly, self building, automated executable.  This is NOT run time programming in the strict sense of the word.  What I did not was allow for the extendability of certain classes at run time.  Meaning that a basic class is interpreted hundreds of different ways depending on run time data. Using this method I created a handful of basic classes that could be extended by strings passed by a user to define the specifics of the field.  To define the default fields the default data is hard coded into the program, but the user has complete control over any field in the program through two specific command line options.

There are some similarities to basic scripting, if that is what you were thinking.  There is one major difference however, bafprp remains in complete control over the program.  With scripting languages the program surrenders part of itself so the user can change default behavior.  In bafprp the basic classes remain in complete control of the operation, but the user has complete control over the data.  Something that I have not seen done in any other program yet.

So by now if you are still reading you are probably itching for some source code.  The design is pretty simple, we have eight or so basic types using the pluggable factory design.  These basic types operate on string data passed to them, how they operate and the string syntax is completely up to them, however for these elements I had them conform to property name, property value pair of strings for operating data.

Here is a simple example of my Date field type

std::string DateField::getString() const
	{
		LOG_TRACE( "DateField::getString" );

		std::string ret;
		if( !_converted )
		{
			LOG_WARN( "Tried to get string before field was converted" );
			ret = "";
		}
		else
		{
			char year[5] = "";
			time_t ltime;
			struct tm* mytm;
			ltime = time( NULL );
			mytm = localtime( &ltime );
			strftime( year, sizeof( year ), "%Y", mytm );
			year[3] = _return[0];

			property_map::const_iterator itr = _properties.find( "format" );
			if( itr != _properties.end() && itr->second.find( "Y" ) != std::string::npos && itr->second.find( "M" ) != std::string::npos && itr->second.find( "D" ) != std::string::npos )
			{
				ret = itr->second;
				ret.replace( ret.find("Y"), strlen( year ), year );
				ret.replace( ret.find("M"), 1, _return.substr(1,2) );
				ret.replace( ret.find("D"), 1, _return.substr(3,2) );
			}
			else
			{
				std::ostringstream os;
				os << year << "-" << _return[1] << _return[2] << "-" << _return[3] << _return[4];
				ret = os.str();
			}
		}

		LOG_TRACE( "/DateField::getString" );
		return ret;
	}

The date field will take a format property and change the data it converted appropriately. This is an example of a basic property that each field can have. For a more complex operation here is the same switch function

std::string SwitchField::getString() const
	{
		LOG_TRACE( "SwitchField::getString" );

		std::string ret = "";
		if( !_converted )
		{
			LOG_WARN( "Tried to get string before field was converted" );
			ret = "";
		}
		else
		{
			props_pair switches = _properties.equal_range( "switch" );
			std::string sw;
			if( switches.first == switches.second )
			{
				// No "switch" property so assume we switch on character 0
				sw = "0" + _return;
				sw.resize( 2 );
				property_map::const_iterator string = _properties.find( sw );
				if( string != _properties.end() )
					ret = string->second + " ";
				else
					ret = "Unknown: " + sw.substr(1);
			}
			else
			{
				for( property_map::const_iterator pos = switches.first; pos != switches.second; pos++ )
				{
					sw = pos->second + (char*)&_return[ atoi( pos->second.c_str() ) ];
					sw.resize(2);
					property_map::const_iterator string = _properties.find( sw );
					property_map::const_iterator desc = _properties.find( pos->second );
					if( string != _properties.end() )
					{
						if( desc != _properties.end() )
							ret += desc->second + " = " + string->second + " : ";
						else
							ret += string->second + " : ";
					}
					else
						ret += "Unknown: " + sw.substr(1) + " : ";
				}
				ret.resize( ret.length() - 3 ); // remove last ':'
			}
		}
		LOG_TRACE( "/SwitchField::getString" );
		return ret;
	}
}

Now here is my general disclaimer, if you see problems with this code feel free to let me know I know it is not 100% efficient and perfected and this is because I have never done this before so I was making up design as I wrote the code.  Eventually I will do yet another rewrite and fix the code with a more complete understanding of the design.

But the switch function needs some explaining.  I think it would be best to see an example switch field

		FieldMaker::setFieldProperty( "calledpartyanswerindicator", "datatype:switch" );
		FieldMaker::setFieldProperty( "calledpartyanswerindicator", "size:1" );
		FieldMaker::setFieldProperty( "calledpartyanswerindicator", "desc:Called Party Answer Indicator" );
		FieldMaker::setFieldProperty( "calledpartyanswerindicator", "switch:0" );
		FieldMaker::setFieldProperty( "calledpartyanswerindicator", "00:Called Party Answer Detected" );
		FieldMaker::setFieldProperty( "calledpartyanswerindicator", "01:Called Party Answer not Detected" );
		FieldMaker::setFieldProperty( "calledpartyanswerindicator", "02:Answered Attempt" );
		FieldMaker::setFieldProperty( "calledpartyanswerindicator", "03:Simulated Called Party Off-Hook Indicator" );
		FieldMaker::setFieldProperty( "calledpartyanswerindicator", "04:NCD, CAS, Blocked After Answer" );
		FieldMaker::setFieldProperty( "calledpartyanswerindicator", "05:NCD, CAS, Blocked Before Answer" );
		FieldMaker::setFieldProperty( "calledpartyanswerindicator", "07:Service Features Not Provided, Call Answered" );
		FieldMaker::setFieldProperty( "calledpartyanswerindicator", "08:Service Features Not Provided, Call Unanswered" );

Ok so first we set the basic data that defines a field: type, size, and desc. This is all that is required to make a field. We then setup a switch on character zero by setting the switch property to zero. The properties after that list the values for the switch. If the switch character is zero print “Called Party Answer Detected,” and so on. The syntax for these are to have the property name before the value, separated by a colon. This of course is completely arbitrary and can be anything depending on how you want to assign properties. The property name syntax is the switch number first, then the switch value. Only two characters are required because I only allow one character switches.

The trick here is how the switch class uses its properties.  For the date object, all it did was change the format of the date, however the switch completely depends on the properties to be set correctly to function properly.  The properties of a switch class define the switch, whereas the properties of the date class hardly do anything.  It is this range of effective designs that make the application successful and indicate proper orthogonality, which is the number one most important design philosophy.

I am finding that increasingly complex posts are getting difficult to write and even more difficult to read so from now on I am going to try and keep posts small and exercise a specific element instead of the whole design. I left a lot out of this post simply because this is a blog not a research paper, so I will try and get some more data posted soon.
In the meantime, you can always check out the source at bafprp’s web site.

Share and Enjoy:
  • Print
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • Blogplay
  • LinkedIn
  • StumbleUpon
  • Twitter

Tags: , ,

  1. XenoMuta’s avatar

    You are a genius. This makes the program infinitely extensible.

Anti-Spam Protection by WP-SpamFree

Charles Solar is Digg proof thanks to caching by WP Super Cache