20071111

Restructuring query languages.

SQL is the de facto language for the manipulation of relational data. Designed in the early 70s and standardized in the mid-80s, SQL has dozens of commercial and open-source servers written for it and enjoys being a language with almost no major competitors in its field.

It seems that all major relational databases are basically interpreters for various flavors of SQL. I've looked around from time to time for a decent alternative to SQL.

I haven't found anything interesting that was particularly useful out there.

I'm going to present for comments some of the ideas I have come up with to fix this. As you read below, know there is not yet an actual interpreter for my language. There will be, but I am not yet finished with it. I am writing an interpreter based on sqlite3 using lemon to create the parser and targeting the sqlite3 virtual machine which I will extend very minimally to handle some functionality I want that doesn't quite fit into their model.

My language is very much based on the entity-relationship model. SQL is often used to model this approach. It can be clumsy to fit into the tables of data paradigm, but it can.

The easiest way to explain my language of course will be to show a brief interpreter session. So here it goes.


Interpreter for AsYetUnnamedProgrammingLanguage by MichaelSpeer v0.-1

>

Let's begin by creating an entity to manipulate. We'll use products, something nicely generic that are probably manipulated by everyone reading this at some point.

> declare
. Products
. :name PHRASE
. :price MONEY'USD
. ;

There you are. We have a database containing products, each of which has a member specifying the name ( PHRASE : a string with no newlines ) and a price ( MONEY : a numeric datatype insured to two decimal places without fear of truncation errors, rounding errors, poor approximations or anything else silly like that. The tick after the MONEY datatype is something for which I do not yet have a permanent terminology, though I like it. I'll call it a shadow type. It is a small comment that must be present on anything that interacts with the particular datatype. So 3'USD + :price is good, whereas 3 + :price is an error. This is just an enforced helper to prevent 3'YEN + 3'USD from not being an error and giving a six.

The syntax for creating some instances of the Products entity is as follows :

> create Products [ :name , :price ] =
. [ "DVD Player" , 20'USD ] ,
. [ "VCR" , 15.5'USD ] ,
. [ "DVD" , 7'USD ] ,
. [ "VCR Tape" , 4'USD ]
. ;

So now we have Products in the system.

> Products ;
[ 1 , 2 , 3 , 4 ]

> Products:id ;
[ 1 , 2 , 3 , 4 ]

These are selectors. Selectors select what information from where will be available for call-back to the application.

As you can see, if you do not give an expressions_list or end the selector with a valuating member, then it will default to the id of whatever the current object is.

> Products[ :id , :name ] ;
[ 1 , "DVD Player" ]
[ 2 , "VCR" ]
[ 3 , "DVD" ]
[ 4 , "VCR Tape" ]

By giving the expressions_list, the values in the selected attributes are returned to the caller.

So this doesn't allow anything interesting yet. The path to usefulness starts with filters.

> Products( :price < 5'USD )[ :name ];
[ "VCR Tape" ]

I mentioned earlier the difference between putting an expressions_list at the end of the selector vs placing an object_member there.

> Products( :price > 10'USD ):name ;
[ "DVD Player" , "VCR" ]

> Products( :price > 10'USD )[ :name ] ;
[ "DVD Player" ]
[ "VCR" ]

Of course all of this would be a huge pain if there was no way to sort them. Sorting expressions can be interpolated directly into the filters.

> Products( asc :name ):name ;
[ "DVD" , "DVD Player" , "VCR" , "VCR Tape" ]

So far we've looked at how to manipulate a single object type. Now let's add a new one to put these products into some categories.

> declare
. Categories
. :name PHRASE
. :sub-products -> Products:category single
. ;

The declaration of the Categories object-type and the :name member looks the same as last time. :sub-products is the new item. This line is declaring a relationship. The `Products:category single` section indicates that the relationship will interconnect Categories ( the current item ) and Products. The relationship can be referred to from a set of Categories using the :sub-products member and from a set of Products using the :category member. Furthermore, the caveat `single` means that any instance of a Products object may only relate to a single Categories set object. The caveat can appear on either or both sides.

Setting relationships is accomplished by placing a selector in the respective slot of the expressions_list being used to set the members. id est

> create Categories [ :name , :sub-products ] =
. [ "Video Players" , Products( :name in [ "DVD Player" , "VCR" ] ) ]
. ;

Here is part of what I like about this syntax. The `:name` term above is a relative selector. When a filter has a selector that isn't grounded it is assumed to be relative to the current object in the filters. So the `:name` above is relative to Products. A filter of that nature can be read as objects from Products where Products:name is "DVD Player" or Products:name is "VCR". There isn't a limit to how many filters are applied either.

If you specify a relationship member as part of the filters then at that point a unique set of items available at that point are selected.

> Products:category:name ;
[ "Video Players" ]

> Products[ :category:name ] ;
[]
[ "Video Players" ]
[]
[ "Video Players" ]

Using the expression_list causes a separate return for each item whereas selecting the attribute directly takes the unique set of values from that attribute.

The reason that this happens is because in the first item the :category filter is selecting the set of all unique categories referred to by the set of current Products. The second selector is selecting each of the current set of Products ( all of them since no filter is applied ) and then for each of them returning a list of the :name attributes on the Categories in the set returned by the :category entry.

One of the things that will be immediately noticed is that I intend for my language to be able to represent and send out complex structures instead of just flat tables. The instruction to process a sub-list is the alteration to the virtual machine that will need to be made. I hope to link the language to python first planning to take advantage of the bindings that already exist in the builtin sqlite3 module.

Here are some more useful examples.

> Products( :category ):name ; -- returns products that have a category
[ "DVD Player" , "VCR" ]

> Products( not :category ):name ; -- returns products lacking a category
[ "DVD" , "VCR Tape" ]

> Products( :id != 3 && :id != 4 )[ :name ] ;
[ "DVD Player" ]
[ "VCR" ]

I will add the other two products to a category to demonstrate some more complex queries.

> create Categories [ :name , :sub-products ] =
. [ "Blank Media" , Products( not :category ) ]
. ;

> Products( :category( :sub-products( :name == "DVD" ) ) ):name ;
[ "DVD" , "VCR Tape" ]

That second query is returning the names of Products that have Categories that have products in them named "DVD". The same query is entirely possible in SQL. Just far more complex.

This language doesn't do a whole lot that cannot be accomplished with SQL or other information engines. It does it easier. This, to me, is better.

Before I conclude we'll raise the price of media by a dollar per item and add another entry to Categories.

> alter Products( :category( :name == "Media" ) )[ :price ] = [ :price += 1'USD ] ;

> declare Categories:recommended -> :sub-products:recommended-by ;

This declaration creates an attribute that will be allowed to point at any number of items that are found in the same items :sub-products list.

> alter Categories( :name == "Media" )[ :recommended ] = [ :sub-products( :name == "DVD" ) ] ;

There will be a caveat called `automatic` that will allow adding items not already in :sub-products to :recommended wherein the added item would also be added to :sub-products and removing the recommended item from :sub-products would remove it from :recommended. Otherwise trying these things will result in an error.

The + sign will be a union operator against sets and the - sign will give all of the first set except those in the second set. As well union( ... ) and except( ... ) functions will likely exist.

This should be enough information to gather thoughts from others on. So, please leave any thoughts you have so I can roll them into the work I am doing.

The current Categories entity type could have been declared initially as :

> declare
. Categories
. :name PHRASE
. :sub-products -> Products:category single
. :recommended -> :sub-products:recommend-by
. ;


That should be it for now.

1 comment:

Anonymous said...

I’ve think your new SQL closely fits a functional language. I also think that SQL must be more functional but why you haven’t used an existing language as a vehicle? Declarations, lists, filters fit nicely though relative selector does not.

I’ve also thought about LINQ.fcbn

About Me

(2) the subculture of the compulsive programmer, whose ethics prescribe that one silly idea and a month of frantic coding should suffice to make him a life-long millionaire. --ewd1036