Important Links


Babel To English Translator
Examples

Project Page
Mailing List

Download

Introducing Babel

What is Babel?

Babel is a language for talking with computers. It is a knowledge representation language that aims to be as expressive as most natural language. It is a formal grammar for representing thoughts and the meaning of human languages.

Babel is modeled after traditional computer programming languages that provide for terse, human readable and easy to write code.

Parsing natural language is a hard problem, but it can be avoided by writing Babel. By using this format natural language processing (NLP) can be separated from the real artificial intelligence work. Babel provides a way to divide the two challenges.

Babel is not a representation of parsed natural language. It is designed to represent specific meaning and does not include many of the subtleties of natural language. Babel does not try to model natural language completely. The aim is simply to be able to represent the meaning of most natural language.

Uses of Babel

The primary use of Babel is a format for talking to computers. Babel could allow for an easier Turing Test to be attempted and may help move beyond the current crude chat bot designs.

Babel could be used as a generalized goal definition system to define objectives for an intelligent agent. It could describe problems, potential solutions and actions to be performed by an agent.

Babel could be of use in translating between natural languages or in other areas of linguistics. Learning and using Babel might improve your grammar since it makes you think about the structure of what you are saying.

Current State of Babel

This is the first release of Babel. The language is capable of representing simple sentences. More complex sentences may have to be broken into multiple sentences. There are still many areas where the language can be expanded to make it more expressive and easier to use.

The Babel grammar is written in a BNF variant that is used by the Gold Parsing System (see http://www.devincook.com/goldparser/). The Gold Parser compiles the grammar into tables which can then be used be loaded by an parsing engine, which have been created for a variety of different development platforms. This will help enable the language to be used on a variety of platforms. The reference implementation is written in C# and requires the .Net Framework 2.0 or later to run.

Babel to English translation is not a hard problem. Included in the project is a translator that gives a reasonably understandable equivalent in English (see http://translator.babelproject.com/). Getting the translation to a point where it could pass for human for Turing Test scenarios may be much more difficult. The English translator is very handy for finding mistakes in Babel code and acts as a check to make sure you said what you meant to.

A parser and the translator is built into a program imaginatively called Babel Editor. The Babel Editor environment provides a view of the parsed result that shows how Babel is represented as objects in the reference implementation. Great care has been taken to keep the parsed object structures simple.

Babel Editor Screenshot

Babel Reference Implementation Class Diagram

Babel was created by me, Bryan Livingston. I also own and operate http://cooltext.com. I'm more of a computer programmer than a linguist. I searched long and hard for a language such as this but found none, so I created Babel over several years as a side project.

Comments and suggestions are welcome on the mailing list at http://groups.google.com/group/babelproject. The web site for the project is at http://babelproject.com.

Learn Babel by Example

Subject Verb Object Pattern

// Joe saw Jane. 
Joe.see-(Jane); 

Babel is composed of sentences. Each sentence ends with a semi-colon. The "//" mark indicates a line comment, which means that the text following it will be ignored by the parser. Whitespace is ignored by the parser which means that you could put spaces or line breaks anywhere and it will still parse the same.

Sentences follow a Subject-Verb-Object pattern. The subject of the sentence is usually first. This is followed by a dot to separate the verb. Then the object of the sentence is enclosed in parenthesis. This follows a C++, C# or Java pattern of object.method(argument).

In natural language we use morphology to change the sound and spelling of words to indicate things like plurals and tensing . In Babel this information is pulled out of the words and added as special characters. The past tensing of the verb "saw" in the example is done with a minus sign attached in a suffix like manner. A plus sign is used for future tense words and if no suffix is present then present tense is assumed for the verb.

Nouns are always capitalized. Currently there is no way to indicate that a noun a proper noun. In spoken English this is specified by the absence of an article (e.g. "the", "a" or "an".)

Implicit Subjects and Sentence as an Object

// See Jane run.
You.see(Jane.run()); 
see(Jane.run()); 

Whenever the initial subject is left off, an implicit noun "You" is used as the subject. The above sentences are logical equivalents. In English we often drop and make implicit the subject of a sentence when it is the person we are speaking to.

The object of the sentence is another sentence. The object of a sentence can be empty, a noun, a set of nouns, another sentence, or just an adjective. Unlike computer programming languages there is max of only one object of a sentence. Information that you would think would be used for multiple arguments would probably be attached directly to the verb as a prepositional phrase.

Adjectives and Adverbs

// The tall man walked slowly.
Man tall.walk- slow();

In this sentence the adjective "tall" is attached to the noun "Man" and adverb "slow" is attached to the verb "Walk".

The suffix "-ly" is dropped from “slowly”. When English is spoken the suffix is used to indicate that it's an adverb, but in Babel that is made explicit by the formatting.

There is no object of this particular sentence so the parentheses are left empty.

Prepositional Phrases and Restrictive Clauses

// The quick brown fox jumped over the lazy dog.
Fox quick brown.jump- over[Dog lazy]();

// The horse that I rode died.
Horse that[I.ride-()].die-();

The prepositional phrase "over the lazy dog" is attached to the verb "jump". Prepositions are attached as if they were adverbs followed by the object of the proposition contained in square brackets.

Restrictive clauses are handled much like a prepositional phrase, but they are instead attached to a noun.

In the reference implementation when these prepositional phrases and restrictive clauses are parsed they are created as adverb and adjective objects but have the contents of the brackets attached.

Declarations

// Joe is tall.
Joe.is(tall);
Joe.tall;

// Joe was tall.
Joe.is-(tall);

// Joe is a human.
Joe.Human;

The top two statements are logically identical. When the verb is left off the verb is assumed to be “is”.

Unfortunately there is no shorthand for was, so in order to express the past tense you have to explicitly use the “is” verb with a past tense suffix.

When the object in a declaration is a noun then the sentence is saying that the subject is a subclass of the object.

Noun Sets

// Tashonda sent e-mail, cards, and letters.
Tashonda.send-(Email & Card* & Letter*);

// Tashonda sent e-mail, cards, or letters.
Tashonda.send-(Email | Card* | Letter*);

// Mrs. Doubtfire gave Tabitha and Samantha quizzes.
MrsDoubtfire.give- to[Tabitha & Samantha](Quiz*);

// Juanita and Celso worked hard.
Juanita & Celso.work- hard();

Anywhere that a noun can be used a set of nouns can be used instead. Simply chain them together with ampersands for an and set and using pipes for or set. You can not mix and and or in a single set.

Also note that the asterisk is used as a suffix to specify a plural.

Gerunds

// I love running.
I.love(Run=);

// Charles is working in the garden. 
Charles.Work= in[Garden];

A gerund is a noun that is built from a verb by attaching an “-ing” suffix. Note that the gerund words are capitalized since they are nouns.

The “-ing” suffix is also attached to present tense participles, which are verbs that are used as adjectives, but there is no support in Babel yet for participles.

Questions

// Where are you going?
where(You.are(Go=));

// What were you reading this morning?
what(You.read- this[Morning]());

// May I postpone this assignment?
may(I.postpone(Assignment));

// Is Joe tall?
is tall(Joe);

// Where is Joe?
where(Joe);

// What is the color of horse that I am riding?
what color(Horse that[I.is-(Ride=)]);

Questions are written with the interrogative word as a verb.

Questions usually use an implicit subject, so a subject of “You” is attached to the question when parsed. Asking a question is technically giving an order since an answer is being demanded.

Currently the parser is hard coded to recognize specific interrogatives. This is less than ideal since it makes the parser English specific and in the future a symbol may be introduced to indicate that a statement is a question.

Babel Limitations

Babel is very much a work in progress. Below are some of the areas that may be improved upon.

Conjunctions

Most conjunctions are not currently available in the language. Complex sentences can not easily be expressed without them.

Articles

The language does not support articles (a, an, and the) right now. In natural language these are used to indicate that a noun is proper. It's questionable whether this information is important enough be included in the language.

If support for articles was added it might take the form of adding an underscore before or after the noun. Or the articles could be added as adjectives, something which could be done right now.

Questions

Currently the parser is hard coded to recognize specific interrogatives. This is less than ideal since it makes the parser English specific and in the future a symbol may be introduced to indicate that a statement is a question.

Modifiers

Words that modify an adjective or adverb (e.g. very, slightly, rather, quite) can not be expressed properly. For instance, it is not really possible to say "the very black dog" since the modifier "very" needs to be attached to the adjective "black" and not the noun "dog".

Negatives (e.g. The dog that is not running.) would also need to be attached as a modifier.

Babel Emitter

If computers are going to be working with Babel they will need to be able to generate it in object form and then save it as written Babel code. This is not a seriously difficult task and will be part of the reference implementation someday.


Cool Text: Logo Generator
Bryan Livingston
Global Combat Risk Like Game
SparkArts Digital Arts Festival in Utah