AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Further improvements in AIML
 
 

Hi, I am a computer engineer and I’m developing a AIML language improvements. With them were able to lower substantially the number of categories. For example the whole category of Sara Bot has been reduced from more than 2300 in over 200 categories.

A part of a series of improvements such as an expert system, a number of amendments covering some AIML language deficiencies.

- Changes in willcards, now may be a certain number of words from 0 to n, and with this there is no need to write 4 categories for the same pattern. Now you can write patterns like ” *0 hello *0 ” and if you want a specific number can be put ” *0-2 hello *3-5”.

- A set of words. For answers evenly over a similar pattern we build a category for each variation we want the bot contemplated. So if all these words together in a set we can reduce these to a single category. For example ” *0 (hello; hi; good morning; how are you) *0 ” With this we have 4 possible reduced to a single category (16 categories without the wildcard expansion).
Synonymous verbs can enter ” *0 {watch; read; browse; review} *0 {book; dictionary; notebook; newspaper} *0 “.
A variation exists is the grouping of those words in an XML file, where it can establish hierarchies of words that may be used in AIML files. For example if we have a set of greetings with the words cited above could create a pattern like ” *0 #greetings *0 “.

- Canonization of words. Makes more sense in a language like Spanish, where the conjugation of verbs is more diverse than in English. But still can be used for singular or plural or male and female gender.
An example would be “* 0 @be @dog barking * 0” to cover the two categories “dogs are barking” and “dog is barking”.

- Optional words. We can put a pattern in which a word or group of words is optional. For example “*0 I [really; absolutely; actually; assuredly; categorically; certainly] like 0*”.

- Exclusion of words. We put a set of words to exclude from a wildcard. For example “* 0* I 0*~(never; don’t; not) love 0* “.

All these improvements and some more variables as new treatments are on a website currently under construction.

http://aimlmultibot.comze.com/

XD Thanks and good programming.

 

 
  [ # 1 ]

After reading both this thread, and the one at another site, I have to say that I, too, like these ideas. The only problem is that the vast majority of these “improvements” would break nearly every AIML interpreter currently in use. In order to make these changes to the AIML specification, there would have to be changes made to every AIML interpreter, past, present and future, to handle them. While I think this would be a good thing, not everyone may agree. Just thought you would like to know. smile

 

 
  [ # 2 ]

It wasn’t as though the designers of AIML were ignorant of the possibility of adding disjunctive patterns and making the pattern language more complicated.  AIML was intentionally designed to be a minimalistic language.  The usual analogy is to think of AIML as a kind of assembly language for AI.  We have always imagined that there should be tools (and there are some) to convert higher level patterns into simpler AIML ones.  Take keyword search for example.  Identifying a keyword in anywhere in an input sentence typically requires four AIML categories: the cases of the word being at the beginning, in the middle and at the end of the sentence, and the case of the word by itself.  Oddcast’s AIMC system provides a tool that allows you to enter the keyword once, and generate the four categories automatically.

What is the main reason for keeping the pattern language simple?  Scalability.  Many other chat bot languages have been designed to allow disjunctive patterns.  The trouble arises when the bot gets up to 1000’s or tens of thousand of rules or categories.  The bot might have a pattern like “A” or “B” in one rule and then 5000 rules later, the botmaster creates a new one with “B” or “C”.  Testing the bot with the input “B”, the first rule might activate and give the wrong response.  The problem becomes much more complicated when you add word lists, and two or more rules contain overlapping lists. 

The AIML restriction of one pattern per category obviates the need for tools to debug the bot with disjunctive patterns.  Given an input and a set of AIML patterns, it is always clear which pattern will match first.

Having said all that, if you can compile your improvements in standard AIML categories then you could potentially overcome the problem that Dave mentioned about making your AIML compatible other existing AIML interpreters.

 

 
  [ # 3 ]

Actually, if you’re willing to “roll your own” AIML interpreter, there’s nothing stopping you from implementing the suggested improvements right now. Having your own AIML interpreter script will allow patterns with “zero-length” wildcards, and patterns like this:

<pattern>I LIKE TO EAT *(MEAT, FISH, PASTA)</pattern>

and have the interpreter be perfectly able to handle matching the input in the proper manner. But from my personal experience with working with Program O, creating an AIML interpreter with those sorts of capabilities will be a rather arduous task. And I must point out that your “improved” AIML set would only work with your interpreter, so that may not be a good thing.

 

 
  [ # 4 ]

Richard,
Is this:

Artificial Intelligence Markup Language (AIML)
Version 1.0.1
A.L.I.C.E. AI Foundation Working Draft
25 October 2001 (rev 006)

http://www.alicebot.org/TR/2001/WD-aiml/

the final spec for AIML?

This line:
“NB: Contents of this document are subject to change! This document should not be used as reference material or cited as a normative reference from another document.”
seems to indicate that it is not. Could you provide a link to the final frozen specification.

Has any work been done recently to create a version 2 of the language? Of is the language intended to be static?

 

 
  [ # 5 ]
Richard Wallace - Apr 15, 2011:

It wasn’t as though the designers of AIML were ignorant of the possibility of adding disjunctive patterns and making the pattern language more complicated.  AIML was intentionally designed to be a minimalistic language.  The usual analogy is to think of AIML as a kind of assembly language for AI.  We have always imagined that there should be tools (and there are some) to convert higher level patterns into simpler AIML ones.  Take keyword search for example.  Identifying a keyword in anywhere in an input sentence typically requires four AIML categories: the cases of the word being at the beginning, in the middle and at the end of the sentence, and the case of the word by itself.  Oddcast’s AIMC system provides a tool that allows you to enter the keyword once, and generate the four categories automatically.

What is the main reason for keeping the pattern language simple?  Scalability.  Many other chat bot languages have been designed to allow disjunctive patterns.  The trouble arises when the bot gets up to 1000’s or tens of thousand of rules or categories.  The bot might have a pattern like “A” or “B” in one rule and then 5000 rules later, the botmaster creates a new one with “B” or “C”.  Testing the bot with the input “B”, the first rule might activate and give the wrong response.  The problem becomes much more complicated when you add word lists, and two or more rules contain overlapping lists.

Doesn’t recursion introduce the same “wrong response” problem in AIML interpreters?
http://www.chatbots.org/ai_zone/viewreply/2495/

Other bot languages find the elimination of recursion and reduction in number of categories simplifies scalability. Leo’s comment “For example the whole category of Sara Bot has been reduced from more than 2300 in over 200 categories.” would indicate Sara might be easier to maintain.

Mitsuku (the top ranked AIML bot - 2nd place in CBC 2011) has 90,000+ categories, edited and managed in notepad. Contrast that to Skynet-AI (independently developed - 3rd place in CBC 2011) which has only 3000 categories.

Richard Wallace - Apr 15, 2011:

The AIML restriction of one pattern per category obviates the need for tools to debug the bot with disjunctive patterns.  Given an input and a set of AIML patterns, it is always clear which pattern will match first.

Having said all that, if you can compile your improvements in standard AIML categories then you could potentially overcome the problem that Dave mentioned about making your AIML compatible other existing AIML interpreters.

The trade-off is in either developing tools that compile into AIML or developing higher level functions that simplify the process. Leo’s or ChatScript’s synonyms for example. Or Dave’s:
< pattern>I LIKE TO EAT *(MEAT, FISH, PASTA)</pattern>
How many patterns would Dave’s example need to be compiled to in order to handle that concept anywhere in an input sentence?

 

 
  [ # 6 ]

It is clear that these improvements would result in a noticeable change in the AIML language and the implementations would not work properly with files based on version 1.X or 2.0. I tried to define for the backward compatibility is complete, ie, a file with version 1.0.1 AIML official with implementations that incorporate the new specifications. (It was easier to put the wildcard * to make a matching from 0 to n elements.)

What if AIML language has not evolved ever? Have not built new tags and have become obsolete as others? Do the implementations have not had to adapt to the new version of AIML language? Also not the first time it happens it is said that this engine only covers up to version X.Y specification language (call Microsoft XD).
What I find strange is that AIML has been stuck at version 1.0.1 and has not changed since 2001, when other applications have emerged trying to cover the shortcomings I have described.

With regard to the AIML was born as an assembler of us I think is great, but like as not programmed in assembly language and have appeared as third generation or even fourth to make programming easier also believe that language should AIML adapt to not end up dying for not wanting to change their specifications.

If you want you can call my proposals AIML version X or fourth generation, it does not matter. The important thing is that programming a bot who will be easier, more friendly and not finished writing Crazy 300K categories, thinking whether to go wrong or you’ll forget some.
Indeed, there is a proposal to make the changes I have proposed are tested in a library that I developed AIMLMultibot, and although still in the process of enlargement and works well with the changes I have described. Many other changes I have not mentioned by not saturating, and I will mention in future post. What was intended was to spread these improvements to the community and if possible, and people seem to even incorporate a new version of AIML language.

Thanks.

 

 
  [ # 7 ]
Merlin - Apr 15, 2011:

The trade-off is in either developing tools that compile into AIML or developing higher level functions that simplify the process. Leo’s or ChatScript’s synonyms for example. Or Dave’s:
< pattern>I LIKE TO EAT *(MEAT, FISH, PASTA)</pattern>
How many patterns would Dave’s example need to be compiled to in order to handle that concept anywhere in an input sentence?

Merlin’s reply expanding on the theme of synonyms or related words together or not, I will describe as I implemented this in the bookstore (although it could expand and improve) and that benefits would do so in this way.

There are some files with DIC (dictionary) in which using XML (done so to follow the same approach as the AIML language) creates a hierarchy of words related to each other.

An example file might look like this:

<?xml version="1.0" encoding="UTF-8"?>
<synonyms>
  <
set name="#animals">
    <
set name="#poultry">
      <
concept name="chicken"/>
      <
concept name="turkey"/>
    </
set>
    <
concept name="veal"/>
    <
concept name="lamb"/>
  </
set>

  <
set name="#meat">
    <
set name="#dead_meat">
      <
concept name="meat"/>
      <
concept name="steak"/>
      <
set name="#animals"/>
    </
set>
  </
set>
</
synonyms

In a separate file would AIML categories in this scenario only one category that would have the following pattern: “*0 LOVE *0 #MEAT *0”
With this single category would cover all possible inputs related to the beef issue.

What advantages does this have? In the aforementioned reduction in the number of categories to be treated and therefore maintainability, we have other advantages such as the separation of code and data. If we add a new word (pig) to the dictionary we just put it in the dictionary, keeping the code intact.

If now we write a new category referred to if you like farm animals will have a reusable code and only have to select the appropriate category of the dictionary. “*0 love *0 #poultry *0”

 

 

 
  login or register to react