AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Standard AIML set
 
 

Dear forum:

I am looking for a ‘standard’ set of AIML that is free (or uses the GPL or similar) to load into Uberbot. So far I have found these:

(1) Free Alice AIML set http://code.google.com/p/aiml-en-us-foundation-alice/

(2) Square Bear AIML set https://code.google.com/p/aiml-en-uk-squarebear-utils/

(3) Alice AIML set from 2005 http://www.alicebot.org/aiml/aaa/ These files are color coded to show their usefulness, but unfortunately they seem to be 8 years out of date.

These AIML sets work fine, in the sense that I loaded most of (1) and (2) into Uberbot and the bot responds sensibly to most questions. Deficiencies are generally caused by errors in Uberbot, not in the AIML.

My question is: Are these the best of the freely available AIML sets? Are there other sources people know of? Are there projects under way to build large ‘generic’ AIML sets?

 

 
  [ # 1 ]

For the most part, any other large AIML sets are going to be “offshoot versions” based on the original ALICE AIML sets. When I was working on the means to provide full Unicode support for Program O, I had taken the 2005 ALICE set and not only converted all of the files to UTF-8, but also corrected several typos within the actual AIML tags themselves, but the content of that set is essentially the same. To the best of my knowledge, there aren’t any original works that are publicly available that are also on the same scale of the ALICE set.

 

 
  [ # 2 ]

This is the site I try to maintain with the most current links to free AIML sets. 

https://code.google.com/p/aiml-en-us-foundation-alice/

 

 
  [ # 3 ]

I don’t update my AIML files as often as I should but the latest ones are here:
http://www.square-bear.co.uk/aiml/

 

 
  [ # 4 ]

Thanks for the links everyone - it looks as if I had found the latest.

Steve: Are the files on your website more up to date than https://code.google.com/p/aiml-en-uk-squarebear-utils/ ?

Dr Wallace: The earlier AIML set from 2005 at alicebot had color coded files to show how ‘general purpose’ they are. This is incredibly useful - are there any plans to restore this feature for the Foundation Alice AIML set. It would greatly help a newbie like myself.

Question for Anyone: As the AIML files are released under the GPL I believe any changes have to be made available to the original source. Is this correct (it does seem fair)?

Also: It seems as if 3 of the 4 bots in the Loebner finals are AIML based. Are they mostly based on the Foundation Alice AIML sets described by Dr Wallace? Or do people create large AIML sets they keep secret in the hope of winning smile It seems to me if you could code 10 categories an hour, it could take 10 years or so to produce a large AIML set! So having the Foundation Alice AIML set available with some 80K categories is incredibly useful.

 

 
  [ # 5 ]

Hey Will, I’m sorry but restoring the color codes is not on my To-Do list at the moment.
I’m working on a totally new version of ALICE called ALICE 2.0 which has not yet been released as source code but you can chat with it through the CallMom BASIC app (https://play.google.com/store/apps/details?id=com.pandorabots.callmom.basic&hl=en)

The bots competing in the Loebner contest all have significant content that differs from the original ALICE bot.  Some of them may have started with ALICE but they are now all quite different, as you can see from the transcripts.

My understanding of the GPL is that yes, if you release your code and it includes GPL code, your code must also be GPL.  If you want it to remain proprietary, the trick is not to “release” it.  So for example if you develop a bot based on ALICE, you can publish the bot so that people can chat with it, but not necessarily publish or release the AIML files.

Increasing botmaster productivity is always a concern.  I think a skilled AIML coder can write up to 1 category per minute.  Some recent work I’ve done with the Pattern Suggestor feature of Program AB has demonstrated productivity up to 6 categories per minute.

 

 

 
  [ # 6 ]

Thanks for the quick reply!

So licence-wise then, I guess people start from the original AIML set, they add lots of ‘secret sauce’ to the categories, then run a bot using their enhanced AIML. And because the AIML is the data behind the bot, they don’t have to release their enhanced categories.

But even if you could write one category a minute it would take a year or so to write the 80,000+ categories in the AIML set you had on google. So I’m grateful to you and the others who undertook this task!

Is your ALICE v2 going to be based on the AIML improvements I saw someplace else on the forum? Are there many v2 AIML files out there?

 

 
  [ # 7 ]

Many of those 80K are yes/no questions that were converted form the MindPixel data. 

One of our most prolific botmasters is Peter Lafferty, author of Chomsky and The Professor.  Working on his bot almost daily for nearly a decade, Peter wrote 600,000 AIML categories.

 

 
  [ # 8 ]

Wow! Did Peter enter it to the Loebner competition? Surely it would almost be guaranteed success.

 

 
  [ # 9 ]

Yes he did and unfortunately we had a slight technical problem with his AIML that made some responses come out blank.  I’ve apologized to Peter and we will try to enter The Professor again next year.

However it’s worth mentioning that the number of categories alone is not a guarantee of success.  It’s a question of quality and quantity.

 

 
  [ # 10 ]
Will Rayer - Jul 9, 2013:

Steve: Are the files on your website more up to date than https://code.google.com/p/aiml-en-uk-squarebear-utils/ ?

Yes, I don’t really update that google site any more. All my free AIML files will be updated on my own website now.

 

 
  login or register to react