AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

AIML Pattern Matching - sentence splitting using the "_"
 
 

In the research paper “A Comparison Between Alice and Elizabeth Chatbot Systems” by Bayan Abu Shawar & Eric Atwell, they describes the process of pattern matching as if the user input is “halo what is 2 and 2?” then it will normalized to “HALO WHAT IS 2 AND 2”.
Then it will first match with the “_WHAT IS 2 AND 2” pattern. The given knowledge base patterns are as follows..

(1)<category>
<
pattern>_ WHAT IS 2 AND 2</pattern>
<
template>
<
sr/> <srai>WHAT IS 2 AND 2</srai>
</
template>
</
category>
<
category>

(
2) <pattern>WHAT IS 2 *</pattern>
<
template>
<
random>
<
li>Two.</li>
<
li>Four.</li>
<
li>Six.</li>
<
li>12.</li>
</
random>
</
template>
</
category>

(
3) <category>
<
pattern>HALO</pattern>
<
template>
<
srai>HELLO</srai>
</
template>
</
category>

(
4) <category>
<
pattern>HELLO</pattern>
<
template>
<
random>
<
li>Well hello there!</li>
<
li>Hi there!</li>
<
li>Hi thereI was just wanting to talk</li>
<
li>Hello there !</li>
</
random>
</
template>
</
category

But then it describes that the “_WHAT IS 2 AND 2” will partition the input as “HALO” and “WHAT IS 2 AND 2”.How does this partitioning happens??? Is that because of the <sr> tag in the first category template???
And why both the <sr> and <srai> tags are used in the template??How does it work in the pattern matching process??

I understand the matching with the “_WHAT IS 2 AND 2” and directing to the pattern “WHAT IS 2 AND 2”.But I really do not understand the splitting of the sentence.How does it actually happens?What is the mechanism behind it?

The research paper can be found at the following link and the graphical tree representation of the above example is given in the page no 9.
http://www.scribd.com/doc/32243318/A-Comparison-Between-ALICE-and-Elizabeth-Chatbot-Systems

I am really confused with this and hoping for the solutions from you all.Thank you very much.

 

 
  [ # 1 ]

Good morning, Umesha.

Yes, the <sr> tag, in that particular instance, is the method used to “partition” the input.

As I said in my answering post in another forum, The <sr> tag is functionally equivalent to using <srai><star></srai>, which basically causes the interpreter to process everything between the <srai> tags separately. Thus, using the categories you’ve listed, the interpreter processes the input, “halo what is 2 and 2” in the following manner:
(please note that I don’t like to “shout”, so I’m normalizing here to lower case, instead of upper case)

1.) “halo what is 2 and 2 is matched to “_ what is 2 and 2”
2.) becomes “<srai>halo</srai> what is 2 and 2”
3.) <srai>halo</srai> is matched with pattern halo
4.) becomes <srai>hello</srai>, and is matched to “hello”
5.) response is selected and saved for “hello” (“well hello there!”)
6.) the remaining input (“what is 2 and 2”) is matched to “what is 2 *”
7.) response is selected and saved for “what is 2 *” (“12”)
8.) all saved responses are combined in order and the finished response becomes “Well hello there! 12”

I hope that this clarifies things. smile

 

 
  [ # 2 ]

Hi Dave,
Thanks a lot for the answer.Yes it did clarify a lot of stuff that I was searching for sometime.I am bit relieved now…:):)
So basically what I understood is:

Say the user input:“Hi,my name is Rangi” Then if we have the following pattern:

<category>
<
pattern>_NAME IS *</pattern>
<
template>
<
sr/> <srai>NAME IS</srai>
</
template>
</
category>
<
category

Because of the <sr> tag the sentence is split into “HI MY” and “NAME IS” and will processed separately.Is that correct??

Therefore the process would be “
(1)“The system matches with the _NAME IS* pattern, and then it detects the <sr> and <srai> tags.Then the system interprets it as what is inside the <srai></srai> tags should be considered separately and take the remaining and process separately “.

OR

(2)“The system matches with the _NAME IS* pattern, and then it detects the <sr> and <srai> tags.Then the system replaces the <sr> with <srai></srai> tags as <srai>HI MY</srai>and process the “HI MY” first then the remaining separately”

What is the correct description of the process??

And what is <star> tag represent??Is that same as the <srai> tag??

And can we have more than one <srai></srai> tags inside one template??
As follows:

<category>
<
pattern>HI HOW ARE YOU</pattern>
<
template>
<
srai>HI<srai><srai>HOW</srai><srai>ARE YOU</srai>
</
template>
</
category>
<
category

Thank you very much.smile

 

 
  [ # 3 ]

A lot of what you’re asking is extremely dependent upon the AIML interpreter that you’re using, but ideally, your second scenario is the case. HOWEVER:

your category is incomplete. Whenever there is one or more wildcard(s) (‘_’ or ‘*’) within a pattern, every single one should be accounted for. For example, in your category:

<category>
<
pattern>_NAME IS *</pattern>
<
template>
<
sr/> <srai>NAME IS</srai>
</
template>
</
category>
<
category

there are two problems that I see. First off, there ALWAYS needs to be a space between your wildcard and the next or previous word. Where your pattern shows “_NAME IS *”, it needs to be “_ NAME IS *”. the reason is that your AIML interpreter will certainly have problems seeing a wildcard if it’s attached as part of another word. Thus the need for a space between.
The second problem is that the last wildcard isn’t processed at all, and in almost every circumstance, that completely ignores what could be important information, with respect to the context of the conversation (in this case, the person’s name). Your category just ignores the second wildcard, which loses data needed for the context. A better way to construct that particular category is as follows:

<category>
<
pattern>_ NAME IS *</pattern>
<
template>
<
sr/> <srai>NAME IS</srai> <star index="2"/>
</
template>
</
category>
<
category

And even that is prone to cause some problems. Take for example, the following input:

“His name is synonymous with evil” (referring to Bill Gates, of course. smile )

That sentence is certain to be misinterpreted by the script, and will likely cause problems. This is a good example of why very careful thought should go into whether or not you should use the underscore wildcard in a pattern. The underscore is very powerful, but can be misused with frightening ease.

Now, notice that the second wildcard was referred to in the template as <star index=“2”>? This is to assure that one wildcard value doesn’t overwrite another. Consider this:

<category>
<
pattern>* is * and * is *</pattern>
<
template>
Do 
you mean to say that <star index="3"/> is <star index="4"/> while <star/> is <star index="2"/>?
</
template>
</
category

What we see here is that you can have as many wildcards as we have words to separate them. You can’t have two wildcards in a row, without having at least a word or phrase between them (at least, as far as I know), because the interpreter won’t know where the cutoff is between one wildcard and the next. This category also shows an interesting way to make the bot appear to be listening, and taking an interest in the conversation, by rephrasing the given statement, and asking the user if this was what they meant.

I hope this helps. smile

[edit]

By the way, in your last category example, you left out a slash or two that would indicate a closing tag. It should look like this:

<category>
<
pattern>HI HOW ARE YOU</pattern>
<
template>
<
srai>HI</srai><srai>HOW</srai><srai>ARE YOU</srai>
</
template>
</
category>
<
category

And by the way, while you CAN, indeed, have categories like this one, it’s not really recommended. When creating your own categories, it’s a good idea to come up with a basic pattern, and then think of how many other ways that you can say the exact same thing, but in different words, then try to reduce all of those ways into the fewest statements possible, by using one or two wildcards. Then, taking the same original statement, Think of how many different ways that you would respond to it, if someone else were to say it to you, and then reduce THEM down to as few responses as you can by using <srai> and <star> tags. Also remember that the <sr> tag will only substitute for the FIRST wildcard in a pattern. Any other wildcards to be processed by an <srai> tag will have to be coded out in full (e.g. <srai><star index=“2”></srai>, etc.).

[/edit]

 

 
  [ # 4 ]

@Umesha: offtopic, if you press ‘advanced reply’ the next time, you can add images to your thread for the graphical representation for example.

 

 
  [ # 5 ]

You can use as many <srai>s inside your category as you like. I use this to add points on to game AIML files.

<category>
<
pattern>ADD 10 POINTS</pattern>
<
template>
<
srai>ADD 1 POINT</srai>
<
srai>ADD 1 POINT</srai>
<
srai>ADD 1 POINT</srai>
<
srai>ADD 1 POINT</srai>
<
srai>ADD 1 POINT</srai>
<
srai>ADD 1 POINT</srai>
<
srai>ADD 1 POINT</srai>
<
srai>ADD 1 POINT</srai>
<
srai>ADD 1 POINT</srai>
<
srai>ADD 1 POINT</srai>
</
template>
</
category

However, if you <srai> to other categories that also contain <srai> tags, you will find the dreaded “Too much recursion in AIML” error if they srai to about 6 or 7 levels.


Your category:

<category>
<
pattern>_ NAME IS *</pattern>
<
template>
<
sr/> <srai>NAME IS</srai> <star index="2"/>
</
template>
</
category>
<
category

Would be a rather strange one and I would be tempted to rewrite it like:

<category>
<
pattern>_ MY NAME IS *</pattern>
<
template>
<
sr/> <srai>MY NAME IS<star index ="2"/></srai>
</
template>
</
category>
<
category

This will then react to the first part of the sentence and then call “My name is ....” eg:

Human: Good evening chatbot, my name is Steve
Bot: Good evening to you too. Pleased to meet you Steve.

 

 
  [ # 6 ]

I want to set the user’s name to a custom tag.Say the user input is “My name is Rangi”, then I want to set the user name to “Rangi” and use it later in other categories where necessary and I have tried the following category using the double set tag:

<category>
<
pattern>_ NAME IS *</pattern>
<
that>WHAT IS YOUR NAME</that>
<
template>Nice to meet you<set_name><star index="2"></set_name></template>
</
category

 

And I used the PROGRAM# interpreter to check my knowledge base files and it did not give my any error for the above category.But I could not display it later using the following category:

<category>
<
pattern>GOODBYE</pattern>
<
template>See you later<get name="it"/></template>
</
category

It did not give me the name.I guess theres is something wrong in the way I am using it or trying to display it.But I could not figure out how.Can someone help me with this ?
Thank you very much. smilesmile

 

 
  [ # 7 ]

I’m not sure what interpreter you are using but you haven’t given a name to your variable and have assumed it is called “it”. I would advised trying categories like this:

<category>
<
pattern>_ NAME IS *</pattern>
<
that>WHAT IS YOUR NAME</that>
<
template>Nice to meet you<set name="name"><star index="2"/></set></template>
</
category>  

<
category>
<
pattern>GOODBYE</pattern>
<
template>
See you later<get name="name"/>
</
template>
</
category>
<
category
 

 
  [ # 8 ]

Hi Steve,

Thanks a lot. it is working smilesmile

I have another problem with <that> tag.My code this like this:

<category>
<
pattern>HELLO</pattern>
<
template>
<
random>
<
li>Hello therewellcome to counselling service</li>
<
li>Hello there what is your name?</li>
</
random></template>
</
category>

<
category>
<
pattern>_ NAME IS *</pattern>
<
that>HELLO THERE WHAT IS YOUR NAME</that>
<
template>Nice to meet you<set name="name"><star index="2"/></set></template>
</
category

But it does not work that way.When I use the “My name is Rangi” input after the “Hello there what is your name?” robot
input.I checked the syntax with the AIML reference manual at:

http://www.alicebot.org/documentation/aiml-reference.html#that

and other source files which uses the <that> tag.But is does not work.The interpreter I am using to check my aiml files is the PROGRAME# at:

http://ntoll.org/article/project-an-aiml-chatterbot-in-c

But if I use the code as :

<category>
<
pattern>HELLO</pattern>
<
template>
<
random>
<
li>Hello therewellcome to counselling service</li>
<
li>HELLO THERE WHAT IS</li>
</
random></template>
</
category>

<
category>
<
pattern>_ NAME IS *</pattern>
<
that>HELLO THERE WHAT IS</that>
<
template>Nice to meet you<set name="name"><star index="2"/></set></template>
</
category

it works.But both text inside <that> tag and the <li> tag should be in block capital and this works only for four words.
But I need the bot outputs to be in normal text . I checked in the research paper(page no 6) at :
http://www.scribd.com/doc/32243318/A-Comparison-Between-ALICE-and-Elizabeth-Chatbot-Systems
also it says the previous bot output normalized in the same way as the user input, therefore it matches with the text in the <that> tag.

Why this happens this way??
Is there anything wrong in the code??
Is the text in the <that> tag works for only few words??
How can I use this in the correct way??

Hope for a solution.Thank you very much.

 

 
  [ # 9 ]

Take out the questions marks from your <li> You shouldn’t have any punctuation in there at all.

 

 
  [ # 10 ]

Hi Umesha,

I’ve written an AIML interpreter before.  A few misuderstandings may have been suggested here.  There is a standard for AIML as you have found already.  Only a few intrepreters follow it strictly (Program D is an example of one that follows the rules.)  Since many hobbbiest have taken the challenge to build their own, you need to be careful of the deviations and ones that are not up to standard.

For example, the standard says you can have two wildcards adjacent.  Wildcards are supposed to consume the least amount of words, so a pattern like “* *” will put only one word in the first wildcard and the rest in the second.  It won’t match one word inputs since there are two wildcards.  It is perfectly normal for a wildcard to go unused, so you don’t have to worry about the second wildcard.

Templates are processed sequentially from beginning to end.  When they encounter a <srai> or <sr> tag, they start another interpreter to respond to the contents of the tag (or the <star> as in the case of the shortcut <sr> tag.)  This recursive calling of the interpreter is often much more than 6 or 7 levels deep unless the AIML engine is very basic and primitive.  The standard test cases nest it twenty levels deep.

A non-standard, but often used enhancement is to assign <set>‘s to the predicate “it”. Often a newer AIML engine will take a <set name=“him”>Gary</set> and produce the “him” as output.  The same goes for all the pronouns.

When you use <set_name>, you were reverting back to the old style of set tags which very few interpreters use anymore.  The AIML for that version (I think it is around version .70) would look like <set_name><star2></set_name> instead of the mixed version your example shows.

Absolutely punctuation and other user friendly notations should be used in templates for output.  It is somewhat undefined and maybe likely that the handling of <that>‘s does do normalization correctly.  The standard is not very clear here.  For example, if the template creates a reply containing several sentences, which sentence is used for the <that>?  Or must all the the outputted text become the <that>?  Normalization for pattern matching should convert to uppercase, but that also is not strictly followed.

One thing my AIML engine does is take in the various older versions of AIML and upgrade it to the 1.01 standard version.

I too have extended the standards and my interpreter is consider experimental by the AIML community.  I’ve documented the differences in its on-line help.  So you can write AIML in it that should work in any standard AIML engine.  Or you can experiment with new tags like <context>.

 

 
  [ # 11 ]
Steve Worswick - Aug 26, 2010:

Take out the questions marks from your <li> You shouldn’t have any punctuation in there at all.

I have to disagree here. The <li> tags encapsulate a portion of the response, so decidedly need punctuation. Your interpreter, however, needs to make sure to strip whatever punctuation is within the text that is to be stored as the THAT “variable”. This is to be done within the script, though, not the AIML, of course.

@Gary

Hi, Gary, and thanks for hopping on in here. I’m curious as to the programming/scripting language you’ve used to create your interpreter. Not for any particular reason, mind you. Just curious. smile

 

 
  [ # 12 ]

Hi,

Removing the punctuation marks did not work and it only works with both <li> and <that> tag having same text in block capitals(only for few words). I really could not figure out why this is happening. <that> tag is important for me as far as concerning my knowledgebase as I am writing my own aiml files from the scratch and I am really stuck :(

Is that something wrong with the interpreter I am using or the syntax in code?? But I guess this interpreter adheres the standards as I selected this form the http://www.alicebot.org/downloads/programs.html

Thank you. smile

 

 
  [ # 13 ]

My apologies, of course you can have punctuation in your <li> tags. It’s the <that> that should have no punctuation. I can only assume it’s the interpreter you are using.

 

 
  [ # 14 ]

You know who would be helpful right now? Chuck! Oh, Chuck!!! We need your expertise!

Seriously, though. I’ve a quick question. Are you using the 2.0 version of Program#, or the original version? I have plenty of room to download both versions, but I don’t have the time to muddle through the code for both.

The way you describe the problem leads me to believe that either Program# isn’t normalizing the input before it checks against the THAT, or possibly isn’t normalizing the output before storing it as THAT.

 

 
  [ # 15 ]

Oh, and by the way, just because the programs listed on that page are there, there’s no guarantee at all that they follow the AIML standards. I know for a fact that at least 2 programs there have serious issues regarding adherence to standards.

 

 1 2 > 
1 of 2
 
  login or register to react
‹‹ AIML <srai> tag      AIML ’topic’ tag ››