AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

AIML Matching priority (that tag)
 
 

According to http://www.alicebot.org/TR/2001/WD-aiml/ the tag and topic patterns should be appended to the match tree. That means, more specific <pattern> patterns have higher priority than more specific <that> patterns.

Using the code (and categories 1-5) below as an example.
This would give us the following matches:
in: THAT ASD
out: THAT ASD (cat 1)
in: THAT ASD
out: that THAT ASD (cat 4)
in: PATTERN ASD
out: pattern/that ASD (cat 5, not cat 2 because of ‘that’)
so far so good. reset interpreter.
in: THAT ASD (just to setup the ‘that’ again)
out: THAT ASD (cat 1)
in: PATTERN PATTERN ASD
out: pattern pattern ASD (here cat 3 has priority over cat 5)

So my question is: Is that the expected behavior or is my interpreter or my understanding of the AIML specs wrong? And if that is the correct behavior, how can I achieve similar results, eg. matching shorter patterns if a more specific ‘that’ is given.

<?xml version="1.0" encoding="utf-8"?>
<aiml versi>
<
category>
  <
pattern>*</pattern>
  <
that>*</that>
  <
template><star/></template>
</
category>
<
category>
  <
pattern>PATTERN *</pattern>
  <
that>*</that>
  <
template>pattern <star/></template>
</
category>
<
category>
  <
pattern>PATTERN PATTERN *</pattern>
  <
that>*</that>
  <
template>pattern pattern <star/></template>
</
category>
<
category>
  <
pattern>*</pattern>
  <
that>THAT *</that>
  <
template>that <star/></template>
</
category>
<
category>
  <
pattern>PATTERN *</pattern>
  <
that>THAT *</that>
  <
template>pattern/that <star/></template>
</
category>
</
aiml

Thanks for your help!

 

 
  [ # 1 ]

Hi, Jannis, and welcome to chatbots.org! smile

First off, by your use of the term “my interpreter”, I assume we’re working with an AIML interpreter of your own design, is this correct?

So far as the way pattern/that/topic matching works in AIML, it’s a matter of specificity, with the <pattern> tag being most general, followed by the <that> tag, which is more specific, and finally by the <topic> tag, which is the most specific. There is also an order of specificity within each of those tags as well, though there is a bit of a “complication” that involves the underscore (_) tag, since underscore wildcards take precedence over even direct matches (I could fill a few posts about the subject of underscores, so I won’t get into it here unless you require clarification about it).

For example, let’s start with AIML categories that only contain a <pattern>, with both <that> and <topic> being the default value of *:

<category>
    <
pattern>MY NAME IS *</pattern>
    <
template>Hello, <star/>!</template>
</
category>

<
category>
    <
pattern>MY NAME IS BOB</pattern>
    <
template>HelloBob!</template>
</
category

(again, not going to get into underscore wildcards unless clarification or explanation is necessary)

Of these two categories, the first is more general in nature, so will match just about any input that begins with “MY NAME IS “. The second category only matches when the input is specifically “MY NAME IS BOB”, so in this particular instance, this is the category that most specifically matches the input, and thus is “chosen”.

Now let’s bring <that> into the mix, and see how that works. Take these two categories, for example:

<category>
    <
pattern>*</pattern>
    <
that>WHAT IS YOUR *</that>
    <
template>Your <thatstar/> is <person2/>? Interesting.</template>
</
category>

<
category>
    <
pattern>*</pattern>
    <
that>WHAT IS YOUR PROFESSION</that>
    <
template>So you work as <person2/>, huhFascinating.</template>
</
category

Once again, the first category in this pair is less specific in nature than the second one, so it will match a wider range of inputs, while the second category, being more specific, allows the chatbot to respond in a more “personalized” way, ut there’s a trade-off between being too specific, which requires MUCH more in the way of AIML categories to cover sufficient contingencies, or being too non-specific, and having a chatbot that has too many overly vague responses. But I digress here, sorry. smile

<topic> works in exactly the same way as <that>, but instead of relying on the bot’s previous replies, it uses a given subject to focus on. For “professional” bots, such as customer service or expert system bots this can prove very useful in isolating the conversation to a specific domain, so that the bot isn’t responding about that trophy bass instead of the customer’s most recent suspicious email.

With that said, the specificity of a given set of <pattern> tags don’t make any difference if the corresponding <that> tags don’t match and vice versa.

 

 
  [ # 2 ]

Thank you for your quick reply. But it does not really answer my question.

First off, I am using a custom pyAIML fork. But the behavior is the same with pyaiml (py -2.7 -m pip install aiml). the pip source should be https://pypi.python.org/pypi/aiml/0.8.6

You say the <that> tag is more specific than the < pattern> tag.
But then
in: THAT ASD
out: THAT ASD (to set up the that)
in: PATTERN PATTERN ASD
if <that> really is more specific, and higher specificity get’s matched first, then

<category>
  <
pattern>PATTERN *</pattern>
  <
that>THAT *</that>
  <
template>pattern/that <star/></template>
</
category

should match. But it matches

<category>
  <
pattern>PATTERN PATTERN *</pattern>
  <
that>*</that>
  <
template>pattern pattern <star/></template>
</
category

where < pattern> is more specific, but <that> is less specific.

Is that pyaiml behavior incorrect?

EDIT: by the way, without the space in < pattern>, it is not shown… Seems to be a bug in the forum software.

 

 
  [ # 3 ]

For a detailed explanation of how AIML pattern matching is supposed to work, see the AIML 2.0 Draft Spec at

https://docs.google.com/document/d/1wNT25hJRyupcG51aO89UcQEiG-HkXRXusukADpFnDs4/pub

In particular Section 7: AIML Pattern Matching.

For even more detail, explore the source code of Program AB.  The file
source-archive.zip\program-ab\src\org\alicebot\ab\Graphmaster.java
has a complete implementation of the matching algorithm for AIML 2.0.

You can download the code from the Google code archive at
https://storage.googleapis.com/google-code-archive-source/v2/code.google.com/program-ab/source-archive.zip

One day soon I should migrate the source tree to GitHub or Bitbucket.  Curses Google for shutting down your Google Code project! 

 

 

 

 

 

 
  [ # 4 ]
Jannis Weigend - Sep 28, 2017:

You say the <that> tag is more specific than the < pattern> tag.
But then
in: THAT ASD
out: THAT ASD (to set up the that)
in: PATTERN PATTERN ASD
if <that> really is more specific, and higher specificity get’s matched first, then

...

EDIT: by the way, without the space in < pattern>, it is not shown… Seems to be a bug in the forum software.

I’m sorry if I didn’t answer the question to your satisfaction. That’s what I get for posting a detailed response at 2 in the morning. cheese

Let’s address the forum software issue first, shall we? smile

There are some pretty strange quirks in the forum software that strongly affect the writing of XML (and thus AIML) tags, but it’s fairly easy to get around with a little effort. When trying to type <pattern> outside of a code block, the forum software “hides” the tag, presumably for “security” reasons (don’t ask, I didn’t write the software). In order to get it to display correctly, you need to use the HTML entity for the left bracket (&lt;) instead of the bracket itself. There are a couple of other tags affected this way, but the rest of the list is playing hide and seek in my mind.

Now, for the specificity issue, let’s use the GraphMaster to see what’s going on. When written in GM format, the inputs and previous bot responses for the above “conversation” should look like this:

THAT ASD <that> * <topic> *

PATTERN PATTERN ASD <thatTHAT ASD <topic> * 

Now if we look at the two AIML categories in question:

<category>
  <
pattern>PATTERN *</pattern>
  <
that>THAT *</that>
  <
template>pattern/that <star/></template>
</
category

<
category>
  <
pattern>PATTERN PATTERN *</pattern>
  <
that>*</that>
  <
template>pattern pattern <star/></template>
</
category

and make GM patterns for each, we should get this:

PATTERN * <thatTHAT * <topic> *

PATTERN PATTERN * <that> * <topic> * 

Now according to the AIML specification (either 1.0 or 2.0), The <that> tag, with it’s higher specificity, will (or should) determine which of those categories get matched, and since the category containing <that>THAT *</that> is the more specific of the two, it should be the one matched. If PyAIML is not matching that category, then it’s got a bug that needs to be addressed.

 

 
  [ # 5 ]

Thanks a lot, both of you. It really seems to be a bug in pyAIML. I am always amazed widely used libraries contain bugs like that. I will try to fix it.

 

 
  [ # 6 ]

Sorry for double posting, but I cannot edit my original post anymore.

I ran the same inputs using the program-ab java program above and the output is the same.

HumanTHAT ASD
STATE
=THAT ASD:THAT=unknown:TOPIC=unknown
input
THAT ASDthatunknowntopicunknownchatSessionorg.alicebot.ab.Chat@30f39991srCnt0
Matched
: * <THAT> * <TOPIC> * compat.aiml
writeCertainIFCaegories learnf
.aiml size0
Robot
THAT ASD

Human
PATTERN PATTERN ASD
STATE
=PATTERN PATTERN ASD:THAT=THAT ASD:TOPIC=unknown
input
PATTERN PATTERN ASDthatTHAT ASDtopicunknownchatSessionorg.alicebot.ab.Chat@30f39991srCnt0
Matched
PATTERN PATTERN * <THAT> * <TOPIC> * compat.aiml
writeCertainIFCaegories learnf
.aiml size0
Robot
pattern pattern ASD 

I read the specification again and I think I understood it correctly in the first place. THAT and TOPIC are just appended to the sentence. And if it is matched from left to right, of course PATTERN PATTERN * will match first, because a WORD has a higher priority than the wildcard *. The matcher doesn’t even have to disambiguate the two THATs, because it finds the best PATTERN and the rest just “fits”.

But this would make THAT and TOPIC kind of useless…
I am a little bit lost.

 

 
  [ # 7 ]

That’s interesting, that Program AB should produce the same results. Let me test this with Program O and see if I get the same results. It may take me a little bit of time to do so, however, as my development box is sick, so is not currently available. I hope to have it running later today, barring any difficulties.

 

 
  [ # 8 ]

playground.pandorabots.com also yields the same results.

What is the reason that THAT and TOPIC is appended to the pattern? If they are prepended, wouldn’t that solve the problem? Or would that introduce other problems?

 

 
  [ # 9 ]

I honestly have no answer for that, Jannis. Perhaps this is a question best answered by Dr. Wallace, who was one of the founding individuals who created AIML. I’ll see if I can’t get his attention and invite him to join the conversation once more. smile

 

 
  [ # 10 ]

OK, I’m going to propose a way to think about this.

Every AIML category has an implicit value of * for <that> and <topic>, when they are not otherwise specified.

The AIML category

<category>
<
pattern>PATTERN</pattern>
<
template>Response</template>
</
category

is equivalent to

<topic name="*">
<
category>
<
pattern>PATTERN</pattern>
<
that>*</that>
<
template>Response</template>
</
category>
</
topic

Now admittedly the arrangement of <pattern>, <that> and <topic> tags is slightly confusing compared to the definition of the “pattern path” as

<patternPATTERN <thatTHAT <topicTOPIC 

Let’s write that off as a design error in early AIML.

Anyway it should be clear that the category above has a pattern path of

<patternPATTERN <that> * <topic> * 

So far so good?  We also have a notation called AIMLIF (AIML Intermediate Format) used in Program AB, where we represent the category in CSV format as:

PATTERN,*,*,Response 

Let’s use this simplified format to look at the OP’s question.
Jannis is asking about three categories:

A.

*,*,*,<star/> 

and

B.

PATTERN *,THAT *,*,pattern/that <star/> 

and

C.

PATTERN PATTERN *,*,*,pattern pattern <star/> 

with the input sequence:
1. THAT ASD
2. PATTERN PATTERN ASD

The first input matches category A and the bot replies with “THAT ASD”, and as a side effect the value of <that>, for the bot, is set to “THAT ASD”.

Jannis correctly surmises that this value of <that> matches the <that> pattern in category C, but in fact category B matches.  Why?  The matching algorithm explores the graph of pattern paths and matches the first “PATTERN” in the input with the first “PATTERN” in both B and C.  Then it asks, my next word is “PATTERN” (again) and which is more specific, the 2nd PATTERN from C or the * from B?  Obviously, C is more specific.  From here it is smooth sailing through the rest of C’s pattern path, and C wins.  (You can try to visualize all the pattern paths with common prefixes merged together from the root of the graph.  The branches in the graph occur when when the next word from one differs from the next word of another). 

I think in this situation if you really want the response “pattern/that PATTERN PATTERN ASD” you are going to have to add another category:

D.

PATTERN PATTERN *,THAT *,*,pattern/that <star/> 

or to put back in regular AIML:

D.

<category>
<
pattern>PATTERN PATTERN *</pattern>
<
that>THAT *</that>
<
template>pattern/that <star/></template>
</
category

With this category the more specific THAT pattern will match and D will win.

This all could seem a bit confusing out of context.  It would be helpful to know what the OP really wanted to do with real words in the patterns, rather than just “PATTERN PATTERN ASD”.

I hope that helps!

 

 

 

 

 
  [ # 11 ]

Thank you for your detailed answer.

I just want to use a few general patterns to catch follow-up question. But in most cases, the general pattern is not matched because a more specific pattern with a less general that is caught.
eg.
User: I am looking for a restaurant.
Bot: What about restaurant ABC?
User: What’s the address?

For the last input I use a less specific pattern like

ADDRESS *,WHAT ABOUT RESTAURANT *,*,template 

But there is usually already a more specific pattern which matches also, like

WHAT IS THE ADDRESS,*,*,Which place are asking about

So I gather there is no easy way to accomplish this in AIML (without having not write a lot more patterns).
I will try to reverse the matching order (<topic>, <that>, <pattern>) and see if this introduces other problems.

Thanks again for your replies.

 

 
  [ # 12 ]

Why not consider using the underscore high-priority matching wildcard?

_ ADDRESS *,WHAT ABOUT RESTAURANT *,*,template 

or even using AIML 2.0 zero+ wildcards:

# ADDRESS #,WHAT ABOUT RESTAURANT *,*,template 

 

 

 
  [ # 13 ]

Yes, this solves some of my cases. But only as long as I don’t have to “override” the priority of multiple <that>s and/or <topic>s.

 

 
  login or register to react
‹‹ why need *.aiml.csv file      AIML with MYSQL ››