AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Order of words in concepts
 
 

If I have a concept like:

concept: ~tags[“domestic abuse” “aadvark abuse” abuse ]

and a rule like

u: (_~tags)
    You matched _0

I though that concept matched in the given order, so a user input of “domestic abuse” would be matched and give “You matched domestic abuse”, but in practice I’m finding that abuse is matched instead so as to give “you matched abuse”.

Is the concept matching on a purely alphabetic basis - in which case would aadvark abuse match before abuse?

I normally want to get the fullest match for something, so if it is matching purely on alpha is there any other way to get the “longest” match within a concept?

 

 
  [ # 1 ]

OK, bit more testing. originally I actually had

concept: ~tags[“Domestic Abuse” “Aadvark Abuse” Abuse ]

which behaved as described, and indeed Abuse still trumped Aadvark Abuse, whether the user typed “aadvark abuse” or “Aardvark Abuse”.

But with all tags in lower case:

concept: ~tags[“domestic abuse” “aadvark abuse” abuse ]

then behaviour is as expected with “Domestic Abuse” and “domestic abuse” both matching “domestic abuse”.

Just seems odd that Abuse on its own matched upper and lower, but quoted (or _ separated) phrases only matched if cases matched or the concept word was l/c.

Been caught out by upper/lower case in Chatscript before, just one more time I suppose!

 

 
  [ # 2 ]

Words in concepts and patterns should be in their “natural” case, that is unless the word is a proper noun it should be specified in lowercase. Consider “will” and “Will” or “rob” and “Rob”.

CS has two dictionaries, one for the lowercase words and one for the uppercase versions. The engine will spell correct a word typed by the user from lowercase to uppercase if the word is only known in an uppercase form or vice versa. But if a word is present in both dictionaries then CS cannot convert them.

As words in concepts and patterns are trusted to be valid forms then any unknown forms can be added to the dictionaries and therefore will stop any spell correction from happening.

 

 
  [ # 3 ]

Thanks, very useful and may explain some issues.

But if I have a set of correctly cased words, then what is the order that the concept list matches if there are multiple possibilities, eg for an input of Aadvark Abuse would it match Aadvark Abuse in both:

concept: ~tags [“Domestic Abuse” “Aadvark Abuse” Abuse ]
and
concept: ~tags [Abuse “Domestic Abuse” “Aadvark Abuse”]

in

u: (_~tags) I see _0

i.e. is the match greedy or does it just take the first it finds?

 

 

 

 
  [ # 4 ]

The concept is matched in succession against the words of the sentence, so it will take the first match it finds, with no regard to order within the concept. But if the concept has “against me”  and “against” as members, and the sentence is “I found it against me”  it will take the longer match and so “against me” will be matched. That is, it gets to the word against as the first matching area, but then can extend it to cover the next word(s) if they as a composite are in the concept

 

 
  [ # 5 ]

A similar issue would arise in [] lists in a pattern of a rule, but there the behavior is that each element of the [] is in turn matched so that order within that set IS important. You may match much later in a sentence than if you had them in a concept instead. And the compiler will complain if you write
u: ([against “against me”])  because it knows that against me cannot match given that against comes before it in the list

 

 
  [ # 6 ]

u: ( [...] ) is always slower than u: (~set)  though usually the speed doesnt matter. Matching time in a pattern is roughly linear in the number of tokens.  ~set is 1 token.  [ “against me” against] is 4 tokens.

 

 
  [ # 7 ]

That’s great, many thanks Bruce.

David

 

 
  login or register to react