AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Determining if user input is a question or statement using AIML
 
 

I am creating a chatbot on Pandorabots.  What are good strategies for determining if a given input is a question or a statement?  People do not always use question marks even if when they are asking a question.

I can grab the first word of the input and am using the following category to return “Yes” if it matches a question word.  I then set “ISQUESTION” to “Yes”.  With each new input I set “WASQUESTION” to the value of “ISQUESTION” so I know if the prior user statement was a question.  Is there any built in functionality in AIML for this already?

<category>
XISQUESTIONWORD * </pattern>
<template>
<think><set name=“star”><star></set></think>
<condition name=“star”>
    <li value=“WHAT”>Yes</li>
    <li value=“WHO”>Yes</li>
    <li value=“WHERE”>Yes</li>
    <li value=“WHEN”>Yes</li>
    <li value=“WHY”>Yes</li>
    <li value=“HOW”>Yes</li>
    <li value=“IS”>Yes</li>
    <li value=“ARE”>Yes</li>
    <li value=“WAS”>Yes</li>
    <li value=“WERE”>Yes</li>
    <li value=“DO”>Yes</li>
    <li value=“DID”>Yes</li>
    <li value=“WILL”>Yes</li>
    <li value=“WOULD”>Yes</li>
    <li value=“CAN”>Yes</li>
    <li value=“COULD”>Yes</li>
    <li value=“SHALL”>Yes</li>
    <li value=“SHOULD”>Yes</li>
    <li>No</li>
</condition>
</template>
</category>

 

 
  [ # 1 ]

I have improved on the ability of my chatbot to recognize questions.  I have added logic to check the last word of input.  If it is a question mark it is automatically flagged as a question.  If the second to the last “word” was a comma then it checks to see if the final word was “yes”,“no”,“right”,“correct”, or “verdad”.

This works for sentences with or without the question mark such as:
Tomorrow is Saturday, correct?
You need to study, right?
The store is around the corner, no?

<category>
XISFINALQUESTIONWORD * </pattern>
<template>
<think><set name=“star”><star></set></think>
<condition name=“star”>
  <li value=“XQUESTIONMARK”>Yes</li>
  <li>
  <condition name=“PREVIOUSWORD”>
    <li value=“XCOMMA”>
      <condition name=“star”>
      <li value=“YES”>Yes</li>
      <li value=“NO”>Yes</li>
      <li value=“RIGHT”>Yes</li>
      <li value=“CORRECT”>Yes</li>
      <li value=“VERDAD”>Yes</li>
      </condition>
    </li>
    <li>No</li>
  </condition>
  </li>
</condition>
</template>
</category>

(In order for this to work the html page must be modified to substitute words for punctuation marks when the form is submitted because Pandorabots strips them out automatically.)

Are there any other rules that anyone can think of to determine if a sentence is intended as a question using AIML?

 

 
  [ # 2 ]

There is no built in functionality for this in AIML.

Another question word you don’t have is “DOES”. Be careful of using this method, as statements such as “Will Smith is a great actor” will be flagged up as a question.

I wouldn’t advise checking the last word like that, as I feel this will flag up more errors than genuine questions. “You are correct”. “At the end of the road, turn right” and so on.

I tend to use a mixture of the 2 approaches, checking for things like “YOU ARE * ARE NOT YOU?” which I srai to “ARE YOU <star>” or “I * DO NOT I” to “DO I <star>”

 

 
  [ # 3 ]

What about something like…

http://elizabot.com/determine

 

 
  [ # 4 ]

Or perhaps like this ...

http://iSyFy.com

in social media.

 

 
  [ # 5 ]

Thanks Steve for the feedback.  The testing for a word at the end of a sentence requires a comma to be present just before the last word so that statements such as “You are correct” will not be flagged a question.  The custom web page codes commas as “XCOMMA” so they are not stripped out.  I can choose to strip them out after determining if the users input is a question.  The case of “Will” being a first name is challenging.  I might let regular aiml pattern matching handle this case and any specific names and then reevaluate whether the input is a question, using a mixed approach as you suggest.  Another approach might be to test the word following the word “Will” and determine if it is in a list of last names, but that seems less efficient.

I found this site listing question words:
http://www.hopstudios.com/nep/unvarnished/item/list_of_english_question_words
The author seems to distinguish between question words that can only be used as question words and words that can be used as verbs which they do not consider to be questions words.  I am not sure but I do not think the distinction matters in the case of testing the first word of a sentence.

I have added “does” to the list.  I have also added “which”, “whose”, “am”, “whom”, “has”, “have”, and the more archaic “wherefore”, “wherewith”, “wither” and “wence” just in case.

I think there are also commands that might be used to demand a response such as “Riddle me this ...” and “Tell me if..” and I think these are best handled with regular aiml pattern matching.

Originally, I was after the functionality of Chatscript where some patterns are only checked if the input is determined to be a question and others only if the input is determined to be a statement.  After further consideration and research it might be best if I created 100+ question patterns in AIML and then flagged the user input as a question if the question pattern is matched.  Would it be more efficient if the pattern matching of questions could be split out from pattern matching for statements?

(∞Pla•Net Thanks for the links.  Not really sure what to make of them.  The site just repeats my questions with the phrase “asked” appended.  It is used to build up data as to what people consider to be questions or statements?)

How do people handle questions like “Tomorrow, will you be online?” versus “Will you be online tomorrow?”  Would you have to handle every different word order case in AIML as separate patterns?  The method I am using now would not work if there was a preceding adverb or prepositional phrase before the actual question.  Would I have to test for “will you”, “will he”, “will she”, “will they” separately in the middle of a users input? What about “Will the president…”, “Will Bob…”, etc. ? 

 

 
  [ # 6 ]

In terms of grammar, a question is distinguishable by verb/modal verb before subject. All I have to add is that you may want to take linking words into account “So/but do you…?”, and that “do” can also be used in commands “Do tell.”, “Do me a favour.” (though uncommon). I can’t tell you how to though as I don’t use AIML.
“Will” is a tough one indeed! smile it can also have the meanings “a strong will”, “last will and testament”, “I will it to happen.”. I’ll have to think about that one.
Can’t you use your same first-word method to handle preceding clauses with

XCOMMA will 
 

 
  [ # 7 ]

If the users are not going to put question marks at the end, they certainly won’t put commas in the question.

 

 
  [ # 8 ]

I just read a post in another topic that shows otherwise smile . Let’s just say we can’t count on it.

I think 8pla meant to show you an example of how you could assure to make the distinction between question/statement through the user interface (by providing two buttons, presumably passing along an addition to the input like ISQUESTION or ISSTATEMENT), so you wouldn’t have to analyse the text for it at all.
Along those lines, in your html you could block user input when the sentence isn’t properly ended with either a . or ?, and so force the user to use proper punctuation. Like a php form refuses to submit when you leave a field empty.

 

 
  [ # 9 ]

Alaric said, “Not really sure what to make of them.”

Thanks for the compliment, Alaric, by not noticing this as an alternate solution.  I prefer that the user not know what to make of it.  Rather, I hope they may discover it naturally.  (At least that was my original plan). 

If the user wishes to ask something, they click the ASK button. If the user wishes to say something, they click the SAY button.  So, the grammar and punctuation are both forced by chosing a button to click.  The responses simply indicate whether a question or a statement was submitted to the robot.  This method may help train a robot to improve its detection of questions and statements, including those which may be poorly formatted.

With that suggested as an alternative, please know I do find your technique very interesting and I do wish you good luck making new discoveries with it.

 

 
  login or register to react