AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Ignore matches in later rules
 
 

Hi,

I have a problem of two rules when the first pattern can catch the first word in the input and the second should catch the second or later word in the input.

The problems is in cases the first and the second word has the same marks and the only way to distinguish between those two is the fact that the first word were not failed (catched / extracted in my terminology) in the first rule.

Example:

uEXTRACT_MAIN_SUBJECT (_~mainsubject)
 
$main_subject _0
 
# the main subject is processed and should be ignored from now on
uEXTRACT_OBJECT (_~noun# catch any noun [strong]after[/strong] the main subject
 
$object _0 

For input like: “John loves Mary”
It creates the variables:
$object = John
$main_subject = John
Because the second rule also applied to the first word.

I’m not sure if my approach is the best way to solve the main issue or the rules should be reorganized in some other way.

Secondly, I have some questions about pos tags and other concepts marks:

After some rule were applied and matched some word in the sentence. What’s the best way to get the relevant marks (to my needs) into variables?

For example:

uEXTRACT_MAIN_SUBJECT (_~mainsubject)
 
$main_subject _0
 $main_subject_is_noun_proper_singular 
= ???
 
$main_subject_is_geographical_areas = ??? 

When I do:prepare sentence, I see that all the relevant information is already there and I don’t want to ^query for membership in concepts to get the data I need.

Lastly, Out of curiosity: I see that the pos tagger is aware of the context.
For example:
“I’ll email you the details” (email=verb)
“Send me an email” (email=noun)
Is the parser rule based or statistics based and how much of the context it takes into consideration?

Thanks for any response,

Sam

 

 

 
  [ # 1 ]

This issue with EXTRACT_OBJECT is that it is looking for any noun from the start of the sentence. Had you said
u: (_~mainobject)
that would have worked. Or
u: (_~mainsubject * _{~noun})
that would have worked
Or you could have demarked the subject via
u: (_~mainsubject) ^unmark(~noun _0)
before running extract object

Your second question requires concept matching in some manner. Eg
u: (_~mainsubject) refine()
  a: (_0?~noun_proper_singular)
  a: (_0?~noun_plural)
and so on

The context of the pos-tagger is the sentence.  It is primarily rule based, but has some statistics for breaking ties.

 

 
  [ # 2 ]

and there is always the exotic:
u: (_~mainsubject) ...
u: (@_0+ * _~noun)  ...

to set the match starting where _0 was and then scan for your noun.

 

 
  [ # 3 ]

Thank you very much Bruce.
The second suggestion seems to suits better for my script.

About the pos tagger, as one who used to play with stanford core nlp tools, I used to think that learning model is better than rule based model but then I’ve read your paper (ChatScript pos-parser manual) and now I understand more the advantage of ChatScript pos parser.

Thanks

 

 
  login or register to react