AI Zone Admin Forum Add your forum

NEWS: survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

How to implement a Chatscript bot that talk Portuguese
  [ # 16 ]

Thank you Bruce and Eduardo Bedoya.

As I understand it, the work of POS tagging was charged to a external tool (for a foreing language), like TreeTagger, right?

I will read about this new information in the manuals of chatscript and do some test.


  [ # 17 ]

CS engine now enables integration with external pos-taggers to support languages other than english, which would never have been integrated into the engine itself


  [ # 18 ]

Hi, Bruce,
I searched for external pos-taggers in the 6.7 manuals with no luck, could you please tell me what manual talks about it? could you please suggest some external pos-tagger? thanks advanced.


  [ # 19 ]

ChatScript posparser in esoterica, dnd of manual


  [ # 20 ]

Hi Bruce, I read the new page on the posparser manual
so it says that, in order to enable it you need to disable first

I have some doubts about it…

1. the spanish spell checker will not be necessary anymore ($cs_language = spanish) the foreign spell checker could completly replace it?

2. I remember that I needed #DO_SUBSTITUTE_SYSTEM enabled, in order to match ” I am Jorge, from Canada” otherwise it would only match “I am Jorge from Canada”, can “TreeTagger” do this also???

Thanks advanced.

PD: could you please post a little example of how to communicate with it locally? how to use the ^popen, thanks advanced.


  [ # 21 ]

A foreign spell checker “can” replace the existing spell checker. I dont have one to test against and some particular one might not output data in a friendly fashion, not known. The built-in spell checker has an awareness of spanish, but also assumes you have supplied a revised dictionary of words.  For no other language is this true, so it needs to be disabled or it will damage the input.

2. Tree tagger cannot make substitutions. almost all substitions in the LIVEDATA files are language dependent.  I have not looked up your specific substitution.

3. the GERMAN bot has script that does the example of local communication with external postagger


  [ # 22 ]

thanks bruce,

I use LIVEDATA/interjections.txt and subtitutes.txt
but none of those seem to handle the commas,
I know for sure that if I disable #DO_SUBSTITUTE_SYSTEM, CS will not handle commas, hope Tree tagger could do so, Ill test it.

Thanks again.


  [ # 23 ]

I’d like to give you a feedback about my bot.

The POS Tagger TreeTagger does not have data for brazilian portuguese, just to european portuguese. So, after some research I found NLTK. NLTK has a trained base for brazilian portuguese.

Within NLTK I’m using the NLPNET, a package of NLP, but it does not return the POS-TAG with the canonical word. So, I’m using too the RSLP Stemmer to do this task. I did some changes in the NLPNET to mix it with RSLP Stemmer, in this way with just one function I can get the POS Tagging and the canonical word like TreeTagger do.

I did some tests with ChatScript and everything seems ok.

Thank’s Bruce and Eduardo Bedoya for your help.



  [ # 24 ]

very interesting,
congrats Oberdan
I keep stick with the old CS spanish way, guess I would try the foreign pos tagger in the future
I would check how all words containing foreign characters work on complex pattern rules
Oberdan, could you tell me if that foreign tagger can handle the commas eg.
having we a concept ~malnames [Oberdan Jorge] (malename is already taken by CS default)
I am Oberdan from Brazil
I am Oberdan, from Brazil
can it handle both “oberdan” and detect them as part of ~malnames???

Bruce, how much time should one person take to develop a 1500 lines bot?

Thanks Advanced.


  [ # 25 ]

Impossible to answer that accurately.  We spent around 5 person months to do a 16000 rule chatbot.


  [ # 26 ]

Thank’s Eduardo!

I don’t know if I got exactly what you really meant. But I did the following test.

I created the concept ~malnames
concept: ~malnames(Oberdan Jorge)

Write the following rule:
t:() Who are you?
  a:(_~malnames) Nice to meet you ‘_0.

And did these tests:

JUNIOR:  Who are you?
john: > I am Oberdan from Brazil.
JUNIOR:  Nice to meet you Oberdan.

JUNIOR:  Who are you?
john: > I am Oberdan, from Brazil.
JUNIOR:  Nice to meet you Oberdan.

For me it is OK. If I did something wrong, please, correct me and I will do others tests for you.


  [ # 27 ]

Hello Guys!

I really didn’t understand what does the functions ^mark and ^unmark do (System functions manual). Someone could explain better and give some examples?

Oberdan Alves


  [ # 28 ]

Hi Oberdan,
sorry for the long delay,
yes the test is fine, you test it in portuguese using the external targer, right?

I haven’t use ^mark in my spanish chatbot,

but it has been posted before,

“A concept cannot be triggered by a pair of words directly.
But you CAN create a topic that will do that. It can be run early in your control script or be run as $cs_prepass which happens before $cs_controlmain
u: ( _~verb_infinitive _0?~verb_noobject ) ^mark(~dualconcept1 _0)
and other such rules”

so ^mark sets some kind of BIT to a POS ~verb, ~noun, etc
of course Bruce has the last word, he would correct me if I mess with something.

good luck


  [ # 29 ]

Hi Eduardo!

You are right. I tested with external tagger in portuguese, though the tests that I did were in english language, there is no problem, because every word that are not recognized in the tagger it is classified as a noun (default).

I saw this post about ^mark function, but I just understood it, that this function can mark a word with a bitflag or a concept and I thought that I am wrong.

In the function manual is said this about function ^mark: “Marking and unmarking words and concepts is fundamental to the pattern matching mechanism”. So, I thought that I must use it.

Thank you Eduardo for trying clarify my doubt.


  [ # 30 ]

I have to step in and let you folks know that I think you’re doing an excellent job here. As I don’t know either Portuguese or Spanish (well, maybe enough Spanish to get myself beat up!), I can’t contribute much to the overall conversation, but I wanted to toss in some encouragement and a small amount of praise for your efforts.


 < 1 2 3 4 > 
2 of 4
  login or register to react