AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Running CS with other languages
 
 

I was asked this question by email because the author is still awaiting membership confirmation:
“How easy (or how hard) to port a new language to ChatScript? specially if this language is right-to-left language? (Arabic, Hebrew, Persian, Urdu, ...)
Do I need wordnet or deep NLP programming or what?
Do you suggest any resources illustrating adding new languages to ChatScript ?”


There are various aspects to language support in ChatScript.  The system supports UTF8, so you can input and match patterns in any language.

The system comes presupplied with a bunch of concepts, which presumably you might want to write your own equivalents in some other language for whatever ones you want to use.

Most users don’t need the pos-tagging/parsing abilities that exist, and if you wanted them for some other language the easiest way is to query a server that does it for some other language and read the answers back into CS.

The dictionary has a variety of uses.  Spell checking is one. Sometimes it is used to enumerate down a list of words to build a concept, but one can build a concept manually.

The biggest issue for foreign languages is getting the canonical form (stemming or whatever), since one usually wants to match the canonical form of a word rather than the original form. There is a file (canonical.txt) which can explicitly list words and their canonical, so one could merely list all the words you want t use and their canonical form in there. Otherwise writing the corresponding code of the system’s stemmers would be needed.

As for right to left languages, I’m not an expert in that.
Output from CS will be whatever the output is, so it isn’t aware of right to left.
Patterns normally match left to right, but it is possible to make it match backwards. But then again, presumably if the input is right to left, you can just match the input per usual left to right, matching the end of the pattern before the start.. I don’t know.

 

 
  [ # 1 ]

Followup questions asked by poster:
Can ChatScript switch among languages instantaneously or the bot needs to shutdown and restart?
Arabic, for example, it’s not only right-to-left languages but also it differs in building the whole sentence as the main verb of the sentence comes first then the noun that do the verb then the noun that completes the meaning, Will that effect deeply the pattern matching mechanism of ChatScript?
The punctuation marks also differs for example ؟ and , are the question mark and comma so Will that effect deeply using ChatScript?
In Arabic it’s almost impossible to make one word with only one meaning as the context is very important to clear the meaning of each word so Will that effect deeply using ChatScript?

Depends on what was being done.  Patterns in utf8 can be in any languages, so one rule can test french and another arabic.  A dictionary COULD be holding multiple languages, though there may be issues when the same spelling of a word yields different parts of speech depending on language.
Order of sentence words doesnt concern chatscript, that’s the job of your patterns to test correct order.
Punctuation will affect chatscript in that internal determination of statement vs question is done using english punctuation. However you can override that from script, so you can work around it.

In ALL languages, one word rarely has only a single meaning as context is important. ChatScript is unaffected because ChatScript is not responsible or capable of understanding “meaning”. That’s the job of the scripts you write, which become responsible for handling context.

 

 
  [ # 2 ]

At last I can post in this forum :D
Thank you Mr Bruce for your co-operation,
One other thing I want to know about its effect on ChatScript smile
In Arabic, the pronouns and the questionning keywords may be added to the stemmed word to produce another word equals a completes meaning, for example this Arabic word is actually a complete questionning sentence:
أنلزمكموها
and is litterally translated as:
Shall we compl you to accept it

Is there any illustrated detaied previous experiment to port non-English language to ChatScript? German for example?

One more thing before I forget smile
can I override the ChatScript keywords to its meaning in other language? for example
topic:
to be
موضوع:
and if it’s possible how it’s done?

 

 
  [ # 3 ]

There are no illustrations of other languages used BEYOND merely using utf8 words in patterns.

You can not override chatscript keywords.  You are stuck with topic:  instead of an arabic equivalent.

 

 
  login or register to react