AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Chinese opening and closing double quotes are revised to ASCII
 
 

I am running into some problem when my top file contains Chinese opening double quote.

My top file is UTF-8 encoding in order to handle Chinese. If there is any Chinese double quote in the file, :build mybot will raise a warning.

line 11 of mytop.top: UTF8 closing double quote revised to Ascii

If I encode the same file with UTF8 without BOM, the above warning goes away, but I get another file level warning:
File mytop.top has no utf8 BOM but has character>127

I also tried escape the double quotes, and ^original(), but it won’t help. Is there anyway to solve this problem?

For testing purpose:
Chinese opening double quote is “
Chinese closing double double is ”

Thank you so much for your help.

 

 
  [ # 1 ]

CS only tokenizes strings and processes them correctly using simple ascii doublequotes, hence why it translates other utf8 quotes to ascii simple.  I understand your files have utf8 quotes.  Why is it important that they NOT be changed internally. When you output such string to user, Chinese want to see the curly quotes only?

 

 
  [ # 2 ]

Yes, the curly quotes (directional) is expected in Chinese, since the vertical quotes are not officially in Chinese. Is it possible that we do not tokenize Chinese quotes at all and treat it as a regular utf8 character? Or is it possible that after tokenization, the output is converted back to the original quote format?

 

 
  [ # 3 ]

So, is it sufficient if I provide an output control that has CS automatically convert normal quotes on output to the user into curly quotes?

 

 
  [ # 4 ]

Yes, absolutely.  It’s better to have two output control for the right and left curly quotes. grin

Thank you so much.

 

 
  [ # 5 ]

done in next release

 

 
  login or register to react