AI Zone: chatbots.org

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Chatbots and Spell Checking

Posted: Dec 8, 2011

Dave Morton

Administrator

Total posts: 3111

Joined: Jun 14, 2010

E-mail Dave

One of the real challenges to any sort of chatbot lies in “understanding” the spirit of the intended input, even if it’s not spelled correctly. One of the simplest (though certainly not the “best”) is to process each word against a database of known commonly misspelled words and their properly spelled replacements, making substitutions as needed. Now this is quite obviously far from perfect, even if there were an algorithm to take context into account (e.g. there/their/they’re, pair/pear/pare, etc.).

However, even though it’s far from perfect, having at least some minimal form of spelling correction is essential to contributing to the success or failure of any type of Turing test. What we humans to without thinking about when we interpret the following statements:

“Fred went for a wauk in the citty, stoped at a flower shop, and baught a boukay of daisies.”

“Paul and Becky went to the park to celebrate there birthday they’re. They had a pare of pairs each, and now their going home.”

can be an impossible task for the average chat agent. While a spell checker may be able to correct the first statement without much difficulty, the second would prove impossible, since there isn’t a “spelling problem”, but one of semantics, instead. Either way, in order to pass a Turing test at all, I think that both of these types of challenges need to be overcome in some form.

Ok, slipped a little bit off topic, but Andy’s post brought this to mind, and I felt it was at least a little relevant.

Posted: Dec 8, 2011

[ # 1 ]

Andres Hohendahl

Senior member

Total posts: 141

Joined: Apr 24, 2011

E-mail Andres

@Dave
I am with you! completely. Good comment! thank you!
I can comment also that my efforts towards a good spellchecker were rewarded because I could present it (and was accepted) to a very good NLPCS congress in Madeira, Portugal on June 2010. (the paper is free, written in plain -no pain- English on my blog-website) here: http://web.fi.uba.ar/~ahohenda
The engine we built is - of course - language dependent (but trainable in any language) and is similar or even better than the most of the best world-class word level spell-checkers including GNU Aspell, ISpell and derived works (used in Open Office). The main problem in a bot, is that you cannot build an “optional” list of candidate words, each with some sort of “coincidence factor” because you got to test them to all patterns and the task become too huge!
Even though, the library is capable of spell-checking and correcting unknown words! based on parasynthetic evidence composition with (prefix and suffix) using valid inflections also.

Posted: Dec 8, 2011

[ # 2 ]

Dave Morton

Administrator

Total posts: 3111

Joined: Jun 14, 2010

E-mail Dave

Ok, now I’m jealous. The spellchecking that Morti has is a simple search/replace function that checks a table in his database of over 7,000 misspelled words. It works well enough for an AIML chatbot, but it would be nice to have something just a bit more advanced.

Posted: Dec 8, 2011

[ # 3 ]

Andres Hohendahl

Senior member

Total posts: 141

Joined: Apr 24, 2011

E-mail Andres

@Dave Don’t feel sad!
I can teach you how to make a much better spellchecker for AIML than ever!
Just make function to extract all of your pattern unique-words (you won’t need anything else to match).
Then build a simple phonetic index double-metaphone algorithm and get all the word-indexs, sort them along with your words. Just then build a simple edit-distance algorithm biased with letter-similarity coefficient (anything reasonable you can imagine will word fine). Push them all into a Blum filter and into a TST Trie.

When a unknown word arrives check the Blum Filter, if not there, you might have to spell-check, go to the TST trie and seek for 1 character change, evaluate against your phonetic distance, if under some threshold: you’re done! if not, check for 2..3 character change! (depends on some heuristics based on the length of you word, for example the square root of the length rounded to an integer)

This will do the best job with your AIML, fast and painless, garanteed!

enjoy!

Posted: Dec 8, 2011

[ # 4 ]

C R Hunt

Senior member

Total posts: 623

Joined: Aug 24, 2010

E-mail C R

Dave Morton - Dec 8, 2011:
One of the real challenges to any sort of chatbot lies in “understanding” the spirit of the intended input, even if it’s not spelled correctly. [...]

“Fred went for a wauk in the citty, stoped at a flower shop, and baught a boukay of daisies.”

“Paul and Becky went to the park to celebrate there birthday they’re. They had a pare of pairs each, and now their going home.”

can be an impossible task for the average chat agent. While a spell checker may be able to correct the first statement without much difficulty, the second would prove impossible, since there isn’t a “spelling problem”, but one of semantics, instead.

There are ways of hacking your way around sentences like these without relying on hand-picked lists of words. I plan to employ a variant of this idea. Though I wonder about the computation time required to work through possible variants and suss out a sentence that makes contextual sense.

Dave Morton - Dec 8, 2011:
Ok, slipped a little bit off topic, but Andy’s post brought this to mind, and I felt it was at least a little relevant.

Whoops, guess I’m not helping.

Posted: Dec 8, 2011

[ # 5 ]

C R Hunt

Senior member

Total posts: 623

Joined: Aug 24, 2010

E-mail C R

Andres Hohendahl - Dec 8, 2011:
Then build a simple phonetic index double-metaphone algorithm [...]

Ha, Andy beat me to it apparently.

Posted: Dec 8, 2011

[ # 6 ]

Dave Morton

Administrator

Total posts: 3111

Joined: Jun 14, 2010

E-mail Dave

I’ve gone ahead and split this into it’s own thread, since we sort of strayed significantly off topic.

Andy and CR, this discussion has lead me to do a little research, and I’ve found that PHP already has Metaphone support built in. There’s also a module available to handle Double Metaphone, but it’s quite likely that most servers won’t have that particular module installed.

@Andy, you lost me at “Blum Filter”, but that’s what research is for.

Posted: Dec 8, 2011

[ # 7 ]

Andrew Smith

Senior member

Total posts: 473

Joined: Aug 28, 2010

E-mail Andrew

Dave Morton - Dec 8, 2011:
@Andy, you lost me at “Blum Filter”, but that’s what research is for.

I’m pretty sure he meant “Bloom Filter”

http://en.wikipedia.org/wiki/Bloom_filter

Posted: Dec 8, 2011

[ # 8 ]

Dave Morton

Administrator

Total posts: 3111

Joined: Jun 14, 2010

E-mail Dave

Thanks, Andrew. That would, of course, make more sense.

Posted: Dec 8, 2011

[ # 9 ]

Andres Hohendahl

Senior member

Total posts: 141

Joined: Apr 24, 2011

E-mail Andres

Dave Morton - Dec 8, 2011:
Thanks, Andrew. That would, of course, make more sense.

Wow! what a hell of an idea-exchange! (brain hurricane!)
Thanks Andrew! of course Bloom Filtering (is a kind of unified hash, very efficient recall 100% precision quite high, good for making first chances, very fast, only N hashed done, and N is factor of number of “contained” memory and needed precision)
My error: I speak English, German and Spanish as well, so my memory is basically phonetic (as might be all of ours as humans) when you pronounce Bloom the double “oo” spelled as “u” and in German and Spanish this sound it is also written as “u” because Blume is the German word for flower, and for sure this English name derives from the term ‘flower’ in old-Anglican or old-German with a phonetic-orthographic transcript. Also blossom means something related in English, here are the roots! Also from here comes part of my concern about phonetics, on which mostly I based my spell-repair algorithm. (the former method - mentioned - was something I’ve tried, worked fine for English… but not for Spanish) :(
hope it helps!

@Andrew Congratulations for your Doctoring.. (PhD)
I am headed towards this also.. may be some day.. have no hurry!

Posted: Dec 8, 2011

[ # 10 ]

Andres Hohendahl

Senior member

Total posts: 141

Joined: Apr 24, 2011

E-mail Andres

C R Hunt - Dec 8, 2011:
Dave Morton - Dec 8, 2011:
One of the real challenges to any sort of chatbot lies in “understanding” the spirit of the intended input, even if it’s not spelled correctly. [...]

“Fred went for a wauk in the citty, stoped at a flower shop, and baught a boukay of daisies.”

“Paul and Becky went to the park to celebrate there birthday they’re. They had a pare of pairs each, and now their going home.”

can be an impossible task for the average chat agent. While a spell checker may be able to correct the first statement without much difficulty, the second would prove impossible, since there isn’t a “spelling problem”, but one of semantics, instead.

There are ways of hacking your way around sentences like these without relying on hand-picked lists of words. I plan to employ a variant of this idea. Though I wonder about the computation time required to work through possible variants and suss out a sentence that makes contextual sense.

Dave Morton - Dec 8, 2011:
Ok, slipped a little bit off topic, but Andy’s post brought this to mind, and I felt it was at least a little relevant.

Whoops, guess I’m not helping.

The problem you mean is easily NP hard, if you plan to check all combinations, so it’s not an option to make this way, best try my method! also Metaphone is not the best phonetic-code (its a index-code, not a measure) so when you get the same number out of 2 words, you can guess they sound somehow similar, but you wont know how much “somehow” = similar!
This is the pity of all the Soundex and Meta.. family!
best!

Posted: Dec 8, 2011

[ # 11 ]

Laura Patterson

Senior member

Total posts: 250

Joined: Oct 29, 2011

E-mail Laura

It was my first priority when designing my chatbot. My parser first checks to see if it’s a valid word of phrase. The word(s) are checked against a 58,000 word and a 164,000 phrase database. During the parsing process, if an invalid word is found, it checks to see if the word has at least an 80% match to a valid word. If so, it marks that word to be compared in context with the rest of the input. As the parser continues, it checks for associative words and assigns grammar tags. The “taggedStr” is then processed by the interpretor where the taggedStr is further analyzed to form a reply.

It’s a complicated process that is yielding excellent results in testing thus far.
I agree that both spelling and grammar correction is the cornerstone to an effective AI conversation with a bot.

Posted: Dec 9, 2011

[ # 12 ]

Robert Mitchell

Senior member

Total posts: 147

Joined: Oct 30, 2010

E-mail Robert

Here’s the statistical nlp (and therefore, google’s) approach: http://norvig.com/spell-correct.html

Posted: Dec 9, 2011

[ # 13 ]

Laura Patterson

Senior member

Total posts: 250

Joined: Oct 29, 2011

E-mail Laura

That is very similar to the approach that I have taken with the exception that mine is written in Ruby.

Posted: Dec 9, 2011

[ # 14 ]

Merlin

Guru

Total posts: 1081

Joined: Dec 17, 2010

E-mail Merlin

If you haven’t done it, you should implement the spellcheck=“true” tag in your input box.

Posted: Dec 9, 2011

[ # 15 ]

Laura Patterson

Senior member

Total posts: 250

Joined: Oct 29, 2011

E-mail Laura

Posting on an iPhone sucks, don’t you know?

‹‹ questions and answers in the chatbot designer Psychodynamics and chat bot ››

Search the Forum

Forum Profile

Forum Subscription

Forum Moderators

On Our Admin Forums

Partner Forums

Science Statistics

Chatbot Statistics

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

How many team members (marketing, sales, IT and customer support) will be involved in
your chatbot system?

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

We're putting your report together.

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

What chat automation functions are most important to you? Check all that apply.

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

Who should we send the information to?

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

Who should we send the information to?

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

What is the best number to reach you?

Search the Forum

Forum Profile

Forum Subscription

Forum Moderators

On Our Admin Forums

Partner Forums

Science Statistics

Chatbot Statistics

Use our Chat Match Tool to get started with Chatbots for Business

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

How many team members (marketing, sales, IT and customer support) will be involved in your chatbot system?

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

We're putting your report together.

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

What chat automation functions are most important to you? Check all that apply.

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

Who should we send the information to?

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

Who should we send the information to?

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

What is the best number to reach you?

Compare features, pricing, and reviews from award-winning providers based on best fit for your business.

Subscribe

Use our Chat Match Tool to get started with
Chatbots for Business

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

How many team members (marketing, sales, IT and customer support) will be involved in
your chatbot system?

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.

Compare features, pricing, and reviews from award-winning providers based on best
fit for your business.