Article to share: Chat or What: Approaching Text Normalization in Chats and Social Networks

Thought this article might be interesting:


User-generated content, and all of its various forms—chat speak, text speak, phone speak, twitter speak, and I’m-an-illiterate-teenager-speak—are among my pet-peeves when it comes to chatbots.

I suppose it has its place where brevity is valued, but the problem is, like colloquial slang, it’s ever growing.  Add to that the fact that there are an infinite number of ways to misspell a word, and include the human propensity to constantly test the limits of any system to its breaking point, and it’s easy to conclude that it’s simpler to require correct spelling, and to exclude the various ways people attempt to communicate in a creative and impressive way.

The largest and most up to date computers that are scanning our emails and snooping in on our text messages might find that making all user-generated content into something readable for a human is desirable, but on this level, it’s just a headache.


