AI Zone Admin Forum Add your forum

NEWS: survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Badwords list

How can I detect bad words using AIML?
e.g. user may say SEX word in 3 different ways.  it can says it in the middle of the sentence, at the end if the sentence at the beginning, and juse onw word. sex.
I created AIML to do so:

  sex *</pattern>
  _ sex</pattern>
  _ sex *</pattern>

but this is so frustrating to write this for every word.
Is there a shortest way of doing so?


  [ # 1 ]

I have an AIML file that I use for this purpose, called profanity.aiml - It handles not only bad words, but also hate speech. It warns the user twice about bad language, then “kicks” the user from my bot’s site, sending them to Google in such a way that the “back” button won’t even work, and then banning their user id and IP address for 24 hours. It uses custom tags, though, so it’s quite likely that it won’t work with most chatbot programs. Let me work up a modified version that will only ignore the user, and I’ll either post a link, or upload it here.


  [ # 2 ]

my question is how do you block a specific word….
the current design is if I reciece from the bot “BAD WORD” response I do actions.

how do you recieve “BAD WORD” response for every sentence contains the word SEX ?
The above AIML works fine, but I would like to reduce it to ONE category only.
is it possible with AIML?



  [ # 3 ]

Unfortunately, it’s not possible to do that with only one category. The reason for this is that with AIML, the wildcards (_ and *) require at least one word for them to be valid. In other words, the pattern “* sex” needs at least one word before the word sex in order to be a valid match. Thus, at least four categories are needed, though they can be <srai>‘d into a single category. These four categories would have the following patterns:

* sex
sex *
* sex *

If you want to also match the underscore wildcard, it will cancel out three of the above patterns, since those will never match. Therefore, if you want to use underscores, the four patterns are:

_ sex
sex _
_sex _

Now, obviously, there are a large variety of options for additional categories here. Also, if you intend to filter out more than the word “sex”, you have to add at least four more categories for each word you want filtered, but there’s really no other way to handle this with AIML.


  [ # 4 ]

OK…got it.
I am not sure though, What are the differences between the “_” to “*”?


  [ # 5 ]

There are really two answers to this question.

One is that you can think of AIML as a kind of “assembly language” for AI.  The kinds of operations that you can do in AIML are primitive, so you need more instructions to do something you might do with less in a higher-level language.  I had always envisioned, and it had come true to some extent, that there would be tools to create things more complex operations in fewer keystrokes, that would compile down to the more primitive AIML.  If I recall correctly the Sitepal AIMC system for AIML includes exactly the feature you want: a method to specify a keyword and its associated response, and it generates the four keyword categories automatically. 

The second answer is that some AIML interpreters include a preprocessor configuration file (which is in fact part of the AIML spec—the file might be called “startup” or “config”).  The AIML preprocessor includes a step called “normalization” where it removes punctuation, expands contractions, corrects some spelling mistakes, and in general performs arbitrary substitutions on the input as specified by the boitmaster.  It is possible to use the preprocessor to replace all badwords with a single keyword, such as BADWORD.  Then the AIML itself has to look for only the one keyword BADWORD. 

In Pandorabots this file is called and is globally configured for the entire server.  The only way to access it at present is to subscribe to Pandorabots dedicated hosting service. 


  [ # 6 ]

This might also help in some way or another:


  [ # 7 ]

Thank you, Richard and Hans Peter, for the additional information. It’s always best to get one’s knowledge from multiple sources. smile


  [ # 8 ]
Hans Peter Willems - Mar 20, 2011:

This might also help in some way or another:

Interesting tool!

any way, if there is no tag such <contains> rag so actually there is a problem.
Because lets say I need to detect when a user says the word “DOG”

so I need to create 4 AIML categories for it….


If the AIML won’t detect the word “DOGS” so I need to create another 4 categories for that….:\
and what abou DOGGY…which is the same root and meaning….so another 4 categories…
3 words of the sanme meaning and 12 categories….

Is there anyway to detect this words in 4 categories only?
or do I have to create a top level code above the AIML, as suggested here?

Thank you!


  [ # 9 ]

You have to create 4 categories for each word I’m afraid.


  [ # 10 ]
Steve Worswick - Mar 30, 2011:

You have to create 4 categories for each word I’m afraid.

Thank you all!


  login or register to react