AI Zone Admin Forum Add your forum

NEWS: survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

bot compiler

I’m making a game set in WWII

Of course the NPC’s shouldn’t know about Richard Nixon or “The West Wing” series, or even what a TV is for that matter.

So I’m going through the installed dictionaries, quibbles, etc and weeding out anything after 1942. This is pretty slow going, since I’ve no idea when half these movies were made.

Additionally, I’m sure someone who was an adult in France in the 40’s knows a lot more about French movies of the 30’s than me.

All in all, this smells of ‘human shouldn’t be doing this’.

So, I’m thinking about writing a semantic web application that takes a config file about the character (current date, age, nationality, education, etc.) and attempts to make files that are correct for them by browsing sources like IMDB.

That sounds like a large enough project to do some thinking before doing it. So I’m posting a note here to see if I can get feedback of things people would want included, suggestions, problems, etc.



  [ # 1 ]

> Meet VeeMe: The Virtual Agent Programmed to Think Like Me (29 Nov 12)

Yes please, I want one!  There has been a long history of systems trying to work off the Digital footprint, ultimately leading to Mind uploading, but with a spotty success rate at best. is an example of one such system, and is another.  Barrett Brown has even written of shadowy government Sockpuppet (Internet) programs developing Persona Management software.  It could be a living person, could be a historical person, or could be a fictional character.  I would like to extract characters from novels, and then be able to flesh out their backgrounds with just such a tool, or module, as you suggest.


  [ # 2 ]

Well, I think you’re imagining a more complex, perhaps more puffy blue sky project than I am.

I’m imagining a reasonable scale tool that knows how to select subsets of ‘world knowledge’ that are appropriate for a charater, so yes, if you make an Abraham Lincoln bot it doesn’t know about video games but has a surprisingly encyclopedic knowledge of the bible, while the Will Wright bot has most of the IMDB videogame database in it’s production set.

Now a counterexample:

I’d expect Angela from “Tom Loves Angela” to “know about Paris” Well, does that mean she’d know the Mona Lisa is in the Louvre? Certainly. And that it was built by Philip II as a fortress? Perhaps. And that it’s at 48.860339N 2.337599E? Not a chance.

Angela’s life seems wrapped up in using sex and gender games to get what she wants. As part of that, she walks the path of gender conformity pretty closely. That includes not knowing something as nerdy as the coordinates of the Louvre.

Do I intend to get anywhere CLOSE to being able to reason that out? No way. That’d take a psychological model that’s beyond the state of the art or a huge catalog of HIT classified ‘things’ classified by gender stereotype, nerdiness level, and a bunch of other data, as well as a LOT of work to put it together.



  [ # 3 ]

You might consider making a wikipedia scrapper that recursively goes through links on a topic (filter by the date of the link reference to get historically relevant data).  I have seen these before (linked somewhere on Reddit) and they work pretty well. 

I found this (Scraper-Reference) which is the same idea. 

Extracting the “knowledge” would be more challenging (believe you me!) , but at least you would have a “period appropriate” machine generated corpus to begin with.


  [ # 4 ]

I don’t have to scrape Wikipedia. Others far more capable have done it for me.

dbPedia extracts information from wikipedia ‘info boxes’, the boxes in upper right of many articles, and presents them as RDF triples, essentially the same thing as ChatScript facts.

Wikidata gathers large amounts of human entered data.

These are only two of a large number of data sources available on the semantic web.  One of the most complete listings of such data is at

A lot of knowledge of popular culture is encoded by imdb’s API, and by the open source movie database

Knowledge of Geography can be extracted from GeoNames.

All of this data is encoded by ‘ontologies’, definitions of the ontological relationships among items. Thus, somewhere out there is a notion of ‘director of’ for a film,

RDF data can be queried in such a way that it gets the period appropriate data. I do have to define what ‘period appropriate movie’ means for each encoding, query them, and apply some stochastic model to decide which ones are actually known by the bot.

I’ve been talking with some semantic web researchers I work with today, they’re working on a sort of Google for semantic web that lets one search the entire space, but it’s not quite finished yet.  Their scheme is just what I need, so I’m going to be waiting anxiously for it to finish.

So I feel like it’s a reasonable, do-able project to build a toolkit that helps build bots.


  [ # 5 ]

Here’s an example of data for The Shining.

Understand this isn’t the raw data, this is a user friendly web interface display of it.

If you follow the link to get to the director’s page,

you see links for ‘movie:director_of’  for a bunch of movies (names of things are URI’s, you have to follow rdfs:Label to get the human name)
You also see links for ‘foaf:made’  - foaf is an ontology for social relationships. So Wendy Carlos and Stanley Kubrick have a relationship because they worked together on Clockwork Orange.

And, since ‘made’ and ‘director of’ are pretty much the same thing, unsurprisingly the list is pretty much the same.



  [ # 6 ]

Update - I’ve got a paid gig (working with chatscript, yay!) so I’m ignoring this until it’s over. Hopefully by that time the semantic web folks will have their nifty tool in action.


  [ # 7 ]

Definitely should have some other levels, like snideness, honesty, humor, persistance in getting bot back on track,

bot’s ‘job’ (eg it’s a chatbot that just wants to keep’em talking, or an infobot that wants to steer people onto productive tracks, or an NPC that needs to stay in role, etc.)



  [ # 8 ] is an example of one such system, and is another (@eter9) is all over the news at the moment. (I’ve been unable to register so far.)

> Eter9 social network learns your personality so it can post as you when you’re dead

New site Eter9 promises digital immortality using a kind of artificial intelligence to scan your posts


  [ # 9 ]

- (@LifeNaut)

- (@eterni__me)

- (@eter9)

=> (@humaitech)

According to a flurry of news reports, Humai is a new player in this space… making even more fantastical claims than its predecessors: Company Aims To Bring Back The Dead Within 30 Years


  login or register to react