AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Simple Knowledge Base Interface
 
 

Someone mentioned tapping into Wikipedia as a Knowledge base here a while back and I lost track of the thread, so Im going to post this here. Its general enough to warrant a new thread I think. RICH taps into many different online KB’s as well as having the ability to read webpages and use Internet searches, and of all these interfaces,  the simplest interface that I have found is Duck Duck. The XML return is simple and the JSON is even better. There is no special formatting for queries and the API is open. (at least for now.)
the following query

http://api.duckduckgo.com?q=keith+richards&format=json

demonstrates the interface. switch format=xml to see the xml output

If I ask an AI “who is Keith Richards”, and it returns a 3 paragraph dissertation including the history of British rock, I can be fairly certain that this is a machine. Logic in RICH creates an abstract in order to make the answer more compatible with taught responses.

Duck Duck does this for you.  Basically they do what a lot of the other code in RICH does (Who knew LOL)

The interface resuirements are so basic that even novice coders should be able to tap in, but anyone who needs code written let me know and Ill throw a simple class together and put it up.

 

 
  [ # 1 ]

> http://www.meta-guide.com/home/application-programming-interfaces

Web APIs, I love web APIs.  I use Yahoo! Pipes for all web APIs.  When I have a decent dialog system web API available, I don’t have any trouble integrating it with Pipes.  Your most basic knowledgebase format is a text file, which can be saved as CSV.  Your next most basic format is any spreadsheet.  Pipes is great because it has a CSV module to integrate virtually any spreadsheet, hosted on Google Drive for instance.  Unfortunately, Yahoo! Pipes chokes on large and complex integrations; so, lately I’ve been looking into iPaaS (integration Platform as a Service), though have yet to find just what I’m looking for….

> http://www.meta-guide.com/home/ipaas-integration-paas

See also AI Middleware:

> http://www.meta-guide.com/home/artificial-intelligence-middleware

 

 
  [ # 2 ]

Vince, I altered the URL that you posted because the forum software here doesn’t like spaces, and as a result broke your link. It’s all better now, though. smile

 

 
  [ # 3 ]
Dave Morton - Oct 17, 2012:

Vince, I altered the URL that you posted because the forum software here doesn’t like spaces, and as a result broke your link. It’s all better now, though. smile

Thanks Dave

 

 
  [ # 4 ]

I am currently building a data miner, and have considered tying into some of the exhaustive resources that already exist, and I certianly will at some point. Prior to doing so, however, I need a grammar parser that can deal with what it finds. Different resources are good for different things, right? I mean, a knowldge base like wikipedia is formatted paragraph style, and difficult to granularize, whereas something like Duck Duck sounds more componentially categorical.

I feel that what is needed is a grammar parsing engine powerful enough to break down a wikipedia style resource, and at the same time, to assemble elements of a Duck Duck style resource. For that matter, it seems that this same syntactic limitation is what primarilly holds chat bots back from more gracefully handling prompt/response. Dealing with archived data or streaming conversant data, either way, an effective program has to break down complex sytax into simple elements, and then reformulate back an origonal remixation with some variation and some room to contribute to conversant direction through decisional options.

How about nailing down such a grammar master and then tapping into different APIs on a tiered basis. Going in, you could hit something categoriacal, like Duck Duck, as a primary, and then refinine teh results thru another pass into something more abstract, like wikipedia. Coming out, you could do the abstract first, and reverse the process, or something like that.

 

 
  [ # 5 ]

Jeff,

Thats exactly what RICH attempts to do. Not always completely successful, it still spits out some “zingers” of the “Im a daisy variety”. If the AI encounters an inrerrogatory that it does not already have an answer to, it tries to find and compile an abstract from many sources. After encountering duck duck it is absolutely going to the top of the tier!

Vince

 

 
  [ # 6 ]

That is interesting. I am trying to build a grammar parser using sql db for the core. I’m trying to work out the normalization process now, where my tables have no break in Codd’s principles. With gramatical syntax, this is very difficult, but it is a healthy, though greuling process for me to go through.

I havn’t played with any RICH applications, but I’ll try to find something to dink with. I was looking at the reed kellog sentence diagrammer though, and wish I could see under that hood.

http://1aiway.com/nlp4net/services/enparser/default.aspx

Even this breaks down. I hope my parser can do better. The problem is that I am not satisfied with the creation alone. I want my logic agent to be able to figure out the order through a series of question and answer interludes, and build up its own set of tables and relations. I forsee normalization being a huge problem as I attempt to get it to do this.

Check out this sentence diagrammer. It is impressive.

 

 
  [ # 7 ]

I am trying to build a grammar parser using sql db for the core. I’m trying to work out the normalization process now, where my tables have no break in Codd’s principles. With gramatical syntax, this is very difficult, but it is a healthy, though greuling process for me to go through.

Have you considered operation speed?

 

 
  [ # 8 ]

Yes Jan. I have been considering speed, and I hope that ajax calls can help there, bringing data sets in from mysql, converting them to json format, and modifying the json with assycronously as required.

What do you think of that approach? You sound dubious.

I’m a total green boy at all this, but am having some fun, and learning a lot.

 

 
  [ # 9 ]

It all depends on how you set things up and how you use it, but from my experience, relational databases aren’t that good in working with unstructured data.

Take wordnet for instance, there are several implementations available, the sql version is slower than the flat file versions.

 

 
  [ # 10 ]
Jan Bogaerts - Oct 21, 2012:

It all depends on how you set things up and how you use it, but from my experience, relational databases aren’t that good in working with unstructured data.

Take wordnet for instance, there are several implementations available, the sql version is slower than the flat file versions.

Understood. However, as an English teacher and a grammar guy, I don’t view any English expression as unstructured, although there are too many variational potentialities to allow for ONLY a strictly structured data environment. The rules of syntax however, are amazingly consistent.

To approach this problem effectually I have to think first about representing the redundant structure of English grammar NON redundantly in an RDB (BIG NORMALIZATION CONNUNDRUM), and secondly, about how to integrate that with a flat file format to re-introduce, or re-allow the redundancy, and for speed issues as well. The RDB will have to talk to a flat file to first map it and then data populate it in an initialization phase on startup. Then, there will have to be cross format asynchronous calls between the json data and RDB data matrices in order to provide the ordering and updating integrity from the RDB on one hand, and the speed and redundancy needs from the json on the other.

In this way, I don’t think that there really is a SIMPLE knowledge base interface. There are simple DATABASE interfaces, but to access data is a diminutive thing compared to having knowledge. The facts are still the facts, but I still strongly believe that the difference between data and knowledge is what an information processor can do with the facts, and one way to measure that is syntactically.

In response to Vincent’s initial post, I don’t think any grammar parser/database choreography as yet allows us to call what we produce “knowledge” from a strictly AI perspective. A digital logic system simply MUST be able to handle the syntax in both input and output scenarios, not simply faking it by drawing on massive banks of preformatted pattern response. There has to be a deeper, rule base granularity behind both input analysis and output formulation in order for us really introduce terms like knowledge and intelligence into the conversation.

 

 

 
  [ # 11 ]
Jan Bogaerts - Oct 21, 2012:

I am trying to build a grammar parser using sql db for the core. I’m trying to work out the normalization process now, where my tables have no break in Codd’s principles. With gramatical syntax, this is very difficult, but it is a healthy, though greuling process for me to go through.

Have you considered operation speed?

Hey Jan

I got the conversion code put together to change the RDB into JSON. Check it out under “Teach Harri” and click on the last button in the RDB controller list. Every tie yo click it, it re-writes the sql tables to json and saves the file in the directory.
http://www.projectenglishtv.com/schl/hari/

I want to do some write functionality, and some formatting so that the json can be changed and saved on the fly. I haven’t gotten to that yet, but this is a start.

Everything works cross browser, but the css is a little funky in IE, just at the bottom of the teaching div. Everything else looks ok, but firefox is a better display.

 

 
  [ # 12 ]

I keep posting on this thread because it is so relevant to what I am doing right now, and reading and writing helps me think ahead.  Marcus was making some comments about APIs like pipes and ipas that support a wide variety of data formats, but that may have differing foibles here and there, different one from the other. I don’t know how to compare them myself, because I am pretty new to all this and haven’t gotten any farther than you can see on harri’s interface.  I suspect that in many ways I am re-inventing the wheel. I trust however, that a better wheel can be made through a combination of my unique theoretics and what I know to be my almost ridiculous ignorance regarding existing data interface options.

Once my application gets a little farther along, I will begin to expose myself to other people’s solutions, knowing from experience that a rush to production has compromised a lot of what is available on any market, and not wanting to curb my own zeal for what might be or setting lower expectations through too early self exposure to the productions of other minds. But in this state of temporarily self imposed ignorance, I find your comments and advice stimulating and insightful, and wish there was more feedback from the community.

If you look at the application so far though, you can see that it is beginning to integrate a sql database with a json formatted flat structure. I would like to be able to convert the same data fluently between different formats and also have assemblative dialogues to restructure each independently, and then to re-format any of the others based on the perspectives gained thru working on the current form.  Fluent convertibility and cross mutability between CSV, xml, sql, and json, I think would be a very helpful data interface, especially with a query generating interface that did the heavy lifting and allowed your cranial capacities to focus on WHAT to do with the data instead of HOW to do it, which I think speaks to Vincent’s initial reasons for starting this thread.

I expect to be able to upload a sql version of something like wordNet, drill into it through a window into the relations, convert selected aspects into json, fool with different query formats, be able to back adjust the sql relationships based on what is learned, re-integrate json changes to the “bettered” sql, and then to be able to do that same process through fiddling with the same data in other formats as well.

Creating flexible query interfaces for table data blending is where this will get tricky, and powerful. I am a firm believer in the possibility of a system becoming more than the sum of its parts, and that with structure and order, it is inevitable to so become. But there has to be a richness in the design where it cannot be easily seen from end to end, or even from end to middle. Flexibility, variability, interactivity and standardization will lead, I am confident, to an interesting place.

Anyway, take a look and tell me your thoughts so far.

 

 
  [ # 13 ]
Jeff Rogers - Oct 26, 2012:

Anyway, take a look and tell me your thoughts so far.

Unfortunately got there on a day when HARI was having “Brain surgery” Im sure the “patient “will be OK LOL
Couple of thoughts on your ovrerall posting(s) When parsing grammar remember for the most part the only entities that speak in a gramatically correct fashion are robots. Some of the funniest moments are seeing a chatbot handle this type of coversation.

User: Whats up?
Bot: Up is a direction, being the opposite of down.

Or even if the bot recognizes that “whats up” is slang depending on how it is used contextually
there is no switch in the context of the response

User: Whats up?
Bot: How are you?


Certainly gramatically and logically correct, but not really a “human” response.
Looking forward to seeing more!

Vince

 

 
  [ # 14 ]
Marcus Endicott - Oct 17, 2012:

> http://www.meta-guide.com/home/application-programming-interfaces

Web APIs, I love web APIs.  I use Yahoo! Pipes for all web APIs.  When I have a decent dialog system web API available, I don’t have any trouble integrating it with Pipes.  Your most basic knowledgebase format is a text file, which can be saved as CSV.  Your next most basic format is any spreadsheet.  Pipes is great because it has a CSV module to integrate virtually any spreadsheet, hosted on Google Drive for instance.  Unfortunately, Yahoo! Pipes chokes on large and complex integrations; so, lately I’ve been looking into iPaaS (integration Platform as a Service), though have yet to find just what I’m looking for….

> http://www.meta-guide.com/home/ipaas-integration-paas

See also AI Middleware:

> http://www.meta-guide.com/home/artificial-intelligence-middleware


THanks for pointing these out Marcus. I agree that web API’s are great!

 

 
  [ # 15 ]

Jeff, Does the bot consider that, in this case, up could potentially be an adverb modifying ‘is’, and since, normally people ask “what is a (string)?”  (treating ‘up’ as simply a generic string). rather than simply “what is (string)” (although non-native English speakers usually drop essential modifiers required).  Example: “what is car” sounds a bit silly compared to “what is a car” - you’re asking what is this class of objects known as ‘car’. By overly simplifying (leaving out the ‘a’) we don’t know which.  So does Harri perform this depth of analysis?  You could combine probablity into the mix.  For example if 99.9% of the time “whats up” was succesfully interpreted as semantically equivllent to “what’s new with you?” and also given the fact that we’re not specifiying that we are interested in a class of objects (by leaving out the indefinite article modifer ‘a’), we could correctly make that assumption. However as Vincient points out, context should override (if present).  so if the user said “I am going to ask you to define a set of words, ok, here we go…  “whats up”).  It will of course have to assume that “whats” is poor grammar for “what is” or “what’s”, but that’s beside the point. 

So the full set of permutations for “whats up” is:

1. assumping the user was lazy and that “whats” == “what’s” (or “what is”) and they are asking what thing is currently in the state of up.

2. they are stating, that , in fact, there is a thing (called ‘whats’ ) that they are telling you is in a state called ‘up’.  Without the ? mark of course.. but with the ‘?’ it could be still a statement that they are asking to be verified… ‘whats up?’ ..  is it true that “what” is “up” ?  silly, but how does the AI know that is not the case smile

3. they are asking you “anything new happen lately?” (so the ‘hard coded’ mapping that we all take for granted, which you could directly tell your bot, OR have it, using probablity, realize that in 99% of previous cases it was successful when this mapping was assumed, so go with it… UNLESS, of course if current context overrides.)

As for using a relational db, I’m with Jan, probably not the best approach for NLP (for deep parsing NLP forsure anyway).

I see he is having serious brain surgery…looking foward to chatting with him later..hope he makes it!

 

 1 2 > 
1 of 2
 
  login or register to react