AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Can CS handle concepts as Big Data?
 
 

Hello again. I am hoping to experiment with Big Data and CS and want to know if it’s possible. My goal is to use the Wikipedia categories to achieve a massive, prebuilt hierarchy of concepts. Wikipedia categories would be CS concepts, and pages would be non-concept members. To choose the brief if exotic example (they go together!) of Phoenicianism, the Wikipedia hierarchy

https://en.wikipedia.org/wiki/Category:Phoenicianism

would convert to the CS concept

concept: ~phoenicianism (~Kataeb_Party ~Lebanese_Front‎ ~Phoenicianists Phoenicianism Al_Tanzim_Al_Amal Guardians_of_the_Cedars Kataeb_Party Kataeb_Regulatory_Forces Lebanese_Christian_Nationalism\:_The_Rise_and_Fall_of_an_Ethnic_Resistance Lebanese_Renewal_Party‎)

Then if someone mentioned the Kataeb Party your bot could ask you your feelings about Phoenicianism, or in the opposite direction, could go down the tree and ask what you think of Kataeb Party politicians. Simply alternating references to the category above and below some key part of each volley could make for powerfully relevant conversation, with a sufficiently rich dataset.

The Wikipedia category tree and associated pages runs into the tens of millions, all in a logical flow. I assume this would be way too much for local CS storage, so would need a database to host the concepts.  Then one could create topics, rules and gambits traversing the concept tree.

I can foresee several tricky bits, but assuming I can normalise the text and eliminate redundancies, how would CS interact with a database with say 10 million concepts?

 

 
  [ # 1 ]

I should add, the CS constraint I can see is that the two ways I understand of doing this would be to make a database call every volley which would surely not scale, or compiling the full list of concepts from the database at the start, but presumably 10 million concepts would fry CS’s brain. Are these the only two options?

 

 
  [ # 2 ]

database calls every volley scale just fine. Kore does lots of api calls every volley, including calling mongo at the start and end of every volley since it saves the user topic file there. 10M concepts would fry unless I revise engine to use 64bit fact pointers (but I dont have incentive to do that yet).

 

 
  [ # 3 ]

OK, thanks a lot. When I read something like that I am once more seriously impressed by CS. A good project ahead.

While I’m at it, a quick if basic question (I suspect the answer is 101 but I am still getting to grips with the ins and outs of the documentation). Is there a way to directly output the CS concepts to the user? I can see how I could copy the list and create rules for the concept to trigger an output of its copy, but is there a way to reference directly as output?  If a user writes “I want to buy a diamond”” can I reference the concept.top list to respond: “Would you like to buy other trade goods and jewellery items?”

 

 
  [ # 4 ]

well, you can certainly get the list of concepts indexed thru a position in the sentence. But you would want to create a list of concepts you dont care about to exclude, because how do you know you want those 2 concepts ONLY.

 

 
  [ # 5 ]

any concept name dumped directly into output, automatically has the ~ removed and _ converted to space.

 

 
  [ # 6 ]

Thanks. What would be the syntax for outputting a concept label? as in:

u:(~animals ~family ~felines ~cars ~pets)
You are writing about [syntax for inserting matching concepts here]

So if the volley was: “I want a cat”

The response would be (assuming cat exists in all three concepts): “You are writing about animals, felines, pets.”

 

 
  [ # 7 ]

u: (_[~animals ~family ~felines ~cars ~pets])
@_0 = ^conceptList(_0)
loop()
{
_0 = ^burst() -- you can get all concepts running thru but you need to discard ones you dont care about
}

OR

u: (_[~animals ~family ~felines ~cars ~pets])
  if (_0 ? ~animals) { $_tmp = ^join($_tmp , ~animals)}
if (_0 ? ~family) { $_tmp = ^join($_tmp , ~family)}
...
then at end remove the leading ,

 

 
  [ # 8 ]

That is super helpful and will save me tons of time. You rock.

 

 
  login or register to react