AI Zone: chatbots.org

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

Where does semantics fail?

Posted: Jan 3, 2013

JRowe

Member

Total posts: 9

Joined: Dec 31, 2012

E-mail JRowe

I keep circling around this idea, like some sort of buzzard, waiting to see if the idea is dead and ripe, or if it’s still mobile enough to escape me.

Basically, why hasn’t semantics been enough to produce an intelligent program? The failure of Cyc to produce a readily identifiable intelligent system is often pointed to as proof the idea is unsound, but the fact that there’s too much information out there doesn’t seem to be a good enough excuse.

I think it’s a matter of engineering, a way of intersecting active processes to produce emergent effects. I keep returning to neural networks as a necessary component, but I have some doubts.

What neural networks offer is a powerful, general purpose function approximator. It’s awesome in cases where you an identifiable set of inputs and a predictable set of outputs within a well defined problem domain. Semantic reasoners, on the other hand, don’t approximate, they just process using whichever inference method is put in place. Any locally consistent ontology can be queried to produce correct results.

My idea is that if you incorporate the processing of a bot within a consistent ontology, defining behaviors and feeding self knowledge into the graph, then you could end up with an emergent algorithm.

To test this, I’m using an old chatbot console I’ve gutted, and I’m going to use StrixDB and OWL to create a simple chat system. I plan on very limited interactions at first so that I can strictly define behaviors, but at the same time, identify opportunities for development.

This means that I want the bot to be able to learn, so it has to develop new entries within the ontology, and apply some sort of online learning algorithm. This may mean incorporating a parser, as well - I prefer LinkGrammar over most other parsers, because you can get some really high level results without a lot of work, and it’s extremely comprehensive.

The closest I think I’ve seen to this was the OpenCyc / AIML project Cyn, which basically nailed AIML onto the front of some CycL queries. The output looked very promising to me, but beyond the initial experiment, the project didnt seem to get off the ground.

So I have a project with a semantic reasoner engine (datalog) and a frontend - can anyone think of some good experiments to run?

I was thinking of a “Hello, Bot” experiment, in which I would attempt to model every relevant aspect of an interaction between a single user and a bot - the bot’s identity, the bot’s awareness of the console as an interface between something fundamentally “itself” and the outside world, the expectation of a greeting response when someone says hello, and so forth. Simple behaviors and mental processes would be designed, and all interaction with the bot would be through executing the processes and behaviors. The graph would contain every aspect of the bots existence, including the queries used to produce, alter, and add to the graph itself.

Once I can get past a few experiments of this nature, my hope is that ideas for active learning will start percolating. The obvious benefit of this type of system is that OpenCyc is available in OWL format, so you can get a huge well curated KB on the cheap.

Posted: Jan 5, 2013

[ # 1 ]

Mark Atkins

Senior member

Total posts: 133

Joined: Sep 25, 2012

E-mail Mark

Well..

“Semantics is the study of meaning.”
http://en.wikipedia.org/wiki/Semantics

“Meaning” is the same issue as “understanding”, and “understanding” is undefined.
http://en.wikipedia.org/wiki/Understanding

So far, all approaches to computer semantics/understanding of which I am aware have relied on logic or math, which are similar types of representation:

There are many approaches to formal semantics; these belong to three major classes:

Denotational semantics, whereby each phrase in the language is interpreted as a denotation, i.e. a conceptual meaning that can be thought of abstractly. Such denotations are often mathematical objects inhabiting a mathematical space, but it is not a requirement that they should be so. As a practical necessity, denotations are described using some form of mathematical notation, which can in turn be formalized as a denotational metalanguage. For example, denotational semantics of functional languages often translate the language into domain theory. Denotational semantic descriptions can also serve as compositional translations from a programming language into the denotational metalanguage and used as a basis for designing compilers.

Operational semantics, whereby the execution of the language is described directly (rather than by translation). Operational semantics loosely corresponds to interpretation, although again the “implementation language” of the interpreter is generally a mathematical formalism. Operational semantics may define an abstract machine (such as the SECD machine), and give meaning to phrases by describing the transitions they induce on states of the machine. Alternatively, as with the pure lambda calculus, operational semantics can be defined via syntactic transformations on phrases of the language itself;

Axiomatic semantics, whereby one gives meaning to phrases by describing the logical axioms that apply to them. Axiomatic semantics makes no distinction between a phrase’s meaning and the logical formulas that describe it; its meaning is exactly what can be proven about it in some logic. The canonical example of axiomatic semantics is Hoare logic.

http://en.wikipedia.org/wiki/Semantics_(computer_science)

I suggest that the essence of computer understanding boils down to the problem of finding a suitable knowledge representation method for computers, and/or possibly a type of math that involves something other than numerical or Boolean values for describing the very complex, uncertain, and changing world that living organisms inhabit and perceive. I know of no published system of representation that will accomplish that.

JRowe - Jan 3, 2013:

My idea is that if you incorporate the processing of a bot within a consistent ontology, defining behaviors and feeding self knowledge into the graph, then you could end up with an emergent algorithm.

I eschew “emergence”. I don’t like to have to wait for things to decide if they want to emerge. By gosh, I intend to make them emerge! That’s why the popular new AI direction of creating swarms with hopefully emergent intelligence doesn’t interest me at all.

I don’t know if that helps any, but at least now you can say that somebody responded to your post.

6.2 Limitations of Logic

In developing formal logic, Aristotle took Greek mathematics as his model. Like his predecessors Socrates and Plato, Aristotle was impressed with the rigor and precision of geometrical proofs. His goal was to formalize and generalize those proof procedures and apply them to philosophy, science, and all other branches of knowledge. Yet not all subjects are equally amenable to formalization. Greek mathematics achieved its greatest successes in astronomy, where Ptolemy’s calculations remained the standard of precision for centuries. But other subjects, such as medicine and law, depend more on deep experience than on brilliant mathematical calculations. Significantly, two of the most penetrating criticisms of logic were written by the physician Sextus Empiricus in the second century A.D. and by the legal scholar Ibn Taymiyya in the fourteenth century.
Sextus Empiricus, as his nickname suggests, was an empiricist.
(“Knowledge Representation”, John F. Sowa, 2000, page 356)

Posted: Jan 5, 2013

[ # 2 ]

Dave Morton

Administrator

Total posts: 3111

Joined: Jun 14, 2010

E-mail Dave

Just because some of us don’t respond, that doesn’t necessarily mean that we aren’t “listening”. I’m finding a great deal of interesting information in these threads, and I’m learning a great deal. I don’t respond (most of the time) because “Wow! I didn’t know that! Thanks!” doesn’t really move the discussion forward. Just know that I DO read these posts, and further, I appreciate them.

Posted: Jan 5, 2013

[ # 3 ]

JRowe

Member

Total posts: 9

Joined: Dec 31, 2012

E-mail JRowe

Emergence is a required feature. Waiting is not - I’m with you on that one. However, the system has to feed back into itself. The word emergence simply describes the quality of the whole being greater than the sum of its parts, algorithmically speaking. I use the word fractal in the same way, to describe a system’s momentum of complexity.

I suggest that the essence of computer understanding boils down to the problem of finding a suitable knowledge representation method for computers, and/or possibly a type of math that involves something other than numerical or Boolean values for describing the very complex, uncertain, and changing world that living organisms inhabit and perceive. I know of no published system of representation that will accomplish that.

The system of knowledge representation you’re looking for is called predicate calculus, which boils down to a programming language that is Turing complete. It’s a problem that’s largely solved - what is not solved, and the reason current efforts to implement first order logic engines have not resulted in AGI, is the computability of such systems. Where the human brain can easily process immense graphs of disparate knowledge that result in highly efficient logical operations and intelligent results, an inference engine is stuck without the ability to dynamically select optimal rules for graph traversal.

The whole of knowledge representation can be reduced to the traversal of a graph. Automation of this feature in AI led to such experiments as Markov chain bots, and the underlining of one of the huge gotchas in AI - Combinatorial Explosion.

The big unknown and hard thing about knowledge representation comes down to p vs np, which we currently don’t have the solution to. Calculations are hard, and you can’t even prove that a particular query will result in an answer, or even how long a query will take to resolve. There’s lots of hacks, but no hard and fast formulas for performance.

This means stepping outside of the engine and imposing arbitrary limits on resources allocated to a given problem, with no way of knowing if you’re excluding an entire magnitude of solvable problems just outside the artificial constraint.

This is why I think incorporating neural networks into the mix is important. An ANN has the ability to generalize, is globally optimal, and can be locally tuned as well. Knowledge representation is complete by default - every possible significant relation of everything it encounters given the entire input sequence is trained into the system. It can’t not learn something significant. In the same way, it implicitly condenses learned features and prevents redundancy of representation. This trick is accomplished biologically through networks that incorporate sparse distributed memory and about 10000 connections per neuron. Given the generally known attributes of the way the system works, this means that each neuron has around 2^13 different possible inflections, completely dependent on the context provided by other neurons.

Completeness of knowledge representation also needs to combine with compression, providing an expression of the relative measure of intelligence of a system. The measure of the intelligence of a system depends on maximally reducing the Kolmogorov complexity of the output.

I’m hoping my experiment will shed light on targets of opportunity. For example, a learning algorithm that doesn’t fixate on repeating individual actions, but grouping action sequences into behaviors, and clustering behaviors together intelligently… think about children first learning to spell out words, and then spelling the letters of a word out loud to themselves before repeating the entire word. Then mouthing out a word in a sentence, reading it multiple times. And then finally, reading to themselves without saying anything at all, even subvocalization… Learning to read books has remarkably little to do, directly, with learning the alphabet. Once the words are recognized as independent concepts, the consitituent structures no longer matter to the semantic processing. The letters only serve to facilitate the communication of a concept, and nothing more. 3v3n 1nc0rr3c7 or 4l73rn473 spelling suffices to get the point across, even if we have to do mental substitution. The substitution itself is not all that relevant to the end result - even less relevant than the fact that the substitution occured at all.

From what I know, knowledge representation is a globally solved problem. What is not solved is the relationship to intelligence, or the required “local” subset of a Turing complete system that gives you the shortcuts needed to perform efficient graph traversal in a general way.

The solution, I believe, will be a matter of engineering. Compartmentalizing the processes of acquiring knowledge, processing input, temporal and spatial awareness, proprioception, and above all, incorporating self awareness into the system implicitly, will result in a program that behaves intelligently.

That’s why I’m starting by implementing the simplest “behavior” I can recognize - that of mutual greeting. I believe it is the simplest expression of an AI-complete problem (capable of requiring an intelligence to solve.)

Posted: Jan 6, 2013

[ # 4 ]

Mark Atkins

Senior member

Total posts: 133

Joined: Sep 25, 2012

E-mail Mark

JRowe - Jan 5, 2013:

The system of knowledge representation you’re looking for is called predicate calculus, which boils down to a programming language that is Turing complete. It’s a problem that’s largely solved - what is not solved, and the reason current efforts to implement first order logic engines have not resulted in AGI, is the computability of such systems. Where the human brain can easily process immense graphs of disparate knowledge that result in highly efficient logical operations and intelligent results, an inference engine is stuck without the ability to dynamically select optimal rules for graph traversal.

No, I disagree with every statement above. There also exist analog computers that can in theory solve problems that no digital computer can ever solve, although admittedly in the real world there exist resolution limits (typically at the size of atoms) that in practice might render that advantage moot, since digital computers can simulate with arbitrary accuracy any system, given enough time. The main problem with digital simulation is that in practice the computation time needed on a digital computer for many simulations is often so prohibitively large that the computation will never finish in time for the answer to be useful.

JRowe - Jan 5, 2013:
The whole of knowledge representation can be reduced to the traversal of a graph.

If by “graph” you mean a directed acyclic graph (DAG) such as used in graph theory, this is not true. Photographs, poles, sticks of spaghetti, ropes, balance scales, measuring tape, pieces of paper, containers of water, atomic orbitals, and numerous other analog systems can and routinely do represent knowledge, not even considering all the analog versions of instruments like clocks, compasses, weight scales, thermometers, pressure gauges, accelerometers, slide rules, etc., and none of those analog systems can be exactly represented by a digital computer because digital computers always have finite accuracy whereas analog computers have infinite accuracy—again, in theory. The same is true about such analog systems being nonrepresentable by discrete KR structures like DAGs: completely discrete structures cannot represent continuous phenomena without incurring great inefficiency.

For example, check out the “spaghetti sort” for an analog way to perform a sort in O(n) time…
http://en.wikipedia.org/wiki/Spaghetti_sort

...which is faster than even the most efficient sorting algorithm known for digital algorithms, which is typically O(n log n)...

http://warp.povusers.org/SortComparison/

What Is an Analog Computer?
...
Discrete numbers are those which are either whole numbers; those where the decimal fractions are either limited such as one-eighth being 0.125; or those which have sequences which repeat, such as one-sixth being 0.1666 recurring. The infinite nature of irrational numbers means they cannot be reduced to the binary figure needed for a digital computer. This means only analog computers can act as so-called “real computers” and solve some of the most complicated problems in mathematics.

http://www.wisegeek.com/what-is-an-analog-computer.htm

I know that analog computers / analog computation is no longer taught in school, but that doesn’t mean it doesn’t exist or isn’t important. Back in the early days of computers, analog computers dominated digital computers for years, in fact.

Darpa Has Seen the Future of Computing … And It’s Analog

It seems like a new idea — probabilistic computing chips are still years away from commercial use — but it’s not entirely. Analog computers were used in the 1950s, but they were overshadowed by the transistor and the amazing computing capabilities that digital processors pumped out over the past half-century, according to Ben Vigoda, the general manager of the Analog Devices Lyric Labs group.

“The people who are just retiring from university right now can remember programming analog computers in college,” says Vigoda. “It’s been a long time since we really questioned the paradigm that we’re using.”

http://www.wired.com/wiredenterprise/2012/08/upside/

Analog computer trumps Turing model
...
Recent developments in computing theory challenge longstanding assumption about digital and analog computing, and suggest that analog computations are more powerful than digital ones.

The latest thesis, advanced by Hava Siegelmann at the Technion Institute of Technology, claims that some computational problems can only be solved by analog neural networks. Since neural networks are essentially analog computers, the work suggests, on a theoretical level, that analog operations are inherently more powerful than digital.

Based on her work, Siegelmann believes that neural networks represent a “super-Turing” computer more comprehensive than even a hypothetical digital computer with unlimited resources. Long-term, Siegelmann said she hopes her model will open up completely new ways of thinking about computing.

http://www.eetimes.com/electronics-news/4036696/Analog-computer-trumps-Turing-model

If you think about it, there is a very important qualitative change that happens when transitioning from discrete to continuous. Some examples:

(1) If you traverse a unit square (a square where every side has length 1) between its two most distant corners in only discrete steps, each step of which is parallel to one of the sides, the total distance traveled will always be exactly 2, even if you approach the direct diagonal path with a huge number of tiny steps. But as soon as you go analog and travel straight from corner to corner, the distance drops down to about 1.4 (= the square root of 2), which is significantly smaller than 2.

(2) The 0/1 knapsack problem is a discrete version of that problem that is NP-hard, but the continuous version can be solved with a very fast, trivial algorithm.

http://encyclopedia2.thefreedictionary.com/Continuous+knapsack+problem
http://xlinux.nist.gov/dads/HTML/fractionalKnapsack.html

(3) In geometry, the formulas for length, distance, or area of most curves or polygons composed of n discrete segments of straight lines are complicated, but usually those formulas suddenly reduce to a simple form when the limit of n is taken to infinity—equivalent to moving from discrete to continuous.

It is largely for reasons like this that I agree with DARPA that the future of computing, especially for AI, lies in analog computing and in analog representations.

As the astute reader has noted, there is a problem with this line of reasoning: Digital computers are, at bottom, similarly bimodal. Zeroes and ones are the popular alphabet for the bit patterns in the hardware of computers. Does this mean, as Dreyfus (1979) used to argue, that AI must move to analog computing machines before there is any hope of success? Most people don’t think so, but a consideration of the various ways up and out of this particular hole will take us too far afield; it is thus a take-home problem.
(“A New Guide to Artificial Intelligence”, Derek Partridge, 1991, page 472)

Posted: Jan 6, 2013

[ # 5 ]

Mark Atkins

Senior member

Total posts: 133

Joined: Sep 25, 2012

E-mail Mark

Dave Morton - Jan 5, 2013:
I don’t respond (most of the time) because “Wow! I didn’t know that! Thanks!” doesn’t really move the discussion forward. Just know that I DO read these posts, and further, I appreciate them.

This forum could always be the first in the world to develop a new icon that means “Wow! I didn’t know that! Thanks!”, analogous to the thumbs-up or thumbs-down icons used on sites like YouTube, whereupon readers here could merely click on that icon to transmit that sentiment. Just another idea from my overactive mind.

Anyway, I agree with that sentiment, Dave. For example, although I didn’t know what to say to Andrew Smith’s recent post about discoveries of microtubules in neurons possibly being the basis of memories, I appreciated that news and never heard of that before.

Posted: Jan 6, 2013

[ # 6 ]

JRowe

Member

Total posts: 9

Joined: Dec 31, 2012

E-mail JRowe

The main problem with digital simulation is that in practice the computation time needed on a digital computer for many simulations is often so prohibitively large that the computation will never finish in time for the answer to be useful.

what is not solved, and the reason current efforts to implement first order logic engines have not resulted in AGI, is the computability of such systems. ...an inference engine is stuck without the ability to dynamically select optimal rules for graph traversal.

Same idea. Different words. The quality of computability is the emplacement of limits on resources allocated to an operation - one of the reasons Turing is so important to AI is because he gave us the math proofs that describe the boundaries of computablity. We can only say that certain problems are even computable, and for most we can’t determine if they’re computable in polynomial time.

What you posted came off a little agressive, which I can appreciate, and I usually welcome challenges to my ideas. This one confused me, though, since we’re both saying the same thing. It may be due to my sometimes excessive wordiness, for which I apologize. I type quickly, so sometimes I ignore brevity for the sake of laziness just so I can hammer out an idea. I’ll be much less lazy from here on out.

Let me sum up my previous posts without the particulars: Current KR/AI systems lack the ability to generalize well. Work needed. Fundamental principles are sound. Implementations, or their application, are lacking.

If by “graph” you mean a directed acyclic graph (DAG) such as used in graph theory, this is not true

When I use the term “graph” I’m referring to a hypergraph... either an implicitly modeled hypergraph, or a DAG modified for recursion, which is functionally equivalent. Interpreted recursion suffices for well formed hypergraphs notated as DAGs. It just reduces brevity of expression, which for large systems, is trivial.

The experiment I’m working on is an attempt to generate a self-organizing interpreted hypergraph with various qualities such as self similarity, sensor/articulator/mind feedback loops, and so on. I want a system in which I can program/design behaviors, instead of actions. I’m implementing sensors and articulators as models that operate concurrently with a realtime interface.

Behaviors are loosely defined at this point as a sequence of events causing or caused by interactions with external and internal constructs that are comprehensively and consistently modeled.

I’m choosing to use the “DAG modified for recursion” approach, because I don’t have to reinvent the wheel. I can use OWL ontologies alongside Datalog for recursion and rule expression. I just model the system, hack in the pieces, and experiment!

Posted: Jan 7, 2013

[ # 7 ]

Andres Hohendahl

Senior member

Total posts: 141

Joined: Apr 24, 2011

E-mail Andres

@Mark

I read out all your links, and payed good attention to the discussion, and here I am going to introduce some ‘new’ concepts:

Analogy, real numbers, etc. are creations of our mind! they don’t exist wild in nature! nor inside computers!
Continuous things are pure imagination, there is nothing continuous, nor even time (according to Physicists)
Formulas, Math and all those stuff! are things created inside human folklore (aka science)

If we want to re-create (an first understand) some human-brain logic, we need to stop following hoaxes and concentrate in the inside mechanisms, which are (in my opinion) far more complex than a simple math-simplification like linearity or even scale-ability.

Semantics and Meaning are concepts, but no one knows how a concept is best represented!

There are models, many are strange, analog, logic or statistical, made up with our ‘brain’ inventions!

The goal, and a good road to AGI (as I see) could be to imitate the behavior of the thinking process, and see if this imitation can be spontaneously created by a model, based on evidence, and then see if it has the capability to infer new solutions.

I saw a good explanation/demo of deep resonant neural networks, which resembles our dreaming mechanism, and the “ideas” are only huge and diffuse data chunks, interpreted in a complex way, here is the video:
http://www.youtube.com/watch?feature=player_embedded&v=Nu-nlQqFCKg

Posted: Jan 9, 2013

[ # 8 ]

Mark Atkins

Senior member

Total posts: 133

Joined: Sep 25, 2012

E-mail Mark

Andres Hohendahl - Jan 7, 2013:

Analogy, real numbers, etc. are creations of our mind! they don’t exist wild in nature! nor inside computers!
Continuous things are pure imagination, there is nothing continuous, nor even time (according to Physicists)
Formulas, Math and all those stuff! are things created inside human folklore (aka science)

That topic is getting close to pure philosophy, but I agree with Penrose that all of mathematics already exists, and all humans can do is to explore it, and to change the representation they use to represent it. Math is really amazing in that it transcends the real world; even if there existed an entire universe separate from ours, with different physical laws, they would still be using the same mathematics, though with presumably different notation!

As for continuity etc., remember my thread about AI and “first principles”: each science has its own level of resolution, so it doesn’t help to know that ultimately the universe is discrete when at the size of living organisms, those limits are orders of magnitude removed from their experience and are therefore impractical.

Is mathematics invention or discovery? When mathematicians come upon their results are they just producing elaborate mental constructions which have no actual reality, but whose power and elegance is sufficient simply to fool even their inventors into believing that these mere mental constructions are ‘real’? Or are mathematicians really uncovering truths which are, in fact, already ‘there’—truths whose existence is quite independent of the mathematicians’ activities? I think that, by now, it must be quite clear to the reader that I am an adherent of the second, rather than the first, view, at least with regard to such structures as complex numbers and the Mandelbrot set.
(“The Emperor’s New Mind: Concerning Computers, Mind, and The Laws of Physics”, Roger Penrose, 1989, page 96)

Andres Hohendahl - Jan 7, 2013:

Semantics and Meaning are concepts, but no one knows how a concept is best represented!

The brain knows. Of course “best” always depends on your definition or your values, but assuming that the goal is efficient, adaptive processing of real-world data, the brain’s representation is best, as far as we know. But we’re unable to directly look at our own brains, unlike our fingers or other anatomy.

Andres Hohendahl - Jan 7, 2013:

I saw a good explanation/demo of deep resonant neural networks, which resembles our dreaming mechanism, and the “ideas” are only huge and diffuse data chunks, interpreted in a complex way, here is the video:

I watched the video, but I wasn’t very impressed. There existed neural networks that “dreamed” (i.e., needed down time to organize their stored memories) even in the late ‘80s, and 70-80% accuracy on speech recognition is not very good, in my opinion. It’s interesting about random Markov processes, though: I’ve seen that mentioned a fair number of times on this forum (I think from you, too, correct?), and I just came across mention of those in conjunction with speech while looking through a textbook in the library today.

Posted: Jan 9, 2013

[ # 9 ]

Merlin

Guru

Total posts: 1081

Joined: Dec 17, 2010

E-mail Merlin

If you are interested in Neural Nets, the Cousera Course provides a good overview.
https://www.coursera.org/course/neuralnets

Lecture 1: Introduction
Lecture 2: The Perceptron learning procedure
Lecture 3: The backpropagation learning proccedure
Lecture 4: Learning feature vectors for words
Lecture 5: Object recognition with neural nets
Lecture 6: Optimization: How to make the learning go faster
Lecture 7: Recurrent neural networks
Lecture 8: More recurrent neural networks
Lecture 9: Ways to make neural networks generalize better
Lecture 10: Combining multiple neural networks to improve generalization
Lecture 11: Hopfield nets and Boltzmann machines
Lecture 12: Restricted Boltzmann machines (RBMs)
Lecture 13: Stacking RBMs to make Deep Belief Nets
Lecture 14: Deep neural nets with generative pre-training
Lecture 15: Modeling hierarchical structure with neural nets
Lecture 16: Recent applications of deep neural nets

‹‹ Does the brain use 6-bit bytes? The lamentable state of knowledge representation ››

Search the Forum

Forum Profile

Forum Subscription

Forum Moderators

On Our Admin Forums

Partner Forums

Science Statistics

Chatbot Statistics