AI Zone Admin Forum Add your forum

NEWS: Chatbots.org survey on 3000 US and UK consumers shows it is time for chatbot integration in customer service!read more..

AI has no first principles
 
 

Years ago, in one of my first AI classes, one middle-aged student (who was also an experienced manager at an aerospace company) told a group of us during break that the problem with AI was that it had no “first principles”. He said that even geology had a first principle, namely that the oldest rocks were on the bottom. He said all fields of science had first principles, but not AI.

It has only been in the last few months that I started thinking more seriously about his statement. To my surprise, I’m finding out that this is a difficult topic because information is very hard to find online in the form of lists of “first principles” in any specific science.  For such a general topic of discussion, you would think that even amateurs in each scientific field would at least list the first principles of their field. But mostly what you find online are advertisements for textbooks with the term “First Principles” in their title, and no actual lists. It would be a time-consuming task to look up a textbook mentioning first principles in several representative fields.

However, while in the library recently, I did look up the AI book with the promising title “Principles of Artificial Intelligence” for exactly such a list, thinking that if any AI reference source would demonstrate that student’s statement as true or false, that would be it. Was I in for a disappointment! In the intro of the textbook, this is all that was written about the “first principles” from which that book got its name:

AI has also embraced the larger scientific goal of constructing an information-processing theory of intelligence. If such a science of intelligence could be developed, it could guide the design of intelligent machines as well as explicate intelligent behavior as it occurs in humans and other animals. Since the development of such a general theory is still very much a goal, rather than an accomplishment of AI, we limit our attention here to those principles that are relevant to the engineering goal of building intelligent machines. Even with this more limited outlook, our discussion of AI ideas might well be of interest to cognitive psychologists and others attempting to understand natural intelligence.
  As we have already mentioned, AI methods and techniques have been applied in several different problem areas. To help motivate our subsequent discussions, we next describe some of these applications.
(“Principles of Artificial Intelligence”, Nils J. Nilsson, 1980, page 2)

Let me paraphrase the above excerpt: “AI has no first principles, so we’re going to show you some applications instead.” Wow, was that disappointing. Here’s all I found online that was applicable to first principles of AI:

This course introduces the following principles of artificial intelligence.
  Represent and solve problems in an abstract language
  Manage uncertainty by tracking possibilities
  Plan behavior by maximizing expected utility
http://conway.rutgers.edu/~ccshan/wiki/cs530/

Let me paraphrase the above excerpt: “We’ll demonstrate principles by showing you computer programs that list the possibilities we encounter, and then use utility functions (that are admittedly ad hoc).” My gosh, the field of AI can’t do any better than this?

So let’s look at other fields for a comparison. What exactly is a “first principle”? Per Wikipedia, the definition depends somewhat on the specific science:

In mathematics, first principles are referred to as axioms or postulates.
...
In philosophy, a first principle is a basic, foundational proposition or assumption that cannot be deduced from any other proposition or assumption.
...
In physics, a calculation is said to be from first principles, or ab initio, if it starts directly at the level of established laws of physics and does not make assumptions such as empirical model and fitting parameters.
http://en.wikipedia.org/wiki/First_principle

The best description I found came from one Ph.D.‘s blog:

In any field of endeavor, there are core assumptions that serve as a foundation for more complex and ornate ideas: The laws of thermodynamics, the hierarchy of needs and the intractable rules of etiquette like never wear white after Labor Day and other fashion felonies. We build our beliefs and preferences based on these first principles. For example, the Constitution of the United States spells out a series of operating principles based on the fundamental belief in the inalienable rights of Man.
http://www.psychologytoday.com/blog/innovation-you/201201/thoughts-first-principles

Apparently not even Computer Science dwells on first principles, though:

Philosophers are less concerned with establishing fixed, controlled vocabularies than are researchers in computer science, while computer scientists are less involved in discussions of first principles, such as debating whether there are such things as fixed essences or whether entities must be ontologically more primary than processes.
http://en.wikipedia.org/wiki/Ontology_(information_science)

And here is a discussion of first principles of biology, compared to first principles in physics:

It is now known, from extensive experience in these physical sciences, that the behaviour at the atomistic scale can be accurately predicted using parameter free quantum mechanical calculations based on density functional theory. These calculations are often referred to as first principles or ab initio. For instance, such first principles calculations have provided an understanding of the molten core of the earth [1], of chemical reactions in numerous different environments [for examples see 2,3], of the growth of oxide films on aluminium [4] and silicon [5] surfaces and there many thousands of other applications. These applications provide a glimpse of the potential impact of quantum mechanical modelling in biology. For instance, our simulations of the reaction of methanol in zeolites [3] showed that, despite decades of research, none of the existing models of the behaviour of this system was correct. This begs the question If we are unable to correctly predict the behaviour of a few tens of atoms in a very well defined configuration what is the chance that we can genuinely predict the atomic scale behaviour of biological systems?
http://talks.cam.ac.uk/talk/index/5083

Wow, it seems that scientists in any field aren’t very interested in first principles anymore, if they ever were. It sounds like they often can’t even figure out what their first principles are, for each field. On the other hand, if you read about the history of astronomy, you’ll find out how erroneous assumptions of first principles prevented progress in that field for a long time, so first principles must be an important topic:

Science and its methods of investigation did not exist in ancient Greece, so when Plato and Aristotle turned their minds to the structure of the universe, they made use of a process common to their times—reasoning from first principles. A first principle is something that is held to be obviously true. Once a principle is recognized as true, whatever can be logically derived from it must also be true.
...
For 2000 years, the minds of astronomers were shackled by a pair of ideas.
http://books.google.com/books?id=DajpkyXS-NUC&pg=PT60&lpg=PT60&dq;=“first+principles”+of+astronomy&source=bl&ots=flKJh0XZ6J&sig=NXtYLgfcdcsdYqEEzn2gtthAKTk&hl=en&sa=X&ei=y1DJUK_tHefoigLG04C4DA&ved=0CDkQ6AEwAg

So is the fact that first principles are an ancient reasoning process an excuse for ignoring them in modern times? But if that’s the case, why do authors keep putting the words “first principles” in their textbook titles? And why aren’t those first principles listed online? Are all textbook descriptions of first principles as woeful as the description in Nilsson’s texbook of AI?

Maybe a different approach to the fundamentals of any given scientific field that is more promising is to define the basic phenomena of the field. Of significance here is the field of biology, where the primary phenomenon of interest is “life”, since biologists generally don’t even try to define “life”...

Since there is no unequivocal definition of life, the current understanding is descriptive.
http://en.wikipedia.org/wiki/Life

I chose biology as an example because the same lack of a basic definition of “intelligence” exists in psychology and AI…

While intelligence is one of the most talked about subjects within psychology, there is no standard definition of what exactly constitutes ‘intelligence.’
http://psychology.about.com/od/cognitivepsychology/p/intelligence.htm

So why doesn’t the field of AI have the same stature as the field of biology? Other than the obvious fact that it hasn’t been around as long, I believe the key answer is simple: in biology there is reasonable accord on what existing organisms would be considered “living” and there are plenty of examples to choose from, but in AI there doesn’t exist even a single example of “machine intelligence” that experts would unequivocally consider intelligent, therefore AI lacks even the ability to study examples empirically. How sad. Yet some day, presumably, somebody will produce either an example of machine intelligence, or a convincing theory of intelligence, then suddenly the field will go from basically groundless to generally accepted, and can then legitimately call itself a science. How strange. But such is the nature of the science in which I find myself. Boy, I sure know how to pick ‘em. So it seems that student’s claim was correct, after all, though that claim has even wider applicability in science than he realized. :-(

 

 

 

 
  [ # 1 ]
Mark Atkins - Dec 13, 2012:

... Since the development of such a general theory is still very much a goal, rather than an accomplishment of AI, we limit our attention here to those principles that are relevant to the engineering goal of building intelligent machines…

Much of the time, scientists in all fields are left to design and eventually build the tools needed to test their approximations of “first principles”, though in already well developed areas the “first principles” are often taken for granted or “generally accepted as true” within the sphere or the system of interest (be it a chip, test tube, cell, organ, individual, or a population)- there are experts working on both end, and all spots in between.

Now, to your question- why no first principles for AI?  Let us examine some other areas where nature is being copied: Artificial materials (AM).

First principles for materials are based on things like tensile strength, elasticity, etc.  And first principles relate to how well these materials are suited for various applications.  First principles also exist on a molecular level- how different elements behave and why and how this relates to the previously mentioned principles important in a practical use scope. 

Maybe the nature of “artificial” defines the scope of AI to developing a reflection of what already exists (“intelligence”).

First principles of AI are simply the ability to make something that can simulate:

in·tel·li·gence
1. capacity for learning, reasoning, understanding, and similar forms of mental activity; aptitude in grasping truths, relationships, facts, meanings, etc.
2. manifestation of a high mental capacity: He writes with intelligence and wit.
3. the faculty of understanding.
4. knowledge of an event, circumstance, etc., received or imparted; news; information.
5. the gathering or distribution of information, especially secret information.
http://dictionary.reference.com/browse/intelligence

Thus, we are essentially left to “limit our attention here to those principles that are relevant to the engineering goal of building intelligent machines”, since intelligence itself is already pretty well defined in a working sense, if not fully understood (though we now know many first principles about intelligence like genetic influence, energy utilization, and pathology).

 

 

 
  [ # 2 ]

AI has lots of First Principles:

AI must ultimately replicate the function of neurons in the human brain.

AI must ultimately become massively parallel (“maspar”).

AI must think in at least one natural, human language.

AI must mediate between sensory input and motor output.

AI should aim for consciousness, as in humans.

The First Principles guide and inform all Mentifex AI Minds:

http://www.scn.org/~mentifex/AiMind.html in English;

http://www.scn.org/~mentifex/Dushka.html in Russian;

http://www.scn.org/~mentifex/mindforth.txt in English;

http://www.scn.org/~mentifex/DeKi.txt in German.

 

 
  [ # 3 ]

Arthur,

I don’t think I agree with any of these items as “principles”. I almost agree with mediation between sensory input and motor output, except that a machine that can come up with good answers to difficult questions but yet lacks arms and legs would still be something I would consider intelligent. The other items seem to be trying to copy human brains and their neurons without having a fundamental reason for doing so, other than lack of understanding of how or why those components are doing what they do.

The following is the kind of list I would like to see as a list of first principles. This lengthy excerpt is from cognitive neuroscience, emphasizing the relationship between brain structures and the functions of those structures, so this could be considered a subset of AI. The list doesn’t say “first principles”, but it does repeatedly say “principles”, and the list is quite general and reasonably insightful for that subfield, although for my taste it’s a little too general to be useful.

A theory of cognitive neuroscience would tell us how the brain works.
It would be integrated in two senses. It would give an account at all the
levels described in our general framework, from a specification of the cog-
nitive systems to an understanding of the cellular mechanisms that support
them. It would also describe how these cognitive systems achieve the sub-
jective conscious experience that we call “mind.” Such a theory would be
a solution to the philosophical separation between mind and brain that was
expressed in the theory of Rene Descartes.
  We do not think that such an integrated theory is at hand, at least not
in our hands. Yet, there is a sense in which the images of mind that we have
described have rendered the mind-brain separation obsolete, at least at the
everyday working level at which science is done. In our laboratories we de-
scribe the shift of attention, the visual word form, or the system responsi-
ble for target detection at the cognitive level—that is, we describe a se-
quence of mental operations. We design and execute experiments to see
where in the brain these operations occur. We sometimes make correct pre-
dictions and sometimes find out new things, but each experimental design
goes from the cognitive to the anatomical or the reverse. Experimenters in
research centers throughout the world now move effortlessly between the
description of mind and the anatomy of brain as though there had not been
the centuries of philosophical disputation about whether it is even possible.
  While a comprehensive theory appears remote, we are able to present
a few principles that cut across the levels of the framework presented on
page 37. Below we outline some of these general principles that we believe
apply to all mental operations and cognitive systems. These principles are
embodied in our ability to observe the physical images of mental events.
We believe that any future integrated theory will need to account for these
principles.

  1. Elementary operations are localized in discrete neural areas. In this
book we have shown in Chapter 7 that the component operations of se-
lective attention are localized and in Chapter 5 that the component op-
erations of auditory and visual word processing are localized. Scientists
are finding additional support for this principle in their research on mo-
tor control, object and face recognition, and memory.

  2. Cognitive tasks are performed by networks of widely distributed neural
systems. Some forms of visual mental imagery have been shown to acti-
vate areas of visual cortex. This is a striking result, but it should not be
taken to mean that imagery is “in” the visual cortex. Many other areas
are involved as well. The scanning of visual images involves the parietal
lobe, and the formation of an image requires the executive attention net-
work. The lateral areas of the left posterior hemisphere are involved in
registering the parts of an image, and the pulvinar appears to be a way
station used by attention to gain access to extrastriate areas. We have
seen in several chapters that cognitive processes involve networks of
many cortical and subcortical areas. Most important, the PET and cogni-
tive studies taken together have placed specific computations in some of
these areas. It is these specific computations that appear to be localized.

  3. Computations in a network interact by means of “reentrant” pro-
cesses. Vernon Mountcastle has written about the basic organization of
cortical anatomy as follows:

  It is well known from classical neuroanatomy that many of the
large entities of the brain are interconnected by extrinsic pathways
into complex systems, including massive reentrant circuits.

  Computer simulations based on the coordination of widespread
neural systems also rely upon this principle, as described by Gerald Edel-
man:

  Signaling between neuronal groups occurs via excitatory connec-
tions that link cortical areas, usually in a reciprocal fashion. According
to the theory of neuronal group selection, selective dynamical links are
formed between distant neural groups via reciprocal connections in a
process called reentry. Reentrant signaling establishes correlations be-
tween cortical maps, within or between different levels of the nervous
system.

  Cognitive experiments give good evidence that computations must be
performed in a specific order. The ordering of computations is not strictly
serial, however. Instead, computations appear to pass information back
and forth. It has been clear for many years that the front and back of the
brain are linked by anatomical connections leading in both directions.
However, we are just beginning to understand the functions of the con-
nections that feed back information from frontal areas to posterior ones.
For example, Chapter 6 reviewed evidence that, when a visual search is
required, a particular posterior area sensitive to visual features is active
shortly after visual input and then becomes active again later as the mind
searches for the visual feature. While our methods do not provide evi-
dence on exactly how the visual feature area is amplified during a volun-
tary search, since for conjunctions the search is contingent upon seman-
tic information from other areas, the search can only take place if that
information is fed back to reenter the critical area.

  4. Network operation is under hierarchical control. With the discovery
of a separate network of anatomical areas devoted to attention, we can
now explore how executive control is established over widely distributed
networks. We know that such control systems exist because two cognitive
tasks performed simultaneously interfere with each other irrespective of
the nature of their constituent computations. That is, attention to one
operation reduces the probability that any other operation can receive at-
tention. The form of control in these experiments appears to be largely
inhibitory: an executive control system inhibits the one task so that the
other is given priority. These findings support the idea that executive
control exists and is carried out by attention systems. This form of
control is consistent with older findings in neurobiology that cortical systems
exercise inhibitory control of spinal reflexes.

  5. Once a computation is activated, the threshold for its reactivation is
temporarily reduced. This is the principle that underlies the cognitive phe-
nomenon of priming discussed in Chapter 6. Whenever a code has been
active, it becomes easier for a stimulus to reactivate it. For example, the
presentation of a word will prime words of similar word form, phonology,
or meaning. As another example, motor movements that repeat the tim-
ing, force, or direction of the prior movement are executed more effi-
ciently.

  6. Less effort and attention are required to repeat a computation. This
principle may seem to be the inverse of the last, but in fact it is a corol-
ary. The efficiency of a computation improves with repetition, and as a
result, the overall activity that accompanies the computation is reduced.
Blood flow is less, electrical activity is reduced, and there is less interfer-
ence between the repeated computation and any other activity.

  7. A computation activated from the bottom up by sensory input involves
many of the same neural systems as the same computation activated from the
top down by attention systems. We showed in Chapters 3 and 4 that direct-
ing attention to motion, color, or form activated many of the same ex-
trastriate areas that were active when passively receiving information of
the same type. There was some evidence that the size and number of ex-
trastriate areas active during the attention condition were greater than
during the comparable passive perception condition. The same principle
was found to apply to the formation of visual imagery, discussed in Chap-
ter 4, as well as to the electrical activity recorded during word processing,
discussed in Chapter 6. We believe that attention can be intentionally
used in experiments to amplify the activity underlying a particular com-
putation. This principle would allow subjects to participate more actively
in uncovering the anatomy and circuitry underlying their own mental
processing, as happened in the conjunction tasks described in Chapter 6.

  8. Practice in the performance of any task will descrease the number of
neural networks necessary to perform it. The idea that the repetition of
events leads eventually to their becoming automatic—that is, performed
without attention—is well established in psychology. We can now see
that this principle also applies at the neural systems level. As discussed in
Chapter 5, when a person is asked to generate verbs in response to the
same list of nouns over many trials, activity drops away in the anterior
cingulate, lateral frontal cortex, posterior cortex, and cerebellum. The
generation task, when automated, uses the same pathway as used in read-
ing the noun aloud. This finding suggests not only that the use of atten-
tion networks is reduced with practice but also that other pathways are
reorganized as the computation becomes habitual. We suggest that this
principle, like the others we have outlined, will apply to cognitive tasks
in general.

  9. The mind becomes capable of performing behaviors through the devel-
opment of specific pathways connecting local computations. The studies cited
in Chapter 8 have indicated that as neural systems mature in infants,
new behaviors are observed, such as inhibition of return and the volun-
tary disengaging of attention. Neural systems continue to be modified
into adulthood. Chapter 5 has shown that the learning of a new skill can
lead to marked shifts in neural pathways. Taken together these findings
indicate that scientists can study maturation and learning by looking for
specific neural changes in the psychological pathways that underlie the
behavior.

10. The symptoms of mental disorders may result from damage to local-
ized computations, to the pathways connecting these computations, or to the at-
tentional networks and neurochemical systems that modulate the computations.
In this volume we have examined many neurological and psychiatric ill-
nesses that produce a wide range of surprising symptoms, including prob-
lems in forming visual imagery (Chapter 4), in reading (Chapter 5), in
orienting (Chapter 7), in maintaining the alert state (Chapter 9), and in
executive attentional control (Chapter 9). The seemingly bizarre symp-
toms of these disorders can often be understood if we view them as cre-
ated by the interruption of specific computations and pathways between
these computations. In this way our knowledge so how mental processes
are organized in normal subjects can be used to explain and even predict
pathologies.
(“Images of Mind”, Michael J. Posner & Marcus E. Raichle, 1994, pages 242-245)

 

 
  [ # 4 ]

Interesting points, Mark!

To my surprise, I’m finding out that this is a difficult topic because information is very hard to find online in the form of lists of “first principles” in any specific science.  For such a general topic of discussion, you would think that even amateurs in each scientific field would at least list the first principles of their field.

It’s hard to condense complex theory into a few line items and maintain clarity. Heck, one could distill basically every field of physics down to Noether’s 1st theorem, but if I told you,

To every differentiable symmetry generated by local actions, there corresponds a conserved current. [wiki]

would it mean much to you? There’s a reason it takes a book to explain a “first principle”. And like you quoted, in physics a calculation is considered to be based on first principles if it begins with established laws of physics. But depending on the complexity of the system at hand, what is considered a “first principle” in one field may be a derived consequence in another.

For example, all of thermodynamics is often treated as a “first principle” although it requires quantum statistical mechanics to explain.

One consequence of so many derived “first principles” is that physicists like to hang out in those regions of research where those principles break down and can no longer be taken for granted. In condensed matter, we do all we can to steer clear of Fermi liquids, which are “The First Principle” for the behavior of metals.

The reason I’m belaboring the point of “derived” first principles is that any supposed first principle that isn’t 1) the standard model of particle physics or 2) general relativity, is in fact derived at some level.

What these derived principles actually tell us is some simple consequence of a complex system. And the thing about complexity is that you can sometimes find yourself in an environment that can be well-understood without being able to figure out how the heck it came about. (*See example from my field in next post.)

In such an intellectual environment, taking too much for granted is dangerous and can lead to mis-interpreted findings or cause you to miss out on interesting phenomena. This is especially relevant in biology, where one cannot directly simulate chemical reactions on the scale of hundreds of atoms or more. There’s no way to approach the problem analytically from physics first principles, and numerically you’re dealing with so many interactions that all the small errors that accumulate will surely lead you astray.

Your only hope is to postulate models for how processes happen and then try to derive a clever experiment or theoretical model that will put your hypothesis to the test. Unfortunately, this is sometimes impossible because there are too many factors at play to isolate those interactions you guess to be relevant.

So what do you do? Relying on truisms that hold in other circumstances is dangerous when small changes in a system can lead to huge changes in its behavior (see my example post below).

The same is probably true for the nature of intelligence, both natural and artificial. It is the consequence of many processes happening in parallel and interacting in a way that can not be understood readily as a single fundamental, physical theory. And even if such a theory were to be developed, it likely won’t be terribly useful in practice (see my example post below).

 

 
  [ # 5 ]

For example, I study a phenomenon called superconductivity. It was first discovered in 1911 in mercury. At very low temperatures (just slightly warmer than space), mercury displayed the strange habit of suddenly developing zero resistance. It wasn’t simply a perfect conductor. It would also expel any magnetic field that would have otherwise penetrated it.

It’s not just mercury. If you get cold enough, aluminum, lead, niobium, and other metals behave like this.

It took 50 years to figure that one out. In the meanwhile, great minds came up with beautiful phenomenological models that are so useful, they’re still used more than the actual microscopic theory. Mostly what the theory did was provide more fundamental justification for the models that already existed.

The microscopic theory does put a limit on how high the temperature of a superconductor can be (30 K). But the rules for how to find a good superconductor couldn’t be guessed from the theory. In practice, a few proven rules of thumb were:

- Avoid oxygen
- Avoid magnetism
- Avoid heavy atoms

Imagine the surprise of the community when a compound with a superconducting transition at 92 K was found in 1986. A compound moreover containing oxygen, heavy atoms, and high magnetic moments. smile The question of what is behind this phenomenon is as old as I am and still unsolved.

We do know pretty much all there is to know about how these guys behave. But even with our best guesses about the underlying mechanism, we still can’t predict what new compounds will superconduct. And whenever they don’t, it turns out to be for complex reasons involving other competing energy scales that may or may not even have anything to do with superconductivity.

I hope this example illustrates my point. (And I just can’t resist the opportunity wax on about the day job. wink )

 

 
  [ # 6 ]

So why doesn’t the field of AI have the same stature as the field of biology? Other than the obvious fact that it hasn’t been around as long, I believe the key answer is simple: in biology there is reasonable accord on what existing organisms would be considered “living” and there are plenty of examples to choose from, but in AI there doesn’t exist even a single example of “machine intelligence” that experts would unequivocally consider intelligent, therefore AI lacks even the ability to study examples empirically. How sad.

Very good point.

For many biological systems, including the human brain, you have a system with collective behavior not readily derivable from its constituent parts. I would suggest that even neurology has only become a “serious” science relatively recently, when it shed its psychology roots and turned to experimental studies on the brains of humans and animals, at the macroscopic level as well as the level of single neurons and the chemicals that interact with them.

What AI lacks, as you said, is experimental cases of generally agreed upon intelligence that can then be picked apart to develop an underlying theory or set of “first principles”.

Yet some day, presumably, somebody will produce either an example of machine intelligence, or a convincing theory of intelligence, then suddenly the field will go from basically groundless to generally accepted, and can then legitimately call itself a science.

I predict that the “engineering” work of developing limited domain intelligence—already broadly accepted and quite successful—will continue to develop and generalize until we sort of stumble into a regime where the combinations of many specialized systems produces something curiously like general intelligence. It may even happen multiple times. At that point, we can begin to compare the underlying mechanisms that comprise these systems and start the process of teasing apart just how that combination achieves intelligence.

As an aside, I’ve never been a believer in any sort of “singularity” scenario. I’m rather convinced that many (though not all) of the limitations of our intelligence are also necessary for achieving intelligence in the first place. But perhaps this is the subject of another thread. I only bring it up because I don’t think the creation of a “generally intelligent” computer system will be an explosive event that leads to machines light years beyond us in ability.

Instead I imagine such a system to be rather like a mildly autistic person: potentially quite intelligent, perhaps more or less than yourself, but strangely different in the way they respond (or don’t) to social cues.

 

 
  [ # 7 ]
C R Hunt - Dec 13, 2012:

It’s hard to condense complex theory into a few line items and maintain clarity. Heck, one could distill basically every field of physics down to Noether’s 1st theorem,

So true. (Though I hadn’t heard of Noether’s 1st theorem.) Just take a look at Maxwell’s equations or Schrodinger’s equation and think of all the phenomena they describe, yet their representation is not at all intuitive. It’s also true that the more general the statement, the less clear its applicability to a specific situation.

Still, you can find plenty of sites that have those mentioned equations, but I haven’t found any online list showing the first principles for physics in natural language. I would think that somebody would want to list those principles for the same reason they’d like to list Maxwell’s equations: even if the reader doesn’t understand the details, it’s just too cool to display the bare essence behind the wonderful functioning of our world. Even if there were only Noether’s 1st theorem for all the first principles of physics, I would think that somebody would be eager to draw a diagram of how special cases of that theorem give rise to more specific statements/principles.

C R Hunt - Dec 13, 2012:

But depending on the complexity of the system at hand, what is considered a “first principle” in one field may be a derived consequence in another.

Also very true. Chemistry is based on atoms, biology is based on chemistry, music is based on sine waves, environmental science is based on living organisms, and so on. Each field can operate at a different scale of size, and may or may not be based on more fundamental components, and it is ultimately those fundamental components that determine the behavior of everything, especially if there will be exceptions to observed patterns. Then, as you pointed out, it becomes risky to rely on principles that are only empirical observations, since there can always be important exceptions.

Still, one consequence of all these insights is that I’m going to be motivated to avoid the term “first principles” altogether from now on since it is so misleading. So maybe my rant is ultimately only that textbooks (and aerospace managers!) keep using that misleading term when it is increasingly clear that science no longer falls into the tidy organization that the ancient Greeks believed in.

 

 

 
  [ # 8 ]
C R Hunt - Dec 13, 2012:

As an aside, I’ve never been a believer in any sort of “singularity” scenario. I’m rather convinced that many (though not all) of the limitations of our intelligence are also necessary for achieving intelligence in the first place.

This is most clearly true in the field of biological vision, where biological visual systems achieve rapid performance at the cost of generality. I’m pretty sure that once we know the essence of intelligence and its limitations, we’ll try to put together a really general intelligence, if only out of curiosity: one that won’t be fooled by optical illusions, for example.

C R Hunt - Dec 13, 2012:

But perhaps this is the subject of another thread. I only bring it up because I don’t think the creation of a “generally intelligent” computer system will be an explosive event that leads to machines light years beyond us in ability.

Instead I imagine such a system to be rather like a mildly autistic person: potentially quite intelligent, perhaps more or less than yourself, but strangely different in the way they respond (or don’t) to social cues.

Very good insight. It has been only recently that I’ve begun to get a feel for what I believe a super intelligent machine would be like, and that’s a good description, I think. I disagree in the details, however: I definitely believe in an upcoming Singularity, and I believe that even an autistic, nerdy super intelligence would be quite sufficient to develop new technologies, weapons, theorems, forms of mathematics, and scientific breakthroughs that would drive all progress in the human experience forward at a frenetic pace. I also believe there exists a single, primary underlying mechanism of intelligence that we’ve been missing, and like clues to light quanta or chaos theory, it has been staring us in the face for at least decades, but as you suggested, that’s the topic of (at least) another thread.

The Origin of Quantum Mechanics (feat. Neil Turok)
http://www.youtube.com/watch?v=i1TVZIBj7UA

 

 

 
  [ # 9 ]

CR:

I thought you might like this passage I recently came across, since it confirms your statements (with which I agreed, anyway).

  INTENTIONAL GRANULARITY. Every branch of science and engineering uses
models that enhance certain features and ignore others. For calculating the orbit of
a satellite, relativity and quantum mechanics are ignored; the influences of the
earth, moon, and sun are significant, but the other planets and asteroids may be
ignored. In thermodynamics, individual molecules are ignored, and temperature,
pressure, entropy, and other quantities depend on averages of sextillions of mole-
cules. In fluid dynamics, the equations are so difficult to evaluate that every
application requires major simplifications. For most subsonic flows, even air is
treated as an incompressible fluid.
  Science is a loose collection of subfields, each focused on a narrow range of
phenomena. But the high degree of specialization causes details outside the primary
focus of attention to be ignored, simplified, or approximated. In every subfield,
problem solving requires abstractions that select relevant knowledge and approxi-
mate irregular features by simpler structures. Reasoning only becomes precise,
formal, and deductive after the intractable details have been thrown away: a boulder
tumbling down a mountainside may be modeled as a sphere rolling down a cone.
But sometimes the seemingly irrelevant details may suddenly become significant.
In the early days of aerodynamics, one mathematician “proved” that it was impos-
sible for anything to travel faster than the speed of sound. Unfortunately, the proof
depended on someone else’s simplified equations for velocities that were much less
than the speed of sound.
(“Knowledge Representation”, John F. Sowa, 2000, page 356)

 

 

 
  [ # 10 ]

But perhaps this is the subject of another thread. I only bring it up because I don’t think the creation of a “generally intelligent” computer system will be an explosive event that leads to machines light years beyond us in ability.

Instead I imagine such a system to be rather like a mildly autistic person: potentially quite intelligent, perhaps more or less than yourself, but strangely different in the way they respond (or don’t) to social cues.

I’ll have to disagree here. The thing about automation of intelligence is that once it’s done on a single computer, it can be done on many computers, and software can be designed to network. You cannot instantly replicate a human mind thousands of times, yet you could instantly replicate millions of artificial minds (or hundreds, or trillions, depending on the eventual required resources.)

Moore’s law means that once we pass the intelligence tipping point, it just becomes a matter of time (10 years as a loose rule of thumb) before it exceeds human intelligence by an order of magnitude. Simply by dint of available computing resources.

It won’t be human, and it won’t have the same reference points for social cues, but considering that the intelligence will exceed ours by quite a lot, it would most likely be trivial, and thus trivially replicable (all such software will be able to learn/download to its brain/simulate the “social behavior module), for such an intelligence to model “appropriate” social interaction.

Uncanny vallley only happens when things are broken - when look, feel, or behavior isn’t quite right. I don’t think an AGI will have that problem, or if it does, it won’t for long.

 

 
  [ # 11 ]

Nonsingularitarians might have this good point: if the intelligence is truly disembodied, it will be highly limited in its ability to accomplish scientific things. That’s the difference between the virtual world and the real world. Ultimately it requires physical actions in the real world, whether a lab technician setting controls on an apparatus, a lab technician noticing an interesting phenomenon that occurred by accident (that’s how penicillin and LEDs were discovered, for example), operation of machines that fabricate computer chips, operation of a supercollider, installation of monitors or probes, or whatever, to make real scientific progress. On the other hand: (1) humans will be driven by necessity, especially because of competition between themselves, to integrate robots into the scientific cycle that can do such physical actions themselves; (2) intelligent machines will be able to simulate the real world accurately enough to make some such important discoveries; (3) even theoretical/virtual progress like new forms of mathematics, solution of very difficult equations, simulation of molecular interactions, exploration of subcases of complicated equations from physics, new encryption algorithms, etc. would be so important that it would have rapid, major impact on science and the world in general; (4) machines will be able to design themselves, including improved versions; (5) sheer monitoring and data collection and analysis (data mining) alone will uncover important patterns and facts. All those considerations together make me believe that a high enough machine IQ alone will be enough to create a singularity scenario, even without any mobility or end effectors. For near human level IQ, I think CR is correct, but when you start talking several orders of magnitude of speed difference in processing, it’s hard to believe that you’d still be in the same familiar realm of progress. Several orders of magnitude is like the difference between studying molecules versus studying fluid flow: totally different dynamics, principles, etc.

JRowe: Care to create a new thread for this topic digression?

Top 10 Accidental Inventions
http://science.discovery.com/brink/top-ten/accidental-inventions/inventions.html

 

 
  [ # 12 ]
JRowe - Jan 1, 2013:

Moore’s law means that once we pass the intelligence tipping point, it just becomes a matter of time (10 years as a loose rule of thumb) before it exceeds human intelligence by an order of magnitude. Simply by dint of available computing resources.

The only problem there is that Moore’s Law seems to be tapering off, and will continue to do so, even to the point of “stopping”, perhaps, until and unless we (Humans) come up with the means to continue to make smaller, faster computing components. Perhaps the research being done with Graphene may do just that, but practical applications of this wondrous material is still several years off; possibly even decades. And, sadly, Moore’s Law only applies to hardware, not software. smile

 

 
  [ # 13 ]
Dave Morton - Jan 1, 2013:

The only problem there is that Moore’s Law seems to be tapering off, and will continue to do so, even to the point of “stopping”, perhaps, until and unless we (Humans) come up with the means to continue to make smaller, faster computing components.

I hadn’t heard of graphene. Thanks for the info, Dave. I think you’re right about Moore’s Law.

Per the following article, it sounds as though the two main hurdles for continuing to push Moore’s Law are: (1) fabrication technology pushing into the nanotechnology size realm; (2) heat. Consistent with one of the main themes of this thread, Moore’s Law, as with any principle focused on one range of one technology, will eventually reach its limits.

The End of Moore’s Law

Given the state of today’s technology, chips can only get so small. When Intel churned out a 90-nanometer chip called “Prescott” last year, it went from pushing the boundaries of miniaturization to the realm of nanotechnology. Unfortunately for the chipmakers, this level of shrinkage has side effects. Not only was Prescott slower than its predecessor, it generated more heat—the mortal enemy of laptop motherboards. The smaller the chip, the hotter they run. The heat created by so many transistors stuffed onto a tiny sliver of silicon has pushed the thermal conductivity of the copper interconnects to their limit. When they overheat, the interconnects can fail.
http://www.slate.com/articles/technology/technology/2005/12/the_end_of_moores_law.html

Moore’s Law can’t stand the heat

In a recent article for CIO.com, Brill argued that Moore’s Law can no longer be seen as a “good predictor of IT productivity because rising facility costs have fundamentally changed the economics of running a data centre”.

Brill gives an example where one of TUI’s members spent US$22 million buying new blade servers but then had to fork out US$54 million to boost power and cooling capacity. This increased the return on investment figure from US$22 million to US$76 million.
http://www.zdnet.com/moores-law-cant-stand-the-heat-1339275034/

The Collapse of Moore’s Law: Physicist Says It’s Already Happening

Despite Intel’s recent advances with tri-gate processors, Kaku argues the company has merely prolonged the inevitable: the law’s collapse due to heat and leakage issues.

“So there is an ultimate limit set by the laws of thermal dynamics and set by the laws of quantum mechanics as to how much computing power you can do with silicon,” says Kaku, noting “That’s the reason why the age of silicon will eventually come to a close,” and arguing that Moore’s Law could “flatten out completely” by 2022.

Where do we go once Gordon Moore’s axiom runs out of steam? Kaku hypothesizes several options: protein computers, DNA computers, optical computers, quantum computers and molecular computers.

http://techland.time.com/2012/05/01/the-collapse-of-moores-law-physicist-says-its-already-happening/#ixzz2GlZjZ1qY

Intel’s tri-gates mentioned above are taking the 3-D approach as a solution…

Intel Reinvents Transistors Using New 3-D Structure

Scientists have long recognized the benefits of a 3-D structure for sustaining the pace of Moore’s Law as device dimensions become so small that physical laws become barriers to advancement. The key to today’s breakthrough is Intel’s ability to deploy its novel 3-D Tri-Gate transistor design into high-volume manufacturing, ushering in the next era of Moore’s Law and opening the door to a new generation of innovations across a broad spectrum of devices.
http://newsroom.intel.com/community/intel_newsroom/blog/2011/05/04/intel-reinvents-transistors-using-new-3-d-structure

Another solution not mentioned above is reversible computation, considered to be a part of “green computing”, because it reuses electrons instead of turning them into wasted heat. I was surprised to learn about this on my own recently, since my teachers never taught us about this in school, despite its apparent importance. Also, it’s so obvious that even a child can understand it: an illustration of an AND gate or an OR gate has an obvious character: more lines go in than come out, so seemingly something has to go somewhere. That “something” is electrons, and that “somewhere” is the surrounding silicon, in the form of heat. This gets into the topic of Toffoli gates, Fredkin gates, and Deutch gates, which interestingly form an intermediate step on the way toward quantum computers, since quantum computers also have reversible gates.

Reversible computing
http://en.wikipedia.org/wiki/Reversible_computing

Reversible computing is ‘the only way’ to survive Intel’s heat
http://www.theregister.co.uk/2003/11/14/reversible_computing_is_the_only/

1.7 Reversible Gates
  When we calculate kT ln 2 for, say, 100o C, we obtain 3.6 x 10^-21
J(oules). This amount may seem negligible, but even with the absolute
minimum of one bit of logical information supported by only one bit of
physical information (electron state) in a gate, such a bit passes through
thousands of gates during each of billions of clock cycles each second.
When we consider that the number of transistors in a CPU (central pro-
cessing unit) may soon reach 1 billion, and that within a few years CPU
clock speed may exceed 10 GHz (see p. 135), we can estimate that the
maximal number of processed bits per second will exceed 10^19 bits per
second per CPU. Then we can easily calculate that unless we overcome
kT ln 2 per bit, no cooling could prevent such CPUs from melting. As
we have already said, the hardware will have to be changed to replace
today’s clock pulses with resonating “swings.” For example, electrons
could ballistically oscillate in silicon crystals or carbon nanotubes until
they hit a programmed lattice defect, which change change [sic] the paths
of some of them in order to perform logical operations. Such oscilla-
tions would need only a tiny amount of energy dissipation to keep them
swinging with a constant frequency.
  Reversible hardware will also require a new kind of software to imple-
ment and use a reversible logic algebra. It seems that development of
such a software is feasible. For example, it has already turned out that
a comparatively small number of oscillation delays would be required.
In today’s computers, a series of delays of relevant clock pulses is always
implemented. In Figs. 1.1 and 1.2 we can see that a pulse (square wave)
cannot arrive at both a gate and a source at the same instant. We first
have to switch the gate on, and only after a delay can we let current
through. Each transistor has such a delay built into it. Groups of gates
in a computer are incorporated into sophisticated timed circuits. The
gates then “calculate” within the time window determined by successive
clock ticks. Any voltage-level change that occurs in response to a clock
tick must charge or discharge parasitic capacitances associated with a
transistor and its surrounding circuitry. The energy cost of (dis)charging
a capacitor is C V^2 / 2. In a conventional circuit, most of this energy is
dissipated resistively into the environment. In a reversible gate, on the
other hand, the energy stored in parasitic capacitances is not dissipated
but is returned back to the circuit
.
...
  We mentioned above that reversible computer and quantum computer
theories have many characteristics in common. The main characteris-
tic is the reversibility of the gates.
(“Quantum Computation and Quantum Communication: Theory and Experiments”, Mladen Pavicic, 2006, pages 14-15, 17)

So help is on the way, but the transition from the current methods of chip manufacturing to reversible gates or graphene or 3-D or other approaches sounds like it will be difficult.

 

 

 
  [ # 14 ]

Given a working AGI algorithm, expansion becomes a matter of networking, rather than adapting anything fundamental. So when I correlate expansion of AGI to Moore’s law, it’s not in any expectation that the software itself grows more powerful, just that the software becomes replicable as many times more as the computing capacity becomes more spacious. If the RAM and CPU capcity means you can run 50 AGI minds, Moore’s law says that capacity will roughly double every 2 years. That means inside of 10 years, instead of running 50 AGI minds in tandem on said computer, you can run 1600. And it’s not so ludicrous to imagine networking a bunch of computers together to achieve some sort of augmentation of the AGI. Therefore, it’s reasonable to assume that if a human intelligence can cope with social interaction, and if one of these AGI’s is smarter than a human, than all the AGI needs do is dedicate one of its 1600 minds to guiding social interactions.

The point, however, is that a real intelligence is not going to have social problems, once you give it any sort of indication that those problems exist, because its capacity to solve it will *far* outstrap our ability to identify problems with its solution to the problem (and maybe the ropopocalypse comes about because the AI was told it was rude, and determined the best solution was to kill anything that thought it was rude.) The idea that an AGI will be socially awkward is itself a form of anthropomorphism - it doesn’t follow from any of the bedrock assumptions we can make about the form an AGI will take, but represents a sort of built in human assumption that anything “other” will be inferior in some way.

As far as the original topic: http://en.wikipedia.org/wiki/Universal_approximation_theorem , http://en.wikipedia.org/wiki/Gödel’s_incompleteness_theorems , http://en.wikipedia.org/wiki/Turing_degree , http://en.wikipedia.org/wiki/Turing_completeness

First Principles act as boundaries defining a problem space within the domain defined by mathematical or logical rules. A first principle within the domain of Artificial Intelligence needs to identify what intelligence is.

One such first principle could be this: There’s no such thing as magic. Therefore, the fundamental procesing unit of the human brain, the neuron, represents a computably replicable machine. Therefore, if you were to replace each neuron with a functionally identical analogue, until all neurons are replaced, the final construct is identically intelligent.

By functionally identical, I mean a machine that behaves exactly the same as the original neuron. Since the machine could be simulated, and the interaction with other neurons done through external connections, then you can make a second postulate: The entire structure can be simulated.

If the entire structure can be simulated, then the problem becomes one of computation, and we end up sitting across the pub table from Alan Turing, listening intently to his crackpot idea that his theoretical Q-Brained computer can be exactly as smart as any P-Brained human.

My first principle for AI: “There is no magic.”

 

 
  [ # 15 ]

By the way, I’m well aware of the fact that we have not perfectly modeled neurons yet, and that there are things going in the human brain that aren’t as of yet perfectly replicable. That doesn’t matter in terms of a theory of mind, however - the idea is to prove as many small bits and pieces as you can until you gain traction on the larger problem.

 

 1 2 > 
1 of 2
 
  login or register to react