PART 1: TENSORS

(Part 1 of about 10 parts)

(This is necessarily going to be a long thread, therefore it will necessarily require many consecutive posts spanning at least several days. Everyone feel free to post inbetween my various posted parts, since it might turn out that I can’t finish my entire intended thread anyway.)

My recent thread about the U.S. government searching for “Maxwell’s equations of thought” really got me thinking about where one would find a suitable mathematics to represent thought, intelligence, or AI, and whether such a form math even exists, and why are why not. I’ve spent the past month learning branches of math I’d never understood before, and I’ve been brainstorming extensively on all of this, so I thought I’d post my findings and thoughts.

In 1989 Patricia Churchland suggested three areas of AI research that she considered particularly promising. One such area was tensors, which are an area of mathematics related to arrays:

(p. 411)

My approach here will be to present three quite different theoretical

examples with a view to showing what virtues they have and why

they are interesting. Each in its way is highly incomplete; of course

each makes simplifications and waves its hands in many important

places. Nevertheless, by looking at these approaches sympathet-

ically, while remaining sensitive to their limitations, we may be able

to see whether the central motivating ideas are powerful and useful

and, most importantly, whether they are experimentally provocative.(p. 411)

Two of the examples originate from within an essentially neu-

robiological framework.The first focuses on the fundamental prob-

lem of sensorimotor control and offers a general framework for

(p. 412)

understanding the computational architecture of nervous systems.

The authors of this approach are Andras Pellionisz and Rodolfo

Llinas, and owing to the very broad scope and the general system-

aticity their theory seeks to encompass, I shall discuss it at consider-

able length.(p. 417)

The basic mathematical

insight was that if the input is construed as a vector in one coordinate

system, and if the output is construed as a vector in a different coordi-

nate system, thena tensor is what effects the mapping or tranforma-. Which tensor matrix governs the

tion from one vector to the other

transformation for a given pair of input-output ensembles is an em-

pirical matter determined by the requirements of the reference frames

in question. And that matrix is implemented in the connectivity rela-

tions obtaining between input arrays and output arrays.(p. 418)

A tensor is a generalized mathematical function for transforming

vectors into other vectors, irrespective of the differences in metric and

dimension of the coordinate systems. If the basic functional problem

of sensorimotor control is getting from one very different coordinate

system to another, thentensorial transformations are just what the. Accordingly, the hypothesis is that

nervous system should be doing

the connectivity relations between a given input ensemble and its

output ensemble are the physical embodiment of a tensor.(“Neurophilosophy: Toward a Unified Science of the Mind-Brain”, Patricia Smith Churchland, 1989)

Coordinate transformations are certainly critically important, as Marr notes in his book “Vision”. Marr even suggests that our brains automatically assign a visual coordinate system to each one of our limbs, even when viewing somebody else’s limbs, from the arm to the hand to each finger joint, each of which rotates on its own unique axis:

(p. 307)

There are two kinds of object-centered coordinate systems that the 3-D

model representation might use. In one, all the component axes of a

description, from torso to eyelash, are specified in a common frame based

on the axis of the whole shape. The other uses a distributed coordinate

system, in whicheach 3-D model has its own coordinate system. The latterFirst, the spatial relations specified in

is preferable for two main reasons.

a 3-D model description are always local to one of its models and should

be given in a frame of reference determined by that model for the same

reasons that we prefer an object-centered system over a viewer-centered

one. To do otherwise would cause information about the relative disposi-

tions of a model’s components to depend on the orientation of the model

axis relative to the whole shape. For example, the description of the shape

of a horse’s leg would depend on the angle that the leg makes with the

torso. Second, in addition to this stability and uniqueness consideration,

the representation’s accessibility and modularity is improved if each 3-D

model maintains its own coordinate system, because it can then be dealt

with as a completely self-contained unit of shape description.(“Vision: A Computational Investigation into the Human Representation and Processing of Visual Information”, David Marr, 1982)

That is a lot of coordinate systems to consider, and at first that suggestion seemed unrealistic to me, but after much thought I decided Marr was probably correct. At the very least our brains certainly convert viewer-centered (= self-centered) coordinate systems (where our own body is at the origin of a 3D graph) to object-centered coordinate systems (where the center of the object being viewed is at the center of a 3D graph), so clearly our brains already perform such computations without our even being consciously aware of that underlying mathematics. Object-centered coordinate systems are almost required since the brain requires invariants: without such invariants, our brains would have to update every model of every object we see as soon as we changed location, or the objects changed location, or shadows moved in the slightest way. Since such coordinate transformations already occur in our brains and are performed with ease, it’s not difficult to believe that extrapolations of this mechanism exist and are commonly used by the brain.

Unfortunately, speaking for myself, I just don’t understand the gist of tensors. Twice in my life I’ve made a serious effort to learn them, and both times I plowed through many pages of linear algebra, most of the material of which I already knew, then I finally gave up in boredom since I didn’t understand where it all was headed. Are tensors just linear algebra with more flexible notation? If so, why doesn’t somebody just say so, and then give a tutorial on the differences and why those differences are important, instead of trying to give me a repeat course in linear algebra? Maybe I’ve just been unfortunate enough to get poor or unsuitable introductory material. Admittedly there exists a lot of poor tutorials out there, especially in topics in mathematics.

But at least Patricia Churchland admitted that she also had great trouble understanding tensor theory, at least as far as it related to AI. She admitted she finally had to devise a cartoon of a crab trying to grab an apple in order to get a grip on a specific example of how tensors work…

(p. 419)

The phenomenological scenario here seems to be

confusion and incomprehension in the first phase, followed, as

understanding flowers, by a gathering sense of obviousness adhering

to the general principles.The detailed hypotheses are, evidentally, aIn what follows I shall use the cartoon story in

further matter. My own understanding here began to find its feet as

Paul M. Churchland and I constructed a cartoon story of a highly

simplified creature who faces a sensorimotor control problem of the

utmost simplicity.

trying to outline the Pellionisz-Llinas picture of the brain’s geomet-

rical problems and its geometrical solutions. With that in hand, we

shall return to the nervous systems and to the cerebellum in particular.(“Neurophilosophy: Toward a Unified Science of the Mind-Brain”, Patricia Smith Churchland, 1989)

I still haven’t dug into the details enough to understand her cartoon example, but nevertheless I thought she did an excellent job of presenting her simplified example, especially with the accompanying 3D coordinate systems and bent lines showing the transformations involved.

My opinion of tensors relative to AI is mixed. While tensors may be ideal for matrix operations, especially since the cerebellum’s layout almost looks like a 3D matrix already, I have some doubts as to whether a matrix can provide flexible enough pattern recognition of the type humans have, especially for vision. Images, after all, aren’t inherently mathematical or inherently suitable for display in a rectangular grid: images are continuous objects that move in a continuous manner. Images in general are very difficult to describe accurately in *any* manner, especially via matrix representation. I have a hard time believing that the brain contains pixelized screens with their pixels discretely winking on or off as the edge of an image passes by.

(p. 412)

The place to start, then, is where the theory started: the cerebellum.

With only some exaggeration it can be said thatalmost everything. For neuroanatomists the cerebellum has been some-

one would want to know about the micro-organization of the cerebel-

lum is known

thing of a dream of experimental approachability, because it has a

limited number of neural types (five, plus two incoming fibers), each

one morphologically distinctive andeach one positioned and con-(figure 2.4).

nected in a characteristic and highly regimented manner

The output of the cerebellar cortex is the exclusive job of just one type

of cell, the Purkinje cell (of which more anon), and the input is sup-

plied by just two, very different cell systems, the mossy fibers and the

climbing fibers. This investigable organization has made it possible to

determine the electrophysiological properties of each distinct class of

neuron and to study in detail the nature of the Purkinje output rela-

tive to the mossy fiber-climbing fiber input. The neuronal population

(p. 413)

in the cerebellum is huge—something on the order of 10^10 neurons—

and there is at last another order of magnitude in synaptic connec-

tions. Nonetheless,basic structural knowledge of the cerebellum hasthat illus-

made it possible to construct a schematic wiring diagram

trates the pathways and connectivity patterns of the participating

cells (figure 10.1). The first point, then, is that a great deal is under-

stood at the level of micro-organization.(“Neurophilosophy: Toward a Unified Science of the Mind-Brain”, Patricia Smith Churchland, 1989)

I’m still keeping an open mind about arrays and tensors possibly being a universal representation method in the brain, but mostly I don’t believe that’s a promising direction. Engineers, mathematicians, and computer scientists, especially people modeling artificial neural networks (ANNs), have been using arrays for representation for years, and nothing particularly promising or novel seems to have arisen from such representations or ANNs. Engineering complaints like wasted storage space for sparse matrices seem to be common, and Simon Haykin mentioned the need for nonlinear analog components in ANNs. Most books on AI don’t even mention arrays in conjunction with knowledge representation (KR). Whereas arrays are good for describing how coordinates transform to one another, an underlying problem is that such an operation must be performed on *every point* in the visual object, and there are a *lot* of points in visual objects. In fact, in theory, there are not just infinitely many such points, but *uncountably* infinitely many such points. Math applicable to computers doesn’t handle uncountable infinity very well.