In the search for artificial general intelligence (AGI), today’s crop of AI applications shows very little aptitude in many things that any three-year-old can grasp – understanding cause and effect, gravity and spatial relationships, and the passage of time.
In large part, this may be due to the fact that while we have an excellent example of general intelligence in the human brain, AI’s neural networks work in ways that are completely different from the way in which the brain works. Sure, they both have things called neurons which are interconnected by weighted synapses, and the state of a neuron impacts the states of neurons to which it is connected.
Artificial general intelligence vs. human general intelligence
From there, though, the similarity stops. Biological neurons don’t appear in orderly layers with orderly connections between one layer and the next in the same way that AI’s neural networks do. Instead, the brain has a tangle of interconnections that we have yet to unravel. The biological neuron accumulates charge from incoming synapses and emits a spike when a threshold is reached. Moreover, the values of biological neurons are binary: either there is a spike, or there isn’t.
AI experts typically counter that difference by setting the activation of AI’s neural networks to a step function so that the output will be a 1 or 0. The problem with that approach, however, is that it ignores the accumulated charge from one cycle to the next. To make up the difference, we would also need to add internal memory, which is not reflected in most neural networks.
Many AI experts also suggest that rather than individual spikes, the output of artificial neurons represents the spiking rate of the neuron, and that rate could vary continuously. In practice, though, the spiking rate cannot vary continuously because biological neurons have a maximum spiking rate of about 250 Hz, and the neural signals cannot be useful below about 20 Hz. In between, high noise levels in the brain limit the number of different rates that can be represented reliably. Further, detecting many unique spiking rates turns out to be well beyond the capabilities of individual neurons, and detecting them in a straightforward way is inordinately slow.
There are also issues when you consider how two input signals spiking at the same rate can produce different output spiking rates depending on their relative phase or timing – both of which are missing in AI – giving the biological neuron a whole universe of potential functionality lacking in AI neural networks.
All of which brings us to synapses. The fundamental problem here is that in a biological context, there is no way for the brain to access the weight of a synapse precisely. If you look at a neuron firing at a fixed rate connected to another neuron with a synapse of unknown weight and find that the output neuron is firing at half the rate of the input, that doesn’t mean the synapse has a weight of 0.5. Rather, it means the synapse weight is somewhere in the range of 0.5 up to 1.0.
Within an artificial brain simulator, it’s possible to simply click on the synapse and read out its weight or click on the neuron and see how much the synapse contributes to the membrane potential. In a biological brain, though, we don’t know how to measure the weight of a synapse and the only way to measure a neuron’s membrane potential is with needle electrodes – not a particularly pleasant or efficient prospect.
So does that mean synapse weights are useless? Far from it. They’re just useless for storing precise values that you might want to read back. A better way to approach this is that a synapse represents a single bit of information while the weight value represents the confidence that that bit is true. The closer the weight is to 0.0 or 1.0, the higher the confidence because it will take more spikes to change the weight to the other state.
Finally, there is backpropagation, which cannot possibly be representative of how neurons learn for two fundamental reasons. There is certainly an issue with the huge amount of training data needed to make a deep learning network function which obviously shows the neural network is learning by a different mechanism than a child. But on an even more fundamental level, a quick look at the backpropagation formula shows that it relies on knowing what current synapse weights are and being able to directly modify the weight of any synapse in the network with great precision. This is simply not possible in a biologically plausible world. Also, the method by which weight changes are calculated in backpropagation will not work if the gradient field is not continuously smooth, which it won’t be because of the discrete nature of the neurons and synapse weights in the human brain.
All of this isn’t to say that today’s AI approaches are wrong or don’t work. On the contrary, many AI systems work very well. But the algorithms of today’s AI are different from the way in which the human brain accomplishes similar tasks. This is because the algorithms in artificial neural networks are impossible to implement in neurons. After four decades of experimentation in AI with no emergence of general intelligence, it is time to recognize that new approaches are needed if AGI is ever to emerge.
Bigger = better?
Many AI experts assume that although today’s AI systems clearly fall short if we could just build them big enough and put in enough training data, general intelligence will emerge. Even if this were possible, the intelligence produced would be different from human general intelligence because of the fundamentally different approaches described above. This is very similar to the thinking early on in expert systems development—if we could just program enough use cases, a system would seem to “understand.” But human intelligence flows the other direction—the complexity of human behavior, its multiple “use cases” are the result of the understanding.
We would be better served by looking at deep learning as a powerful and sophisticated statistical tool. With it, we can find relationships within datasets that no human mind could detect, but it cannot spontaneously create understanding. Instead, we need to focus on the capabilities of a three-year-old. While these don’t offer much in the way of immediate commercial applications, the key is that any three-year-old incorporates the ability to grow to a four-year-old and so on up to adult-level thinking abilities. Looking at the ways any child learns about the environment and the relationships between themselves and the real world is the key to the true understanding, which is fundamental to any future AGI development.