Ghosts Of the Chinese Room: Semantic Understanding In
Neural Networks
Michael Belinsky
Dartmouth College, Fall 2004
Michael Belinsky is a freshman philosophy and economics
double major at Dartmouth College.
His interests include political theory and economic theory. He currently works as a staff columnist
for The Dartmouth daily newspaper and serves as the Secretary General for the
Dartmouth Model United Nations conference.
I.
Introduction
The advent of neural networks reinvigorated the
debate over the ability of artificial intelligence (AI) to duplicate human
thought in functionally defined systems.
Although connectionism theorists still battle over the neural networksÕ
classification as a von Neumann machine, opponents of Strong AI functionalism
have extended their arguments to encompass connectionism. This paper will examine one such
argument, John SearleÕs Chinese Room, and its applicability to neural
networks. The path from computer
science to artificial intelligence to the Chinese Room will be traced out to
provide some groundwork for the discussion.
As a formal science, computation began addressing
the mind-brain problem at the inception of the British mathematician Alan
TuringÕs 1936 paper on the theory of computation. Turing stated that any mechanical method of calculating a
mathematical function can be duplicated by a Turing machine, which was defined
as any system that has a memory bank for discrete symbols and a way of
accessing and altering those symbols according to a program.[1] TuringÕs functional definition laid the
groundwork for computation: anything computable by an informal mechanical
process can be formally computed in a Turing machine. [2] An exciting venue opened up at the
possibility to artificially simulate intelligence in a Turing Machine. Thus, AI grew out of computer science,
holding at its core the belief that intelligence is computable. The reasoning behind the belief is that
(a) thinking is information processing, (b) information processing is symbol
manipulation, and (c) symbol manipulation, by computers or any other medium, if
sophisticated enough, can result in intelligent thought. [3] Computation is a functional concept. Functionalism claims that the world can
be explained in terms of its functional relations. Thus, the functional definition of consciousness would
include everything about consciousness that is independent of the being in
which that consciousness occurs. Computation,
therefore, refers only to the input-output processes and not the medium in
which it occurs. In other words,
as long as it achieves the desired inputs and outputs, computation can be
carried out by shuffling sticks of different lengths, with pen and paper, or by
manipulating any sort of formally defined symbols. This implies that since
computation is independent of the medium in which it is implemented, human
thought is independent of the implementing device. Specifically, the mind can be duplicated independently of
the brain.
From here, AI split into two camps. The first, Strong AI, holds that
artificial intelligence can duplicate human consciousness. [4] The second, Weak AI, holds that while
artificial intelligence machines can appear to think, they can only simulate
conscious machines.[5] In this discussion, the difference
between a duplicate and a simulation merits an explanation. A simulation of a phenomenon does not fully reproduce that
phenomenon.[6] A hurricane simulation will not destroy
your house and a rain simulation will not pour down on it. Similarly, a simulation of thought does
not result in thinking. In this
sense, Weak AI claims that all attempts at artificial intelligence will merely
simulate thought. The job of
Strong AI is not to simulate thought, but rather to duplicate it in formally
defined machines. A duplication of
a phenomenon is a fully functional reproduction of that phenomenon. This paper focuses on Strong AI as the
more controversial of the two theses.
II. Turing
Test
Strong AI functionalism proposes an imitation game
to determine a machineÕs intelligence.
In this imitation game, famously known as the Turing Test, a human
interrogator is connected to a human and a machine via a terminal.[7] In a general case, any arrangement that
facilitates interaction while preserving anonymity is sufficient. The
interrogator poses questions to both, trying to discern the computer from the
human being. A machine that can
fool the interrogator is considered conscious. In other words, this test assumes that if a machine appears intelligent, then it is intelligent.
Various problems with this test exist. An ideal machine with all possible
conversations stored in its memory would be able to pass the Turing Test by
pulling the relevant conversation threads from its database. Such sophisticated automatons arose
throughout the history of computation. IBM created Deep Blue, the chess
number-cruncher that beat chess grandmaster Garry Kasparov. That was hailed by some as a
breakthrough in artificial intelligence.[8] Deep BlueÕs victory, however, was
nothing more than the victory of a mindless calculator analyzing millions of
positions per second according to predetermined algorithms. The Test is also interesting
because it determines the presence of intelligent thought much like people do
in everyday life. If creatures
with whom we interact behave in an intelligent manner, then we consider them
intelligent. This means that the
criticism Ôwe can only be aware of our own consciousness, but can only assume
the consciousness in other similarly looking and behaving creaturesÕ also applies
to the Turing Test.
III. Chinese
Room Argument
In 1980, philosopher John Searle took issue with
the Turing Test. He proposed a
thought experiment, the Chinese Room, which claimed to show that a machine
could pass the Turing Test without being conscious. He took his cue from Leibniz who, in 1714, wrote of
enlarging the human mind to the size of a mill and entering it only to find
Òparts pushing one another, and never anything by which to explain a perception.Ó
[9]
Much like Leibniz, Searle entered
the Turing Machine. The Chinese
Room was set up similarly to the Turing Test, except that the questions fed to the
room were in Chinese and Searle put himself, a monolingual English speaker,
inside the machine with a ledger of instructions on how to output responses
based on the input of Chinese symbols.[10] Searle argued that, while his answers
made coherent sense to an external interrogator, he himself did not understand
a word of Chinese and thus had no understanding of the ongoing conversation.[11] Thus, SearleÕs ability to pass the
Turing Test without understanding the questions asked to him showed that
cognition cannot be measured merely by a systemÕs input and output.
Replacing Searle with an automaton that processes
inputs to produce outputs will result in a system that manipulates complex
symbols and yet can lay no claim to cognition. Two Chinese Rooms could in fact
exist, with Searle in one and a bone fide
Chinese speaker in the other. Both
would appear conscious to an external observer, but only one would understand
the conversation by being able to attribute meaning to the Chinese symbols.
This struck at the heart of the Strong AI thesis. From this thought experiment, Searle argued that no system
can gain meaning through symbol manipulation and that, furthermore, (a) syntax
is not sufficient for semantics and (b) functionalism cannot fully account for
intelligence, natural or artificial.
Meaning exits outside the system and Searle, inside the room, only has
access to syntax in the form of Chinese symbols. Regardless of how long Searle
manipulates the symbols based on his ledger of instructions he will not gain an
understanding of Chinese.
The validity of the first argument, that syntax is
not sufficient for semantics, is not crucial to SearleÕs thesis. Even if syntax led to semantics,
SearleÕs argument would still hold where a system could exist that does not
understand Chinese and yet can produce coherent responses. Imagine two systems in the Chinese
Room. One understands Chinese and
another gains meaning through syntax manipulation. In other words, the second system would be in a Ôlearning
modeÕ of processing syntax to achieve meaning. Both could operate on the same ledgers of rules in the
Chinese Room and thus produce equivalent responses. Maybe, after many input-output iterations, both systems
would achieve the same level of understanding. At the initial iteration, however, one would not comprehend
the given sentence and yet would produce the same responses as one that fully
understood. Thus, unless a system
is a Chinese speaker and knows Chinese or a system that learns Chinese, it
would fall to the Chinese Room Argument.
Semantic understanding can in fact arise within the
system simply through syntax manipulation. But what do the terms Ôsemantic understandingÕ and Ôsyntax
manipulationÕ refer to?
Syntax manipulation refers to a computerÕs processing of words. Words to a computer are just collections
of formally defined symbols (letters) manipulated according to a program. When Searle manipulates the Chinese
symbols in his Chinese Room, according to some rules, he is performing syntax
manipulation. This manipulation
can obviously occur with or without SearleÕs understanding. Symbols that do hold semantic meaning
are understood in terms of other symbols.[12] This Òcorrelation between two domainsÓ
is called semantic understanding.[13] Thus, one domain of meaning is defined,
recursively, by yet another domain.
A computer, defined here as a Turing machine, can attribute meaning to
base case data (that is, data undefined by another domain) through that dataÕs
relationship to itself. [14] In other words, an appropriately
programmed computer can make inferences based on syntax patterns. If infinite conversations are fed to a
computer then that computer could make inferences about sentence structure,
grammar, and by observing the conversation it could learn to formulate its own
responses. Granted, SearleÕs
semantics refers to a complete understanding of Chinese, which is an unlikely
result of syntax inferences. But
the above explanation warrants a rephrasing of SearleÕs statement. In relation to the Chinese Room
Argument, syntax that holds semantic meaning can give rise to some semantic
understanding, but syntax by itself cannot provide complete semantic
understanding since words and symbols cannot be defined in terms of themselves.
III. Replies to the Chinese Room
Twenty four years have passed since Searle first
wrote of the Chinese Room and numerous arguments have been raised for and
against its applicability to the Strong AI thesis. Before delving into a discussion of neural networks, this
paper will examine the Systems Reply, the Connectionist Reply, and SearleÕs
rejoinders to both.
The Systems Reply (Berkley) states that although
understanding cannot be found in an individual aspect of the Chinese Room, the
entirety of the room can understand Chinese.[15] Just like neurons are not themselves
conscious and yet a brain consisting of neurons is cognizant, so does the whole
of the Chinese Room understand Chinese while no understanding can be ascribed
to any of its parts. Thus, while
an English speaking automaton inside the Chinese Room quite obviously does not
understand Chinese, he is part of an arrangement that is implementing a
super-system cognizant of the conversation.
SearleÕs rejoinder was to internalize the Chinese
Room. He memorized the ledger
(program), did all the operations in his head, and even proposed sitting
outside so as to further remove himself from the ÔroomÕ construction.[16] As a result, Searle would be able to
produce answers in Chinese, but would still not understand Chinese. The two subsystems would co-exist in
Searle, but one would be Searle speaking only English and the other would be
the Chinese Room speaking only Chinese.
Searle does not understand Chinese and no part of Searle understands
Chinese. Since the system is now
part of Searle, then the system does not understand Chinese.
Another objection, the Connectionist Reply (Churchlands), arose from the realm of parallel processing. This reply contends that the Chinese Room only applies to serial processing computers. Modern computers rely on, and human brains are believed to utilize, parallel processing in which many commands are processed simultaneously. Aside from faster outputs, parallel processing provides a computationally distinct method of processing inputs. The Reply envisions the Chinese Room as a sweatshop of many monolingual automatons each processing a tiny fraction of the total input. Obviously, no single individual inside the Room can be ascribed consciousness, and only the whole of the Room can be considered conscious. Notice here this ReplyÕs kinship to the Systems Reply. An important difference is that the former relies on serial processing while the latter relies on parallel processing. A serial processing computer carries out operations sequentially, while a parallel processing computer can carry out multiple operations at the same time, providing a completely different framework for computation. Also important is that the Connectionist Reply is irreducible. SearleÕs rejoinder cannot internalize the system, as he did with the Systems Reply. He cannot attempt to conduct the work of monolingual automatons inside his head because he would carry out the calculations sequentially, thus going against the spirit of the Connectionist Reply. He also cannot somehow imprint the Connectionist system into his head, because then he could not make the claim that no part of him understands Chinese.
His rejoinder to the Connectionist Reply, in fact,
completely transformed the Chinese Room. Searle created a Chinese Gym in which many
English-speaking men work in parallel to produce outputs. [17] These outputs would make sense to a
Chinese observer and yet ÒNo one in the gym speaks Chinese, and there is no way
for the system as a whole to learn the meanings of any Chinese words.Ó[18]
Therefore, meaningless symbol manipulation within this Chinese Gym could not
give rise to meaningful thought.
Although this rejoinder was used against all connectionist systems, it
applies more to parallel processing computers than to neural networks, for
reasons discussed below.
IV. Neural
Networks
After serial and parallel processing, the next step
in computer evolution was neural networks. A neural network, often referred to as a neural net,
consists of a web of decision-nodes with inputs and outputs on either side and
through which information can travel.
Input symbols are run through the net and a learning algorithm assigns
higher values to the nodes on decision-paths which result in a correct
output. The next time those
symbols are run, the information travels through the net and tends to prefer
the higher valued decision paths.
In effect, these neural nets are trained to produce satisfactory
output. They differ from serial or
parallel algorithms by virtue of not being endowed with formal rules at the
implementation-level, meaning the level at which the system outputs responses. Their formal input-output algorithms
exist at the low-level of single nodes, but no formal rules govern the
input-output of the entire system. The low-level formal rules are important to
their definition but uninteresting in this discussion since the claim of
cognition at the node-level will not be made in this paper. Their implementation-level lack of
formal rules means that algorithmic information cannot be extracted from the
system in any useful way, because it cannot be reduced to a specific node
within the net. Information about
the netÕs decision-making is spread out over all the nodes. This makes the system - its
decision-making and information - irreducible to a single node and
irreproducible in the Chinese Room.
Specifically, we could reproduce its inputs and outputs with a Chinese
room, but we could not reproduce its fine-grained functional structure.
This conclusion provides two important claims. First, the Systems Reply can be applied
to the neural networks, much like it was applied to the Connectionist Reply. It
would say that the neural net is conscious as a whole even though none of its
parts is individually conscious.
Second, neither SearleÕs rejoinder to the Systems Reply nor his
rejoinder to the Connectionist Reply applies to neural nets. Searle cannot internalize the neural
net like he internalized the Chinese Room in a rejoinder to the Systems
Reply. Information processing in the
neural net, much like the parallel processing system, is irreducible to single
nodes. Searle canÕt use the
Systems rejoinder here for the same reason that he could not use it on the
parallel processing system of the Connectionist Reply. If Searle was to simulate or even
reproduce the neural net inside his head, then he could lay no claim to not
understanding Chinese. A part of Searle would claim to understand Chinese
because he would be implementing the exact
arrangement that claimed to understand Chinese independent of Searle. In the Systems rejoinder, Searle could
memorize the ledger, do all the calculations in his head, and work outside. In the case of parallel
processing, however, he is forced to envision an entire gym of people engaged
in the mindless task of processing inputs according to formal rules, which was
his Connectionist rejoinder.
Using the Connectionist rejoinder Searle can still
claim, correctly, that no part of the neural net system understands
Chinese. Consequentially, but
uninterestingly, the low-level units of the neural net, namely the nodes, do
not understand Chinese. They are
performing formal symbol manipulation over formally defined domains. But this is synonymous with neurons in
our brains and transistors in computers performing similar operations. And the claim of cognition is not that
neurons are conscious, but that the mind is conscious. Generally, the low-level formal symbol
manipulation gives rise to higher-level understanding in a neural network,
contradictory to SearleÕs claim that syntax is not sufficient to semantics.
SearleÕs second claim against Connectionism is that
there is no way for the system to learn Chinese. This claim only works for parallel processing computers,
where formal rules exist at the implementation-level. It does not work for neural nets. As will be shown below, a neural net that does not
understand Chinese and has no meaning assigned to Chinese symbols will produce
erroneous answers to Chinese questions.
Only a neural net that understands Chinese will be able to pass the
Turing Test.
Let us examine the evolution of a neural net over
time. A neural net is a dynamic
system whose internal composition, namely the values of its interconnected nodes,
changes over time in response to a learning algorithm that operates over the
entire system.[19] The learning algorithm updates the
values of its nodes after each successive iteration of the input-output
sequence.[20] At
inception (t=0)
the system does not understand Chinese. Consequentially, it would provide
incorrect responses to questions posed to it in Chinese. Over time (t>0) the system evolves its low-level
processes by evaluating the feedback on whether its outputs were correct. After a certain time (tˆtlearned)
the system begins to produce increasingly correct outputs and eventually (t=tlearned)
this systemÕs responses will be as commonsensical as those of a bone fide Chinese speaker.
This description seems to distance neural nets from
their functional definition.
Strong AI functionalism requires for the system to be defined simply in
terms of its inputs and outputs.
Yet the neural net necessitates a specific internal makeup. At the node-level, however, a neural
net is defined in purely functional terms. Nowhere in the netÕs definition or its learning mode is a
specific node makeup prerequisite to its functional outputs. Thus, at the node level, neural
networks fall under the aegis of Strong
AI functionalism.
These nodes, however, require a specific
arrangement – namely, that of the neural net – in order to gain
semantic understanding through syntax manipulation. So the best description of
a neural network would be a specific arrangement of functionally defined
systems. So in the end the neural
net argument fractures Strong AI functionalism. A neural net is defined both in terms of its low-level
functionalism and its inability to be simulated in the Chinese Room at the
implementation-level.
A neural net, much like the Connectionist Reply,
cannot be simulated in the Chinese Room.
Its best representation would be many Chinese Rooms, intricately
interconnected and governed by a learning algorithm. Searle would say that such an arrangement can be simulated by
his Chinese Room, arguing from the thesis that any computational operation can
be computer by a Turing Machine. A
simulation, however, could lay no claim to consciousness. A simulation, as discussed previously,
would not duplicate the neural net arrangement in the Chinese Room.
An interesting conclusion is that an arrangement can be constructed out of functionally defined elements that avoids the Chinese Room by virtue of having massive parallelism and a learning algorithm that governs those elements. A simulation of this system in the Chinese Room would not be able to duplicate its massive parallelism. A Chinese gym would miss out on its learning algorithm. A Systems Reply rejoinder would not work since Searle cannot internalize the neural net. Thus, the claim of implementation-level cognition in a neural network could be made. But this does not prove the Strong AI thesis, since neural networks do not completely fall under its definition. Rather, a specifically arranged network of functionally defined systems can avoid the Chinese Room, but no functionally defined system can by itself avoid the Chinese Room Argument.
Now, the currently existing neural networks have not reached the appropriate level of sophistication to make such a claim. This paper refers to an idealized neural network, much like Turing referred to an idealized, ÒUniversal,Ó Turing Machine and Searle referred to a perfect Chinese Room. Working neural networks, however, are already making some breakthroughs. Scientists here at Dartmouth, for example, have trained neural nets to discern between authentic and digitally altered artwork.[21] It may be discovered that neural networks cannot reach a level sophistication that would be sufficient for cognitive processes, maybe due to some technological constraints, but the arguments in this paper would still stand, since they refer to possible, not probable, systems.
References
Copeland, Jack B. ÒThe Curious Case of the Chinese
Gym.Ó Synthese. Vol. 95.
1993. 173-86.
Farid, H., S. Lyu, and D. Rockmore. ÒA Digital Technique for Art Authentication.Ó Proceedings of the National Academy of Sciences. 101(49):17006-17010. Hanover, NH: Dartmouth College, 2004.
Harrison, David J. ÒThe Searle Workout: Connectionism hits the Chinese Gym.Ó Connexions. Issue 1. University of Sheffield: 1997.
Hofstadter, Douglas. Gšdel, Escher, Bach: An Eternal Golden Braid. New York: Random House, 1979. 15-17.
Leibnitz, Gottfried. Monadology (1714). Los
Angeles: University of Southern California, 1930.
Krauthammer, Charles. ÒBE AFRAID. The Meaning of Deep Blue's Victory.Ó The
Weekly Standard. 26 May 1997.
<http://wright.chebucto.net/AI.html>.
Preston, John. "Introduction." In Views into the Chinese Room. Ed. John Preston, Mark Bishop. New York: Oxford UP, 2002. 1-47.
Rapaport, William J. ÒUnderstanding Understanding: Syntactic Semantics and Computational Cognition.Ó Buffalo, NY: SUNY Buffalo Department of Computer Science. 1995. 1-4.
Searle, John R. "Is the brain's mind a computer program?" Scientific American. Issue 262(1). 1990.
Searle, John R. "Minds, Brains, and Programs," The Behavioral and Brain Sciences. Vol. 3. Cambridge University Press: 1980.
Saygin, Ayse Pinar. ÒThe Turing Test.Ó 3 December 2004. <http://cogsci.ucsd.edu/~asaygin/tt/ttest.html#intro>.
Smith, Leslie. ÒAn Introduction to Neural Networks.Ó University of Stirling: 1996. <http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html>.
[1] Hofstadter, p. 15
[2] ibid.
[3] ibid.
[4] Preston, p. 5
[5] Ibid.
[6] Searle 1990
[7] Saygin, p. 1
[8] Krauthammer, p. 1
[9] Leibnitz 1714
[10] Searle 1980
[11] Ibid.
[12] Rapaport, p. 1
[13] Ibid.
[14] Ibid.
[15] Searle 1980
[16] Ibid.
[17] Searle 1990
[18] Searle 1990, p. 22
[19] Smith 1996
[20] Ibid.
[21] Farid, 2004