Ghosts Of the Chinese Room: Semantic Understanding In Neural Networks

Michael Belinsky

Dartmouth College, Fall 2004

Michael Belinsky is a freshman philosophy and economics double major at Dartmouth College.  His interests include political theory and economic theory.  He currently works as a staff columnist for The Dartmouth daily newspaper and serves as the Secretary General for the Dartmouth Model United Nations conference.

I.  Introduction

The advent of neural networks reinvigorated the debate over the ability of artificial intelligence (AI) to duplicate human thought in functionally defined systems.  Although connectionism theorists still battle over the neural networksÕ classification as a von Neumann machine, opponents of Strong AI functionalism have extended their arguments to encompass connectionism.  This paper will examine one such argument, John SearleÕs Chinese Room, and its applicability to neural networks.  The path from computer science to artificial intelligence to the Chinese Room will be traced out to provide some groundwork for the discussion.

As a formal science, computation began addressing the mind-brain problem at the inception of the British mathematician Alan TuringÕs 1936 paper on the theory of computation.  Turing stated that any mechanical method of calculating a mathematical function can be duplicated by a Turing machine, which was defined as any system that has a memory bank for discrete symbols and a way of accessing and altering those symbols according to a program.[1]  TuringÕs functional definition laid the groundwork for computation: anything computable by an informal mechanical process can be formally computed in a Turing machine. [2]  An exciting venue opened up at the possibility to artificially simulate intelligence in a Turing Machine.  Thus, AI grew out of computer science, holding at its core the belief that intelligence is computable.  The reasoning behind the belief is that (a) thinking is information processing, (b) information processing is symbol manipulation, and (c) symbol manipulation, by computers or any other medium, if sophisticated enough, can result in intelligent thought. [3]  Computation is a functional concept.  Functionalism claims that the world can be explained in terms of its functional relations.  Thus, the functional definition of consciousness would include everything about consciousness that is independent of the being in which that consciousness occurs.  Computation, therefore, refers only to the input-output processes and not the medium in which it occurs.  In other words, as long as it achieves the desired inputs and outputs, computation can be carried out by shuffling sticks of different lengths, with pen and paper, or by manipulating any sort of formally defined symbols. This implies that since computation is independent of the medium in which it is implemented, human thought is independent of the implementing device.  Specifically, the mind can be duplicated independently of the brain.

From here, AI split into two camps.  The first, Strong AI, holds that artificial intelligence can duplicate human consciousness. [4]  The second, Weak AI, holds that while artificial intelligence machines can appear to think, they can only simulate conscious machines.[5]  In this discussion, the difference between a duplicate and a simulation merits an explanation.  A simulation of a phenomenon does not fully reproduce that phenomenon.[6]  A hurricane simulation will not destroy your house and a rain simulation will not pour down on it.  Similarly, a simulation of thought does not result in thinking.  In this sense, Weak AI claims that all attempts at artificial intelligence will merely simulate thought.  The job of Strong AI is not to simulate thought, but rather to duplicate it in formally defined machines.  A duplication of a phenomenon is a fully functional reproduction of that phenomenon.  This paper focuses on Strong AI as the more controversial of the two theses.

II.  Turing Test

Strong AI functionalism proposes an imitation game to determine a machineÕs intelligence.  In this imitation game, famously known as the Turing Test, a human interrogator is connected to a human and a machine via a terminal.[7]  In a general case, any arrangement that facilitates interaction while preserving anonymity is sufficient. The interrogator poses questions to both, trying to discern the computer from the human being.  A machine that can fool the interrogator is considered conscious.  In other words, this test assumes that if a machine appears intelligent, then it is intelligent. 

Various problems with this test exist.  An ideal machine with all possible conversations stored in its memory would be able to pass the Turing Test by pulling the relevant conversation threads from its database.  Such sophisticated automatons arose throughout the history of computation. IBM created Deep Blue, the chess number-cruncher that beat chess grandmaster Garry Kasparov.  That was hailed by some as a breakthrough in artificial intelligence.[8]  Deep BlueÕs victory, however, was nothing more than the victory of a mindless calculator analyzing millions of positions per second according to predetermined algorithms.   The Test is also interesting because it determines the presence of intelligent thought much like people do in everyday life.  If creatures with whom we interact behave in an intelligent manner, then we consider them intelligent.  This means that the criticism Ôwe can only be aware of our own consciousness, but can only assume the consciousness in other similarly looking and behaving creaturesÕ also applies to the Turing Test.

III.  Chinese Room Argument

In 1980, philosopher John Searle took issue with the Turing Test.  He proposed a thought experiment, the Chinese Room, which claimed to show that a machine could pass the Turing Test without being conscious.  He took his cue from Leibniz who, in 1714, wrote of enlarging the human mind to the size of a mill and entering it only to find Òparts pushing one another, and never anything by which to explain a perception.Ó [9]  Much like Leibniz, Searle entered the Turing Machine.  The Chinese Room was set up similarly to the Turing Test, except that the questions fed to the room were in Chinese and Searle put himself, a monolingual English speaker, inside the machine with a ledger of instructions on how to output responses based on the input of Chinese symbols.[10]  Searle argued that, while his answers made coherent sense to an external interrogator, he himself did not understand a word of Chinese and thus had no understanding of the ongoing conversation.[11]  Thus, SearleÕs ability to pass the Turing Test without understanding the questions asked to him showed that cognition cannot be measured merely by a systemÕs input and output. 

Replacing Searle with an automaton that processes inputs to produce outputs will result in a system that manipulates complex symbols and yet can lay no claim to cognition. Two Chinese Rooms could in fact exist, with Searle in one and a bone fide Chinese speaker in the other.  Both would appear conscious to an external observer, but only one would understand the conversation by being able to attribute meaning to the Chinese symbols. This struck at the heart of the Strong AI thesis.  From this thought experiment, Searle argued that no system can gain meaning through symbol manipulation and that, furthermore, (a) syntax is not sufficient for semantics and (b) functionalism cannot fully account for intelligence, natural or artificial.  Meaning exits outside the system and Searle, inside the room, only has access to syntax in the form of Chinese symbols. Regardless of how long Searle manipulates the symbols based on his ledger of instructions he will not gain an understanding of Chinese.

The validity of the first argument, that syntax is not sufficient for semantics, is not crucial to SearleÕs thesis.  Even if syntax led to semantics, SearleÕs argument would still hold where a system could exist that does not understand Chinese and yet can produce coherent responses.  Imagine two systems in the Chinese Room.  One understands Chinese and another gains meaning through syntax manipulation.  In other words, the second system would be in a Ôlearning modeÕ of processing syntax to achieve meaning.  Both could operate on the same ledgers of rules in the Chinese Room and thus produce equivalent responses.  Maybe, after many input-output iterations, both systems would achieve the same level of understanding.  At the initial iteration, however, one would not comprehend the given sentence and yet would produce the same responses as one that fully understood.  Thus, unless a system is a Chinese speaker and knows Chinese or a system that learns Chinese, it would fall to the Chinese Room Argument.

Semantic understanding can in fact arise within the system simply through syntax manipulation.  But what do the terms Ôsemantic understandingÕ and Ôsyntax manipulationÕ refer to?   Syntax manipulation refers to a computerÕs processing of words.  Words to a computer are just collections of formally defined symbols (letters) manipulated according to a program.  When Searle manipulates the Chinese symbols in his Chinese Room, according to some rules, he is performing syntax manipulation.  This manipulation can obviously occur with or without SearleÕs understanding.  Symbols that do hold semantic meaning are understood in terms of other symbols.[12]  This Òcorrelation between two domainsÓ is called semantic understanding.[13]  Thus, one domain of meaning is defined, recursively, by yet another domain.  A computer, defined here as a Turing machine, can attribute meaning to base case data (that is, data undefined by another domain) through that dataÕs relationship to itself. [14]   In other words, an appropriately programmed computer can make inferences based on syntax patterns.  If infinite conversations are fed to a computer then that computer could make inferences about sentence structure, grammar, and by observing the conversation it could learn to formulate its own responses.  Granted, SearleÕs semantics refers to a complete understanding of Chinese, which is an unlikely result of syntax inferences.  But the above explanation warrants a rephrasing of SearleÕs statement.  In relation to the Chinese Room Argument, syntax that holds semantic meaning can give rise to some semantic understanding, but syntax by itself cannot provide complete semantic understanding since words and symbols cannot be defined in terms of themselves.

III.  Replies to the Chinese Room

Twenty four years have passed since Searle first wrote of the Chinese Room and numerous arguments have been raised for and against its applicability to the Strong AI thesis.  Before delving into a discussion of neural networks, this paper will examine the Systems Reply, the Connectionist Reply, and SearleÕs rejoinders to both.

The Systems Reply (Berkley) states that although understanding cannot be found in an individual aspect of the Chinese Room, the entirety of the room can understand Chinese.[15]  Just like neurons are not themselves conscious and yet a brain consisting of neurons is cognizant, so does the whole of the Chinese Room understand Chinese while no understanding can be ascribed to any of its parts.  Thus, while an English speaking automaton inside the Chinese Room quite obviously does not understand Chinese, he is part of an arrangement that is implementing a super-system cognizant of the conversation.

SearleÕs rejoinder was to internalize the Chinese Room.  He memorized the ledger (program), did all the operations in his head, and even proposed sitting outside so as to further remove himself from the ÔroomÕ construction.[16]  As a result, Searle would be able to produce answers in Chinese, but would still not understand Chinese.  The two subsystems would co-exist in Searle, but one would be Searle speaking only English and the other would be the Chinese Room speaking only Chinese.  Searle does not understand Chinese and no part of Searle understands Chinese.  Since the system is now part of Searle, then the system does not understand Chinese.

Another objection, the Connectionist Reply (Churchlands), arose from the realm of parallel processing.  This reply contends that the Chinese Room only applies to serial processing computers.  Modern computers rely on, and human brains are believed to utilize, parallel processing in which many commands are processed simultaneously.  Aside from faster outputs, parallel processing provides a computationally distinct method of processing inputs.  The Reply envisions the Chinese Room as a sweatshop of many monolingual automatons each processing a tiny fraction of the total input.  Obviously, no single individual inside the Room can be ascribed consciousness, and only the whole of the Room can be considered conscious.  Notice here this ReplyÕs kinship to the Systems Reply.  An important difference is that the former relies on serial processing while the latter relies on parallel processing.  A serial processing computer carries out operations sequentially, while a parallel processing computer can carry out multiple operations at the same time, providing a completely different framework for computation.  Also important is that the Connectionist Reply is irreducible.  SearleÕs rejoinder cannot internalize the system, as he did with the Systems Reply.  He cannot attempt to conduct the work of monolingual automatons inside his head because he would carry out the calculations sequentially, thus going against the spirit of the Connectionist Reply.  He also cannot somehow imprint the Connectionist system into his head, because then he could not make the claim that no part of him understands Chinese.

His rejoinder to the Connectionist Reply, in fact, completely transformed the Chinese Room.   Searle created a Chinese Gym in which many English-speaking men work in parallel to produce outputs. [17]  These outputs would make sense to a Chinese observer and yet ÒNo one in the gym speaks Chinese, and there is no way for the system as a whole to learn the meanings of any Chinese words.Ó[18] Therefore, meaningless symbol manipulation within this Chinese Gym could not give rise to meaningful thought.  Although this rejoinder was used against all connectionist systems, it applies more to parallel processing computers than to neural networks, for reasons discussed below.

IV.  Neural Networks

After serial and parallel processing, the next step in computer evolution was neural networks.  A neural network, often referred to as a neural net, consists of a web of decision-nodes with inputs and outputs on either side and through which information can travel.  Input symbols are run through the net and a learning algorithm assigns higher values to the nodes on decision-paths which result in a correct output.  The next time those symbols are run, the information travels through the net and tends to prefer the higher valued decision paths.  In effect, these neural nets are trained to produce satisfactory output.  They differ from serial or parallel algorithms by virtue of not being endowed with formal rules at the implementation-level, meaning the level at which the system outputs responses.  Their formal input-output algorithms exist at the low-level of single nodes, but no formal rules govern the input-output of the entire system. The low-level formal rules are important to their definition but uninteresting in this discussion since the claim of cognition at the node-level will not be made in this paper.  Their implementation-level lack of formal rules means that algorithmic information cannot be extracted from the system in any useful way, because it cannot be reduced to a specific node within the net.  Information about the netÕs decision-making is spread out over all the nodes.  This makes the system - its decision-making and information - irreducible to a single node and irreproducible in the Chinese Room.  Specifically, we could reproduce its inputs and outputs with a Chinese room, but we could not reproduce its fine-grained functional structure.

This conclusion provides two important claims.  First, the Systems Reply can be applied to the neural networks, much like it was applied to the Connectionist Reply. It would say that the neural net is conscious as a whole even though none of its parts is individually conscious.  Second, neither SearleÕs rejoinder to the Systems Reply nor his rejoinder to the Connectionist Reply applies to neural nets.  Searle cannot internalize the neural net like he internalized the Chinese Room in a rejoinder to the Systems Reply.  Information processing in the neural net, much like the parallel processing system, is irreducible to single nodes.  Searle canÕt use the Systems rejoinder here for the same reason that he could not use it on the parallel processing system of the Connectionist Reply.  If Searle was to simulate or even reproduce the neural net inside his head, then he could lay no claim to not understanding Chinese. A part of Searle would claim to understand Chinese because he would be implementing the exact arrangement that claimed to understand Chinese independent of Searle.  In the Systems rejoinder, Searle could memorize the ledger, do all the calculations in his head, and work outside.   In the case of parallel processing, however, he is forced to envision an entire gym of people engaged in the mindless task of processing inputs according to formal rules, which was his Connectionist rejoinder.

Using the Connectionist rejoinder Searle can still claim, correctly, that no part of the neural net system understands Chinese.  Consequentially, but uninterestingly, the low-level units of the neural net, namely the nodes, do not understand Chinese.  They are performing formal symbol manipulation over formally defined domains.  But this is synonymous with neurons in our brains and transistors in computers performing similar operations.  And the claim of cognition is not that neurons are conscious, but that the mind is conscious.  Generally, the low-level formal symbol manipulation gives rise to higher-level understanding in a neural network, contradictory to SearleÕs claim that syntax is not sufficient to semantics.

SearleÕs second claim against Connectionism is that there is no way for the system to learn Chinese.  This claim only works for parallel processing computers, where formal rules exist at the implementation-level.  It does not work for neural nets.  As will be shown below, a neural net that does not understand Chinese and has no meaning assigned to Chinese symbols will produce erroneous answers to Chinese questions.  Only a neural net that understands Chinese will be able to pass the Turing Test.

Let us examine the evolution of a neural net over time.  A neural net is a dynamic system whose internal composition, namely the values of its interconnected nodes, changes over time in response to a learning algorithm that operates over the entire system.[19]  The learning algorithm updates the values of its nodes after each successive iteration of the input-output sequence.[20] At inception (t=0) the system does not understand Chinese. Consequentially, it would provide incorrect responses to questions posed to it in Chinese.  Over time (t>0) the system evolves its low-level processes by evaluating the feedback on whether its outputs were correct.  After a certain time (tˆtlearned) the system begins to produce increasingly correct outputs and eventually (t=tlearned) this systemÕs responses will be as commonsensical as those of a bone fide Chinese speaker.

This description seems to distance neural nets from their functional definition.  Strong AI functionalism requires for the system to be defined simply in terms of its inputs and outputs.  Yet the neural net necessitates a specific internal makeup.  At the node-level, however, a neural net is defined in purely functional terms.  Nowhere in the netÕs definition or its learning mode is a specific node makeup prerequisite to its functional outputs.  Thus, at the node level, neural networks fall under the aegis of Strong AI functionalism.

These nodes, however, require a specific arrangement – namely, that of the neural net – in order to gain semantic understanding through syntax manipulation. So the best description of a neural network would be a specific arrangement of functionally defined systems.  So in the end the neural net argument fractures Strong AI functionalism.  A neural net is defined both in terms of its low-level functionalism and its inability to be simulated in the Chinese Room at the implementation-level.

A neural net, much like the Connectionist Reply, cannot be simulated in the Chinese Room.  Its best representation would be many Chinese Rooms, intricately interconnected and governed by a learning algorithm.  Searle would say that such an arrangement can be simulated by his Chinese Room, arguing from the thesis that any computational operation can be computer by a Turing Machine.  A simulation, however, could lay no claim to consciousness.  A simulation, as discussed previously, would not duplicate the neural net arrangement in the Chinese Room.

An interesting conclusion is that an arrangement can be constructed out of functionally defined elements that avoids the Chinese Room by virtue of having massive parallelism and a learning algorithm that governs those elements.  A simulation of this system in the Chinese Room would not be able to duplicate its massive parallelism.  A Chinese gym would miss out on its learning algorithm.  A Systems Reply rejoinder would not work since Searle cannot internalize the neural net.  Thus, the claim of implementation-level cognition in a neural network could be made.  But this does not prove the Strong AI thesis, since neural networks do not completely fall under its definition.  Rather, a specifically arranged network of functionally defined systems can avoid the Chinese Room, but no functionally defined system can by itself avoid the Chinese Room Argument.

Now, the currently existing neural networks have not reached the appropriate level of sophistication to make such a claim.  This paper refers to an idealized neural network, much like Turing referred to an idealized, ÒUniversal,Ó Turing Machine and Searle referred to a perfect Chinese Room.  Working neural networks, however, are already making some breakthroughs.  Scientists here at Dartmouth, for example, have trained neural nets to discern between authentic and digitally altered artwork.[21]  It may be discovered that neural networks cannot reach a level sophistication that would be sufficient for cognitive processes, maybe due to some technological constraints, but the arguments in this paper would still stand, since they refer to possible, not probable, systems.


References

Copeland, Jack B. ÒThe Curious Case of the Chinese Gym.Ó  Synthese. Vol. 95.  1993.  173-86.

Farid, H., S. Lyu, and D. Rockmore.  ÒA Digital Technique for Art Authentication.Ó  Proceedings of the National Academy of Sciences. 101(49):17006-17010.  Hanover, NH: Dartmouth College, 2004.

Harrison, David J. ÒThe Searle Workout: Connectionism hits the Chinese Gym.Ó Connexions. Issue 1. University of Sheffield:  1997.

Hofstadter, Douglas.  Gšdel, Escher, Bach: An Eternal Golden Braid.  New York: Random House, 1979.  15-17.

Leibnitz, Gottfried. Monadology (1714). Los Angeles: University of Southern California, 1930.

Krauthammer, Charles.  ÒBE AFRAID. The Meaning of Deep Blue's Victory.Ó The Weekly Standard.  26 May 1997.  <http://wright.chebucto.net/AI.html>.

Preston, John. "Introduction." In Views into the Chinese Room. Ed. John Preston, Mark Bishop. New York: Oxford UP, 2002. 1-47.

Rapaport, William J. ÒUnderstanding Understanding: Syntactic Semantics and Computational Cognition.Ó Buffalo, NY: SUNY Buffalo Department of Computer Science. 1995. 1-4.

Searle, John R. "Is the brain's mind a computer program?" Scientific American. Issue 262(1).  1990.

Searle, John R.  "Minds, Brains, and Programs," The Behavioral and Brain Sciences. Vol. 3. Cambridge University Press: 1980.

Saygin, Ayse Pinar. ÒThe Turing Test.Ó  3 December 2004.  <http://cogsci.ucsd.edu/~asaygin/tt/ttest.html#intro>.

Smith, Leslie.  ÒAn Introduction to Neural Networks.Ó  University of Stirling: 1996.  <http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html>.

 

 



[1] Hofstadter, p. 15

[2] ibid.

[3] ibid.

[4] Preston, p.  5

[5] Ibid.

[6] Searle 1990

[7] Saygin, p. 1

[8] Krauthammer, p. 1

[9] Leibnitz 1714

[10] Searle 1980

[11] Ibid.

[12] Rapaport, p. 1

[13] Ibid.

[14] Ibid.

[15] Searle 1980

[16] Ibid.

[17] Searle 1990

[18] Searle 1990, p. 22

[19] Smith 1996

[20] Ibid.

[21] Farid, 2004