Richard E. Reed
think? This provocative article says they can.
Article appeared in Peek65 magazine, March, 1985.)
In the misty winter dawn shining through the tawdry ghetto windows of his lab the scientist (obviously mad) stoops over the tangle of wires and boards on the operating table. His fingers can be seen hovering above the petrified caterpillar shapes, clicking toggle switches and closing momentary contacts. In the early morning stillness his feverish voice echoes, pontificating like God, as the customary lights (albeit tiny) flash off and on in sync to his actions.
"Clear the experience banks!" (initialize memory).
"Start his character records!" (reset the counters).
"Breathe in the life force!!" (turn him on, dummy).
There is a flurry of activity among the lights. The haggard and unkempt form arises in triumph. "He works! He's working! Ha ha ha ha ha! At last he's working! SAM is born!" (To nobody) "Look at him. He works! ! !"
Nine days before in Fresno, California I had locked myself in the run down room on motel row armed with an Ohio Scientific bare board, a couple of bread boards from Radio Shack, some LEDs, switches, a bag of ICs, and the intention to create SAM. I added some junk food and drinks and for a week and a half I lost track of time while I attempted to put my ideas in silicon and electric fields.
I confess, I was an OSI addict from the word go. I bought one of their first bare boards and ICs to populate it. I had already acquired my 6502 from MOS Technology as soon as they announced it for the then unbelievably low \\$25. I learned to program its machine language while waiting for the chip to arrive, and cut my teeth designing the first untested code for SAM.
I am an unmitigated AI (artificial intelligence) freak. We are not talking II (imitation idiocy) where some program is made to "appear" intelligent as in Wisenbaum's Eliza or even UI (useful imitation) where a program emulates a useful mental function to execute useful tasks as in image recognition. We're talking Frankenstein: trying to create a viable being which could be set loose in a compatible environment and learn how to manage for itself. And now that my own computer (a CPU, glue chips, and 4K of RAM) was here the work began in earnest.
SAM was (and is) an ambitious project. As true AI he
1. Begin with no knowledge or application skill.
And for pragmatic reasons he must:
2. Have only "house-keeping" software.
3. Behave randomly at first.
4. Learn how to survive in his environment.
5. Develop habits:
a. Virtues (with good survival value)
6. Behave unpredictably (except that he will predictably
b. Vices (with moderately poor value)
7. Develop differently each time he is run.
8. Be able to forget.
9. Be expandable.
10. Install without modification in a robot.
11. Fit in a page of memory (256 bytes)
12. Run on my limited 4K machine.
When he finally worked (its amazing how code written without the advantage of testing retains bugs in spite of every effort to weed them out) SAM fulfilled every specification. For reasons of economy his sensory input consisted of a free-running Shottky TTL based counter set up as a random number generator. Its output was byte wide and latchable. The same random numbers were used in several decision processes, so the costs were minimal. As far as SAM was concerned the sensing could have come from anything. With this configuration SAM could only perceive 256 conditions. The program restrained him to 8 reactions. The outcome of the condition/reaction combinations affected "good" and "bad" counters by which progress could be tracked.
How SAM works
SAM is environmentally oriented and works on a behavioral model of intelligence. Nothing happens unless there is an environment to relate to. This environment can be internal or external. With SAM it was the latter. The first step in any process began with sensing the environment. This meant getting a number from the random generator. Eight 256 byte pages of the RAM were allocated to his memory, so that one byte could be used to represent each environment/reaction combination. In this byte we stored the usefulness value of that combination and the number of times it had been accessed (up to 7).
After reading an "environment" he examined the 8 associated bytes and selected the best reaction value. If that value was zero he read the random number generator again and masked for the 3 low order bits to determine his action. If a value was present he read the random number generator again and masked for the 5 low order bits, compared the result with the best reaction value, and if this result was higher he got his action from the random number generator. If this result were lower he repeated the previous best action. This loop automatically favored repeating the best behaviors and trying something new on the worst.
The exception to this rule was if the reaction had been done 7 times. In that case it was repeated automatically. It had become fixed habit. This was not as easy to occur as it might seem because of the "sieve". This subroutine sequentially examined a single memory location each time SAM performed a reaction. If that location had not been used 7 times it was cleared and SAM forgot he had ever tried it. A variation on the procedure used in later SAMs decremented fixed memory by 1 so that occasionally even established habits could be changed. If the sieve happened to get back around to that location before it was tried again even a fixed habit could then be totally forgotten.
SAM's reactions in the original version were (again just because of economics) purely mathematical. He performed a function on the environment and a value stored for each of the 8 reactions. The mathematical result of this function was an 8-bit number which we arbitrarily divided into 4-bit nybbles, giving each a value from 0 to 15. We let one represent the "good" and the other the "bad" effect of that reaction on the particular environment.
SAM had two latched adders fitted with 7-segment LED readouts. The values of the good and bad nybbles were added to their respective adders and the results displayed. The idea of "success" for SAM was to have increasingly greater divergence between the good and bad count (the good count being highest). SAM could have worked just as well having two goals both of which accelerated at the fastest possible rate, but under the original implementation it would be impossible to measure his progress. The good/bad arrangement allowed us to gauge progress at a glance. The overall reaction value was obtained by subtracting the evil from 16 and adding the good. Using 16 skewed the evil result by one, but it let us have a total between 0 and 31 to fit in the lower 5 bits.
Many variations are possible on the basic SAM. He can block reuse of any reaction value lower than say 8. He could be forced to repeat any reaction with a value greater than say 24. Such changes make him a more aggressive learner, but they often prevent him from learning the best, and his "character" is less interesting psychologically from the good and bad habits he develops. As originally conceived SAM will achieve a good score of 60,000 after 5,000 moves and be performing at a 5 to 1 ratio at that time.
SAM's design has proven to be very flexible. A new environment can be added with 2K more of RAM and a little overhead programming to switch between the environments. A fully loaded 64K 6502 can have 30 different environments which can all be maintained in real time by SAM. Alternatively the complexity of SAM's world can be changed by allocating larger blocks of RAM to each environment and increasing the size of environment/reaction potential correspondingly.
Installation within a robot simplifies SAM because there is no arbitrary number processing. The reaction value is derived from the environment which is read by sensors. Reactions involve actuation of drive motors. Methods must be provided to report results to SAM from the environment since his senses are rudimentary in economical implementations. In an experiment described in detail in the next article a plant-watering SAM received "credit" for watering plants and if his watering can got low he received credit for going to a filling station.
Your SAM version
While SAM was originally written in machine language, and his current implementation is proprietary information of Reed Research, we are providing you with a BASIC simulation which works almost identically to the original. We offer this format because it is more readily understood by users, it affords greater ease of change, and gives more extensive reporting of exactly what has occurred.
Lines 10000 to 10060 define variables and set up the output device, SAM's eight reaction values, and the total value he is expected to achieve before the program terminates.
Lines 100 to 110 are the random number generator. N% is the current event counter and every 1000 moves it selects a new random seed. x% contains the maximum value to be returned and z% contains the random number returned.
Line 200 gets the reaction value and use counter in j% and L% respectively. H% is a flag to show prior usage.
The subroutine at 400 to 430 operates the memory sieve. Line 400 has been modified to clear only the use counter rather than the whole byte so that reports generated at the end can show more information. This fact has not been used to modify SAM's operation. Line 410 cycles the low byte of the use counter, and when necessary the high byte. 420 returns it all to the first byte after the end has been reached. Line 415 has been added to speed up the sieve a little.
Lines 500 and 510 increment the use counter on the current reaction if it is less than the maximum.
The main program begins in 5000 which picks up an environment from the random number generator. Subroutine calls, if your basic permits, are named variables because this speeds execution. Subroutine D (100 described above) gets the random number. F% is the environment variable. 5010 to 5030 looks at all possible reactions to see if any were used and returns the one with the highest value. In 5032 we execute a 32-sided coin flip (ignored if the reaction is a fixed habit). 5040 sets some optional skip tests. J% * 1.452 - 8 expands the acceptance/rejection criteria by always avoiding the very bad reactions and always using the very good. The test using N2 / 50 selects a random move every so often no matter what so that SAM will always learn a few new things from time to time. 5050 and 5060 get new reactions when context warrants it.
Lines 5100 to 5160 perform the mathematical calculations involved in a reaction, update the memory, print the results, and get another environment. Line 5025 is a conditional return that lets this sub-routine be used during the printing of reports after the run is finished.
Study this SAM program until you thoroughly understand what is going on. Run it and let it report to the screen at first or a printer later if you want to. The possibilities of what can be done are limitless. Parts 2 and 3 of this series of articles explore a few of the things we did early on in SAM's development. They will discuss the implications of dreaming (your SAM operates in a dream mode) in a real SAM and describe how to design useful applications of dream mode. You will learn how a 2K RAM can be used to store over 250 million SAM environments (some of which will be useful). A complete robotic installation will be described and you will see how an extremely low-budget project designed an environment in which the SAM robot could perform.
Enjoy SAM. More importantly, let these concepts be the kernel of your own expansion in true AI. It's innovative and judicious use could propel a few ambitious and imaginative engineers to the forefront of robotic technology. You could be one of those few.
[Editor's note: Program listings are on page 132. Mr. Reed is the systems manager for Oak Creek Energy Systems in Tehachapi CA. He may be reached at (805) 822-6853, and is willing to pursue "true" AI with any interested parties.]This data was true when the article was originally published, but Mr. Reed is no longer at that location.