http://www.cs.auckland.ac.nz/CDMTCS/chaitin/lisp.c
All in some way encoding a similar implementation and raising the same tives issues and questions (implementation as establishing of primitives in the host language (say C of assembly) and using these as building blocks for the to-some-extent bootstrapping interpreter to abstarct out lower level detail (eg. storage, reading and writing of single characters). Such abstraction - everything is a list.
Questions thus:
With reference to 1] - make-procedure function:
(define (make-procedure parameters body env) (list 'procedure parameters body env))
In the Lisp reader.
From Chaitin's code:
All strings are as characters within the tree of cells. obj_list points to a list (in this same cellular memory) of pointers to all of these (for example primitives) which are initialised first off as built-in atoms. cons is the base operation to construct/allocate such lists. Key functions: lookup_word and mk_atom. All is list operations on storage.
Evaluation handles define first (question of binding) before eval (ev function) proper.
SICP p.377. Binding and frames. Operations on the environment:
The evaluator needs operations for manipulating environments. As explained in section 3.2, an environment is a sequence of frames, where each frame is a table of bindings that associate variables with their corresponding values. We use the following operations for manipulating environments:
(lookup-variable-value
)
returns the value that is bound to the symbol in the
environment
(extend-environment
returns a new environment, consisting of a new frame in which
the symbols in the list
(define-variable!
adds to the first frame in the environment
(set-variable-value!
changes the binding of the variable in the environment
additionally:
http://www.cs.indiana.edu/hmg/schemachine-wp.htm
again, chapter five of SICP:
from: http://www.ida.liu.se/~tobnu/scheme2llvm/
This is a small (about 1K lines) self applicable scheme toy compiler which compiles to C-code, and a version that compiles to LLVM with the types fixnum, symbols, strings, functions and vectors (cons cells are seen as vectors of size 2).
and:
The code is quite similar to the code in SICP (Structure and Interpretation of Computer Programs), chapter five, with the difference that it implements the extra functionality that SICP assumes that the explicit control evaluator (virtual machine) already have. Much functionality of the compiler is implemented in a subset of scheme, c-defines (llvm-defines), which are compiled to C-functions (llvm functions).
code for SICP at:
http://mitpress.mit.edu/sicp/code/index.html
Chapter 5: Computing with Register Machines
walking through a Scheme interpreter and compiler for hypothetical (simulated in this instance) stack-based register machine (minus (as simulated in Scheme) various primitive operations such as car, cons))
for such primitives (and a similar implementation) also see:
http://www.frank-buss.de/lispcpu/lispcpu.lisp.txt
(defconstant primitives '#(+ - < > <= >= /= = * set quote setq defun progn get-time set-time set-led get-led while cons car cdr if))
Storage of car, cdr structures is outlined in section 5.3 p533. also garbage collection
towards lispcpu, also perhaps ATMega interpreter/compiler
[further notes on Steele and Sussman's LispCPU]
1] Microcode:
quote [p35]:
Each of the two parts, EVAL and GC, is itself divided into two parts: registers and controller. The registers provide storage for type/pointer words, and are connected by a common bus in each part. Each controller is a finite-state machine implemented as a PLA, plus some random logic. Each PLA is organised as a micro-code ROM, addressed by a "Micro-PC" and yielding a set of control signals, including register controls and a new micro_PC indicating the next state.
signals enumerated on p36
p39. [SIMPLE Microcode] There are five kinds of operation EVAL can request from the GC (which give EVAL the step-eval signal to advance):
NOP, CAR/CDR, CONS/NCONS, RPLACD and load/store Q (one of GC's registers).
p40:
[quote]
The CONS operation accepts a car pointer from the E bus, takes the contents of Q to be the cdr pointer and then allocates a new two-word cell containing the car and cdr. A pointer to the result, with type 0 (list) is left in Q.
The Microcoded PLA is produced by software from the assembly listing (p48+) produced by the micro-assembler software from the listing (p43+)
2] The register cell
[p63]
One-bit register cell. The signal LD during O/1 loads the cell from the data bus. The signal RD during O/1 drives the bus from the cell. The cell is refreshed during O/2.
after: Design of LISP-Based processors ... or, LAMBDA The Ultimate Opcode
Guy Steele and Gerald Sussman
(http://repository.readscheme.org/ftp/papers/ai-lab-pubs/AIM-514.pdf ) 1979, MIT
1] Architecture reflectes language structure
2] The overlap of high level and machine language is also dictated by necessity to manipulate program as data
3] Programs are represented as nested Lisp data structures -> as a tree of linked records subject to a recursive tree walk
4] Lists are represented by records each of which contains 2 pointers to other records - car and cdr
5] The type field associated with each pointer is used to encode opcodes and return addresses
6] p17 implementation:
7] Two consecutive words of memory hold a list cell - each word can hold a type field and an address field (an address referenced within linear memory)
8] The evaluator and the storage manager as individual processors each with a state machine controller and a set of registers
contents of any register is address field and type field
9] p32 evaluation and procedure call (type 6):
We have a 3-bit type field which provides 8 opcodes (recall 7 as quotation) - address part of a word has different function depending on type (6 is procedure call).
quote:
The evaluation of type 4 (procedure) results in a pointer to the newly allocated word pair. This pointer has type 3 (closure). The car of the pair contains the cdr of the procedure: this is the code body of the procedure. The cdr of the pair contains the current environment.
and:
A procedure call (type 6) is the most complicated... It is a list of indefinite length, chained together by cdr pointers. Each cdr pointer except the last MUST have type 0 (list). The last cdr pointer should have a zero address and a non-ZERO type. This last type specifies the operation to be performed. In CDRing down the list, SIMPLE evaluates each of the expressions in the car, saving the resulting values. These values are available as arguments to the operation to be performed.
operations in tyoe here such as car, cdr, cons, atom, progn, list and funcall
10] further question of recursion in hardware by way of closures:
A closure combines the code of a function with a special lexical environment bound to that function (scope).
Design of LISP-Based processors ... or, LAMBDA The Ultimate Opcode
Guy Steele and Gerald Sussman (http://repository.readscheme.org/ftp/papers/ai-lab-pubs/AIM-514.pdf ) 1979, MIT
see also diagrams
To quote from the opening abstract:
We present a design for a class of computers whose 'instruction sets' are based on LISP. LISP, like traditional stored-program machine languages and unlike most high-level languages, conceptually stores programs and data in the same way and explicitly allows programs to be manipulated as data. LISP is therefore a suitable language around which to design a stored-program computer architecture. LISP differs from traditional machine languages in that the program/data storage is conceptually an unordered set of linked record structures of various sizes, rather than an ordered, indexable vector of integers or bit fields of fixed size. The record structures can be organised into trees or graphs. An instruction set can be designed for programs expressed as such trees. A processor can interpret these trees in a recursive fashion, and provide automatic storage management for the record structures.
further - an architecture and an instruction set are specified, fabrication of a VLSI prototype microprocessor is described.
1) LISP as a high-level machine language. Given that "LISP reflects the structure of program expressions in the structure of the data which represents the program." and also given that data and programs are equivalent and can equally be manipulated (within the "incestuous" realm of compilers and interpreters).
2) Tree structure rather than linear vector of instructions which can be indexed using counters and the like. Evaluation by recursive tree-walk.
3) Lisp atoms and lists are described in fine detail with examples prior to the exposition of a meta-circular LISP interpreter (ie. it is written in LISP and can interpret itself).
4) APPLY within this simple interpreter... in the case of primitive procedures:
"... primitive symbols are not to be confused with the atomic symbols used as their names. The actual procedure involved in the combination (car x) is not the atomic symbol CAR, but rather some bizarre object (the value of the atomic symbol CAR) which is meaningful only to PRIMOP-APPLY."
(thus further into the appendix the magic of execution is microcoded)
5) State machine implementation:
An interpreter in the form of a state machine controller. (rendering explicit as a control mechanism that which the recursive LISP interpreter hides - state information that must be saved on each recursive invocation).
Registers and a list memory system.
The evaluator in Lisp has five global registers which simulate the registers of a machine
EXP - hold expression or parts of under evaluation ENV - holds pointer to environment structure (context) VAL - value developed in evaluation of expressions ARGS - list of evaluated arguments CLINK - pointer to top of the list structure which is the control stack
Further LISP code such as EVAL-DISPATCH implements the simulation
6) Representing LISP data.
"Lists are normally represented by records each of which contains two pointers to other records. One pointer is the car and the other is the cdr."
(A (B C) D) becomes _ _ _ _ _ _ |.|.+ -> |.|.+ -> |.|.+--> NIL - - - - - - | | | v | v A v D _ _ _ _ |.|.+ -> |.|.+--> NIL - - - - | | v v B C
Pointer representation is unimportant. We give pointer to memory system and it returns the context of the record pointed to. A type field is associated with each pointer.
7) The state machine implementation is combined with typed pointer dispatch to form an interpreter which can be implemented in hardware (p. 15)
8) Storage management:
The system is divided into:
a) a storage system which "provides an operator for he creation of new data objects and also other operators (such as pointer traversal) on those objects."
b) EVAL (program interpreter) which "executes programs expressed as data structures within the storage system"
in classic Von Neumann style.
The storage manager here makes a "finite vector memory appear to the evaluation mechanism to be an infinite linked-record memory". Thus a garbage collector is implemented.
9) Physical layout of the prototype processor
"The evaluator and the storage manager are each implemented in the same way as an individual processor. Each processor has a state-machine controller and a set of registers. On each clock cycle the state-machine outputs control signals for the registers and also makes transitions to a new state."
...
"The contents any register is a pointer (8 bits in the prototype) and a type field (3 bits in the prototype). The registers of a processor are connected by a common bus (E bus in the evaluator, G bus in the storage manager)
....
"Each state-machine controller consists of a read-only memory (implemented as a PLA), two half registers (?) (clocked inverters, one at each input and one at each output), and some random logic (eg. for computing the next state)... two phase non-overlapping clock signals..."
1- registers are clocked. next state is computed
2- next-state signals appear and are latched
all signals from the controllers can be multiplexed onto twelve probe lines
10) Discussion
There is no ALU.
Possible addition of complex processors/devices on the external memory bus with LISP processor serving as controller.
At the same time talks of a layered approach wherein a line can be drawn at arbitrary points within a tower of abstraction = a boundary between evaluator and storage manager
"... a hierarchy of interpreters running in [a] virtual machine. Each layer implements a virtual machine within which the next processor up operates."
such a boundary also exhibits an arbitrary distinction between hardware and software. also the overlap:
"Each of the layers in this architecture has much the same organisation: it is divided into a controller ("state machine") and a data base ("registers"). There is a reason for this. Each layer implements a memory system and so has state; this state is contained in the data base (which may be simply a small set of references into the next memory system down (own note: no operation but only a mapping)). Each layer also accepts commands from the layer above it, and transforms them into commands for the layer below it; this is the task of the controller."
also in talking of analogies between common CPU and CPU here:
"We may loosely say that there are two addressing modes in this architecture, one being immediate data (as in a variable reference), and the other being a recursive evaluation. In the latter case, merely referring to an operands automatically calls for the execution of an entire routine to compute it!"
11) History of VLSI implementation
Typed pointers treated as instructions, with the types as "opcodes" to be dispatched on by a state machine...
Rough sketch of building blocks:
PLA library array cells, simple replicated register cells assembled using (LISP-based) software
12) Conclusion
A CPU "... organised purely around linked records, especially in that the instruction set is embedded in the organisation."
Finally concludes that just as the LISP tree data representation informs this particular instruction set and thus the CPU architecture (for it is not just a question of representation but also changing the means of manipulation), so other representations (for example graphs) or storage organisations could be examined.
13) Appendix - Prototype Lisp Processor Technical Specifications
For later examination.
so far... The Instruction Set:
The 3 byte type field supplies 8 "opcodes":
from 0=constant list to 7=quoted constant
Address part of the word has different purposes dependent on type.
Procedure call (type 6) is the most complicated of all:
"It is a list of indefinite length, chained together by CDR pointers
xx_____
Also of note here is the use of transistors and resistors (in this case depletion-mode transistors) which can be used to construct logic gates.
see also GC probe mux and multiplexor p61,62 - a grid of wires with transistor across for probes
register cell p 63