Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
989 views
in Technique[技术] by (71.8m points)

assembly - Homoiconic and "unrestricted" self modifying code + Is lisp really self modifying?

I will be forward in admiting that my knowledge of Lisp is extremely minimal. However I am extremely interested in the language and plan to begin seriously learning it in the near future. My understanding of these issues is no doubt flawed, so if I say anything which is blatently wrong, please comment and correct me rather than downvoting.

Truly Homoiconic and Self-modifiable languages

I'm looking for examples of programming languages which support both Homoiconicity (Code has the same representation as data) and unrestricted self modification (Unrestricted meaning that you can change every aspect of your running code, not merely emit new code or change function pointers/delegates.)

There are but three examples I have found so far which fit this criteria:

  1. Machine code. Homoiconic in that everything is a number. Unrestrictedly modifiable in that it includes pointers, which can be used to manipulate any memory address regardless of whether that address holds code or data.
  2. Malbolge. Same reasoning as Machine code. Every instruction modifies itself after being executed
  3. DNA. Not a programming language, but still interesting. It isn't self modifying in the same sense as machine code is; Where the actual instructions + data are modified in place. However it is self replicating and can mutate/evolve according to it's previous state (With side effects such as radiation screwing it up every now and then). This is just an indirect way of self modification anyway. In short, DNA can self modify, but it does so by reproducing itself in it's entirity along with the relevant mutations. A physical string of DNA is "immutable".

Why Lisp is not on this list

Lisp is not on that list because it appears to me that Lisp is only almost Homoiconic, and only supports restricted self modification. You can do something like

(+ 1 2 3)

which will do the same thing as

(eval '(+ 1 2 3))

In the first version (+ 1 2 3) is raw code, whereas in the second version it is data. By assuming the truth of this statement it can be argued that Lisp isn't even homiconic. The code has the same representation as data in the sense that they are both lists/trees/S-expressions. But the fact that you have to explicitly mark which of these lists/trees/S-expressions are code and which are data to me seems to say that Lisp is not homiconic after all. The representations are extremely similar, but they differ in the tiny detail that you have to actually say whether you are dealing with code or data. This is not in any way a bad thing (In fact anything else would be madness), but it highlights a difference between Lisp and machine code. In machine code you don't have to explicitly mark which numbers are instructions, which are pointers, and which are data. Everything is simply a number until an interpretation is actually required, at which point it could be any of those things.

It's an even stronger case against unrestricted self modification. Sure, you can take a list that represents some code and manipulate it. For example changing

'(+ 1 2 3)

to

'(+ 1 4 3)

And then you run it through eval. But when you do this you're just compiling some code and running it. You're not modifying existing code, you're just emitting and running new code. C# can do exactly the same thing using expression trees, even if in a less convenient format (which arises due to C# code having a different representation to its AST, as opposed to Lisp, which is its own AST). Can you actually take an entire source file and start modifying that entire source file as it is running, with the changes made to the source file having realtime effects on the program behaviour?

Unless there is some way to do this, Lisp is neither homiconic nor self modifying. (To put off an argument over definitions, Lisp is not homoiconic or self modifying to the same degree that machine code is.)

Ways to make Lisp Homoiconic/Unrestrictedly self-modifiable

I can see 3 potential ways to make Lisp as homoiconic/self-modifiable as machine code.

  1. Non-Von-Neumann architecture. If someone could invent some amazing hypothetical machine where the lowest level representation of programs is an AST which can be executed directly (No further compilation is necessary). On such a machine an AST would represent both executable instructions, as well as data. Unfortunately the problem won't have been solved, because the AST still has to be either code or data. The prescence of an eval function doesn't change this. In machine code you can flip back and forth between code and data as much as you want. Whereas with eval and Lisp once you've "evaled" some list from data to code and executed it, there's no way to get that list back as data again. In fact, that list is gone forever and has been replaced with it's value. We'd be missing something crucial, which just so happens to be pointers.
  2. List Labels. If it was a requirement that every list also have a unique label, it would be possible to do indirect self modification by running functions against a list with a given label. Combined with continuations this would finally allow for self modifying code in the same sense that machine code has it. The labels are equivilant to machine code memory addresses. As an example, consider a Lisp program where the top node of the AST has the label "main". Inside main you could then execute a function that takes a label, an Integer, an Atom, and copies the atom to the List with a label that matches the one supplied to the function, at the index specified by the Integer. Then just call with current continuation on main. There you go, self modifying code.
  3. Lisp Macros. I haven't taken the time to understand Lisp macros, and they may in fact do exactly what I'm thinking about.

Point 1. combined with 2. would produce a totally self-modifying Lisp. Provided that the magical Lisp machine described could be produced. 2. alone could produce a self-modifying Lisp, however the implementation on a Von Neumann architecture could be extremely innefficient.

The Questions

  1. Are the any languages other than machine code, dna and malbolge which can do total self modification and are homoiconic?
  2. (DO NOT bother answering if you did a tl;dr at the above text). Is lisp really homoiconic + self modifying? If you say so, can you quote exactly where in my argument I went astray?

Appendix

Languages with unrestricted self modification but no homiconicity

  1. Assembly. The code uses words as opposed to numbers, so loses homiconicity, but it still has pointers, which retains the total control over memory and allows for unrestricted self modification.
  2. Any language that uses raw pointers. For example C/C++/Objective C. Same argument as Assembly
  3. JIT languages that include virtual pointers. For example C#/.net running in an unsafe context. Same argument as Assembly.

Other concepts and languages that may be somehow relevant/interesting: Lisp, Ruby, Snobol, Forth and it's compile time metaprogramming, Smalltalk and it's reflection, the untyped lambda calculus with it's property that everything is a function (Which kind of implies that assuming we could invent a machine which executes lambda calculus directly, lambda calculus would be homoiconic and Von Neumann machine code would not when run on said machine. [And Godels theorem would be executable. Haha, scary thought :P])

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

In the first version (+ 1 2 3) is raw code, whereas in the second version it is data. By assuming the truth of this statement it can be argued that Lisp isn't even homiconic. The code has the same representation as data in the sense that they are both lists/trees/S-expressions. But the fact that you have to explicitly mark which of these lists/trees/S-expressions are code and which are data to me seems to say that Lisp is not homiconic after all.

This is not true. In the first version, the list (+ 1 2 3), which is data, is being fed to the interpreter to be executed, i.e. to be interpreted as code. The fact that you have to mark s-expressions as being code or data in a specific context does not make Lisp non-homoiconic.

The point of homoiconicity is that all programs are data, not that all data are programs, so there is still a difference between the two. In Lisp, (1 2 3) is a valid list but not a valid program since an integer is not a function.

[If we look at the other great homoiconic programming language, Prolog, then we see the same phenomenon: we can build a data structure foo(X, 1, bar), but without a definition of foo, we can't execute it. Also, variables cannot be the names of predicates or facts, so X. is never a valid program.]

Lisp is self-modifying to a great degree. E.g., here's how to change the definition of a function:

[1]> (defun foo (x) (+ x 1))
FOO
[2]> (defun bar (x) (+ x 2))
BAR
[3]> (setf (symbol-function 'foo) #'bar)
#<FUNCTION BAR (X) (DECLARE (SYSTEM::IN-DEFUN BAR)) (BLOCK BAR (+ X 2))>
[4]> (foo 3)
5

Explanation: at [1], we defined the function foo to be the add-1 function. At [2], we defined bar to be the add-2 function. At [3], we reset foo to the add-2 function. At [4], we see that we've successfully modified foo.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...