First let's start with a brief, probably not accurate but still valid, definition of what stack overflow is.
As you probably know right now there are two different kind of memory which are implemented in too different data structures: Heap and Stack.
In terms of size, the Heap is bigger than the stack, and to keep it simple let's say that every time a function call is made a new environment(local variables, parameters, etc.) is created on the stack. So given that and the fact that stack's size is limited, if you make too many function calls you will run out of space hence you will have a stack overflow.
The problem with recursion is that, since you are creating at least one environment on the stack per iteration, then you would be occupying a lot of space in the limited stack very quickly, so stack overflow are commonly associated with recursion calls.
So there is this thing called Tail recursion call optimization that will reuse the same environment every time a recursion call is made and so the space occupied in the stack is constant, preventing the stack overflow issue.
Now, there are some rules in order to perform a tail call optimization. First, each call most be complete and by that I mean that the function should be able to give a result at any moment if you interrupts the execution, in SICP
this is called an iterative process even when the function is recursive.
If you analyze your first example, you will see that each iteration is defined by two recursive calls, which means that if you stop the execution at any time you won't be able to give a partial result because you the result depends of those calls to be finished, in this scenario you can't reuse the stack environment because the total information is split between all those recursive calls.
However, the second example doesn't have that problem, A is constant and the state of p and r can be locally determined, so since all the information to keep going is there then TCO can be applied.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…