Let's Do Structure and Interpretation of Computer Programs (SICP)
Table of Contents
SICP (and SDF)
This exists because if you ask around how to learn programming likely someone will shill this book to you due to it being a seminal CS text that's been used for decades. SICP is still relevant in the Gemini/ChatGPT and Claude AI world of development tools as we will learn all about procedural abstraction something that AI is still very poor at doing. In 2026 the current coding agents are basically replicating punch cards shitting out imperative code that is notoriously difficult to get right and ignore over 50 years of progress and here we will learn how to properly abstract all that stuff if we want to rapidly prototype using one of these tools.
Even the workflow of old school lisp is similar to what they're doing now with Claude AI where tests are done as you write the code and do not persist in the codebase.
SICP is lecture notes that became a book the original 6.001 course had many additional assignments to the book exercises. Some examples of these are here and here. Instead of doing those old 6.001 assignments I'm going to do the 2025 assignments from Sussman's 6.945 course and new book Software Design for Flexibility (SDF) where many of those assignments point to chapters in SICP.
SDF teaches the exact opposite of current day software engineering showing how in nature complex systems are designed for flexibility so new abilities can be effortlessly added without needing to do any kind of rewriting of the existing system. This is exactly what you would want if using some AI to generate code.
SICP in 2026
This is still the best book for learning the abstract art of programming. MIT taught some form of this book in their 6.001 course from 1980 until 2007 and I think still offers a condensed version in 6.037 (now 6.9550). UC Berkeley teaches a Python derivative of the book and there is an authorized JavaScript 2022 version taught at the National University of Singapore but the authors have had to make significant changes in chapter 4 because of the limitations of JavaScript.
Optional lectures
There is lectures from 1986 here and if you search YouTube many more recent lectures from Berkeley and countless hackathons or compsci reading groups have videos on working through the book.
SETUP
Book versions
I'm using the official online version and one of unofficial PDFs floating around to see the figures which can sometimes be impossible to read in the online html version. There's also an info version you can run in the terminal or in another emacs window. Remember those are unofficial so there could be extra errata introduced.
Install mit-scheme
To do the book read the setup instructions from Sussman's grad course and only use mit-scheme or you may have problems in later chapters when we need to use the picture language or mutable lists.
If you have an M-series chip Mac then UTM app will effortlessly run ARM64 Ubuntu/Debian/Arch VMs that work with mit-scheme. Homebrew can be used to install the free version. These VMs don't even need a GUI installed you can do everything from the edwin editor in the terminal. You could even rent a low end box for $20/year and ssh directly into it and run edwin.
IDE/Editor
You can type code into any text editor, save as filename.scm, run mit-scheme, and load your file in the REPL (load "filename.scm"). Any modern IDE can be scripted with a bunch of hooks to do this.
For the purposes of SICP and SDF we only need edwin so that's what I'll use here. Edwin is a clone of an old emacs version (Emacs 18 see the ref card below) and edwin is entirely written in mit-scheme so as you complete this book you can hack your own editor to do anything you want.
You start edwin with mit-scheme - -edit and are good to go without needing to install anything else.
- Editor setup from 6.945 and setup from 6.946 for emacs or edwin
- Sussman's .edwin config file here for better fonts (save in home directory as .edwin (with the period))
- Edwin specific docs here (it has a stepper too)
If you use emacs I wouldn't recommend installing Paredit mode or Geiser or anything until you start writing larger programs in the later chapters because you learn scope in the beginning by seeing how all the parens close blocks of expressions and having something automatically shifting around parens when you don't have experience is going to be rage inducing. When your programs start becoming one big function in chapter 4 then install tools.
Commands:
- C-x C-c (exit emacs/edwin) is hold down Ctrl and type x, keep holding Ctrl and type c.
- C-h t is hold down Ctrl and type h, release Ctrl and type t (starts the tutorial).
- PgUp/PgDn, End/Home, arrow keys and mouse all work too you don't have to use the movement commands in tutorial though eventually you will see how fast you become if you do decide to learn them.
- M-x the M is the 'meta' key which on a modern keyboard is usually ALT (sometimes the Windows key) but you can redefine both C and M to be something else.
- C-g or Ctrl-g is the escape key if you accidentally screw up mid command and want to try again.
BEGIN
Assuming you downloaded Sussman's edwin config and saved it as ".edwin" in your home directory or whatever directory you're running mit-scheme type the following:
- mit-scheme - -edit
You should get a 1980s green text screen that presents you with a single window (the scheme REPL). You can already program in this buffer try entering (+ 1 1) and C-x C-e to get ;Value: 2. Read the edwin documentation for mit-scheme.
Next:
- C-x C-f and type test.scm
A new blank editor buffer will open. Now let's get a split-screen, if you want your new window to the right instead of below type C-x 3
- C-x 2
- C-x o (lower case o)
- or just click on the bottom window
- C-x b and press enter to open default scheme buffer
- C-x o to return to editor window
- C-x C-s saves your editing window buffer and C-x C-c will leave edwin/scheme.
Now you are ready to write SICP exercises.
Workflow
You write all the program logic for some abstraction in a single procedure (function) where it contains multiple other smaller procedures to control the namespace. At the end of this pile of code you run C-x C-e and just evaluate that code by itself in the REPL and hand test everything. Good testing will be the standard edge cases, all the cond branches, and property-based testing making sure the behavior of your procedure conforms to whatever properties you've identified and we'll try some of that here.
By doing this you 'rapidly prototype' and I've used scheme/common lisp in the past to simulate all kinds of things like the basic environment of embedded devices or other targets for deployment all because of this book.
Errors
If you get something similar:
;Unbound variable: *2 ;To continue, call RESTART with an option number: ; (RESTART 3) => Specify a value to use instead of *2. ; (RESTART 2) => Define *2 to a given value. ; (RESTART 1) => Return to read-eval-print level 1.
Then typing (restart 1) will return to the REPL and you can check your code again to fix the problem or you can try one of the options to fix the problem. The debugger is great in Lisp type C-h m to get a list of commands. Read the don't panic guide how to debug.
The bulk of errors will be parens problems in the beginning but it's how you learn and after a few of them you're much more careful about code and what scope it's in.
Edwin demo in Lecture 1
First lecture from the 1986 series if you want to follow him with edwin like @35:40 in the video:
- Start edwin with 'mit-scheme –edit'
- C-x 2 (split screen like he has)
- C-x C-f and type in lecture.scm (it will create this file)
- Any expressions eval with C-x C-e
- Press tab to indent when he demos 'pretty printing'
Run the tutorial!
Hold Ctrl-h then type t to start the tutorial so you understand what 'point' means (the location of the current cursor) and all the cut + paste commands.
SICP 1.x
I would not read these early chapters very fast as it builds up showing the various abstractions in incremental levels that will be needed for later chapters.
A computational process I would interpret as a running program or the act of computation itself. A procedure is sometimes called a function in modern programming parlance but the book makes it clear later in this chapter these are different things pointing out how a math function definition is nothing like a procedure.
Write yourself some notes. These chapters tell us everything is an expression and every expression has a value. Procedures themselves are a value. A sequence of expressions within parens is a combination where the operator is applied to the values of the expressions. The operator can be a simple procedure or a compound very complex procedure. These complex procedures can be named so we can refer to their results. The environment (scope) is the range of significance that a name has where some names are globally accessible and others only locally within procedures.
This is what they mean by the evaluation rule is recursive
- (+ 1 (+ 2 (* 3 4)))
- (+ 1 (+ 2 (eval 3 4)))
- (+ 1 (+ 2 12))
- (+ 1 (eval 2 12))
- (+ 1 14)
- 15
In order to evaluate an expression it has to evaluate the most nested subexpression the rule points to itself like a loop that keeps evaluating until recursion stops and a value is produced. Special forms are not procedures because they aren't always evaluated and if the OR procedure returns true for one expression in a chain of OR's then it doesn't need to evaluate anymore expressions.
Ex 1.2 is an example of an impossible to read figure. Use the pdf.
(/ (+ 5 4 (- 2 (- 3 (+ 6 (/ 4 5))))) (* 3 (- 6 2) (- 2 7)))
Ex 1.3 simple (cond (and ())) you can write your own test cases/examples too:
(= (three-square-sum 1 2 3) (+ (* 2 2) (* 3 3)) (= (three-square-sum 3 1 2) 13) (= (three-square-sum 3 2 1) 13) (= (three-square-sum 1 1 1) 2) (= (three-square-sum 0 1 0) 1)
Ex 1.4 many languages won't let you do this and is a good example of symbolic programming it evals to (+ a b) if b > 0 or (- a b) if not.
Ex 1.5 (define (p) (p)) is a non-terminating loop where (p) keeps calling itself. Since normal-order doesn't eval arguments into values until they are needed (p) is never run because it's never needed as (test 0 (p)) will always evaluate to 0. Under applicative-order however it evaluates all the arguments first so will evaluate (p) and go into an infinite loop.
Ex 1.6 new-if is a regular procedure not a special form so it's input (all the subexpressions) are evaluated first with applicative-order eval. The three inputs to new-if: (good-enough? guess x) returns a boolean, guess is already a value, but (sqrt-iter (improve guess x) x) will keep improving the guess until the end of days as the cond predicate good-enough? is never reached to stop the recursion. The special-form if/else does not automatically evaluate the else clause it preforms the predicate test first.
Ex 1.7 try examples:
1 ]=> (sqrt 0.0009) ;Value: .04030062264654547 1 ]=> (sqrt 1000000000000000) ;takes forever, infinite loop?
Subtracting the square of the guess from the radicand isn't precise enough for small numbers and very large numbers clearly they never meet that precision or it takes too long to do so. If we ratio the change in guess with the existing guess, the closer the guess is the closer the ratio approaches 1 so we can adjust our program to check for this by subtracting that ratio from 1 to see if the two guesses are within epsilon (0.001) of each other.
(define (good-enough? guess x) (< (abs(- 1 (/ (improve guess x) guess))) 0.001))
Now gigantic square roots eval immediately and tiny square roots are close enough to wolfram alpha online calculator precision.
Ex. 1.8 The cube root of 8 is 2 or 2 * 2 * 2. Plugging that into the formula: (8/4 + 4)/3 or 6/3 = 2. All I did was change improve procedure and delete average procedure since it wasn't needed anymore:
(define (improve guess x) (/ (+ (/ x (* guess guess)) (* 2 guess)) 3))
1.1.8 Black-Box Abstractions if you look at the mit-scheme documentation and scroll to the index at the bottom you'll find definitions for exp and log which are ex and ln(x). We can see this ourselves using desmos. Graph both ln(x) and exp(x) then go along the x-axis to 2, go up until it meets the blue ln(x) line, it lands at y = 0.7 or so. Double 0.7 to get 1.4 then travel along the x-axis to 1.4, go up to the orange exp(x) line and you'll see it intersect at y = 4 or exp(log(2) + log(2)) is 4. Both e and ln ('the natural log') are preferred for math modeling because it has properties like the derivative of e is itself, this square property we just learned, and that you can estimate ex and ln(x) simply on a piece of paper when working with small displacements. For example ex for small x like 0.05 is 1 + x and ln(1 + x) is x.
SICP 1.2.1
I rewrote fact-iter to instead be a function nested inside another function, with a 'trampoline' meaning execution falls to the bottom of the function hitting (f 1 n) and bounces back calling f(). A factorial is a product of every number up to the nth factorial so 3! is 1x2x3.
(define (factorial n)
(define (f total counter)
(if (< counter 2)
total
(f (* total counter) (- counter 1))))
(f 1 n))
; repl tests
(factorial 6)
(factorial 1)
(factorial 0)
The difference in the book between the delayed chain of operations and the iterated version is called re-entry. The delayed chain has to re-enter the same function before returning a value so must keep all that accounting of state somewhere which we will eventually learn is a stack frame.
Ex 1.9 addition is defined as a function that uses a as a counter and increments b everytime there is still an a left until a = 0. The first function (inc (function call)) is going to return a chain of increments until a = 0 then all delayed increment operations will be done.
Ex 1.10
I assume 'concise math definitions' means turn (* 5 n n) into 5n2 and if you plug in some values to f then f(n) = 2y. If you are using edwin you can run the stepper see the manual for mit-scheme. Otherwise hand step:
(define (A x y)
(cond ((= y 0) 0)
((= x 0) (* 2 y))
((= y 1) 2)
(else (A (- x 1)
(A x (- y 1))))))
(define (g n) (A 1 n))
(define (h n) (A 2 n))
(g 2)
- (A 1 2)
- (A 0 (A 1 1))
- (A 0 (2))
- (* 2 2)
- 4
(h 3)
- (A 2 3)
- (A 1 (A 2 2)))
- (A 1 (A 1 (A 2 1)))
- (A 1 (A 1 2))
- (A 1 (A 0 (A 1 1)))
- (A 1 (A 0 (2)))
- (A 1 (* 2 2))
- (A 1 4)
- (A 0 (A 1 3))
- (A 0 (A 0 (A 1 2)))
- (A 0 (A 0 (A 0 (A 0 1))))
- (A 0 (A 0 (A 0 (2))))
- (A 0 (A 0 (4)))
- (A 0 (* 2 4))
- (A 0 (8))
- (* 2 8)
- 16
G seems to produce 2n while H generates sequence 2, 4, 16, 65536… or \(2^{2^2..}\).
SICP 1.2.2
Phi or \(\varphi\) is a constant like e and there's many articles and videos on Phi. In case it wasn't clear in the book the iterative algorithm comes from the observation that 0, 1, 1, 2, 3, 5.. the next in the sequence is the sum of the previous 2 numbers.
Hand step the count-change procedure with only 2 coins and amount 1:
- (+ (cc 1 2 (+ (cc 1 1) (cc 1 0))))
- (+ (cc 1 2 (+ (cc 1 1) 0)))
- (+ (cc 1 2 (+ (cc (1 - 1) 0))))
- (+ (cc 1 2 (+ 1 0)))
- (+ (cc (1 - 2) 1))
- (+ 0 1)
- 1
The computations are delayed as (+) needs two arguments so recursion is unrolled fully then evaluation can begin once it stops with 0 or 1.
Ex 1.11
Write examples 1, 2, 4, 11, 25, 59.. enter it into OEIS and there's actually a generating function solution a physicist figured out however the assignment wants us to create an iterable procedure. You can use the slower recursive procedure we learned to test your iterative one. I took eqv? from the mit-scheme reference manual.
(define (f n)
(define (f-iter n1 n2 n3 counter)
(cond ((< n 3) n)
((= counter n) n1)
(else
(f-iter (+ n1 (* 2 n2) (* 3 n3)) n1 n2 (+ counter 1)))))
(f-iter 2 1 0 2))
(define (g n)
(cond ((< n 3 ) n )
(else (+
(g (- n 1))
(* 2 (g (- n 2)))
(* 3 (g (- n 3)))))))
(and
(eqv? (f 1) (g 1))
(eqv? (f 2) (g 2))
(eqv? (f 3) (g 3))
(eqv? (f 4) (g 4))
(eqv? (f 0) (g 0)))
Ex 1.12
Pascal's triangle or array is explained here as the number of paths you can take to get to that number in the array or the number of coefficients in (a + b)n
(define (pascal row col)
(cond ((= row 0) 1)
((> col row) 0)
((or (= col 0) (= col row)) 1)
((or (= col 1) (= col (- row 1))) row)
(else (+ (pascal (- row 1) (- col 1))
(pascal (- row 1) col)))))
For example (pascal 4, 2) is 6. The first and last columns are always 1, the n+1 and n-1 columns are the same as the row number if you start counting from 0. The value of (pascal 4, 2) is the value of (+ (pascal 3, 1) (pascal 3 2)) which is (+ 3 (pascal 2 1) (pascal 2 2)) or (+ 3 2 1). Trying to get a middle column element from row 30 takes a long time as the recursive process is generating a tree branch on almost every remaining input.
You can property test your program by noticing each element in a row of Pascal's array sums to \(2^{(row)}\):
(define (pascal row col)
(cond ((= row 0) 1)
((> col row) 0)
((or (= col 0) (= col row)) 1)
((or (= col 1) (= col (- row 1))) row)
(else (+ (pascal (- row 1) (- col 1))
(pascal (- row 1) col)))))
(define (ptest n)
(define (row-sum count)
(cond ((= count 0) 1)
(else (+ (pascal n count) (row-sum (- count 1))))))
(row-sum n))
(and
(eqv? (ptest 2) (* 2 2))
(eqv? (ptest 3) (* 2 2 2))
(eqv? (ptest 10) (* 2 2 2 2 2 2 2 2 2 2)))
Ex 1.13
Closest integer means a number like -1, 0, 1, 2, 3.. not a decimal/rational. Try some examples in scheme:
(define (fib n)
(cond ((= n 0) 0)
((= n 1) 1)
(else (+ (fib (- n 1))
(fib (- n 2))))))
(define (phi x)
(/ (x 1 (sqrt 5)) 2))
(define (f n)
;; expt and sqrt in mit-scheme reference doc
(/ (- (expt (phi +) n) (expt (phi -) n)) (sqrt 5)))
(fib 11)
(f 11)
(fib 12)
(f 12)
Notation
- \(\varphi\) is phi or (1 + sqrt(5))/2
- \(\psi\) is psi or -1/phi or (1 - sqrt(5))/2
The book says to use the definition fib(n) = fib(n - 1) + fib(n - 2) and since fib(n) = \(\frac{\varphi^n - \psi^n}{\sqrt 5}\) we can substitute it wherever we see fib(n):
\[\frac{\varphi^n - \psi^n}{\sqrt 5} = \frac{\varphi^{n-1} - \psi^{n-1}}{\sqrt 5} + \frac{\varphi^{n-2} - \psi^{n-2}}{\sqrt 5}\]
Looking at this solution:
- Clear the fractions by multiplying everything by \(\frac{\sqrt 5}{1}\) though \(\sqrt 5\) is itself \(2\varphi - 1\)
\[\varphi^n - \psi^n = \varphi^{n-1} - \psi^{n-1} + \varphi^{n-2} - \psi^{n-2}\]
- Rewrite exponents to n. If n = 2 then 22 is 2*2 and (2*2)/2 is 21 or 2n-1
\[\varphi^n-\psi^n=\frac{\varphi^n}\varphi-\frac{\psi^n}\psi+\frac{\varphi^n}{\varphi^2}-\frac{\psi^n}{\psi^2}\]
- Group similar together and distribute out phi and psi numerators:
\[\varphi^n-\psi^n=\varphi^n\left(\frac1\varphi+\frac1{\varphi^2}\right)-\psi^n\left(\frac1\psi+\frac1{\psi^2}\right)\]
You could keep following their solution or take advantage of the fact that phi2 = phi + 1, 1/phi = phi – 1, 1/psi = 1/(-1/phi) or negative phi (\(-\varphi\)) to end up with \(\varphi^n - \psi^n = \varphi^n - \psi^n\) then use the same inequality reasoning they did to get to -1/2 < phi/sqrt(5) < 1/2
SICP 1.2.3
Watch the lecture @17:09 where he explains what time and space means though the whole lecture is worth watching. Time is the amount of steps that are evaluated and space is the amount of data needing to be stored. The numbers 16 and 8 differ by a constant factor of 2 and by a constant factor of 8 but as he says in this analysis we don't care what the constant is just that it is a constant and not changing with the input. Try the iterative procedure in the lecture that uses '1+' and '-1+' both built-in increment/decrement functions.
He uses big-O in the lecture (and the book's first edition) and big-theta in the second edition we're reading. O(n) is an upper bound and \(\theta(n)\) is both a lower and an upper bound.
In the book the definition for \(\theta(f(n))\) is \(k_{1}f(n) \le R(n) \le k_{2}f(n)\)
- the two k's represent constants we choose that can be the same or different
- R(n) is a function we define that represents the resources for a process (space or steps) when given an input of size n
- f(n) are the same functions and chosen from a list of existing math functions
- f(n) = n (linear)
- f(n) = n2 (quadratic)
- f(n) = log n (logarithmic)
- f(n) = 1 (constant)
- there actually is a specific list with more
If R(n) = n100 meaning for every input 100 steps are processed, or n100 steps, then R(n) is within a constant factor of f(n) = n and fits the big-theta definition in the book cf(n) <= n100 <= kf(n) because both c and k can be chosen so that inequality holds. R(n) is in \(\theta(n)\) and belongs to the set of all functions f(n) = n or linear functions that perform at least linear work and at most linear work.
Big-O is an upper bound only. If R(n) is in O(n) then it is in the set of all functions bounded by the set of all linear functions. So far we are only looking at the computational process meaning the exact amount of steps generated and the storage needed to store the steps in memory.
The \(\in\) symbol means 'is in'
- Fib(n) steps take R(n) = \(\frac{\varphi^n}{\sqrt 5}\) and we can choose the constants arbitrarily as the definition allows this so long as the inequality holds
\[\frac{1}{\sqrt 5}\cdot\varphi^n \le \frac{\varphi^n}{\sqrt 5} \le 1\cdot\varphi^n \in \theta(\varphi^{n})\]
- The definition of big-O is \(R(n) \le kf(n)\) so any positive constant will work for fib(n) :
\[\frac{\varphi^n}{\sqrt 5} \le 1\cdot\varphi^n \in O(\varphi^{n})\]
- Phi is a constant 1.6180.. and (1.6)n < 2n this is another upper bound though not as tight fitting but still fine according to the definition of big-O:
\[\frac{\varphi^n}{\sqrt 5} \le 1\cdot 2^n \in O(2^{n})\]
Watch this to learn more like O-tilde, poly(), and what exactly are the standard form functions that are 'chosen from a list' in O(f(n)). That same lecture also explains why 'n' is used in O(f(n)) instead of O(f(x)) because in this analysis inputs are supposed to grow to infinity so n is used as a convention to represent near infinite inputs.
The formal definition of big-O and big-theta is that there exists some specific input n0 where for every input n larger than n0 the constants will always hold so you don't have to worry about tiny edge cases where inputs are small so long as the constant(s) hold eventually for all future inputs.
Ex 1.14
There is a nice graphic here showing the full (cc 11 5) tree. He used (display) to generate the data to plug into graphviz and we can do the same while adjusting the procedure to count the number of calls or nodes in the process tree. Zero out the cond then in the else add 2 for 2 calls to cc:
(define (count-nodes amount)
(cc amount 5))
(define (cc amount kinds-of-coins)
(display "(cc ") (display amount) (display " ") (display kinds-of-coins) (display ")") (newline)
(cond ((= amount 0) 0)
((or (< amount 0) (= kinds-of-coins 0)) 0)
(else (+ 2 ( + (cc amount (- kinds-of-coins 1))
(cc (- amount (first-denomination kinds-of-coins)) kinds-of-coins))))))
(define (first-denomination kinds-of-coins)
(cond ((= kinds-of-coins 1) 1)
((= kinds-of-coins 2) 5)
((= kinds-of-coins 3) 10)
((= kinds-of-coins 4) 25)
((= kinds-of-coins 5) 50)))
1 ]=> (count-nodes 1)
(cc 1 5)
(cc -49 5)
(cc 1 4)
(cc -24 4)
(cc 1 3)
(cc -9 3)
(cc 1 2)
(cc -4 2)
(cc 1 1)
(cc 0 1)
(cc 1 0)
;Value: 10
We can do the same for the fib function and compare the result to the tree in the book (14 nodes not including the root)
(define (fib-nodes n)
(cond ((= n 0) 0)
((= n 1) 0)
(else (+ 2 (fib-nodes (- n 1))
(fib-nodes (- n 2))))))
1 ]=> (fib-nodes 5)
;Value: 14
We can see count-change process steps are much less than fib:
1 ]=> (fib-nodes 20) ;Value: 21890 1 ]=> (fib-nodes 30) ;Value: 2692536 1 ]=> (count-nodes 20) ;Value: 150 1 ]=> (count-nodes 30) ;Value: 390
Space is defined as the max depth of the longest tree branch because we only have to keep track of the nodes above us and count-change longest branch is whatever the input is since if there's a coin size of 1 it will take as a minimum n nodes of space of n-sized input so space's lowest bound is n. We can always find some constant to multiply n by in order to represent the rest of the space used in the tree for the other coin denominations meaning space is \(\theta(n)\) it won't exceed more than n(times a constant).
To estimate the amount of steps try gathering data and plot it:
1 ]=> (count-nodes 100) ;Value: 15498
1 ]=> (count-nodes 200) ;Value: 229588
1 ]=> (count-nodes 300) ;Value: 1292590
1 ]=> (count-nodes 400) ;Value: 4642024
1 ]=> (count-nodes 500) ;Value: 12822610
1 ]=> (count-nodes 600) ;Value: 29806268
1 ]=> (count-nodes 700) ;Value: 61312118
1 ]=> (count-nodes 800) ;Value: 115126480
1 ]=> (count-nodes 900) ;Value: 201422874
Plug it into a desmos like I did here.
You can see for up to 900 inputs it's still bounded by x3 but if you keep going then the points approach the y-axis quickly. As you add 100 to the input the nodes appear to roughly double in size as an upper bound so we can keep approximating and doubling the last value of nodes and see eventually the x3 curve is no longer bounding. It will eventually pass x4 too but it will always be bounded by x5 and that is the critical insight to big-O or big-theta the inputs are supposed to be approaching infinity so no matter how big of an input you can guarantee the bound.
Ex 1.15
Build another step counter
(define (count c angle)
(display c) (display " ") (display angle) (newline)
(if (not (> (abs angle) 0.1))
c
(count (+ c 1) (/ angle 3.0))))
(count 0 12.5)
1 ]=>
1 4.166666666666667
2 1.388888888888889
3 .462962962962963
4 .154320987654321
5 5.1440329218107005e-2
;Value: 5
(count 0 100)
1 ]=>
;Value: 7
(count 0 1000)
1 ]=>
;Value: 9
The sine function calls (p (p (p (p (p 0.05144))))) and once the recursion stops and angle is returned then p can start to reduce. Try log2(100) and log2(1000) and see (sine a) is logarithmic in steps. We multiplied the steps by 10 and only added 2 more. Space is the bookkeeping we have to save and is the depth of p so also \(\theta\)(log(a)) as the assignment wants us to express it as a function of a.
SICP 1.2.4
Ex 1.16
The book states the invariant here is abn meaning it's value should not change each time we iterate/loop (change state). The hint/observation is a(b2)n/2 is always going to be abn since 2n/2 is n and that is our even clause in the cond. The else clause or not even is bbn-1 and abbn-1 is of course abn too. I wrote some tests here but this is Lisp at anytime you can manually enter tests into the REPL.
(define (even? n)
(= (remainder n 2) 0))
(define (square n)
(* n n ))
(define (expt b n)
(expt-iter 1 b n))
(define (expt-iter a b n)
(cond ((= n 0) a)
; a(b^2)^1/2
((even? n) (expt-iter a (square b) (/ n 2)))
; abb^n-1
(else (expt-iter (* a b) b (- n 1)))))
(define (test-expt)
(and
(eqv? (expt 0 1) 0)
(eqv? (expt 1 0) 1)
(eqv? (expt 1 2) (* 1 1))
(eqv? (expt 2 3) (* 2 2 2))
(eqv? (expt 3 7) (* 3 3 3 3 3 3 3))))
Ex 1.17
Try examples, ab is b amount of times a is added to itself so 3 x 4 is 4 groups of 3 added or (+ 3 3 3 3) or 4 addition steps. How would double/half help us here? Applying the even test to b: (+ (double 3) (double 3)) is (+ 6 6) or 2 steps and lg(4) is 2. An odd example is 3 x 5 or (+ 3 (double 3) (double 3)) and that's our procedure. Whenever we can halve the product b we double a, if we can't we add whatever a is and reduce b by 1.
(define (even? n)
(= (remainder n 2) 0))
(define (double n)
(+ n n ))
(define (halve n)
(/ n 2))
(define (* a b)
(cond ((= b 0) 0)
((even? b) (* (double a) (halve b)))
(else (+ a (* a (- b 1)))))))
We can count the number of arithmetic steps by altering our procedure then check it against log (base e) a built-in procedure to make sure our number of steps are logarithmic. The book so far is only concerned with the actual steps for arithmetic they aren't concerned with what is the cost of calling halve and double so I assume those are constant step operations thus ignored:
(define (* a b) (cond ((= b 0) 0) ((even? b) (* (double a) (halve b))) (else (+ 1 (* a (- b 1)))))) (display "Actual number of arithmetic steps:") (* 2 100) (* 2 1000) (* 2 10000) (display "log_e of 100, 1000, and 10000:") (log 100) (log 1000) (log 10000)
Ex 1.19
Apply the first state change a <- bq + aq + ap and b <- bp + aq
- Tpq(a, b)
- = Tpq(bq + aq + ap, bp + aq)
Then apply another state change (there is now 2 b's and 3 a's)
- Tp'q'(bq + aq + ap), bp + aq)
- = Tp'q'(bp + aq)q + (bq + aq + ap)q + (bq + aq + ap)p, (bp + aq)p + (bq + aq + ap)q
Distribute everything and group like terms
- (bpq + bqq + bqp) + (aqq + aqq + aqp + aqp + app), (bpp + bqq) + (aqp + aqq + apq)
Factor out b and a
- b(2pq + qq) + a(qq + qq + qp + qp + pp), b(qq + pp) + a(2pq + qq)
- b(2pq + qq) + a(2pq + qq) + a(qq + pp), b(qq + pp) + a(2pq + qq)
q' is 2pq + q2 and p' is q2 + p2
I only know this from TAOCP book with similar state changes otherwise iirc I didn't understand this exercise the first time I saw it years ago.
SICP 1.2.5
This is 'integer division' or mod so the remainder 6 in GCD(206,40) is 200 with 6 remaining or 206 mod 40.
The number of steps of GCD(206,40) is bounded by the log of the smaller of the two inputs or log(40) (base phi) and since n = 40 is the input this is \(\theta\)(log n). It is derived by taking the log of both sides of that inequality in the book.
You don't need to know this but if you were curious about Lame's theorem first we notice in each step of Euclid's algorithm it will have to find the gcd of two inputs that are smaller than the previous input. When you unroll the steps of the algorithm out like a recurrence (which we haven't yet learned) and you compare n, the second smaller input to each step, then the first step is n >= 1 step, the second will be a sum of the last step which is n >= 1 step, the third will be a sum of the last 2 steps which is n >= 2 steps… you see where this is going you eventually get the fib sequence: \(n \ge \frac{\varphi^k}{\sqrt 5}\) and take log of both sides log(n) \(\ge\) k * log(\(\varphi\)) - log(\(\sqrt 5\)) or log(n) \(\ge\) kc - c where c is some constant factors we don't care about. Now it's clear that the number of steps k are logarithmic in the size of the input n. Another way to tell this was logarithmic was to notice the input is tossed by half or more at each loop and in a log(input) number of steps we exhausted the input.
This is something else I only know from reading TAOCP vol 2 section 4 which has a very long analysis of Euclid's algorithm.
Ex. 1.20 chapter 4 extensively covers this. Applicative-order all parameters are evaluated before being applied and normal-order is known as a form of 'lazy' eval where nothing is evaluated until needed. This results in a ton of normal-order remainder operations that I gave up counting after 8 whereas applicative-order is only 4 remainder ops. In a modern lazy eval language the problems of normal-order copying everything around are fixed by having parameters stored in a single location in memory so when they are applied as arguments to a procedure and copied only the value is copied as the parameter needs to be evaluated only once.
SICP 1.2.6
(define (smallest-divisor n)
(find-divisor n 2))
(define (find-divisor n test-divisor)
(cond ((> (square test-divisor) n) n)
((divides? test-divisor n) test-divisor)
(else (find-divisor n (+ test-divisor 1)))))
(define (divides? a b)
(= (remainder b a) 0))
(define (prime? n)
(= n (smallest-divisor n)))
(define (expmod base exp m)
(cond ((= exp 0) 1)
((even? exp)
(remainder (square (expmod base (/ exp 2) m))
m))
(else
(remainder (* base (expmod base (- exp 1) m))
m))))
(define (fermat-test n)
(define (try-it a)
(= (expmod a n n) a))
(try-it (+ 1 (random (- n 1)))))
; test divisor prime test w/Carmichael numbers
(prime? 561)
; test mod prime test
(fermat-test 561)
1 ]=>
;Value: #f
1 ]=>
;Value: #t
Ex 1.22
Search-for-primes, the modern x86 or ARM64 desktop/laptop can execute ~100m operations per second and the interpreter will be slower but still it's going to be at least tens of millions per second. There is no point testing the times unless you start at enormous inputs.
1 ]=> Here are the first 3 primes from 1000000000000 to 10000000000000 Prime 1000000000039 took this long to find: 1.2100000000000009 Prime 1000000000061 took this long to find: 1.1999999999999993 Prime 1000000000063 took this long to find: 1.2100000000000009 End of Primes Here are the first 3 primes from 10000000000000 to 100000000000000 Prime 10000000000037 took this long to find: 3.7799999999999994 Prime 10000000000051 took this long to find: 3.789999999999999 Prime 10000000000099 took this long to find: 3.75 End of Primes
Adding a factor of 10 increases runtime by roughly 3x which is the sqrt of 10.
Ex 1.23
After you discover that the repeated and needless testing of even numbers with an if predicate slows the process down try the following:
(define (bytwo? n)
(divides? n 2))
; change defaults to 3
(define (smallest-divisor n)
(find-divisor n 3))
; add this one-time check to your search-for-primes function
; assuming you are using a 'trampoline' and have a nested internal find function
(if (bytwo? start)
(find-p (+ 1 start) 0))
(find-p start 0)))
Testing the same trillion large input I get a runtime ratio on average of 1.21/.609 or ~2 the ratio we were looking for as it was the repeated if predicate slowing the runtime down.
Ex 1.24
Impossible to test on modern machines because CPUs do so much magic with optimizing constant ops since the 1980s no matter what large input you won't see an O(log n) process. In the book they claim the random sample of just 2 choices is better than a 3 out of 4 chance that n is prime so 1 - (1/2)2 is 0.75 and in Sussman's MIT course Assignment 0 Diffie-Helman is the same problem where he recommends minimum 20 random values be tested so 1 - (1/2)20 or 0.9999.. but even with 50 tests the runtime is still constant on my machine or O(1).
Misc exercises
Ex 1.25 The 'A lisp hacker' solution the call to fast-expt will keep on going squaring gigantic numbers until it returns the square whereas expmod every loop calls remainder with m on delayed computations so when they get to the delayed squaring ops they are small. If you hand evaluate expmod for 37 modulo 7 I think the biggest square in all the returns was 36.
Ex 1.26 A 'loose reasoner' the hint is Eva Lu Ator because scheme uses applicative-order evaluation so (square (params)) the params are evaluated to a value only once then square is applied where (* (params) (params)) is evaluated twice before multiplication is applied to the values so successive doubling of work everytime that cond branch is used.
The last exercise 1.28 is explained here where 'nontrivial' means something that isn't 1 since 12 has a trivial square root of 1 but any odd number squared mod 8 has a nontrivial square root of 1 for example 32 mod 8 or 9 mod 8 is 1 so now 3 is a nontrivial square root of 1 mod 8. The exercise is actually trivial even if the square root is nontrivial all you have to do is drop a check in the squaring branch of expmod "n = not 1 or n-1 and n2 = 1 mod n" to reveal if it's a nontrivial square root and if so return 0 wiping out all the delayed multiplication and remainders of expmod so when the result gets back to fermat-test it compares a to 0.
If you enjoy the strange world of number theory we'll be doing more of it when we cover Sussman's new book and Richard Borcherds a Fields medalist has a playlist on his YouTube channel teaching an introduction if interested.
SICP 1.3.1
The integrals in this chapter aren't used only the numerical approximations which we are given full formulas for are so if you've never taken calculus it won't matter.
Ex 1.29
The Simpson's rule is explained here (click on show answer)
We're asked to define a procedure that takes arguments f, a, b, n where f is a function to integrate, a and b are the integral ranges and n is a precision number.
(define (sum term a next b)
(if (> a b)
0
(+ (* (term a))
(sum term (next a) next b))))
(define (srule f a b n)
(define (parity x) (+ x 2))
(define delta-x (/ (- b a) n))
(define (y k) (f (+ a (* k delta-x))))
(* (/ delta-x 3)
(+ (* (sum y 1 parity (- n 1)) 4)
(* (sum y 2 parity (- n 1)) 2)
(y 0)
(y n))))
; test functions y = 1/x+1 and y = x^2
(define (fun x) (/ 1.0 (+ x 1)))
(define (pow x) (* x x))
; this should equal ~0.2876831
(srule fun 2 3 4)
; this should equal 215/3
(srule pow 1 6 4)
Exercise 1.30 to 1.33 are standard programming introducing the concept of fold/reduce (accumulate) and filter which will be used later on lists and streams.
Ex. 1.34
(define (f g) (g 2))
If f is applied to itself, then it becomes the g in (g 2) which then will be given 2 as the parameter for g and pass that to it's body trying to eval (2 2) an error.
TODO
SDF 0
Now that we've finished Chapter 1 we can do the first pset from Sussman's 2024 course Adventures in Advanced Symbolic Programming and latest book Software Design for Flexibiliy (SDF). Let's try Assignment 0 Diffie-Helman it's only prereq is SICP Chapter 1. TODO
.. cont