Let's Do Structure and Interpretation of Computer Programs (SICP)
Table of Contents
SICP (and SDF) Workshop
This exists because if you ask around how to learn programming likely someone will shill this book to you due to it being a seminal CS text that's been used for decades. It's easier to do this after you have at least some programming experience because then you won't miss as many details as SICP almost gives you everything at once. Most readers like myself will get hung up on the first few chapters because of the math examples thus give up immediately but they are only examples and not the core of the text so don't give up so easily. In interviews the authors explained this course was taught at MIT after 2 or 3 semesters of calculus so for that target audience it made sense to use math examples.
SICP is lecture notes that became a book the original 6.001 course had many additional assignments to the book exercises. Some examples of these are here and here. Instead of doing those old 6.001 assignments I'm going to do the 2024 assignments from his 6.945 course and new book Software Design for Flexibility (SDF) where many of those assignments point to chapters in SICP. SDF is a wild book teaching the exact opposite of current day software engineering showing how in nature complex systems are designed for flexibility so new abilities (features) can be effortlessly added without needing to do any kind of rewriting of the existing system.
SICP in 2024
MIT taught some form of this book in their 6.001 course from 1980 until 2007 and I think still offers a condensed version in 6.037 (now 6.9550). UC Berkeley teaches a Python derivative of the book and there is an authorized JavaScript 2022 version taught at the National University of Singapore but the authors have had to make significant changes in chapter 4 because of the limitations of JavaScript.
I add solutions to the sicp wiki as I go only if another solution is helpful.
Optional lectures
There is lectures from 1986 here and if you search YouTube many more recent lectures from Berkeley and countless hackathons or compsci reading groups have videos on working through the book.
BEGIN
Install mit-scheme
To do the book read the setup instructions from Sussman's grad course and only use mit-scheme or you may have problems in later chapters when we need to use the picture language or parallel.scm or mutable lists. If you have to install a VM try the QEMU image for Guix System because it's entire configuration is done using scheme. Guile scheme may work for the entire book too but I have no idea, it has a picture language at least.
I'm aware of #lang sicp in DrRacket and many other 3rd party implementations you can start with those for the first few chapters but if you want to do Sussman's more recent books like SDF you're going to have to use the latest mit-scheme anyway so we may as well begin with using the language the book actually wants us to use.
IDE/Editor
You can type scheme into any text editor, save as filename.scm, run mit-scheme, and load your file in the REPL manipulating it there you don't have to use an IDE. This is trivial with (load "filename.scm").
The only IDEs I use are emacs and edwin the editor that comes with mit-scheme. Edwin is a clone of an old emacs version (Emacs 18 I believe see the ref card below) entirely written in scheme and is what 6.001 and the graduate courses at MIT have used for decades. Sussman still uses it as per his 2022 course setup notes. You start edwin with mit-scheme –edit and are good to go without needing to install anything else as a scheme shell spawns immediately upon opening and debugging is all built-in. However regular emacs will probably be more familiar to you with 'modern' default behavior like you'll have a GUI menu, ability to highlight and cut+paste with a mouse, delete key won't default to be a backspace etc.
- Editor setup from 6.945 and setup from 6.946 for emacs or edwin
- Sussman's .edwin config file here for better fonts (save in home directory as .edwin (with the period))
- Edwin specific docs here (it has a stepper too)
If you choose emacs install vanilla emacs not any of those 'spacemacs' or whatever preconfigured versions that remove the GUI drop down menus. Click Options->Customize Emacs->Custom Themes and change the colors around if you want. You won't need a config file everything like MELPA package install is already included (and scheme mode) so you will automatically enter scheme mode when opening any file ending in .scm
I wouldn't recommend Paredit mode or Geiser or anything until you start writing larger programs in the later chapters because you learn scope in the beginning by seeing how all the parens close blocks of expressions and having something automatically shifting around parens when you don't have experience is going to be rage inducing. When your programs start becoming one big function in chapter 4 then install tools.
Commands:
- C-x C-c (exit emacs/edwin) is hold down Ctrl and type x, keep holding Ctrl and type c.
- C-h t is hold down Ctrl and type h, release Ctrl and type t (starts the tutorial).
- PgUp/PgDn, End/Home, arrow keys and mouse all work too you don't have to use the movement commands in tutorial
- M-x the M is the 'meta' key which on a modern keyboard is usually ALT (sometimes the Windows key) but you can redefine both C and M to be something else.
- C-g or Ctrl-g is the escape key if you accidentally screw up mid command and want to try again.
Open multiple screens
In either edwin or emacs: use command C-x C-f and type test.scm to open or make a new file. Use C-x 3 and a new window should open to the right or if you want the new window below use C-x 2. Click on the new window and type C-x b and press enter to switch to the scheme REPL which was the buffer you saw when you first opened the editor. Now you have a screen to enter code with the output beside it. In emacs there is no default scheme buffer (unless you configure one) you have to select from the dropdown menu 'run inferior scheme' or M-x mit-scheme.
Eval code
Click on the test.scm buffer and enter some random code:
(define x 3) (+ x 2) (+ x 2 2) "this is a string" ;this is a comment
Run it in the REPL with M-o or C-M-z (edwin) or follow the emacs GUI menus to 'evaluate region' all described in the docs and tutorial. Type C-h m to get a list of available commands in whatever current buffer. To switch out of the help buffer C-x b and default will be test.scm or wherever you were last. You can highlight with the mouse (in emacs) snippets of code and only eval that selected code in the REPL to test it by hand.
Errors
If you get something similar:
;Unbound variable: *2 ;To continue, call RESTART with an option number: ; (RESTART 3) => Specify a value to use instead of *2. ; (RESTART 2) => Define *2 to a given value. ; (RESTART 1) => Return to read-eval-print level 1.
Then typing (restart 1) will return to the REPL or you can try option 3 and fix the typo, or break/kill it in emacs. The bulk of errors will be parens problems in the beginning but it's how you learn and after a few of them you're much more careful about code and what scope it's in. If you run the debugger type C-h m to get a list of commands (can run that in the REPL too).
Edwin demo in Lecture 1
First lecture from the 1986 series if you want to follow him with edwin like @35:40 in the video:
- Start edwin with 'mit-scheme –edit'
- C-x 2 (split screen like he has)
- C-x C-f and type in lecture.scm (it will create this file)
- Any expressions eval with C-x C-e
- Press tab to indent when he demos 'pretty printing'
In either the scheme buffer or editor window type C-h m to see all the scheme mode commands like indenting multiple lines of code.
Book versions
I'm using the official online version and one of unofficial PDFs floating around to see the figures which can be impossible to read in the online html version. There's also an info version you can run in the terminal or in another emacs window. Remember those are unofficial so there could be extra errata introduced.
SICP 1.x
A computational process I would interpret as a running program or the act of computation itself. A procedure is sometimes called a function in modern programming parlance but the book makes it clear later in this chapter these are different things pointing out how a math function definition is nothing like a procedure.
Ex 1.2 is an example of an impossible to read figure. Use the pdf.
(/ (+ 5 4 (- 2 (- 3 (+ 6 (/ 4 5))))) (* 3 (- 6 2) (- 2 7)))
Ex 1.3 simple (cond (and ())) you can write your own test cases/examples too:
(= (three-square-sum 1 2 3) (+ (* 2 2) (* 3 3)) (= (three-square-sum 3 1 2) 13) (= (three-square-sum 3 2 1) 13) (= (three-square-sum 1 1 1) 2) (= (three-square-sum 0 1 0) 1)
Ex 1.4 many languages won't let you do this and is a good example of symbolic programming it evals to (+ a b) if b > 0 or (- a b) if not.
Ex 1.5 (define (p) (p)) is a non-terminating loop where (p) keeps calling itself. Since normal-order doesn't eval arguments into values until they are needed (p) is never run because it's never needed as (test 0 (p)) will always evaluate to 0. Under applicative-order however it evaluates all the arguments first and 'applies' test to (0 (p)) so will evaluate (p) and go into an infinite loop.
Ex 1.6 new-if is a regular procedure so it's input (all the subexpressions) are evaluated first with applicative-order eval. The three inputs to new-if: (good-enough? guess x) returns a boolean, guess is already a value, but (sqrt-iter (improve guess x) x) will keep improving the guess until the end of days as the cond predicate good-enough? is never reached to stop the recursion. The special-form if/else does not automatically evaluate the else clause it preforms the predicate test first.
Ex 1.7 try examples:
1 ]=> (sqrt 0.0009) ;Value: .04030062264654547 1 ]=> (sqrt 1000000000000000) ;takes forever, infinite loop?
(sqrt 0.0009) should be 0.03 and the gigantic number never terminates. This happens because using our current testing strategy of subtracting the square of the guess from the radicand the epsilon (0.001) isn't precise enough for small numbers and very large numbers clearly they never meet that precision or it takes too long to do so. If we ratio the change in guess with the existing guess, the closer they are together the closer they approach 1 so we can adjust our program to check for this by subtracting that ratio from 1 to see if the two guesses are within epsilon of each other.
(define (good-enough? guess x) (< (abs(- 1 (/ (improve guess x) guess))) 0.001))
Now gigantic square roots eval immediately and tiny square roots are close enough to wolfram alpha online calculator precision.
Ex. 1.8 The cube root of 8 is 2 or 2 * 2 * 2. Plugging that into the formula: (8/4 + 4)/3 or 6/3 = 2. All I did was change improve procedure and delete average procedure since it wasn't needed anymore:
(define (improve guess x) (/ (+ (/ x (* guess guess)) (* 2 guess)) 3))
1.1.8 Black-Box Abstractions if you look at the mit-scheme documentation and scroll to the index at the bottom you'll find definitions for exp and log which are ex and log(x) base e. The square x function defined as (exp (double (log x))) if x is 8 then (exp ~4.16) which is 64. A log is a linearized exponent it tells us what the x in ex = 8 is and it's 2.079. If we double that and feed it to e then e4.16 = 64. We can see this ourselves using desmos go along the x-axis to 2, go up until it meets the blue log line, it lands at y = 0.7 or so. Double that to ~1.4 and travel x-axis to 1.4, go up to the exp(x) line and you'll see you intersect at y = 4 or the square of 2 is 4. Both e and it's log (usually written ln or 'the natural log') are preferred for math modeling because it has properties like the derivative of e is itself, this square property we just learned, and that you can estimate ex and ln(x) simply on a piece of paper when working with small displacements. For example ex for small x like 0.05 is 1 + x and ln(1 + x) is simply x for 'small' x.
SICP 1.2.1
I rewrote fact-iter to instead be a function nested inside another function, with a 'trampoline' meaning execution falls to the bottom of the function hitting (f 1 n) and bounces back calling f. It's the same as what is in the book except simplified. A factorial is a product of every number up to the nth factorial so 3! is 1x2x3.
(define (factorial n) (define (f total counter) (if (< counter 2) total (f (* total counter) (- counter 1)))) (f 1 n)) (factorial 6) (factorial 1) (factorial 0)
The difference in the book between the delayed chain of operations and the iterated version is called re-entry. The delayed chain has to re-enter the same function before returning a value so must keep all that accounting of state somewhere which we will eventually learn is the stack.
Ex 1.9 addition is defined as a function that uses a as a counter and increments b everytime there is still an a left until a = 0. The first function (inc (function call)) is going to return a chain of increments until a = 0 then all delayed increment operations will be done.
Ex 1.10 'concise math definitions' means turn (* 5 n n) into 5n2 and if you plug in some values to f then f(n) = 2y. If you are using edwin you can run the stepper see the manual for mit-scheme. Otherwise hand step:
(define (A x y) (cond ((= y 0) 0) ((= x 0) (* 2 y)) ((= y 1) 2) (else (A (- x 1) (A x (- y 1)))))) (define (g n) (A 1 n)) (define (h n) (A 2 n)) (g 2) - (A 1 2) - (A 0 (A 1 1)) - (A 0 (2)) - (* 2 2) - 4 (h 3) - (A 2 3) - (A 1 (A 2 2))) - (A 1 (A 1 (A 2 1))) - (A 1 (A 1 2)) - (A 1 (A 0 (A 1 1))) - (A 1 (A 0 (2))) - (A 1 (* 2 2)) - (A 1 4) - (A 0 (A 1 3)) - (A 0 (A 0 (A 1 2))) - (A 0 (A 0 (A 0 (A 0 1)))) - (A 0 (A 0 (A 0 (2)))) - (A 0 (A 0 (4))) - (A 0 (* 2 4)) - (A 0 (8)) - (* 2 8) - 16
G seems to produce 2n while H generates sequence 2, 4, 16, 65536… or \(2^{2^2..}\).
SICP 1.2.2
Phi or \(\phi\) is a constant like e that is solely here to illustrate (fib n) process is exponential.
The 'golden ratio' is anything satisfying \(\frac{a + b}{a} = \frac{b}{a}\) and is found in taking diagonals of pentagons and other geometry you can look up on YouTube and phi is this ratio. In case it wasn't clear in the book the iterative algorithm comes from the observation that 0, 1, 1, 2, 3, 5.. the next in the sequence is the sum of the previous 2 numbers.
Hand step the count-change procedure with only 2 coins and amount 1:
- (+ (cc 1 2 (+ (cc 1 1) (cc 1 0))))
- (+ (cc 1 2 (+ (cc 1 1) 0)))
- (+ (cc 1 2 (+ (cc (1 - 1) 0))))
- (+ (cc 1 2 (+ 1 0)))
- (+ (cc (1 - 2) 1))
- (+ 0 1)
- 1
The computations are delayed as (+) needs two arguments so recursion is unrolled fully then evaluation can begin once it stops with 0 or 1.
Ex 1.11
Write examples 1, 2, 4, 11, 25, 59.. enter it into OEIS and there's actually a generating function solution a physicist figured out however the assignment wants us to create an iterable procedure. You can use the recursive procedure we learned to test your iterative one. I took eqv? from the mit-scheme reference manual.
(define (f n) (define (f-iter n1 n2 n3 counter) (cond ((< n 3) n) ((= counter n) n1) (else (f-iter (+ n1 (* 2 n2) (* 3 n3)) n1 n2 (+ counter 1))))) (f-iter 2 1 0 2)) (define (g n) (cond ((< n 3 ) n ) (else (+ (g (- n 1)) (* 2 (g (- n 2))) (* 3 (g (- n 3))))))) (and (eqv? (f 1) (g 1)) (eqv? (f 2) (g 2)) (eqv? (f 3) (g 3)) (eqv? (f 4) (g 4)) (eqv? (f 0) (g 0)))
Ex 1.12
Pascal's triangle or array is explained here as the number of paths you can take to get to that number in the array or the number of coefficients in (a + b)n
(define (pascal row col) (cond ((= row 0) 1) ((> col row) 0) ((or (= col 0) (= col row)) 1) ((or (= col 1) (= col (- row 1))) row) (else (+ (pascal (- row 1) (- col 1)) (pascal (- row 1) col)))))
For example (pascal 4, 2) is 6. The first and last columns are always 1, the n+1 and n-1 columns are the same as the row number if you start counting from 0. The value of (pascal 4, 2) is the value of (+ (pascal 3, 1) (pascal 3 2)) which is (+ 3 (pascal 2 1) (pascal 2 2)) or (+ 3 2 1). Trying to get a middle column element from row 30 takes a long time as the recursive process is generating a tree branch on almost every remaining input.
You can property test your program by noticing each element in a row of Pascal's array sums to \(2^{(row)}\):
(define (pascal row col) (cond ((= row 0) 1) ((> col row) 0) ((or (= col 0) (= col row)) 1) ((or (= col 1) (= col (- row 1))) row) (else (+ (pascal (- row 1) (- col 1)) (pascal (- row 1) col))))) (define (ptest n) (define (row-sum count) (cond ((= count 0) 1) (else (+ (pascal n count) (row-sum (- count 1)))))) (row-sum n)) (and (eqv? (ptest 2) (* 2 2)) (eqv? (ptest 3) (* 2 2 2)) (eqv? (ptest 10) (* 2 2 2 2 2 2 2 2 2 2)))
Ex 1.13
Try some examples in scheme:
(define (fib n) (cond ((= n 0) 0) ((= n 1) 1) (else (+ (fib (- n 1)) (fib (- n 2)))))) (define (phi x) (/ (x 1 (sqrt 5)) 2)) (define (f n) ;; expt and sqrt in mit-scheme reference doc (/ (- (expt (phi +) n) (expt (phi -) n)) (sqrt 5)))
Notice phi can accept '-' or '+' as a parameter the benefits of symbolic programming. If you run this and compare (f n) and (fib n) you get exactly the same outputs. The book says to use the definition fib(n) = fib(n - 1) + fib(n - 2) and since fib(n) = \(\frac{\phi^n - \varphi^n}{\sqrt 5}\) we can substitute it wherever we see fib(n):
\[\frac{\phi^n - \varphi^n}{\sqrt 5} = \frac{\phi^{n-1} - \varphi^{n-1}}{\sqrt 5} + \frac{\phi^{n-2} - \varphi^{n-2}}{\sqrt 5}\]
Looking at this solution:
- Clear the fractions by multiplying everything by \(\frac{\sqrt 5}{1}\)
\[\phi^n - \varphi^n = \phi^{n-1} - \varphi^{n-1} + \phi^{n-2} - \varphi^{n-2}\]
- Rewrite exponents to n. If n = 3 then 23 is 2*2*2 and (2*2*2)/2 is 2*2 or 2n-1
\[\varphi^n-\phi^n=\frac{\varphi^n}\varphi-\frac{\phi^n}\phi+\frac{\varphi^n}{\varphi^2}-\frac{\phi^n}{\phi^2}\]
- Distribute out phi and varphi numerators:
\[\varphi^n-\phi^n=\varphi^n\left(\frac1\varphi+\frac1{\varphi^2}\right)-\phi^n\left(\frac1\phi+\frac1{\phi^2}\right)\]
Here can take advantage of the fact that phi2 = phi + 1 and 1/phi = phi – 1 or keep following the solution above to see how they end up plugging back in the definitions for phi and varphi given in the exercise and showing that phin over root 5 is never further away than +/- a half of the value of fib(n) thus the closest integer.
SICP 1.2.3
Watch the lecture for this chapter where he explains what time and space means the whole lecture is worth watching but the specific time/space explanation starts around 17:09. Time is the amount of steps that are evaluated and space is the amount of data needing to be stored. The numbers 16 and 8 differ by a constant factor of 2 and by a constant factor of 8 but as he says in this analysis we don't care what the constant is just that it is a constant and not changing with the input. Try the iterative procedure in the lecture that uses '1+' and '-1+' both built-in increment/decrement functions. He redefines '+' procedure which you can do in lisp you can rename or redefine anything you want. That iterative procedure uses an accumulator the y parameter keeps incrementing until x is 0 and y is returned.
He uses big-O in the lecture (and first edition) and big-theta in the second edition we're reading. O(n) is an upper bound and \(\theta\)(n) is a tighter bound of both a lower and an upper bound.
In the book the definition for \(\theta\)(f(n)) is \(k_{1}f(n) \le R(n) \le k_{2}f(n)\) where the two k's represent constants that can be the same or different. R(n) is a function we define that represents the resources for a process (space or steps) and f(n) is the function inside \(\theta\)(f(n)) that is chosen from a list of existing math functions. A small detail, in the definition of big-O or big-theta the resource function R(n) is already in O(f(n)) meaning it is that function f(n) or belongs to the family or set of all of functions that are represented by f(n) it doesn't grow to be that function O(f(n)) it already is by definition. The computational process however can grow as described in the book but the function R(n) in asymptotics is derived from that process and isn't growing though you will see this all over YouTube tutorials.
You may have seen other books or tutorials where they analyze the code to come up with R(n) or create recurrence relations and solve them but here so far we are only looking at the computational process meaning the exact amount of steps generated for each run of the program and the storage needed to store the steps in memory. A good example is going back to the chapter on linear recursion and looking at the recursive factorial process. It acts like a line being scaled out then scaled back or in other words linear. No matter what input we give it will do this linear process so it is at least linear and at most linear. You can check this by seeing that given an input of 6 it expands out to 5 multiplication operations. It needs to hold space for delayed computations for 5 operations as well. Given 100 as an input it expands out to 99 multiplication ops.
- Factorial resource function R(n) = n - 1 and the constants here can be k1 = 0.5 or k2 = 1 it only matters that this inequality holds up to any constant factor(s). The \(\in\) symbol means 'is in':
\[k_{1}\cdot n \le n - 1 \le k_{2}\cdot n \in \theta(n)\]
- Fib(n) is R(n) = \(\frac{\phi^n}{\sqrt 5}\)
\[\frac{1}{\sqrt 5}\cdot\phi^n \le \frac{\phi^n}{\sqrt 5} \le \frac{1}{\sqrt 5}\cdot\phi^n \in \theta(\phi^{n})\]
- The definition of big-O is \(R(n) \le kf(n)\) so fib(n) is:
\[\frac{\phi^n}{\sqrt 5} \le \frac{1}{\sqrt 5}\cdot\phi^n \in O(\phi^{n})\]
- Phi is a constant 1.6180.. and (1.6)n < 2n this is another upper bound though not as tight fitting thus less descriptive:
\[\frac{\phi^n}{\sqrt 5} \le \frac{1}{\sqrt 5}\cdot\phi^n \in O(2^{n})\]
Watch this to learn more like O-tilde, poly(), and what exactly are the standard form functions that are 'chosen from a list' in O(f(n)). That same lecture also explains why 'n' is used in O(f(n)) instead of O(f(x)) because in this analysis inputs are supposed to grow to infinity so n is used as a convention to represent near infinite inputs.
Ex 1.14
There is a nice graphic here showing the full (cc 11 5) tree. He used (display) to generate the data to plug into graphviz and we can do the same while adjusting the procedure to count the number of calls or nodes in the process tree. Zero out the cond then in the else add 2 for 2 calls to cc:
(define (count-nodes amount) (cc amount 5)) (define (cc amount kinds-of-coins) (display "(cc ") (display amount) (display " ") (display kinds-of-coins) (display ")") (newline) (cond ((= amount 0) 0) ((or (< amount 0) (= kinds-of-coins 0)) 0) (else (+ 2 ( + (cc amount (- kinds-of-coins 1)) (cc (- amount (first-denomination kinds-of-coins)) kinds-of-coins)))))) (define (first-denomination kinds-of-coins) (cond ((= kinds-of-coins 1) 1) ((= kinds-of-coins 2) 5) ((= kinds-of-coins 3) 10) ((= kinds-of-coins 4) 25) ((= kinds-of-coins 5) 50))) 1 ]=> (count-nodes 1) (cc 1 5) (cc -49 5) (cc 1 4) (cc -24 4) (cc 1 3) (cc -9 3) (cc 1 2) (cc -4 2) (cc 1 1) (cc 0 1) (cc 1 0) ;Value: 10
We can do the same for the fib function and compare the result to the tree in the book (14 nodes not including the root)
(define (fib-nodes n) (cond ((= n 0) 0) ((= n 1) 0) (else (+ 2 (fib-nodes (- n 1)) (fib-nodes (- n 2)))))) 1 ]=> (fib-nodes 5) ;Value: 14
We can see count-change process steps are much less than fib:
1 ]=> (fib-nodes 20) ;Value: 21890 1 ]=> (fib-nodes 30) ;Value: 2692536 1 ]=> (count-nodes 20) ;Value: 150 1 ]=> (count-nodes 30) ;Value: 390
Space is defined as the max depth of the longest tree branch because we only have to keep track of the nodes above us and count-change longest branch is whatever the input is since if there's a coin size of 1 it will take as a minimum n nodes of space of n-sized input so space's lowest bound is n. We can always find some constant to multiply n by in order to represent the rest of the space used in the tree for the other coin denominations meaning space is \(\theta(n)\) it won't exceed more than n (times a constant) and it's number of steps are are within a constant factor of the size of the input.
To estimate the amount of steps try gathering data and plot it:
1 ]=> (count-nodes 100) ;Value: 15498
1 ]=> (count-nodes 200) ;Value: 229588
1 ]=> (count-nodes 300) ;Value: 1292590
1 ]=> (count-nodes 400) ;Value: 4642024
1 ]=> (count-nodes 500) ;Value: 12822610
1 ]=> (count-nodes 600) ;Value: 29806268
1 ]=> (count-nodes 700) ;Value: 61312118
1 ]=> (count-nodes 800) ;Value: 115126480
1 ]=> (count-nodes 900) ;Value: 201422874
Plug it into a desmos like I did here or write a procedure to to compare n5 to count-change(n).
You can see for up to 900 inputs it's bounded by x3 but if you keep going then the points approach the y-axis quickly. As you add 100 to the input the nodes appear to double in size from input 600 to 900 so we can keep approximating and doubling the last value of nodes and see by the time you hit x = 1100 then the x3 curve is no longer bounding. If you keep going then it eventually passes x4 too but it will always be bounded by x5 and that is the critical insight to big-O or big-theta the inputs are supposed to be approaching infinity so no matter how big of an input you can guarantee the bound.
Ex 1.15
Build another step counter
(define (count c angle) (display c) (display " ") (display angle) (newline) (if (not (> (abs angle) 0.1)) c (count (+ c 1) (/ angle 3.0)))) (count 0 12.5) 1 ]=> 1 4.166666666666667 2 1.388888888888889 3 .462962962962963 4 .154320987654321 5 5.1440329218107005e-2 ;Value: 5 (count 0 100) 1 ]=> ;Value: 7 (count 0 1000) 1 ]=> ;Value: 9
The sine function calls (p (p (p (p (p 0.05144))))) and once the recursion stops and angle is returned then p can start to reduce. Try log2(100) and log2(1000) and see (sine a) is logarithmic in steps. We multiplied the steps by 10 and only added 2 more. Space is the bookkeeping we have to save and is the depth of p so also \(\theta\)(log(a)) as the assignment wants us to express it as a function of a.
SICP 1.2.4
Look up logs if you forgot them, log notation by convention
- log (base 10)
- lg (base 2)
- ln (base e)
ln is almost always used or base 2 assumed in compsci as ln has many nice properties for approximating on the fly like it's derivative is itself. A log (base 10) is the number of zeros in a number so if your program is logarithmic in steps and inputs to your web app spike to a million you only pay an additional cost factor of 6.
Ex 1.16
The book states the invariant here is abn meaning it's value should not change each time we iterate/loop (change state). The hint/observation is a(b2)n/2 is always going to be abn since 2n/2 is n and that is our even clause in the cond. The else clause or not even is bbn-1 and abbn-1 is of course abn too. I wrote some tests here but this is Lisp at anytime you can manually enter tests into the REPL.
(define (even? n) (= (remainder n 2) 0)) (define (square n) (* n n )) (define (expt b n) (expt-iter 1 b n)) (define (expt-iter a b n) (cond ((= n 0) a) ; a(b^2)^1/2 ((even? n) (expt-iter a (square b) (/ n 2))) ; abb^n-1 (else (expt-iter (* a b) b (- n 1))))) (define (test-expt) (and (eqv? (expt 0 1) 0) (eqv? (expt 1 0) 1) (eqv? (expt 1 2) (* 1 1)) (eqv? (expt 2 3) (* 2 2 2)) (eqv? (expt 3 7) (* 3 3 3 3 3 3 3))))
Ex 1.17
Try examples, ab is b amount of times a is added to itself so 3 x 4 is 4 groups of 3 added or (+ 3 3 3 3) or 4 addition steps. How would double/half help us here? Applying the even test to b: (+ (double 3) (double 3)) is (+ 6 6) or 2 steps and lg(4) is 2. An odd example is 3 x 5 or (+ 3 (double 3) (double 3)) and that's our procedure. Whenever we can halve the product b we double a, if we can't we add whatever a is and reduce b by 1.
(define (even? n) (= (remainder n 2) 0)) (define (double n) (+ n n )) (define (halve n) (/ n 2)) (define (* a b) (cond ((= b 0) 0) ((even? b) (* (double a) (halve b))) (else (+ a (* a (- b 1)))))))
We can count the number of arithmetic steps by altering our procedure then check it against log (base e) a built-in procedure to make sure our number of steps are logarithmic. The book so far is only concerned with the actual steps for arithmetic they aren't concerned with what is the cost of calling halve and double so I assume those are constant step operations thus ignored:
(define (* a b) (cond ((= b 0) 0) ((even? b) (* (double a) (halve b))) (else (+ 1 (* a (- b 1)))))) (display "Actual number of arithmetic steps:") (* 2 100) (* 2 1000) (* 2 10000) (display "log_e of 100, 1000, and 10000:") (log 100) (log 1000) (log 10000)
Ex 1.19
Apply the first state change a <- bq + aq + ap and b <- bp + aq
- Tpq(a, b)
- = Tpq(bq + aq + ap, bp + aq)
Then apply another state change (there is now 2 b's and 3 a's)
- Tp'q'(bq + aq + ap), bp + aq)
- = Tp'q'(bp + aq)q + (bq + aq + ap)q + (bq + aq + ap)p, (bp + aq)p + (bq + aq + ap)q
Distribute everything and group like terms
- (bpq + bqq + bqp) + (aqq + aqq + aqp + aqp + app), (bpp + bqq) + (aqp + aqq + apq)
Factor out b and a
- b(2pq + qq) + a(qq + qq + qp + qp + pp), b(qq + pp) + a(2pq + qq)
- b(2pq + qq) + a(2pq + qq) + a(qq + pp), b(qq + pp) + a(2pq + qq)
q' is 2pq + q2 and p' is q2 + p2
I only know this from TAOCP book with similar state changes otherwise iirc I didn't understand this exercise the first time I saw it years ago.
SICP 1.2.5
This is 'integer division' or mod so the remainder 6 in GCD(206,40) is 200 with 6 remaining or 206 mod 40.
The number of steps of GCD(206,40) is bounded by the log of the smaller of the two inputs or log(40) (base phi) and since n = 40 is the input this is \(\theta\)(log n). It is derived by taking the log of both sides of that inequality in the book.
You don't need to know this but if you were curious about Lame's theorem first we notice in each step of Euclid's algorithm it will have to find the gcd of two inputs that are smaller than the previous input. When you unroll the steps of the algorithm out like a recurrence (which we haven't yet learned) and you compare n the second smaller input to each step then the first step is n >= 1 step, the second will be a sum of the last step which is n >= 1 step, the third will be a sum of the last 2 steps which is n >= 2 steps… you see where this is going you eventually get the fib sequence: n \(\ge \frac{\phi^k}{\sqrt 5}\) and take log of both sides log(n) \(\ge\) k * log(\(\phi\)) + log(\(\sqrt 5\)) or log(n) \(\ge\) kc + c where c is some constant factors we don't care about. Now it's clear that the number of steps k are logarithmic in the size of the input n. Another way to tell this was logarithmic was to notice the input is tossed by half or more at each loop and in a log(input) number of steps we exhausted the input.
This is something else I only know from reading TAOCP vol 2 section 4 which has a very long analysis of Euclid's algorithm.
Ex. 1.20 chapter 4 extensively covers this. Applicative-order all parameters are evaluated before being applied and normal-order is known as 'lazy' eval where nothing is evaluated until needed. This results in a ton of normal-order remainder operations that I gave up counting after 8 whereas applicative-order is only 4 remainder ops.
SICP 1.2.6
Let's write search-for-primes. There is 2 problems with this exercise. First if you are using a language that isn't mit-scheme you will have to find an equivalent runtime procedure in that language's libraries. Second the modern desktop or laptop can execute about 100m commands per second so there is no point testing the times unless you start at enormous inputs of 1 trillion or more. Since this is a ridiculous gigantic number I modified my version to give me the first 3 primes with hardcoded numbers.
I changed timed-prime-test to produce false if not prime so I could test it for carmichael numbers, changed some displays around but the range parameter is probably pointless as we have to use huge numbers.
(define (timed-prime-test n) (start-prime-test n (runtime))) (define (start-prime-test n start-time) (if (prime? n) (report-prime n (- (runtime) start-time)) false)) (define (report-prime n elapsed-time) (newline) (display "Prime ") (display n) (display " took this long to find: ") (display elapsed-time)) (define (first-three-primes start max) (newline) (display "Here are the first 3 primes from ") (display start) (display " to ") (display max) (define (find-p n counter) (cond ((> n max) (display "\n Exhausted Range")) ((> counter 2) (display "\n End of Primes")) ((timed-prime-test n) (find-p (+ n 2) (+ 1 counter))) (else (find-p (+ n 2) counter)))) ;; For Carmichael numbers ... omitted tests here in the trampoline for even? then +1 n (find-p start 0)) (define (giga-prime-finder) ; 12 zeros (first-three-primes 1000000000000 10000000000000) ; 13 zeros (first-three-primes 10000000000000 100000000000000)) (giga-prime-finder) 1 ]=> Here are the first 3 primes from 1000000000000 to 10000000000000 Prime 1000000000039 took this long to find: 1.2100000000000009 Prime 1000000000061 took this long to find: 1.1999999999999993 Prime 1000000000063 took this long to find: 1.2100000000000009 End of Primes Here are the first 3 primes from 10000000000000 to 100000000000000 Prime 10000000000037 took this long to find: 3.7799999999999994 Prime 10000000000051 took this long to find: 3.789999999999999 Prime 10000000000099 took this long to find: 3.75 End of Primes
Adding a factor of 10 increases runtime by about 3 which is the sqrt of 10.
Ex 1.23 after you discover that the repeated calling of the next function and needless testing with an if predicate every loop produces a ratio of 1.6 improvement try the following:
(define (bytwo? n) (divides? n 2)) ; change defaults to 3 (define (smallest-divisor n) (find-divisor n 3)) ; add this one-time check to your search-for-primes function ; assuming you are using a 'trampoline' and have a nested internal find function (if (bytwo? start) (find-p (+ 1 start) 0)) (find-p start 0)))
Testing the same trillion large input I get a runtime ratio on average of 1.21/.609 or ~ 2 the ratio we were looking for, so it was the repeated if predicate slowing the runtime down.
Ex 1.24 is impossible to test on modern machines because CPUs do so much magic with optimizing constant ops since the 1980s no matter what large input you won't see O(log(n)) process. In the book they claim the random sample of just 2 choices is better than 3 out of 4 chance that n is prime so 1 - (1/2)2 is 0.75 and in Sussman's MIT course Assignment 0 Diffie-Helman has the same problem where he recommends minimum 20 random values be tested so 1 - (1/2)20 or 0.9999.. but even with 50 tests the runtime is still constant on my machine or O(1).
Ex 1.25 The 'A lisp hacker' solution the call to fast-expt will keep on going squaring gigantic numbers until it returns the square whereas expmod every loop calls remainder with m on delayed computations so when they get to the delayed squaring ops they are small. If you hand evaluate expmod for 37 modulo 7 I think the biggest square in all the returns was 36.
Ex 1.26 A 'loose reasoner' the hint is Eva Lu Ator because scheme uses applicative-order evaluation so (square (params)) the params are evaluated to a value only once then square is applied where (* (params) (params)) is evaluated twice before multiplication is applied to the values so successive doubling of work everytime that cond branch is used.
The last exercise is explained here where 'nontrivial' means something that isn't 1 since 12 has a trivial square root of 1 but any odd number squared mod 8 has a nontrivial square root of 1 for example 32 mod 8 or 9 mod 8 is 1 so now 3 is a nontrivial square root of 1 mod 8. The exercise is actually trivial even if the square root is nontrivial all you have to do is drop a check in the squaring branch of expmod "n = not 1 or n-1 and n2 = 1 mod n" to reveal if it's a nontrivial square root and if so return 0 wiping out all the delayed multiplication and remainders of expmod so when the result gets back to fermat-test it compares a to 0.
If you enjoy the strange world of number theory we'll be doing more of it when we cover Sussman's new book and Richard Borcherds a Fields medalist has a playlist on his YouTube channel teaching an introduction if interested.
SICP 1.3.1
I go through basic calculus here if interested but we don't need to know it to finish this chapter only how to translate notation into a procedure.
Ex 1.29
The Simpson's rule is explained here (click on show answer)
We're asked to define a procedure that takes arguments f, a, b, n and given a formula for h which reuses the inputs to the procedure so we may as well place it inside so it has access to the outer parameter bindings.
(define (srule f a b n) (define (h) (/ (- b a) n)) )
We're given the inputs to f where every even and odd n is scaled except for the first and last term:
(define (input k) (* (scale k) (f (+ a (* k (h)))))) (define (scale x) (cond ((even? x) 2) ((odd? x) 4) ((or (= x 0) (= x n) 1))))
The book chapter already gave us sum and an incrementer for next, all that is left to do is write out that formula at the bottom of srule procedure exactly how it appears in the book:
(* (/ (h) 3) (sum input 0 incr n)))
I rewrote this to take of advantage of the parameters to sum and added a scalar to sum since if we can control the incrementer why not have it skip even/odds:
(define (sum term a next b scale) (if (> a b) 0 (+ (* scale (term a)) (sum term (next a) next b scale)))) (define (srule f a b n) (define (parity x) (+ x 2)) (define delta-x (/ (- b a) n)) ;variable syntax we haven't learned yet (define (y k) (f (+ a (* k delta-x)))) (* (/ delta-x 3) (+ (sum y 1 parity (- n 1) 4) (sum y 2 parity (- n 1) 2) (y 0) (y n))))
Ex 1.30
… cont TODO
SDF 0
Now that we've finished Chapter 1 we can do the first pset from Sussman's 2024 course Adventures in Advanced Symbolic Programming and latest book Software Design for Flexibiliy (SDF). Let's try Assignment 0 Diffie-Helman it's only prereq is SICP Chapter 1. TODO
.. cont