Programming musings

Tuesday, September 21, 2010

Google DevQuiz, the shiritori problem

So, I mentioned I'd have a look at the dev quiz problems and I'm going to have a quick look over one of them.

The problem in question is playing "Shiritori" against their server using a limited dictionary. First one to be unable to respond loses.
As I'm sure most non-japanese readers are not familiar with the game, it is a simple word association game. The next word you respond with must begin with the same later the previous word ended with, so one might go:

Apple->elephant->tower->rat->...

No word may be repeated twice and if you can't respond, you lose.

The game is generally delineated by a subject such as sports or famous people or whatever... In the case of google, the limiting set would be a limited dictionary that you and the server share. Here is the set from the easy(trivial exercise).

psouvqk  <- start word
khbpmr
kozljf
krgauzfzlgm
kkfria
rdqbrwyvbf
rzsllm
fcdpksoc
feeesamqti
fmxfwtqa
fdqyteakd
cudyi
cfaxwprqywu
czeltzvu
iapjqat
isttjds
isexgjyllo
tejubs
tkjkxecsw
sfdezjjxj
shdlkdsjpb
slpbtiox
jvhgdyvb
jlumkmgl
jcsfqql
jsvknhxwbjh
bpddtv
bhtkvcuph
vzlocep
mdnjqve
elfduspou
utvptuuukqz
zzlrtwetw
wsoguy
wwgcgqwxo
yyfofqglzvj
ybiyodwxysl
llottukq
qsgmdelg
gucmqmkiarn
anmxhykbad
djyicuzu
ooxkbyzdxh
hiodhnmqox
xdisedp
pauopwwan

This one could actually be solved with a piece of paper as the branching is very limited. Most words are directly connected to the next one. Of course writing code from the start to solve it is reusable in the harder questions.

The next one is not nice if you try to trace the possible paths on paper.

cjunqesksk
kabeoppnt
gihcboqk
cjorrhkg
ntzptexc
dgmizn
mwsgd
trpiqowsm
kkqarpan
gkyzgbfd
cvhrlkmwpm
njprgqt
dxeheowsk
msmtvrg
tdrygvhxc
kbscviwmdpd
gzqwm
cddemqxfvgt
nrfavlpwock
djtswyfksg
mpescdobc
tocztzehxn

Less words! But much more branching... There was actually a trend of a single "end-game" that if succesfully entered you could work it in paper perhaps but the whole thing was begging for a read-ahead program.

I'm going to spare the detail, but by implementing a dictionary to store the remaining words, and then running a recursive algorithm to check whether each path is "alive", one can simply pick the live paths all the way until the opponent loses. With this word set it is still doable without any pattern analysis or serious optimisation. I still used C++ as I imagined the next stage could be more demanding and require expanding on my current work(it did).

Anyway, here's the core classes pulled from the header files:

class Game {
private:
 Dictionary legalWords;  /* list of legal words that haven't been used */
 vector<string> usedWords; /* list of words that have been used */

 int turn;     /* turn counter used to determine player or ai turn */

 void useWord(string word);

 void undoLast();
 int isComputerTurn();
 int isChoiceAlive(string choice, int steps);
public:
 Game();
 void startGame();
 void displayTurn();
 int selectNext(vector<string> choiceList);
};

class Dictionary {
private:
 list<string> words[ALPH_LIST_NO];
 int names;

 string firstWord;
public:
 /*
  Construct the dictionary from a file
 */
 Dictionary(char * filename);

 Dictionary() {
  names = 0;
 }

 /*
  Loads all names from a file, returning the number read,
  or 0 for a failure
 */
 int loadFromFile(char * filename);

 /*
  Display the contents of the dictionary
 */
 void displayContents();

 int addWord(string word);
 int removeWord(string word);

 /*
  Return the first word for empty strings
 */
 vector<string> getPossibleNext(string cur);
};

The important part of the code is really the bit about checking whether a choice is alive. The rest is just trivial implementation.

/**
 Recursively check whether a specific choice can be used safely
 Any case where there is a set of choices by the ai that can lead to
 game over means the choice is dead.
 The function simulates actual play, trying each path. 
 For player turns any subpath means the path is alive
 For ai turns any subpatch being dead means the path is dead
*/
int Game::isChoiceAlive(string choice, int steps) {
 if(steps <= 0) {
  return 1;
 }

 useWord(choice);
 steps--;


 vector<string> nextWords = legalWords.getPossibleNext(choice);
  
 int result = 1;
 if(isComputerTurn()) {
  if(nextWords.size() == 0) {
//   DEBUG("ran out of words at: " << choice);

   result = 1;
  } else if(choice[0] == 'x' || choice[0] == 'k' || choice[0] == 'b') {
   // h leads to a dead loop
   
   result = 1;
  } else {

   vector<string>::iterator it;
   // check all words. on the computers turn if any choice is dead
   // we assume the computer will pick it and thus lead to a dead end
   for(it = nextWords.begin(); it != nextWords.end(); it++) {
    if(!isChoiceAlive(*it, steps)) {
     result = 0;
          
     break;
    }
   }
  }
 } else {
  if(nextWords.size() == 0) {
   // no player choices so game over
//   DEBUG("ran out of words at: " << choice << "\n");
   result = 0;
   DEBUG("Player ran out of words\n");
  } else {

   vector<string>::iterator it;
   // default to failure
   result = 0;

   for(it = nextWords.begin(); it != nextWords.end(); it++) {
    if(isChoiceAlive(*it, steps)) {
     result = 1;
     break;
    }
   }
  }
 }

 undoLast();
 return result;
}

One will note that this works by recursively calling itself until a path is determined to either by alive(1) or dead(0). This depends on whose turn it is. For the player choice so long as one sub-path is alive, the whole path is alive. On the other hand on the servers turn, one path that results in a dead player is a dead path(since we assume the server plays a perfect game, which it probably does). To restore the state of the dictionary, any items popped out of the dictionary(with useWord) are returned back to it using undoLast().

This results in an elegant solution to the problem. It certainly could use some optimizations but it returns a response to each server move very quickly so such are not necessary. I would be curious to benchmark the Visual Studio STL implementationfor this purpose.

The final question signficantly extends the number of words:

zjfurojiwt
tpenjyluge
txxkygwaqgv
ttloiso
exsosv
edmpo
ejvzpaf
vgaxvtbzo
vzgcsrihdof
vfqkkfefsz
orxbf
ocsbkecrutz
oiuvsvdzny
fmualz
fttay
fjlin
zxqry
zwlhuaavn
zbfbu
yzkhgqjwn
ylapbgu
yztpihmont
nmmeycwxu
nvchbixbft
nhube
uvokit
ulrsdme
ullkbwv
tqgbupfbx
xbxaj
xqkfbyg
xlsjah
euxgwijb
bygjlosw
buoqdwoeda
bltxp
vtvbiphxlk
kbucm
kvpbutfxnci
kpiidygnc
oqcfhx
fcnylfqb
zhptjfbk
ywhozx
nuceayubb
uornsk
jwkcw
jtebgzhijgm
jiyeziidgeg
jhdgofa
jkqvxudtkpi
jvdfqjh
jpsbykp
jymncyc
jeuawoyqfeq
jxwdyctpbr
jtkius
wyehnlfcewj
wvwnxm
wkwpoxug
wauwusmlca
wbvheei
wrvuppgxeh
wqpdckip
wuearboc
wpixxq
wgysr
waegpptsls
myrjybpaj
miksbvyrpw
mgbidg
mrzowglfa
mncqxmi
mqkxjtcnrhh
mymuttlap
muzynhc
mmgkyupq
mhvcustr
mgyxalfjoks
grfnkukj
grvpecw
glafmam
gkktjoquda
gxcumli
gnojpqlfwh
gcykxbcp
gfukbmc
grrnxq
gmadnr
gkeeyspumss
awwgeyfvj
axcvtmpsw
arcjpzm
azpqqvg
aakhi
acxurh
agiaariapp
ajzlurevxyc
akvsq
acnzqsglor
alvilkvzvs
ipsyrchhmwj
iqpgfhzzw
icypxogrm
ibcytxveblg
iatrdra
izgujgoph
iknbjgp
icewc
inbfvdmq
intkzlyeer
ibqnfshys
hkpuj
hspyw
hvkom
hsrewtg
hxxfa
huozi
htihddyp
hkumyc
hfaadvq
hkzmvdcxr
hvtfs
pesyj
pzqdepzw
ptybxgm
pahfbcdtg
ppnhzyspa
phfqjdxybi
psmkslfah
pagszdvdc
petdq
ptgevrrebr
pcdbqtls
cfzfmjhj
czczzhaxw
cgbpozgm
cwctwg
cyfamuchyaa
cxxqydweki
cwnzkiah
ccactp
cbylbq
chogthyr
cswakbs
qpslvrrgnj
qltanybow
qtqpmximzwm
qhkbwlig
qgmyza
qkvozmngii
qjdqvh
qfdntswgjfp
qujuqkczgrc
qjbrsxwcjhr
qkajthhs
rqgvj
rmbcwvenyw
rwiuwlulm
rplfg
rikfpsra
rrjfki
rkinlvh
raditmwbp
rabaxvlzoc
ryzzijueq
rxtwzcs
svosgyukelj
suvjdw
szfizfvlem
scwyaooag
syrfklha
scnyeai
slwhbymh
shsguslpxpp
sskxiec
srzgnsoooq
svmgiir
qasspaquyt
qzptilhpgyo
qauxwqly
rvwse
rjrnhyvdf
rjuvgruzpn
sptgdchfahv
sssuuafpiz
skhsmu

Trying to simply run through the massive number of choices is going to take a long, long time. I initially tried to solve this by compromising and figuring that the server may only be looking a number of steps ahead so optimised my own code and got it to look around 8-10 ahead(I can't remember at this time). This wasn't sufficient, but I noticed losing patterns would lead into a back and forth where the opponent keeps sending the same letter back to you.
On examination you can find there are one more words that end with certain letters than those that begin with them. Thus if one can force the computer to select one of these words, a path can be considered "alive". This reduces a huge amount of branching, and in fact is enough to allow a full simulation, and the final 6 points for the question. The edit to isChoiceAlive is quite simple, note the "if(choice[0] == 'k' || ... ) code.

int Game::isChoiceAlive(string choice, int steps) {
 if(steps <= 0) {
  return 1;
 }

 useWord(choice);
 steps--;


 vector<string> nextWords = legalWords.getPossibleNext(choice);
  
 int result = 1;
 if(isComputerTurn()) {
  if(nextWords.size() == 0) {
//   DEBUG("ran out of words at: " << choice);

   result = 1;
  } else if(choice[0] == 'x' || choice[0] == 'k' || choice[0] == 'b') {
   // h leads to a dead loop
   
   result = 1;
  } else {

   vector<string>::iterator it;
   // check all words. on the computers turn if any choice is dead
   // we assume the computer will pick it and thus lead to a dead end
   for(it = nextWords.begin(); it != nextWords.end(); it++) {
    if(!isChoiceAlive(*it, steps)) {
     result = 0;
          
     break;
    }
   }
  }
 } else {
  if(nextWords.size() == 0) {
   // no player choices so game over
//   DEBUG("ran out of words at: " << choice << "\n");
   result = 0;
   DEBUG("Player ran out of words\n");
  } else if(choice[0] == 'x' || choice[0] == 'k' || choice[0] == 'b') {
   // h leads to a dead loop
   
   result = 0;
  } else {

   vector<string>::iterator it;
   // default to failure
   result = 0;

   for(it = nextWords.begin(); it != nextWords.end(); it++) {
    if(isChoiceAlive(*it, steps)) {
     result = 1;
     break;
    }
   }
  }
 }

 undoLast();
 return result;

Sunday, September 5, 2010

Concrete mathematics

Got the results in from the DevQuiz. All appears to be correct and I've got a place, woo! I'm in two minds about the difficulty of the questions. On one hand, they were challenging enough to make one think, but I don't think that they were overall "hard" even for the final questions, as I could still get away with fairly generic programs. On the other hand it did take a fair amount of time to do all the questions... I hope developer day is awesome and worth the time spent on the questions

That aside, I picked up Concrete Mathematics a couple of days along with The Art of Computer Programming. I don't imagine I'll have the time to read the latter for a while, but I've started poking through Concrete Mathematics(yeah I'll get back to sicp eventually. Too many books, too little time). I really like how they've added in the student "graffiti". Some of it is very helpful in grasping a concept and other parts keep the text interesting even if they aren't directly useful. It's been a while since I've had to do some serious math (I guess a few of the questions in part 1 of sicp count), so it's got my brain going for a ride, but it's interesting stuff, and I still love to see a formula come together in an elegant manner. Not gotten far in yet, but definitely recommended for those interested.

Thursday, August 26, 2010

Recent activity: Google DevQuiz and why you probably shouldn't use Visual C++

Google Developer Day Tokyo is on end of September.

After previous years where signup was first come first served, this year they're filtering people by using a "developer quiz". Seems a reasonable enough move to make. I've noticed a few people who are interested in going just for the prospect of a chance at a "free phone" since there is some speculation they'll be giving away an android handset again this year(If I'm not mistaken they already gave them out at this years I/O).

Despite starting early, and getting the first half done well before the deadline, I got distracted by other things for a while and ended up doing the remaining half in the last day. Got all my solutions in, and at least according to the automated grader I cleared all the problems successfully. I can only hope that I don't lose points for bad code or something(because my "test of concept" prototype ended up getting hacked into my final solution...). I'll probably throw up some of the code here later with light commentary.

Unfortunately I lost a fair bit of time to poor problem comprehension as they were stated only in Japanese. I'm quite comfortable with it, but considering there's a reasonable amount of English speakers participating, it would have been nice to have English version of the questions.

Since I was doing my work on a pc I don't normally use for development, I thought I'd give Visual C++ a go for old times sake(when I was doing a lot of DirectX/Win32 programming at university)... See how it was now.

Unfortunately I was rather disappointed. While IDE's like eclipse have come ahead leaps and bounds, Visual C++(or at least the express edition) is slow and clumsy. Often for apparently no good reason it would just slow down, and syntax checking and highlighting is terribly buggy. While it is the express edition with a cut-down feature set, I don't think that "adding bugs" is the same as feature cuts. I used to have a lot of appreciation for Visual Studio as one of the better Microsoft products, but right now it does little to please and much to annoy.

On a final note, I recently picked up "Lions' Commentary on UNIX 6th Edition". I've only just gotten started on it, but it's a real trip to the past. The old school(pre-ANSI) C is a real shocker, and having to learn PDP-11 assembly is a bit of an irritation, but what I've read so far has been interesting and educating.

Thursday, August 12, 2010

The Practice of Programming

Yep. I'm far from done with SICP, but the other books I feel compelled to order are also very interesting.

Recently I read a quote somewhere about programming style that was taken out of Kernighan and Pike's "The Practice of Programming". It was neat enough that I went to check it out on Amazon, and as one would, ended up buying it.

Having read through the first few chapters so far it's been a good refreshed on some basic algorithms, and some well thought out strategies. The CVS formatting example in chapter 4 is far from amazing, but it takes a good measured approach to a common problem that has far too many annoying little edge cases. More than that however, the Markov Chain work in chapter 3 was great. It was interesting to see the 150 line C program taken down to 20 lines in awk. Maybe one of these days I'll take a look at that(after haskell, clojure, ada and all the other languages with nifty features I've been wanting to look at). What was surprising though was how poor some of the STL implementations appeared to be.

A very compact book well worth the read for anyone serious about programming. If I have any criticism of it, it would likely lie in some of the rather cryptic examples that could've been a lot more clear with proper variable naming. They are always explained afterwards, and very clever, but many of them don't feel entirely necessary. Despite that minor niggle, definitely recommended and I look forward to the other half.

Tuesday, July 27, 2010

Huffman encoding tree. SICP 2.68 - 2.72

I can't help but notice that the further I get into sicp, the more of the blogs chronicling other peoples progress on it seem to just drop off, which is unfortunate. Honestly, if I was to post detailed solutions to everything, I doubt I would keep at it myself either.

Anyway, the huffman encoding trees stuff was pretty cool, so I'll post my solutions here.

First, some general implementation details for the trees(as always looking in the text for the explanations):

This is all straight out of sicp.

(define (make-leaf symbol weight)
(list 'leaf symbol weight))

(define (make-leaf symbol weight)
  (list 'leaf symbol weight))
        
(define (leaf? object)
  (eq? (car object) 'leaf))

(define (symbol-leaf x) (cadr x))

(define (weight-leaf x) (caddr x))

(define (make-code-tree left right)
  (list left
        right
        (append (symbols left) (symbols right))
        (+ (weight left) (weight right))))

(define (left-branch tree)
  (car tree))

(define (right-branch tree)
  (cadr tree))

(define (symbols tree)
  (if (leaf? tree)
      (list (symbol-leaf tree))
      (caddr tree)))

(define (weight tree)
  (if (leaf? tree)
      (weight-leaf tree)
      (cadddr tree)))

(define (decode bits tree)
  (define (decode-1 bits current-branch)
    (if (null? bits)
        '()
        (let ((next-branch
               (choose-branch (car bits) current-branch)))
          (if (leaf? next-branch)
              (cons (symbol-leaf next-branch)
                    (decode-1 (cdr bits) tree))
              (decode-1 (cdr bits) next-branch)))))          
  (decode-1 bits tree))

(define (choose-branch bit branch)
  (cond ((= bit 0) (left-branch branch))
        ((= bit 1) (right-branch branch))
        (else (error "bad bit -- CHOOSE-BRANCH" bit))))

Also, as a "setup" sicp supplies a couple of functions for working with sets, to which I add one(element-of-set, which checks if a pair corresponding to a symbol is present)

(define (adjoin-set x set)
  (cond ((null? set) (list x))
        ((< (weight x) (weight (car set))) (cons x set))
        (else (cons (car set)
                    (adjoin-set x (cdr set))))))

(define (element-of-set? x set)
  (cond ((null? set) #f)
        ((eq? (symbol x) (symbol (car set))) #t)
        (else (element-of-set? x (cdr set)))))

(define (make-leaf-set pairs)
  (if (null? pairs)
      '()
      (let ((pair (car pairs)))
        (adjoin-set (make-leaf (car pair)
                               (cadr pair))
                    (make-leaf-set (cdr pairs))))))

SICP 2.68 encoding a message:

Idea here is to encode one symbol at a time from the tree. SICP supplies us with encode:

(define (encode message tree)
  (if (null? message)
      '()
      (append (encode-symbol (car message) tree)
              (encode (cdr message) tree))))

And here's my implementation of encode-symbol:

(define (encode-symbol sym tree)
  (define (get-encoding branch)
    (cond ((leaf? branch) '())
          ((element-of-set? sym
                            (symbols (left-branch branch)))
           (cons 0 (get-encoding (left-branch branch))))
          ((element-of-set? sym
                            (symbols (right-branch branch)))
           (cons 1 (get-encoding (right-branch branch))))))
  (if (element-of-set? sym (symbols tree))
      (get-encoding tree)
      (error "Symbol not part of encoding set")))

I start out with a simple check that the symbol is part of the set. From there we check which branch the element is, and work down the tree, consing on the relevant bits until we hit a leaf which returns a null and completes the encoded symbol.

SICP 2.69 generating a huffman tree:

Here's the real meat, and a surprisingly simple case once you get to it. First as part of the problem definition we're given generate-huffman-tree and told to implement successive-merge.

(define (generate-huffman-tree pairs)
  (successive-merge (make-leaf-set pairs)))

Here's my take on successive-merge:

(define (successive-merge pairs)
  (if (= (length pairs) 1)
      (car pairs)
      (successive-merge
       (adjoin-set (make-code-tree (cadr pairs)
                                   (car pairs))
                   (cddr pairs)))))

Note that as the pairs are sorted, we need only grab the smallest(first two) pairs, join them together into a new sub-tree, and then place it in the correct position in the list(according to its weight. This is achieved with the modified adjoin-set detailed earlier).

SICP 2.70 a huffman song:

Not much to this question. Apply the routines to some sample data.

(define song-tree
  (generate-huffman-tree
   '((A 2) (BOOM 1) (GET 2) (JOB 2) (NA 16) (SHA 3) (YIP 9) (WAH 1))))

(length (encode '(get a job
                      sha na na na na na na na na
                      get a job 
                      sha na na na na na na na na
                      wah yip yip yip yip yip yip yip yip yip
                      sha boom) song-tree))
;Value: 84

As we can see it comes out to 84 bits. As a fixed-length code each word would be 3 bits, and with 36 words, we would get a grand total of 108 bits, which of course is longer.

SICP 2.71 and 2.72 some theory

It should be quite apparent that if we form a tree in a pattern of 1,2,4,8,16... that each sub-tree formed would be merged one at a time with the next item, giving a tree of n-1 levels. The most frequent symbol would be a single access away, while the least would be n-1 levels away(since the final level has 2 leaves, not 1). All remaining items would be on the nth level.

The interesting thing though is, that as we're dealing with a sorted list, to get the first(most frequent) item, it is right at the end of the list. We thus end up going through every entry of (symbols tree) to find it, but only once, giving a complexity of n. The final entry on the other hand is at the very head of the list, and thus immediately found each time. However it has to go through n levels of the tree, also giving an order of growth n, though we can assume there would be more overhead from cons and whatnot.

Sunday, July 18, 2010

In-fix derivation (sicp exercise 2.58 b)

In part 2.3.2 Symbolic Differentiation, we develop a system to calculate simple derivations of calculations in prefix form( such as: + 2 5).

In 2.58 the task is to change this to in-fix form (going with the last example 2 + 5). In addition, it should be able to handle several different operations in a single list(for example 2 * x * x + 5 * x + 3), and correctly prioritize the operations so as to give a correct derivative, without modifying the original deriv procedure.

After going through a lot of dead-end solutions(many of which involved playing around with make-sum and make-product to read ahead and see what else is in the list but were cumbersome and lacking in "elegance") I finally figured out an elegant solution that makes sense.

Here's the initial code that the new data must be matched to:

(define (deriv exp var)
  (cond ((number? exp) 0)
 ((variable? exp)
  (if (same-variable? exp var) 1 0))
 ((sum? exp)
  (make-sum (deriv (addend exp) var)
     (deriv (augend exp) var)))
 ((product? exp)
  (make-sum
   (make-product (multiplier exp)
   (deriv (multiplicand exp) var))
   (make-product (deriv (multiplier exp) var)
   (multiplicand exp))))
 (else
  (error "unknown expression type -- DERIV" exp))))

To achieve it, the "detector functions" (in this case sum? and product?) must be changed to figure if there is a corresponding operation anywhere in the top level.

(define (has-operation? x target)
  (cond ((and (pair? x) (pair? (cdr x)) (eq? (cadr x) target)) #t)
        ((and (pair? x) (pair? (cdr x))) (sum? (cddr x)))
        (else #f)))
(define (sum? x)
        (has-operation? x '+))

(define (product? x)
        (has-operation? x '*))

With this
x * 5 + x * 4

Would be detected as a sum (it would also be detected as a product, but it is derivs job to make sure the operations are handled in the right order).

We want it to be correctly handled as
(x * 5) + (x * 4)

To this end, we redefine addend and augend as:

addend => everything before the +
augend => everything after the +

(define (addend s)
  (before-op s '+))
       
(define (augend s)
  (after-op s '+))

(define (before-op s target)
  (define (list-before s target)
    (if (or (null? s) (eq? (car s) target))
        '()
        (cons (car s) (list-before (cdr s) target))))
  (clean (list-before s target)))

(define (after-op s target)
  (define (list-after s target)
    (cond ((null? (cdddr s)) (caddr s))
          ((eq? target (cadr s)) (cddr s))
          (else (list-after (cddr s) target))))
  (clean (list-after s target)))
    
(define (clean target)
  (cond ((not (pair? target)) target)
        ((null? (cdr target)) (car target))
        (else target)))

In the same way we handle mutiplier and multiplicand(reusing the overlapping code of course)

(define (multiplier p) (before-op p '*))

(define (multiplicand p) 
  (after-op p '*))

The make-product and make-sums are barely changed. The order of the outputted list is about all that changes:

(define (make-sum a1 a2)
  (cond ((and (number? a1) (number? a2)) (+ a1 a2))
        ((=number? a1 0) a2)
        ((=number? a2 0) a1)
        (else (list a1 '+ a2))))

(define (make-product m1 m2) 
  (cond ((or (=number? m1 0) (=number? m2 0)) 0)
        ((=number? m1 1) m2)
        ((=number? m2 1) m1)
        ((and (number? m1) (number? m2)) (* m1 m2))
        (else (list m1 '* m2))))

The other stuff provided by the problem needed for the program to work:

(define (variable? x) (symbol? x))

(define (=number? exp num)
  (and (number? exp) (= exp num)))

(define (same-variable? v1 v2)
  (and (variable? v1) (variable? v2) (eq? v1 v2)))

As a test of all this, passing some derivations:

; x^2 + 2x + 5
(deriv '(x * x + (x + x + 5)) 'x)
;Value 120: ((x + x) + 2)
; 2x + 2

;x^3 + 5x^2 - 3x + 8
(deriv '(x * x * x + x * x * 5 + x * -3 + 8) 'x)
;Value 121: (((x * (x + x)) + (x * x)) + (((x * 5) + (x * 5)) + -3))
;3x^2 + 10x - 3

As to further improvements... Well there is still a lot of work that could be done on simplifying the outputs.

Overall though I am very happy with this solution and would be comfortable adding more operations to deriv(simply add the most "top-level" ops at the top, and they will then properly calculate their subs before operating on them)

Tuesday, July 13, 2010

Regarding the difficulty of sicp

Sicp(Structure and interpretation of computer programs) if anything has made me aware of weaknesses and strengths.

Getting around lambda notation has been one challenge. It is gradually coming more naturally, but after programming for years mainly in C++ getting used to functions as first order has put my brain for a twist. Several times.

Similarly, I have been really poor about data hiding for a long time. Nowadays I am certainly better about it, but I still have difficulty considering ideas such as flatmap as a black box. I try to think about what is going on inside it rather than "what does flatmap do?", making a relatively simple problem(the queens problem from my last post) a lot more mentally challenging.

On the other hand, working with the picture language described in part 2.2.4 came very naturally. I can only presume this was from working in computer graphics which I studied as part of my computer games development course at university.

Of course the natural thing to do would be to describe all the transformations in matrices...

That being said, despite being an "introductory" textbook, sicp really challenges a lot of preconceived notions and habits, and opens up new ways of thinking. It is certainly challenging, but it is definitely worth taking the time to read(at least so far).

I can't wait to get to the interpreter implementation. If only I could balance more time into it from all the other reading I'm doing now(Programming Pearls just arrived but it may be a while till I get around to that).