Tuesday, August 16, 2005

Notes on Brawler

If you are wondering what brawler is its this crawler that I've been writing recently to test out Common LISP and I am having awsome fun in doing so. Also, this has given me something that I've never ever openly accepted in all these years, and that is I am addicted to coding and computers in general. This should be evident from the fact that my GRE is days away and yet I know zilch of the massive wordlists that I need to mug up :(

One of the things that any program for a crawler needs is a nice URL handling library, and strangely even after so much talk about how useful LISP is for web programming, there seems to be no good ones (and something thats open source with a BSD/MIT/LGPL style licence) yet. One thing I like about python is their batteries included approach. Wish LISP grew out of Steele's CLTL2 and developed into more complete and mature distributions. Anyhow returning to the topic of non-existant URL handling libraries, I've decided to write my own, but more specifically oriented to the HTTP protocol. Doing something more general would be an overkill right now, however, I am considering putting in an OO style architecture so that I can probably extend it to say a FTP version of it or something. Also, I am planning to base this URL library on the lines of that in JDK, with something like say an 'open-url-stream' , defaulting to an input stream which must be readily readable with standard lisp I/O functions.

So to get started on that, I needed to experiment with HTTP and I first tried telnet, but only got bored with switching windows and ended up writing this piece of small piece of lisp code.

;; function http-interact, allows interaction with remote http server.
;; uses CLOCC net.lisp for socket interface from PORT collection.

(defun http-interact (website)
(do ((line (read-line t nil) (read-line t nil))
(mysock (socket-connect 80 website) (socket-connect 80 website))
((string= line "bye") (close mysock))

;; read user input and send it to server (request)
(format mysock "~a" line)
(terpri mysock)
(force-output mysock)

;; read whatever server throws back at us (response).
(do ((response (read-line mysock nil) (read-line mysock nil)))
((null response) (values))
(format t "~a~%" response))))
That code is about 10 lines and was written in a matter of 2 minutes, and was debugged at the toplevel in about 5 minutes. I can't even imagine doing the same thing in C anymore (takes around 40+ lines of code) and would definitely not have taken just 5 minutes to do something as trivial as this. (Java is also as good as LISP in this type of code economy I am looking at, but doesn't do better than lisp, unless you really end up cutting corners and sometimes even newline charecters, but what the heck it already has nice URL facility that I was looking for in the first place)

Well, thats it, I am hooked onto lisp (yeah It will never probably replace C for most of the things I do, but might after all replace ruby for most of my throwaway coding..)

Signing Off,
Vishnu Vyas

No comments: