The parsing is a trial and error process, yet we do not want to grab the html from the web everytime the parser is changed. I wrote a small library which intelligently caches webpages so that I can work on parsing without having to worry about making too many requests over www