Saurabh Bikram

Efficient web page parsing

Wed Aug 19 2020 Data Engineering 3 mins

The parsing is a trial and error process, yet we do not want to grab the html from the web everytime the parser is changed. I wrote a small library which intelligently caches webpages so that I can work on parsing without having to worry about making too many requests over www