2016-07-24: Improve research code with static type checking
The Pain of Late Bug Detection [The web] is big. Really big. You just won't believe how vastly, hugely, mindbogglingly big it is... [1] When it comes to quick implementation, Python is an efficient language used by many web archiving projects. Indeed, a quick search of github for WARC and Python yields a list of 80 projects and forks . Python is also the language used for my research into the temporal coherence of existing web archive holdings. The sheer size of the Web means lots of variation and lots low-frequency edge cases. These variations and edge cases are naturally reflected in web archive holdings. Code used to research the Web and web archives naturally contains many, many code branches. Python struggles under these conditions. It struggles because minor changes can easily introduce bugs that go undetected until much later. And later for Python means at run time. Indeed the sheer number of edge cases introduces code branches that are exercised so infrequent...