Posts

Showing posts with the label web scraping

2019-05-06: Twitter broke my scrapers

Image
Fig. 1: The old tweet DIV showing four ( data-tweet-id , data-conversation-id , data-screen-name , and tweet-text ) attributes with meaningful names. These attributes are absent in the new tweet DIV (Fig. 2). On April 23, 2019, my Twitter desktop layout changed. I initially thought a glitch caused me to see  the mobile layout on my desktop instead of the standard desktop layout, but I soon learned this was no accident . I was part of a subset of Twitter users who did not have the option to opt-in to try the new layout.  New desktop look  While others might have focused on the cosmetic or functional changes, my immediate concern was to understand the extent of the structural changes to the Twitter DOM . So I immediately opened my Google Chrome Developer Tools to inspect the Twitter DOM, and I was displeased to learn that the changes to the layout seeped beyond the cosmetic new looks of the icons into the DOM. This meant that I would have to rewrite all my...

2017-12-03: Introducing Docker - Application Containerization & Service Orchestration

Image
For the last few years, Docker , the application containerization technology, has been gaining a lot of attraction from the DevOps community and lately it has made its way to the academia and research community as well. I have been following it since its inception in 2013. For the last couple years, it has become a daily driver for me. At the same time, I have been encouraging my colleagues to use Docker in their research projects. As a result, we are gradually moving away from one virtual machine (VM) per project to a swarm of nodes running containers of various projects and services. If you have accessed MemGator , CarbonDate , Memento Damage , Story Graph or some other WS-DL services lately, you have been served from our Docker deployment. We even have an on-demand PHP/MySQL application deployment system using Docker for the CS418 - Web Programming course . I ( @ibnesayeed ) have been selected as the @Docker Campus Ambassador for Old Dominion University! /cc @ODU @od...