Posts

2021-02-09: The 1st AAAI Workshop on Scientific Document Understanding (SDU 2021)

Image
The AAAI-21 Workshop on Scientific Document Understanding (SDU) was co-located with the 35th AAAI Conference on Artificial Intelligence . This is another workshop focusing on scientific documents, following the Workshop on Scientific Document Processing (SDP) . The SDP workshop was last held with EMNLP 2020 . The workshop accepted 23 papers, including 14 long papers and 9 short papers, in the following topics: Information extraction: 7 Information veracity or significance: 5 Biomedical text proceeding: 4 Scientific image processing: 3 Document classification: 2 Summarization: 1 Reading comprehension: 1 I co-authored two papers accepted by the AAAI-21 SDU workshop, they are Understanding and Predicting Retractions of Published Work (by Sai Ajay Modukuri [PSU] ) Recognizing Figure Labels in Patents (by Ming Gong [Dayton] ) Retraction Prediction Paper The last two decades have seen growing concern in the scientific community on the integrity of published works, represented by an incre

2021-02-13: Summary of "Latent Feature Vulnerability Ranking of CVSS Vectors", Part II

Image
A critique of the summary of "Latent Feature Vulnerability Rankings of CVSS Vectors" (cc @correnmccoy https://t.co/Hmjph1CfNv — Sciuridae Hero (@attritionorg) January 20, 2021 When an academic researcher must condense months or even years of work into a few pages for peer-reviewed publication, some degree of selectivity is required in terms of what to include. A paper summary, like the one I presented in my blog post Summary of "Latent Feature Vulnerability Ranking of CVSS Vectors" can even further condense the original content and perhaps lead to additional questions. Sciuridae Hero ( @attritionorg ), also known as Brian Martin (industry expert on security topics), took note of my paper summary and offered a thoughtful critique via Twitter and his own detailed blog entries. In this posting, I would like to take a deeper look at each of Mr. Martin's bulleted comments and observations to 1) make sure I adequately represented the authors' original intent,

2021-01-26: Summary of "CVExplorer: Multidimensional Visualization for Common Vulnerabilities and Exposures"

Image
Figure 1: Overview of 110,766 CVEs reported from 1998 to 2018 obtained from the NVD. Layers are severity classification: low, medium, high, and critical. (Source: Pham et al.) Computing network facilities and data storages in national, industry, academic research labs, and offices are all possible targets of cyber attacks. A network vulnerability analysis, remediation, and alerting tool that can help enhance the security against cyber attacks caused by human error can potentially reduce network vulnerabilities. Even though human error is the most significant cybersecurity vulnerability (e.g., falling for phishing , unrestrained web browsing, and weak passwords ), most commercial  vulnerability scanners are not designed to detect vulnerabilities introduced by humans interacting with the system. In their paper, CVExplorer: Multidimensional Visualization for Common Vulnerabilities and Exposures , Pham et al. introduce a novel interactive system for visualizing cybersecurity threats

2021-01-22 Twitter rewrites your URLs, but assumes you’ll never rewrite theirs: more problems replaying archived Twitter

Image
Figure 1: The tweet replayed in Internet Archives’s Wayback Machine has the t.co URI-M (“/web/20210106213519/https://t.co/Pm2PKV0Fp3”) displayed in the memento . URLs shared on Twitter are automatically shortened to t.co links . Twitter does this to track its engagements and also protect its users from sites with malicious content. Twitter replaces these t.co URLs with HTML that suggests the original URL so that the end-user does not see the t.co URLs while browsing. When these t.co URLs are replayed through web archives, they are rewritten to an archived URL (URI-M) and should be rendered in the web archives as in the live web, without displaying these t.co URI-Ms to the end-user. However, as shown in Figure 1, the tweet replayed in Internet Archive’s Wayback Machine has the t.co URI-Ms (or at least the relative URL, “/web/20210106213519/https://t.co/Pm2PKV0Fp3”) displayed in the tweet itself.  We first noticed the t.co URL displayed in the memento while exploring the archived Twitte