Posts

Showing posts with the label hash

2017-12-11: Difficulties in timestamping archived web pages

Image
Figure 1: A web page from nasa.gov is archived  by Michael's Evil Wayback in July 2017. Figure 2: When visiting the same archived page in October 2017, we found that the content of the page has been tampered with.   The 2016 Survey of Web Archiving in the United States shows an increasing trend of using public and private web archives in addition to the Internet Archive (IA). Because of this tendency we should consider the question of validity of archived web pages deleivered by these archives.  Let us look at an example where the important web page https://climate.nasa.gov/vital-signs/carbon-dioxide/ , that keeps a record of the carbon dioxide (CO2) level in the Earth’s atmosphere, is captured by a private web archive “Michael’s Evil Wayback” on July 17, 2017 at 18:51 GMT. At this time, as Figure 1 shows, the CO2 was 406.31 ppm. When revisiting the same archived page in October 2017, we should be presented with the same content. Surpris...

2017-01-15: Summary of "Trusty URIs: Verifiable, Immutable, and Permanent Digital Artifacts for Linked Data"

Image
Example: original URI vs. trusty URI Based on the paper: Kuhn, T. , Dumontier, M. : Trusty URIs: Verifiable, immutable, and permanent digital artifacts for linked data . Proceedings of the European Semantic Web Conference (ESWC) pp. 395–410 (2014). A trusty URI is a URI that contains a cryptographic hash value of the content it identifies. The authors introduced this technique of using trusty URIs to make digital artifacts, specially those related to scholarly publications, immutable, verifiable, and permanent. With the assumption that a trusty URI, once created, is linked from other resources or stored by a third party, it becomes possible to detect if the content that the trusty URI identifies has been tampered with or manipulated on the way (e.g., trusty URIs to prevent man-in-the-middle attacks ). In addition, trusty URIs can verify the content even if it is no longer found at the original URI but still can be retrieved from other locations, such as Google's cache, ...