Posts

Showing posts with the label Memento

2020-06-10: Hypercane Part 2: Synthesizing Output For Other Tools

Image
This image by NOAA is licensed under NOAA's Image Licensing & Usage Info . In Part 1 of this series of blog posts, I introduced Hypercane , a tool for automatically sampling mementos from web archive collections. If a human wishes to create a sample of documents from a web archive collection, they are confronted with thousands of documents from which to choose. Most collections contain insufficient metadata for making decisions. Hypercane's focus is to supply us with a list of memento URI-Ms derived from the input we provide. One of the uses for this sampling is summarization. The previous blog post in this series focused on its high level sample and report actions and how they can be used for storytelling. This post focuses on how to generate output for other tools via Hypercane's synthesize action. The goal of the DSA project : to summarize a web archive collection by selecting a small number of exemplars and then visualize them with social media

2020-06-03: Hypercane Part 1: Intelligent Sampling of Web Archive Collections

Image
This image by NASA is licensed under NASA's Media Usage Guidelines Yasmin AlNoamany experimented with summarizing a web collection by choosing a small number of exemplars and then visualizing them with social media storytelling . This is in contrast to approaches that try to account for all members of the collection. When I took over the  Dark and Stormy Archives project  from her in 2017, the goal was to improve upon her excellent work. Her existing code relied heavily upon the Storify platform to render its stories.  Storify was discontinued  in May 2018. We discovered that  other platforms rendered mementos poorly , so we developed  MementoEmbed  to render individual  surrogates  and later  Raintale  to render whole stories. We discovered that  cards are probably the best surrogate  for stories. We now  publish stores to the DSA-Puddles web site  on a regular basis. Up to this point, we have relied upon sources such as  Nwala's StoryGraph  or human selection

2020-05-21: Visualizing Webpage Changes Over Time With TMVis

Image
Home page of  tmvis.cs.odu.edu This work has been supported by a  NEH/IMLS Digital Humanities Advancement Grant ( HAA-256368-17 ).  The web is dynamic, meaning webpages that exist today may not exist tomorrow. Even if a webpage continues to exist, it could display completely different content than it used to. Web archives, such as the  Internet Archive  (IA),  Archive-It  (AIT), and  many others , preserve past versions of webpages for use by scholars, researchers, and the general public. Using Memento terminology, an archived version of a webpage at a particular time is called a memento, or URI-M, and the list of all mementos for a particular webpage is called a TimeMap. Different web pages have different sized TimeMaps. For example, the TimeMap for odu.edu contains over 2000 mementos, while the TimeMap for cnn.com contains around 300,000. Analyzing such large TimeMaps is nearly impossible to do manually. Based on previous work ( Alsum and Nelson, ECIR 2014 ),  TimeMap Visu