Posts

2020-06-19: Data Visualization Fall 2019 Projects

Image
(Previous semester Information Visualization highlights posts: Fall 2017, Spring 2017, Spring 2016, Spring 2015, Spring/Fall 2013, Fall 2012, Fall 2011)
In Fall 2019, I introduced CS 625: Data Visualization, a new graduate-level visualization course. (This course was taught in a flipped+hybrid manner, as I described in an earlier blog post.) We used the same textbook, Tamara Munzner's Visualization Analysis and Design, as in my previous CS 725/825 Information Visualization courses, but this course was designed to be a gentler introduction to visualization and data analysis. We focused on basic visualization design principles and on how to ask good questions rather than D3 programming. Students were allowed to use whatever tool they wished, but I emphasized clear design no matter what tool was used. Over the course of two assignments (HW7, HW8), students developed questions about real-world data, developed a draft visualization, and then refined the visualization based on feedback. …

2020-06-17: Hypercane Part 3: Building Your Own Algorithms

Image
In Part 1, we introduced Hypercane, a tool for automatically sampling mementos from web archive collections. Web archive collections consist of thousands of documents, and humans need tools to intelligently select mementos for a given purpose. Hypercane's goal is to supply us with a list of memento URI-Ms derived from the input we provide. In Part 2, I highlighted how Hypercane's synthesize action converts its input into other formats like JSON for Raintale stories, WARCs for Archives Unleashed Toolkit, or boilerplate-free files for Gensim. This post focuses on the primitive advanced actions that make up Hypercane's sampling algorithms. We can mix and match different primitives to arrive at the sample that best meets our needs.



Our roadmap of Hypercane posts is as follows:
Hypercane Part 1: Intelligent Sampling of Web Archive Collections - an introduction to HypercaneHypercane Part 2: Synthesizing Output For Other Tools - how Hypercane can generate output for Archives U…

2020-06-10: Hypercane Part 2: Synthesizing Output For Other Tools

Image
In Part 1 of this series of blog posts, I introduced Hypercane, a tool for automatically sampling mementos from web archive collections. If a human wishes to create a sample of documents from a web archive collection, they are confronted with thousands of documents from which to choose. Most collections contain insufficient metadata for making decisions. Hypercane's focus is to supply us with a list of memento URI-Ms derived from the input we provide. One of the uses for this sampling is summarization. The previous blog post in this series focused on its high level sample and report actions and how they can be used for storytelling. This post focuses on how to generate output for other tools via Hypercane's synthesize action.



Our roadmap of Hypercane posts is as follows:
Hypercane Part 1: Intelligent Sampling of Web Archive Collections - an introduction to HypercaneHypercane Part 2: Synthesizing Output For Other Tools - this postHypercane Part 3: Building Your Own Algorithm…