Monday, June 21, 2010

2010-06-23: Hypertext 2010; We laughed, we cried, we danced on air.

Hypertext 2010 13 - 16 June 2010 has come and gone, but the memories linger.

Martin Klein and I presented our respective papers. He will be detailing his experience and his paper Is This a Good Title? at Hypertext 2010 when he returns from JCDL 2010. My paper Analysis of Graphs for Digital Preservation Suitability and it's associated PowerPoint presentation are available. The paper and the presentation was given at the Hypertext 2010 in Toronto, Ontario, Canada. A complete Hypertext program is available here.
Day zero, 12 June

Mary (my wife) and I got to Toronto late Saturday. We were four and a half hours late out of Norfolk because of weather problems in Chicago. Fortunately Mary made alternative reservations out of Dulles to Toronto as soon as we thought we were going to miss our connection. Pays to pay attention and to have alternative plans.

Martin (the sly dog) chose to travel 13 June on a direct flight from Richmond, VA to Toronto.

Day one, 13 June

I attended the Modelling Social Media workshop hosted by Alvin Chin (Nokia Research Center, Beijing). The ways that social networks can be modelled and then networked together engendered all sorts of "Big Brother" feelings.

Day two, 14 June

Andrew Dillion gave an interesting talk as the opening keynote speaker stressing how difficult it is for those professionals whose interests are not confined to a single, well defined discipline can find a venue to promote their ideas. He promoted things (institutions) like iUniversities that fostered ideas that crossed boundaries. Andy challenged the Hypertext organizing committee to find a way to encourage "cross cultural ideas." He also said that if you had a passion about a topic or idea; regardless of how it was characterized to follow that passion and somehow things would work out. All in all, interesting ideas, but I'm not sure how it could be implemented. His presentation was the topic of lots of conversations.

Martin (the lucky dog that he is) was the first presenter after Andy. I never asked, but I bet he was glad to get his presentation out of the way so that he could sit back and enjoy the show. Martin is off to JCDL right now and I'm sure will have lots to "talk" about when he gets finished down under.

Ryen White's talk about how people use parallel browsing (having multiple tabs open simultaneously) was interesting to me mostly because of the way that data was collected. When a user upgrades their browser, they often permit their browsing activities to be sent back to the "mother ship" so that the developer can improve their product. If you think about it, do you want them to know where and how you browse?? Kind of an interesting question. Not sure that I do.

The presentation about using tags as a source for thematic data to create a narrative resulted in an interesting collection of photographs that at first glance may not have seemed related. I think that the Shakespearian quote used in the paper ("Freeze, freeze thou bitter sky" from As You Like It) and the imagery resulting from the tags were a little bit of a mismatch (to me the quote speaks to being forgotten vice the weather) points out how relying on just the tags can lead you astray.

Ricardo Kawase's "The Impact of Bookmarks and Annotations on Refinding Information" looks to the question of how to refind information that you once found. It is an interesting questions because not only does each of the techniques presented (tagging and spreadcrumbs) require that you "know" what you are looking for, but that the user is not the same person (in the sense that time has passed, new experiences acquired, contexts have changed, etc.) that annotated the data in the first place. Interesting question, how do you refind something when you aren't the person that found it in the first place??

Sihem Amer-Yahia and her "Automatic Construction of Travel Itineraries using Social Breadcrumbs" looked at constructing a travel itinerary based on tags that others had put on pictures in a particular area. That way a travel itinerary could be constructed that took you to the same places that most other people had been when they were there before you. In a sense you could repeat what they had done and therefore you must have been there because you saw the same things as everyone else. Interesting concept, in many ways there are a certain number of "must see" things you have to see or otherwise you haven't really been there. It seems that the hand editing of some of the data would make the technique hard to scale.

Kaipeng Liu's "Speak the Same Language with Your Friends: Augmenting Tag Recommenders with Social Relations" presentation showed that combining the tag sets from various socially related taggers can help to "normalize" the tag set across the users. This normalized tag set can then recommend additional tags based on the tags that the tagger has recently used. These ideas could also be used to help a someone trying to search for additional, or related works. Kind'a neat.

Huizhi Liu and "Connecting Users and Items with Weighted Tags for Personalized Item Recommendations" builds on the idea of having a "normalized" set of tags based and applies a mathematical bend to it. All this is to counter the problem that tags are user dependent and that the same tag can be used by two different taggers in at least two different ways. This "tagging noise" can make it very difficult to find the correct match based on tags.

"Topic-based Personalized Recommendation for Collaborative Tagging System" looks at reducing the noise in freeform tagging systems by applying a modified Latent Dirichlet Allocation (LDA) approach. LDA is used to identify unknown groups in data sets. By applying LDA techniques, to tags and taggers, better tags can be recommended.

Iñaki Paz's "Providing Resilient XPaths for External Adaptation Engines" has an interesting application of simulated annealing (SA) to derive a XPath specification for extracting data from a page that changes over time. The premise being that it is easy to tailor a XPath for someone else's static page, but given the page will probably evolve, how to derive a XPath that can be expected to be reasonably robust as the page changes and evolves. Nice to see SA used to extract data from a web page.

Vinicius F. C. Ramos's "The Influence of Adaptation on Hypertext Structures and Navigation" was about how students use an adaptive hypertext (AH) e-learning application. A fair amount of discussion ensued after the presentation as to why the students went off track and followed links that didn't "go anywhere." Was it because they didn't understand that they shouldn't have gone there, or was it a reflection of their attempting to be through, or was it something else??

Alexandra I Cristea's "The Next Generation Authoring Adaptive Hypermedia: Using and Evaluating the MOT3.0 and PEAL Tools" dealt with the problem of authoring and evaluating AH programs and lesson plans.

Evgeny Knutov's "Provenance Meets Adaptive Hypermedia" explored and offered a model of how to address the question of provenance in an AH environment by bringing together several existent models. Their overarching model was then used to answer the provenance questions of: Where?, When?, Who?, Why?, Which? and How? In the digital preservation arena, these type of provenance questions and data have to be preserved as well.

Day three, 15 June

Haowei Hsieh's "Assisting Two-Way Mapping Generation in Hypermedia Workspace," looked at presenting hypermedia in a X-Y spatial manner to reduce the complexity that became apparent from feedback by previous AH creators. Interesting ideas, I'm not sure how to apply them to what I am doing, but it does open my mind to the fact that while I may think that my way is best, others may have a different opinion.

My personal favorite paper and presentation was: Analysis of Graphs for Digital Preservation Suitability. The mechanics of the presentation went very well. Prior to the presentation, I watched as others had struggled with the microphone on the podium. It had a pick up range of about 4 feet. Beyond that, the speaker had to really project for people in the back of the auditorium to hear. I have a hard time being constrained to a 4 foot radius circle, so the first thing that I did was to walk away from the microphone and use by "outdoor voice" and get feedback from the back of the room if I was understandable. Then I found a way to engage the audience by pointing out how Calum's (the conference photographer) work would only be available for a few number of years and that I was proposing something different. Rather than relying on institutions to preserve digital data, have the data preserve itself. By using a laser pointer, moving around in front of the screen (if you look at the HT-2010 home page, you can see the bottom portion of the screen and the podium is just off image to the right), focusing some attention onto Calum, and then having a video that showed how these digital preservation graphs were going to be attacked and then repaired, everyone's attention seemed to be on the performance and not so much on what was happening on their respective screens. The first video was a lot of work to put together (figuring out which image format to create to feed the video creating software to feed Youtube to be usable on a large screen, etc.) but after that it was almost mechanical. I haven't found a command line video creation tool yet; but the GUI based one isn't too bad. There was a lot of power in the video, while it was playing, everyone was staring at the screen. The presentation lasted right at 20 minutes and then it was question and answer time. Mark Bernstein asked several questions about how the Web Objects from the paper would communicate and if I had thought about how they could live in their entirety inside a URI. Alvin Chin asked about how they could propagate across different types of social networks. Nathan (whose last name I have misplaced) a student from the University of Cambridge asked many questions about the energy costs of sending a message across the diameter of the Unsupervised Small World graph. I presented a version of this paper at JCDL-2009; initially I knew that there were two people there that understood what I was talking about. At the end there were three. At this conference, I started with one (me) and based on the conversations that I had with Mark, Alvin, Nathan and Jamie Blustein I ended up with at least five. And so at the end, we laughed.

Hypertext Final - Analysis of Graphs for Digital Preservation Suitability

BTW: One of the things that I've noticed and haven't figured a way around is that SlideShare seems to not support PPT animations (for instance on slides 5 and 8 of the presentation) and some clicks to external pages (for instance on slide 20). Inside Powerpoint, clicking on the graph on slide 20 will take you to the Youtube video shown here:

The movie starts off with a baseline graph that alternately gets attacked and repairs itself. Nodes that are isolated from the graph are shown in red, while those that are still connected to at least one other node are shown in cyan. This game goes on for 10 turns just to show how a graph can be evaluated to quantify how long it will last given that some nodes become isolated and then try to regain membership into the larger graph. These are some of the ideas behind the paper.

Heiko Haller's presentation of "iMapping - A Zooming User Interface Approach for Personal and Semantic Knowledge Management" was very interesting because of the tool that he used. I have had very little experience with Personal Knowledge Management tools, so I can't comment on the efficacy of iMapping, but the things that iMapping can do in terms of a presentation are very different and innovative when compared to Powerpoint and EndNote. iMapping still has some rough edges, but the ideas, the interface and the way that data is presented is really, really neat. Heiko's paper on the Ted Nelson Newcomer Award for best newcomer paper.

Tsuyoshi Murata's presentation "Modularity for Heterogeneous Networks" tackles the problem of detecting communities of users, tags, URIs/URLs using graph theoretical approaches. He started with the "simple" bipartite problem, expanded it to tripartite and implied that the approach could be extended to n-partite. My eyes started to cross after the third nested summations of various logs.

Dan Corlette presented data as part of his paper: "Link Prediction Applied to an Open Large-Scale Online Social Network" that looked at the links that LiveJournal users created between and amongst themselves. Dan and company then made predictions as to how many links a user would form as a function of the length of time they were members of LiveJournal. Their predictions were "reasonable" for new users, but less accurate for old users. I wonder if old users (those who have established themselves, vice old chronologically old) had focused on
what they wanted and that was enough links for them. Kind of an interesting human issue question.

Said Kashoob presentation on "Community-Based Ranking of the Social Web" compared different techniques to identify communities in social networks using different "community finding" techniques. Frankly, I got lost after the third integral. His group claims to ave a technique that works better then the "normal" ones, but I'm not qualified to talk to it.

Danielle H. Lee presented "Social Networks and Interest Similarity: The Case of CiteULike" looked at how social networks among CiteULike members tend to use the same tags and to reinforce the tags that members of their group use. As the connections between different groups becomes more and more distant, the similarity lessen. Is this a case of "birds of a feather, flock together?"

Christian Körner tackled the problem of what motivates people to tag in "Of Categorizers and Describers: An Evaluation of Quantitative Measures for Tagging Motivation." The presentation generated a lot of questions and introspection among the attendees who tried to categorize them selves as either someone who describes resources, or someone that categorizes resources. Christian provided an interesting insight into what motivates people to tag and why.

Jacek Gwizdka's "Of Kings, Traffic Signs and Flowers: Exploring Navigation of Tagged Documents" took the problem of how to represent the "goodness" of a set of tags by the use of a "heat map" of the tags. That was after looking at hypertext links with Kings, traffic signs and flowers. His paper and the presentation have interesting diagrams.

Jeff Huang presented "Conversational Tagging in Twitter" took the first look at the use of tags in Twitter. Looking at how long some last (only a couple of days in some cases), how many people use them (from a few to viral numbers) and how they meaning to only a few. Jeff claimed that this was the first time anyone had looked at Twitter tags and that the analysis showed some interesting and unexpected things. Jeff's group worked on a data set of Twitter tags prior to the LoC getting a copy of all public tweets. He said that he is interested in applying the same analysis on the much larger LoC data set.

Marek Lipczak looked at "The impact of resource title on tags in collaborative tagging systems" of tagging by members of a collaborative group. His team's results point to members of the group tagging resources more in line with their personal interests than those of the group. (Definitely a darker side of group participation.)

Day four, June 16

Daniel Gonçalves' "A Narrative-Based Alternative to Tagging" looked at placing tags into a narrative about a series of images and then measuring how the tags were reused, how long they conveyed information and how well an outsider could see the connections between the images. As a demonstration of the effectiveness of the approach, Daniel's presentation took 14 minutes with 90 slides and very few words on the screen. It is an interesting idea and approach, I'm not sure how I'll use it but it is something to remember that there are alternatives.

F. Allan Hansen's "UrbanWeb: a Platform for Mobile Context-aware Social Computing" reported on an on-going experiment at Aarhus University and in the city of Aarhus where tags were placed "in the wild" of the real world and people went in search of them. People used their smart mobile phones as a way to access a database based on their location, snaps of barcodes and other tagging information that was present. The combination of these data helps to create a richer and more interesting experience for the human, and to encourage humans in the same location with similar interests to connect. Neat application.

James Goulding presented the Douglas Englebart Best Paper Ward paper called "Hyperorders and Transclusion: Understanding Dimensional Hypertext." He addressed some of the limitations of the RDF representation of data and relationships between data and then went on to move the discussion into an arena where the number of different values that a database tuple can have can be viewed as its dimensions. Based on the intersection of these dimensioned data, it is possible to have data that is hyperconnected. Taking the idea that these data can be viewed as hyperlinks, then the dimensioned data can be hyperconnected. The transclusions come in when the data can be used in different contexts. After a while, my head started to hurt from trying to follow the different types of connections.

Mark Bernstein's paper, presentation and panel discussion called "Criticism" was a real delight to hear, see and follow. Mark's premise is that we (as a community) have done lots and lots of work with hypertext, but is it good work and is hypertext a good tool?? His insights about how we do things and that our tools and approaches taint what we see and how we see them rang true again and again. During his presentation he used the phrase "web of scholarship" that, I think speaks to the heart of the matter. Too often we get bound up in the way that we (and only we) do things and fail to see how there are influences outside of our immediate sphere that also influence us. I think that this is absolutely true and that we have to raise our heads up from time to time and see what the rest of the world is doing and drink in the bigger pciture.

The panel discussion "Panel Session: Past Visions of Hypertext and Their Influence on Us Today" by all the authors was an interesting reflection on some of the major ideas that got us to where we are today. By putting Vannevar Bush's seminal paper "As We May Think" into a cultural context (where he got his ideas, why he was published, where he was published, etc.) and then taking those ideas (not the only his, but also ideas from people like him) and seeing how they have and have not come to be. It was real pleasure to hear a panel of experts in their and our collective field talk about their views on someone that affected us all. A real pleasure.

Irene Greif's closing keynote address "The Social Life of Hypertext" brought things back to the real world. For most of the presentations, things had been very ethereal, existing in the abstract, with only limited connectivity to things of this world. Irene was able to bring things back, back to how having lots of real data can give new insights (, how Twitter backscatters can reveal things that may have passed unnoticed, and in general that all the things that we had talked about for the entire conference have a place in the real world.

And, so HT2010 closed.Link
Martin, Mary and I had a celebratory dinner at the 360 restaurant in the CN tower. At over 1150 feet, it provides a full view of the Toronto area. Including the area where the riverboat dinner cruise took us the night before.

After two circuits around the tower, Martin and I went down to the observation deck.

They have all sorts of pictures and statistics about the tower (height, weight, fastest climb from the bottom to the observation level, etc.). A section of the floor is glass, so you can see all the way down the side of the tower to the ground far, far below. Martin and I danced on the air that evening.

The hotel lost electrical power for most of the evening.

Day five, June 17

The hotel power came on early in the morning. Early enough that the water had a chance to heat for showers and for us to pack. As we went to the elevator, it went out again. Fortunately we weren't ready 90 seconds earlier, otherwise we would have been in the elevator when the power went out.

Down the stairs we went, six floors, 12 flights, clunking suitcases all the way. Our carriage awaited us at the bottom, and whisked us away to airport. Getting through Customs seemed to take forever. As we snaked our way to the front of the line, those that had signed up for NEXUS sailed by us all. Something to keep in mind, if you are plan on going back and forth across the US Canada border, NEXUS may be a way to save a lot time and effort.

We cried (almost) trying to get to Toronto. We laughed when our presentations were done. We danced on air (with a few inches of hardened glass beneath our feet) at the end.

Now that Hypertext 2010 has come and gone, it is time to get back on the paper and conference treadmill again.

More as events warrant,

-- Chuck Cartledge