Friday, October 28, 2016

2016-10-27: UrduTech - The GeoCities of Urdu Blogosphere


On December 12, 2008, an Urdu blogger Muhammad Waris reported an issue in Urdu Mehfil about his lost blog that was hosted on UrduTech.net. Not just Waris, but many other Urdu bloggers of that time were anxious about their lost blogs due to a sudden outage of the blogging service UrduTech. The downtime lasted for several weeks which has changed the shape of the Urdu blogosphere.

Before diving into the UrduTech story, let's have a brief look into the Urdu language and the role of the Urdu Mehfil forum in promoting Urdu on the Web. Urdu is a language spoken by more than 100 million people worldwide (about 1.5% of the global population), primarily in India and Pakistan. It has a rich literature, while being one of the premier languages of poetry in South Asia for centuries. However, the digital footprint of Urdu has been relatively smaller than some other languages like Arabic or Hindi. In the early days of the Web, computers were not easily available to the masses of the Urdu speaking community. Urdu input support was often not built-in or would require additional software installation and configuration. The right-to-left (RTL) direction of the text flow in Urdu script was another issue of writing and reading it on devices that were optimized for left-to-right languages. There were not many fonts that support Urdu character set completely and properly. The most commonly used Nastaleeq typeface was initially only available in a propriety page-making software called InPage which did not support Unicode and locked-in the content of books and news papers. Early online Urdu news sites used to export the content as images and publish on the web.

Urdu community used to write Urdu text in Roman script on the Web initially, but the efforts of promoting Unicode Urdu were happening on small scales; one such early effort was Urdu Computing Yahoo Group by Eijaz Ubaid. In the year 2005, some people from the Urdu community including NabeelZack, and many others took an initiative to build a platform to promote Unicode Urdu on the Web and created UrduWeb and a discussion board under that with the name Urdu Mehfil. This has quickly become the hub for Urdu related discussions, development, and idea exchange. The community created tools to ease the process of reading and writing Urdu in computers and on the Web. They created many beautiful Urdu fonts and keyboard layouts, translated various software and CMS systems and customized themes to make them RTL friendly, created dictionaries and encyclopedia, developed plugins for various software to enable Urdu in them, developed Urdu variants of Linux OS, provided technical help and support, digitized printed books, created Urdu blog aggregator (Saiyarah) to promote blogging and increase the visibility of new bloggers, and gave a platform to share literary work. These are just a few of many contributions of UrduWeb. These efforts played a significant role in shaping up the presence of Urdu on the Web.

I, Sawood Alam, am associated with UrduWeb since early 2008 with my continuing interest in getting the language and culture online. For the last seven years I am administering UrduWeb. In this period I have mentored various projects, developed many tools, and took various initiatives. I recently collaborated with Fateh, another UrduWeb member, to published a paper entitled, "Improving Accessibility of Archived Raster Dictionaries of Complex Script Languages" (PDF), in an effort to enable easy and fast lookup in many classical and culturally significant Urdu dictionaries that are available in scanned form in the Internet Archive.

To give a sense of the increased activity and presence of Urdu on the Web we can take a couple examples. In the year 2007 when UrduTech was introduced as a blogging platform, Urdu Wikipedia was in the third group of languages on Wikipedia based on the number of articles, with only 1,000+ articles. Fast forward eight years, now in 2016 it has jumped to the second group of languages with 100,000+ articles and actively growing.


In May, 2015 Google Translate Community hosted a translation challenge, in which Urdu languages surfaced in the top ten most contributing languages that was highlighted by Google Translate as, "Notably Bengali and Urdu are in the lead along with some larger languages."


Now, back to the Urdu blogging story, in the year 2007, WordPress CMS was the most popular blogging software for those who can afford to host their site and make it work. For those who were not technically sound or did not want to pay for hosting, WordPress and Blogger were among the most popular hosted free blogging platforms. However, when it comes to Urdu, both platforms had some limitations. WordPress allowed flexible options of plugins, translations, and theming etc., but only if one runs the CMS on their server, hosted free service in contrast, had limited number of themes of which none were RTL friendly and it did not allow custom plugins either. This means, changing CSS to better suit the rendering of the mixed bidirectional content was not allowed that would render the lines containing bidirectional text (which is not uncommon in Urdu) in an unnatural and unreadable order. Lack of custom plugin support would also mean that providing JavaScript based Urdu input support in the reply form was not an option as a result articles would receive more comments in Roman script than in Urdu. On the other hand, blogger allowed theme customization, but the comment form was rendered inside an iframe that had no way to inject external JavaScript in it to allow Urdu input support. As a result, those Urdu bloggers who chose one of these hosted free blogging services had some compromises.

The technical friction of getting things to work for Urdu was a big reason for the slow adoption of Urdu blogging. To make it easier, Imran Hameed, a member of UrduWeb, introduced UrduTech blogging service. People from UrduWeb including Mohib, Ammar, Mawra, and some others encouraged many people to start Urdu blogging. UrduTech used WordPress MU to allow multi-user blogging on a single installation. It was hosted on a shared hosting service. Creating a new blog was as simple as filling an online form with three fields and hit the "Next" button. From there, one can choose from a handful of beautiful RTL-friendly themes and enable pre-installed add-ons to allow Urdu input support, both in the dashboard for post writing and on the public facing site for comments. Removing all the frictions WordPress and Blogger had, UrduTech gave a big boost to the Urdu community and many people started creating their blogs.


It turned out that creating a new blog on UrduTech was easy not just for legitimate people, but for spammers as well. This is evident from the earliest capture of UrduTech.net in the Internet Archive. Unfortunately, the styleseets, images, and other resources were not well archived, so please bear with the ugly looking (damaged Memento) screenshots.


Later captures in the web archive show that as the Urdu bloggers community grew on UrduTech, so did the attack from spam bots. This has increased the burden of the moderation to actively and regularly clean the spam registrations.


The service ran for a little over a year with occasional minor down times. Urdu blogosphere has started evolving slowly and the diversity of the content increased. During this period, some people have slowly started migrating to other blogging platforms such as their personal free or paid hosting, other Urdu blogging offerings, or hosted free services of WordPress and Blogger. This is evident from the blogroll of various bloggers in their archived copies.

Increasing activity on UrduTech from both human and bots lead to the point where the shared hosting provider decided to shut the service down without any warning. People were anxious of the sudden loss of their content and demanding for the backup. Who makes backups? (Hint: Web archives!) Imran, the founder of the service, was busy in his other priorities that took him more than a month to bring the service back online. In the interim, people either decided to never do blogging again or swiftly moved on to other more robust options to start over from scratch (so did Waris) with the lesson learned the hard way to make backup of their content regularly.


"Did Waris really lost all his hard work and hundreds of valuable articles he wrote about Urdu and Persian literature and poetry?" I asked myself. The answer was perhaps to be found somewhere in 20,000 hard drives of the Internet Archive. However, I didn't know his lost blog's URL, but the Internet Archive was there to help. I first looked through a few captures of the UrduTech in the archive, from there I was able to find his blog link. I was happy to discover that his blog's home page a was archived a few times, however the permalinks of individual blog posts were not. Also, the pages of the blog home with older posts were not archived either. This means, from the last capture, only the 25 latest posts can be retrieved (without comments). When other earlier captures of the home page are combined, a few more posts can be archived, but perhaps not all of them. Although the stylesheet and various template resources are missing, the images in the post are archived, which is great.


What happened to the UrduTech service? When it came back online after a long outage, many people have already lost their interest and trust in the service. In less than three months, the service went down again, but this time it was the ultimate death of the service until the domain name registration expired.

Due to its popularity and search engine ranking, the domain was a good target for drop catching. Mementos (captures) during November 27, 2011 and December 18, 2014 show a blank page when viewed using WayBack Machine. A closer inspection of the page source reveals what is happening there. Using JavaScript, the page is loaded in the top frame (if not already) and the page has frames to load more content. Unfortunately, resources in the frame are not archived, so it is difficult to say how the page might have looked in that duration. However, there is some plain text for "noframe" fallback that reveals that the domain drop catchers were trying to exploit the "tech" keyword present in the UrduTech name, though they have nothing to do with Urdu.


Sometime before March 25, 2015, the domain name was presumably went through another drop catch. Alternatively, it is possible that the same domain name owner has decided to host a different type of content on that domain. Whatever is the case, since then the domain is serving a health-related "legitimate-looking fake" site, it is still live, and adding new content every now and then. However, the content of the site has nothing to do with neither "Urdu" nor "tech".


UrduTech simplified a challenging task at that time, made it accessible to people with the little technical skills, proliferated the community, killed the service, but the community has moved on (though the hard way) and transformed into a more mature and stable blogosphere. It has played the same role for Urdu blogging what the GeoCities did for personal home page hosting, only on a smaller scale for a specific community. Over the time the Web technology matured, support for Urdu in computer and smart phones became better, awareness of the tools and technologies grew in the community in general, and various new communication media such as social media sites helped spread the word and connect people together. Now, the Urdu blogosphere has grown significantly and people in the community organize regular meetups and Urdu blogger conferences. Manzarnamah, another initiative from UrduWeb members, introduces new bloggers in the community, publishes interviews of regular bloggers, and distributes annual awards to bloggers. Bilal, another member of the UrduWeb, is independently creating tools and guides to help new bloggers and the Urdu community in general. UrduTech was certainly not the only driving force for Urdu blogging, but it did play a significant role.


On the occasion of 20th birthday celebration of the Internet Archive (#IA20), on behalf of WS-DL Research Group and the Urdu community I extend my gratitude for preserving the Web for 20 long years. Happy Birthday Internet Archive, keep preserving the Web for many many more years to come. I could only wish that the preservation was more complete and less damaged, but having something is better than nothing and as DSHR puts it, "You get what you get and you don't get upset". Without these archived copies I would not be able to augment my own memories and tell the story of the evolution of a community that is very dear to me and to many others. I can only imagine how many more such stories are buried in the spinning discs of the Internet Archive.

--
Sawood Alam

2 comments:

  1. Wow, you reminded me those old gold days... When Facebook and other social media platforms were not very popular like today, founding UrduTech encouraged Urdu community to write blogs. The damage of losing the service was severe and several people left blogging. But that was not the very first time. Back in 2006, there was a service "Urdu home" that used to offer Urdu blogs on WordPress MU. And one day, it was just vanished. I was one of the victim.
    A very brief note about it can be read on my blog http://www.ibnezia.com/2014/10/wayback-machine.html

    ReplyDelete
  2. Nice and great information. Keep it up sir.
    I also write Urdu stories in roman fonts at http://urdustory.pk Please review and give your feedback
    Regards

    ReplyDelete