Tuesday, April 18, 2017

2017-04-18: Local Memory Project - going global

Screenshots of world local newspapers from the Local Memory Project's local news repository. Top: newspapers from Iraq, Nigeria, and France. Bottom: Chile, US (Alaska), and Australia.
Soon after the introduction of the Local Memory Project (LMP) and the local news repository of:
  • 5,992 US Newspapers
  • 1,061 US TV stations, and
  • 2,539 US Radio stations
I considered extending the local news collection beyond US local media to include newspapers from around the world.
Finding and generating the world local newspaper dataset
After a sustained search, I narrowed my list of potential sources of world local news media to the following in order of my perceived usefulness:
From this list, I chose Paperboy as my world local news source because it was fairly structured (makes web scraping easier), and contained the cities in which the various newspaper organizations are located. Following scraping and data cleanup, I extracted local newspaper information for:
  • 6,638 Newspapers from 
  • 3,151 Cities in 
  • 183 Countries
The dataset is publicly available.
Integrating the world local newspaper dataset into LMP
For a seamless transition from US to a world-centric Local Memory Project, it was pertinent to ensure the world local media was represented with exactly the same data schema as the US local media. This guarantees that the architecture of LMP remains the same. For example, the following response excerpt represents a single US college newspaper (Harvard Crimson). 
{
  "city": "Cambridge", 
  "city-latitude": 42.379146, 
  "city-longitude": -71.12803, 
  "collection": [
   {
      "city-county-lat": 42.377, 
      "city-county-long": -71.1167, 
      "city-county-name": "Harvard", 
      "country": "USA", 
      "facebook": "http://www.facebook.com/TheHarvardCrimson", 
      "media-class": "newspaper", 
      "media-subclass": "college", 
      "miles": 0.6, 
      "name": "Harvard Crimson", 
      "open-search": [], 
      "rss": [], 
      "state": "MA", 
      "twitter": "http://www.twitter.com/thecrimson", 
      "video": "https://www.youtube.com/user/TheHarvardCrimson/videos", 
      "website": "http://www.thecrimson.com/" 
   }
  ], 
  "country": "USA", 
  "self": "http://www.localmemory.org/api/countries/USA/02138/10/?off=tv%20radio%20", 
  "state": "MA", 
  "timestamp": "2017-04-17T18:56:10Z"
 }
Similarly, world local media use this same schema for seamless integration into the existing LMP framework. However, different countries have different administrative subdivisions. From an implementation standpoint, it would have been ideal if all countries had the US-style administrative subdivision of: Country - State - City, but this is not the case. Also, currently, LMP's Geo and LMP's Local Stories Collection Generator are accessed using a zip code. Consequently, the addition of world local news media meant finding the various databases which mapped zip codes to their respective geographical locations. To overcome the obstacles of multiple administrative subdivisions, and the difficulty of finding comprehensive databases that mapped zip codes to geographical locations, while maintaining the pre-existing LMP data schema, I created a new access method for Non-US local media. Specifically, US local news media are accessed with a zip code (which maps to a City in a State), while Non-US local news media are accessed with the name of the City. For example, here is a list of 100 local newspapers that serve Toronto, Canada: http://www.localmemory.org/geo/#Canada/Toronto/100/

The addition of 6,638 Non-US newspapers from 183 countries makes it possible not only to see local news media from different countries, but also to build collections of stories about events from the perspectives of local media around the world.

--Nwala

1 comment:

  1. The history of finger biometry was initiated in the late nineteenth century by scientist Francis Galton. Since then, it has grown tremendously thanks to a large team of geneticists and biologists. In 1880, Henry Faulds made the argument for the amount of fingerprint RC (Ridge Count) to assess the degree of fingerprint dependence on the genes.

    The scientists claim that fingerprints are formed under the influence of the genetic system of the fetus inherited and the impact of the environment through the vascular system and the nervous system located between the dermis and the expression the cover. Some of these effects are oxygen supply, nerve formation, the distribution of sweat glands, the development of epithelial cells. Interestingly, although there is a common genetic system Hereditary but fingerprints on the ten fingers of each individual individual. In 1868 the scholar Roberts pointed out that each finger had a different micro-growth environment; In addition, the thumb and index finger suffers from some additional environmental effects. So fingerprints on the top ten fingers of a different individual. The twin brothers (sisters) with fingerprint eggs are quite similar but still can distinguish fingerprints of each person. This is because although they have the same genetic system and share the same developmental environment in the womb, but because of their different position in the womb, their micro environment is different and therefore has different fingerprints. together.

    See more at : http://umit.vn

    sinh trắc vân tay hà nội
    Khám phá bản thân
    Trung tâm sinh trắc vân tay
    Khám phá bản thân

    ReplyDelete