Showing posts from May, 2020

2020-05-28: Richard Pates (Computer Science PhD Student)

     Welcome to my profile on Blogger! My name is  Richard Pates  and I joined the  Web Sciences and Digital Libraries  (WS-DL) research group in the  Department of Computer Science  (CS) at  Old Dominion Univeristy  (ODU) during the Summer of 2020 as a PhD Student in CS advised by  Dr. Jian Wu  as a member of the research team in the  Lab for Applied Machine Learning and Natural Language Processing Systems  (LAMP-SYS) Group working on the  Mining Electronic Theses and Dissertations  (METD) Project. Upon earning the  Masters of Science in Computer Science  (MSCS) from ODU during the Fall of 2018 approval was granted to join the PhD program in CS during the Spring of 2019 jointly advised by  Dr. Ravi Mukkamala  and  Dr. Cong Wong  with an interest in Artificial Intelligence (AI), Cybersecurity and Systems.      This year the main goal in the PhD program for me will be to advance as a  PhD Candidate  during the Fall of 2020 ( Current Academic Calendar ) having made the  Doctoral Dissert

2020-05-22: YouTube's recommended videos get longer as more of them are watched; Most are conspiracy videos.

The video "The NZ Mosque Attack Doesn't Add Up" was recommended from 51 channels In this post, I examine the results of YouTube's recommendation algorithm through an example of series of videos recommended by YouTube. From this example, I found that: The recommended videos are generated to maximize watch time There is significant correlation between videos' metadata and their recommendation order YouTube's recommended videos promote conspiracy theories (in this example) Maximizing watch time is YouTube's ultimate goal YouTube's recommendation algorithm, among other discovery features, focuses on watch time to keep viewers glued to the site. In theory, maximizing engagement benefits YouTube, content creators, and advertisers. It encourages YouTubers to create content that people actually want to watch because it makes them more money from displaying more ads. On the other hand, YouTube makes money from advertisers because they find thei

2020-05-21: Visualizing Webpage Changes Over Time With TMVis

Home page of This work has been supported by a  NEH/IMLS Digital Humanities Advancement Grant ( HAA-256368-17 ).  The web is dynamic, meaning webpages that exist today may not exist tomorrow. Even if a webpage continues to exist, it could display completely different content than it used to. Web archives, such as the  Internet Archive  (IA),  Archive-It  (AIT), and  many others , preserve past versions of webpages for use by scholars, researchers, and the general public. Using Memento terminology, an archived version of a webpage at a particular time is called a memento, or URI-M, and the list of all mementos for a particular webpage is called a TimeMap. Different web pages have different sized TimeMaps. For example, the TimeMap for contains over 2000 mementos, while the TimeMap for contains around 300,000. Analyzing such large TimeMaps is nearly impossible to do manually. Based on previous work ( Alsum and Nelson, ECIR 2014 ),  TimeMap Visu

2020-05-19: OCR Tools Experiment on Scanned Electronic Theses and Dissertations (ETDs)

A thesis or dissertation is one type of scholarly work that shows a student pursuing higher education and has successfully met the partial requirement of a degree. An electronic thesis or dissertation can be found from either a university's electronic theses and dissertations (ETDs) digital library or ProQuest (a third party ETD repository). ETDs contain lots of rich metadata that can be used for searching ETDs from the repository. However, not all ETD metadata are available. Therefore, it is necessary to extract metadata from scholarly ETDs. Also, extracting metadata could be challenging, mainly when it is found as scanned academic ETDs. Although many open-source tools exhibit satisfying performance in certain types of documents, experiments indicate that they tend to produce unacceptable errors or fail on scanned ETDs. In this blog post, I introduce one of the widely used optical character recognition (OCR) tools called tesseract-OCR and show how tesseract-OCR performs on scann

2020-05-06: PTSD Assessments in COVID-19 Health Care Workers

Figure 1:  Both military and medical personnel are at risk for  psychological trauma [ ] Health care workers are working in unfamiliar territory in recent times. Hospitals in major cities are overwhelmed by the number of patients they are handling as a result of the coronavirus disease 2019 (COVID-19) pandemic. There are accounts of people dying in the hospital hallways before help can arrive due to an insufficient amount of space, equipment and staff to handle the influx of patients. Hospital morgues are overflowing. To make matters worse, doctors and nurses have to worry about exposure to COVID-19 and/or possibly exposing their families largely due to a lack of personnel protective equipment (PPE). The current environment is putting health care works at greater risk of developing Post-Traumatic Stress Disorder (PTSD) . As a matter of fact, hospital personnel have started to report symptoms consistent with those suffering with PTSD from sleep disturbances to const

2020-05-06: Teaching a Flipped Hybrid (In-Class/Online) Course

I’ve been meaning to write this for a couple years. Now seems an especially appropriate time for it. In particular, a hybrid course may be an option if staggered in-class attendance is something that will be implemented in the Fall. My first hybrid class began as an in-class "flipped" model.  So first, I'll talk about how I implemented the flipped mode and then I'll discuss how I handled the hybrid (in-class and online) aspects the following year. My definition of a "flipped" class (see , , ) is one in which students actually do the reading before the class meeting, and the class meeting time is spent discussing the material with students ( not lecturing) and doing in-class activities. There can be several benefits to this, including that class time is changed from content delivery to ac