Posts

Showing posts from December, 2021

2021-12-31: Installing Several Open Source and Commercial Optical Character Recognition (OCR) Tools on a PC

Image
  Optical Character Recognition (OCR) tools are used for extracting text from images. There are many off-the-shelf OCR tools we can choose from. In a previous blogpost , I compared the performance of several open-source and commercial OCR tools. I’d like to go further and summarize the installation of these tools. In this blogpost, I will talk about the installation of Tesseract, Abbyy, Amazon Textract, and Google Cloud Vision. Tesseract: Tesseract is a free software package which accepts a wide range of file formats such as JEPG, PNG, TIFF, and BMP. The installation on a Win10 system is as follows: Step 1:  Download tesseract executable file  tesseract.exe  from this website .  Double click the file and it will guide you through installation. Step 2: Download the language package to the installation directory of the tesseract executable file. It must be compatible with the version of tesseract.exe . This is the  language package  for tesseract version 4.0 . You can download  tessdata

2021-12-20: Machine Learning on Mobile and Embedded Devices

Image
What is TensorFlow and TensorFlow Lite?    TensorFlow is an open-source machine learning library widely adopted among deep learning practitioners and researchers. The library provides functionalities for developing, training, and testing machine learning models on various devices through different Application Programmable Interfaces (APIs). Among them are the Python API and the JavaScript API . Despite these APIs being able to meet on-device, cloud, and web machine learning requirements, they do not support edge computing devices such as mobile and embedded devices. To address the issue, TensorFlow introduced TensorFlow Lite (TFLite), a lightweight solution for mobile and embedded systems.      This blog post will briefly discuss how TFLite integrates into a development workflow and selected vital concepts in TF Lite. Then we will apply some of the ideas to a simple model to observe their impact. Finally, we will discuss how we ( NIRDSLab ) use TF Lite and techniques with key takeaw

2021-12-16: From Researcher to ISTI Postdoctoral Fellow

Image
This summer, I was excited to be chosen as a new Postdoctoral researcher at Los Alamos National Laboratory 's Information Sciences Division (CCS-3) . Being accepted for the position helped me focus my attention on quickly finishing and defending my dissertation . I appreciate the support of my committee -- Michael Nelson , Michele Weigle , Sampath Jayarathna , Jian Wu , Jose Padilla , and Martin Klein  -- without whom this would not have been possible. By finishing my dissertation this summer, I could accept the Postdoc position this fall and begin working with my new mentor, Dr. Diane Oyen . Over the past few months, I have been working on new computer vision and image information retrieval research. Dr. Oyen strongly suggested submitting an application to the  Information Science & Technology Institute (ISTI) Postdoctoral Fellow  program. Each year, out of the hundreds of postdocs at LANL, only two are awarded this prestigious position. Last week, I discovered that the commi

2021-12-13: Summary of “What Makes Videos Accessible to Blind and Visually Impaired People?”

Image
Figure 1 : Video (A)  contains no information about accessibility. Video accessibility metrics (B),  BVI people using it C) to filter or quickly identify accessible videos from search results. (Source: ACM ) Text-based documents, articles, reviews, and reports were the main information sources on the internet. In late 2000, the Internet became mainstream . Among the attractions of this mainstream, Youtube was a popular search platform for watching videos, reaching 81% of internet users .  These videos give so much information such as a tutorial, lecture, unboxing video, review, and more that people increasingly use videos instead of text to communicate information. In fact, based on a report provided by YouTube, there are 1 billion hours of videos being played each day on YouTube alone. These videos provide information, both auditory and visually, but the visual content in videos is not accessible to blind and visually impaired (BVI) audience members. Traditionally, to make videos acc