Showing posts from December, 2022

2022-12-31: Paper Summary: "Beyond Classifiers: Remote Sensing Change Detection with Metric Learning" Zhang et al.

Semantic mapping of changes between images using Triplet Loss Metric Learning, Fig 8. from Zhang et al . I talked about two kinds of trust in my previous two posts,  Evaluating Trust in User-Data Networks: What Can We Learn from Waze?  and Trust Management in Multi-Agent Systems via Deep Reinforcement Learning . In the former, we looked at trust as a measure of the accuracy of data provided by user and in the latter we looked at evaluating the behavior of the user to measure out trust in that particular user. The difference is subtle but apparent - a user with trustworthy historical behavior consistently provides accurate data. Conversely, one who provides data with inconsistent accuracy can be considered as being a qualitatively inferior data provider with measurably lower trust.  This implies an exploitable attack vector, however. If we award implicit trust based on historical behavior, what happens when a historically trustworthy user suddenly provides the system with either false o

2022-12-29: A Summary of "CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents"

                                                                                Figure 1: Traditional  and Deep Learning Approaches to Table Recognition ( Hashmi et al. ) Table recognition refers to the process of using optical character recognition (OCR) and machine learning (ML) models to identify the rows, columns, and individual text cells in tables in digital documents either born-digital or scan PDFs. The task of table recognition has been under investigation for more than two decades  for automatically extracting textual information from a variety of tables  [ Kieninger et al. , Wei et al. ].  Automatic table recognition can be very challenging due to tables having different structures, data types, and misaligned data entries (Figure 2). For instance, some tables have text spanning multiple rows or columns. Also, some tables have clearly defined borders while some do not have any border (borderless) or  are partially-bordered. These complexities make it difficult for template-ba

2022-12-26: Getting started with Mixed Reality experience

Figure 1: HoloLens 2 headset device. One of the most popular technological trends today is Mixed Reality (MR) . You can use advanced sensing and imaging technology to interact with and manipulate real-world and virtual objects and surroundings in MR. Without ever taking off your headset, MR lets you see and become fully immersed in the world around you while interacting with a virtual environment with your hands. It enables you to have one foot (or hand) in the actual world and the other in an imaginary setting, bridging the gap between the real and the virtual and providing an experience that might alter how you work and play now. Microsoft brings its version of MR, which they call HoloLens . These HoloLens can be used to show information, merge with the real world, or even recreate a virtual world thanks to various sensors, cutting-edge optics, and holographic computing that adapts to its surroundings. The information can be presented in multiple ways, such as visually, aurally, hapt

2022-12-23: ECCV 2022 and DIRA 2022 Trip Report

  I had a paper accepted to the Drawings and abstract Imagery: Representation and Analysis (DIRA) workshop, allowing me to attend the 17 th European Conference on Computer Vision 2022 (ECCV 2022) in Tel Aviv, Israel, from October 23 - 27. ECCV 2022 is a large conference with attendees from more than 76 countries. More than 3,200 people attended ECCV 2022 in person, and 1,800 more attended virtually. ECCV 2022 was my first Computer Vision conference and perhaps the largest academic conference I have attended to date. ECCV 2022 is a premier conference for computer vision. The conference contains work from many corners of computer vision, from detecting and processing text in images to generating full images based on text prompts. ECCV 2022’s organizers came from a wide variety of universities and industry, including places like Harvard , Meta , Kyoto University , IBM Research , and USC .