Research Projects

Wearable Text Reader for Visually Impaired (Jan 2022 - Present)

Italian Trulli

In this project we aim at enabling mobility for visually impaired individuals, particularly in Indian context by creating a wearable devide capable of dectecting stray animals, recognizing faces and reading text. I am working on improving Scene Text Recognition in extreme conditions such as low illumination, motion blur, perspective distortion and jittering for both Hindi and English languages.

Digitization of Mother-Child Protection Cards (November 2021 - Present)

Italian Trulli

In India, children’s vaccination records are typically recorded manually on paper-based Mother and Child Protection cards (MCP). In this project we are working on deep neural network models that can read the handwritten data on these cards. We aim to employ several traditional and new OCR technologies to improve the performance of OCR in wild such as curved pages, extreme illumination, perspective distortion, etc. and create a robust and reliable OCR system.

Smartforms (June 2021 - October 2021)

Italian Trulli

Data collection through digital devices may not always be feasible, for reasons such as unaffordability of smartphones and tablets by field-based cadre, or shortfalls in their training and capacity building. Paper-based data collection has been argued to be more appropriate in several contexts. In this work we designed a homography based Optical Character Recognition pipeline for digitization of paper based forms. We used deep-metric learning to solve the problem of intr-class variance and inter-class similarity. The OCR system was used by Gram Vaani used it to digitise phone number information obtained on paper forms and sent health awareness voice messages to women affiliated with self-help groups in Bihar.

Paper- link
Story by Microsoft- AI for Humanitarian Action projects
Code- link



Course Projects

Person Re-identification using Vsion Transformers

Italian Trulli

Person re-identificaiton(ReID) is an active research topic in surveillance applications which aims at identifying the same person across multiple camera views. There have been multiple deep metric learning based approaches proposed over the years which have achieved state of the art results in Person-ReID datasets. Most of these methods use CNN as the feature extractor and triplet loss as the metric loss function. More recently with the advent of transformers in Computer Vision, multiple research works based on Vision Transformers are proposed which achieve competitive results with CNN based methods on Reid Datasets and even surpass the performance of CNN benchmark in some datsets. In this work we implement a Transformer based person re-identification method on the given person ReID dataset and improve the results further using Global Hard Identity Searching.

Content Aware Seam Carving

Italian Trulli

Effective resizing of images should not only use geometric constraints, but also consider the image content as well. In this project, we implement a method of content aware seam carving proposed by Shamir et al. [1] for reduction, expansion and object removal of the image.

Project Code- Github

Computational Social Choice for Moral Decision Making

Italian Trulli

Autonomous decision-making systems frequently confront a choice between two or more alternatives. The criteria for comparing the alternatives are often unclear or unethical. In this project, we discussed two such proble. The first is the Kidney Exchange Problem, where a central market maker allocates living kidney donors to patients in need of an organ. The allotment chosen from the numerous may determine a person’s fate. The second is the Trolley Problem in Autonomous Vehicles. A brake failure may encompass deciding whom to kill and whom to save. This work describes a general principle for ethical decision making in autonomous systems using voting.

Project Report- link

Surgical instrument segmentation

Italian Trulli

Surgical instrument segmentation plays a crucial role in robot-assisted surgery. The information provided by the instrument segmentation is useful for various tasks such as tool pose estimation, real time tracking of the surgical instruments, surgical phase estimation etc. In the current approach we propose transfer learning of the state of the art pre-trained instance segmentation models on the medical surgery image dataset. The tasks are instrument instance segmentation and video instance segmentation. The evaluation metrics used are COCO Average Precision AP, AP50 and AP75.
The dataset generation and models can be accessible from here: link