In the realm of Artificial Intelligence, particularly Deep Learning, data is the lifeblood that powers predictive solutions. But what happens when data is scarce or virtually non-existent? Sumedh Datar, a renowned Computer Vision expert, has been at the forefront who has already impacted several people around the world with his developed solutions for challenging environments in retail and the healthcare industry. Recognized globally for his contributions through awards and fellowships, Datar has also been a staunch advocate to promote data science knowledge. Last week he was selected to be a steering committee member for OpendDS4All, an initiative by the Linux Foundation to spread Data Science knowledge to people around the world. We sat down with him to delve deeper into the intricacies of working in data scarce environments. Q: Sumedh, could you explain to our readers what exactly are 'Data Scarce Environments'? Sumedh Datar: Certainly. Data Scarce Environments refer to situations or domains where there's a significant lack of data. This could be due to various reasons, such as the novelty of a problem, logistical challenges in data collection, or simply because there hasn't been a prior solution to gather data. Q: How were you first introduced to these environments? Datar: My journey began when I was working on a project in India, aimed to identify specific medical conditions using computer vision. The challenge was that there was very limited data available for training models. This pushed me to explore innovative ways to leverage the data we had and augment it effectively. Q. Need for Deep Learning in Healthcare In my experience, particularly within the realm of healthcare, and more specifically in cancer research, I recognized a crucial need. Currently, medical professionals and experts take days or even weeks to confirm abnormalities because they must meticulously scrutinize several images multiple times—a laborious process prone to errors. Deep Learning algorithms excel at processing images swiftly. With ample computational power, these algorithms can efficiently expedite this task to mere hours, enabling doctors to identify patterns more rapidly and thus diagnose faster. Q: Can you give more insights into how you developed solution for cancer detection Datar: In the biomedical imaging domain, scarce data is a common challenge, especially when dealing with rare diseases. The technique I have employed, a variant of 'transfer learning,' played a pivotal role in crafting solutions. The key was to start developing a solution and begin using it to create a platform for data collection, rather than waiting for the perfect predictive model. This approach enabled me to amplify the quantity of training images, yielding exceptional results and cutting down diagnosis time from several days/weeks to mere hours. Q. Currently you work in the retail space, how did that translate ? Datar : While the domains differ, the underlying principle remains the same: images, in essence, are numerical data. Whether in healthcare or retail, we grapple with images, but the limitations on data availability can be quite distinct. In the retail space, I devised frictionless solutions to address staffing challenges and store optimization. To gather the necessary data, I had to employ innovative strategies, such as physical augmentation techniques and simulation methods. These approaches proved highly effective in building robust image recognition systems, now used by thousands.. Q. Tell us more about promoting and sharing data science knowledge Datar: With nearly a decade of hands-on experience in crafting these solutions, I recognized the imperative of sharing knowledge. I had the opportunity to distill the theory behind applied deep learning for computer vision techniques into a comprehensive coursework. This coursework has been embraced by several universities across the globe. I've also been an invited speaker at various international AI conferences and podcasts, where I've shared insights into the successful development of Computer Vision solutions. Q. What is your opinion about GenAI Datar: Generative models are truly transformative, as they have the power to create content ranging from images to textual narratives. In the past, Computer Vision and Natural Language Processing (NLP) operated largely as separate entities, with their respective limitations. With GenAI, the synergy of Computer Vision and NLP unlocks cutting-edge solutions that far surpass their individual capabilities, ushering in a new era of AI innovation." AI, Artificial Intelligence is saving the lives of children