In an exclusive interview Prof. Dinesh Babu J who recently has been awarded with the National Teachers Award 2023 for higher education from the President of India discussed how artificial intelligence is revolutionising the way we interact with machines. This interview delves deep into the captivating world of Multimodal Perception, Social Computing and the fusion of AI with human interactions. Excerpts edited from the interview.
What inspired you to pursue a career in academia and research, particularly in the areas of Machine Learning and Social Computing?
Well, you know, my journey into academia and research started way back when I was in high school. I used to attend these Math Olympiad classes and there was this scientist who used to teach us. He became my role model and all I knew about him was that he had a PhD. That motivated me to pursue a PhD myself. Later, during my master's at IASC, I also did some volunteer teaching at a government school, which was incredibly satisfying. Teaching, for me, has always been about satisfaction. So, with a background in signal processing and machine learning, I found a fascinating PhD opportunity that combined these with social psychology. I had a great mentor in Switzerland that's where I learned the ropes of research and collaboration. From then on, it felt logical to pursue a career in teaching and research in the areas I've been passionate about.
Your research has received several awards and recognitions, including the Outstanding Paper Award at ICMI 2012. What do you believe sets your work apart in these competitive fields?
I think one of the things that sets our work apart is our problem-first approach. Instead of starting with mathematics or modelling, we begin with a clear research agenda. For example, our lab focuses on automatic interviewing, which is about creating virtual avatars that can interview people. We also delve into related issues like assessing communication skills and providing feedback. We're all about solving real-world problems. We follow current literature, build solutions and strive to take them beyond the lab into practical use. It's a creative process that combines the problem set with the solution.
How has your experience at IIIT Bangalore shaped your approach to teaching and research in the field of Multimodal Perception?
IIIT Bangalore places equal emphasis on teaching and research that's something I've carried with me throughout my career. Teaching is a top priority because it allows us to directly impact a large number of students. We teach both basic and advanced courses, covering cutting-edge methods that help students in their research endeavours. Moreover, IIIT Bangalore encourages research that has a social impact. Technology should be meaningful and beneficial to society. This approach has influenced our lab's work, where we focus on applied research that can solve real-world problems. IIIT Bangalore's modern infrastructure and supportive environment have greatly contributed to our research journey.
Can you provide an overview of your research in Audio-Visual Signal Processing and its real-world applications?
Sure! In the realm of audio-visual signal processing, one of our key areas of research is communication skill assessment and feedback. We're interested in assessing people's soft skills and providing actionable feedback, which can be done anytime, anywhere through platforms. This has applications in various fields, such as interview training. We also explore making avatars behave like humans, particularly in standardised interviews. These avatars can interact with users, ask questions and create realistic interactions. Our work has real-world applications in assessment and feedback.
Can you discuss the role of audio-visual signal processing in emerging technologies like AI, ARand VR and how it can influence the future of human-computer interaction?
Certainly! Consider a scenario where a virtual agent interviews a candidate. This involves both virtual reality (VR) and artificial intelligence (AI). VR allows for a more immersive interaction with avatars, while AI helps understand speech, emotions and nonverbal cues. This natural form of human-computer interaction is the future. Think beyond interviews; it can extend to education, language learning and more. AI and VR play pivotal roles in shaping this conversational future.
How do you see the landscape of Multimodal Perception and Audio-Visual Signal Processing evolving in the next decade and what are the key challenges researchers in this field will face?
The field of Multimodal Perception and Audio-Visual Signal Processing is dynamic. A key challenge is obtaining high-quality data while respecting user privacy, given the human-centred nature of research. Trends indicate a convergence of computer vision, graphicsand language technologies. Machines are becoming more imaginative, as seen in systems like Chat GPT. The future may involve creating entire movies from scripts, bridging the gap between scripts and actual movies. Researchers will navigate this exciting yet data-sensitive landscape.
How do you balance your roles as a researcher, educator and mentor and what strategies do you employ to ensure the success of your students and lab members?
Balancing these roles isn't easy, but teaching takes top priority. It allows us to impact a large number of students directly. We also teach advanced courses, covering state-of-the-art methods to prepare students for research. We empower students gradually, from hands-on guidance to independent operation. Rigorous reviewing processes ensure quality. It's like a "build, operate, transfer" approach, training students from start to finish. Spending time in industry and gaining real-world experience is crucial. It's about nurturing talent and letting them flourish.
Could you share advice for aspiring researchers and students looking to make an impact in the domains of Machine Learning and Social Computing?
Machine Learning and Social Computing are incredibly exciting fields. If you're strong in mathematics, focus on making AI responsible and safe. For applied researchers, choose application areas like healthcare or education. Gain industry experience to build practical systems. Consider spending time abroad to gain exposure and knowledge. Ultimately, it's about contributing meaningfully to these transformative fields and bringing that knowledge back to benefit India and beyond.