Stockholm university

Venktesh ViswanathanAssociate senior lecturer

About me

I am an Associate Senior Lecturer (Biträdande lektor) in the Data Science Research group at DSV, Sotckholm University.

 

My research aims to develop robust and efficient pipelines for complex knowledge intensive tasks to assist in wide range of real-world applications like healthcare, education, research and tackling disinformation. My research his entails building robust and efficient Retrieval Augmented Generation (RAG) pipelines powered by Large Language Models (LLMs) through theoretically grounded advances in Machine Learning (ML) that bridge the retrieval and reasoning gaps. Towards realizing this vision, I have developed theoretically grounded sample-efficient algorithms for compute-optimal test time scaling at retrieval and reasoning stages. My work also involves developing robust LLM-feedback approaches based on uncertainty quantification to improve retrieval and reasoning at inference time. The resulting works have been published at WWW,WSDM,ECIR,CIKM,EMNLP, ICML, NAACL and SIGIR. I also regularly serve as Reviewer for NAACL,ACL,EMNLP,SIGKDD,SIGIR,AAAI,CIKM,WSDM,WWW,ECIR. I also serve as reviewer for journals like ACM TOIS. I am also currently serving as Program chair for Reproducibility Track at ECIR 2026.

 

Apart from publications my work has been deployed at scale for real-world impactful applications like Live-factchecking of US presidential and EU debates by Factiverse AI, Norway. My work on efficient retrieval for live fact-checking with collaborators from UIS and Factiverse won the ``Best paper award" at ECIR 2025.

Teaching

I am currently teaching in the following courses:

  • Research Topics in Data Science (RTDS)
  • Foundations of Data Science (FODS)

Research

My research broadly focuses on helping people find, verify and organize information for their Complex Information Needs. For instance, doctors in multidisciplinary teams could easily look for information like ``What is the protocol for handling the virus that causes acute respiratory failure?" through such systems and save a lot of time in manual search over multiple systems for differential diagnosis to get the protocol checklist. However, the systems need to perform multi-turn information retrieval and reasoning over information to identify possible viruses causing this symptoms and protocols identified over the years to handle them and ranking them internally to synthesize the best checklist. Hence my research is at the intersection of NLP and IR also concerned with developing efficient Machine learning mechanisms for optimizing these pipelines. Departing from current view that more data and large parametric models are necessary to do this, I propose sample-efficient and compute-efficient pipelines by advancing research in:

  • Efficient neural retrieval for queries over large corpora (Millions of documents)
  • Efficient reasoning (efficient search over possible hypothesis for solutions).
  • Theoretically grounded bandit frameworks for sample-efficient exploration.
  • Promoting model reuse by evolving new skills through model merging.

 

$presentationText

profilePageLayout