Submissions | VizChitra 2026

Sarkaar Data Nahi Deta: Constructing Visualizable Datasets from Indian Governement Data

Karnav

AssociateBoston Consulting Group (BCG)

Under Review · Dialogues · Visualizations for Community

Description

This session is about the variety of data and information with varying degrees of usability that the Indian government publishes across different platforms, and how the community can turn it into useful datasets, narratives and insights.

For two years as an undergrad at Ashoka, I worked with their Centre for Digitalization, AI and Society on a broad project to take various Indian government platforms and extract meaningful data from it for social science research. This has given me deep context into a series of problems and interesting points which are fertile ground for the community to come together to discuss and solve:

  1. Indian open data is fragmented, it's process to publication is opaque and arbitrary, and often it is made available on the internet but not at all accessible.

  2. The key constraint to making this data accessible is often UI/UX and visualization: by definition these are massive datasets, often underprocessed, and difficult to visualize. How can this data be visualized and presented in a way that the common Indian can consume?

  3. OVer the past few years, platforms like Data for India and Ashoka's Center for Economic Data and Analysis have made massive progress in visualizing key Indian datapoints and crafting narratives. Often, their analysis has been limited by the amount of work it takes to make obscure datasets usable and visualizable. How can their approach be expanded to more complex datasets require lots of work and compute? Is an open-source approach possible?

My particular focus area is on data with a geospatial aspect: land use, environmental clearances, real estate and urban governance and expenditure, as that's the data that I have worked with extensively.

In particular, I'd like to anchor the discussion on two rich sources of open data published by the government which we were able to compile and generate useful insights from: the BhuNaksha platform (https://app4bhunakshaodisha.nic.in:8443/bhunaksha/) holding plot-by-plot land ownership and characteristic data, and the Parivesh platform (https://environmentclearance.nic.in/proposal_status_new1.aspx) holding records of all environmental clearances applied for and granted to industrial projects in India. I have been able to scrape, clean, match, process and visualize these datasets to some success and can discuss the learnings that came from the projects - and the next steps for more data like this

These topics are relevant to data viz because there are vast collections of data that the government is willingly providing to the common Indian, with the primary barrier to their consumption being our inability to convert large tables into easy-to-consume graphs, exploratory websites and factoids. Data viz has the power to solve this problem, to the benefit of both the community and the nation

I think the ideal way to structure the session is as an introduction and short talk, followed by a guided discussion, and an interactive activity, yielding to free-flowing discussion (either open floor or in groups)

5-10 mins: Introduction, context-setting and scope of discussion 5-10 mins: Elaboration on problem and case studies (BhuNaksha, Parivesh, Data for India and 1-2 examples outside India) 10-15 mins: Guided discussion on people's experiences working in the domain, problems and solutions, and opinions on the path going forward 10-20 mins: Interactive activity (Stickies and cards to be handed out with participants to mark their opinions on most interesting Stakeholders, Problems and Domains, Approaches, and Tools/Platforms) followed by discussion

The intended audience is primarily data journalists, as well as professionals and hobbyists in policy, open-source data, and across the social sciences and impact ecosystem interested in Indian government-scale data and evaluation

Related Links

Materials Required

Browser and GitHub would probably be needed; Python QGIS and GEE are good if participants are interested in replicating/exploring case studies

Room Setup

Projector would be needed; whiteboard is good for interactivity

VizChitra instagram linkVizChitra twitter linkVizChitra linkedin linkVizChitra bluesky linkVizChitra youtube linkVizChitra github link

Copyright © 2026 VizChitra. All rights reserved.