Submissions | VizChitra 2026
Building Machine-Learnable Representations of Dataviz
Jaidev
Sr Software Engg - II•Aftershoot
Description
The mathematician Gregory Chaitin wrote,
... comprehension is compression. You compress things into computer programs. The simpler the theory, the better you understand something.
While Chaitin was talking about the communication of scientific facts and axioms, it is not hard to see parallels to this in visual analytics. E.g., anyone who practices data analytics eventually concludes that a dashboard, to be effective, must compress data. A chart must be simpler than the data it explains. And the insight drawn from the chart must in turn be simpler than the chart itself. This leads to a chain of lossy compressions from data to charts to insights (or captions of the charts).
But, can we automate this compression chain? Given a dataset, could we train a machine to automatically render the right visual and derive from it precisely the right insight?
This talk stems from a series of experiments around answering these very questions. At one level, we know the answer to be a resounding "yes", because of LLMs. No less than S Anand himself has enthusiastically (and effectively) advocated the automation of visual analytics with LLMs. GenAI can clearly build the aforementioned lossy compression pipeline from data to insights. But as Anand himself admits (in a proposal for a dialogue in this very edition of VizChitra), the real problem is not of creation, but curation. So the question now is not whether we can automate the compression of data into charts and charts into insights, but whether we can do so while avoiding slop.
My core hypothesis is that the three interlinked elements of a chart,
- the visual, or the raw pixels of the chart,
- the declarative specification of the chart (in a language like, say, Vega), and
- the set of insight(s) drawn from it (or the captions one could write for it),
are all one and the same thing. I call them the chart triplet. It's reasonable that if a skilled analyst has only one of the above, they could come up with decent approximations of the other two.
In the talk, I will
- Test this hypothesis: see whether it is possible to train a machine to be a skilled analyst who can navigate from any one of these three aspects of a chart to the other two.
- Build a dense, unified representation (or embedding) of the chart triplet is possible and meaningful.
- Find out whether the representation is able to generate charts and insights that are useful enough to be consumed (in other words, whether it solves the aforementioned curation problem).
- Show how visual + language transformer models can be trained to identify if captions and insights are big, useful and surprising.
Before the talk, I will be realeasing an open dataset containing over 1,50,000 individual charts, each with its Vega spec in JSON, accompanied with a million captions. All code, trained models and documentation will also be released. The audience are welcome to use these assets to benchmark their own storytelling processes.