Show HN: Data Formulator – AI-powered data visualization from Microsoft Research

126 points16 hoursgithub.com

Creating data visualizations with AI nowadays often means chat, chat and more chats...and writing long prompts can be annoying while they are also not the most effective way to describe your visualization designs.

Data Formulator blends UI interaction with natural language so that you can create visualizations with AI much more effectively!

You can:

* create rich visualizations beyond initial datasets, where AI helps transforming and visualizing data along the way

* iterate your designs and dive deeper using data threads, a new way to manage your conversation with AI.

Here is a demo video: https://github.com/microsoft/data-formulator/releases/tag/0....

Give it a shot and let us know how it looks like!

zurfer • 2 hours ago

Anthropic recently released something that looks more polished but follows the chat paradigm. [1]

As a builder of something like that [2], I believe the future is a mix, where you have chat (because it's easy to go deep and refine) AND generate UIs that are still configurable manually. It's interesting to see that you also use plotly for rendering charts. I found it non-trivial to make these highly configurable via a UI (so far).

Thank you for open sourcing so we can all learn from it.

[1] https://news.ycombinator.com/item?id=41885231 [2] https://getdot.ai

zurfer • 2 hours ago

Here is the link to one of the prompts. It seems like all the LLM tasks are in the agents directory: https://github.com/microsoft/data-formulator/blob/main/py-sr...

Some of these "agents" are used for surprising things like sorting: https://github.com/microsoft/data-formulator/blob/main/py-sr... [this seems a bit lazy, but I guess it works :D]

DeathArrow • 59 minutes ago

If you look in the video from OP, you can see that chat is still used at some point.

goose- • 5 hours ago

Since Data Formulator performs data transformation on your behalf to get the desired visualization, how can we verify those transformations are not contaminated by LLM hallucinations, and ultimately, the validity of the visualization?

larodi • 4 hours ago

We can’t. Without the driver this car runs on probability. And that all. A capable operator is still needed in the loop.

DeathArrow • 59 minutes ago

You can see the generated code.

marktl • 7 hours ago

Definitely looks like something that could save me, and others, allot of time. Thanks for sharing!

matt3D • 4 hours ago

After giving it a whirl I'm a little underwhelmed, but maybe I'm using it wrong. I'm getting less consistent results than if I prompted GPT4-o for a Vega graph after providing it with the documentation.

data_ders • 9 hours ago

way cool! I hope to take it for a spin tomorrow!

Q: Does your team see potential value in a DSL for succinctly describing visualizations to an LLM as Hex did with their DSL for Vega-lite specs [1]?

[1]: https://hex.tech/blog/making-ai-charts-go-brrrr/

chenglong-hn • 8 hours ago

Wow, that's pretty cool! I think there are potential -- current LLMs are not that good on VegaLite when I ask it to edit the script :)

donq1xote1 • 5 hours ago

Thanks for sharing and provide open source version! This is great!

hggigg • 3 hours ago

I rather like this idea. Apologies however for my cynicism in advance. I suspect it'll die due to human concerns. I've seen many reports recently which are just plain and utterly wrong written in dashboards by vendors and internally. The veracity of the results is mostly based on the human driving it and validating the methodology and the competent ones are apparently rather rare. This serves to give it to humans who are even worse at the job than the current ones.