Building a better data agent benchmark
It's the multitrillion-dollar question that's haunting everyone, from senior software developers and junior lawyers to Hollywood writers and, of course, data analysts and engineers: Can the robots do our jobs?
For us data people, in various forms, the signs are mixed. According to recent research from Anthropic, data scientists are already using AI to augment or automate 46 percent of their work. In two years, LLMs got good at writing code; they are now also very good at writing SQL.[^1]
But, as every data person knows, writing queries is not our job; not really. For better or for worse (and perhaps better, if the great code-writing disruption is already here), our job is, as Caitlin Moorman put it over six years ago, glue work. We map this team's data to that team's goal; we connect questions over here to trends over there. We deal with ambiguity; we say "it depends." We tell people no.
