The Context Revolution in Analytics: Lessons from Paradime's Approach

When ChatGPT burst onto the scene in late 2023, a horde of data vendors rushed to make analysts obsolete with "text-to-SQL". My friend Kaustav Mitra, however, remained skeptical.

The challenge, as many data folks quickly realized, wasn't just generating code - it was understanding the complex context around data structures and business meanings.

"Natural language to SQL has significant limitations without proper context," Kaustav explains. "But the repetitive aspects of analytics engineering and day-to-day data work? That's where AI can provide tremendous value."

This insight highlights something that's reshaping how forward-thinking teams approach AI in data: context is everything. The journey from concept to execution has revealed some practical approaches that dramatically improve how AI systems can be leveraged in analytics workflows.

P.S. Kaustav is the founder of Paradime.io - a data development platform built for analytics engineers.

P.P.S. This isn't an ad for Paradime or DinoAI, but we're definitely talking through their solution to get learnings. :-)

Context-First Approach to Analytics

In my chats with Kaustav, it's clear that early experiments with AI in data workflows taught some valuable lessons. Kaustav's team recently launched DinoAI - what they call "Cursor for data" - which is perfect timing to talk about the role of context in all this.

Their initial implementations in early 2024 followed the common pattern of letting users ask questions and get answers directly from AI.

"Our first attempts were quite basic," Kaustav admits. "We quickly discovered that without rich contextual information, the results weren't specific or accurate enough to provide real value to analytics teams."

This frank assessment reflects a broader realization: generalized AI approaches often fall short in specialized domains like data analytics. The most valuable implementations are those that deeply integrate with existing data structures, metadata, and organizational knowledge.

The Limitations of Text-to-SQL

Many AI tools in the data space focus on translating natural language to SQL queries—a capability that has fundamental limitations without proper context.

"Generic text-to-SQL has significant challenges," Kaustav observes. "It really only works when the system has deep context about data structures and their meaning."

This highlights an important distinction. While platforms like Databricks or Snowflake might have advantages because they own the data infrastructure, the most practical implementations focus on leveraging available resources like code and metadata.

"What we've found most effective is bringing code and data warehouse metadata into the AI workflow," explains Kaustav. "When you enhance your AI experience with this structural context, results improve dramatically in relevance and accuracy."

Structural Context: The Foundation of Effective AI

A key realization in modern analytics engineering is that code is deeply dependent on data structure—which continually evolves as schemas change with columns being added, deleted, or modified.

Kaustav illustrates this with a practical example:

"Consider a scenario where you have complex SQL and need to create tests. Without structural context, an AI system can't meaningfully interpret a statement like 'Select * from table X' because it has no visibility into what columns actually exist."

Everything changes when structural context is incorporated:

"With proper context from the data warehouse, AI systems can expand that asterisk into actual columns that need testing, complete with data types, primary key relationships, and metadata descriptions. This structural awareness elevates generic queries into powerful, specific tools."

Real-World Applications of Contextual AI

Teams implementing context-aware AI are seeing major improvements in several key areas:

1. Documentation and Change Management

One of the most time-consuming aspects of data work is maintaining documentation when schemas change. Context-aware AI systems can dramatically reduce this burden.

"The most valuable application we've seen is in schema changes," notes Kaustav. "When new columns appear in tables, AI can automatically document them, generate descriptions, create appropriate tests, and even update code references across repositories."

The impact is huge: tasks that previously required hours of manual updates can be completed in minutes, freeing analysts to focus on verification and higher-value tasks rather than tedious maintenance.

2. Pattern-Based Model Creation

When starting new data projects, teams are finding that contextual AI can leverage existing patterns and best practices effectively.

"For new model development, modern approaches let teams simply describe what they need functionally," Kaustav explains. "The system analyzes existing patterns in the codebase and builds solutions that maintain consistency with established practices."

This prevents duplicate effort and addresses "model bloat"—a common challenge where data pipelines become unnecessarily complex because developers create redundant models instead of building on existing work.

Multiple Context Sources: Beyond Database Metadata

A fascinating aspect of modern contextual approaches is how systems can derive meaning without direct access to underlying data. Kaustav observes that metadata itself often provides substantial information:

"Column names and metadata are surprisingly informative about the data's purpose and relationships," he notes. "This structural information creates a foundation for understanding."

But effective implementations go further by incorporating multiple types of contextual information:

"The most powerful approaches allow teams to incorporate diverse context sources—PDF documentation, CSV files, web search results, and even specialized knowledge bases," explains Kaustav. "We've recently integrated Perplexity search into Dino to expand contextual understanding."

This flexible approach addresses an important consideration in enterprise environments: data privacy.

"Organizations need solutions that respect data privacy while still enabling rich contextual understanding," Kaustav emphasizes. "The best systems provide context without requiring direct access to sensitive data."

Organizational Standards as Context

Beyond technical metadata, effective AI implementations must incorporate organizational standards and best practices. At DinoAI, this is handled through configuration files for company-specific conventions.

"Every organization has its own standards and preferences," Kaustav points out. "Some teams require commas at the beginning of SQL lines rather than the end. Others specify particular date formats or naming conventions."

These organizational conventions serve as critical guardrails, ensuring that AI-generated outputs align with established standards and can be seamlessly integrated into existing codebases.

Bridging Human and Machine Understanding

One of the most transformative aspects of DinoAI's approach is how it bridges the gap between human-readable and machine-readable contexts.

"We've always had context in our organizations," Kaustav observes, "but there was a fundamental disconnect between human-readable documentation and machine-readable specifications. You can't expect business users to write YAML or JSON configurations."

Kaustav sees this barrier dissolving in modern AI systems:

"What's exciting about the AI-first approach is that human-readable context and machine-readable context are becoming essentially the same thing. Natural language can now serve both humans and systems simultaneously."

Democratizing Data Work Through Context

One of Kaustav's most meaningful examples involves a QA team that was initially anxious about having to learn DBT (a data transformation tool) to participate in the analytics process.

"This team had all their requirements in spreadsheets and mapping documents," Kaustav explains. "With DinoAI, they could simply upload these familiar documents, have the system generate appropriate code, and then focus on their core expertise: testing and validation."

The team's feedback was powerful: "One QA specialist told me, 'Initially I was defensive about this new process. Now I feel empowered. I feel like I can do so much more with my career.'"

This reminds me of what we do at Reconfigured where we design directly in code and hand over that code as a blueprint. There's no ambiguity about what it should look like or what it should do - the translation layer is gone.

When you eliminate that translation layer between business needs and technical implementation, you remove countless opportunities for miscommunication. The game of telephone (business → analyst → developer → result) becomes far more reliable when context-aware AI acts as the intermediary.

The result? People previously excluded from technical processes can now actively participate without becoming coding experts. They gain a renewed sense of agency in their work, no longer dependent on technical gatekeepers to translate their needs.

Revolutionizing Analytics Handovers

One particularly valuable application Kaustav highlights is how DinoAI streamlines handovers between analysts and data platform teams:

"The improved process is remarkably efficient," Kaustav explains. "A product analyst completes their exploratory analysis and documents test cases in a PDF. The data platform team then inputs this PDF along with dataset information into DinoAI, which generates appropriate code. The team simply reviews it, runs the validation tests, and commits the changes."

This turns what was traditionally a process filled with miscommunication and delays into a smooth workflow that leverages contextual understanding to maintain fidelity between business requirements and technical implementation.

The Iterative Loop of Data Work

"By the time you produce something, the business has moved on," Kaustav points out. "If people take too long to produce code, either I end up writing it myself over the weekend or the process has changed. The company has a new revenue model now. We've moved on."

This painful reality of data work is one of its most persistent challenges. The business doesn't stand still while analytics teams build. Markets shift, products evolve, and priorities change—often during the middle of a data project.

With faster code generation and documentation, teams can keep pace with rapidly evolving business needs. This is especially critical when stakeholders can't provide feedback until they see something tangible:

"In data work, stakeholders can't comment on validity until they see something. You produce something, show them, they give feedback: 'This is wrong. That field doesn't make sense. We don't use it anymore.' Then you pass that feedback to your AI."

This drastically compressed feedback loop is a game-changer. In traditional data workflows, each iteration might take days or weeks—time that many businesses simply don't have. By the third round of feedback, the original question might not even be relevant anymore.

Context-aware AI changes this dynamic fundamentally. When feedback can be integrated in minutes rather than days, the entire relationship between data teams and stakeholders evolves. Data projects become more collaborative, more responsive, and ultimately more valuable to the business.

Think about what this means in practice: a data analyst can sit with a product manager, generate a first-pass solution, get immediate feedback, refine on the spot, and potentially deliver a working solution in a single session. What was once a weeks-long process with multiple handoffs becomes a real-time collaboration with immediate results.

This is the difference between data work that drives business decisions and data work that documents what's already happened. And in competitive environments, that difference can be decisive.

The Meaning Behind the Data

At the end of the day, data is just a representation of something, usually read from a table with columns and rows. Very hollow on its own.

"Meaning of data is super important, especially for business-specific dimensions," Kaustav emphasizes. "Your GA4 data from a mobile app might have custom fields that mean one thing, while the same fields in web application data might mean something completely different."

This is where context truly becomes critical. Data without meaning is just numbers and strings—it's the business context that converts them into actionable insights.

Think about a simple field like "status_code: 0101" in a financial transaction. Without context, it's meaningless. With context—knowing that 0101 means "debit" in a particular company's system—it becomes crucial information. As Kaustav explains, "A financial services company might have a rule where transaction field '0101' means debit."

Or consider product codes in manufacturing: "Manufacturing businesses use SKUs for transactions. Each SKU might have a detailed description or data sheet, but that information sits in PDFs. In the database, you only have the SKU codes."

These examples highlight a fundamental truth in data work: the most valuable information often lives outside the database itself. It's in documentation, spreadsheets, PDFs, or even institutional knowledge that hasn't been formally recorded anywhere.

Edge Cases: Where the Value Truly Lies

Kaustav emphatically stresses a point that resonates with any experienced data professional: while standard patterns are important, it's the edge cases that often provide the most value:

"Edge cases are the most valuable ones. Most of the standard cases are fine, but the edge cases are where the real value lies."

This cuts to the heart of what makes data work both challenging and valuable. The 80% of cases that follow standard patterns can often be handled through existing processes or even automation. But it's the 20% of unusual cases that drive innovation, reveal hidden problems, and create opportunities for competitive advantage.

Think about a retail analytics team discovering an unusual purchasing pattern that leads to a new targeted marketing campaign. Or a finance team spotting an anomalous transaction type that exposes a risk in their reporting. These edge cases aren't just interesting anomalies—they're often where the business impact is highest.

The challenge has always been that these edge cases are difficult to document, hard to remember, and challenging to incorporate into standardized processes. They're the kinds of insights that typically live in an analyst's head or personal notes, making knowledge transfer nearly impossible.

With context-aware AI, these valuable edge cases can be properly captured, connected to relevant data structures, and made accessible to the entire team. By the sounds of it, Kaustav and friends have figured out how. :-)

Conclusion: Why Context Matters in Modern Analytics

The most important takeaway from all this? Context is revolutionizing how analytics teams work.

As Kaustav eloquently puts it: "We've always had both machine-readable context in our databases and human-readable context in our documentation, but there was a persistent disconnect between them. In today's AI-first approach, that distinction is disappearing—human-readable context and machine-readable context can finally be unified."

This convergence is making data work more accessible, efficient, and impactful across the industry. It's empowering diverse team members to focus on value creation rather than repetitive tasks. Teams applying these principles are iterating faster, improving accuracy, and delivering solutions that truly meet business needs.

DinoAI's customers are seeing real benefits from this approach, but the principles apply regardless of which tools you use. Any solution that prioritizes context in analytics will help bridge the gap between technical implementation and business understanding.

But since we're here talking with/about Paradime - do check them out. Maybe they are what you needed to appropriately spice up your data life. 😉