
Contents
Introduction
In the era of big data, efficient data access and analysis are critical for business success. Uber, a global leader in logistics and transportation, processes approximately 1.2 million interactive queries monthly, with the Operations team alone contributing 36% of these queries. Writing SQL queries generally requires deep technical expertise and familiarity with the database schema, consuming up to 10 minutes per query. To address this bottleneck, Uber developed QueryGPT, an AI-powered tool that transforms natural language into SQL queries, reducing query generation time to just 3 minutes and saving an estimated 140,000 hours annually.
This article delves into the internal workings of QueryGPT, exploring its architecture, key components, and the challenges Uber overcame to create this transformative tool.
The origins of QueryGPT
QueryGPT was born during Uber’s Generative AI Hackdays in May 2023. Initially, it was a simple Retrieval-Augmented Generation (RAG) system that vectorized user prompts and performed similarity searches on SQL samples and schemas. However, this early version struggled with scalability and accuracy, particularly when handling large schemas and complex queries.
Over 20 iterations, Uber refined QueryGPT into a sophisticated multi-agent system capable of handling diverse business domains and complex user intents. Today, QueryGPT is a production-ready tool that democratizes data access across Uber’s organization.
How QueryGPT Works Internally
Users write their queries in natural language and QueryGPT converts them into SQL Queries, something like this:

Note that the table names and column names are provided only as an example and are not real database object names.
1. Workspaces: Domain-Specific Data Curation
QueryGPT organizes data into Workspaces, which are curated collections of SQL samples and table schemas aligned with specific business domains such as Mobility, Ads, and Core Services. This domain-specific approach narrows the search space, improving the accuracy of query generation. For example, a query about trip data is routed to the Mobility workspace, ensuring the system references only relevant tables and schemas.
2. Intent Agent: Mapping User Queries to Business Domains
The Intent Agent is the first step in the query generation pipeline. It analyzes the user’s natural language prompt, identifies keywords and context, and maps the query to the appropriate Workspace(s). This step ensures that the system focuses on the correct subset of tables and schemas, significantly improving efficiency and relevance.
3. Table Agent: Selecting Relevant Tables
Once the Intent Agent identifies the relevant Workspace, the Table Agent proposes a list of tables that are most likely to contain the required data. Users can review and modify these selections, ensuring alignment with their expertise and needs. This human-in-the-loop approach enhances accuracy and trust in the system.
4. Column Prune Agent: Optimizing Schema Input
Large enterprise schemas often contain hundreds of columns, which can overwhelm language models and exceed token limits. The Column Prune Agent filters out irrelevant columns, reducing the input size and improving query generation efficiency. This step also lowers the cost of LLM calls and minimizes latency. It also makes the resulting SQL queries efficient, fast and reduces load on the databases.
5. QueryGPT Agent: Generating the Final SQL Query
The final step involves the QueryGPT Agent, which uses few-shot prompting to generate the SQL query. By referencing a small set of SQL examples and the pruned schema, the agent produces a precise and executable query. This approach reduces query generation time from 10 minutes to just 3 minutes.
Key Challenges and Solutions
1. Understanding User Intent
One of the biggest challenges was accurately interpreting user intent, especially with vague or incomplete prompts. Uber addressed this by introducing a Prompt Enhancer that refines user inputs before processing.
2. Handling Large Schemas
With some tables containing over 200 columns, token limits became a significant issue. The Column Prune Agent was introduced to filter out irrelevant columns, ensuring the system stays within token limits and operates efficiently.
3. Reducing Hallucinations
LLMs occasionally generate queries with non-existent tables or columns. To mitigate this, Uber implemented iterative refinement techniques and is exploring the use of a Validation Agent to verify query accuracy.
Evaluation and Continuous Improvement
Uber employs a robust evaluation framework to ensure QueryGPT’s continuous improvement. Key metrics include:
Intent Accuracy: How well the system maps user queries to the correct business domain.
Table Overlap: The relevance of selected tables to the user’s query.
Successful Runs: Whether the generated SQL executes without errors.
Query Similarity: A comparison between generated SQL and manually validated “golden” queries.
This data-driven approach allows Uber to identify patterns, address shortcomings, and measure incremental improvements over time.
Business Impact and Future Directions
QueryGPT has transformed data access at Uber, delivering significant business outcomes:
70% reduction in query time: From 10 minutes to 3 minutes per query.
140,000 Hours saved annually: Enabling teams to focus on higher-value tasks.
Faster feedback loops: Accelerating decision-making and innovation.
Looking ahead, Uber plans to:
- Expand QueryGPT’s capabilities to handle more complex queries.
- Integrate user feedback to refine performance.
- Reduce hallucinations through improved prompts and validation mechanisms.
Conclusion
Uber’s QueryGPT represents a groundbreaking advancement in AI-driven SQL query generation. By combining domain-specific Workspaces, multi-agent architecture, and continuous evaluation, Uber has created a tool that not only boosts productivity but also democratizes data access across the organization.
As AI continues to evolve, tools like QueryGPT will likely become the standard for efficient data interaction, empowering businesses to harness the full potential of their data. Uber’s journey from a hackathon idea to a production-ready solution serves as an inspiring example of how innovation and iterative development can lead to transformative outcomes in the tech industry.
References
QueryGPT – Natural Language to SQL Using Generative AI