---
title: "Approximate Answers, Exact Decisions: New Sketch Functions for Analytics"
source_name: "Databricks"
original_url: "https://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics"
canonical_url: "https://www.traeai.com/articles/f6a3f04d-0961-48a2-90e5-5a6b9ef68e40"
content_type: "article"
language: "英文"
score: 6
tags: ["数据分析","Databricks"]
published_at: "2026-04-29T20:01:22+00:00"
created_at: "2026-04-30T02:42:40.137872+00:00"
---

# Approximate Answers, Exact Decisions: New Sketch Functions for Analytics

Canonical URL: https://www.traeai.com/articles/f6a3f04d-0961-48a2-90e5-5a6b9ef68e40
Original source: https://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics

## Summary

Databricks发布新的Sketch函数，用于在数据分析中提供近似答案，帮助快速决策。

## Key Takeaways

- 新的Sketch函数可以提供近似答案，加速数据分析过程。
- 这些函数适用于大规模数据集的快速处理和初步分析。
- Databricks平台支持这些新功能，以提高数据处理效率。

## Content

Title: Approximate Answers, Exact Decisions: New Sketch Functions for Analytics

URL Source: http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics

Published Time: 2026-04-29T20:01:22+0000

Markdown Content:
# Approximate Answers, Exact Decisions: New Sketch Functions for Analytics | Databricks Blog

[Skip to main content](http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics#main)

[![Image 1](blob:http://localhost/c3d26385bd032c882a09c45135533626)](http://www.databricks.com/)

[![Image 2](blob:http://localhost/c3d26385bd032c882a09c45135533626)](http://www.databricks.com/)

*   Why Databricks 

    *           *   Discover 

            *   [For App Developers](http://www.databricks.com/developers)

            *   [For Executives](http://www.databricks.com/why-databricks/executives)

            *   [For Startups](http://www.databricks.com/product/startups)

            *   [Lakehouse Architecture](http://www.databricks.com/product/data-lakehouse)

            *   [Databricks AI Research](http://www.databricks.com/research/databricks-ai-research)

        *   Customers 

            *   [Customer Stories](http://www.databricks.com/customers)

        *   Partners 

            *   [Partner Overview Explore the Databricks partner ecosystem](http://www.databricks.com/partners)

            *   [Partner Program Explore benefits, tiers and how to become a partner](http://www.databricks.com/partners/partner-program)

            *   [Find a Partner Discover Databricks partners for your needs](http://www.databricks.com/partners/partner-directory)

            *   [Partner Spotlight Featured partner announcements](http://www.databricks.com/partners/partner-spotlight)

            *   [Cloud Providers Databricks on AWS, Azure and GCP](http://www.databricks.com/partners/cloud-partners)

            *   [Partner Solutions Find custom industry and migration solutions](http://www.databricks.com/partners/consulting-and-si/partner-solutions)

*   Product 

    *           *   Databricks Platform 

            *   [Platform Overview A unified platform for data, analytics and AI](http://www.databricks.com/product/data-intelligence-platform)

            *   [Sharing An open, secure, zero-copy sharing for all data](http://www.databricks.com/product/delta-sharing)

            *   [Governance Unified governance for all data, analytics and AI assets](http://www.databricks.com/product/unity-catalog)

            *   [Artificial Intelligence Build and deploy ML and GenAI applications](http://www.databricks.com/product/artificial-intelligence)

            *   [Business Intelligence Intelligent analytics for real-world data](https://www.databricks.com/product/business-intelligence)

            *   [Database Postgres for data apps and AI agents](http://www.databricks.com/product/lakebase)

            *   [Data Management Data reliability, security and performance](http://www.databricks.com/product/delta-lake-on-databricks)

            *   [Data Warehousing Serverless data warehouse for SQL analytics](http://www.databricks.com/product/databricks-sql)

            *   [Data Engineering ETL and orchestration for batch and streaming data](http://www.databricks.com/product/data-engineering)

            *   [Data Science Collaborative data science at scale](http://www.databricks.com/product/data-science)

            *   [Application Development Quickly build secure data and AI apps](http://www.databricks.com/product/databricks-apps)

            *   [Security Open agentic SIEM built for the AI era](http://www.databricks.com/product/lakewatch)

        *   Integrations and Data 

            *   [Marketplace Open marketplace for data, analytics and AI](http://www.databricks.com/product/marketplace)

            *   [IDE Integrations Build on the Lakehouse in your favorite IDE](http://www.databricks.com/product/data-science/ide-integrations)

            *   [Partner Connect Discover and integrate with the Databricks ecosystem](http://www.databricks.com/partnerconnect)

        *   Pricing 

            *   [Databricks Pricing Explore product pricing, DBUs and more](http://www.databricks.com/product/pricing)

            *   [Cost Calculator Estimate your compute costs on any cloud](http://www.databricks.com/product/pricing/product-pricing/instance-types)

        *   Open Source 

            *   [Open Source Technologies Learn more about the innovations behind the platform](http://www.databricks.com/product/open-source)

*   Solutions 

    *           *   Databricks for Industries 

            *   [Communications](http://www.databricks.com/solutions/industries/communications)

            *   [Financial Services](http://www.databricks.com/solutions/industries/financial-services)

            *   [Healthcare & Life Sciences](http://www.databricks.com/solutions/industries/healthcare-and-life-sciences)

            *   [Manufacturing](http://www.databricks.com/solutions/industries/manufacturing-industry-solutions)

            *   [Media and Entertainment](http://www.databricks.com/solutions/industries/media-and-entertainment)

            *   [Public Sector](http://www.databricks.com/solutions/industries/public-sector)

            *   [Retail](http://www.databricks.com/solutions/industries/retail-industry-solutions)

            *   [See All Industries](http://www.databricks.com/solutions)

        *   Cross Industry Solutions 

            *   [AI Agents](http://www.databricks.com/solutions/ai-agents)

            *   [Cybersecurity](http://www.databricks.com/solutions/industries/cybersecurity)

            *   [Marketing](http://www.databricks.com/solutions/industries/marketing)

        *   Migration & Deployment 

            *   [Data Migration](http://www.databricks.com/solutions/migration)

            *   [Professional Services](http://www.databricks.com/professional-services)

        *   Solution Accelerators 

            *   [Explore Accelerators Move faster toward outcomes that matter](http://www.databricks.com/solutions/accelerators)

*   Resources 

    *           *   Learning 

            *   [Training Discover curriculum tailored to your needs](https://www.databricks.com/learn/training/home)

            *   [Databricks Academy Sign in to the Databricks learning platform](https://www.databricks.com/learn/training/login)

            *   [Certification Gain recognition and differentiation](https://www.databricks.com/learn/training/certification)

            *   [Free Edition Learn professional Data and AI tools for free](http://www.databricks.com/learn/free-edition)

            *   [University Alliance Want to teach Databricks? See how.](http://www.databricks.com/university)

        *   Events 

            *   [Data + AI Summit](https://www.databricks.com/dataaisummit)

            *   [Data + AI World Tour](http://www.databricks.com/dataaisummit/worldtour)

            *   [AI Days](https://www.databricks.com/ai-days)

            *   [Event Calendar](http://www.databricks.com/events)

        *   Blog and Podcasts 

            *   [Databricks Blog Explore news, product announcements, and more](http://www.databricks.com/blog)

            *   [AI Blog Explore our AI research and engineering work](http://www.databricks.com/blog/category/ai)

            *   [Data Brew Podcast Let’s talk data!](http://www.databricks.com/discover/data-brew)

            *   [Champions of Data + AI Podcast Insights from data leaders powering innovation](http://www.databricks.com/discover/champions-of-data-and-ai)

        *   Get Help 

            *   [Customer Support](https://www.databricks.com/support)

            *   [Documentation](https://www.databricks.com/databricks-documentation)

            *   [Community](https://community.databricks.com/s/)

        *   Dive Deep 

            *   [Resource Center](http://www.databricks.com/resources)

            *   [Demo Center](http://www.databricks.com/resources/demos)

            *   [Architecture Center](http://www.databricks.com/resources/architectures)

*   About 

    *           *   Company 

            *   [Who We Are](http://www.databricks.com/company/about-us)

            *   [Our Team](http://www.databricks.com/company/leadership-team)

            *   [Databricks Ventures](http://www.databricks.com/databricks-ventures)

            *   [Contact Us](http://www.databricks.com/company/contact)

        *   Careers 

            *   [Working at Databricks](http://www.databricks.com/company/careers)

            *   [Open Jobs](http://www.databricks.com/company/careers/open-positions)

        *   Press 

            *   [Awards and Recognition](http://www.databricks.com/company/awards-and-recognition)

            *   [Newsroom](http://www.databricks.com/company/newsroom)

        *   Security and Trust 

            *   [Security and Trust](http://www.databricks.com/trust)

*   DATA + AI SUMMIT [![Image 3: Data+ai summit promo](https://www.databricks.com/sites/default/files/2026-03/dais26-nav-promo-240x96-2x.svg) JUNE 15–18|SAN FRANCISCO Last chance to save 50% — ends April 30. Register](http://www.databricks.com/dataaisummit?itm_source=www&itm_category=home&itm_page=home&itm_location=navigation&itm_component=navigation&itm_offer=dataaisummit)

*   [Login](https://login.databricks.com/?dbx_source=www&itm=main-cta-login&l=en-EN)
*   [Contact Us](http://www.databricks.com/company/contact)
*   [Try Databricks](https://www.databricks.com/signup?dbx_source=www&itm_data=dbx-web-nav&l=en-EN&itm_source=www&itm_category=blog&itm_page=approximate-answers-exact-decisions-new-sketch-functions-analytics&itm_location=nav&itm_component=menu-area&itm_offer=signup)

1.   [All blogs](http://www.databricks.com/blog/category/all)
2.   / [Platform](http://www.databricks.com/blog/category/platform)

Table of contents

*   [Percentile calculations in milliseconds, not minutes](http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics#section-1)
*   [Audience overlap analysis without the compute bill](http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics#section-2)
*   [Real-time leaderboards without reprocessing raw data](http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics#section-3)
*   [Cardinality and revenue attribution in one pass](http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics#section-4)
*   [Getting started with the right sketch](http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics#section-5)

Table of contents

Table of contents

*   [Percentile calculations in milliseconds, not minutes](http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics#section-1)
*   [Audience overlap analysis without the compute bill](http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics#section-2)
*   [Real-time leaderboards without reprocessing raw data](http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics#section-3)
*   [Cardinality and revenue attribution in one pass](http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics#section-4)
*   [Getting started with the right sketch](http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics#section-5)

[Product](http://www.databricks.com/blog/category/platform/product)April 29, 2026

# Approximate Answers, Exact Decisions: New Sketch Functions for Analytics

Four new sketch functions in Databricks speed up percentiles, distinct counts, and top-K queries by orders of magnitude

by [Daniel Tenedorio](http://www.databricks.com/blog/author/daniel-tenedorio), [Kent Marten](http://www.databricks.com/blog/author/kent-martin), [Gengliang Wang](http://www.databricks.com/blog/author/gengliang-wang) and [Chenhao Li](http://www.databricks.com/blog/author/chenhao-li)

Summary

*   Percentiles in milliseconds, not minutes: KLL quantile sketches compute P50, P90, P99 over massive datasets in constant memory. Store sketches and merge them for instant incremental updates.
*   Audience overlap at a fraction of the cost: Theta and Tuple sketches perform unions, intersections, and set differences on distinct value sets. Tuple sketches also associate metrics (sums, mins, maxes) with each key for combined counting and aggregation.
*   Real-time trending without reprocessing: Approximate top-K functions identify the most frequent items in bounded memory, mergeable across time windows.

![Image 4: New Sketch Functions for Analytics](https://www.databricks.com/sites/default/files/inline-images/New-Sketch-Functions-for-Analytics.png?v=1777493382)

Expand

Large-scale datasets compress into compact, mergeable sketches, enabling fast percentile queries and aggregations without scanning raw data.

Many analytical questions are decision-support, not audit. If knowing "~4.7M unique users ±1%" leads to the same decision as "4,712,389 unique users," the approximate answer at a fraction of the cost is strictly better.

Every warehouse has a handful of queries that burn the most compute: percentiles that force global sorts, distinct counts that track every unique value, top-K rankings that reshuffle entire datasets. Databricks now supports four new sketch function families, built on[Apache DataSketches](https://datasketches.apache.org/), that replace these exact computations with bounded-memory approximations. The tradeoff: 1-2% configurable relative error. The payoff: orders-of-magnitude less compute, plus sketches you can store, merge, and requery without touching raw data.

## **Percentile calculations in milliseconds, not minutes**

When you call PERCENTILE(response_time_ms, 0.99) on a billion-row table, the engine must sort every value globally. A full cluster shuffle could take minutes and consume gigabytes of memory. For a dashboard that refreshes every 5 minutes, you're paying that cost over and over.

KLL sketches are compact and mergeable summaries, built to answer quantile questions. They let you replace this sort while using the same bounded memory, whether you process a thousand values or a trillion. Typical relative error is 1-2% and is configurable, well within the actionable range for latency monitoring, capacity planning, and anomaly detection.

sql

```sql
-- Build a sketch
SELECT kll_sketch_agg_bigint(response_time_ms, 200) AS sketch FROM web_logs;
-- Query the sketch
SELECT kll_get_quantile_bigint(sketch, 0.99) AS p99_latency FROM sketches;
```

The real advantage is the workflow sketches enable. Build them once during your daily ETL. Store them as columns in Delta tables. When a dashboard needs P50/P90/P99 for any time range, merge the precomputed sketches in milliseconds instead of rescanning raw data. Extract multiple quantiles from a single sketch in one pass with kll_get_quantile_bigint(sketch, ARRAY(0.5, 0.9, 0.99)).

## **Audience overlap analysis without the compute bill**

How many users saw your Super Bowl ad but not your Instagram campaign? Audience overlap analysis is core to marketing measurement. You need to know total reach (users who saw any campaign), overlap (users who saw multiple campaigns), and exclusive reach (users who saw only one campaign). But exact computation requires collecting every user ID into memory and performing set operations across potentially billions of identifiers. At scale, this becomes impractical or impossible.

Theta sketches summarize a set of distinct values in bounded memory and support full set algebra: unions, intersections, and differences. Build a sketch per campaign, then combine them mathematically:

sql

```sql
-- Build sketches for each campaign
WITH campaign_sketches AS (
  SELECT 'A' AS campaign, theta_sketch_agg(user_id) AS sketch FROM campaign_a
  UNION ALL
  SELECT 'B' AS campaign, theta_sketch_agg(user_id) AS sketch FROM campaign_b
)
SELECT
  -- Total unique users across both campaigns
  theta_sketch_estimate(
    theta_union(
      MAX(CASE WHEN campaign = 'A' THEN sketch END),
      MAX(CASE WHEN campaign = 'B' THEN sketch END)
    )
  ) AS total_reach,
  
  -- Users who saw BOTH campaigns
  theta_sketch_estimate(
    theta_intersection(
      MAX(CASE WHEN campaign = 'A' THEN sketch END),
      MAX(CASE WHEN campaign = 'B' THEN sketch END)
    )
  ) AS overlap
FROM campaign_sketches;
```

The exact approach would require a UNION to deduplicate, then a JOIN to find overlap, possibly shuffling raw user IDs twice across your cluster. With Theta sketches, you generate compact binary objects measured in kilobytes, and the**set operations happen locally in microseconds**. This makes daily reach curves, incrementality measurement, and cross-channel deduplication practical.

## **Real-time leaderboards without reprocessing raw data**

What's trending right now? It's a simple question with an expensive exact answer: count every distinct value, store all those counts, shuffle them across your cluster, sort globally. For high-cardinality event streams like search logs or clickstreams, this is a batch job, not a live query.

Approximate top-K sketches track your most frequently occurring items in bounded memory and let you merge across partitions and time windows to extract results instantly. Rare items might be dropped, which is fine, because that’s not what you’re looking for.

sql

```sql
-- Build a sketch of the last hour's searches
SELECT approx_top_k_accumulate(search_term, 10000) AS sketch
FROM search_logs
WHERE search_time > current_timestamp() - INTERVAL 1 HOUR;

-- Extract the top 10 trending terms
SELECT approx_top_k_estimate(sketch, 10) AS trending_terms;
```

With approx_top_k_combine, your "trending this week" dashboard becomes a merge of 168 pre-computed sketches rather than a scan of billions of raw events. For streaming workloads, merge each micro-batch's sketch into a running total and display results in real time. What was once a batch job becomes a live leaderboard.

## **Cardinality and revenue attribution in one pass**

Counting distinct customers is one query. Summing their revenue is another. Doing both correctly, without double-counting customers who appear in multiple periods, is the challenge.

Consider a common analytics question: “How many unique customers made a purchase this month, and what was their total revenue by region?” Typically, you would start with a large GROUP BY, deduplicating customer IDs while summing purchases across billions of transactions. And you can't simply add prior results together, customers appearing in both periods get double-counted and their revenue overstated.

Tuple sketches solve this by combining distinct counting and metric aggregation in a single, mergeable structure.

sql

```sql
-- Build daily sketches: distinct customers with their total spend
CREATE TABLE daily_revenue_sketches AS
SELECT 
  date,
  region,
  tuple_sketch_agg_double(customer_id, purchase_amount, 12, 'sum') AS sketch
FROM transactions
GROUP BY date, region;

-- Query any date range instantly by merging sketches
SELECT 
  region,
  tuple_sketch_get_estimate_double(
    tuple_sketch_union_double(sketch, 14, 'sum')
  ) AS unique_customers,
  tuple_sketch_get_sum_double(
    tuple_sketch_union_double(sketch, 14, 'sum')
  ) AS total_revenue
FROM daily_revenue_sketches
WHERE date BETWEEN '2024-01-01' AND '2024-01-31'
GROUP BY region;
```

Each sketch maps a distinct customer to its aggregated spend. When you merge across days, customer counts deduplicate automatically and revenue sums accumulate. Exact incremental computation would have you reprocessing from raw data every time the data range changed.

## **Getting started with the right sketch**

**Function Family****Use Cases**
KLL Quantile Sketches Percentiles (P50, P90, P99)
Theta Sketches Set operations on distinct values
Approximate Top-K Most frequent items
Tuple Sketches Distinct counts and metric aggregations

**When to use sketches**: Dashboards, trend analysis, monitoring, marketing attribution -- any query where approximate answers are acceptable. The larger your dataset, the better. If you’re not sure what sketch to use, ask[Genie Code](https://docs.databricks.com/aws/en/genie-code/) to help you know the right choice.

**When to stay exact**: Financial auditing, compliance reporting, or any use case where regulatory or business requirements demand precise values.

These four function families turn long-running queries into the cheapest in your warehouse. Build sketches once during ETL, store them in Delta, merge them on read. The raw data is still there when the auditors ask. For everything else, a 1% error margin and a 1000x speedup is a welcome trade-off.

All functions work in SQL, DataFrame, and Structured Streaming pipelines. Sketches created in Spark are interoperable with other systems in the[Apache DataSketches](https://datasketches.apache.org/) ecosystem. See documentation ([1](https://docs.databricks.com/aws/en/sql/language-manual/functions/kll_sketch_agg_double),[2](https://docs.databricks.com/aws/en/sql/language-manual/functions/theta_sketch_agg),[3](https://docs.databricks.com/aws/en/sql/language-manual/functions/approx_top_k_accumulate),[4](https://docs.databricks.com/aws/en/sql/language-manual/functions/tuple_sketch_agg_integer)) for function signatures and examples and get started with sketches today.

_Special mention to Christopher Boumalhab (cboumalh on GitHub) for implementing and contributing the Theta sketch and Tuple sketch function families in Apache Spark._

### Get the latest posts in your inbox

Subscribe to our blog and get the latest posts delivered to your inbox.

## Sign up

*

Work Email 

*

Country Country*

By clicking “Subscribe” I understand that I will receive Databricks communications, and I agree to Databricks processing my personal data in accordance with its [Privacy Policy](https://www.databricks.com/legal/privacynotice).

Subscribe

[View all blogs](http://www.databricks.com/blog/category/all)

[![Image 5: databricks logo](https://www.databricks.com/sites/default/files/2023-08/databricks-default.png?v=1712162038)](https://www.databricks.com/)

Why Databricks

Discover
*   [For App Developers](http://www.databricks.com/developers)
*   [For Executives](http://www.databricks.com/why-databricks/executives)
*   [For Startups](http://www.databricks.com/product/startups)
*   [Lakehouse Architecture](http://www.databricks.com/product/data-lakehouse)
*   [Databricks AI Research](http://www.databricks.com/research/databricks-ai-research)

Customers
*   [Customer Stories](https://www.databricks.com/customers)

Partners
*   [Partner Overview](http://www.databricks.com/partners)
*   [Partner Program](http://www.databricks.com/partners/partner-program)
*   [Find a Partner](http://www.databricks.com/partners/partner-directory)
*   [Partner Spotlight](http://www.databricks.com/partners/partner-spotlight)
*   [Cloud Providers](http://www.databricks.com/partners/cloud-partners)
*   [Partner Solutions](http://www.databricks.com/partners/consulting-and-si/partner-solutions)

Why Databricks

Discover

*   [For App Developers](http://www.databricks.com/developers)
*   [For Executives](http://www.databricks.com/why-databricks/executives)
*   [For Startups](http://www.databricks.com/product/startups)
*   [Lakehouse Architecture](http://www.databricks.com/product/data-lakehouse)
*   [Databricks AI Research](http://www.databricks.com/research/databricks-ai-research)

Customers

*   [Customer Stories](https://www.databricks.com/customers)

Partners

*   [Partner Overview](http://www.databricks.com/partners)
*   [Partner Program](http://www.databricks.com/partners/partner-program)
*   [Find a Partner](http://www.databricks.com/partners/partner-directory)
*   [Partner Spotlight](http://www.databricks.com/partners/partner-spotlight)
*   [Cloud Providers](http://www.databricks.com/partners/cloud-partners)
*   [Partner Solutions](http://www.databricks.com/partners/consulting-and-si/partner-solutions)

Product

Databricks Platform
*   [Platform Overview](http://www.databricks.com/product/data-intelligence-platform)
*   [Sharing](http://www.databricks.com/product/delta-sharing)
*   [Governance](http://www.databricks.com/product/unity-catalog)
*   [Artificial Intelligence](http://www.databricks.com/product/artificial-intelligence)
*   [Business Intelligence](http://www.databricks.com/product/business-intelligence)
*   [Database](http://www.databricks.com/product/lakebase)
*   [Data Management](http://www.databricks.com/product/delta-lake-on-databricks)
*   [Data Warehousing](http://www.databricks.com/product/databricks-sql)
*   [Data Engineering](http://www.databricks.com/product/data-engineering)
*   [Data Science](http://www.databricks.com/product/data-science)
*   [Application Development](http://www.databricks.com/product/databricks-apps)
*   [Security](http://www.databricks.com/product/lakewatch)

Pricing
*   [Pricing Overview](http://www.databricks.com/product/pricing)
*   [Pricing Calculator](http://www.databricks.com/product/pricing/product-pricing/instance-types)

[Open Source](http://www.databricks.com/product/open-source)

Integrations and Data
*   [Marketplace](http://www.databricks.com/product/marketplace)
*   [IDE Integrations](http://www.databricks.com/product/data-science/ide-integrations)
*   [Partner Connect](http://www.databricks.com/partnerconnect)

Product

Databricks Platform

*   [Platform Overview](http://www.databricks.com/product/data-intelligence-platform)
*   [Sharing](http://www.databricks.com/product/delta-sharing)
*   [Governance](http://www.databricks.com/product/unity-catalog)
*   [Artificial Intelligence](http://www.databricks.com/product/artificial-intelligence)
*   [Business Intelligence](http://www.databricks.com/product/business-intelligence)
*   [Database](http://www.databricks.com/product/lakebase)
*   [Data Management](http://www.databricks.com/product/delta-lake-on-databricks)
*   [Data Warehousing](http://www.databricks.com/product/databricks-sql)
*   [Data Engineering](http://www.databricks.com/product/data-engineering)
*   [Data Science](http://www.databricks.com/product/data-science)
*   [Application Development](http://www.databricks.com/product/databricks-apps)
*   [Security](http://www.databricks.com/product/lakewatch)

Pricing

*   [Pricing Overview](http://www.databricks.com/product/pricing)
*   [Pricing Calculator](http://www.databricks.com/product/pricing/product-pricing/instance-types)

Open Source

Integrations and Data

*   [Marketplace](http://www.databricks.com/product/marketplace)
*   [IDE Integrations](http://www.databricks.com/product/data-science/ide-integrations)
*   [Partner Connect](http://www.databricks.com/partnerconnect)

Solutions

Databricks For Industries
*   [Communications](http://www.databricks.com/solutions/industries/communications)
*   [Financial Services](http://www.databricks.com/solutions/industries/financial-services)
*   [Healthcare and Life Sciences](http://www.databricks.com/solutions/industries/healthcare-and-life-sciences)
*   [Manufacturing](http://www.databricks.com/solutions/industries/manufacturing-industry-solutions)
*   [Media and Entertainment](http://www.databricks.com/solutions/industries/media-and-entertainment)
*   [Public Sector](http://www.databricks.com/solutions/industries/public-sector)
*   [Retail](http://www.databricks.com/solutions/industries/retail-industry-solutions)
*   [View All](http://www.databricks.com/solutions)

Cross Industry Solutions
*   [Cybersecurity](http://www.databricks.com/solutions/industries/cybersecurity)
*   [Marketing](http://www.databricks.com/solutions/industries/marketing)

[Data Migration](http://www.databricks.com/solutions/migration)

[Professional Services](http://www.databricks.com/professional-services)

[Solution Accelerators](http://www.databricks.com/solutions/accelerators)

Solutions

Databricks For Industries

*   [Communications](http://www.databricks.com/solutions/industries/communications)
*   [Financial Services](http://www.databricks.com/solutions/industries/financial-services)
*   [Healthcare and Life Sciences](http://www.databricks.com/solutions/industries/healthcare-and-life-sciences)
*   [Manufacturing](http://www.databricks.com/solutions/industries/manufacturing-industry-solutions)
*   [Media and Entertainment](http://www.databricks.com/solutions/industries/media-and-entertainment)
*   [Public Sector](http://www.databricks.com/solutions/industries/public-sector)
*   [Retail](http://www.databricks.com/solutions/industries/retail-industry-solutions)
*   [View All](http://www.databricks.com/solutions)

Cross Industry Solutions

*   [Cybersecurity](http://www.databricks.com/solutions/industries/cybersecurity)
*   [Marketing](http://www.databricks.com/solutions/industries/marketing)

Data Migration

Professional Services

Solution Accelerators

Resources

[Documentation](https://www.databricks.com/databricks-documentation)

[Customer Support](https://www.databricks.com/support)

[Community](https://community.databricks.com/)

Learning
*   [Training](http://www.databricks.com/learn/training/home)
*   [Certification](https://www.databricks.com/learn/training/certification)
*   [Free Edition](http://www.databricks.com/learn/free-edition)
*   [University Alliance](http://www.databricks.com/university)
*   [Databricks Academy Login](https://www.databricks.com/learn/training/login)

Events
*   [Data + AI Summit](http://www.databricks.com/dataaisummit)
*   [Data + AI World Tour](http://www.databricks.com/dataaisummit/worldtour)
*   [AI Days](https://www.databricks.com/ai-days)
*   [Event Calendar](http://www.databricks.com/events)

Blog and Podcasts
*   [Databricks Blog](http://www.databricks.com/blog)
*   [AI Blog](http://www.databricks.com/blog/category/ai)
*   [Data Brew Podcast](http://www.databricks.com/discover/data-brew)
*   [Champions of Data & AI Podcast](http://www.databricks.com/discover/champions-of-data-and-ai)

Resources

Documentation

Customer Support

Community

Learning

*   [Training](http://www.databricks.com/learn/training/home)
*   [Certification](https://www.databricks.com/learn/training/certification)
*   [Free Edition](http://www.databricks.com/learn/free-edition)
*   [University Alliance](http://www.databricks.com/university)
*   [Databricks Academy Login](https://www.databricks.com/learn/training/login)

Events

*   [Data + AI Summit](http://www.databricks.com/dataaisummit)
*   [Data + AI World Tour](http://www.databricks.com/dataaisummit/worldtour)
*   [AI Days](https://www.databricks.com/ai-days)
*   [Event Calendar](http://www.databricks.com/events)

Blog and Podcasts

*   [Databricks Blog](http://www.databricks.com/blog)
*   [AI Blog](http://www.databricks.com/blog/category/ai)
*   [Data Brew Podcast](http://www.databricks.com/discover/data-brew)
*   [Champions of Data & AI Podcast](http://www.databricks.com/discover/champions-of-data-and-ai)

About

Company
*   [Who We Are](http://www.databricks.com/company/about-us)
*   [Our Team](http://www.databricks.com/company/leadership-team)
*   [Databricks Ventures](http://www.databricks.com/databricks-ventures)
*   [Contact Us](http://www.databricks.com/company/contact)

Careers
*   [Open Jobs](http://www.databricks.com/company/careers/open-positions)
*   [Working at Databricks](http://www.databricks.com/company/careers)

Press
*   [Awards and Recognition](http://www.databricks.com/company/awards-and-recognition)
*   [Newsroom](http://www.databricks.com/company/newsroom)

[Security and Trust](http://www.databricks.com/trust)

About

Company

*   [Who We Are](http://www.databricks.com/company/about-us)
*   [Our Team](http://www.databricks.com/company/leadership-team)
*   [Databricks Ventures](http://www.databricks.com/databricks-ventures)
*   [Contact Us](http://www.databricks.com/company/contact)

Careers

*   [Open Jobs](http://www.databricks.com/company/careers/open-positions)
*   [Working at Databricks](http://www.databricks.com/company/careers)

Press

*   [Awards and Recognition](http://www.databricks.com/company/awards-and-recognition)
*   [Newsroom](http://www.databricks.com/company/newsroom)

Security and Trust

[![Image 7: databricks logo](https://www.databricks.com/sites/default/files/2023-08/databricks-default.png?v=1712162038)](https://www.databricks.com/)

Databricks Inc.

 160 Spear Street, 15th Floor

 San Francisco, CA 94105

 1-866-330-0121

*   [](https://www.linkedin.com/company/databricks)
*   [](https://www.facebook.com/pages/Databricks/560203607379694)
*   [](https://twitter.com/databricks)
*   [](https://www.databricks.com/feed)
*   [](https://www.glassdoor.com/Overview/Working-at-Databricks-EI_IE954734.11,21.htm)
*   [](https://www.youtube.com/@Databricks)

![Image 9](https://www.databricks.com/sites/default/files/2021/02/telco-icon-2.png?v=1715274112)

[See Careers](https://www.databricks.com/company/careers)

[at Databricks](https://www.databricks.com/company/careers)

*   [](https://www.linkedin.com/company/databricks)
*   [](https://www.facebook.com/pages/Databricks/560203607379694)
*   [](https://twitter.com/databricks)
*   [](https://www.databricks.com/feed)
*   [](https://www.glassdoor.com/Overview/Working-at-Databricks-EI_IE954734.11,21.htm)
*   [](https://www.youtube.com/@Databricks)

© Databricks 2026. All rights reserved. Apache, Apache Spark, Spark, the Spark Logo, Apache Iceberg, Iceberg, and the Apache Iceberg logo are trademarks of the [Apache Software Foundation](https://www.apache.org/).

*   [Privacy Notice](https://www.databricks.com/legal/privacynotice)
*   |[Terms of Use](https://www.databricks.com/legal/terms-of-use)
*   |[Modern Slavery Statement](https://www.databricks.com/legal/modern-slavery-policy-statement)
*   |[California Privacy](https://www.databricks.com/legal/supplemental-privacy-notice-california-residents)
*   |[Your Privacy Choices](http://www.databricks.com/blog/approximate-answers-exact-decisions-new-sketch-functions-analytics#yourprivacychoices)
*   ![Image 11](https://www.databricks.com/sites/default/files/2022-12/gpcicon_small.png)

## We Care About Your Privacy

Databricks uses cookies and similar technologies to enhance site navigation, analyze site usage, personalize content and ads, and as further described in our [Cookie Notice](https://www.databricks.com/legal/cookienotice). To disable non-essential cookies, click “Reject All”. You can also manage your cookie settings by clicking “Manage Preferences.” 

Manage Preferences

Reject All Accept All

![Image 12: Databricks Company Logo](https://cdn.cookielaw.org/logos/29b588c5-ce77-40e2-8f89-41c4fa03c155/bc546ffe-d1b7-43af-9c0b-9fcf4b9f6e58/1e538bec-8640-4ae9-a0ca-44240b0c1a20/databricks-logo.png)

## Privacy Preference Center

Opt-Out Preference Signal Honored

## Privacy Preference Center

*   ### Your Privacy 
*   ### Strictly Necessary Cookies 
*   ### Performance Cookies 
*   ### Functional Cookies 
*   ### Targeting Cookies 
*   ### TOTHR 

#### Your Privacy

When you visit any website, it may store or retrieve information on your browser, mostly in the form of cookies. This information might be about you, your preferences or your device and is mostly used to make the site work as you expect it to. The information does not usually directly identify you, but it can give you a more personalized web experience. Because we respect your right to privacy, you can choose not to allow some types of cookies. Click on the different category headings to find out more and change our default settings. However, blocking some types of cookies may impact your experience of the site and the services we are able to offer.

#### Opting out of sales, sharing, and targeted advertising

 Depending on your location, you may have the right to opt out of the “sale” or “sharing” of your personal information or the processing of your personal information for purposes of online “targeted advertising.” You can opt out based on cookies and similar identifiers by disabling optional cookies here. To opt out based on other identifiers (such as your email address), submit a request in our [Privacy Request Center](https://privacypreferences.databricks.com/). 

[More information](https://www.databricks.com/legal/cookienotice)

#### Strictly Necessary Cookies

Always Active

These cookies are necessary for the website to function and cannot be switched off in our systems. They assist with essential site functionality such as setting your privacy preferences, logging in or filling in forms. You can set your browser to block or alert you about these cookies, but some parts of the site will no longer work.

#### Performance Cookies

- [x] Performance Cookies 

These cookies allow us to count visits and traffic sources so we can measure and improve the performance of our site. They help us to know which pages are the most and least popular and see how visitors move around the site.

#### Functional Cookies

- [x] Functional Cookies 

These cookies enable the website to provide enhanced functionality and personalization. They may be set by us or by third party providers whose services we have added to our pages. If you do not allow these cookies then some or all of these services may not function properly.

#### Targeting Cookies

- [x] Targeting Cookies 

These cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant advertisements on other sites. If you do not allow these cookies, you will experience less targeted advertising.

#### TOTHR

- [x] TOTHR 

### Cookie List

Consent Leg.Interest

- [x] checkbox label label

- [x] checkbox label label

- [x] checkbox label label

Clear

*   - [x] checkbox label label 

Apply Cancel

Confirm My Choices

Allow All

[![Image 13: Powered by Onetrust](https://cdn.cookielaw.org/logos/static/powered_by_logo.svg)](https://www.onetrust.com/products/cookie-consent/)

![Image 15](https://insight.adsrvr.org/track/pxl/?adv=b44zbk4&ct=0:hep6l5y&fmt=3)

![Image 16](https://bttrack.com/Pixel/Conversion/16418/default?type=img)![Image 17](https://bttrack.com/Pixel/Conversion/16418/landingpage?type=img)

![Image 18](https://bat.bing.com/action/0?ti=26095796&tm=gtm002&Ver=2&mid=86fca32c-3cab-4a35-8b4c-08859a363898&bo=3&sid=2e1c59e0443e11f1ad87178eea8c9c1c&vid=2e1d55c0443e11f189d3ef837307523e&vids=1&msclkid=N&pi=918639831&lg=en-US&sw=800&sh=600&sc=24&tl=Approximate%20Answers,%20Exact%20Decisions%3A%20New%20Sketch%20Functions%20for%20Analytics%20%7C%20Databricks%20Blog&p=https%3A%2F%2Fwww.databricks.com%2Fblog%2Fapproximate-answers-exact-decisions-new-sketch-functions-analytics&r=&lt=357&evt=pageLoad&sv=2&asc=G&cdb=AQET&rn=295316)
