Describe the result.
Datapia builds the workflow.

Datapia turns plain-language instructions into reliable data pipelines. Load a file, describe the result you need, and generate validated transformations, reports, and charts. Run the same workflow on every future dataset and get consistent results — always under your control.

no credit card · your data stays yours
orders_q2.csv
48,210 rows × 14 cols
Drop duplicates & nullsDone
48,210 → 46,884 · 1.2 s
Enrich with FX ratesRunning
Artifacts
report · chart · clean data

How it works

A pipeline is a single line from raw file to finished artifact. Every step is a station you can inspect.

01Load

Drop in any file: CSV, Excel, or JSON. Datapia profiles your data locally and shows exactly what it detected — columns, types, and patterns.

02Describe

Tell Datapia what you want in plain language. No formulas or coding. The AI translates your intent into a validated transformation workflow.

03Generate

Datatapia builds a repeatable pipeline and executes it locally. You get clean datasets, reports, and charts — ready to use or export.

04Reuse

Next month, new file. Same pipeline. One click rerun. Same logic, consistent results every time — without re-explaining anything.

Private by design

Your file never leaves your workspace. To write each step’s code, the AI sees only a lightweight profile of your data — never the full dataset.

Datapia is a desktop application built for private and sensitive data workflows. It runs locally, and your files never leave your machine.
To generate transformations, Datapia processes only a lightweight data profile (structure, types, statistical patterns). Never your raw data.
Pipeline execution is fully local and isolated from the internet. For full control, you can run Datapia with your own local LLM or trusted provider of choice.
Your files power one thing only — your own pipeline runs. They’re never used to train AI models, and Datapia keeps no copies beyond your local machine.

Say it plainly — tune it if you want

Describe the transformation in plain language and get consistent, validated results every time. Pro users can inspect and refine the generated code directly.

Drop duplicate orders by order_id and remove rows with a missing customer or amount_usd.

Also drop rows where amount_usd is negative.
# generated from your description
def transform(df):
    df = df.drop_duplicates(subset=["order_id"])
    df = df.dropna(subset=["customer", "amount_usd"])
    return df[df["amount_usd"] >= 0]
Scheme review on every step Validated before saving Full lineage, top to bottom Every edit is a restorable version

Start your first pipeline in minutes

Free plan includes three pipelines and the full describe-to-artifact workflow.

Examples

Real pipelines built on real data — from a raw file to finished artifacts. Watch the demo, then browse the examples below.

Demo · YouTube
Datapia in 90 seconds1:30
AllCleaningAnalyticsFinanceMarketingIoT
Orders cleanup

Deduplicate orders, fix column types, normalize amounts to USD. Build a chart to visualize the sales per region.

Cleaning4 steps · 2 artifacts
Bank statements

Load bank report, extract statement data (detect header and footer rows), transform the amount in words to numbers, and generate a summary report by category.

Analytics5 steps · 3 artifacts
Marketing attribution

Parse UTM tags, build a last-touch model, and render a channel performance report.

Marketing6 steps · 2 artifacts
Sensor readings ETL

Resample raw telemetry, remove outliers, and publish a daily heatmap per device group.

IoT4 steps · 1 artifact
Quarterly finance report

Aggregate ledgers by region and currency, reconcile totals, export a rendered HTML report.

Finance5 steps · 3 artifacts
Log triage

Parse server logs, sessionize requests, and summarize errors by route and status code.

Analytics3 steps · 1 artifact
Public beta · v0.7.0

Datapia on your desktop.

A native app with a local engine — your files are read and transformed right on your machine. Install once, then describe your pipelines and ship artifacts offline.

All platforms
Signed installer ~150 MB

Choose your platform

Same app everywhere. Pick a build below, or grab the one we detected for you.

Windows

.exe installer · 64-bit
Recommended for you

macOS

.dmg · Universal (Intel & Apple silicon)
Comming soon...

Linux

.AppImage · x86_64
Comming soon...
System requirements
  • OSWindows 10+, macOS 12+, or a modern 64-bit Linux
  • Memory4 GB RAM minimum, 8 GB recommended
  • Disk~400 MB after install (engine included)
  • NetworkOnly for AI generation — your data stays local
What’s in the box
  • The full describe-to-artifact workspace
  • Bundled local engine — no extra setup required
  • Sandboxed runs with an allowlist of safe libraries
  • Automatic updates, opt-out anytime

Pricing

Start free. Upgrade as your workflows grow.

Free

$0
Try it out
50 credits / mo
  • 3 pipelines
  • Swift & Balanced models
  • Unlimited runs
  • All templates
  • Artifacts viewer

Pro

$29 / month
Full power, full control
3 000 credits / mo
  • Unlimited pipelines
  • All models incl. Expert
  • Manual code editing
  • Pipeline export

Enterprise

Custom
Teams & compliance
custom credits
  • Everything in Pro
  • SSO / SAML
  • Local LLMs
  • Audit log
  • SLAs & dedicated support
What are credits?
One credit ≈ one AI generated step in Balanced mode. Credits reset on the first of each month. Unused credits don’t roll over.
What happens to my data?
Your file is never uploaded to the AI model. To generate a step’s code, Datapia sends only a lightweight profile of your data — table shape, column types, value ranges, and a handful of sample values (could be fully obfustated in pro/enterprise tiers) — never the full dataset. Files stay in your workspace, are only read by your own pipeline runs.
Can I cancel anytime?
Yes. Downgrading keeps all pipelines; those beyond the plan limit become read-only until you reactivate or delete them. For lifetime licenses, please contact support.
What counts as a pipeline?
One source file plus its chain of steps and artifacts. Clones count as separate pipelines.
We’d love to hear from you
  • Questions about plans, the engine, or your data
  • Enterprise, SSO, and on-prem deployments
  • Bug reports and feature requests

Get in touch

Tell us what you need — we usually reply within one business day.

We’ll only use your email to reply. No newsletters, no sharing.

Message sent

Thanks for reaching out — we’ve got your message and will get back to you shortly.

Privacy Policy

Effective date: June 1, 2026  ·  Last updated: June 1, 2026

Short version: Your files never leave your machine. Datapia sends only a lightweight data profile to the AI — never your raw data. We collect minimal telemetry and you can opt out.

1. What Datapia is

Datapia is a desktop application that runs entirely on your local machine. Pipeline execution, file reading, and artifact generation all happen on your device. We do not host your data.

2. Data that stays on your device

The following never leaves your machine:

  • Files you load into Datapia (CSV, Excel, JSON, and any other format)
  • Intermediate data generated during processing
  • Generated artifacts (cleaned datasets, reports, charts)

3. What we send to AI providers

To generate transformation code, Datapia sends a lightweight data profile to an AI model. A profile contains:

  • Table shape (row count, column names and inferred types)
  • Statistical summaries (min, max, nulls, cardinality)
  • A small number of representative sample values (never full rows)
  • Your natural-language description of the step

Raw file contents are never included. Pro and Enterprise plans offer additional obfuscation of sample values.

4. Account and billing data

When you create an account or subscribe, we collect your email address and process payments through Stripe. Stripe handles and stores all payment card details — Datapia never sees your card number.

5. Telemetry

The app collects anonymous usage events (e.g., step types used, pipeline run counts, crash reports) to help us improve reliability. No file contents, column names, or sample values are included in telemetry. You can opt out in Settings → Privacy.

6. Data sharing

We do not sell, rent, or share your data with third parties for advertising. AI generation requests are sent to your chosen AI provider (Datapia-hosted or BYO key) subject to that provider's own privacy policy.

7. Data retention

Account data is retained until you delete your account. Telemetry events are aggregated and anonymized within 90 days. To delete your account, contact us via the form on the site.

8. Children

Datapia is not directed at children under 13. We do not knowingly collect personal data from children.

Terms of Service

Effective date: June 1, 2026  ·  Last updated: June 1, 2026

1. Acceptance

By downloading, installing, or using Datapia you agree to these Terms. If you do not agree, do not use the software.

2. License

We grant you a non-exclusive, non-transferable, revocable license to install and use Datapia on devices you own or control, solely for your own data processing workflows. You may not sublicense, resell, or redistribute the software.

3. Accounts and subscriptions

Features require different types of Datapia accounts. You are responsible for keeping your login credentials secure. Paid subscriptions are billed monthly or annually in advance. Downgrades take effect at the end of the current billing period; cancellations are handled the same way.

4. Acceptable use

You agree not to:

  • Use Datapia to process data you do not have the right to use
  • Attempt to reverse-engineer, decompile, or extract the source code
  • Use the software for unlawful purposes or to harm others
  • Circumvent plan limits or access controls

5. Your content

You retain all rights to the files you load and the pipelines you create. Datapia does not claim ownership of your data or outputs.

6. AI-generated code

Transformation code produced by Datapia is generated by AI and may contain errors. You are responsible for reviewing, testing, and validating any generated code before relying on its output.

7. Availability and updates

We may update, modify, or discontinue features at any time. For paid plans, we will give reasonable advance notice of material changes. Automatic updates can be disabled in Settings.

8. Disclaimer of warranties

Datapia is provided "as is" without warranty of any kind, express or implied, including fitness for a particular purpose or uninterrupted availability.

9. Limitation of liability

To the maximum extent permitted by law, Datapia's total liability to you for any claims arising from use of the software is limited to the amount you paid us in the 12 months preceding the claim.

10. Governing law

These Terms are governed by the laws of the jurisdiction in which Datapia is incorporated, without regard to conflict-of-law rules.

11. Contact

Questions about these Terms? Contact us via the form on the site.

Security

Last updated: June 1, 2026

Found a vulnerability? Please report it responsibly via the form on the site. We aim to acknowledge reports within 2 business days.

Local-first execution

Datapia's pipeline engine runs entirely on your device in an isolated process. File reads, data transformations, and artifact writes all happen locally. The engine has no persistent network access during a run.

Data profile, not raw data

The only data sent over the network is a lightweight profile: table shape, column types, statistics, and a small number of representative sample values. Your raw file or processing results never leave your machine.

Transport security

All network communication between the app and Datapia's servers (license checks, credit sync, telemetry) uses TLS 1.2 or higher. AI generation requests to third-party providers also use TLS.

Local mode

When you provide your own local LLM server, generation requests go directly from your machine to the AI provider — they never pass through Datapia's infrastructure.

Automatic updates

Updates are delivered over HTTPS and verified with a code signature before installation. You can opt out of automatic updates in Settings, though we recommend staying current for security patches.

Account security

Passwords are hashed with bcrypt. We support email-based two-factor authentication. Sessions expire after 30 days of inactivity.

Vulnerability disclosure

We follow a coordinated disclosure model. If you discover a security issue, please contact us via the form on the site with a description and reproduction steps. Please allow us 90 days to investigate and patch before public disclosure. We do not currently offer a bug bounty program, but we recognize responsible reporters in our release notes.

Supported versions

Security patches are applied to the current release only. We encourage all users to keep Datapia up to date.