$ How It Works
From signup to data in under 3 minutes
1. Sign Up & Get Your Key
Create a free account. Your API key is generated instantly — no credit card required.
2. Generate Data
Call the API with your desired table type, row count, and sector. Get results in seconds.
3. Build & Ship
Use realistic synthetic data for testing, training ML models, or compliance workflows.
Integrate in Minutes
First-class SDKs for your stack. Or just use curl.
curl https://api.vynfi.com/v1/generate/quick \ -H "Authorization: Bearer vf_live_7mN4kP2x..." \ -H "Content-Type: application/json" \ -d '{ "preset": "retail_small", "tables": ["journal_entries"], "rows": { "journal_entries": 1000 }, "format": "json" }'Built for Every Use Case
Audit Testing
Generate realistic journal entries with known anomalies for audit analytics testing. Calibrated to real-world distributions.
Fintech Development
Build and test financial applications with production-quality synthetic data. No real customer data exposure.
Academic Research
Create large-scale datasets for fraud detection, process mining, and financial ML research.
Compliance Validation
Test SOX, Basel III, and IFRS workflows with realistic synthetic data. Full COSO control mappings and evaluation reports.
ESG & Sustainability
Test CSRD/TCFD reporting pipelines with Scope 1/2 emissions from production data, workforce diversity from HR, and pay equity analysis.
Enterprise Audit Capabilities
Big 4 audit methodologies, group audit simulation, and complete audit data generation.
Big 4 Audit Methodologies
4 integrated methodologies with 728–757 steps per blueprint, including KPMG Clara, PwC Aura, Deloitte Omnia, and EY GAM.
Group Audit (ISA 600)
Component auditor simulation with Significant, Non-Significant, and Not-in-Scope classification and consolidated reporting.
14 Audit Data Types
From journal entries to board minutes — complete audit data generation including IT reports, management packs, and regulatory filings.
Process Mining Exports
Disco, Celonis IBC, XES 2.0, and OCEL 2.0 format support for process mining research and tooling.
Banking & AML Data, Done Right
14 fully-implemented money laundering typologies, multi-party criminal networks, cross-layer fraud propagation from payments to bank transactions, and 10 evaluators to prove it.
14 AML Typologies
Structuring, smurfing, mule chains, synthetic identity, trade-based ML, crypto integration, sanctions evasion, romance scam, casino & real-estate integration — with ground-truth labels on every suspicious transaction.
Multi-Party Networks
Barabási-Albert preferential-attachment topology produces power-law degree distributions — one coordinator + 5-25 smurfs, mule chains with recruiter/middleman/cash-out roles, shell company pyramids.
Velocity & Device Features
Per-transaction rolling-window features (1h/24h/7d/30d counts, unique counterparties, amount z-scores) and realistic power-law device fingerprint distributions pre-computed for your ML pipeline.
Cross-Layer Coherence
A fraudulent vendor payment now shows up in document flow, journal entries, AND on both sides of a mirrored bank transaction pair — ≥95% fraud-label propagation guaranteed.
TB-Scale Without the Disk Hell
Generated data streams straight to Azure Blob with short-lived SAS downloads — or bring your own storage and keep zero bytes on VynFi infrastructure. For live pipelines, rate-controlled NDJSON streaming delivers events at up to 10,000/sec.
Managed Azure Blob
All tiers get managed blob output with lifecycle retention (7d Free → 365d Scale). Per-file SAS URLs give clients direct blob access — no API proxy, no 2 GB cap, no OOM kills.
BYO Storage (Team+)
Supply a container SAS URL at job-submit and the worker uploads directly into your own data lake. Zero bytes transit our storage. Enterprise customers pair this with Private Link for airgapped workflows.
NDJSON Live Streaming (Scale+)
GET /v1/jobs/{id}/stream/ndjson emits self-describing envelopes with token-bucket rate-limiting and periodic progress events. Point Kafka, Spark, ClickHouse at it and ingest live.
Financial Coherence Engine
Every number connects. From raw journal entries to audited financial statements, VynFi generates data that passes your reconciliation and audit tests.
Full Financial Statements
Complete balance sheet, income statement, cash flow, and equity rollforward — generated from actual journal entry data, not templates.
Manufacturing Cost Flow
Multi-stage WIP → Finished Goods → COGS pipeline with standard cost variance accounting and IAS 37 warranty provisions.
Treasury & Hedge Accounting
Debt interest accrual, cash flow and fair value hedge mark-to-market, cash pool sweeps, and covenant compliance evaluation.
Tax from Real GL
Tax provision computed from actual pre-tax income. VAT posting from source documents. Deferred tax with temporary difference tracking.
XBRL 2.1 Export
Instance documents mapped to US GAAP and IFRS taxonomies. Test your regulatory filing pipeline with realistic synthetic data.
32+ Coherence Validators
FG rollforward, WIP rollforward, trial balance proof, cash flow reconciliation, equity rollforward, segment-to-consolidated, IC elimination.
Quality by Design
Statistical validation built into the generation engine
Benford MAD
Mean Absolute Deviation for first-digit compliance. Rated 'excellent conformity' by Nigrini's criteria.
F1 Score Delta
Target: ML fraud detectors trained on synthetic data within 3% F1 of real-data baselines.
Copula Families
Gaussian, Clayton, Gumbel, Frank, and Student-t copulas model complex inter-variable dependencies.
Anomaly Types
Spanning 5 categories: timing, amount, relationship, pattern, and structural anomalies with ground-truth labels.
Built on Real-World Research
The DataSynth engine was calibrated against 155 real-world datasets, encompassing 364 million journal entries and 2.4 billion line items across industries and geographies.
Real-World Datasets
Analyzed for distribution calibration and statistical benchmarking across 10 industry sectors.
Journal Entries
In the calibration corpus used to derive realistic financial patterns and temporal dynamics.
Line Items
Processed to build inter-table correlation models and cross-entity relationship graphs.
41 Country Packs
Localized tax, banking, and accounting standards for realistic regional data
Each pack includes locale configuration, multi-cultural naming, regional holidays, tax frameworks, banking standards, and accounting frameworks.
Simple, Transparent Pricing
Start free. Scale as you grow.
Anomaly types
Coherence validators
Country packs
Rows per second
Datasets analyzed
Powered by the DataSynth engine — a purpose-built Rust engine with 16 crates and counting.
Ready to generate your first dataset?
10,000 credits free every month. No credit card required.
You scrolled all the way down. We respect that.