ISO 21378: the audit-data classification standard you didn't know your data already speaks
VynFi datasets carry the ISO 21378 L1/L2/L3 audit-data classification on every GL account + journal entry. Here's why that matters for SAF-T, FEC, GoBD, and Big-4 audit-software imports.
Every general-ledger record an auditor receives speaks at least three different dialects at once. The chart-of-accounts numbering follows the firm's local convention. The export format follows whatever the regulator demanded that year (SAF-T in Portugal, FEC in France, GoBD in Germany, IRS-820 in the US, no specific format in dozens of jurisdictions). The audit-tool import wants its own classification overlay so it can route 'cash' rows to the cash-procedure pipeline and 'revenue' rows to the revenue-recognition pipeline. None of those three layers know about the others.
**TL;DR** — ISO 21378 is the global standard that classifies every audit-data row into a stable Level-1 / Level-2 / Level-3 hierarchy, so audit tools can route data by category instead of fighting through 200 firm-specific COA conventions. As of DataSynth 5.6.0, every VynFi-generated `GLAccount` and every row of `journal_entries.csv` carries the ISO 21378 codes. Imports into Caseware Cloud, IDEA, ACL, MindBridge, and the Big-4-internal data-analytic platforms work without manual mapping.
The audit-data Tower of Babel
Ask three audit firms how to model 'trade receivables' and you'll get four answers. One firm carries it as a single account in the COA (1100). Another splits it into trade-receivables-domestic (1110) and trade-receivables-foreign (1120). A third runs ten sub-accounts indexed by customer credit risk, with the actual classification buried in the description field. The data is the same — what every auditor cares about is the same — but the schema is different every time. And every audit firm has its own internal taxonomy on top.
Regulators add another layer. Germany's GoBD format asks for a specific export structure. France's FEC has its own column order. Portugal's SAF-T is XML with a tightly-prescribed nesting. The US IRS in §6001 asks for 'records sufficient to establish the amount of gross income, deductions, and credits' and leaves the structure to the taxpayer. None of these regulator formats have a stable cross-jurisdiction taxonomy. They're shapes, not classifications.
Audit-tool vendors have spent the last two decades building proprietary classification overlays so that, regardless of how the firm's COA is structured, their tool can route 'cash and equivalents' rows into the cash-recon pipeline. Caseware Cloud has a classification engine. IDEA has one. ACL/Galvanize has one. MindBridge has one. Each Big 4 firm has a proprietary one (EY Helix, KPMG Clara Analytics, PwC Halo, Deloitte Argus). They overlap heavily but they don't cleanly map to each other. When a Big 4 firm acquires a mid-tier firm, the two classification systems have to be reconciled — usually by hand.
ISO 21378 in 60 seconds
ISO 21378:2022 (Audit Data Collection — Specification for the structure and content of audit data exchange) provides a stable global classification for general-ledger and journal-entry data. The classification has three levels:
- **Level 1 — AdcType.** 5 top-level categories: `Asset`, `Liability`, `Equity`, `Revenue`, `Expense`. (Some implementations split Expense into `CostOfGoodsSold` and `OperatingExpense`; the standard permits this as an internal refinement.)
- **Level 2 — AdcClass.** 28 mid-level classes that refine each L1. For Asset: `CashAndCashEquivalents`, `TradeReceivables`, `OtherReceivables`, `Inventory`, `PrepaidExpenses`, `PropertyPlantAndEquipment`, `IntangibleAssets`, `InvestmentsLongTerm`, `OtherAssets`. For Liability: `TradePayables`, `AccruedExpenses`, `ShortTermDebt`, `LongTermDebt`, `IncomeTaxPayable`, `OtherLiabilities`. For Equity: `ShareCapital`, `RetainedEarnings`, `OtherEquity`. For Revenue: `OperatingRevenue`, `OtherIncome`. For Expense: `CostOfSales`, `Wages`, `Rent`, `Depreciation`, `Interest`, `IncomeTax`, `OtherExpense`. (28 distinct classes total across the 5 L1 categories.)
- **Level 3 — AdcSubClass.** 45 sub-classes for further refinement. `CashAndCashEquivalents` decomposes into `Cash`, `BankCurrentAccount`, `BankSavingsAccount`, `MoneyMarketFund`, `RestrictedCash`. `TradeReceivables` decomposes into `TradeReceivablesDomestic`, `TradeReceivablesForeign`, `TradeReceivablesIntercompany`, `AllowanceForDoubtfulAccounts`. And so on across all 28 L2 classes.
The L1/L2/L3 codes are stable across jurisdictions, GAAPs, firm conventions, and audit-tool vendors. A row classified `(Asset, CashAndCashEquivalents, BankCurrentAccount)` in a Portuguese SAF-T file means the same thing as a row classified `(Asset, CashAndCashEquivalents, BankCurrentAccount)` in a German GoBD file or a French FEC file. The audit tool ingests by L2 class, not by COA number.
How VynFi maps it: every account, every JE row
DataSynth 5.6.0 (the engine VynFi is built on) made ISO 21378 codes a first-class property of the chart-of-accounts model. Every `GLAccount` record now carries four ISO fields:
{ "account_id": "1100", "account_name": "Trade Receivables — Domestic", "account_class": "TradeReceivables", // ISO 21378 L2 "account_class_name": "TradeReceivables", // Display name (= L2) "account_sub_class": "TradeReceivablesDomestic", // ISO 21378 L3 "account_sub_class_name": "TradeReceivablesDomestic"}The L2 code (`account_class`) is the load-bearing field for audit-tool routing. Every audit tool that supports ISO 21378 ingests by L2. The L3 code (`account_sub_class`) refines for procedures that need finer granularity (e.g., the receivables-aging procedure cares about domestic vs foreign vs intercompany — it routes by L3).
Every row of `journal_entries.csv` then carries the ISO codes denormalised onto the line, so that audit tools can route by ISO class without joining to the COA file:
- `account_class` — Level-2 ADC class (string, one of 28).
- `account_class_name` — display name for L2.
- `account_sub_class` — Level-3 ADC sub-class (string, one of 45).
- `account_sub_class_name` — display name for L3.
The denormalisation is deliberate. Audit tools running on millions of journal-entry rows shouldn't have to join to the chart-of-accounts file to figure out which classification pipeline to route a row through. The four extra columns add a few percent to the file size; the avoided join saves an order of magnitude in import time.
**Breaking change in DS 5.6.0:** the `account_class` field used to be a single first-digit category (Asset = `1`, Liability = `2`, etc.). It is now the ISO 21378 Level-2 string code (`TradeReceivables`, `CashAndCashEquivalents`, ...). Downstream consumers reading the field as a single character will need to update. The `account_class_name` field is the same value, retained for explicit display use.
Worked example: a US retail chain's chart of accounts
Concrete walkthrough. A US retail chain has a 47-account COA. Half a dozen of the accounts under traditional 'cash and equivalents' get classified by VynFi like this:
[ { "account_id": "1010", "account_name": "Petty Cash", "account_class": "CashAndCashEquivalents", "account_sub_class": "Cash" }, { "account_id": "1020", "account_name": "Operating Cash — JPMorgan Chase", "account_class": "CashAndCashEquivalents", "account_sub_class": "BankCurrentAccount" }, { "account_id": "1030", "account_name": "Operating Cash — Wells Fargo", "account_class": "CashAndCashEquivalents", "account_sub_class": "BankCurrentAccount" }, { "account_id": "1040", "account_name": "Money Market Reserve", "account_class": "CashAndCashEquivalents", "account_sub_class": "MoneyMarketFund" }, { "account_id": "1050", "account_name": "Restricted Cash — Lease Reserve", "account_class": "CashAndCashEquivalents", "account_sub_class": "RestrictedCash" }, { "account_id": "1100", "account_name": "Accounts Receivable — Customers", "account_class": "TradeReceivables", "account_sub_class": "TradeReceivablesDomestic" }]An audit tool routing by L2 sees five rows under `CashAndCashEquivalents` and one row under `TradeReceivables`. The cash-confirmation procedure runs against the five cash rows — even though three of them are at different banks and one is restricted — without needing to know the firm's COA conventions. The cash-disclosure procedure further routes by L3 to separate restricted cash (¶7 of IAS 7) from unrestricted cash. The same data file ingested into IDEA, ACL, or MindBridge gives the same procedure routing.
What this unlocks
- **Caseware Cloud SAF-T import** — Caseware Cloud's SAF-T ingester reads the ISO 21378 fields directly. A VynFi-generated dataset imports without manual COA mapping. Engagement teams can run a synthetic data + procedure walkthrough end-to-end in their existing Caseware environment.
- **Big 4 audit-tool data ingestion** — EY Helix, KPMG Clara Analytics, PwC Halo, and Deloitte Argus all support ISO 21378 ingestion to varying degrees. Engagement teams testing new procedures or training new auditors can use VynFi datasets without writing per-COA mapping logic.
- **Regulatory submission pipelines** — German GoBD, French FEC, Portuguese SAF-T, and IRS-820 all derive from chart-of-accounts data. The ISO 21378 fields make it straightforward to build the regulator-specific export from the same source data, since each export format has documented mappings from ISO 21378 codes to the format's own field structure.
- **Cross-firm data exchange** — when a Big 4 firm acquires a mid-tier firm (or a CFO migrates from one firm's audit relationship to another), the data exchange becomes ISO-classified rather than firm-specific. The acquiring firm's tools can ingest the acquired firm's historical data without manual reclassification.
- **ML training data with stable labels** — synthetic-data ML training pipelines can use the L2 / L3 codes as stable target labels. A model trained on `(account_class = TradeReceivables)` learns a feature that generalises across COAs; one trained on `(account_id = 1100)` learns the firm's COA convention.
What it doesn't replace
ISO 21378 is the data-exchange classification. It is not the audit-procedure layer. The audit work — risk assessment, materiality determination, control testing, substantive procedures, audit-trail capture, opinion formulation — is unchanged by the classification standard. ISA 600 group-audit coordination, IFRS 3 / IFRS 10 consolidation mechanics, the Bayesian RMM scoring, the L4 audit graph: those are jurisdictional / standards / methodology layers. ISO 21378 sits below all of them, making the underlying data interoperable.
Concretely: ISO 21378 says nothing about whether a particular receivable should be impaired, whether a particular control is effective, or whether the going-concern assumption holds. It says only what category each balance / transaction falls into. The audit conclusions are the auditor's; the classification is the data's. We were careful to keep that boundary clean — the ISO codes are derived deterministically from the COA structure, never inferred from auditor judgement.
How to use it
If you're already a VynFi customer: the ISO 21378 fields are emitted by default in every dataset generated after May 2026. No config change required. Existing analytics that read `account_class` as a single character will need to be updated to read it as a string code (the breaking change noted above). Datasets generated before May 2026 are unaffected — they ship the old single-character format.
If you're new to VynFi: every public reference dataset on Hugging Face (`vynfi-journal-entries-1m`, `vynfi-aml-100k`, `vynfi-audit-p2p`, `vynfi-group-audit-enterprise-2000`, etc.) ships with the ISO 21378 fields populated. Load any dataset, check the `account_class` and `account_sub_class` columns, and you've got a working ISO-classified ledger to test your audit-tool ingestion against — no API key required.
Background reading: the Audit Methodology Library — open Apache-2.0 reference covering Big4 spines, jurisdictional overlays, ISA 600, CSRD, and KYC blueprints. The Group Audit landing page covers the IFRS 3 / 10 / 28 / 21 / 29 / 36 + ISA 600 surface that consumes the ISO-classified data. The full ISO 21378:2022 standard is available from ISO directly (iso.org); the Apache-2.0 reference catalog at /audit-methodology cross-references the relevant ISO clauses where they affect VynFi blueprints.