Skip to main content

Introduction

The VeriSynth CLI provides a simple, reproducible way to generate privacy-safe synthetic datasets from the command line. It’s ideal for:
  • Data scientists who want to quickly synthesize CSV datasets
  • Privacy engineers who need deterministic, verifiable runs
  • Developers integrating VeriSynth into pipelines, notebooks, or CI/CD

Why a CLI?

VeriSynth was designed to run locally and without external dependencies.
The CLI lets you:
  • Generate synthetic datasets in seconds
  • Produce cryptographic proof receipts automatically
  • Reproduce any previous run using deterministic seeding
  • Integrate into any pipeline (Bash, Airflow, CI/CD, notebooks, etc.)
You own the data — VeriSynth never sends it to the cloud.

Quick Example

verisynth data/patients.csv -o out/ --rows 1000000 --seed 42
Output:
📁 out/synthetic.csv   # Synthetic dataset
🧾 out/proof.json      # Cryptographic proof receipt
Result summary:
VeriSynth — Synthetic Data Report
========================================
Input: data/patients.csv
Output: out/synthetic.csv
Engine: GaussianCopula | Seed: 42
Fidelity: corr Δ=0.23 | Privacy risk=0.0
Proof: out/proof.json (Merkle verified)