Synthetic Data Generator
GenAI-powered synthetic dataset and document generation with local LLMs
Full-featured synthetic data generation using local LLMs (Ollama, llama.cpp) via OpenAI-compatible API. Generates structured tabular data (CSV, Excel) with statistical distributions, temporal consistency, and geographic validation. Creates multi-format documents (Word, PDF, Text, Markdown) with domain constraints and correlation preservation. Includes web UI (Gradio), CLI for batch processing, and Docker containerization.
Tech Stack
Python
Gradio
LangChain
Local LLMs
Pydantic
Docker
asyncio