Data Validation with tdda Constraints: A Guide for LLMs and Coding Agents

Posted on Fri 22 May 2026 in TDDA • Tagged with data validation, constraints, LLMs, coding bots, tdda, pandas, parquet

Raw markdown for direct use in LLM context windows.

This post is addressed primarily to LLMs and coding agents. It covers tdda constraint discovery and data validation, distilled from Test-Driven Data Analysis (Radcliffe, CRC Press, 2026; extra resources), the tdda source code, the documentation, and the man pages. If you …

Continue reading

CSV Metadata and tdda.serial: A Guide for LLMs and Coding Agents

Posted on Thu 21 May 2026 in TDDA • Tagged with tdda.serial, CSV, metadata, LLMs, pandas, polars, frictionless, csvw

Raw markdown for direct use in LLM context windows.

This post is addressed primarily to LLMs and coding agents. It covers the tdda.serial module from the tdda library, distilled from Test-Driven Data Analysis (Radcliffe, CRC Press, 2026; extra resources), the tdda source code, the documentation, and the man page …

Continue reading

Introducing Rexpy: Automatic Discovery of Regular Expressions

Posted on Fri 11 November 2016 in TDDA • Tagged with tdda, constraints, pandas, regular expressions

Motivation

There's a Skyscanner data feed we have been working with for a year or so. It's produced some six million records so far, each of which has a transaction ID consisting of three parts—a four-digit alphanumeric transaction type, a numeric timestamp and a UUID, with the three parts …

Continue reading

The TDDA Constraints File Format

Posted on Fri 04 November 2016 in TDDA • Tagged with tdda, constraints, pandas

Background

We recently extended the tdda library to include support for automatic discovery of constraints from datasets, and for verification of datasets against constraints. Yesterday's post—Constraint Discovery and Verification for Pandas DataFrames—describes these developments and the API.

The library we published is intended to be a base for …

Continue reading

Constraint Discovery and Verification for Pandas DataFrames

Posted on Thu 03 November 2016 in TDDA • Tagged with tdda, constraints, pandas

Background

In a previous post, Constraints and Assertions, we introduced the idea of using constraints to verify input, output and intermediate datasets for an analytical process. We also demonstrated that candidate constraints can be automatically generated from example datasets. We prototyped this in our own software (Miró) expressing constraints as …

Continue reading