Jupyter Notebooks Considered Harmful: The Parables of Anne and Beth

Posted on Thu 14 November 2024 in TDDA • Tagged with TDDA, reproducibility, process

I have long considered writing a post about the various problems I see with computational notebooks such as Jupyter Notebooks. As part of a book I am writing on TDDA, I created four parables about good and bad development practices for analytical workflows. They were not intended to form this …

Continue reading

Learning the Hard Way: Regression to the Mean

Posted on Thu 20 June 2024 in TDDA • Tagged with TDDA, reproducibility, errors, interpretation

I was at the tenth PyData London Conference last weekend, which was excellent, as always. One of the keynote speakers was Rebecca Bilbro who gave a rather brilliant (and cleverly titled) talk called Mistakes Were Made: Data Science 10 Years In.

The title is, of course, a reference to the …

Continue reading

TOMLParams: TOML-based parameter files made better

Posted on Sun 16 July 2023 in TDDA • Tagged with TDDA, reproducibility

TOMLParams is a new open-source library that helps Python developers to externalize parameters in TOML files. This post will explain why storing parameters in non-code files is beneficial (including for reproducibility), why TOML was chosen, and some of the useful features of the library, which include structured sets of parameters …

Continue reading