tdda.serial: Metadata for Flat Files (CSV Files)

Posted on Mon 23 June 2025 in misc

Almost all data scientists and data engineers have to work with flat files (CSV files) from time to time. Despite their many problems, CSVs are too ubiquitous, too universal, and (whisper it) have too many strengths for them to be likely to disappear. Even if they did, they would quickly …

Continue reading

Best Practices for Notebook Users

Posted on Tue 17 December 2024 in misc

In a previous post, I discussed some of the dangers of challenges, dangers and weaknesses of Jupyter Notebooks, JupyterLabs and their ilk. I used The Parables of Anne and Beth as a device to illustrate what I think of as good and bad practices for data science. A reasonable criticism …

Continue reading

Log Graphs and Grokkability

Posted on Thu 12 December 2024 in misc

In his novel Stranger in a Strange Land, Robert Heinlein1 introduced the word grok. It is used all the time in the computing sphere, but rarely, as far as I know, outside it. The definition that seems to me most closely to match its usage is:

grok (transitive verb …

Continue reading

An Adware Malware Story Featuring Safari, Notification Centre, and Box Plots

Posted on Sun 22 September 2024 in misc

This is not, primarily, an article about TDDA, but I thought it was worth publishing here anyway. Itʼs a story about a kind of adware/malware incident I had this morning—with digressions about box plots.

Disgression

I was doing some research for a book (on TDDA), looking up information …

Continue reading