In Defence of XML: Exporting and Analysing Apple Health Data

Posted on Fri 15 April 2016 in TDDA • Tagged with xml, apple, health

I'm going to present a series of posts based around the sort of health and fitness data that can now be collected by some phones and dedicated fitness trackers. Not all of these will be centrally on topic for test-driven data analysis, but I think they'll provide an interesting set …

Continue reading

Lessons Learned: Bad Data and other SNAFUs

Posted on Mon 15 February 2016 in TDDA • Tagged with tdda, bad data

My first paid programming job was working for my local education authority during the summer. The Advisory Unit for Computer-Based Education (AUCBE), run by a fantastic visionary and literal "greybeard" called Bill Tagg, produced software for schools in Hertfordshire and environs, and one of their products was a simple database …

Continue reading

How far in advance are flights cheapest? An error of interpretation

Posted on Wed 06 January 2016 in TDDA • Tagged with tdda, errors, interpretation

Guest Post by Patrick Surry, Chief Data Scientist, Hopper

Every year, Expedia and ARC collaborate to publish some annual statistics about domestic airfare, including their treatment of the perennial question "How far in advance should you book your flight?" Here's what they presented in their report last year:

Figure: Average Ticket Price cs. Advance Purchase Days for Domestic Flights (Source; Expedia/ARC)

Although there …

Continue reading

Tools and Tooling

Posted on Wed 16 December 2015 in TDDA • Tagged with tdda, tools

Good tools for testing matter because the temptation to skimp on testing is real even for true believers: anything that reduces the friction and pain associated with actually adding tests therefore has a disproportionate effect on adoption and implementation rates.

I think there are several reasons the temptation to forego …

Continue reading

Generalized Overfitting: Errors of Applicability

Posted on Mon 14 December 2015 in TDDA • Tagged with tdda, errors, applicability

Everyone building predictive models or performing statistical fitting knows about overfitting. This arises when the function represented by the model includes components or aspects that are overly specific to the particularities of the sample data used for training the model, and that are not general features of datasets to which …

Continue reading