WritableTestCase: Example Use

Posted on Sun 18 September 2016 in TDDA • Tagged with tdda

In my PyCon UK talk yesterday I promised to update the and document the copy of writabletestcase.WritableTestCase on GitHub.

The version I've put up is not quite as powerful as the example I showed in the talk—that will follow—but has the basic functionality.

I've now added examples to the repository and, below, show how these work.

The library is available with

git clone https://github.com/tdda/tdda.git

WritableTestCase extends unittest.TestCase, from the Python's standard library, in three main ways:

  • It provides methods for testing strings produced in memory or files written to disk against reference results in files. When a test fails, rather than just showing a hard-to-read difference, it writes the actual result to file (if necessary) and then shows the diff command needed to compare it—something like this:

    Compare with "diff /path/to/actual-output /path/to/expected-output"
    

    Obviously, the diff command can be replaced with a graphical diff tool, an open command or whatever.

    Although this shouldn't be necessary (see below), you also have the option, after verification, or replacing diff with cp to copy the actual output as the new reference output.

  • Secondly, the code supports excluding lines of the output contain nominated strings. This is often handy for excluding things like date stamps, version numbers, copyright notices etc. These often appear in output, and vary, without affecting the semantics.

    (The version of the library I showed at PyCon had more powerful variants of this, which I'll add later.)

  • Thirdly, if you verify that the new output is correct, the library supports re-running with the -w flag to overwrite the expected ("reference") results with the ones generated by the code.

    Obviously, if this feature is abused, the value of the tests will be lost, but provided you check the output carefully before re-writing, this is a significant convenience.

The example code is in the examples subdirectory, called test_using_writabletestcase.py. It has two test functions, one of which generates HTML output as a string, and the other of which produces some slightly different HTML output as a file. In each case, the output produced by the function is not identical to the expected "reference" output (in examples/reference), but differs only on lines containing "Copyright" and "Version". Since these are passed into the test as exclusions, the tests should pass.

Here is the example code:

# -*- coding: utf-8 -*-
"""
test_using_writabletestcase.py: A simple example of how to use
tdda.writabletestcase.WritableTestCase.

Source repository: https://github.com/tdda/tdda

License: MIT

Copyright (c) Stochastic Solutions Limited 2016
"""
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals

import os
import tempfile

from tdda import writabletestcase
from tdda.examples.generators import generate_string, generate_file


class TestExample(writabletestcase.WritableTestCase):
    def testExampleStringGeneration(self):
        """
        This test uses generate_string() from tdda.examples.generators
        to generate some HTML as a string.

        It is similar to the reference HTML in
        tdda/examples/reference/string_result.html except that the
        Copyright and version lines are slightly different.

        As shipped, the test should pass, because the ignore_patterns
        tell it to ignore those lines.

        Make a change to the generation code in the generate_string
        function in generators.py to change the HTML output.

        The test should then fail and suggest a diff command to run
        to see the difference.

        Rerun with

            python test_using_writabletestcase.py -w

        and it should re-write the reference output to match your
        modified results.
        """
        actual = generate_string()
        this_dir = os.path.abspath(os.path.dirname(__file__))
        expected_file = os.path.join(this_dir, 'reference',
                                     'string_result.html')
        self.check_string_against_file(actual, expected_file,
                                       ignore_patterns=['Copyright',
                                                        'Version'])


    def testExampleFileGeneration(self):
        """
        This test uses generate_file() from tdda.examples.generators
        to generate some HTML as a file.

        It is similar to the reference HTML in
        tdda/examples/reference/file_result.html except that the
        Copyright and version lines are slightly different.

        As shipped, the test should pass, because the ignore_patterns
        tell it to ignore those lines.

        Make a change to the generation code in the generate_file function
        in generators.py to change the HTML output.

        The test should then fail and suggest a diff command to run
        to see the difference.

        Rerun with

            python test_using_writabletestcase.py -w

        and it should re-write the reference output to match your
        modified results.
        """
        outdir = tempfile.gettempdir()
        outpath = os.path.join(outdir, 'file_result.html')
        generate_file(outpath)
        this_dir = os.path.abspath(os.path.dirname(__file__))
        expected_file = os.path.join(this_dir, 'reference',
                                     'file_result.html')
        self.check_file(outpath, expected_file,
                        ignore_patterns=['Copyright', 'Version'])


if __name__ == '__main__':
    writabletestcase.main(argv=writabletestcase.set_write_from_argv())

If you download it, and try running it, you should output similar to the following:

$ python test_using_writabletestcase.py
..
----------------------------------------------------------------------
Ran 2 tests in 0.004s

OK

The reference output files it compares against are:

  • examples/reference/string_result.html
  • examples/reference/file_result.html

To see what happens when there's a difference, try editing one or both of the main functions that generate the HTML in generators.py. They're most just using explicit strings, so the simplest thing is just to change a word or something in the output.

If I change It's to It is in the generate_string() function and rerun, I get this output:

$ python test_using_writabletestcase.py
.
File check failed.
Compare with "diff /var/folders/w7/lhtph66x7h33t9pns0616qk00000gn/T/string_result.html /Users/njr/python/tdda/examples/reference/string_result.html".

Note exclusions:
Copyright
Version
F
======================================================================
FAIL: testExampleStringGeneration (__main__.TestExample)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_using_writabletestcase.py", line 55, in testExampleStringGeneration
    'Version'])
  File "/Users/njr/python/tdda/writabletestcase.py", line 294, in check_string_against_file
    self.assertEqual(failures, 0)
AssertionError: 1 != 0

----------------------------------------------------------------------
Ran 2 tests in 0.005s

FAILED (failures=1)
1 godel:$

If I then run the diff command it suggests, the output is:

$ diff /var/folders/w7/lhtph66x7h33t9pns0616qk00000gn/T/string_result.html /Users/njr/python/tdda/examples/reference/string_result.html
5,6c5,6
<     Copyright (c) Stochastic Solutions, 2016
<     Version 1.0.0
—
>     Copyright (c) Stochastic Solutions Limited, 2016
>     Version 0.0.0
29c29
<     It is not terribly exciting.
—
>     It's not terribly exciting.

Here you can see the differences that are excluded, and the change I actually made.

(The version I showed at PyCon has an option to see the only the non-excluded differences, but this version doesn't; that will come!)

If I now run again using -w, to re-write the reference output, it shows:

$ python test_using_writabletestcase.py -w
.Expected file /Users/njr/python/tdda/examples/reference/string_result.html written.
.
----------------------------------------------------------------------
Ran 2 tests in 0.003s

OK

And, of course, if I run a third time, without -w, the test now passes:

$ python test_using_writabletestcase.py
..
----------------------------------------------------------------------
Ran 2 tests in 0.003s

OK

So that's a quick overview of it works.


Slides and Rough Transcript of TDDA talk from PyCon UK 2016

Posted on Sat 17 September 2016 in TDDA • Tagged with tdda

Python UK 2016, Cardiff.

I gave a talk on test-driven data analysis at PyCon UK 2016, Cardiff, today.

The slides (which are kind-of useless without the words) are available here.

More usefully, a rough transcript, with thumbnail slides, is available here.


Extracting More Apple Health Data

Posted on Wed 20 April 2016 in TDDA • Tagged with xml, apple, health

The first version of the Python code for extracting data from the XML export from the Apple Health on iOS neglected to extract Activity Summaries and Workout data. We will now fix that.

As usual, I'll remind you how to get the code, if you want, then discuss the changes to the code, the reference test and the unit tests. Then in the next post, we'll actually start looking at the data.

The Updated Code

As before, you can get the code from Github with

$ git clone https://github.com/tdda/applehealthdata.git

or if you have pulled it before, with

$ git pull --tags

This version of the code is tagged with v1.3, so if it has been updated by the time you read this, get that version with

$ git checkout v1.3

I'm not going to list all the code here, but will pull out a few key changes as we discuss them.

Changes

Change 1: Change FIELDS to handle three different field structures.

The first version of the extraction code wrote only Records, which contain the granular activity data (which is the vast bulk of it, by volume).

Now I want to extend the code to handle the other two main kinds of data it writes—ActivitySummary and Workout elements in the XML.

The three different element types contain different XML attributes, which correspond to different fields in the CSV, and although they overlap, I think the best approach is to have separate record structures for each, and then to create a dictionary mapping the element kind to its field information.

Accordingly, the code that sets FIELDS changes to become:

RECORD_FIELDS = OrderedDict((
    ('sourceName', 's'),
    ('sourceVersion', 's'),
    ('device', 's'),
    ('type', 's'),
    ('unit', 's'),
    ('creationDate', 'd'),
    ('startDate', 'd'),
    ('endDate', 'd'),
    ('value', 'n'),
))

ACTIVITY_SUMMARY_FIELDS = OrderedDict((
    ('dateComponents', 'd'),
    ('activeEnergyBurned', 'n'),
    ('activeEnergyBurnedGoal', 'n'),
    ('activeEnergyBurnedUnit', 's'),
    ('appleExerciseTime', 's'),
    ('appleExerciseTimeGoal', 's'),
    ('appleStandHours', 'n'),
    ('appleStandHoursGoal', 'n'),
))

WORKOUT_FIELDS = OrderedDict((
    ('sourceName', 's'),
    ('sourceVersion', 's'),
    ('device', 's'),
    ('creationDate', 'd'),
    ('startDate', 'd'),
    ('endDate', 'd'),
    ('workoutActivityType', 's'),
    ('duration', 'n'),
    ('durationUnit', 's'),
    ('totalDistance', 'n'),
    ('totalDistanceUnit', 's'),
    ('totalEnergyBurned', 'n'),
    ('totalEnergyBurnedUnit', 's'),
))

FIELDS = {
    'Record': RECORD_FIELDS,
    'ActivitySummary': ACTIVITY_SUMMARY_FIELDS,
    'Workout': WORKOUT_FIELDS,
}

and we have to change references (in both the main code and the test code) to refer to RECORD_FIELDS where previously there were references to FIELDS.

Change 2: Add a Workout to the test data

There was a single workout in the data I exported from the phone (a token one I performed primarily to generate a record of this type). I simply used grep to extract that line from export.xml and poked it into the test data `testdata/export6s3sample.xml'.

Change 3: Update the tag and field counters

The code for counting record types previously considered only nodes of type Record. Now we also want to handle Workout and ActivitySummary elements. Workouts do come in different types (they have a workoutActivityType field), so it may be that we will want to write out different workout types into different CSV files, but since I have only, so far, seen a single workout, I don't really want to do this. So instead, we'll write all Workout elements to a corresponding Workout.csv file, and all ActivitySummary elements to an ActivitySummary.csv file.

Accordingly, the count_record_types method now uses an extra Counter attribute, other_types to count the number of each of these elements, keyed on their tag (i.e. Workout or ActivitySummary).

Change 4: Update the test results to reflect the new behaviour

Two of the unit tests introduced last time need to be updated to reflect this Change 3. First, the field counts change, and secondly we need reference values for the other_types counts. Hence the new section in test_extracted_reference_stats:

    expectedOtherCounts = [
       ('ActivitySummary', 2),
       ('Workout', 1),
    ]
    self.assertEqual(sorted(data.other_types.items()),
                     expectedOtherCounts)

Change 5: Open (and close) files for Workouts and ActivitySummaries

We need to open new files for Workout.csv and ActivitySummary.csv if we have any such records. This is handled in the open_for_writing method.

Change 6: Write records for Workouts and ActivitySummaries

There are minor changes to the write_records method to allow it to handle writing Workout and ActivitySummary records. The only real difference is that the different CSV files have different fields, so we need to look up the right values, in the order specified by the header for each kind. The new code does that:

def write_records(self):
    kinds = FIELDS.keys()
    for node in self.nodes:
        if node.tag in kinds:
            attributes = node.attrib
            kind = attributes['type'] if node.tag == 'Record' else node.tag
            values = [format_value(attributes.get(field), datatype)
                      for (field, datatype) in FIELDS[node.tag].items()]
            line = encode(','.join(values) + '\n')
            self.handles[kind].write(line)

Change 7: Update the reference test

Finally, the reference test itself now generates two more files, so I've added reference copies of those to the testdata subdirectory and changed the test to loop over all four files:

def test_tiny_reference_extraction(self):
    path = copy_test_data()
    data = HealthDataExtractor(path, verbose=VERBOSE)
    data.extract()
    for kind in ('StepCount', 'DistanceWalkingRunning',
                 'Workout', 'ActivitySummary'):
        self.check_file('%s.csv' % kind)

Mission Accomplished

We've now extracted essentially all the data from the export.xml file from the Apple Health app, and created various tests for that extraction process. We'll start to look at the data in future posts. There is one more component in my extract—another XML file called export_cda.xml. This contains a ClinicalDocument, apparently conforming to a standard from (or possibly administered by) Health Level Seven International. It contains heart-rate data from my Apple Watch. I probably will extract it and publish the code for doing so, but later.


Unit Tests

Posted on Tue 19 April 2016 in TDDA • Tagged with xml, apple, health

In the last post, we presented some code for implementing a "reference" test for the code for extracting CSV files from the XML dump that the Apple Health app on iOS can produce.

We will now expand that test with a few other, smaller and more conventional unit tests. Each unit test focuses on a smaller block of functionality.

The Test Code

As before, you can get the code from Github with

$ git clone https://github.com/tdda/applehealthdata.git

or if you have pulled it previously, with

$ git pull

This version of the code is tagged with v1.2, so if it has been updated by the time you read this, get that version with

$ git checkout v1.2

Here is the updated test code.

# -*- coding: utf-8 -*-
"""
testapplehealthdata.py: tests for the applehealthdata.py

Copyright (c) 2016 Nicholas J. Radcliffe
Licence: MIT
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals

import os
import re
import shutil
import sys
import unittest

from collections import Counter


from applehealthdata import (HealthDataExtractor,
                             format_freqs, format_value,
                             abbreviate, encode)

CLEAN_UP = True
VERBOSE = False


def get_base_dir():
    """
    Return the directory containing this test file,
    which will (normally) be the applyhealthdata directory
    also containing the testdata dir.
    """
    return os.path.split(os.path.abspath(__file__))[0]


def get_testdata_dir():
    """Return the full path to the testdata directory"""
    return os.path.join(get_base_dir(), 'testdata')


def get_tmp_dir():
    """Return the full path to the tmp directory"""
    return os.path.join(get_base_dir(), 'tmp')


def remove_any_tmp_dir():
    """
    Remove the temporary directory if it exists.
    Returns its location either way.
    """
    tmp_dir = get_tmp_dir()
    if os.path.exists(tmp_dir):
        shutil.rmtree(tmp_dir)
    return tmp_dir


def make_tmp_dir():
    """
    Remove any existing tmp directory.
    Create empty tmp direcory.
    Return the location of the tmp dir.
    """
    tmp_dir = remove_any_tmp_dir()
    os.mkdir(tmp_dir)
    return tmp_dir


def copy_test_data():
    """
    Copy the test data export6s3sample.xml from testdata directory
    to tmp directory.
    """
    tmp_dir = make_tmp_dir()
    name = 'export6s3sample.xml'
    in_xml_file = os.path.join(get_testdata_dir(), name)
    out_xml_file = os.path.join(get_tmp_dir(), name)
    shutil.copyfile(in_xml_file, out_xml_file)
    return out_xml_file


class TestAppleHealthDataExtractor(unittest.TestCase):
    @classmethod
    def tearDownClass(cls):
        """Clean up by removing the tmp directory, if it exists."""
        if CLEAN_UP:
            remove_any_tmp_dir()

    def check_file(self, filename):
        expected_output = os.path.join(get_testdata_dir(), filename)
        actual_output = os.path.join(get_tmp_dir(), filename)
        with open(expected_output) as f:
            expected = f.read()
        with open(actual_output) as f:
            actual = f.read()
        self.assertEqual(expected, actual)

    def test_tiny_reference_extraction(self):
        path = copy_test_data()
        data = HealthDataExtractor(path, verbose=VERBOSE)
        data.extract()
        self.check_file('StepCount.csv')
        self.check_file('DistanceWalkingRunning.csv')

    def test_format_freqs(self):
        counts = Counter()
        self.assertEqual(format_freqs(counts), '')
        counts['one'] += 1
        self.assertEqual(format_freqs(counts), 'one: 1')
        counts['one'] += 1
        self.assertEqual(format_freqs(counts), 'one: 2')
        counts['two'] += 1
        counts['three'] += 1
        self.assertEqual(format_freqs(counts),
                         '''one: 2
three: 1
two: 1''')

    def test_format_null_values(self):
        for dt in ('s', 'n', 'd', 'z'):
            # Note: even an illegal type, z, produces correct output for
            # null values.
            # Questionable, but we'll leave as a feature
            self.assertEqual(format_value(None, dt), '')

    def test_format_numeric_values(self):
        cases = {
            '0': '0',
            '3': '3',
            '-1': '-1',
            '2.5': '2.5',
        }
        for (k, v) in cases.items():
            self.assertEqual((k, format_value(k, 'n')), (k, v))

    def test_format_date_values(self):
        hearts = 'any string not need escaping or quoting; even this: ♥♥'
        cases = {
            '01/02/2000 12:34:56': '01/02/2000 12:34:56',
            hearts: hearts,
        }
        for (k, v) in cases.items():
            self.assertEqual((k, format_value(k, 'd')), (k, v))

    def test_format_string_values(self):
        cases = {
            'a': '"a"',
            '': '""',
            'one "2" three': r'"one \"2\" three"',
            r'1\2\3': r'"1\\2\\3"',
        }
        for (k, v) in cases.items():
            self.assertEqual((k, format_value(k, 's')), (k, v))

    def test_abbreviate(self):
        changed = {
            'HKQuantityTypeIdentifierHeight': 'Height',
            'HKQuantityTypeIdentifierStepCount': 'StepCount',
            'HK*TypeIdentifierStepCount': 'StepCount',
            'HKCharacteristicTypeIdentifierDateOfBirth': 'DateOfBirth',
            'HKCharacteristicTypeIdentifierBiologicalSex': 'BiologicalSex',
            'HKCharacteristicTypeIdentifierBloodType': 'BloodType',
            'HKCharacteristicTypeIdentifierFitzpatrickSkinType':
                                                    'FitzpatrickSkinType',
        }
        unchanged = [
            '',
            'a',
            'aHKQuantityTypeIdentifierHeight',
            'HKQuantityTypeIdentityHeight',
        ]
        for (k, v) in changed.items():
            self.assertEqual((k, abbreviate(k)), (k, v))
            self.assertEqual((k, abbreviate(k, False)), (k, k))
        for k in unchanged:
            self.assertEqual((k, abbreviate(k)), (k, k))

    def test_encode(self):
        # This test looks strange, but because of the import statments
        #     from __future__ import unicode_literals
        # in Python 2, type('a') is unicode, and the point of the encode
        # function is to ensure that it has been converted to a UTF-8 string
        # before writing to file.
        self.assertEqual(type(encode('a')), str)

    def test_extracted_reference_stats(self):
        path = copy_test_data()
        data = HealthDataExtractor(path, verbose=VERBOSE)

        self.assertEqual(data.n_nodes, 19)
        expectedRecordCounts = [
           ('DistanceWalkingRunning', 5),
           ('StepCount', 10),
        ]
        self.assertEqual(sorted(data.record_types.items()),
                         expectedRecordCounts)

        expectedTagCounts = [
           ('ActivitySummary', 2),
           ('ExportDate', 1),
           ('Me', 1),
           ('Record', 15),
        ]
        self.assertEqual(sorted(data.tags.items()),
                         expectedTagCounts)
        expectedFieldCounts = [
            ('HKCharacteristicTypeIdentifierBiologicalSex', 1),
            ('HKCharacteristicTypeIdentifierBloodType', 1),
            ('HKCharacteristicTypeIdentifierDateOfBirth', 1),
            ('HKCharacteristicTypeIdentifierFitzpatrickSkinType', 1),
            ('activeEnergyBurned', 2),
            ('activeEnergyBurnedGoal', 2),
            ('activeEnergyBurnedUnit', 2),
            ('appleExerciseTime', 2),
            ('appleExerciseTimeGoal', 2),
            ('appleStandHours', 2),
            ('appleStandHoursGoal', 2),
            ('creationDate', 15),
            ('dateComponents', 2),
            ('endDate', 15),
            ('sourceName', 15),
            ('startDate', 15),
            ('type', 15),
            ('unit', 15),
            ('value', 16),
        ]
        self.assertEqual(sorted(data.fields.items()),
                         expectedFieldCounts)


if __name__ == '__main__':
    unittest.main()

Notes

We're not going to discuss every part of the code, but will point out a few salient features.

  • I've added a coding line at the top of both the test script and the main applehealthdata.py script. This tells Python (and my editor, Emacs) the encoding of the file on disk (UTF-8). This is now relevant because one of the new tests (test_format_date_values) features a non-ASCII character in a string literal.

  • The previous test method test_tiny_fixed_extraction has been renamed test_tiny_reference_extraction, but is otherwise unchanged.

  • Several of the tests loop over dictionaries or lists of input-output pairs, with an assertion of some kind in the main body. Some people don't like this, and prefer one assertion per test. I don't really agree with that, but do think it's important to be able to see easily which assertion fails. An idiom I often use to assist this is to include the input on both sides of the test. For example, in test_abbreviate, when checking the abbreviation of items that should change, the code reads:

    for (k, v) in changed.items():
        self.assertEqual((k, abbreviate(k)), (k, v))
    

    rather than

    for (k, v) in changed.items():
        self.assertEqual(abbreviate(k), v)
    

    This makes it easy to tell which input fails, if one does, even in cases in which the main values being compared (abbreviate(k) and v, in this case) are long, complex or repeated across different inputs. It doesn't actually make much difference in these examples, but in general I find it helpful.

  • The test test_extracted_reference_stats checks that three counters used by the code work as expected. Some people would definitely advocate splitting this into three tests, but, even though it's quick, it seems more natural to test these together to me. This also means we don't have to process the XML file three times. There are other ways of achieving the same end, and this approach has the potential disadvantage that the later cases won't be run if the first one fails.

    The other point to note here is that the Counter objects are unordered, so I've sorted the expected results on their keys in the expected values, and then used Python's sorted function, which returns a generator to return the values of a list (or other iterable) in sorted order. We could avoid the sort by constructing sets or a dictionaries from the Counter objects and checking those instead, but the sort here is not expensive, and this approach is probably simpler.

  • I haven't bothered to write a separate test for the extraction phase (checking that it writes the right CSV files) because that seems to me to add almost nothing over the existing reference test (test_tiny_reference_extraction).

Closing

That's it for this post. The unit tests are not terribly exciting, but they will prove useful as we extend the extraction code, which we'll start to do in the next post.

In a few posts' time, we will start analysing the data extracted from the app; it will be interesting to see whether, at that stage, we discover any more serious problems with the extraction code. Experience teaches that we probably will.


First Test

Posted on Mon 18 April 2016 in TDDA • Tagged with xml, apple, health

In the last post, I presented some code for extracting (some of) the data from the XML file exported by the Apple Health app on iOS, but—almost comically, given this blog's theme—omitted to include any tests. This post and the next couple (in quick succession) will aim to fix that.

This post begins to remedy that by writing a single "reference" test. To recap: a reference test is a test that tests a whole analytical process, checking that the known inputs produce the expected outputs. So far, our analytical process is quite small, consisting only of data extraction, but this will still prove very worthwhile.

Dogma

While the mainstream TDD dogma states that tests should be written before the code, it is far from uncommon to write them afterwards, and in the context of test-driven data analysis I maintain that this is usually preferable. Regardless, when you find yourself in a situation in which you have written some code and possess any reasonable level of belief that it might be right,1 an excellent starting point is simply to capture the input(s) that you have already used, together with the output that it generates, and write a test that checks that the input you provided produces the expected output. That's exactly the procedure I advocated for TDDA, and that's how we shall start here.

Test Data

The only flies in the ointment in this case are

  1. the input data I used initially was quite large (5.5MB compressed; 109MB uncompressed), leading to quite a slow test;

  2. the data is somewhat personal.

For both these reasons, I have decided to reduce it so that it will be more manageable, run more quickly, and be more suitable for public sharing.

So I cut down the data to contain only the DTD header, the Me record, ten StepCount records, and five DistanceWalkingRunning records. That results in a small, valid XML file (under 7K) containing exactly 100 lines. It's in the testdata subdirectory of the repository, and if I run it (which you probably don't want do, at least in situ, as that will trample over the reference output), the following output is produced:

$ python applehealthdata/applehealthdata.py testdata/export6s3sample.xml
Reading data from testdata/export6s3sample.xml . . . done

Tags:
ActivitySummary: 2
ExportDate: 1
Me: 1
Record: 15

Fields:
HKCharacteristicTypeIdentifierBiologicalSex: 1
HKCharacteristicTypeIdentifierBloodType: 1
HKCharacteristicTypeIdentifierDateOfBirth: 1
HKCharacteristicTypeIdentifierFitzpatrickSkinType: 1
activeEnergyBurned: 2
activeEnergyBurnedGoal: 2
activeEnergyBurnedUnit: 2
appleExerciseTime: 2
appleExerciseTimeGoal: 2
appleStandHours: 2
appleStandHoursGoal: 2
creationDate: 15
dateComponents: 2
endDate: 15
sourceName: 15
startDate: 15
type: 15
unit: 15
value: 16

Record types:
DistanceWalkingRunning: 5
StepCount: 10

Opening /Users/njr/qs/testdata/StepCount.csv for writing
Opening /Users/njr/qs/testdata/DistanceWalkingRunning.csv for writing
Written StepCount data.
Written DistanceWalkingRunning data.

The two CSV files it writes, which are also in the testdata subdirectory in the repository, are as follows:

$ cat testdata/StepCount.csv
sourceName,sourceVersion,device,type,unit,creationDate,startDate,endDate,value
"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:27:54 +0100,2014-09-13 10:27:59 +0100,329
"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:34:09 +0100,2014-09-13 10:34:14 +0100,283
"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:39:29 +0100,2014-09-13 10:39:34 +0100,426
"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:45:36 +0100,2014-09-13 10:45:41 +0100,61
"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:51:16 +0100,2014-09-13 10:51:21 +0100,10
"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:57:40 +0100,2014-09-13 10:57:45 +0100,200
"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:03:00 +0100,2014-09-13 11:03:05 +0100,390
"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:08:10 +0100,2014-09-13 11:08:15 +0100,320
"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:27:22 +0100,2014-09-13 11:27:27 +0100,216
"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:33:24 +0100,2014-09-13 11:33:29 +0100,282

and

$ cat testdata/DistanceWalkingRunning.csv
sourceName,sourceVersion,device,type,unit,creationDate,startDate,endDate,value
"Health",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:41:28 +0100,2014-09-20 10:41:30 +0100,0.00288
"Health",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:41:30 +0100,2014-09-20 10:41:33 +0100,0.00284
"Health",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:41:33 +0100,2014-09-20 10:41:36 +0100,0.00142
"Health",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:43:54 +0100,2014-09-20 10:43:56 +0100,0.00639
"Health",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:43:59 +0100,2014-09-20 10:44:01 +0100,0.0059

Reference Test

The code for a single reference test is below. It's slightly verbose, because it tries to use sensible locations for everything, but not complex.

As before, you can get the code from Github with

$ git clone https://github.com/tdda/applehealthdata.git

or if you have pulled it previously, you can update it with

$ git pull

This version of the code is tagged with v1.1, so if it has been updated by the time you read this, get that version with

$ git checkout v1.1

Here is the code:

"""
testapplehealthdata.py: tests for the applehealthdata.py

Copyright (c) 2016 Nicholas J. Radcliffe
Licence: MIT
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals

import os
import re
import shutil
import sys
import unittest

from applehealthdata import HealthDataExtractor

CLEAN_UP = True
VERBOSE = False


def get_base_dir():
    """
    Return the directory containing this test file,
    which will (normally) be the applyhealthdata directory
    also containing the testdata dir.
    """
    return os.path.split(os.path.abspath(__file__))[0]


def get_testdata_dir():
    """Return the full path to the testdata directory"""
    return os.path.join(get_base_dir(), 'testdata')


def get_tmp_dir():
    """Return the full path to the tmp directory"""
    return os.path.join(get_base_dir(), 'tmp')


def remove_any_tmp_dir():
    """
    Remove the temporary directory if it exists.
    Returns its location either way.
    """
    tmp_dir = get_tmp_dir()
    if os.path.exists(tmp_dir):
        shutil.rmtree(tmp_dir)
    return tmp_dir


def make_tmp_dir():
    """
    Remove any existing tmp directory.
    Create empty tmp direcory.
    Return the location of the tmp dir.
    """
    tmp_dir = remove_any_tmp_dir()
    os.mkdir(tmp_dir)
    return tmp_dir


def copy_test_data():
    """
    Copy the test data export6s3sample.xml from testdata directory
    to tmp directory.
    """
    tmp_dir = make_tmp_dir()
    name = 'export6s3sample.xml'
    in_xml_file = os.path.join(get_testdata_dir(), name)
    out_xml_file = os.path.join(get_tmp_dir(), name)
    shutil.copyfile(in_xml_file, out_xml_file)
    return out_xml_file


class TestAppleHealthDataExtractor(unittest.TestCase):
    @classmethod
    def tearDownClass(cls):
        """Clean up by removing the tmp directory, if it exists."""
        if CLEAN_UP:
            remove_any_tmp_dir()

    def check_file(self, filename):
        expected_output = os.path.join(get_testdata_dir(), filename)
        actual_output = os.path.join(get_tmp_dir(), filename)
        with open(expected_output) as f:
            expected = f.read()
        with open(actual_output) as f:
            actual = f.read()
        self.assertEqual(expected, actual)

    def test_tiny_fixed_extraction(self):
        path = copy_test_data()
        data = HealthDataExtractor(path, verbose=VERBOSE)
        data.extract()
        self.check_file('StepCount.csv')
        self.check_file('DistanceWalkingRunning.csv')


if __name__ == '__main__':
    unittest.main()

Running the Test

This is what I get if I run it:

$ python testapplehealthdata.py
.
----------------------------------------------------------------------
Ran 1 test in 0.007s

OK
$

That's encouraging, but not particularly informative. If we change the value of VERBOSE at the top of the test file to True, we see slightly more reassuring output:

$ python testapplehealthdata.py
Reading data from /Users/njr/qs/applehealthdata/tmp/export6s3sample.xml . . . done
Opening /Users/njr/qs/applehealthdata/tmp/StepCount.csv for writing
Opening /Users/njr/qs/applehealthdata/tmp/DistanceWalkingRunning.csv for writing
Written StepCount data.
Written DistanceWalkingRunning data.
.
----------------------------------------------------------------------
Ran 1 test in 0.006s

NOTE: The tearDownClass method is a special Python class method that the unit testing framework runs after executing all the tests in the class, regardless of whether they pass, fail or produce errors. I use it to remove the tmp directory containing any test output, which is normally good practice. In a later post, we'll either modify this to leave the output around if any tests fail, or make some other change to make it easier to diagnose what's gone wrong. In the meantime, if you change the value of CLEAN_UP, towards the top of the code, to False, it will leave the tmp directory around, allowing you to examine the files it has produced.

Overview

The test itself is in the 5-line method test_tiny_fixed_extraction. Here's what the five lines do:

  1. Copy the input XML file from the testdata directory to the tmp directory. The Github repository contains the 100-line input XML file together with the expected output in the testdata subdirectory. Because the data extractor writes the CSV files next to the input data, the cleanest thing for us to do is to take a copy of the input data, write it into a new directory (applehealthdata/tmp) and also to use that directory as the location for the output CSV files. The copy_test_data function removes any existing tmp directory it finds, creates a fresh one, copies the input test data into it and returns the path to the test data file. The only "magic" here is that the get_base_dir function figures out where to locate everything by using __file__, which is the location of the source file being executed by Python.

  2. Create a HealthDataExtractor object, using the location of the copy of the input data returned by copy_test_data(). Note that it sets verbose to False, making the test silent, and allowing the line of dots from a successful test run (in this case, a single dot) to be presented without interruption.

  3. Extract the data. This writes two output files to the applehealthdata/tmp directory.

  4. Check that the contents of tmp/StepCount.csv match the reference output in testdata/StepCount.csv.

  5. Check that the contents of tmp/DistanceWalkingRunning.csv match the reference output in testdata/DistanceWalkingRunning.csv.

Write-Test-Break-Run-Repair-Rerun

In cases in which the tests are written after the code, it's important to check that they really are running correctly. My usual approach to that is to write the test, and if appears to pass first time,2 to break it deliberately to verify that it fails when it should, before repairing it. In this case, the simplest way to break the test is to change the reference data temporarily. This will also reveal a weakness in the current check_file function.

We'll try three variants of this:

Variant 1: Break the StepCount.csv reference data.

First, I add a Z to the end of testdata/StepCount.csv and re-run the tests:

$ python testapplehealthdata.py
F
======================================================================
FAIL: test_tiny_fixed_extraction (__main__.TestAppleHealthDataExtractor)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "testapplehealthdata.py", line 98, in test_tiny_fixed_extraction
    self.check_file('StepCount.csv')
  File "testapplehealthdata.py", line 92, in check_file
    self.assertEqual(expected, actual)
AssertionError: 'sourceName,sourceVersion,device,type,unit,creationDate,startDate,endDate,value\n"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:27:54 +0100,2014-09-13 10:27:59 +0100,329\n"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:34:09 +0100,2014-09-13 10:34:14 +0100,283\n"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:39:29 +0100,2014-09-13 10:39:34 +0100,426\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:45:36 +0100,2014-09-13 10:45:41 +0100,61\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:51:16 +0100,2014-09-13 10:51:21 +0100,10\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:57:40 +0100,2014-09-13 10:57:45 +0100,200\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:03:00 +0100,2014-09-13 11:03:05 +0100,390\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:08:10 +0100,2014-09-13 11:08:15 +0100,320\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:27:22 +0100,2014-09-13 11:27:27 +0100,216\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:33:24 +0100,2014-09-13 11:33:29 +0100,282\nZ' != 'sourceName,sourceVersion,device,type,unit,creationDate,startDate,endDate,value\n"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:27:54 +0100,2014-09-13 10:27:59 +0100,329\n"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:34:09 +0100,2014-09-13 10:34:14 +0100,283\n"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:39:29 +0100,2014-09-13 10:39:34 +0100,426\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:45:36 +0100,2014-09-13 10:45:41 +0100,61\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:51:16 +0100,2014-09-13 10:51:21 +0100,10\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:57:40 +0100,2014-09-13 10:57:45 +0100,200\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:03:00 +0100,2014-09-13 11:03:05 +0100,390\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:08:10 +0100,2014-09-13 11:08:15 +0100,320\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:27:22 +0100,2014-09-13 11:27:27 +0100,216\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:33:24 +0100,2014-09-13 11:33:29 +0100,282\n'

----------------------------------------------------------------------
Ran 1 test in 0.005s

FAILED (failures=1)
$

That causes the expected failure. Because, however, we've compared the entire contents of the two CSV files, it's hard to see what's actually gone wrong. We'll address this by improving the check_file method in a later post.

Variant 2: Break the DistanceWalkingRunning.csv reference data.

After restoring the StepCount.csv data, I modify the reference testdata/DistanceWalkingRunning.csv data. This time, I'll change Health to Wealth throughout.

$ python testapplehealthdata.py
F
======================================================================
FAIL: test_tiny_fixed_extraction (__main__.TestAppleHealthDataExtractor)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "testapplehealthdata.py", line 99, in test_tiny_fixed_extraction
    self.check_file('DistanceWalkingRunning.csv')
  File "testapplehealthdata.py", line 92, in check_file
    self.assertEqual(expected, actual)
AssertionError: 'sourceName,sourceVersion,device,type,unit,creationDate,startDate,endDate,value\n"Wealth",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:41:28 +0100,2014-09-20 10:41:30 +0100,0.00288\n"Wealth",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:41:30 +0100,2014-09-20 10:41:33 +0100,0.00284\n"Wealth",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:41:33 +0100,2014-09-20 10:41:36 +0100,0.00142\n"Wealth",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:43:54 +0100,2014-09-20 10:43:56 +0100,0.00639\n"Wealth",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:43:59 +0100,2014-09-20 10:44:01 +0100,0.0059\n' != 'sourceName,sourceVersion,device,type,unit,creationDate,startDate,endDate,value\n"Health",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:41:28 +0100,2014-09-20 10:41:30 +0100,0.00288\n"Health",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:41:30 +0100,2014-09-20 10:41:33 +0100,0.00284\n"Health",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:41:33 +0100,2014-09-20 10:41:36 +0100,0.00142\n"Health",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:43:54 +0100,2014-09-20 10:43:56 +0100,0.00639\n"Health",,,"DistanceWalkingRunning","km",2014-09-21 07:08:49 +0100,2014-09-20 10:43:59 +0100,2014-09-20 10:44:01 +0100,0.0059\n'

----------------------------------------------------------------------
Ran 1 test in 0.005s

FAILED (failures=1)
$

The story is very much the same: the test has failed, which is good, but again the source of difference is hard to discern.

Variant 3: Break the input XML Data.

After restoring DistanceWalkingRunning.csv, I modify the input XML file. In this case, I'll just change the first step count to be 330 instead of 329:

$ python testapplehealthdata.py
F
======================================================================
FAIL: test_tiny_fixed_extraction (__main__.TestAppleHealthDataExtractor)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "testapplehealthdata.py", line 98, in test_tiny_fixed_extraction
    self.check_file('StepCount.csv')
  File "testapplehealthdata.py", line 92, in check_file
    self.assertEqual(expected, actual)
AssertionError: 'sourceName,sourceVersion,device,type,unit,creationDate,startDate,endDate,value\n"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:27:54 +0100,2014-09-13 10:27:59 +0100,329\n"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:34:09 +0100,2014-09-13 10:34:14 +0100,283\n"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:39:29 +0100,2014-09-13 10:39:34 +0100,426\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:45:36 +0100,2014-09-13 10:45:41 +0100,61\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:51:16 +0100,2014-09-13 10:51:21 +0100,10\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:57:40 +0100,2014-09-13 10:57:45 +0100,200\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:03:00 +0100,2014-09-13 11:03:05 +0100,390\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:08:10 +0100,2014-09-13 11:08:15 +0100,320\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:27:22 +0100,2014-09-13 11:27:27 +0100,216\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:33:24 +0100,2014-09-13 11:33:29 +0100,282\n' != 'sourceName,sourceVersion,device,type,unit,creationDate,startDate,endDate,value\n"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:27:54 +0100,2014-09-13 10:27:59 +0100,330\n"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:34:09 +0100,2014-09-13 10:34:14 +0100,283\n"Health",,,"StepCount","count",2014-09-21 07:08:47 +0100,2014-09-13 10:39:29 +0100,2014-09-13 10:39:34 +0100,426\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:45:36 +0100,2014-09-13 10:45:41 +0100,61\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:51:16 +0100,2014-09-13 10:51:21 +0100,10\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 10:57:40 +0100,2014-09-13 10:57:45 +0100,200\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:03:00 +0100,2014-09-13 11:03:05 +0100,390\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:08:10 +0100,2014-09-13 11:08:15 +0100,320\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:27:22 +0100,2014-09-13 11:27:27 +0100,216\n"Health",,,"StepCount","count",2014-09-21 07:08:48 +0100,2014-09-13 11:33:24 +0100,2014-09-13 11:33:29 +0100,282\n'

----------------------------------------------------------------------
Ran 1 test in 0.005s

FAILED (failures=1)
$

Again, we get the expected failure, and again it's hard to see what it is. (We really will need to improve check_file.)

Enough

That's enough for this post. We've successfully added a single "reference" test to the code, which should at least make sure that if we break it during further enhancements, we will notice. It will also check that it is working correctly on other platforms (e.g., yours).

We haven't done anything to check the the CSV files produced are genuinely right beyond the initial eye-balling I did on first extracting the data before. But if we see problems when we start doing proper analysis, it will be easy to correct the expected output to keep the test running. And in the meantime, we'll notice if we make changes to the code that result in different output when it wasn't meant to do so. This is one part of the pragmatic essence of basic TDDA.

We also haven't written any unit tests at all for the extraction code; we'll do that in a later post.


  1. For example, you might have already blogged about it and pushed it to a public repository on Github 

  2. Which is not always the case