GDPR, Consent and Microformats: A Half-Baked Idea

Posted on Fri 08 September 2017 in TDDA

Last night I went to The Protectors of Data Scotland Meetup on the subject of Marketing and GDPR. If you're not familiar with Europe's fast-approaching General Data Protection Regulation, and you keep or process any personal data about humans,1, you probably ought to learn about it. A good place to start is episode 202 of Horace Dediu's The Critical Path podcast, in which he interviews Tim Walters.

During the meeting, I had an idea, and though it is rather less than half-baked, right now it seems just about interesting enough that I thought I'd record it.

One of the key provisions in GDPR is that data processing generally requires consent of the data subject, and that consent is defined as

any freely given, specific, informed and unambiguous indication of his or her wishes by which the data subject, either by a statement or by a clear affirmative action, signifies agreement to personal data relating to them being processed

This is further clarified as follows:

This could include ticking a box when visiting an Internet website, choosing technical settings for information society services or by any other statement or conduct which clearly indicates in this context the data subject’s acceptance of the proposed processing of their personal data.

Silence, pre-ticked boxes or inactivity should therefore not constitute consent.

The Idea in a Nutshell

Briefly, the idea is this:

  • Websites (and potentially apps) requesting consents should include a digital specification of that consent in a standardized format to be defined (probably either an HTML microformat or a standardized JSON bundle).

  • This would allow software to understand the consents being requested unambiguously and present them in a standardized, uniform, easy-to-understand format. It would also encourages businesses and other organizations to standardize the forms of consent they request. I imagine that if this happened, initially browser extensions and special apps such as password managers would learn to read the format and present the information clearly, but if were successful, eventually web browsers themselves would do this.

  • Software could also allow people to create one or more templates or default responses, allowing, for example, someone who never wants to receive marketing to make this their default response, and someone who wants as many special offers as possible to have settings that reflect that. Obviously, you might want several different formats for organizations towards which you have different feelings.

  • A very small extension to the idea would extend the format to record the choices made, allowing password managers, browsers, apps etc. to record for the user exactly what consents were given.2

Benefits

I believe such a standard has potential benefits for all parties—businesses and other organizations requesting consent, individuals giving consent, regulators and courts:

  • Businesses and other data processing/capturing organizations would benefit from a clear set of consent kinds, each of which could have a detailed description (perhaps on an EU or W3C document) that could be referenced by a specific label (e.g. marketing_contact_email_organization). Best practice would hopefully quickly move to using these standard categories.

  • Software could present information in a standardized, clear way to users, highlighting non-standard provisions (preferably with standard symbols, a bit like the Creative Commons Symbols

  • By using template responses, users could more easily complete consent forms with less effort and less fear of ticking the wrong boxes.

Digital Specification? Microformat? JSON?

What are we actually talking about here?

The main (textual) content of a web page consists of the actual human-readable text together with annotation ("markup") to specify formatting and layout, as well as special features like the sort of checkboxes used to request consent. In older versions of the web, a web page was literally a text file in the a special format originally defined by Tim Berners-Lee (HTML).

For example, in HTML, this one-sentence paragraph with the word very in bold might be written as follows:

<p>For example, in HTML, this one-sentence paragraph with the
word <b>very</b> in bold might be written as follows:</p>

Since the advent of "Web 2.0", many web pages are generated dynamically, with much of the data being sent in a format called JSON. A simple example of some JSON for describing (say) an element in the periodic might be

{
    "name": "Lithium",
    "atomicnumber": 3,
    "metal": true,
    "period": 2,
    "group": 1,
    "etymology": "Greek lithos"
}

It would be straightforward3 to develop a format for allowing all the common types of marketing consent (and indeed, many other kinds of processing consent) to be expressed either in JSON or an HTML microformat (which might not be rendered directly by the webpage). As a sketch, the kind of thing a marketing consent request might look like in JSON could be:

{
    "format": "GDPR-Marketing-Consent-Bundle",
    "format_version": "1.0",
    "requesting_organization": "Stochastic Solutions Limited",
    "requesting_organization_data_protection_policy_page:
        "https://stochasticsolutions.com/privacy.html",
    "requesting_organization_partners": [],
    "requested_consents": [
        "marketing_contact_email_organization",
        "marketing_contact_mobile_phone_organization",
        "marketing_contact_physical_mail_organization"
    ],
    "request_date": "2017-09-08",
    "request_url": "https://StochasticSolutios.com/give-us-your-data"
}

Key features I am trying to illustrate here are:

  • The format would include details about the organization making the request
  • The format would have the capacity to list partner organizations in cases in which consent for partner marketing or processing was also requested
  • The format would be granular with a taxonomy of known kinds of consents. These might be parameterized, rather than being simple strings. In this case, I've included a few different contact mechanisms and the suffix "organization" to indicate this is consent for the organization itself, rather than any partners or other randoms.

Undoubtedly, a real implementation would end up a bit bigger than this, and perhaps more hierarchical, but hopefully not too much bigger.

The format could be extended very simply to include the response, which could then be sent back to the site and also made available on the page to the browser/password manager/apps etc. Here is an augmentation of the request format that also captures the responses:

{
    "format": "GDPR-Marketing-Consent-Bundle",
    "format_version": "1.0",
    "requesting_organization": "Stochastic Solutions Limited",
    "requesting_organization_data_protection_policy_page:
        "https://stochasticsolutions.com/privacy.html",
    "requesting_organization_partners": [],
    "requested_consents": {
        "marketing_contact_email_organization": false,
        "marketing_contact_mobile_phone_organization": false
        "marketing_contact_physical_mail_organization: true
    },
    "request_date": "2017-09-08",
    "request_url": "https://StochasticSolutions.com/give-us-your-data"
}

This example indicates consent to marketing contact by paper mail from the organization, but not by phone or email.

Exactly the same could be achieved with an HTML Microformat, perhaps with something like this:

<div id="GDPR-Marketing-Consent-Bundle">
    <span id="format_version">1.0</span>
    <span id="requesting_organization">
        Stochastic Solutions Limited
    </span>
    <span id="requesting_organization_data_protection_policy_page">
        "https://stochasticsolutions.com/privacy.html"
    </span>
    <ol id="requesting_organization_partners">
    </old>
    <ol id="requested_consents">
        <li>marketing_contact_email_organization</li>
        <li>marketing_contact_mobile_phone_organization<li>
        <li>marketing_contact_physical_mail_organization</li>
    </ol>
    <span id="request_date">2017-09-08</span>
    <span id="request_url">
        https://StochasticSolutions.com/give-us-your-data
    </span>
</div>

(Again, I've no idea whether this is actually what HTML-based microformats typically look like; this is purely illustrative.)

Useful?

I don't know whether this idea is useful or feasible, nor whether it is merely a half-baked version of something that an phalanx of people in Brussels has already specified, though I did perform many seconds of arduous due dilligence in the form of a web searches for terms like "marketing consent microformat" without turning up anything obviously relevant.

It seems to me that if something like this were created and adopted, it might help make GDPR and web/app-based consent avoid the ignominious fate of the cookie pop-ups that were so well intentioned but such a waste of time in practice. Ideally, some kind of collaboration between the relevant part of the EU and either W3C would produce (or at least endorse) any format.

Do get in touch through any of the channels (@tdda0, mail to info@ this domain etc.) if you have thoughts.


  1. So-called personally identifiable information (PII)

  2. Possibly even on a blockchain, if you want to be terribly au courrant and have the possibility of cryptographic verification. 

  3. technically straightforward; obviously this would require much work and hammering out of special cases and mechanisms for non-standard requirements.