Tim Nash "stuff" Blog

Should online Behavioural Profiling respect privacy?

2

On my walk to the office in the morning, I passed through 3 RFID enabled doors (1 in my flat complex and 2 at my office) and around 35 CCTV cameras. That doesn’t sound a lot… except my office is less then a mile from my home and takes me just 10 minutes to walk. That is 3.5 cameras a minute. Basically, every inch of my route, I’m being monitored.
Privacy a myth?
The web is no different, it’s just as overt and people I think quite rightly complain. They also should have a way to opt out. Now, I’m also someone who is big on behavioural modelling. The problem is I want data, and the more data I can get about a user, the better. So, on the one hand I believe everyone should have a way to opt out. On the other, I really don’t want them to; they will ruin my stats.

I’m currently midway through a major behavioural modelling project. It’s complete with large scale re-targeting, both within the site and via advertising networks, as well as using techniques such as the CSS history hack to collect data about whether a visitor has visited our competitors.

Indeed. Looking at the sort of stats we are collecting:

Everyone coming into one of 3 sites is being tagged, using a browser fingerprint. It’s similar to the level Panopticlick project uses in their efforts to educate users on how to be on the safe side. We also use a persistent storage in the form of persist.js to “cookie” the visitor. Lastly, any third party software, which is capable of accepting custom values, has the user hash added, making tracking a user across the board as easy as possible.

To put this in perspective we can at 1 click retrieve:

  • If the user has purchased
  • How they arrived on the site
  • A rough idea of age
  • Rough idea gender
  • Where they are coming from (not just country but are they at home or work)
  • If they have visited our competitors
  • What pages they have visited
  • What offers enticed them
  • If reinforcement marketing is working
  • Any lead mechanisms (email/twitter etc) we may have them subcribed to
  • If they are part of a focus or test group
  • What ad group they arrived in
  • Which split tests they have been set up with
  • Where they clicked on a page and when

Basically, everything they have done on the site to the tiniest detail can be looked at, analysed and dissected. What’s more, the average user will not have a clue. That list would terrify many. I mean, if little peeps like us are doing it, then imagine what people like Google are doing. Best get your foil hats now!

Privacy at heart of Behavioural driven campaigns

One of the things that has been important for us from the start of the campaign is for our visitors to be in control, well a bit anyway. If possible, we want them to be able to opt out of our orwellian vision. The problem is how?

Removing data

Let’s assume we have a user who does want to opt out. The first stage is to remove their data. Since our system has a database, this is fairly simple. Just delete their row in the visitors table and any associated data in the meta table. Small snag: this doesn’t remove the data in third party applications and causes data corruption in the master table. Really, there is not much we can do about the 3rd party applications. Where possible, you can try to automate them, but normally the only option you are left with is giving a user a link to the application’s opt out procedures, if indeed they have one at all!

With your own applications, we have gone down the route of what we term “anonymous annihilation”. All our users are split into testing groups, and the user’s information is overwritten by an average of all those in the test group. The only data we keep exact is country. The IP is overwritten to 999.999.999.999, which makes an easy way for us to exclude the data in reporting, and their user agent finger print is reduced by us removing all the plugin data. Suddenly, we can’t tell them from Adam, except for that Persistent “cookie”, which actually is quite a pain to remove. But hey ho! That was the point. The issue is how do we not track them in the future?

Cooking the excluded

The only real way to exclude someone from an opt out system is to know they have opt’d out! But, to know they have opt’d out, we need to either maintain some information, or tag them in some way. Neither of these options are very palatable to the end user, but ultimately, at the moment, it is the only real solution. When opting out, I suggest using a traditional cookie rather then a persistent storage, clearly named within the cookie and make it clear this is what you have done. The downside, if they clear their cookies and come back, you generate a new profile and the circle starts again. But hey, you tried!

Looking to the future

Right now, there is a lot of talk about “Do not track” methods, especially amongst browser manufacturers. Google is releasing a new extension to allow you to prevent tracking (The irony sure will not be lost on them) and there is a more public discussion from the Mozilla team. Both seem to be heading down the route of the browser making the decision to prevent storage, which is great in principal, but has 2 major obstacles to overcome:

Persistent storage is all about hiding things in the most obscure places such as flash storage, where browsers do not have control. Therefore, simply assuming the browser is in control of all storage would be a mistake.

Carpet banning of data, would be frustrating and would effectively break a lot of the modern web. Cookies and storage are used in every aspect of web development, from ad tracking through to analytics, to storing shopping carts, to changing the colour of a site. Users are not going to want to be prompted every time, so they are likely to adopt an on or off approach.

One of the more interesting and hopeful projects is the idea of using headers. Proposed by Mozilla, the idea is that the client browser sends a HTTP header to the server, telling the server the user does not want to be tracked.
X-Tracking-Choice: do-not-track
It’s then up to the server to determine how to handle this. I think this is a great step forward with one major addition.

Telling a browser to send the header, I would like to see a method that allows sites to instruct a browser to send the do not track header. In effect, when someone clicks opt out, the site tells the browser the user wishes to opt out. Now, obviously, you don’t want a site to be able to opt people in, so the mechanism should be one way, and not mandatory for the browser (i.e it shouldn’t override an existing user preference).

The mechanism I propose has one major issue, at the start I explained this was a multi site campaign, but the mechanism is for only one site, and I can’t see a safe way around.

What do you think? Should we adopt Do not click header? What about the ability for a site to ask a browser to enforce it? Would other advertisers use it?

Consulting
Looking to run behaviour modelling driven campaigns in your company? Concerned about privacy? Why not check out my consulting page!
Consulting

While I no longer offer personal consultancy if you are interested in going further then please let us know at Coding Futures


2 comments

  • gneesham

    I think the Do not click header is a great idea. When collecting data a user should always have the option to opt out. The problem is, as a stat collector you want to be able to collect all the data you can and so I wonder if other advertisers would be as ethical as you have been here.

  • Tim Nash

    probably not but its actually really not easy to do, even if all browsers followed Mozila idea, and implemented my own, you would still have arguments about what constitues tracking, and it’s very much aimed at advertisers providing the technical opt out, I can see larger companies doing that, but smaller ones and joe blogs with his analytics on his site he is unlikely to have the technical know how to do so let alone the will.

    Still something will come of it I hope.

Add a comment



*Required

You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> in your comment.