How much do you reveal about yourselves to the sites you visit?
Without filling in a form there is a strong possibility that a site can get hold off:
- Your Name
- Age
- Shipping Address
- Some basic Credit details.
They can almost certain make educated and pretty accurate guesses to:
- If you are at work
- your income bracket
- Gender
They might even in a few users cases be able to tell you what you had for breakfast all in a few seconds. Scared?
We recent worked with a client who was looking to improve conversions on their alternative payment methods in this case primarily Google Checkout. They were disappointed by the take up on their Google Checkout option (with such low commission for the first year I suspect they were expecting significant savings) we were asked to develop a method to target existing Google Checkout users and present them with using the one click checkout option in favour of the traditional credit card approach.
Solution
We ran a very short questionnaire, of sales made to ask a simple question:
Do you use any of the following Google Services (tick as many as you use):
- Gmail
- Google Calendar
- Google Checkout
- Google reader
- Google Personalised Home page
The results were pretty staggering 55% used one of the above services with Google personalised homepage making up the majority followed by Google checkout!
Making it worthwhile
ok so we now know there is a feasible amount of users accessing Google services and a reasonable (12.4%) using Google Checkout to make targeting while 12.4 of participants to the survey we can normally add an error margin of + or – 2% which means theoretically 14.5% (approx) could be using checkout thats not a small majority. If we could up the number of completed checkout by even a few percent it would make a difference.
The idea was fairly simple instead of presenting a Pay with Credit Card button with a google checkout button as one of the alternate smaller payments, we would show the Google checkout button as the primary option and pay with credit card as the alternate.
This idea could backfire, users may have made a single one off payment with Google Checkout and absolutely hated it! We have no way of knowing we maybe suggesting something they loathe but lets for now assume that paying via Google Checkout is the preferred option for all those who have previously used checkout but if its a new user how do we know?
Iframes, Javascript and the DOM
If I was to get you to click this link https://www.google.com/accounts/ManageAccount?service=profiles&hl=en chances are that it has just opened your Google profile. Why is that? Well mainly because the people reading this are likely to use one or more of Google services with a fair few using Gmail. Now whats the first thing you do when you get online?
Ok so this page is going to be accessible via javascript for a lot of users, to access it we simply open it in an iframe. We then can use javascript to pull data out, by calling the document in the iframe using something similar to:
window.frames['myframe'].document
where myframe is the id of the frame, once we have access to the document we can use a range of techniques for extracting data from getelementbyid through to implementations of xpath (since Google can't manage nicely designed web pages the later is needed)

note – Its worth remembering this sort of implementation is subject to the whims of the site your grabbing the data from! If they change the page layout you will need to match it. Also you are reliant on the user a) being logged in, b) their browser playing fair, each browser handles cross domain requests differently for example IE is a lot more strict then say Google Chrome the above code will not work on most browsers without some modification
If you take a look at the above you can see the section marked default payment you will forgive me if I have removed some personal details but it shows the default Card type and last 4 numbers if any card is associated with the Google account. So all we need to do is check to see if there is a 4 digit code there if there is then we can assume the user is a Checkout user if there is not or if we couldn't log in then we assume they are not.
Once they have made their first purchase we can ask them to create an account and allow them a preference check as to how they wish to pay.
We have only been running the new system a few days but the initial data is positive sadly we can't do a similar thing for Paypal and would need to rely on CSS history trick (see below)
All your data is mine
Lets go look at that Google account page again, I'm pretty frugal with my details yet on that page there is my Full name, email address, a shipping address (probably my home even) if you go down to the services you would have quickly realised that I use this account for work as some of my services include Google Web Optimiser, Adwords, Analytics.
So what about other sites and other information for example...
MSN / Yahoo
Similar to Google both have a central account location lots of information to grab should you choose to.
http://www.facebook.com/home.php?ref=home
Gives basic login info including full name and their Facebook ID via the profile link which allows us to access:
http://www.facebook.com/profile.php?id=FBID&ref=profile#/profile.php?id=FBID&v=info&viewas= FBID
Which gets you the users complete profile, dates of birth, mobile phone numbers etc etc, oh and all their friends information to (well what they shared with your original victim sorry customer)
Twitter
I did say find out what they had for lunch, while Twitter information can be obtained through the Iframe method its much simpler but my friend TheHodge pointed out an alternate method utilising a bit of javascript and Twitter API the downside to this is that like the rest of Twitter you are reliant on the thing being up! Still I did promise we could find out what some people had for lunch and what better place, once you got the username, just pull in their feed and mine.
Profiling and Data guessing
Once we have all this data we can start making some guesses and profiling users, we could go the whole hog and do credit checks after all we have name and current address but in the UK that would probably abusing your credit license plus time consuming etc. We could also pull details from the land registry on their property and then estimate an income bracket, or we could look up their job description. I can't find the study but the BBC ran a report a while back claiming people were less likely to lie about their Job title and description on facebook then on other sites such as LinkedIn etc always worth knowing.
Identifying if some one is at work
This is pretty easy, verify they have a job, check current time in their location is it between 9 and 5? reverse DNS on their hostname check against a list of common terms such as residential etc if their ISP shows no sign of being primarily for non commercial use, its within normal hours of work and they themselves do work its a reasonable assumption to assume they are at work! Of course they might not be but chances are they are.
What if they use a secure session?
Some sites like Paypal won't let you stay logged in so you can't grab information from the browser window unless they happen to be using Paypal within a few minutes of visiting your site. If we want to know if they have visited Paypal we will have to go down an alternate and less reliable route.
CSS Browser History
Your browser can identify visited links, so when a page renders it knows when you have been to certain pages before. Web designers can even style these visited pages with a different colour link, which means its equally easy for us to identify where a user has been. The first step is to generate a list of possible locations, then create a very specific visited colour. When the user visits the page it will show the other pages they visited in your specific colour, we then run a piece of script to identify elements with that specific colour on the page and we now have a history of where you have been.
Note: This is flaky and I mean really flaky I have used the term URL but I should really have said URI it is the exact location you have been for example http://www.example.com and http://www.example.com/#hello are not the same and in some browsers (such as chrome they would not both show in css browser history unless you visited both)
For more information and for the original script that see the Ha.ckers.org CSS History Hack
Final thoughts
If this has worried you a little then remember the solution is simple, log in and out of your sites, if you must stay logged in use a different browser (or in Chrome and new Firefox user their privacy mode)
However for marketers the ability to know as much information as possible about a user is gold but only if you can properly utilise it also keeping in mind data protection and retention regulations means a lot of the data you may find you can't keep so can only be used for one off judgments (such as assigning a profile group, or changing checkout info around)
So now I will leave you with a simple question, how much are you worth? Your data I mean how much do you think all the data your exposing is worth?
Are you interested in Profiling and Grouping your users then why not think about hiring a Stuff Consultant! See my consulting services for more information or why not get in touch!








CCleaner rules!
That only helps on session exit not while your currently surfing
though it will help with the css history aspect.
Sounds like you have been busy on another great research project. After seeing Google display my Adsense id and email accounts on search pages on another browser tab , I quickly realized that I was being tracked.
I had little doubt that Google tracks everything we do, they’ve not exactly made a secret of it. But, in my limited knowledge of cookies and tracking it seemed quite probable that others could gain information to my accounts if I surfed while logged in. Your findings prove my suspicions to be correct.
Good for marketing, I suppose; but horrific for personal data security. The last 4 digits and type of credit card??? Guess, I won’t be using Google Check out if I can help it.
Its worth remembering that the above does not gain access in the traditional sense, your browser (i.e you) are doing the access and then the information is just being extracted and shunted back to the site. Also its worth remembering it takes time and so for a really deep profile the site would have to keep you stationary for a little while, say a really long blog post…
or a flash game, then its reliant on a bit of social engineering humans are creature of habit, the first thing I do is open my emails the second is look at messages, the funny picture of my mates friday night out etc, of course in theory this should effect users surfing the web from work less because depending on company policy they wouldn’t have opened a gmail account or facebook but how many people really obey company policies?
I didn’t think you could access the attributes of an iframe’s contents via javascript when the main page and the contents of the iframe were on different domains?
Wouldn’t you just hit permission denied on the window.frames['myframe'].document.getelementbyid?
I must be missing something?
It does tend to depend on browser as to exactly what you can do, for example if your user is using IE then you have to jump thrugh a couple of additional hoops but most allow limited access to read and extract content by getelementbyid for example. Where cross domain issues come in is if I wanted to say change the wording in the other frame if it was from a different domain. Then each browser has its own rules by which it can or can’t do also I tended to notice lots of related issues with running long functions but I suspect thats timeout issue. As mentioned in the article a lot of people get access denied problems not due to security restrictions but the fact the frame has yet to load.
Talking with a lovely gentlemen the other day who wanted to remain nameless suggested using javascript executed through flash was an alternate method as it bypassed many of the safeguards put in place by the browser, but in turn will meet flash own limited ones I haven’t tested this but he did show several impressive examples using what appeared to be this technique.
It really needs to be emphasised the information you can grab is very very limited, their is an exception to this and that is if the javascript is run on the local machine. i.e if I could convince a user to execute a script residing on their local machine, then you would have full read/write access.
Just to add that if I managed to social engineer a person to launch a script locally I think we could come up with something a bit more nefarious then reading website domains.
Ultimately this post is about social engineering and profiling grabbing data from any source you can, be it twitter, facebook or blog comments.
Tim, thanks again for a great case study. We have been using online profiling for our ecommerce site. Your case study has shown us another path to explore.
Since we have recently implemented Google Checkout, I will be trying your method to increase our gc conversions.
Thanks
Matt
Responses to this post: