Posts Tagged ‘measurement’
In a recent post Haowen Chan and Robin Morris warn “the last thing you want to do is implement a [big data] system that develops and propagates data, only to learn it’s hopelessly biased.” All research and analysis has bias built in by the very nature of human involvement. However Chan and Morris provide four useful bias-quelling tactics that can be used to improve the big data science process:
- Employ domain experts Rely on them to help select relevant data and explore which features, inputs and outputs produce the best results. If heuristics are used to gain insights into smaller data sets, the data scientist will work with the domain expert to test the heuristics and ensure they actually produce better results. Like a pitcher and catcher in a baseball game, they are on the same team, with the same goal, but each brings different skill sets to complementary roles.
- Look for white spaces Data scientists who work with one data set for periods of time risk complacency, making it easier to introduce bias that reinforces preconceived notions. Don’t settle for what you have; instead, look for the “white spaces” in your data sets and search for alternate sources to supplement “sparse data.”
- Open a feedback loop This will help data scientists react to changing business requirements with modified models that can be accurately applied to the new business conditions. Applying Lean Startup like continuous delivery methodologies to your big data approach will help you keep your model fresh.
- Encourage your data scientists to explore. If you can afford your own team of data scientists, be sure they have the space and autonomy to explore freely. Some equate big data to the solar system, so get out there and explore this uncharted universe!
We can also consider what bias we are encouraging when we develop systems – from social media plugins to smart objects – which collect ‘big data,’ or data which could be aggregated into big data analysis. Might we be unfairly representing a picture from our data subjects, either by representation or omission? Collection, processing and analysis are all crucial to consider in the quest for useful and accurate big data outcomes.
Business Insider has shared a fascinating look at what helped the Obama campaign raise so much money during his recent successful presidential bid. The key was a highly successful combination of science and creativity – with what has been described as “strange, incessant, and weirdly over familiar” email subject lines and content.
A/B testing is a technique popular with web designers. It involves showing two different versions of a page to users – and measuring which gets the best response (this could be in terms of time spent on page, or the completion of a desired goal – i.e. purchase or successful registration). The Obama campaign triumphed by being brave, cheeky, and optimising subject lines, content and formatting (with often as many as 18 variations) incessantly to find out what achieved the best results for its fundraising emails. In the end, the ‘winning’ email subject line was ‘I will be outspent’ – a rather passive aggressive line that obviously shook Obama supporters with their worst fear: that his opponent would spend more, and win the election on that basis.
This provides a strong reminder of how valuable access to data is, in running successful communications activity. Even if you are working agency-side, and somewhat removed from your client’s analytics – it is imperative to know what is working by getting access to as much data like this as possible from across their channels.
Posted September 2, 2012on:
“Movenbank just released its financial scoring system that allows users to monitor and understand their financial data in a whole new way. This innovative real-time financial credibility score combines data from shopping patterns, daily spending and social influence into a personalized ‘CREDscore.’ … As part of their services Movenbank will provide instant real-time feedback on spending, with personalized insights that affect behavior.
Today I’ve been checking out Movenbank – in the context of social media data beginning to affect our financial statuses. The site / service is presented very much from the perspective that banks have been letting us down with the way they offer products and services, and make decisions about our credit scores, and loans, unfairly… taking this ‘out of our hands.’
Movenbank will, in contrast, develop a view of individuals based on personality, and behaviour, some of it determined by a ‘fun personality questionnaire that identifies your financial profile:’
It’s too early to judge Movenbank, and what overall, this financial services innovation will do for us. But I think it’s worth asking a few questions about the implications of tying up self-identified personality traits, social media data and shopping behaviour to our financial ‘CRED.’
How will it work? Will we get a better score from consistently buying sensible organic vegetables and pulses, rather than last minute flights to Biarritz or a gorgeous pair of Alexander McQueens? And why should we?
As I understand it credit scoring is determined by how much you have been earning, borrowing, and paying back. What do we really gain from adding social media and precise shopping activity into the mix? Will some with poor traditional credit scores be able to borrow more? Will some with good traditional credit scores, be marked down in CRED for a personality or behaviour that this new score deems dodgy in some way?
Seems to me many of our financial problems have been caused by too much credit, not too little… if this is another way of opening the gates for people deemed a ‘risk’ by other measures, to borrow… is it a step forward? And will it really be fairer for us to be financially scored on our Facebook likes, tweets, and late night impulse Amazon purchases? For our financial status to be based on what we buy, and who we are, or have constructed ourselves to be, as well as, or instead of, how much we spend, borrow and repay?
Vegetables image from Bread, Water, Salt, Oil