Some questions for those jumping on the "Big Data" bandwagon


If "Big Data" is the answer, what was the question?

Back in 1984 I organised an event on the application of data mining techniques to very large financial services transaction databases to aid the detention of insider trading and fraud. That was a closed round table for the C3 Club of Investment Analysts and Fund Managers. Philip Treleaven did a tour de force, describing exercises where he and his students at UCL had already helped detect serious misconduct. Over the past thirty years the techniques have matured. Brought alongside the Internet and the World Wide Web, they now lie at the heart of a wide variety of on-line models: from enabling Internet Services to be funded by targetted advertising to the analyse of on-line gossip to detect political and social trends. They also lie at the heart of most of the worlds monitoring and surveillance systems.

Today, thanks (in part) to the success of films like The Bourne Trilogy and television series like Person of Interest they are being seen as either a threat to society or the "answer" to a whole range of problems facing central and local government government. In consequence we have a rash of policy round tables and conference planned over the next few months as lobbyists see a bandwagon on which clients anxious to fill their server farms and data centres or sell mass storage and analytical software can jump.

This has led to confiusion on the part of Civil Liberties Lobbies as to whether their "real" enemy is the State or those who collect, collate and analyse data sets and sell the results to who-ever pays. Meanwhile regulators huff and puff and exempt those with the legal fire-power to paralyse their enforcement activities.

In November 2008 EURIM organised a Director's Round Table on Information Governance. This was followed in February 2010 by a high profile round table with the title "Uncovering the Truth". Recordings of the discussion are available on the EURIM website. Some of the allegations as to the scale and nature of random error and systemic in public sector databases were truly frightening. Most speakers were focussed, however, on the potential for using the better management of data to better target and also greatly improve public services and on the issues that needed to be addressed in order to deliver that potential. Most of the latter were to do with the wetware, the people processes: from basic training through to high level governance. EURIM then organised a series of studies and reports on these, including topics such as trust, identity, value, quality and security by design under the broad umbrella topic of Information Governance .

Most of events being organised this autumn still, however, obsess over hardware, software and the organisation of shared outsourcing. There appears to be a belief that hyping these will help kick start a new round of customer (particularly public sector customer) spend. I believe this is waste of marketing effort, given the shortcomings of public sector files, the scale of leakage and compromise from centralised databases (public and private) and the financial pressures on customers to demonstrate rapid return on new ICT spend. There is good business around, including from using the PSN framework to help leverage shared services, including for data cleansing, security and fraud detection. But this is not the most cost effective way to access the more profitable opportunities.

I have therefore asked my successor to include another round table (building on the formula that was successful for the previous EURIM Round Tables) in the launch programme for the new Digital Policy Alliance. Given that the exercise will need to be funded by suppliers looking for new business, the aim will be to focus on those activities which help them switch their efforts from "pissing in the winds of change" into finding new and more profitable ways of helping their customers deliver more (and more securely and reliably) for less. I have done a brief for my successor with my ideas as to what these might be and who should be invited. If you want to know what is in that brief you will need to join the DPA and volunteer to join the planning group.

I will, however, say that, as Chairman of the Conservative Technology Forum, I am asking whether the time has come to make use of the laws of libel, slander and tort to void contractual small print and take action against those who (by design or negligence) pour streams of unvalidated data into leaky vats of toxic sludge which they then sell or publish. The House of Lords Select Committee on Personal Internet Safety made a similar point - which has been studiously ignored by all and sundry since then.

My aim is unashamedly to tilt markets in favour of those who are serious about working together to help educate and train their customers in the proper use of the people processes and support tools available to cleanse, collate and analyse the growing morass of data becoming including from smart phones, meters, grids, buildings and "things".

But what is "proper".

That is a debate that should take place on an all-party basis  - hence my desire to see an open and swell-publicised DPA activity, as well as the rollicking session that I am that I expect the CTF to organise under the Chatham House Rule.

P.S. Just been sent a link to a Register Article on the integrated surveillance system sold by Microsoft to the NYPD . What really surprised me was the royalty deal. It was "Soooo 20th Century". I remember customers being persuaded to sign up to similar deals in the 1970s. The managers who signed the deals used to appear on the conference circuit, all expenses paid by the suppliers concerned, trying to drum up business. I do not recollect any of them generating a serious revenue stream for either customer or supplier. In consequence the model fell out of favour in the UK by the early 1980s. That is not, of course, to say that it is  not valid - after all SAP was created using a similar business model to cover the development cost of application modules using standard business approaches within an inter-operability framework which it could knit together and resell.