Well it’s a big question, but certainly one Chris Yiu of the Policy Exchange thinks the answer to is “Yes” Big Data can save the UK Government £33bn per year. I’ve just attended a Policy Exchange event which EMC Greenplum co-hosted with Francis Maude at the Policy Exchange where The Cabinet Minister, Chris and James Petter, the EMC MD, all discussed how this could be achieved. One important recommendation was that the Government should adopt a Code for Responsible Analytics. Something Continue Reading

Your thoughts matter

Whoop Whoop!!!

We’re only a couple of days away of what promises to be an exciting 24 hours of data science bonanza. Official Website

EMC Greenplum, EMI Music and Data Science London have been teaming up to put together an online competition that will the the largest consumer dataset collected for the music industry be the centre stage for what we’re calling a Hackathon.

Data Scientists from the UK and across the globe will be entering a 24 hours challenge to help predict how music fans might like songs released by artists. (see press release)

Over 3 years, EMI have been collecting information from customers around the globe through different sets of surveys. Reaching nearly 1 million entries, this is the largest dataset of its kind in the music industry.

David Boyle, SVP of insight at EMI, has been heavily involved in making this contest a reality, working closely with EMC Greenplum in the UK and the USA. More importantly are the data scientists who will compete to win the £6500 cash Continue Reading

Your thoughts matter

So what do the Large Hadron Collider (LHC) experiments at CERN and the announcement that scientists may have found the Higgs boson  particle have to do with the challenges faced by the UK Government and their drive to achieve savings?  The answer is ‘Big Data’ and the drive to develop new data analytic methods based on the manipulation of huge amounts of data.

The availability of affordable compute power and new platforms such as Hadoop and Greenplum HD designed specifically for big data analytics across unstructured datasets has changed the landscape of scientific research and will ultimately have a similar effect on public sector capabilities.  But the question is – how long will it take before Government leaders recognise the opportunity and act?

It is also worth noting that I believe that many people in Government find it hard to differentiate between the fundamental concept of ‘Big Data’ and the largely politically driven objective of ‘Open Data’.  But in the context of this blog post it may be worth holding this thought for a later article.

So how large is the big data challenge faced by CERN?  Well, apparently this year the LHC experiments will generate 22 Petabytes of data and that’s after 99% of the data from the experiments has been tossed.   That’s a big number with 15 noughts (a million gigabytes) and a big data challenge.

However, although it seems a big number, it is also comparable in size to data held by a number of UK public sector organisations.

But in many instances, the data held by Government is not considered a raw resource in the same way as CERN treats the data from the LHC.  CERN have focused on exploiting this data to complete humanity’s understanding of the standard model of physics and the latest development in our knowledge of the fabric of the universe.  The UK public sector probably doesn’t have such a profound rationale for exploiting big data but it still has a choice – it can either develop a strategy to exploit the value of the data it currently holds and make a real difference to public administration, services and outcomes or continue to cocoon the data it holds in costly retention practices missing the opportunity to achieve big savings.

So how big are the potential savings?  In a report released by Policy Exchange on ‘The Big Data Opportunity’ they estimate that ‘achieving cutting-edge performance could in time save the public sector up to £16 billion to £33 billion a year – equivalent to £250 to £500 per head of the population ‘.

So pretty substantial and an opportunity not to be missed!

Your thoughts matter

Is there anything to be learned from the music industry? The Derby? And the Jubilee Celebrations?

I never sit down and write a blog just because it’s been six days since my last post. Usually I’m working through a business concept or challenge and something in another World will bring clarity to my thoughts.  And like London busses sometimes several ideas will arrive at once!

My last blog, “What is the collective name for a group of Data Scientists” got such a positive response that I wanted to continue on the theme of community. Also I’ve just returned from the 2nd Data Science summit, where Greenplum announced our continuing commitment to community with the launch of the Greenplum Analytics Workbench. This will enable the Apache Hadoop open source community to validate code to scale on a regular, ongoing basis. With contributions certified at scale, enterprises can run them with confidence. (download technical paper here)

I’d also attended a meeting of the Data Science
London community with EMI to plan our next “Crowd Sourcing” data science Continue Reading

Your thoughts matter

The traditional Western view of China as the ‘workshop of the world’ is rapidly melting into the mist. The consumer classes are buying more and – supported by their government – making less. Labour costs are rising at 20% a year:

“In this decade, China will be driven by consumers, not manufacturers”

– Anna Stupnytska, executive director of Goldman Sachs’ Investment Management division.

But this is not a one-way street, and China is not a blank cheque – we should see this more as the start of new ways of innovating, bringing products to market and the creation of new business relationships. At the recent ‘Retail Futures 2012’ event at the Future Laboratory, the developing economic, consumption and production picture was painted as much more nuanced, complex and multi-tonal than a set of crass ‘x’ and ‘y’ axes.

The buzzwords INDOVATION (pertaining to India) and SYNDOVATION (pertaining to China) refer to innovation and products Continue Reading

Your thoughts matter

120 days into my tenure leading the Greenplum business in the UK and Ireland and we’ve just completed our first sponsorship of a global “Hackathon”.  What a great experience. Working closely with the Data Science London community, as part of their Big Data Week, together we organized 200 plus data scientists in 10 cities to compete, over a 24 hour period, to predict the air quality in Cook County in the US.  Seeing all the data scientists at Big Data Week got me thinking. What is the collective noun for a group of data scientists?

Continue Reading

Your thoughts matter

I’m here at Newbury races and in the horse racing community the end of March is a time of both reflection and anticipation.

The jumping community reflects on how their horses performed at the Cheltenham festival in March. Years of work cumulating in one week of championship racing in the Cotswold countryside. Others in the community are anticipating the Aintree festival where the most famous of races, The Grand National, is to be run in a few weeks.

And anticipation is the watchword in the flat racing community. With the first of the five classic races only six weeks away the new season is just under way and trainers and connections wait to see Continue Reading

Your thoughts matter

So there I am sitting on a horse waiting to compete in a show jumping competition when one of the other competitors, having just completed their round, appears back in the warm up ring. She says to her coach

“I don’t know what went wrong, I corrected his (the horses) position before each fence, but we still knocked several of them”

The coach, clearly had been here before;

“Yes you did and that’s the problem. You’re correcting the horse after the fact. You’re waiting for something to happen before you react. The winning riders are reading all the signals and predicting what might happen and taking corrective action first”

“But there is so much going on” says a clearly frustrated pupil, “The crowd, the speed and rhythm of the horse, the type of fence, the line of the fences, the next fence, the time left, my position, the horses lead leg…”

Having attended the Greenplum Data Science Event on Feb 8th it’s clear that whether you are Marcus Du Sautoy, Peter Hinssen or Professor Nigel Shadbolt predicting outcomes based on vast quantities of data, just like our rider, is a key skill that business needs to develop.

Peter Hinssen discussed how the

“power of participation is changing the world” and that “markets are becoming networks of intelligence and the need to have deep technology but also a drastically different consumption model.”

Whilst Marcus Du Sautoy cautioned about identifying patterns too early Continue Reading

Your thoughts matter
February 10 2012

Big data – is it much ado about nothing or is it, as it’s hyped to be, the saviour to many an organisation? People are still struggling to come to terms with big data and exactly what it is. Either way, it has certainly established itself as the key technological trend for 2012.

For a variety of reasons, organisations are now gathering and storing more data than ever before. In its raw form, however, data is useless. It is information which is of benefit to a company – thus begging the question, how do you get information from data? Simple: data analysis.

Big data is the process of taking large, complex data sets, which may come from a range of sources and in a variety of formats, and analysing it to extract relevant, tangible information from it. Whether it is garnering greater insight into the behavioural patterns Continue Reading

Your thoughts matter

I’ve just returned from my “sheep dip” sorry new hire training on the US west coast. What movie did I watch on the plane on the way over, you’ve guessed it

MONEYBALL

How kul is that?

(You can tell I’ve been on the west coast as I’ve started using words like kul which I haven’t done since the 70’s.).

The story of Oakland A’s general manager Billy Beane’s successful attempt to put together a winning baseball club on a budget by employing computer-generated analysis to draft his players.

A film with data science and predictive analytics at the heart of it – I’m now trendy, I can tell people what I do at dinner parties and they’ll understand.

“Yes I employ Peter Brand Characters who are great with numbers”

One of the points I took away from the film was that the Billy Beane character had to stand firm with his commitment to a radically new approach despite a torrent of resistance Continue Reading

Your thoughts matter