Whoop Whoop!!!

We’re only a couple of days away of what promises to be an exciting 24 hours of data science bonanza. Official Website

EMC Greenplum, EMI Music and Data Science London have been teaming up to put together an online competition that will the the largest consumer dataset collected for the music industry be the centre stage for what we’re calling a Hackathon.

Data Scientists from the UK and across the globe will be entering a 24 hours challenge to help predict how music fans might like songs released by artists. (see press release)

Over 3 years, EMI have been collecting information from customers around the globe through different sets of surveys. Reaching nearly 1 million entries, this is the largest dataset of its kind in the music industry.

David Boyle, SVP of insight at EMI, has been heavily involved in making this contest a reality, working closely with EMC Greenplum in the UK and the USA. More importantly are the data scientists who will compete to win the £6500 cash Continue Reading

Your thoughts matter

Is there anything to be learned from the music industry? The Derby? And the Jubilee Celebrations?

I never sit down and write a blog just because it’s been six days since my last post. Usually I’m working through a business concept or challenge and something in another World will bring clarity to my thoughts.  And like London busses sometimes several ideas will arrive at once!

My last blog, “What is the collective name for a group of Data Scientists” got such a positive response that I wanted to continue on the theme of community. Also I’ve just returned from the 2nd Data Science summit, where Greenplum announced our continuing commitment to community with the launch of the Greenplum Analytics Workbench. This will enable the Apache Hadoop open source community to validate code to scale on a regular, ongoing basis. With contributions certified at scale, enterprises can run them with confidence. (download technical paper here)

I’d also attended a meeting of the Data Science
London community with EMI to plan our next “Crowd Sourcing” data science Continue Reading

Your thoughts matter

120 days into my tenure leading the Greenplum business in the UK and Ireland and we’ve just completed our first sponsorship of a global “Hackathon”.  What a great experience. Working closely with the Data Science London community, as part of their Big Data Week, together we organized 200 plus data scientists in 10 cities to compete, over a 24 hour period, to predict the air quality in Cook County in the US.  Seeing all the data scientists at Big Data Week got me thinking. What is the collective noun for a group of data scientists?

Continue Reading

Your thoughts matter

So there I am sitting on a horse waiting to compete in a show jumping competition when one of the other competitors, having just completed their round, appears back in the warm up ring. She says to her coach

“I don’t know what went wrong, I corrected his (the horses) position before each fence, but we still knocked several of them”

The coach, clearly had been here before;

“Yes you did and that’s the problem. You’re correcting the horse after the fact. You’re waiting for something to happen before you react. The winning riders are reading all the signals and predicting what might happen and taking corrective action first”

“But there is so much going on” says a clearly frustrated pupil, “The crowd, the speed and rhythm of the horse, the type of fence, the line of the fences, the next fence, the time left, my position, the horses lead leg…”

Having attended the Greenplum Data Science Event on Feb 8th it’s clear that whether you are Marcus Du Sautoy, Peter Hinssen or Professor Nigel Shadbolt predicting outcomes based on vast quantities of data, just like our rider, is a key skill that business needs to develop.

Peter Hinssen discussed how the

“power of participation is changing the world” and that “markets are becoming networks of intelligence and the need to have deep technology but also a drastically different consumption model.”

Whilst Marcus Du Sautoy cautioned about identifying patterns too early Continue Reading

Your thoughts matter

I’ve just returned from my “sheep dip” sorry new hire training on the US west coast. What movie did I watch on the plane on the way over, you’ve guessed it

MONEYBALL

How kul is that?

(You can tell I’ve been on the west coast as I’ve started using words like kul which I haven’t done since the 70’s.).

The story of Oakland A’s general manager Billy Beane’s successful attempt to put together a winning baseball club on a budget by employing computer-generated analysis to draft his players.

A film with data science and predictive analytics at the heart of it – I’m now trendy, I can tell people what I do at dinner parties and they’ll understand.

“Yes I employ Peter Brand Characters who are great with numbers”

One of the points I took away from the film was that the Billy Beane character had to stand firm with his commitment to a radically new approach despite a torrent of resistance Continue Reading

Your thoughts matter

Day 6.  I’ve just taken over the running of EMC’s Analytics business (Greenplum) in the UK&I. Drinking from a fire hose!

Big Data this Big Data that, Massive Parallel  Processing, Scatter Gather Technology, Sentiment Analysis, Bayesian Statistics, met twenty new people already. Chaos? Typical first week.

Then some bright spark from marketing tells me I’m to meet Marcus du Sautoy at a Data Science event on Feb 8th. “Look forward to it” I say as I type his name into Google.

His big thesis is that although the world looks messy and chaotic, if you translate it into the world of numbers and shapes, patterns emerge and you start to understand why things are the way they are.

I need to be there. Marcus may be able to help. Thinking about it I can’t be the only person trying to make sense of a deluge of data from many different sources and then trying to make some money from it. If you find yourself in the same boat join me at the Data Science Series and hear not only Professor du Sautoy but also Peter Hinssen, Sean Gourley and others who have some great ideas and learning.

http://www.datascienceseries.com/index.php

If you’re interested in learning about building an analytics business for the future and learning about analytics in general then join me on my journey on my blog throughout 2012.

Your thoughts matter