How to Evolve In Data Science: Memoires from Strata and Hadoop

Posted 2017-03-23T Posted by Tom

By Maya Dillon, Pivigo Community Manager

What and why Strata+Hadoop?

In early May I attended the Strata + Hadoop World Summit in London: three intensive days dedicated to living and breathing data science and analytics. The webpage described it as the place to “to learn, connect, and explore the complex issues and exciting opportunities brought to business by big data, data science, and pervasive computing.” It didn’t disappoint. Our very own Dr Kim Nilsson was there last year, and this year, it was my privilege. Aside from going some very interesting talks networking, my main objective was to find out what was really going on in the world of data science. So I took the opportunity to interview some rather talented people spearheading their organisations progress in the world of data science. If you want to find out more in detail about what we chatted about, check out my Long-Form posts on LinkedIn and my conversations with:

Andy Isaacs, Head of Marketing Services at the BBC,

Aaron Kimball, CTO at Zymergen,

Ronan Moynihan
, Data Science Developer at Ryanair,

Ben Lever, CTO at Ambiata

Phil Harvey, CTO at DataShaka.

The Maturity Model for data science is real and tangible

So of my many takeaways, one of the things that was striking was the pervasiveness of the concept of maturity in data science. I only heard of the “Maturity Model” myself about a year ago.

(See image from Booz Allen Hamilton. Other Maturity Models are available!) From what I gather, companies are now moving into what I would term as a state of ‘self-awareness’: a necessary mind-set that must manifest itself in order for a company even begin to evolve along the maturity timeline. I’ve noticed businesses going from saying: “What is data science?” to, “Yeah, we use data in what we do, but we’re not doing any real data science yet.” Or “Actually we’re in the early stages.” Progress indeed!

“We are getting to the point where manually directing efforts is becoming inefficient, or are rate limited in how well they can keep things moving forward. We’re going to have to start using more predictive models from here.” – Aaron Kimball: CTO at Zymergen

Companies are more aware of their data science needs

Every company has unique data science demands and needs. However, it seems that many problems are rooted in one issue: a company needs to be specific and scientific in what they require from their data. Many of the new data driven companies – formed in the last few years – are born with an innate understanding of this concept, but it’s not impossible for companies of greater heritage to adopt this approach. The difficulty lies in making changes in the organisational mind-set and environment as well as best practices – which are not always best! I could wax lyrical about this, but that’s for another blog post.

A lot of what we do at Pivigo is educate companies on what data science entails considering their domain specific demands. There is a reason why the word ‘science’ is in data science! At the same time we educate data scientists on how they can equip themselves with the skillsets they need to achieve their required objectives. It’s apparent that many companies who offer analytics or data science as a service, are also having to do the same.

When we initially talk to an organisation, that end-state is usually a long way down the road. So we find taking an organisation on a journey is really important. The difficult part is educating organisations so that they feel comfortable and get on-board with a scientific approach. That can take a long time, but the outcomes are worth it.” – Ben Lever: CTO at Ambiata

Respect your data while you find your people

Very simply, treat data as a first class citizen in your organisation. It is as important as your human resources, to have the right data resources available to you. It is the key route to learning and to knowledge within your organisation. If you treat it as a first class citizen will give back to you in spades.” – Phil Harvey: CTO at DataShaka
I couldn’t have put it better myself. The way in which data is handled is also the source of many issues for companies. It seems that problems arise from our misunderstanding and mistreatment of data. The consequences can be flawed methods of extraction, as well as obstructive formatting and storage. This in turn can have a negative impact on any future analytics being performed on the datasets, and undermine any insights and value from the data. Which is why we collect it in the first place, right?

So, while you are figuring out why you want to collect your data and develop a framework managing it you’ll be winning only half the battle. Why? Because it seems that finding people to work on your data is still tough. Data science means different things to different companies, and so a data scientist’s responsibilities vary. However, a successful data scientist will always need a strong statistics background, programming skills, and a scientific mind set combined with excellent communication skills. They also like to, and in the majority of cases, need to work in groups. However, consider what is it you want your data scientists to achieve in the context of the culture of your company.

We are a very creative organisation, and while certain parts are very data driven, others are incredibly creatively driven. To thrive at the BBC you need to be able to talk human! Communication skills are very important, so someone who has a lot of data skills and likes crunching numbers must understand how our business works and be willing and able to explain their analysis in an accessible manner.” – Andy Isaacs: Head of Marketing Sciences at the BBC

In conclusion:

It is interesting to note that no matter what your company specialises in delivering or selling, or how old your company is, data will be the source of future growth. Whether or not you believe in the concept of "Big Data", this is about moving with the times and adapting to a world that is changing at an exponential rate. It’s actually Darwinian. So, when it comes to developing your data science strategy and becoming more “Data Mature”, keep the following in mind

  • Develop a robust framework for monitoring, extracting and storing and analysing your data.
  • Be Consistent
  • Be specific and scientific in what you want from your data. Think SMART and KISS!
  • And finally, treat your data as you would treat your valuable data scientists. With much respect.

  • “Thank you” to Aaron, Ben, Phil and Andy, for generously giving your time and energy and for sharing your knowledge and expertise. Also many thanks to Strata+Hadoop world for the opportunity to attend.

    Get priority access to Pivigo news, features, events and networking opportunities