Humans are creating vast amounts of data. By 2020 it is estimated that we will have produced 44 zettabytes, or 44 trillion gigabytes of data between us. Organisations have complex steps put in place to capture and capitalise on it, yet many businesses still bemoan a lack of actionable insight from their data. You would imagine that data quality is the main block here, but even with “clean” data, a lack of time (to deliver insight to the business) and focus (the setting of specific commercial objectives) serving as the major road blocks that defiantly stand in the way of all that Big Data promises.
We Need to Talk About Big Data
Simplicity. It’s Underrated.
Simple models, such as logistic regression, are more than sufficient for the analysis of Big Data. Rather than focussing on complex models, there should instead be an aim to drive down the time between the acquisition of data and the development of predictive models.
Problems. You Need to Explore More of Them.
Data scientists should be able to respond quickly to multiple prediction problems. Rather than taking a single commercial problem at a time, a streamlined, advanced machine learning model should be able to deal with a dozen at any one time.
Your Data. Take a Representative Sample.
Right now, data scientists are focussing on distributed computing in order to manage and analyse big data; the resources currently invested in this would be better spent on techniques that can allow for comparable insights from a data subsample. This avoids the reliance on huge processing power, and provides for the support required when it comes to exploring a higher number of hypotheses.
Automation. It’s Going to Streamline Your Data Efforts.
Reducing the time that Big Data demands can be split into two areas: reducing the time to the first model, and increasing capabilities to explore more hypotheses more quickly. What the data science realm should be working toward is algorithms that can streamline data transformation — automotive data processing techniques that can transition data into aggregates or ensure the data is ready for predictive modelling.
These four core concepts can deliver time efficiencies, and decrease the timescales that data scientists require to understand, formulate, and process data for a machine learning problem. This, alongside a relevant and precise focus, can be the difference between your data scientists keeping up with the problems presented by your business experts, or becoming woefully overwhelmed.
A Masterclass in Unlocking Big Data
The Client
The Parts Alliance Group is an industry leader in the distribution of parts for the automotive aftermarket. Annually, they supply 30,000 unique products to 155 UK garages and workshops. Safe to say that this is a business home to Big Data.
The Project
Over five weeks, PhD Data Scientists analysed 20 million transactions, the aim of which was to discover why some branches were more successful than others, and to gain insight into sales data.
The Result
A £6 million increase in revenue in five weeks — perfectly demonstrating the power of Big Data when unlocked.
5 Tips to Unlocking the Value of Your Big Data
1. Your Data — Make it squeaky clean and meticulously organised
The success of your data project relies on the quality of your data — it mustbe clean, lean and meticulously organised. Too much data, and even the most talented of data scientists will be overwhelmed. Machine learning professionals expect data that has been arranged into helpful variables — such as the number of visitors for a website, instead of each click, action and outcome of every user.
2. Your Data Scientists — Help Them Hone In On a Crystal-Clear Business Objective
Data scientists have a finely focused mind-set — they work on the task at hand and the specific problem before them. They create models to increase efficiency, bolster profit, or drive down costs. They don’t see the bigger picture, nor should they, if they’re to deliver the laser precise answers you need. Your final business objective should be understood by your business executives, and used to provide context to your data scientist.
3. Your Data Scientists — Provide Them with Freedom to Flourish
Create an atmosphere where free-thinking is encouraged, where experimentation can be undertaken free from constraints. It is in this kind of environment that talented data scientists can use their creativity in order to arrive at the most insightful of solutions.
4. Your Talent — Make Sure You Hire the Right Type
There’s a big wide world out there in terms of data science talent, and this spectrum of professionals tends to be distinctly split at either end when it comes to commercial hiring — those who thrive on faultless datasets, and those who hold PhDs in rocket science. Neither of which will best benefit the business.
The ‘right’ data scientist is the one who has commercial acumen andacademic ability — those who have experience of business and maths combined: go for those with statistics, financial engineering and actuarial science qualifications.
5. Your Focus — It Should be on Commercial Return
Focussing on commercial return is the task of your executives, who should know what value this model generates, and how this value can be measured. This question should be asked early-on, in the midst of formulating the problem to be addressed, avoiding the pitfall of reverse engineering in the latter stages of the project.
Comments