Robots In a Car Factory

Industrializing Data Science and Analytics

Gartner´s “Hype Cycle for Advanced Analytics and Data Science 2015″ has just been published. The trends indicated in the hype cycle show a rising maturity of this young organizational discipline.  It is interesting to see that the buzzword “Big Data” has finally disappeared from the hype cycle, while machine learning (a discipline that has been there for decades, at least in academia) has reached the peak of inflated expectations.  This underpins a tendency to  move from big data (the bigger the better) to smart data (the smarter the better). Simply said: “No matter if it is big or small data, it is still data and we aim to get more value out of it.”

A trend that is also visible at a second glance is the emerging industrialization of data science, which is underpinned by a number of developments. Vendors increasingly support the management of analytical models built by data scientist over their entire life cycle, when they are scaled from prototype to company-wide adoption. So far, the management of analytical models has been rather disorganized in most companies.  Data scientists would create new models on an use case by use case basis. Some of the models have been actually doing what they promised to do and would be deployed in operations.

An end to end management of the models and a reuse of solution patterns for analytical models across the enterprise has not been actively enforced or governed. In a new project, they would often start nearly from scratch although a similar model might have been already developed in a different business unit. From an organizational point of view, it makes sense to have a centralized data science unit that can support data scientists in decentralized business units. A central data science unit can ensure that learnings are incorporated and fed back to the organization and that analytical models are consistently governed even after they are handed over to IT.

Very connected to this is the concept of the model factory. The idea is to bring automation and scalability to the process of building and deploying predictive models.  To find the best models, a huge number of models are built and tested using software tools that provide a high  degree of automation during devleopment. At the end of the process, only the best few models are deployed.

Finally, a thrilling concept comes from Gartner´s Alexander Linden, which is the concept of the analytics market place. Some companies such as Microsoft, Rapidminer and FICO have created marketplaces, where data science services and additional functionality are provided by third parties which can be purchased by users of the analytics platforms. This can become a true game changer. Similar to the third party apps and services provided at, analytics marketplaces could become a source of millions of very domain specific analytics micro-applications that drive innovation.

Today, we stand only at the beginning. I am convinced that in a few years time, data science and advanced analytics will be as industrialized as traditional IT. What has changed with the uprise of data science is the speed with which new applications are developed and deployed, the increased willingness to experiment and the direct way data innovates business models and business operations. Now, we only need to scale it to the rest of the enterprise to reap the full benefits.