All posts by aborek

Data Science for Grown Ups – How to Get Machine Learning out of the Lab to Scale It Across the Enterprise

The Lab Trap

Have you ever wondered why so many Internet companies use algorithms across their core business processes to drive automation and constantly optimise their business while your algorithms never really left the lab? You are not alone. Many large organisations invested heavily in data science and big data over the past few years and often struggle to scale their successful machine learning projects beyond a small pilot scope. Something went wrong when we tried to copy the digital players. Algorithms stay in the lab and are not put into the heart of the enterprise. This article sheds some light on how traditional big corporates can take a leap forward towards the digital players and is a built up to my previous article „10 Rules for Data Transformation in Inherently Traditional Industries”.

Dr. Alexander Borek, Global Head of Data & Analytics, Volkswagen Financial Services; Alexander.borek@vwfs.io
Alexander will be speaking at the Enterprise Data & Business Intelligence and Analytics Conference Europe 19-22 November 2018 in London on the subject, Data Science for Grown Ups: How to Get Machine Learning out of the Lab to Scale it Across the Enterprise

How we all got there

A few years ago, the business world realised that Machine Learning, Big Data and AI can generate new value out of large amounts of data generated through digitalisation of the business. New tools, new types of databases and and the rise of cloud computing allowed very flexibly to combine and process high volumes and diverse formats of data bringing new flexibility in working with large amounts and varieties of data. New ways of working between business and IT aimed at bringing rapid business value were introduced in the Tech startup world and copied to more established businesses bringing new agility. Machine learning and AI methods entered business life with the effect that processes can be increasingly automated.

We hired plenty of data scientists and let them do magic. Somehow inside the lab use cases worked, in very fast time we could solve complex business problems, but while transporting them to the real world they suddenly broke down and collapsed. Outside the lab we often find a hostile environment which create a number of challenges and threats for our precious little algorithms:

• Different toolsets and architectures
• Cloud cannot be used
• Legacy IT Systems
• Slow and complex purchasing and approval processes
• Risk, security, data protection, regulations and compliance issues
• Deployment procedures unfit
• Traditional IT operations unfit
• Data quality issues
• Inconsistent data models
• Strong cultural resistance
• Data Scientist are unexperienced and don’t know „how the company is running“

This is because we never fixed these problems, we just tried to create a new free space where innovation can grow and foster within a safe environment, the Innovation Lab, where everything was sort of allowed. What works inside the capsule of a innovation lab does not necessarily work in outer space, i.e. the rest of the organisation. Many organisations just ignored the fact that anything that comes out of the lab will be dead within seconds as it leaves the „free like a bird environment“ and enters the „caged bird“ environment that we are used to in large corporates. Many IT organisations feel threatened by the labs and have a low motivation to help them bringing successful prototypes into production. Regulations such as GDPR are seen as helpful allies to reject the work of labs as unrealistic and in-compliant.

The Data Factory can build the bridge between the two worlds

In one way or another we need something to create the bridge between the two environments to successfully deliver data analytics & AI products and in my opinion this bridge is the Data Factory. The Data Factory needs to replace or extend the Data Lab to ensure that new innovative projects can be executed and then better scaled into the rest of the organisation. There are at least five key components of such a Data Factory, possibly even more.

(1) The Data Platform ensures that technologies are available outside the capsule. It ensures a common state of the art data analytics & AI tools for sandbox, development & production stages, which means that an idea for an algorithm can be explored and then later operated on the same technological environment. Standardised programming languages (e.g. Python), tools and packages are used throughout all phases of the project. Security, data protection and compliance is ensured on platform so the algorithm developer does not need to worry so much when the algorithm leaves the lab. It provides a central data storage for productive data as part of a joint data lake and data catalog and standardised accesses and interfaces to legacy IT systems.

(2) Processes outside the capsule need to be updated to ensure they can handle data analytics & AI projects. They include at least the PLEASE processes: Purchasing, Legal, Evaluation, Auditing, Security and Ethics.

(3) Data Engineers are usually more important than data scientists after prototyping, but many companies have hired a lot of data scientists and not enough data engineers. They ignored that once the analytical model is designed, it is mostly about software engineering! Data engineers have the focus on software engineering rather than modelling. Software and architecture skills are key for developing ETL processes, integrating data ad-hoc and cleansing, writing APIs and database connections, creating CI/CD pipeline and DevOps, testing and deploying data products and adding new components to the platform.

(4) Data Ops are needed to support the operation and maintenance of finished data analytics & AI products. Data Ops contains tasks that are perceived as unattractive for data scientists, e.g.:

• Rules for deployment
• Helpdesk and Ticketing
• ITIL processes
• Managing.SLAs
• Logging
• Archiving

Furthermore, IT Operations and Data Scientists often speak different languages. Data Scientists feel misunderstood because things are complicated and slow. IT Operations are irritated by the perceived ignorance towards corporate processes of the data scientists.

(5) And here comes the Data Product Manager into the game! The Data Product Manager is always 100% involved from start to end of the data analytics & AI product. The Data Product Manager is a true multi-talent in data science and management. He or she holds the end to end responsible for the Data Analytics and AI product, which includes:

• Ideation and data product definition
• Product owner in scrum approach
• Managing all stakeholder relations
• Accountable for deployment
• Ensuring SLAs during operation
• First contact for change requests
• Change management

The Data Product Manager understands the Data Scientists and Data Engineers, but also speaks the language of more traditional business functions. He is the key person to bring change within your organisation and help to create the cultural bridge between the two universes within your corporation.

Obviously there are more further important elements that should be considered as part of every successful Data Factory that I did not mention here. Nevertheless, the Data Factory concept presented in this article should make you rethink how you organise Data Science, AI and Business Intelligence across your enterprise. Maybe a key learning is that Data Science, AI and Business Intelligence are closer than you think and should come as close together as possible!

Succeeding in Data & AI: Insights from the Data Leaders Summit Europe 2018

Inspired from the Data Leaders Summit in Barcelona (and from Paul Laughlin´s brilliant jokes) on my flight back home to Berlin, I wrote down a few key insights from my own keynote and from this years conference in general. It was a pleasure to meet so many likeminded data people again!

Data leaders are a growing population in Europe

You can really see the community of data leaders is growing in Europe and becomes more mature across industries. Finally, even in Continental Europe most companies have hired heads of data and analytics. In most companies data scientists and engineers can be found spread out across the business fully engrained with a business function in addition to the central data & analytics teams (the hub model). Data leaders and their organisations become increasingly well intertwined and accepted within the old established business functions. The really good news is that most of us has left the digital ivory tower by now and do good things to the core business.

AI is clearly the new buzzword – the data fashion show continues

Last year, the buzzword Big Data finally disappeared, but everyone was talking about data science. This year everyone talked about AI and machine learning. I guess its one part of belonging to the data profession to adapt to the newest buzzword each year! My keynote was renamed by the conference editing director from „data science for grown ups“ into „succeeding in AI“. In my own company, it was difficult to explain to people that our data scientists can also work on AI (basically I ended up writing AI all over most presentations). I decided from now on to call data scientists the data & AI scientists to signalise that these are the same people. From a more critical point of view, one could argue that real AI is still taking a bit of time. The kind of AI that most people talk about is not a single algorithm, but a systems of a multitude of interconnected algorithms embedded in a well managed algorithm architecture that can make machines sense, think, learn and act (see also my past article on Smart Machines). Nobody has achieved this so far to my knowledge. My colleague Dat Tran, Head of Data Science at Idealo, summarised it nicely in a Linkedin posting today: „We´re so far from what people call real AI. All what we do is some mathematical optimisation on some data. And if the data is bad the AI also is bad“.One thing that I pointed out in my keynote today was that we missed out to build knowledge taxonomy for our machines, they all need to rely on raw data when we run machine learning algorithms. But how can machines learn about society if they don’t read books?

Underdeveloped Data & AI Engineering and Ops major barrier

One of the main themes that came out of the conference was that realisation that most of us had hired to many data scientists while neglecting to hire enough data engineers and data ops guys. Developing algorithms is an important but only a small proportion of the success of a data & AI project. We ignored that once the analytical model is designed, it is mostly about software engineering! We need a lot of engineers to create data pipelines and connect them to our legacy systems. Data engineers have the focus on software engineering rather than modelling. Software and architecture skills are key for developing ETL processes, integrating data ad-hoc and cleansing, writing APIs and database connections, creating CI/CD pipeline and DevOps, testing and deploying data products and adding new components to the platform. Data Ops are needed to support the operation and maintenance of finished data analytics & AI products. Data Ops contains tasks that are perceived as unattractive for data scientists, such as managing the code repositories and the rules for deployment and the productive environment, running the Helpdesk, managing SLAs, archiving data and ensuring load balancing. There is a clear need for more enthusiastic Data Ops people, I find it very difficult to find the right people for the job at the moment and they certainly need more appreciation by the data community. A highlight at the conference was certainly Harvinder Atwal´s talk discussing how moneysupermarket.com has approached and solved Data Engineering and Data Ops in a neat fashion.

Challenge of running multiple clouds and data gravity

Setting up the increasingly complex cloud environments and create the channels to move data from on-premise into the cloud and back is another big engineer challenge, even for digital unicorns, and the increasing number of clouds and technologies that you need to managed does not make it easier. This was pointed very nicely out by the VP Data Infrastructure at Zalando, Kshitij Kumar. Each cloud has their strengths and weaknesses and we might need to live with the fact that we need more than one cloud environment in the future. Data gravity was discussed in a number of sessions, its increasingly the deciding factor where you run your algorithms. Still, many companies struggle to convince the management of their IT Security Officers that moving to the cloud is a safe and reasonable option. This will turn out to be a big barrier to innovation and growth for traditional companies.

Innovations brought from the Travel Industry

One of the best things that I learned was from Booking.com´s Director of Data Science, Ting Wang. Apparently, booking.com has established a central feature store in one or more of their business areas which is well governed and versioned. This allows data scientists to massively reduce the amount of work they do as they can reuse most important features across their models. A great advantage is that features are finally standardised and from a kind of single version of truth! Another great inspiration comes from another peer from the Ttravel industry. Charlie Ballard, Global Director of Strategic Insights at TripAdvisor had a fabulous presentation on how you can use your data to build data products that you can actually sell to your customers. Knowing who travels when and where and visits which hotel accommodation sides allows you also to analyse as a hotel who your true competitors are. Apparently the guys at Ritz Carlton got it all wrong and ignored that Hyatt is one of their big competitors, as they thought that they play in a different segment. Data speaks louder than words. It turned out that customers would see the two chains as roughly exchangeable and would choose the cheaper one of the two options – which typically made Hyatt a winner!

Complexity avoidance and the White Swan Theory

If you can build a great data product without using machine learning, its the best way to go as you can reduce unnecessary complexity that makes putting it into robust production and scaling it much more difficult. This was the hypothesis that was raised by at least two of the speakers, Ryan den Rooijen, the Head of Data at Dyson, and Stefan Meinzer, Head of Advanced Analytics EMEA at BMW. A challenge is however to keep your top data science talent happy who strive to solve complex problems and to apply the most sophisticated and intellectually beautiful solutions. And here comes Ryan´s White Swan Theory into place. For executives you show a beautiful lake from above with even more beautiful white swans swimming on the lake. However, under the surface of the calm water you hide the complexity and the chaos of machine learning, the bad data formats and abundant data quality problems that are unavoidable whenever you want to implement something. It allows your top talent to run very complex models and then to simplify how you communicate about them to the business. Something that I mentioned in my own keynote is the fact that we need to contain our data scientists more when it comes to the choice of tools, programming languages and packages. Every new bit adds a new layer of complexity when you want to put models into production. To me, a major priority is to standardise the Data & AI environment so you develop and deploy the models on the same tech stack. On top the redesign of control and compliance processes around Purchasing, Legal Processes, Evaluation Processes (KPIs), Auditing Processes, Security Processes and Ethics Processes (aka the “PLEASE” framework) to make them fit for Data & AI needs to become a major goal for every Head of Data. This is again a major cultural shift and requires discipline in the area of data science that from a lot of viewpoints still needs to grow up and mature.

In summary, a great conference in 2018 with a fabulous bunch of data leaders. I learned a lot from them, as every year!

Cheers,

Alex

 

10 Rules for Data Transformation in Inherently Traditional Industries

I gave a keynote at the Data Insight Leader Conference in Barcelona this week speaking about the 10 rules of data transformation in inherently traditional industries based on the experiences I have made so far. Interestingly, there was an agreement among the Data Executives in the room that we need to keep doing showcases while transforming the company and it all needs to be wrapped up in a powerful narrative.

Here are the ten rules that I presented.

Rule 1: Accept that digital transformation success is a myth

As discussed in my last blog post, there is an inherent dilemma with digital transformation. In simple terms put, you either appear successful at digital transformation by focusing on digital showcases (happy honeymoon) or you try to do the real transformation changing how the company operates (the endless road). When you are doing digital transformation, never just simply say you are doing this and that because of this and that. Always tell a story where we are today, what the steps in between are and what the end game looks like.

Rule 2: Demonstrate how basic beliefs in your industry are turned upside down due to digital disruption

A major mistake that many Digital Transformation Executives do at the beginning is to assume that others understand and share the basic beliefs about the success factors and changes in their industry. But they don´t. Others in your company, which includes the top management, have been running their company based on the same unchanged beliefs for decades. Such beliefs consist of things that have been true for a while. Examples are the standardization of processes to reduce cost, hardware product centricity to ensure product quality and attractiveness, number of sold products as key KPI and the number of physical stores is a reflection of market power. Digital disruption suddenly turns the world upside down. It might be that software engineering becomes equally important to hardware engineering and manufacturing. Data & Analytics become a central part of product quality and customer experience. Its not about physical stores but digital touch points. At the very beginning, the CDO needs to set the scene and also explain its implications and then ask directly for the changes to be implemented to match the implications. It is important to tell your senior management explicitly how basic beliefs are turned upside down due to the forces of digital disruption and which implications this have on corporate strategy, organizational structures, KPIs and incentive plans that need to be introduced or adapted. Complementary implementation projects which demonstrate value and roadblocks can be of help that can be discussed as tangible example.

Rule 3: Communicate the simple equation: Digital = Data + X

This is the most important formula for Data Executives. Communicate it at all times. Anything digital is a result of data (collecting, using, combining, analyzing data) and something else on top. It means that there is no digital transformation without a data transformation. Any Chief Digital Officer that says we will deal with Data & Analytics later since we have other more important priorities for digital transformation at the moment misses out that anything else digital he wants to do requires Data & Analytics. Unfortunately, in very product driven companies this happens very often. Communicating the magic formula constantly and explaining it with tangible examples reminds everyone around us that data is a key ingredient to any form of digitalization effort and digital product. The fact that we always need something more than data & analytics puts any data executive in a strategic disadvantage. Simply put: If others don’t do their job, you are screwed. So you better choose projects were you can rely on the X! The best algorithm to determine the optimal pricing of goods sold does not add much benefit, if the results of the algorithm are not used inside an e-commerce portal to improve pricing. This requires changes in the e-commerce portal itself and the processes around it.

Rule 4: Train your existing workforce in data analytics – everyone can learn it

It is naive to think that you can hire all data scientist from other companies. You can hire a few experts, the rest you need to train. And its not only the data scientist you need to train. You need to train also the data and digital project managers and program managers and the people that steer them and their top management. They all need a better understanding of what it needs to build a great data product. A lot of culture change comes through this type of education. It is therefore absolutely essential and not a side activity.

Rule 5: Ask the board to delegate decision making power to cross functional data analytics roles and bodies

The board cannot decide on all data aspects. They need to decide on the governance framework and strategic decisions, the rest of it needs to be delegated. Here the problems start. These decisions are delegated to individual departments such as sales and production. Customer data decisions are delegated to the customer departments, production data decisions are delegated to production departments. This is wrong. Decisions on data should be ultimately taken by cross-functional committees and cross-functional data roles and not individual departments. One data producing department might not see a need to provide data in a well readable format to a data consuming department. Creating the cross-functional committees and roles for data and analytics is one of the first things you should do as Data Executive.

Rule 6: Free your data

Everyone in the media talks about the large volumes of data that are available to be analyzed for all sort of purposes. The brutal reality in traditional companies that we often find no lakes at all, instead a vast data desert. Data is locked away in hundreds of different legacy systems that cannot be used due to their instability and risk of impacting running operational applications. This is something that top management does not perceive since they get all the analysis they ask for (within a day). Even worse, company politics prevent data from being shared. And GDPR and other compliance challenges make this even more difficult. Any project suffers under it. Any. Show to your board the need to create an electronic workflow to regulate data sharing and access involving all control functions like legal, security and data protection with concrete examples. Document how long it takes to get the data and what the problems are GDPR and other approvals. Create a virtual or physical data lakes with an integrated access layer to provide the data to the data scientists and data users. The workflow has prechecked criteria and categories. Instead of having individual decisions all the time by the control functions, they preapprove some data usage for some purposes.

Rule 7: Share knowledge on data

Our romanticized view of what data scientist do all day is that they create complex statistical models and algorithms, apply deep learning and other sophisticated machine learning methods. Wrong. Data scientists in traditional corporations search for the right data all day long. Once they find it, they need to find out what the data means by trying to set up meetings with business and IT departments that do not see helping the data scientist as part of their regular job activities. Its more like their hobby or welfare activities. Just let everyone finding out things document it in the business glossary. Its a give and take model. You document and next time you can find something that others have entered. People in the business departments have also an incentive to feed their data knowledge into the system, since they have less work with explaining data to each new data projects. At the end, all we need is some way of tracking these activities as part of each project and some good peer pressure. This usually reduces around 30-40% of the data scientists and data project workloads. The world can be so simple.

Rule 8: Automate data preperation

Another 20-40% of workload reduction for the data scientist and data projects can be realized if we automate some of the most tiring tasks of data scientists: data preperation. Let´s say you run 50 data projects a year in your company. Lets assume that 20% of the data processing work could be automated once you solved it for the first time. If you only half of the time and resources to automate these data preparation tasks, you have a pretty good business case. After a while, you will realize the full 20% savings.

Rule 9: Embrace an open source, cloud and AI first strategy

In the data and analytics space, a lot of the great tools are open source. The good thing about open source is that you can use it across the entire organization and often more Cloud ready than many commercial software tools. Students from university typically know the tools and there is loads of training. They are much better interoperable with other tools. And you can replace them much faster once they are out of date. This all speaks for an Open Source first strategy. Especially in data science, data scientists need fast access to tools and data with flexible computing power. Hence, adapting a Cloud first approach is the best way forward for most companies. The best way is to use a hybrid cloud approach (private and public). AI assistants like Siri, Alexa and Cortana are currently reshaping the way how we interact with machines using natural language and automating business processes and decisions in the background. Building new applications should therefore follow the AI first paradigm, no matter if it is about internal process optimization (e.g. IT helpdesk) or customer facing applications (e.g. customer support).

Rule 10: Balance data analytics innovation and transformation

Last but not least, combine some elements of each “Data Innovation” and “Data Transformation” in every project you do, right from the very beginning. This will keep people happy while buying you time to do the long-term stuff. It might sound simple, but it is effective. The trick is to find the right mix, e.g., when you run an analytics pilot, also work on the data quality or data collection in parallel.

The fascinating aspect to me is that doing digital transformation is one of the most difficult tasks that anyone can get. Nobody likes change. You receive so much resistance and people try to politically kill you. Still, most people that I know working in this field are passionate and love their job. They have the feeling that they do something with a higher meaning. Their jobs can have significant visible impact in the most positive sense after a few years. You don´t necessary get the recognition for it. But you can see the result after a while and be proud.

Digital Transformation Success is a Myth

I have been working for a while now as part of digital and data transformation teams in large traditional companies, both as a consultant and as a responsible manager. There are big plus sides of such a job. You are working on topics that are considered to be “hot” and there are usually some innovation budgets you can tap into in order to finance your new projects. Its easier to find new jobs and negotiate good salaries as headhunting firms are looking for digital skills and people with experience on digital topics.

The job has also a big downside that most mainstream media outlets involved in generating the hype seem to underestimate. It is a dilemma that nobody I met so far can truly escape.

In simple terms put, you either appear successful at digital transformation by focusing on digital showcases or you try to drive real change (which takes an awful amount of time to show results in the best case or does not show much effects even after years due to the difficulty to change company culture). In a way its a lose – lose situation, which leads me increasingly to believe that Digital Transformation Success is a pure myth, at least when you look at it from the perspective of the person tasked to do the digital transformation. The simple truth is that you cannot really succeed in digital transformation. Let me elaborate on that in a bit more detail and also tell you how I adjusted my tactics based on that simple realization.

Case 1: Happy Honeymoon

In the first case (focusing on digital showcases rather than transformation), you most likely get some recognition in the beginning but it will quickly evaporate as you are not able to deliver to the expectation after a while. This approach brings short term benefits that look good to your upper management. I saw many digital labs and departments tasked with digital transformation producing one digital prototype after the other while neglecting the need to make the rest of the organization ready to be able to absorb these innovations.

Hence, most of the innovative ideas never made it into production or roll out, where the real business benefits happen. And even if they get there, they are simply rejected by the business departments as their are not ready to use them. Inevitably, after a shorter or longer while, the honeymoon period ends. Due to pressure from your executives to work on productive roll outs and since that requires you to fix some basic underlying problems, you are naturally shifting your goals towards driving digital transformation at the business departments. This makes you focus on the underlying pain points that prohibit the success of your great digital innovations you produce. Which leads us to case number 2.

Case 2: The Endless Road

In the second case (trying the hard stuff: transforming the company at its core), its even worse! You and your team run 12 hours and more a day to transform the company. If you do a poor job, nothing happens. Other departments will start to point at you and will ask you to justify your existence. If you do a good job at it, most people hate you since you are changing the company that they know and love. There is most often a reason why people work for a company. It is because they feel attracted to the products and culture. And you are here to change probably both at the same time (Sidemark: this in itself is the very reason why it is so difficult to execute digital transformation as you need to push the company on these two axis simultaneously). As a result, competing departments will try to convince your board that you are a waste of money, time and resources to slow you down in making progress as they fear to lose power and influence in their kingdoms.

To make things even worse: Why the disruptive nature of change does not help at all

Well, we are not done yet. There is an additional difficulty that you need to deal with as somebody working in digital transformation. Digital disruption has an exponential curve, which mans that it comes slowly without being noticed, but then turns suddenly and brutally your entire industry upside down (as in the case of Nokia and Kodak). Considering that your company is a traditional company, the digital maturity of your company is probably extremely low and it takes years to see the first results. At the same time, the revenues and profits of your company are still very high since your products are in the “Cash Cow” phase of their product life cycle. So, your executives do not really feel the pain of the digital disruption that is entering your industry. Perhaps they believe you that it is coming, but that is very different to real pain. And even then, it takes years of digital catch up until first results are seen. There are not many executives that are willing to wait that long as most of them will move into new jobs by then or get retired. Why should they risk their big bonuses today for something that does not impact them immediately?

As a summary, when you choose to go the endless road, even if you are the smartest and most effective person in the world, digital transformation will probably take too long until the fruits can be reaped and there is currently not enough “real pain” to make what you do attractive to your current top leadership. For them, it is enough that you do “digital showcases” to prove to the investors that your company is making progress and is innovating. Then, we are back at case 1. Which means that at some point people will start saying that your showcases do not bring real benefits and will stop supporting you. Well, does that not sound like the perfect vicious cycle?

Here are a few tactics that I learned
It is worth to think about tactics to solve at least partially the dilemma, especially when you believe that digital transformation is the right way to go for your organization. As part of my work, I started to apply three simple tactics that might help you as well. They won´t make the dilemma go away, but they can help to soften it.

1. The Mixing Cases Strategy: Combine some elements of each “Happy Honeymoon” and “The Endless Road” in every project you do, right from the very beginning. This will keep people happy while buying you time to do the long-term stuff. It might sound simple, but it is effective. The trick is to find the right mix. E.g., when you run an analytics pilot, also work on the data quality or data collection in parallel.

2. The Expectation Story Strategy: When you are doing digital transformation, never just simply say you are doing this and that because of this and that. Always tell a story where we are today, what the steps in between are and what the end game looks like. Make sure that from the very beginning you raise the right expectations of what your top leadership can expect at which point of time. This way they see even small steps as the right step towards a bigger goal, which your top executives can then communicate to shareholders and stakeholders (whatever makes them look good will make your live easier).

3. The Evidence Collection Strategy: Once you applied the Expectation Story Strategy, you should start taking base measurements and then start to take further measurements during every step to provide regular evidence to your leadership that your company is on the right track based on the expectations you have raised as part of your storytelling. Having evidence to show that corresponds to the expectation level of your storytelling in the beginning is a prove point that your story is right (at least so far). It helps to increase the level of trust of your upper management in the digital transformation activities during the next phases as you move towards the defined target state. Even if business benefits are not high during the phase you are in right now, people will feel more asserted that the high business benefits that come later are in the process of being achieved.

Happily Forever After

In a way we can draw a comparison to our private lives. At which point would you consider a marriage to be successful? Is it after a glamorous wedding and a honeymoon with lots of beautiful pictures taken on the beach and posted on Facebook? Certainly not. Just because you mastered the honeymoon, it does not mean that your marriage will last happily ever after. So, is the point of success after 5 years of happy marriage? Or, perhaps, rather after 20, 30 or more years? We cannot properly define what success means in the case of marriage as long as the couple is still alive. Nevertheless, we can perceive a marriage to be successful when we meet a couple after they have been through many ups and downs in their many years of marriage and still appear to be reasonably happy. We never know for sure, some myth will always remain! Maybe, its the same with digital transformation.

Smart Machine Marketing and the Algorithmic Economy

The reason why Smart Machines are so much more powerful than conventional computer programs are the advanced AI algorithms and the data that they can absorb. Smart Machines can sense their own state and their environment, can communicate with other Smart Machines, they are self-learning and can solve very complex problems, and they can act, sometimes autonomously. There are many technologies behind the capabilities of Smart Machines. The most important enabler is the massive amount of computing power and storage that is available today for a relatively cheap price, which makes it finally possible to apply computational heavy artificial intelligence algorithms that would have not been possible some years ago.

Many characteristics distinguish traditional software applications from Smart Machines. Computers have always been pretty good in repetitive and clearly described tasks and in applying strict logic and complex mathematics. An abundance of tasks today are solved by computers much faster, cheaper and more reliable than by humans. Yet, in many ways, computers appear oftentimes annoyingly stupid. Have you tried to have a meaningful and interesting conversation with a computer? It can be a difficult and typically very frustrating endeavour. What computers are missing is the ability to understand the meaning of what we have to say. This is because language is very ambiguous. The very same sentence can mean something completely opposite if said in another situation or by a different person. “I love this computer” could mean either did I really like my computer a lot but it could also mean that I really hate my computer because it doesn’t do what I want it to do. It is very unlikely that love refers to romantic love in this context. The idea that computers can think like a human sounds stretched, it is, however, closer than you might think.

It is changing with the up-rise of Smart Machines. It makes machines being able to handle situations with ambiguity, sparse information and uncertainty, thus, be able to solve human kind of problems. Instead calculating the optimal solution using a predefined algorithm, smart machines evaluate different options and choose the best option out of the possibilities. Problems do not need to be provided in a specified machine readable format, they can be simply formulated in natural language or even normal speech. Looking at the context of the problem makes it possible to interpret the question correctly. When I ask a smart machine “what is the best restaurant?”, it should understand that I am probably looking for a good restaurant that is not too far away from my current location. Based on the outcomes of an action, smart machines can learn and improve their problem solving. Instead of being programmed, they can read PDF documentations to understand a business process and observe how humans perform a business process to build its own knowledge base and eventually be able to handle the business process on its own.

The key components of a Smart Machine are depicted in the figure and will be explained in detail in the following. An incentive and rule system needs to be set for a Smart Machine which provides a purpose for the Smart Machine to exist (e.g. as a self-driving car) and the rules it needs to obey (e.g. ethics, law, company procedures, business goals).

Smart Machine Marketing Artwork

In order for machines to see, feel, hear, smell, and taste like human beings, all aspects of the physical world need to be translated into “digestible” data for machines to process, reason, and act. The rise of low-cost sensor technologies and the Internet of Things with its connected devices enables the collection of data from the physical world without human interaction. All senses are needed to cover an entire customer journey from inspiration to usage. The augmented senses of machines allow a broader, deeper, and more personalized customer experience. Sensed information is fed, interpreted, filtered, interlinked and used to initiate further activities.

The most important ability of Smart Machines is to process the sensed information similar to the way us humans process information (i.e. empirical learning). Smart machines are able to think and solve problems by understanding and clarifying objectives (and sometimes coming up with their own objectives), by generating and evaluating hypotheses, and by providing answers and solutions like a human would do (and unlike a search machine which gives a list of results). Smart Machines are self-learning, they can adapt their own algorithms through observing, discovery and by doing.

Finally, Smart Machines can act, by visualizing and providing the responses to a human decision maker, by informing or even commanding a human to execute certain activities, or in the extreme case, by completely autonomously executing a business process or any other actions. Based on the results of the actions, Smart Machines are able to re-calibrate their goal setting.

The impact of Smart Machines will be observable in three domains for Marketing Professisonals. First of all, customers will get a more contextualized and personalized experience. Secondly, the marketing departments will be able to do more with less people building on automation and scale of intelligent algorithms that take over some of the human labor. Thirdly, there will be advances in the customer journey possible which are of disruptive nature.

The marketing profession will be impacted fast and significantly by Smart Machines and the Algorithmic Economy. Personalizing and contextualizing the customer experience is the aim of everyone. But, creating meaningful continuous 1:1 interactions can only be feasible on a large scale with thousands or millions of customers if Smart Machines take over a lot of the work. This means that smart machines take over work reserved for humans in the past, as, for example, generating new content, and supervising staff in retail stores to ensure high customer engagement. It also means that those companies that still struggle with data-driven marketing will be in deep trouble. Those who embrace Smart Machines will be able to drive productivity beyond the imaginable for marketing and sales within the next decades.

Like all things in life, Smart Machines are all a matter of perspective. For marketing divisions in traditional companies, they might be seen as the biggest threat in history. The way most marketing departments work today is very reliant on human labor and decision making. Shifting the work to Smart Machines will make a lot of the abilities needed for traditional marketing personnel redundant and will require new capabilities that the workforce does not necessary have. For others, Silicon Valley startups and companies, Smart Machines generate an once in a lifetime opportunity. Smart Machines enable them to scale their limited resources and, thus, be able to challenge even the largest established players in their own strongholds, irrespective if it is retail, consumer goods, banking, insurance, manufacturing, entertainment or any other type of industry which requires Smart Machine Marketing.

Is it Time for a 2-Speed Business?

Shortly before my summer break – a lovely holiday in Northern France – I gave a keynote at a data science event that highlighted the importance of a bimodal IT for digital innovation.

The key idea behind bimodal IT is that IT needs to offer a second mode in addition to traditional IT that is more risk taking, agile and customer-centric in order to drive digital & analytics innovation more effectively.

Mode 1 is characterized by Gartner as the traditional mode of IT, which has a focus on reliability, is plan and approval-driven, uses large enterprise IT suppliers and typically follows a waterfall approach for implementations.

Mode 2 emphasizes agility and, hence, uses agile implementation approaches, it utilizes often small, new innovative vendors and works closely with the business to drive fast and frequent customer-centric business innovations.

There are many organizations that have started to establish a second, more agile, mode of IT (e.g. in form of a data science lab, a digital factory, or an agile development and DevOps department) and they usually run into two major challenges which impede them to reap the expected benefits:

(1) The two modes of IT are not synchronized well enough

(2) Business is not able to engage effectively with agile IT

I will explain these issues in more depth in the following and some lessons learned how to resolve them.

(1) The two modes of IT are not synchronized well enough

What many organizations get wrong is that they focus to much on creating the new agile Mode 2 of IT.  However, this is only one component of implementing a bimodal IT. The real challenge is how to synchronize both modes so they can play as a team. Having them in silos will not only create conflicts, but also will limit the success of any projects that need both Mode 1 and Mode 2 resources to succeed – which is rather the usual. So, what organizations need to establish is a bridge between the two modes.

Practically speaking, it all starts with mutual understanding and respect between the two modes. If Mode 1 resources have the feeling the they are a second class of IT, they will stop supporting Mode 2 and hinder them wherever possible. Leadership needs to communicate that no mode is better than the other, and both modes of IT are equally needed for success. Mode 2 resources need to understand that Mode 1 is crucial to renovate the core of IT, which enables innovative digital apps to be built on top of a healthy infrastructure efficiently and securely.

Moreover, there are touchpoints between Mode 1 and Mode 2 that require bimodal synchronization through explicit governance:

~ When a new application is planned to be developed, selection criteria have to be defined that outline which implementation should be done in which mode of IT.

~ When a new Mode 2 implementation project is starting, it has to be examined if interfaces to Mode 1 applications are needed and/or if other Mode 1 resources are required.

~ In particular, when the Mode 2 product is supposed to be released in a Mode 1 production environment, traditional release management needs to be involved already in the beginning of an agile project.

~ Finally, when a Mode 2 product is released, there might be a decision to further manage it in Mode 1 in the future.

(2) Business is not able to engage effectively with agile IT

Today´s businesses are not ready yet to engage with Mode 2 IT in a productive manner. This has two main reasons.

First, the second mode of IT is all about experimentation. Trying out new features, new approaches to analyze data and new ways to interact with customers, and taking into account that many of the experiments will not turn into viable products after all. Today, most traditional organizations have not developed a mindset for experimentation yet.

Second, using agile IT methods requires a much more intense participation of business during IT projects. Business is used to “throw business requirements over the fence” and IT would take them, take a few months or even years to implement them, and would come back eventually for testing. In the meanwhile, business does not need to spend much time for the  IT project. This is not the case for agile projects. In each sprint, the business needs to closely work with the developers and defines the business requirements on the run during the project.

These two points highlight some of the obstacles that come up, when there is a two speed organization on the IT side, but only a one speed organization on the business side. The solution is simple, but substantial: Many large organizations that I work with have recognized the need to establish also a second mode of business, which is more experimental, fast paced and enables real digital innovation.

The consequences are visible: There are more and more business labs and business innovation centers of large enterprises popping up around the world in addition to data labs that have the role to work with agile IT to come up and test new innovative ideas in a fast mode. They aim to imitate a startup environment  where creativity, experimentation  and disruptive innovation is in the focus. The results are impressive so far. Mode 2 IT can be much better utilized and the collaboration between business and a bimodal IT becomes significantly better when a two speed business has been established.

This is only the beginning, but one new imperative clearly emerges: It is time for a two speed business for any organization. The pace of change will become faster and volatility will increase in the future. So, let´s get business ready for it.

 

Dr. Alexander Borek advises Forbes 500 companies in multiple industries with regards to their digital transformation, data governance and Big Data Analytics innovation strategy.

All opinions in this blog are written in private capacity and do not express or reflect the opinions of his employer.

Industrializing Data Science and Analytics

Gartner´s “Hype Cycle for Advanced Analytics and Data Science 2015″ has just been published. The trends indicated in the hype cycle show a rising maturity of this young organizational discipline.  It is interesting to see that the buzzword “Big Data” has finally disappeared from the hype cycle, while machine learning (a discipline that has been there for decades, at least in academia) has reached the peak of inflated expectations.  This underpins a tendency to  move from big data (the bigger the better) to smart data (the smarter the better). Simply said: “No matter if it is big or small data, it is still data and we aim to get more value out of it.”

A trend that is also visible at a second glance is the emerging industrialization of data science, which is underpinned by a number of developments. Vendors increasingly support the management of analytical models built by data scientist over their entire life cycle, when they are scaled from prototype to company-wide adoption. So far, the management of analytical models has been rather disorganized in most companies.  Data scientists would create new models on an use case by use case basis. Some of the models have been actually doing what they promised to do and would be deployed in operations.

An end to end management of the models and a reuse of solution patterns for analytical models across the enterprise has not been actively enforced or governed. In a new project, they would often start nearly from scratch although a similar model might have been already developed in a different business unit. From an organizational point of view, it makes sense to have a centralized data science unit that can support data scientists in decentralized business units. A central data science unit can ensure that learnings are incorporated and fed back to the organization and that analytical models are consistently governed even after they are handed over to IT.

Very connected to this is the concept of the model factory. The idea is to bring automation and scalability to the process of building and deploying predictive models.  To find the best models, a huge number of models are built and tested using software tools that provide a high  degree of automation during devleopment. At the end of the process, only the best few models are deployed.

Finally, a thrilling concept comes from Gartner´s Alexander Linden, which is the concept of the analytics market place. Some companies such as Microsoft, Rapidminer and FICO have created marketplaces, where data science services and additional functionality are provided by third parties which can be purchased by users of the analytics platforms. This can become a true game changer. Similar to the third party apps and services provided at Salesforce.com, analytics marketplaces could become a source of millions of very domain specific analytics micro-applications that drive innovation.

Today, we stand only at the beginning. I am convinced that in a few years time, data science and advanced analytics will be as industrialized as traditional IT. What has changed with the uprise of data science is the speed with which new applications are developed and deployed, the increased willingness to experiment and the direct way data innovates business models and business operations. Now, we only need to scale it to the rest of the enterprise to reap the full benefits.

 

 

Data Quality Expert Panel at DGIQ in San Diego

I really enjoyed the Data Governance and Information Quality conference that took place in mid June in San Diego. There were many great talks, a highlight was Anthony Algmin  who talked about his first 100 days as the new Chief Data Officer at the Chicago Transit Authority.  A great keynote was given by Scott Hallworth about the data quality journey at Capital One. Nancy Fessatidis of SAP gave a keynote on an emerging topic that gets a lot of attention these days: the ethics and morality of big data. The panel on controversial issues in data governance had been a great ending of the conference.

During the conference, I gave a half day tutorial on “setting up a data quality risk management program at your organization”, where I could enjoy a very active and interested audience. Going to a lot of data conferences, I can observe a rising level of interest over the years in applying the risk paradigm to data quality, especially from regulated industries like banking and insurance.

I also participated in a very interesting panel discussion on data quality best practices with Michael Scofield, Peter Aiken, David Loshin and John Talburt, in which I have highlighted the role of business outcome focused data quality metrics.  You can watch the video of the panel discussion below.

Frontend Versus Backend for Digital Innovation

Very simply speaking, business processes can be divided into two categories, namely, front office processes, which are all customer interfacing business processes and, back office processes, which are all business processes that have no customer touch points.   The back office is usually what the customer does not see.

Let me use a simplified exemplary scenario to explain how all these things interplay:

A customer finds a red wardrobe in a catalog and would like to know if his nearest furniture shop has this particular product available to make sure he does not drive 35 miles to the store for no reason. The front office business process in this scenario is that the customer asks the question if the product is in stock and gets the answer to his question. To answer his question, we need to know which products are available at any given time. Hence, there is also a back office process required, which is to keep track which products are in stock and which ones are out of stock at the moment.

If the front office process and the back office process are both not digitized at all, the customer has to give the store a phone call and hope that some staff member will pick up the phone, go to the shelf where the product is stored and checks visually if there is still a red wardrobe available for sales.

Examples Front office process Back office process
Not Digitized Process Customer gives the store a phone call. Staff member picks up the phone, checks if the product is available Staff member goes to the shelf where the product is stored and checks visually if there is a red wardrobe still available for sales
Digitized Process Customer types “red wardrobe” and his address at the web site of the furniture store and it shows that product is available in the nearest store All products contain an RfiD chip that can track them on the shelf. IT System can provide real-time availability information to staff and customers.
Automation of Business Process Website makes call obsolete and staff does not need to take an additional phone call RfiD tracking of stock instead of visual check if product is available
Digital Data Generation Customer address and product of interest is captured Stock level and availability for each product is captured
Digital Data Usage Data about availability of red wardrobe is used to answer request Data showing which RfiD tag is linked to which product is used to track stock level

In contrast, if we want to digitize the front office process, we could create a website through which the customer can check if the wardrobe is in stock. The website would automate the business process in the front office. By typing the product name of interest at the website, the customer provides this information in a digital format. The result comes back on the screen, which uses existing digital information about product availability in store. The data about product availability could still be entered into an IT system and maintained manually by an employee in the back office. The customer would not notice if the back office process is digitized or not as long as the information is up to date.

Finally, if we want to digitize the back office process in this scenario, we could, for instance, automate the tracking of products in the shelf by putting RfID tags on each product (RfID = Radiofrequency Identification). An RfID reader can then wirelessly detect how many products are on the shelf at any given time and store this information in an IT system. The IT system can provide this information to the website so it is visible to the customer. But the front office process does not necessarily have to be digitized. Even when the customer calls in, the staff member still saves time. The staff member would not need to make a visual inspection to capture the stock level as he can look up the product availability in the IT system.

Even if your customer does not see what is going on in the backyard, digital transformation of your back office is very important to your business success. A great customer experience is often not possible without efficient and effective back office processes. In our small furniture shop scenario, when the stock level and availability for each product is captured digitally, it is ensured that the customer has always accurate information in real time. Secondly, making your back office running more efficient with digital transformation can save you a lot of costs and make your operations run smoother and leaner. In our small example, you would need a lot of additional service staff that answers service requests. And thirdly, digital transformation makes your back office more effective, which can help you, for instance, to optimize your supply chain management, to prevent fraud, manage business performance better, optimize your physical assets, create the highest value with your human resources and better manage your finances.

In essence, executives should avoid to focus all their digital innovation efforts only on what is shiny and visible to the customers, the inner core of your business can be an even stronger competitive differentiator, even if that is not directly seen from the outside. And not everything that shines is gold.