Why Every Employee in the Enterprise Doesn't Need to be a Data Expert
There is a lot of pressure for the C-suite today to make their organizations data driven and data smart. The hype around big data, machine learning, artificial intelligence is putting tremendous pressure on the leaders to guide their organizations to be data driven for informed decision making.
However, the latest slew of data and analytics products promise and sell the vision of making every user a data expert. This is a dangerous goal to target and almost unachievable. Not all employees in an enterprise need to be data experts; experts at identifying the right data, connecting data sets and generating analytics. However, almost all employees need to be empirical in their drive to make the best decisions using the best analytics at their disposal.
Not Every Employee Should be a Data Expert
All employees cannot be data experts for several reasons. For example, not all employees have the background or the technical expertise to understand the specifics and intricacies of data curation, transformation and enrichment. At the same time, not all employees have the aptitude or the interest in performing data analytics and insights generation. At the same time, not all employees have the time or organizational freedom to spend their time generating data analytics and insights.
The market is full of vendors providing data products that promise the transformation of all employees into data experts. This is a dangerous sell and can lead to unrealistic expectations on the part of the enterprise from their employees and their ability to process and derive insights from enterprise data which is often disparate, disjointed, dirty, incomplete and cannot be readily processed.
At the same time, due to lack of business context and understanding of data, even if most employees were able to process and analyze data, the lack of solid understanding of standard data handling technique and data processing can lead to incorrect and suboptimal analytics that can be incorrect, unverifiable and misleading. The harm from incorrect or proliferation of inconsistent analytics across the enterprise is often more dangerous than the lack of analytics to drive decisions.
Another danger from unqualified analytics produced by untrained employees is the lack of verifiability and lineage in data-driven decisions made by the employees. The lack of lineage, reasoning and description of assumptions made before and while the analysis is generated severely limits the trustworthiness of the analytics. When such lineage is unavailable, an agile, iterative approach to improvements is made impossible and untenable.
The Strategy for Transformative Culture Shift
Enterprises are better off with a four-part strategy to really catalyze the organization into a data-driven organization. The three parts of the strategy are: a curated data set store, a curated analytics store, and data domain expert user group and data governance group.
Curated Data Set Store
Enterprises should focus on establishing data sets that represent the most critical processes, workflows or user actions/ behaviors enabled over the enterprise products and services. These curated data sets should be clean, qualified, curated, validated, enriched and transformed to provide a fully descriptive, highest fidelity representation of the enterprise’s critical operations. The assumption that any consumer of data from the curated data set store should be able to make is that the data is guaranteed to be processing ready and the most accurate representation of the entity or process that it represents. The store should also behave like a library and keep a record of all consumers that have utilized data from the store.
Curated Analytics Store
Enterprises should focus on establishing analytics (reports, dashboards, visualizations or insights) that are deemed to be standardized, accurate and consumable by decision makers and actors across the enterprise. The consumers of analytics (in any form) should be able to depend on the soundness, verifiability and accuracy of the analytics including having the ability to understand all assumptions made in the generation of the analytics that make it appropriate or inappropriate for the decision that the consumer is attempting to drive from the analytics. Similar to the Curated Data Set Store, the Curated Analytics Store should also behave like a library and keep a record of all consumers (and the type of consumption including the type of decisions driven by the analytics) that have utilized analytics from the curated analytics store.
Data Domain Expert Group
Enterprises should also focus on establishing and defining user groups that represent the domain experts who understand not just the data set itself including its attributes, lineage, provenance and structure but also the business context in which the data was generated and the intricacies of the process represented in the data set. The Data Domain Experts function as the experts who can enable analytics, usage and consumption of both the data and the analytics derived from a certain data set and ensure that the rest of the enterprise is utilizing the data and analytics in the most efficient, accurate and verifiable manner.
Data Governance Group
Enterprises should also ensure that there is a central team that is able to govern the usage of data and analytics to drive key decisions and actions in the enterprise. The data governance group should perform the following activities.
Data Governance group should establish both quantitative and qualitative feedback loops around data and analytics consumption. However, the quality of feedback loops is directly correlated to how standardized the data/analytics access, usage and delivery mechanisms.
For example, if there is a single (or finite) number of APIs or access mechanisms to the data, the governance team can monitor the data access patterns and generate a view at an aggregate level (what are various products leveraging from the data pool) and at an individual product level (what is each product doing with the data). A standardized interface provides this visibility that can be used to highlight unexpected, new or incorrect uses of the data.
This visibility then leads to a process that requires the governance team to periodically monitor usage and make data driven decision to either
1. Continue with a certain access pattern
2. Reach out to product teams for clarifications on anomalous or unclear access patterns
3. Immediately disable risky or incorrect data access patterns (enabled by the standardized interface)
Similar to any piece of content, the number of content producers can be much smaller than the number of content providers, the number of employees generating analytics should be much smaller than the number of users using the analysis to generate decisions and actions. Without this pattern, the enterprise is bound to see an increase in the analytics noise in the enterprise leading to either bad decisions and bad actions or a delay in the decisions makers and actors finding the right insights leading to missed opportunities and lost business value.