Audit Your Data Strategy
Before anything else happens, conduct an audit of your current data strategy. The more thorough the audit, the more opportunity there is to spot wasted effort and lost opportunities. Use these guided questions to make sure you don’t miss anything vital. What data do you have and how do you access it? Identify every method you currently have of collecting information. This includes marketing data, website metrics, feedback collection and analysis procedures- essentially, anything that measures how your company is performing and how consumers are interacting with your brand. The more detail you use here, the better. Don’t forget to include data from embedded analytics. Even if you haven’t begun adding analytics on your own, most websites that were built within the last decade will have at least some tools that track site activity. A growing number of social media and marketing apps also offer embedded analytics. Take a look at exactly what is available.- Are you using analytics tools to track your users’ activity? Which ones?
- Can you track an individual’s path through your website, or does your software only collect aggregate data from all users?
- Do you have defined Key Performance Indicators (KPIs)?
- How much say do you have in what metrics are tracked? Are you able to customize collection streams to highlight KPIs?
- Do you have a central dashboard for viewing and manipulating your data? If not, how many programs does a user have to log into when creating reports?
- How much data do you save? Do you save everything or only data related to KPIs?
- Are your storage limits imposed by internal decisions or available space?
- What are your backup procedures? How can you recover your data in the event of a loss?
- Do you have an employee or contractor assigned to oversee your data storage?
- How will you know if your data storage fails or becomes corrupted?
- Who can see your data? What are the security procedures used to keep unauthorized users from using/altering data?
- What is your data telling you about clients, market conditions, and workflow efficiency?
- Are you using AI techniques such as machine learning?
- Does your data generate actionable insights?
- Do you act on the insights generated by your data?
- How are you using data to optimize marketing, increase customer satisfaction, and improve internal processes?
Conduct a Needs Assessment
The needs assessment consists of two stages: planning solutions for errors found during the strategy audit and determining what emerging technologies can be most effectively integrated. Identifying weaknesses During the assessment some problems with the current data strategy should become apparent. It will be obvious if there is no backup, for example, or if huge amounts of data are currently being unused. Other issues may take longer to realize. These tend to be workflow problems, workarounds that staff has created in order to function in a dysfunctional data environment. The two most common are data silos and Shadow IT. Data silos A data silo is a collection or storage system that is only accessible to one group within an organization. Drawing data from these systems for use in other places adds several extra layers of work for employees. Data silos can be formed accidentally when the higher leadership is unaware of a resource one department has and how it can be applied to the company at large. For this reason, CDOs and CIOs need to cultivate reciprocal relationships with their subordinate managers. There should be a climate where those managers feel empowered to share their ideas and processes without being accused of “getting bogged down in details”. This gives C-level execs the chance to see opportunities for applying those processes in other departments. Every meeting doesn’t need to be a class on how a department runs; an in-depth update once or twice a month is generally sufficient to stay on top of developing procedures. Sometimes, data silos are intentionally created by IT departments in the interest of security. Security versus flexibility is one of the greatest conflicts in data science. It’s critical to ensure information (especially protected customer information and strategy-sensitive data) is kept from unauthorized use, but at the same time too many silos make completing even the basic tasks complicated. Shadow IT Difficulties in accessing data needed to operate leads to the second most common “data dysfunction”: Shadow IT. This consists of any systems, applications, and procedures adopted by non-IT staff without IT consultation. Despite its ominous name, Shadow IT isn’t caused by a desire to hurt the company. Employees become frustrated with inefficient workflows or limited capabilities and act to “fix” those problems. They install enterprise software (or sometimes write their own) that automates as much of their “housekeeping” tasks as possible in order to give themselves more time to focus on their primary jobs. Allowing key decision makers to champion new technology has the benefit of increasing flexibility and offloading some of the IT workload. It isn’t without risk, however. Unapproved software can expose the organization’s data to the security risks the IT manager created the data silo to avoid. Also, without central coordination resources are wasted on redundant or conflicting software. For more information about these hazards that IT face, read last month's blog post: The Dangers of Shadow IT in Mobile App Development. Evaluating new data science applications How well is your current data strategy delivering results? If you aren’t using Artificial Intelligence-based analytics software, there’s room for improvement. AI allows for faster analysis of unstructured data which makes up an estimated 80% of the world’s data. Including AI in your data strategy is the first step to introducing it into your business strategy. For inspiration, here are some of the technologies being used by top level enterprises today. Deep learning Just as machine learning is an application of Artificial Intelligence, deep learning is an sub-discipline of machine learning. Some industry publications describe it as an evolution, but this is misleading as machine learning is still a vibrant and growing field. Both machine and deep learning teach software to make increasingly more accurate choices about data based on past experience with little to no human input. Deep learning focuses more on the creation of deep neural networks: vast collections of data that help refine the program’s definitions of categories. Imagine a company wants to design a program to screen customer images posted to a social media site for inappropriate content. A series of initial algorithms is written to define what “inappropriate” means. The program uses these algorithms to approve or flag incoming content. In the beginning, though, the software will make a lot of mistakes while it attempts to understand the provided instructions. Patterns of color and unusual body positions could cause false positives. Deep learning shortens the training period by feeding an enormous amount of prepared data through the algorithm during the preparation phase. Because the program can access a deep pool of pre-screened exemplars to check its results, it doesn’t have the same learning curve as machine learning processes that must construct their own models. Results returned by deep learning algorithms reach usable levels of accuracy faster. On an enterprise level, deep learning is most beneficial in cases where a company already has a store of sorted data to apply. Current applications of the technology include predicting the outcome of legal cases, navigating self-driving cars and guidance systems for the visually impaired, automatically generating reports in response to unstructured triggers (such as the text of a complaint email), and providing more challenging virtual opponents for computer and video games. Data mining Data mining and predictive analytics are often used interchangeably, though in reality data mining is a process that powers predictive analytics. It’s one of the techniques that creates the framework that predictive analytics uses to generate its predictions. Modern applications of data mining use machine learning to refine their output. There are different categories of data mining depending on the desired end state.- Association is used to find connections between events (ie, customers look at these two websites during the same visit).
- Path analysis is the logical continuation of that process in which the typical order of events is defined (customers look at these FAQ pages before choosing the “Contact” page).
- Data clustering also groups data by proximity but without assuming causation (for unknown reasons, the most customers come from these six cities).
- Classification sorts data into classes based on differentiating factors (customers who have made a purchase in the past vs customers who only browse the site).
Structuring Good Data Strategy
With all this information at hand, it’s time to create the data strategy itself. This can be the most contentious part of the process. C-level executives with different areas of responsibility have different ideas about what the plan should look like and who should be responsible for its adoption and upkeep. 43% of enterprise leaders feel that getting everyone to agree to the same data strategy is too hard. In truth, good data strategy prevents more problems than it causes. It removes the uncertainty around data management by outlining expectations of all involved parties. After adoption of the data strategy different departments of the organization will find it much easier to rely on data coming from other branches since they have more trust in the collection and management procedures. There is no standard template for enterprise data strategy. Your business is unique, even among your competitors, and what works for another company might not compliment your existing operations. There are certain elements that every data strategy should address in order to be considered complete. Data Storage Before any data is collected, there needs to be a storage system in place. The main decision here is whether to build local storage or contract for cloud storage. Local storage will often have faster connection speeds, and you will have complete control over functions such as backups and access control. You can also manually disconnect local storage from the internet in case of a network attack. Setting up local storage comes with a large up-front investment, though. Also, you will have to arrange for maintenance and security personnel to protect that investment. Cloud storage side-steps the costs of building and managing local servers. The provider handles maintenance and improvements as part of the cost, which is typically structured as a subscription. It’s possible to purchase more storage as your business grows without being delayed by construction. Keeping data stored on the cloud protects it against on-site accidents, too. These advantages come at the cost of less control over the details of data storage and a slightly slower connection speed. The vast majority of businesses won’t notice an appreciable difference in connectivity between local and cloud storage, so for most people cloud storage is the best solution. Collection and Exploitation Although collection and exploitation are different domains, the rise of embedded analytics has tied them together. An increasing number of products that used to simply collect data are now processing it as well, and few companies are willing to invest in new software that doesn’t include some form of analytics. Planning for collection includes deciding what information you need to track. Be specific, but don’t feel the need to ration your KPIs. Data science needs data to work. The more relevant information you have, the more ROI you can realize from data science programs. Don’t forget to address gaps in your data infrastructure revealed during the audit stage. Exploitation covers everything from which embedded analytics programs will be utilized to the new data science applications you plan to adopt. What do you want your data to do? What goals should your CDO be working towards? While these will by nature be loosely defined, try to narrow it down more than “growth”. A better exploitation goal would be “increase growth in X market” or “improve the customer acquisition funnel”. Data Integration Describe your executive expectations for adoption of data science enterprise-wide. This section should include a detailed plan for how data should be disseminated throughout the company. Incorporating data science into existing workflows is most efficiently done on a rolling phased basis. That is, identify the first few steps to improving your data usage, then periodically reassess and add new steps as the old ones are completed. Provide metrics to help managers assess their data science integration. A data strategy should be dynamic as well as specific. Make sure there are guidelines for adjusting plans to fit new information, but don’t change requirements on a weekly basis. Alterations to your strategy should always be data-driven and push towards well-defined goals. Governance and security Determine who owns each data asset within the company. Who is responsible for overseeing it? Who can make changes? Who can retrieve data? How will your data be protected from external malicious actors and internal negligence? What measures are in place to comply with relevant privacy laws or HIPAA regulations? There’s an executive trend towards democratizing data so that it’s accessible by every department. That provides an incredible amount of flexibility and encourages innovation on an individual level, but there are some security concerns involved. Decide what level of access each category of employee will have based on what you deem an acceptable balance of risk. Resist the urge to centralize your entire data governance to the CDO. Assign key data governors at each level of authority, all the way from the CDO to individual departments. This is a good way to balance freedom of data versus security, in fact. Each governor can assess their section’s need for specific data more easily than the CDO, and having everything flow through that local governor provides a measure of accountability. [caption id="attachment_434" align="aligncenter" width="500"] Source: aiim.org[/caption]Conclusion
Sound data strategy and the resulting increase in data utilization translates into profit: Fortune 1000 companies who increase their data utilization by a mere 10% can add up to $65 million to their net income. By assessing and improving your company’s data strategy, you can position yourself to take advantage of new AI technologies and win a share of that increased profit.Concepta can help assess your data strategy. Contact us today for a consultation!