Cloud-Based Analytics Requires Hybrid Data Access and Integration


As I discussed in the state of data and analytics in the cloud recently, usability is a top evaluation criterion for organizations in selecting cloud-based analytics software. Data access of cloud and on-premises systems are essential antecedents of usability. They can help business people perform analytic tasks themselves without having to rely on IT. Some tools allow data integration by business users on an ad hoc basis, but to provide an enterprise integration process and a governed information platform, IT involvement is often necessary. Once that is done, though, using cloud-based data for analytics can help, empowering business users and improving communication and process .

vr_DAC_16_dealing_with_multiple_data_sourcesTo be able to make the best decisions, organizations need access to multiple integrated data sources. The research finds that the most common data sources are predictable: business applications (51%), business intelligence applications (51%), data warehouses or operational data stores (50%), relational databases (41%) and flat files (33%). Increasingly, though, organizations also are including less structured sources such as semistructured documents (33%), social media (27%) and nonrelational database systems (19%). In addition there are important external data sources, including business applications (for 61%), social media data (48%), Internet information (42%), government sources (33%) and market data (29%). Whether stored in the cloud or locally, data must be normalized and combined into a single data set so that analytics can be performed.

Given the distributed nature of data sources as well as the diversity of data types, information platforms and integration approaches are changing. While more than three in five companies (61%) still do integration primarily between on-premises systems, significant percentages are now doing integration from the cloud to on-premises (47%) and from on-premises to the cloud (39%). In the future, this trend will become more pronounced. According to our research, 85 percent of companies eventually will integrate cloud data with on-premises sources, and 84 percent will do the reverse. We expect that hybrid architectures, a mix of on-premises and cloud data infrastructures, will prevail in enterprise information architectures for years to come while slowly evolving to equality of bidirectional data transfer between the two types.

Further analysis shows that a focus on integrating data for cloud analytics can give organizations competitive advantage. Those who said it is very important to integrate data for cloud-based analytics (42% of participants) also said they are very confident in their ability to use the cloud for analytics (35%); that’s three times more often than those who said integrating data is important (10%) or somewhat important (9%). Those saying that integration is very important also said more often that cloud-based analytics helps their customers, partners and employees in an array of ways, including improved presentation of data and analytics (62% vs. 43% of those who said integration is important or somewhat important), gaining access to many different data sources (57% vs. 49%) and improved data quality and data management (59% vs. 53%). These numbers indicate that organizations that neglect the integration aspects of cloud analytics are likely to be at a disadvantage compared to their peers that make it a priority.

Integration for cloud analytics is typically a manual task. In particular, almost half (49%) of organizations in the research use spreadsheets to manage the integration and preparation of cloud-based data. Yet doing so poses serious challenges: 58 percent of those using spreadsheets said it hampers their ability to manage processes efficiently. While traditional methods may suffice for integrating relatively small and well-defined data sets in an on-premises environment, they have limits when dealing with the scale and complexity of cloud-based data. vr_DAC_02_satisfaction_with_data_integration_toolsThe research also finds that organizations utilizing newer integration tools are satisfied with them more often than those using older tools. More than three-fourths (78%) of those using tools provided by a cloud applications  provider said they are satisfied or somewhat satisfied with them, as are even more (86%) of those using data integration tools designed for cloud computing; by comparison, fewer of those using spreadsheets (56%) or traditional enterprise data integration tools (71%) are satisfied.

This is not surprising. Modern cloud connectors are designed to connect via loosely coupled interfaces that allow cloud systems to share data in a flexible manner. The research thus suggests that for organizations needing to integrate data from cloud-based data sources, switching to modern integration tools can streamline the process.

Overall three-quarters of companies in our research said that it is important or very important to access data from cloud-based sources for analysis. Cloud-based analytics isn’t useful unless the right data can be fed into the analytic process. But without capable tools this is not easy to do. A substantial impediment is that analysts spend the majority of their time in accessing and preparing the data rather than in actual analysis. Complicating the task, each data source can represent a different, possibly complex, data model. Furthermore, the data sets may have varying data formats and interface requirements, which are not easily addressed with legacy integration tools.

Such complexity is the new reality, and new tools and approaches have come to market to address these complexities. For organizations looking to integrate their data for cloud-based analytics, we recommend exploring these new integration processes and technologies.

Regards,

Ventana Research

The Establishment of Data Preparation


Data is an essential ingredient for every aspect of business, and those that use it well are likely to gain advantages over competitors that do not. Our benchmark research on information optimizationvr_Info_Optimization_02_drivers_for_deploying_information reveals a variety of drivers for deploying information, most commonly analytics, information access, decision-making, process improvements and customer experience and satisfaction. To accomplish any of these purposes requires that data be prepared through a sequence of steps: accessing, searching, aggregating, enriching, transforming and cleaning data from different sources to cre­ate a single uniform data set. To prepare data properly, businesses need flex­ible tools that enable them to en­rich the context of data drawn from multiple sources, collaborate on its preparation to serve business needs and govern the process of preparation to ensure security and consistency. Users of these tools range from analysts to operations professionals in the lines of business.

Data preparation efforts often encounter challenges created by the use of tools not designed for these tasks. Many of today’s analytics and business intelligence products do not provide enough flexibility, and data management tools for data integration are too complicated for analysts who need to interact ad hoc with data. Depending on IT staff to fill ad hoc requests takes far too long for the rapid pace of today’s business. Even worse, many organizations use spreadsheets because they are familiar and easy to work with. However, when it comes to data preparation, spreadsheets are awkward and time-consuming and require expertise to code them to perform these tasks. They also incur risks of errors in data and inconsistencies among disparate versions stored on individual desktops.

vr_Info_Optimization_16_information_software_evaluation_criteriaIn effect inadequate tools waste analysts’ time, which is a scarce re­source in many organizations, and can squander market opportunities through delays in preparation and unreliable data quality. Our information optimization research shows that most analysts spend the majority of their time not in actual analysis but in readying the data for analysis. More than 45 percent of their time goes to preparing data for an­al­y­sis or reviewing the quality and consistency of data.

Businesses need technology tools capable of handling data preparation tasks quick­ly and dependably so users can be sure of data quality and concen­trate on the value-adding as­pects of their jobs. More than a dozen such tools designed for these tasks are on the market. The best among them are easy for analysts to use, which our research shows is critical: More than half (58%) of participants said that usability is a very important evaluation criterion, more than any other, in software for optimizing information. These tools also deal with the large numbers and types of sources organizations have accumulated: 92 percent of those in our research have 16 to 20 data sources, and 80 percent have more than 20 sources. Complicating the issue further, these sources are not all inside the enterprise; they also are found on the Internet and in cloud-based environments where data may be in applications or in big data stores.

Organizations can’t make business use of their data until it is ready, so simplifying and enhancing the data preparation process can make it possible for analysts to begin analysis sooner and thus be more productive. Our analysis of time related to data preparation finds that when this is done right, significant amounts of time could be shifted to tasks that contribute to achieving business goals. We conclude that, assuming analysts spend 20 hours a week working on analytics, most are spending six hours on preparing data, another six hours on reviewing data for quality and consistency issues, three more hours on assembling information, another two hours waiting for data from IT and one hour presenting information for review; this leaves only two hours for performing the analysis itself.

Dedicated data preparation tools provide support for key tasks in areas that our research and experience finds that are done manually by about one-third of organizations. These data tasks include search, aggregation, reduction, lineage tracking, metrics definition and collaboration. If an organization is able to reduce the 14 hours previously mentioned in data-related tasks (that including preparing data, reviewing data and waiting for data from IT) by one-third, it will have an extra four hours a week for analysis – that’s 10 percent of a 40-hour work week. Multiply this time by the number of individual analysts and it becomes significant. Using the proper tools can enable such a reallocation of time to use the professional expertise of these employees.

This savings can apply in any line of business. For example,vr_NG_Finance_Analytics_10_data_issues_slow_delivery_of_metrics our research into next-generation finance analytics shows that more than two-thirds (68%) of finance organizations spend most of their analytics time on data-related tasks. Further analysis shows that only 36 percent of finance organizations that spend the most time on data-related tasks can produce metrics within a week, compared to more than half (56%) of those that spend more time on analytic tasks. This difference is important to finance organizations seeking to take a more active role in corporate decision-making.

vr_BDI_09_big_data_integration_starts_with_basicsAnother example is found in big data. The flood of business data has created even more challenges as the types of sources have expanded beyond just the RDBMS and data appliances; Hadoop, in-memory and NoSQL big data sources exist in at least 25 percent of organizations, according to our big data integration research. Our projections of growth based on what companies are planning indicates that Hadoop, in-memory and NoSQL sources will increase significantly. Each of these types must draw from systems from various providers, which have specific interfaces to access data let alone load it. Our research in big data finds similar results regarding data preparation: The tasks that consume the most time are reviewing data for quality and consistency (52%) and preparing data (46%). Without automating data preparation for accessing and streamlining the loading of data, big data can be an insurmountable task for companies seeking efficiency in their deployments.

A third example is in the critical area of customer analytics. Customer data is used across many departments but especially marketing, sales and customer service. Our research again finds similarvr_Info_Optimization_11_innovations_important_for_information issues regarding time lost to data preparation tasks. In our next-generation customer analytics benchmark research preparing data is the most time-consuming task (in 47% of organizations), followed closely by reviewing data (43%). The research also finds that data not being readily available is the most common point of dissatisfaction with customer analytics (in 63% of organizations). Our research finds other examples, too, in human resources, sales, manufacturing and the supply chain.

The good news is that these busi­ness-focused data preparation tools have usability in the form of spreadsheet-like interfaces and include analytic workflows that simplify and enhance data preparation. In searching for and profiling of data and examining fields based on analytics, use of color can help highlight patterns in the data. Capabilities for addressing duplicate and incorrect data about, for example, companies, addresses, products and locations are built in for simplicity of access and use. In addition data preparation is entering a new stage in which ma­chine learning and pat­tern recog­ni­tion, along with predictive analytics techniques, can help guide individuals to issues and focus their efforts on looking forward. Tools also are advancing in collaboration, helping teams of analysts work together to save time and take advantage of colleagues’ expertise and knowledge of the data, along with interfacing to IT and data management professionals. In our information optimization research collaboration is a critical technology innovation, according to more than half (51%) of organizations. They desire several collaborative capabilities ranging from discussion forms to knowledge sharing to requests on activity streams.

This data preparation technology provides support for ad hoc and other agile approaches to working with data that maps to how business actually operate. Taking a dedicated approach can help simplify and speed data preparation and add value by enabling users to perform analysis sooner and allocate more time to it. If you have not taken a look at how data preparation can improve analytics and operational processes, I recommend that you start now. Organizations are saving time and becoming more effective by focusing more on business value-adding tasks.

Regards,

Mark Smith

CEO and Chief Research Officer