SAP has launched its Enterprise Information Management (EIM) 4.0 release as part of its “Run Better Tour.” It includes a broad range of information management components spanning data integration, data quality, data profiling, metadata management and more. The launch was done in conjunction with SAP Business Intelligence (BI) 4.0, which got much bigger billing at the event –to the point where one might call this a stealth marketing campaign. However, the event did identify three themes intended to highlight EIM capabilities: event insight, trusted data and text processing. The goal here was to communicate the integration SAP has achieved within and between its BI and EIM products. IBM announced a similar advance with its InfoSphere products and Informatica has also invested heavily in integrating its information management products. Our Information Management benchmark research validates this approach, finding that incompatible tools create a significant obstacle to organizations’ quest for consistent sets of data.
With this release SAP also announced three new products: SAP BusinessObjects Data Services, SAP BusinessObjects Information Steward and SAP Business Objects Event Insight.
Data Services brings together ETL (extract, transform and load) capabilities, data quality and text processing into a single set of services. With this release, SAP supports address cleansing for more than 230 countries. In addition, the data quality components are available as an embeddable set of self-contained libraries requiring no application server. This architecture makes it easier to embed within third party applications. Data services is also the engine for moving, transforming and loading data and metadata into SAP’s in-memory, high performance analytic application, HANA. This suggests high throughput rates for Data Services, since HANA is about performance and real-time access to data. Text data processing, formerly based on Insight, which was a separate acquired technology, has been consolidated with data services in 4.0.
A demo is available of the text data processing capabilities being used to perform sentiment analysis of a tweet stream in what is becoming a required showpiece for all BI software vendors. While sentiment analysis of social media data makes for an interesting demo, it remains a very difficult problem to solve. If in the demo you view the tweets with a strong positive sentiment, you’ll see that first page includes 10 tweets. While those tweets contain positive sentiment with words like “excellent,” “thanks” or “positive,” it turns out that seven of the ten are actually expressions of positive sentiment about reviews of the event, not the event itself. I’ve seen the same flaw in other vendors’ tweet stream analyses as well. While most of these products, including SAP’s, allow you to extend the libraries used for determining sentiment and context to produce more accurate results, I would use caution before relying too heavily on these automated analyses.
The second of the three new products, Information Steward, provides data governance capabilities designed to help increase, in SAP’s words, business users’ trust in the data. Trust is an important issue. Our Data Governance research found that only 9 percent of organizations completely trust their data for decision-making. Information Steward provides, among other things, profiling and data quality dashboards, including the ability, accessible from the front-end BI tools, to drill down into the details behind quality scores and key performance indicators. Users can even see which records have failed in recent integration processes. Another capability, Cleansing Package Builder, can profile a set of data to derive rules about that data. These automatically derived rules can then be reviewed and modified or augmented as necessary.
The third new piece to the SAP EIM portfolio, Event Insight, manages event-based data to enable operational intelligence using real-time data. The new product is based on the Aleri technology acquired as part of the Sybase portfolio of products. Event Insight has three basic capabilities: the ability to process streams of data such as network traffic, combine that with historical data, and deliver the information in real time to users. In order to help increase scale and reduce network congestion, SAP has architected technology to “push down” some of the event filtering to decentralized servers. The historical data is still centralized, so the rule execution must still happen at the central server. This technology is part of what is known as complex event processing; our research into operational intelligence found a need for improved efficiency of business processes utilizing events.
SAP placed significantly less emphasis on security than have other information management vendors I’ve spoken with. For instance, there was no mention of data masking, which competitive products provide. Another area that was noticeably lacking was integration with Hadoop. Many information management vendors are incorporating Hadoop into the product strategies and I expect we’ll hear more from SAP on this in the future.
Overall, the integration delivered in this release should benefit users; SAP now has the unusual opportunity to provide that integration all the way from ERP applications through information management processes to business intelligence. My colleague has pointed out the importance of business and IT engagement and where SAP is integrating the technologies further to meet that need. In addition, SAP’s HANA provides evidence that SAP wants to exploit the larger information and analytics opportunity further by accelerating the processing of data to information for a range of business needs. Now SAP needs to decide if it should market this set of technology further to capitalize on the demand or just keep it part of the business intelligence efforts.