The Data Virtualization Gold Standard

Robert Eve

Subscribe to Robert Eve: eMailAlertsEmail Alerts
Get Robert Eve: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Cloud Computing, Virtualization Magazine, Desktop Virtualization Journal, SOA & WOA Magazine, Big Data on Ulitzer

BigData: Article

Data Virtualization - RDBMS vs Middleware

Data virtualization platforms

As discussed in my previous article, Data Virtualization is the new enterprise data integration pattern for the petabyte enterprise that depends on Cloud for a dynamic allocation of resources to satisfy information needs. We have also seen several attributes of data virtualization that fits the needs of the new enterprise.

As an illustration of data virtualization at work, we have demonstrated the usage of SQL Server as a data virtualization engine that utilizes the relational database engine as a data virtualization server and delivers the results.

However, utilizing the existing RDBMS as a data virtualization engine is not the only option and we can utilize specialized engines outside of RDBMS as a data virtualization engine. This concept is not new.

ROLAP vs MDOLAP
In the decision-support systems, OLAP (Online Analytical Processing) refers to a multi-dimensional view of aggregate data to provide quick access to strategic information for further analysis.

MDOLAP or MOLAP, multi-dimensional OLAP servers, employ dedicated OLAP engines optimized to manage sparse matrices of data. MOLAP storage management maintains the physical storage of OLAP cubes. It has got the advantage of high performance and response time to queries while it is quite proprietary and cannot scale for large data sets.

ROLAP technology accesses data stored in a relational database to provide OLAP analysis without the requirement to store and calculate data in a multi-dimensional cube. Relational databases serve as the database layer for data storage, access and retrieval processes. ROLAP has the advantage of storing large data sets. However the response time is not fast as MOLAP.

While the idea is not to compare between ROLAP and MOLAP but it is more about , certain data integration and access patterns can be either achieved with pure RDBMS implementation or specialized middle tier servers specifically built for them.

In that context apart from analyzing the traditional RDBMS as a data virtualization engines in the last article, we also wanted analyze specialized data virtualization engines that specifically meant for the purpose.

Data Virtualization Using a Composite Platform
The Composite Data Virtualization Platform
provides a middle-tier platform outside of the relational databases that helps to integrate data from multiple disparate sources in a unified, logically virtualized manner for access by various front-end technologies.

This solution provides a virtual data abstraction layer on top of the disparate data sources.

At the heart of this virtualization platform, the Composite Information Server acts a virtual database layer and facilitates the following core tenants of data virtualization.

  • Federates and queries data across disparate data sources. This provides integration across multiple data sources like Big Data, mainframes, RDBMS, web services, messages, Microsoft Office, etc.
  • Performance optimization for the queries that converges data from multiple data sources
  • QoS factors like caching and security
  • Abstraction layer to deliver data to consuming applications

Another interesting and useful feature of this data virtualization platform and that's not available out-of-the-box in the traditional RDBMS implementation of data virtualization is the Performance Plus Adapters.' This feature converges data from enterprise applications like SAP, Siebel, Oracle E Business Suite, as well as traditional OLAP platforms such as Oracle Essbase, SAP BW and newer analytical databases like HP Vertica, Netezza and even Big Data implementations like Hadoop. This feature enables the creation of entire enterprise application integration patterns using the data virtualization layer, replacing or augmenting the Enterprise Service Bus (ESB).

While most cases of data virtualization solutions are good to visualize, their performance always subject to issues due to the latency involved in connecting to disparate data sources. To solve this issue, this platform also provides a ‘Composite Active Cluster' that acts like MSCS for Microsoft SQL Server implementation, with features like.

  • Active/Active Clustering
  • Shared Cluster Cache
  • Replicated Metadata Repository

Summary
As seen in several articles, data virtualization will present a useful value proposition for enterprise data integration. Availability of multiple options will help the enterprises evaluate them for their needs and budgets and fit them accordingly. Apart from the traditional relational databases, specialized engines like the one by the Composite software provide multiple features to implement data virtualization in the enterprises.

More Stories By Srinivasan Sundara Rajan

Highly passionate about utilizing Digital Technologies to enable next generation enterprise. Believes in enterprise transformation through the Natives (Cloud Native & Mobile Native).