Home About Replication Our Solution Blog Contact

 

 


IBM DB2 ReplicationOGSA-DAI

ABOUT


We are interested in integrating OGSA-DAI technology with DB2 data replication. Over years, IBM's data replication have helped business reengineer their business processes, improve decision making and increase system availability. It provides robust versatile replication support across heterogeneous data models and multi-vendor environment. Aimed to assist with access and integration of distributed data resource via the grid, OGSA-DAI provided various interfaces supporting data operation, transforming and delivering with many popular (relational or XML) database, and file systems, and provide a means for user to expose their stand-alone/web application onto grids. By assembling OGSA-DAI with DB2 data replication, we are able to deploy DB2 replication components within a grid environment Grid-enable DB2 Data Replicate can greatly enhances the software capability in the way that supporting scalable, secure, reliable and efficient data access across a large distributed virtual organization.

There are potentially great needs for grid replication. For example, a data grid is a grid computing system that deal with data, requires intensive computation and analysis of shared large-scale databases, millions of Gigabytes, across widely distributed scientific communities. Data replication is one of most useful strategies to achieve high levels of availability and fault tolerance as well as minimal access time in a data grid. A Digital library is a collection of documents, services and infrastructure in organized electronic form, available on the distributed system. The usages of replication in a digital library can be: replication of (collections of) digital documents -- the creation of identical copies of digital documents in a separate server site; replication of index data – the recreation of index data structures at the separate site; replication of user interface data -- the synchronized presentation of user interface data and events at several user interfaces, etc.

Replication has been studied quite a lot in relational database. Sybase' replication, among the early products, has been available since 1993, and nowadays, most of relational database, such as Oracle, DB2, MySQL, Microsoft SQL Server, provide their own replication solutions. The designs of replication in relational database mainly deal with high availability/disaster recover, data consolidation for central audition/analysis, data distribution for balancing access load, offline query, or security purposes, etc. By years of experiences and studying of real commercial requirements, their replication functionalities are quite flexible, and advanced in offering solution for complete/portion of data objects copying, synchronous/asynchronous copying, update everywhere, data confliction, mess deploy, etc. However, with the new demanding arose in the grids, the replication capabilities of their products seems limited in addressing scalability, dynamic changing, and fault tolerance.

Various Grid replication projects have been undertaken in recent years, among them EDG replica manager, Globus data replication service, and SRB. These systems are mainly design for moving large scientific data blocks closer to the applications to improve accessing efficiency. They implement a similar replication mechanism: when request, first search for an appreciate resource or existing replica, this is normally done by query a centralized metadata index; then use some efficient transferring tools, e.g. GridFTP, to move data. Most of existing gird replication tools only deals with (read only) files. SRB is able to manage data movement among heterogeneous databases, yet, it doesn't support transaction-based replication, and its metadata searching seems very stiff and not suitable for commercial usages.

Different from existing replication technologies, we focus on industry and business demanding for grid computing capabilities, which allows users being able to address more business challenges and maximize their commercial benefits. We start by restructuring conventional database replication mechanism. This may helps large amount of commercial users understand the concepts of grids easily and move onto grids smoothly. For example, many businesses users would not like to abandon their in-using software which had been invested a lot and made familiar and correct. This strategy will assist them to grid-enable their database accessibility in minimal cost. On the other hand, software reusing can greatly reduce development and testing expenditure, allowing for keeping the features of old means, and delivering new quality product quicker.