At its core, data integration is the process of combining data from different sources so that users and applications see it as coming from a single source. There are many ways to do this, and there are just as many possible applications too. It can be used for everything from research to marketing and corporate mergers.
Ideally, this is not something to be done just as an IT initiative to lower costs or improve the system. The main reason should be to help improve business processes or to solve a problem. Such systems based on integrated information collected from different sources have indeed been developed, and are quite commonly used in many industries.
For example, a company that has separate databases for their marketing/sales and service departments may want to integrate all of it into a centralized repository. Otherwise known as a CRM system, this allows the marketing and sales people to target existing customers based on information collected from the other departments. Two companies entering into a merger or acquisition will likewise need to integrate their respective systems.
It is possible to accomplish this process on both the application and middleware layers. It may also be done by physical warehousing in a completely new system, or through virtual integration where no new repository is actually created. Let's consider these choices one by one, so that it becomes clear how each one works and where it can be used.
If the application has the built-in logic to extract and combine information stored in different sources, there is no need to create a new centralized database. The same applies for a solution on a middleware layer. In this case, the logic in the middleware will provide every application with whatever information it needs from any and all sources at the back end.
Virtual integration is the simplest method to create an integrating tool which does require creation of a new storage system. Under this method, a set of pre-defined queries will access required information from separate sources. For instance, consider a case where a customer profile needs to be seen. The query extracts records from all the sources based on the main index field, which is usually a customer ID. The extracted information is then presented to the user in a single and unified view.
Warehousing is a completely new system which can siphon and store information from any number of sources. This is mostly done only at an enterprise level, where vast amounts of data coming in from all of a company's departments and locations can be collected, stored and managed in massive data centers. This centralized system can then be used by applications and users to gain enterprise-wide access, reporting and analysis capability.
The choice of data integration method and the scope of the project is a critical decision. The basic deciding factors are the number of sources and their type, along with the business benefits that are expected. The project cost is important, and so are the security and backup systems impacted. Other similar projects which are likely to have some impact are ongoing migrations and synchronization, along with MDM or master data management.
Ideally, this is not something to be done just as an IT initiative to lower costs or improve the system. The main reason should be to help improve business processes or to solve a problem. Such systems based on integrated information collected from different sources have indeed been developed, and are quite commonly used in many industries.
For example, a company that has separate databases for their marketing/sales and service departments may want to integrate all of it into a centralized repository. Otherwise known as a CRM system, this allows the marketing and sales people to target existing customers based on information collected from the other departments. Two companies entering into a merger or acquisition will likewise need to integrate their respective systems.
It is possible to accomplish this process on both the application and middleware layers. It may also be done by physical warehousing in a completely new system, or through virtual integration where no new repository is actually created. Let's consider these choices one by one, so that it becomes clear how each one works and where it can be used.
If the application has the built-in logic to extract and combine information stored in different sources, there is no need to create a new centralized database. The same applies for a solution on a middleware layer. In this case, the logic in the middleware will provide every application with whatever information it needs from any and all sources at the back end.
Virtual integration is the simplest method to create an integrating tool which does require creation of a new storage system. Under this method, a set of pre-defined queries will access required information from separate sources. For instance, consider a case where a customer profile needs to be seen. The query extracts records from all the sources based on the main index field, which is usually a customer ID. The extracted information is then presented to the user in a single and unified view.
Warehousing is a completely new system which can siphon and store information from any number of sources. This is mostly done only at an enterprise level, where vast amounts of data coming in from all of a company's departments and locations can be collected, stored and managed in massive data centers. This centralized system can then be used by applications and users to gain enterprise-wide access, reporting and analysis capability.
The choice of data integration method and the scope of the project is a critical decision. The basic deciding factors are the number of sources and their type, along with the business benefits that are expected. The project cost is important, and so are the security and backup systems impacted. Other similar projects which are likely to have some impact are ongoing migrations and synchronization, along with MDM or master data management.
About the Author:
Peggie K. Lambert loves working and studying the world of data integration. If you are looking to learn more about data integration solutions then she recommends you check out www.liaison.com.