Data integration is the process of transferring data between different storage types and locations. This typically includes extraction, cleaning, loading into target data repository and verification.
Data integration can be classified in different type of activities, depending of the objective to accomplish. The objective of a data Integration project might be:1. Data Migration
Typically: data on old servers that will soon be refurbished needs to be transferred to a new system.
2. Data Consolidation
Very often, after 2 companies merged, the data of the 2 companies is distributed amongst many different systems. The process of consolidation moves remote data into one central consolidated repository.
3. Data Federation (ETL for Business Intelligence and Data Warehousing)
The process of data federation moves data from many different sources into one central data repository to be able to make different kind of analysis (create the OLAP reports, create predictive or segmentation models, or any other statistical activities). Data Federation is mostly used in collaboration with Business-Intelligence tools (such as predictive analytic tools, data warehousing tool, OLAP tools)
4. Data synchronization
Process that ensures that 2 different data repository contains the same up-to-date data.
5. Master Data Management
Processes and tools to define and manage non-transactional data. Provides for collecting, aggregating, matching, consolidating, quality-assuring, persisting and distributing data to an organization to ensure consistency and control.
For example, here is a screenshot of a simple data-load script made with Anatella that imports a text file into a database:
This Anatella script:
- Loads the text file (that can be compressed on RAR,ZIP,GZ,LZO) (the de-compression will occur in RAM "on-the-fly").
- Checks for the name of the fields:
- Field names like "key" or "age" are usually forbidden inside a relational database
- Field names with special characters (like the quote or the minus sign) are forbidden
- Creates the target table inside the database (using a "CREATE TABLE" SQL statement). The field types are automatically detected based on the content of the text file.
- Uploads the text file inside the database ("INSERT" type of operation).
This is the easiest solution if you need to quickly upload a large text file into a database.