Artículos IT, información y publicaciones sobre Integracion de datos

Últimas publicaciones destacadas en Dataprix sobre esta temática TIC

Warehousing Your Data in the Cloud with ETL


DWH on the CloudThe process of taking data from different systems and putting it into a data warehouse for business analysis can be a complicated affair. In this article, we look at what is involved and how the cloud has made matters potentially trickier.

A data warehouse is a relational database that is designed for query and analysis rather than for transaction processing. It usually contains historical data derived from transaction data, but it can include data from other sources..

Do we give the importance they deserve to Data Quality Processes?

Average: 4.5 (1 vote)

Perfilado de datos. Estadísticas y distribuciones de datosAmong the Data Management activities that organizations performs the processes that monitor and ensure data quality are becoming critical. The volume of information is constantly growing in organizations. Having a reliable data storage is essential for a correct analysis and explotation of these data, avoiding inconsistencies, misleadings and facilitating the development of future systems based on master data consistent, cleansed, enriched and reliable.

Aspects to evaluate the selection of an ETL tool


Addressing a business intelligence project is important to proper assessment of the ETL tool that we will use. The tool based on which we will implement our procurement processes of the Datamart, Datawarehouse or storage structure based on which further exploit the data. It is a cornerstone for the design, construction and subsequent evolution of our BI system. We will analyze technical issues only, without entering into the economic aspects or otherwise (licenses, agreements, technical support, tool changes, etc. ..) . Note that the ETL processes, are closely linked to the processes of data profiling and data quality, here we will not consider.

Oracle BI Publisher


We show a summary of capabilities of this tool integrated in the Oracle BI suite including the features included in version 11g.

Real Time Data Integration - CDC


There is an ever greater need in the Business Intelligence environment to have the information in the shortest time possible, data generation cycles getting shorter and updating of data in near real time. There is talk of 'Operational Business Intelligence (OBI)' and 'Real Time Decision Support'.

It is critical to reaching operational data analytic environments in shortest possible time. There is a need for a 'Real Time Data Integration'.

In the optimization of these data integration processes, we must consider both the usual source data (ERP's, CRM's, operating systems, databases, flat files, Excel, XML, etc. ..) and other from more immediate nature such as messaging queues and on-line information accessed via Web services or RSS.

Do your BI platform support the data source you want to analyze?


Although the latest versions of most BI platforms support a wide range of data sources, this is a common question which involved the version of your BI tool, the version of the database file format or ERP that acts as a data source and the operating system.


In the case of SAS can solve these questions by referring to the SAS / ACCESS Validation Matrix, selecting the version of SAS, the database, the operating system and you have the answer.

Introduction to Open Data, Linked Data, data and resource catalogs

Average: 4 (1 vote)

Mapa de catálogos de datos Open Data en EspañaOpen Data is a movement that is meant to make easily accessible to citizens and businesses that collect public government data.

Open Data is supported by the W3C and other international bodies, and gradually adds up initiatives in different countries. The first step is the initiative of a public body, was made publicly available certain information in one or more standard formats easily 'treatable'. Since there may be other agencies or businesses that add value to that data across or enriching them with other data sources, or developing applications that allow user to see that data in a friendly environment.


Informatica 9, a complete data integration platform

Average: 3 (3 votes)

In the market for a Data Integration is a leading manufacturer Informatica. This company is the first independent provider of data integration software. His best-known tool and the heart of his platform is Informatica PowerCenter, which has gone through many versions, and is a reference in the world of integration.

But apart from PowerCenter, Informatica also has other tools that focus on more specific purposes, while that are integrated into the platform, and always in the context of Data Integration.

Data Profiles of SQL Server Information Services stored in tables

Average: 4 (3 votes)

The task Data Profile of SQL Server Information Services stores the results of profiling in an XML document that can be examined with the Data Profile Viewer. Article Dataprofiling with SQL Server 2008 explains how to use this new Task in SSIS.

Although this method is very simple, sometimes may not be sufficient. Addressing a data quality project may involve, for example, storing a history of profiles to assess how data quality of processed data has been improving.

The best way to work with historical data is using a database and storing the data in tables, where you can make queries, reports and comparisons. To achieve that all you would need to do is moving the metadata that the profiling task has been storing in the XML file to database tables.

Well, someone has already prepared an easy way to do it. Thomas Frisendal from the website Information Quality Solutions explains how to create an XSLT file for each type of profiling that is used to extract the XML generated by the Data Profile Task SSIS into one or more XML files with a format that can be directly imported to tables .  

Data profiling with SQL Server 2008

Average: 3.5 (2 votes)

One of the many improvements brought about SQL Server 2008 at the ETL with Integration Services is their ability to perform data profiling with its new Data Profile Task.

The data profiling is one of the first tasks typically addressed in Data Quality processes, and involves an initial analysis of the source data, usually on tables, with the goal of beginning to know their structure, format and level of quality. Inquiries are made at the table level, column, relationships between columns, and even relationships between tables.

The SSIS Data Profile Task works by selecting a table in a SQLServer 2000 database or above (no use on other databases) the profiling options you want to perform on the data in the table, and an XML file for saving the result. It's really simple.

You can select up to 8 types of profiling, 5 for column level and 3 several columns level analysis.

Column level profile

Syndicate content