Free software, Open Source Software

Confluent platform

Confluent Platform is a data integration solution geared towards the management and transformation of real-time streaming in enterprise environments. The tool brings together essential components - such as Apache Kafka, Schema Registry and Kafka Connect - that facilitate the ingestion, processing and continuous distribution of information, enabling the consolidation of historical and live data for strategic decision making...

Pentaho Data Integration

Pentaho Data Integration is a platform for integration and orchestration of ETL processes. The tool combines a visual interface with advanced analysis and transformation functionalities, allowing the creation of complex data flows without the need for programming from scratch.
It also offers deployment options in local, cloud or hybrid environments, facilitating the management and consolidation of information in different organisational contexts..

Apache NiFi

Apache NiFi

Apache NiFi is a data integration platform designed to automate the flow of information between systems. Its visual approach allows users to design, manage and monitor data flows intuitively, without the need for advanced programming. Thanks to its processor-based architecture, NiFi facilitates real-time data transformation, routing and processing..

H2O.ai

H2O.ai

H2O.ai is a machine learning software used to build and deploy predictive analytics models. H2O.ai provides an easy-to-use interface that allows users to build and train machine learning models without writing code. This can be done using built-in algorithms or by importing custom algorithms from R and Python...

KNIME Analytics Platform

KNIME Analytics Platform is a software application that enables the creation and analysis of data-driven workflows, or pipelines, within the KNIME platform. The software was originally developed by researchers at the University of Konstanz in Germany, but is now available under an open source licence. 

The software offers a number of platform-specific features:

R

R Development StudioR is an open source suite of utilities and a programming language for data manipulation, statistical calculations, analytics and graphics visualisation.

The environment is easily extendable with new packages (statistics, graphics, analytics, etc.) contributed by the community of R users and developers...

Apache Spark

Apache Spark

Spark is an open source framework from Apache Software Foundation for distributed processing of large amounts of data on clusters of computers, designed for use in Big Data environments, and created to enhance the capabilities of its predecessor MapReduce.

Spark inherits the scalability and fault tolerance capabilities of MapReduce, but far surpasses it in terms of processing speed, ease of use and analytical capabilities...

Apache Hive

Editor de consultas SQL de Apache Hive

Hive is a software that works on Hadoop clusters creating a layer that allows the developer to abstract from the management of HDFS and MapReduce files through SQL-based data query operations, with the HiveQL language...