A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine
-
Updated
Aug 17, 2024 - Scala
A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine
Feathr – A scalable, unified data and AI engineering platform for enterprise
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
A simple Spark-powered ETL framework that just works 🍺
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
A schema-aware Scala library for data transformation
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.
A re-implementation of Hadoop DistCP in Apache Spark
Data manipulation and reporting for Scala.
PDF DataSource for Apache Spark
OpenSnowcat Collector, an open source fork of Snowplow (Apache 2.0 License)
Data Brewery is an ETL (Extract-Transform-Load) program that connect to many data sources (cloud services, databases, ...) and manage data warehouse workflow.
OpenSnowcat Enricher (Apache 2.0 License)
akka http service for serving spark machine learning models
Flink Example
Data Generators -> Kafka -> Spark Streaming -> PostgreSQL -> Grafana
Huemul BigDataGovernance, es una framework que trabaja sobre Spark, Hive y HDFS. Permite la implementación de una estrategia corporativa de dato único, basada en buenas prácticas de Gobierno de Datos. Permite implementar tablas con control de Primary Key y Foreing Key al insertar y actualizar datos utilizando la librería, Validación de nulos, la…
Optimal distributed data deduplication and supervised learning pipeline using Apache Spark
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."