Stars
😎 A curated list of awesome DataOps tools
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
A light-weight, flexible, and expressive statistical data testing library
Semantic Functions for Semantic Link
Qubole Sparklens tool for performance tuning Apache Spark
Samples on how to use Azure SQL database with Azure OpenAI
A Python framework for defining and querying BI models in your data warehouse
Display paginated content in the browser and generate print books using web technology
Exposes the Windows Process creation Win32 functions in PowerShell
Invoke Command As System/Interactive/GMSA/User on Local/Remote machine & returns PSObjects.
No-code in the front, Python in the back. An open-source framework for creating data apps.
Yet another googlesearch - A Python library for executing intelligent, realistic-looking, and tunable Google searches.
🔥 Blazing fast bulk data transfers between any cloud 🔥
Free and open source schema versioning and database migration made natively with .NET/6. NEW THIS MAY 2022! v1.3.15 released!
All image quality metrics you need in one package.
Augraphy: Creating Realistic Document Image Datasets with Data Augmentation
Fully managed Apache Parquet implementation
2D/3D renderer - makes it simple to draw stuff across platforms (including web)
A high-performance SVG renderer and toolkit, powered by Rust based resvg and napi-rs.
ShabbyPages is a state-of-the-art corpus of born-digital document images with both ground truth and distorted versions appropriate for use in training models to reverse distortions and recover to o…
Tool for probabilistically linking the records of individual entities (e.g. people) within and across datasets
Introducing the most comprehensive and up-to-date open source dataset on US car models on Github. With over 15,000 entries covering car models manufactured between 1992 and 2023, this repository of…
The state-of-the-art image restoration model without nonlinear activation functions.