qprofiler is a Python package that provides an intelligent way to create a data quality profile for your development(train) dataset(s) and save it as a reference to use in creating quality check tests and automatic handling cases for production(test) datasets.
The source code is currently hosted on GitHub at: dprofiler-github
Binary installers for the latest released version are available at the PyPi
# PyPi
pip install qprofiler
- Polars(>=0.19.0 <0.20.0)
- PyYAML(>=6.0.1 <7.0.0)
- Pathlib(>=1.0 <2.0)
- rumamel.yaml(>=0.17.32 <0.18.0)
check the notebook that contains everything about how to use DataProfiler module in profiling datasets, and how to use QTest module to create quality check tests.
check the notebook that contains everything about how to use QPipeline
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
- Create Quality Pipeline(v0.2.2).
- enhance documentation.
- add utility method in DataProfiler to .dprofiler structure tree.
- Modify file structure of .dprofiler.
- add Metadata Module of .dprofiler.
- modify test cases.