Skip to main content
Warning: You are using the test version of PyPI. This is a pre-production deployment of Warehouse. Changes made here affect the production instance of TestPyPI (testpypi.python.org).
Help us improve Python packaging - Donate today!

Machine Learning Version Control made Simple

Project Description

Estimators

Machine Learning Versioning made Simple

Intro

Estimators helps version and track machine learning models and datasets.

It can be used to version and deploy models. It’s highly extensible and can be used with almost any python object (scikit-learn, numpy arrays, and custom modules).

This repo utilizes sqlalchemy as an ORM. If you’re using django, try django-estimators instead.

Installation

Estimators is not yet on PyPI, so just run:

pip install estimators

Environment Setup

Before we can we need to, one-time initialize our database and environments. In future releases, we anticipate this step will be simplified.

from estimators import Estimator, DataSet, DataBase
db = DataBase()
db.initialize_database()
Estimator.initialize_root_dir()
DataSet.initialize_root_dir()

Basic Usage

We can see the power of Estimators in 2 steps. Let’s say we are developing a classifier. We’ll load up the data, split it for validation, and then create and train a model.

from sklearn.datasets import load_digits
from sklearn.ensemble import RandomForestClassifier

digits = load_digits() # 1797 by 64
X = digits.data
y = digits.target

# simple splitting for validation testing
X_train, X_test = X[:1200], X[1200:]
y_train, y_test = y[:1200], y[1200:]

rfc = RandomForestClassifier()
rfc.fit(X_train, y_train)

1. First import an Evaluator object that instantiates an evaluation plan. Set the estimator, X_test and y_test to that evaluator object.

from estimators import Evaluator

plan = Evaluator()
plan.estimator = rfc
plan.X_test = X_test
plan.y_test = y_test

# persist all objects upon prediction
result = plan.evaluate()

# including our predictions
result.y_predicted

2. At a later date, we can retrieve the results, along with the original estimator, X_test dataset and y_test dataset using sqlalchemy orm.

from estimators import DataBase, EvaluationResult
db = DataBase()

result = db.Session.query(EvaluationResult).first()

# which has all our attributes
result.id
result.create_date
result.estimator
result.X_test
result.y_test
result.y_predicted

Advanced Usage

Continuing with the above example, we can pull specific estimators or datasets from our database.

from estimators import Estimator, DataSet

# to return an estimator proxy object
es = db.Session.query(Estimator).first()

# return our fitted RandomForestClassifier
es.estimator

# to returns all datasets as proxy objects

ds = db.Session.query(DataSet).all()
ds[0].data

But we can continue on to use all of sqlalchemy’s expressions

X_test_one = db.Session.query(DataSet).filter(DataSet.hash=='a381b220d0cd271d608a27eb52dfb654').first()
y_test_one = db.Session.query(DataSet).filter(DataSet.hash=='fe773b5c53aec02fd98ffc65feb4714d').first()

Furthermore, we can run more evaluations using our new proxy objects. The Evaluator object handles the proxy Estimator and DataSet objects just like regular data.

plan = Evaluator()
plan.estimator = es
plan.X_test = X_test_one
plan.y_test = y_test_one

result_two = plan.evaluate()

Additionally if we want to use a different database connection, we can pass the sqlalchemy session object to the evaluator.

from estimators import DataBase
db = DataBase(url='sqlite://')

plan = Evaluator()
plan.session = db.Session
# and continue as expected otherwise

Development Installation

To install the latest version of estimators, clone the repo, change directory to the repo, and pip install it into your current virtual environment.:

$ git clone git@github.com:fridiculous/estimators.git
$ cd estimators
$ <activate your project’s virtual environment>
(virtualenv) $ pip install -e .  # the dot specifies for this current repo
Release History

Release History

This version
History Node

0.1.0.dev3

History Node

0.1.0.dev2

History Node

0.1.0.dev1

History Node

0.1.0.dev0

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
estimators-0.1.0.dev3-py2.py3-none-any.whl (17.6 kB) Copy SHA256 Checksum SHA256 py2.py3 Wheel Oct 28, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting