Skip to main content
Warning: You are using the test version of PyPI. This is a pre-production deployment of Warehouse. Changes made here affect the production instance of TestPyPI (testpypi.python.org).
Help us improve Python packaging - Donate today!

Useful math stats functions for collections of datetime objects

Project Description

Useful math stats functions for collections of datetime objects.

Ever had a list or set of datetimes and needed to perform some basic statistical functions to find the median, mean, minimum, or maximum? datetimestats provides some basic utilities to do this without the overhead of converting to a numeric data type (e.g., unix timestamps at millisecond-level) or pandas data frames (in order to employ pandas time series functionality).

datetimestats takes any iterable of datetime objects (including panda Series). It supports naive and non-naive datetimes, including iterables of objects with different Olson timezones. However, it does not support iterables that contain both naive and non-naive datetime objects.

Currently, datetimestats is supported for use with Python 2.7.

Installation

Install using pip with:

pip install datetimestats

Or download a wheel or source archive from PyPI.

Usage

datatimestats currently supports obtaining the the mean, median, min, or max of an Iterable of datetime objects. The functions are designed to minimize time complexity where possible.

Caluclating Means

Mean is calculated as the the arithmetic mean across datetime objects with micro-second precision.

When given a list of naive datetimes, mean returns a naive datetime object

>>> import datetime as dt
>>> naive_1 = dt.datetime(2015, 9, 10, 12, 30, 0)
>>> naive_2 = dt.datetime(2015, 9, 10, 12, 0, 0)
>>> from datetimestats import mean
>>> mean([naive_1, naive_2])
datetime.datetime(2015, 9, 10, 12, 15)

When given a list of non-naive datetimes, it returns the mean as a datetime object in UTC time:

>>> import datetime as dt
>>> import pytz
>>> nyc_noon = dt.datetime(2014, 1, 1, 12, 0, 0, tzinfo=pytz.timezone('America/New_York'))
>>> print nyc_noon # Curiosity of pytz, it does not convert to whole time zones
2014-01-01 12:00:00-04:56
>>> london_noon = dt.datetime(2014, 1, 1, 12, 0, 0, tzinfo=pytz.timezone('Europe/London'))
>>> singapore_noon = dt.datetime(2014, 1, 1, 12, 0, 0, tzinfo=pytz.timezone('Asia/Singapore'))
>>> from datetimestats import mean
>>> mean([nyc_noon, london_noon, singapore_noon])
datetime.datetime(2014, 1, 1, 11, 20, 40, tzinfo=<UTC>)

Calculating Medians

If an odd number of objects are provided, median is calculated is the inner-most, sorted value. If an even number is provided, median is the arithmetic-mean of the two inner-most, sorted values.

Odd number of values:

>>> import datetime as dt
>>> import pytz
>>> london_noon = dt.datetime(2014, 1, 1, 12, 0, 0, tzinfo=pytz.timezone('Europe/London'))
>>> nyc_noon = dt.datetime(2014, 1, 1, 12, 0, 0, tzinfo=pytz.timezone('America/New_York'))
>>> singapore_noon = dt.datetime(2014, 1, 1, 12, 0, 0, tzinfo=pytz.timezone('Asia/Singapore'))
>>> from datetimestats import median
>>> median([nyc_noon, london_noon, singapore_noon])
datetime.datetime(2014, 1, 1, 12, 0, tzinfo=<DstTzInfo 'Europe/London' LMT-1 day, 23:59:00 STD>)
>>> median([nyc_noon, london_noon, singapore_noon]) == london_noon # Noon in London does fall between Singapore and NYC
True

Even number of values:

>>> import datetime as dt
>>> naive_1 = dt.datetime(2015, 9, 10, 12, 0, 0)
>>> naive_2 = dt.datetime(2015, 9, 10, 14, 0, 0)
>>> naive_3 = dt.datetime(2015, 9, 10, 13, 0, 0)
>>> naive_4 = dt.datetime(2015, 9, 10, 15, 0, 0)
>>> from datetimestats import median
>>> median([naive_1, naive_2, naive_3, naive_4])
datetime.datetime(2015, 9, 10, 14, 30)

Calculating Min and Max datetime

The min datetime is the earliest datetime. Conversely, the max is the latest. This is most interesting when calculating across multiple timezones:

>>> import datetime as dt
>>> import pytz
>>> london_noon = dt.datetime(2014, 1, 1, 12, 0, 0, tzinfo=pytz.timezone('Europe/London'))
>>> nyc_noon = dt.datetime(2014, 1, 1, 12, 0, 0, tzinfo=pytz.timezone('America/New_York'))
>>> singapore_noon = dt.datetime(2014, 1, 1, 12, 0, 0, tzinfo=pytz.timezone('Asia/Singapore'))
>>> from datetimestats import min, max
>>> min([nyc_noon, london_noon, singapore_noon])
datetime.datetime(2014, 1, 1, 12, 0, tzinfo=<DstTzInfo 'Asia/Singapore' LMT+6:55:00 STD>)
>>> min([nyc_noon, london_noon, singapore_noon]) == singapore # It is noon in Singapore EARLIEST
True
>>> max([nyc_noon, london_noon, singapore_noon])
datetime.datetime(2014, 1, 1, 12, 0, tzinfo=<DstTzInfo 'America/New_York' LMT-1 day, 19:04:00 STD>)
>>> max([nyc_noon, london_noon, singapore_noon]) == nyc_noon # It is noon in NYC LATEST
True

Others

Sets, tuples, numpy Arrays and pandas Series are also supported:

>>> import datetime as dt
>>> naive_1 = dt.datetime(2015, 9, 10, 12, 0, 0)
>>> naive_2 = dt.datetime(2015, 9, 10, 14, 0, 0)
>>> from datetimestats import median
>>> median([naive_1, naive_2])
datetime.datetime(2015, 9, 10, 13, 0)
>>> median((naive_1, naive_2))
datetime.datetime(2015, 9, 10, 13, 0)
>>> import numpy as np
>>> median(np.asarray([naive_1, naive_2]))
datetime.datetime(2015, 9, 10, 13, 0)
>>> import pandas as pd
>>> median(pd.Series([naive_1, naive_2]))
datetime.datetime(2015, 9, 10, 13, 0)
Release History

Release History

This version
History Node

1.0.0

Download Files

Download Files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

File Name & Checksum SHA256 Checksum Help Version File Type Upload Date
datetimestats-1.0.0.tar.gz (4.1 kB) Copy SHA256 Checksum SHA256 Source Aug 13, 2016

Supported By

WebFaction WebFaction Technical Writing Elastic Elastic Search Pingdom Pingdom Monitoring Dyn Dyn DNS Sentry Sentry Error Logging CloudAMQP CloudAMQP RabbitMQ Heroku Heroku PaaS Kabu Creative Kabu Creative UX & Design Fastly Fastly CDN DigiCert DigiCert EV Certificate Rackspace Rackspace Cloud Servers DreamHost DreamHost Log Hosting