Laboratory

A library for carefully refactoring critical paths, with support for Python 2.7 & 3.3+

Laboratory is all about sure-footed refactoring achieved through experimentation. By conducting experiments and verifying their results, not only can we see if our refactored code is misbehaving, we have established a feedback loop to help us correct its behaviour.

https://img.shields.io/github/license/joealcorn/laboratory.svg? https://img.shields.io/pypi/v/laboratory.svg? https://img.shields.io/travis/joealcorn/laboratory.svg? https://pypi-badges.global.ssl.fastly.net/svg?package=laboratory&timeframe=monthly

Note

These docs are a work in progress. Additional documentation can be found in the project’s README

Installation

Installing from PyPI is recommended.

If you’re unfamiliar with Python packaging tools (such as pip and virtualenv) see what The Hitchhiker’s Guide to Python has to say about them.

$ pip install laboratory

You can also install a tagged version from Github

$ pip install https://github.com/joealcorn/laboratory/archive/v1.0.tar.gz

Or the latest development version

$ pip install git+https://github.com/joealcorn/laboratory.git

Now move on to the Quickstart

Publishing results

We saw in the Quickstart how to create and run an experiment. Now let’s see how we can take the data gathered in that experiment and publish it to make it useful to us.

Laboratory makes no assumptions about how to do this — it’s entirely for you to implement to suit your needs. For example, timing data can be sent to graphite, and mismatches could be written to disk for debugging at a later date.

Publishing

To publish, you must implement the publish() method on an Experiment.

The publish method is passed a Result instance, with control and candidate observations available under result.control and result.candidates respectively.

Experiment.publish(result)[source]

Publish the results of an experiment. This is called after each experiment run. Exceptions that occur during publishing will be caught, but logged.

By default this is a no-op. See Publishing results.

Parameters:result (Result) – The result of an experiment run

StatsD implementation

Here’s an example implementation for statsd:

class StatsdExperiment(laboratory.Experiment):
    def publish(self, result):
        if result.match:
            statsd.incr('experiment.match')
        else:
            statsd.incr('experiment.mismatch')

        statsd.timing('experiment.control', result.control.duration)
        for obs in result.candidates:
            statsd.timing('experiment.%s' % obs.name, obs.duration)

API Reference

Experiment

class laboratory.experiment.Experiment(name='Experiment', context=None, raise_on_mismatch=False)[source]

Experiment base class. Handles running your control and candidate functions. Should be subclassed to add publishing functionality.

Variables:
  • name (string) – Experiment name
  • raise_on_mismatch (bool) – Raise MismatchException when experiment results do not match
classmethod decorator(candidate, *exp_args, **exp_kwargs)[source]

Decorate a control function in order to conduct an experiment when called.

Parameters:
  • candidate (callable) – your candidate function
  • exp_args (iterable) – positional arguments passed to Experiment
  • exp_kwargs (dict) – keyword arguments passed to Experiment

Usage:

candidate_func = lambda: True

@Experiment.decorator(candidate_func)
def control_func():
    return True
control(control_func, args=None, kwargs=None, name='Control', context=None)[source]

Set the experiment’s control function. Must be set before conduct() is called.

Parameters:
  • control_func (callable) – your control function
  • args (iterable) – positional arguments to pass to your function
  • kwargs (dict) – keyword arguments to pass to your function
  • name (string) – a name for your observation
  • context (dict) – observation-specific context
Raises:

LaboratoryException – If attempting to set a second control case

candidate(cand_func, args=None, kwargs=None, name='Candidate', context=None)[source]

Adds a candidate function to an experiment. Can be used multiple times for multiple candidates.

Parameters:
  • cand_func (callable) – your control function
  • args (iterable) – positional arguments to pass to your function
  • kwargs (dict) – keyword arguments to pass to your function
  • name (string) – a name for your observation
  • context (dict) – observation-specific context
conduct(randomize=True)[source]

Run control & candidate functions and return the control’s return value. control() must be called first.

Parameters:randomize (bool) – controls whether we shuffle the order of execution between control and candidate
Raises:LaboratoryException – when no control case has been set
Returns:Control function’s return value
enabled()[source]

Enable the experiment? If false candidates will not be executed.

Return type:bool
compare(control, candidate)[source]

Compares two Observation instances.

Parameters:
  • control (Observation) – The control block’s Observation
  • candidate (Observation) – A candidate block’s Observation
Raises:

MismatchException – If Experiment.raise_on_mismatch is True

Return bool:

match?

publish(result)[source]

Publish the results of an experiment. This is called after each experiment run. Exceptions that occur during publishing will be caught, but logged.

By default this is a no-op. See Publishing results.

Parameters:result (Result) – The result of an experiment run
get_context()[source]
Return dict:Experiment-wide context

Observation

class laboratory.observation.Observation(name, context=None)[source]

Result of running a single code block.

Variables:
  • name (string) – observation name
  • failure (bool) – did the function raise an exception
  • exception (Exception) – exception raised, if any
  • exc_info – result of sys.exc_info(), if exception raised
  • value – function return value
duration

How long the function took to execute

Return type:timedelta
get_context()[source]

Return observation-specific context

Result

class laboratory.result.Result(experiment, control, candidates)[source]
Variables:
  • experiment (Experiment) – The experiment instance that recorded this Result
  • control (Observation) – The control observation
  • candidates ([Observation]) – A list of candidate observations
  • match (bool) – Whether all candidates match the control case

Exceptions

exception laboratory.exceptions.LaboratoryException(message, *a, **kw)[source]

Base class for all laboratory exceptions

exception laboratory.exceptions.MismatchException(message, *a, **kw)[source]

Quickstart

See: Installation or pip install laboratory

With Laboratory you conduct an experiment with your known-good code as the control block and a new code branch as a candidate.

Let’s do an experiment together:

import laboratory

# create an experiment
experiment = laboratory.Experiment()

# set your control and candidate functions
experiment.control(authorise_control, args=(user,))
experiment.candidate(authorise_candidate, args=(user,))

# conduct the experiment and return the control value
authorised = experiment.conduct()

Laboratory just:

  • Executed the unproven (candidates) and the existing (control) code
  • Compared the return values
  • Recorded timing information about all code
  • Caught (and logged) exceptions in the unproven code
  • Published all of this information (see Publishing results)

For the most part that’s all there is to it. You’ll need to do some work to publish your results in order to act on the experiment, but if you’ve got a metrics solution ready to go it should be straightforward.

If you need to control comparison, you can do that too.

Tip

Your control and candidate functions execute in a random order to help catch ordering issues

Indices and tables