NAME
    Benchmark::Lab - Tools for structured benchmarking and profiling

VERSION
    version 0.001

EXPERIMENTAL
    This module is still in the early experiment stage. Breaking API changes
    could occur in any release before 1.000.

    Use and feedback is welcome if you are willing to accept that risk.

SYNOPSIS
        # Load as early as possible in case you want profiling
        use Benchmark::Lab -profile => $ENV{DO_PROFILING};

        # Define a task to benchmark as functions in a namespace
        package My::Task;

        # do once before any iterations (not timed)
        sub setup {
            my $context = shift;
            ...
        }

        # do before every iteration (not timed)
        sub before_task {
            my $context = shift;
            ...
        }

        # task being iterated and timed
        sub do_task {
            my $context = shift;
            ...
        }

        # do after every iteration (not timed)
        sub after_task {
            my $context = shift;
            ...
        }

        # do once after all iterations (not timed)
        sub teardown {
            my $context = shift;
            ...
        }

        # Run benchmarks on a namespace
        package main;

        my $context = {}; # any data needed

        my $result = Benchmark::Lab->new()->start( 'My::Task', $context );

        # XXX ... do stuff with results ...

DESCRIPTION
    This module provides a harness to benchmark and profile structured
    tasks.

    Structured tasks include a task to be benchmarked, as well as work to be
    done to prepare or cleanup from benchmarking that should not be timed.

    This module also allows the same structured task to be profiled with
    Devel::NYTProf, again with only the task under investigation being
    profiled. During prep/cleanup work, the profiler is paused.

    On systems that support "Time::HiRes::clock_gettime" and
    "CLOCK_MONOTONIC", those will be used for timing. On other systems, the
    less precise and non-monotonic "Time::HiRes::time" function is used
    instead.

    Future versions will add features for analyzing and comparing benchmarks
    timing data.

USAGE
  Loading and initializing
    If you want to use the profiling feature, you MUST load this module as
    early as possible so that Devel::NYTProf can instrument all subsequent
    compiled code.

    To correctly initialize "Benchmark::Lab" (and possibly Devel::NYTProf),
    you MUST ensure its "import" method is called. (Loading it with "use" is
    sufficient.)

    Here is an example that toggles profiling based on an environment
    variable:

        use Benchmark::Lab -profile => $ENV{DO_PROFILING};

        # loading other modules is now OK
        use File::Spec;
        use HTTP::Tiny;
        ...

  Creating a structured task
    A structured task is a Perl namespace that implements some of the
    following *task phases* by providing a subroutine with the corresponding
    name:

    *   "setup" – run once before any iteration begins (not timed)

    *   "before_task" – run before *each* "do_task" function (not timed)

    *   "do_task" – specific task being benchmarked (timed)

    *   "after_task" – run after *each* "do_task" function (not timed)

    *   "teardown" – run after all iterations are finished (not timed)

    Each task phase will be called with a *context object*, which can be
    used to pass data across phases.

        package Foo;

        sub setup {
            my $context = shift;
            $context->{filename} = "foobar.txt";
            path($context->{filename})->spew_utf8( _test_data() );
        }

        sub do_task {
            my $context = shift;
            my $file = $context->{filename};
            # ... do stuff with $file
        }

    Because structured tasks are Perl namespaces, you can put them into .pm
    files and load them like modules. Or, you can define them on the fly.

    Also, since "Benchmark::Lab" finds task phase functions with the "can"
    method, you can use regular Perl inheritance with @ISA to reuse
    setup/teardown/etc. task phases for related "do_task" functions.

        package Foo::Base;

        sub setup { ... }
        sub teardown { ... }

        package Foo::Case1

        use parent 'Foo::Base';
        sub do_task { ... }

        package Foo::Case2

        use parent 'Foo::Base';
        sub do_task { ... }

  Running benchmarks
    A "Benchmark::Lab" object defines the conditions of the test – currently
    just the constraints on the number of iterations or duration of the
    benchmarking run.

    Running a benchmark is just a matter of specifying the namespace for the
    task phase functions, and a context object, if desired.

        use Benchmark::Lab -profile => $ENV{DO_PROFILE};

        sub fact { my $n = int(shift); return $n == 1 ? 1 : $n * fact( $n - 1 ) }

        *Fact::do_task = sub {
            my $context = shift;
            fact( $context->{n} );
        };

        my $bl      = Benchmark::Lab->new;
        my $context = { n => 25 };
        my $res     = $bl->start( "Fact", $context );

        printf( "Median rate: %d/sec\n", $res->{median_rate} );

  Analyzing results
    TBD. Analysis will be added in a future release.

METHODS
  new
    Returns a new Benchmark::Lab object.

    Valid attributes include:

    *   "min_secs" – minimum elapsed time in seconds; default 0

    *   "max_secs" – maximum elapsed time in seconds; default 300

    *   "min_reps" - minimum number of task repetitions; default 1; minimum
        1

    *   "max_reps" - maximum number of task repetitions; default 100

    *   "verbose" – when true, progress will be logged to STDERR; default
        false

    The logic for benchmark duration is as follows:

    *   benchmarking always runs until both "min_secs" and "min_reps" are
        satisfied

    *   when profiling, benchmarking stops after minimums are satisfied

    *   when not profiling, benchmarking stops once one of "max_secs" or
        "max_reps" is exceeded.

    Note that "elapsed time" for the "min_secs" and "max_secs" is wall-clock
    time, not the cumulative recorded time of the task itself.

  start
        my $result = $bm->start( $package, $context, $label );

    This method executes the structured benchmark from the given $package.
    The $context parameter is passed to all task phases. The $label is used
    for diagnostic output to describe the benchmark being run.

    If parameters are omitted, $package defaults to "main", an empty hash
    reference is used for the $context, and the $label defaults to the
    $package.

    It returns a hash reference with the following keys:

    *   "elapsed" – total wall clock time to execute the benchmark
        (including non-timed portions).

    *   "total_time" – sum of recorded task iterations times.

    *   "iterations" – total number of "do_task" functions called.

    *   "percentiles" – hash reference with 1, 5, 10, 25, 50, 75, 90, 95 and
        99th percentile iteration times. There may be duplicates if there
        were fewer than 100 iterations.

    *   "median_rate" – the inverse of the 50th percentile time.

    *   "timing" – array reference with individual iteration times as
        (floating point) seconds.

CAVEATS
    If the "do_task" executes in less time than the timer granularity, an
    error will be thrown. For benchmarks that do not have before/after
    functions, just repeating the function under test in "do_task" will be
    sufficient.

RATIONALE
    I believe most approaches to benchmarking are flawed, primarily because
    they focus on finding a *single* measurement. Single metrics are easy to
    grok and easy to compare ("foo was 13% faster than bar!"), but they
    obscure the full distribution of timing data and (as a result) are often
    unstable.

    Most of the time, people hand-wave this issue and claim that the Central
    Limit Theorem (CLT) solves the problem for a large enough sample size.
    Unfortunately, the CLT holds only if means and variances are finite and
    some real world distributions are not (e.g. hard drive error frequencies
    best fit a Pareto distribution).

    Further, we often care more about the shape of the distribution than
    just a single point. For example, I would rather have a process with
    mean µ that stays within 0.9µ - 1.1µ than one that varies from 0.5µ -
    1.5µ.

    And a process that is 0.1µ 90% of the time and 9.1µ 10% of the time
    (still with mean µ!) might be great or terrible, depending on the
    application.

    This module grew out of a desire for detailed benchmark timing data,
    plus some additional features, which I couldn't find in existing
    benchmarking modules:

    *   Raw timing data – I wanted to be able to get raw timing data, to
        allow more flexible statistical analysis of timing distributions.

    *   Monotonic clock – I wanted times from a high-resolution monotonic
        clock (if available).

    *   Setup/before/after/teardown – I wanted to be able to
        initialize/reset state not just once at the start, but before each
        iteration and without it being timed.

    *   Devel::NYTProf integration – I wanted to be able to run the exact
        same code I benchmarked through Devel::NYTProf, also limiting the
        profiler to the benchmark task alone, not the setup/teardown/etc.
        code.

    Eventually, I hope to add some more robust graphic visualization and
    statistical analyses of timing distributions. This might include both
    single-point estimates (like other benchmarking modules) but also more
    sophisticated metrics, like non-parametric measures for comparing
    samples with unequal variances.

SEE ALSO
    There are many benchmarking modules on CPAN with a mix of features that
    may be sufficient for your needs. To my knowledge, none give timing
    distributions or integrate with Devel::NYTProf.

    Here is a brief rundown of some that I am familiar with:

    *   Benchmark – ships with Perl, but makes it hard to get timing
        distributions in a usable form.

    *   Benchmark::Timer – times only parts of code, targets a degree of
        statistical confidence (assuming data is normally distributed).

    *   Dumbbench – attempts to improve on Benchmark with a more robust
        statistical estimation of runtimes; no before/after capabilities.

    *   Surveyor::App – also runs benchmarks from a package, but doesn't
        have before/after task capabilities and relies on Benchmark for
        timing.

SUPPORT
  Bugs / Feature Requests
    Please report any bugs or feature requests through the issue tracker at
    <https://github.com/dagolden/Benchmark-Lab/issues>. You will be notified
    automatically of any progress on your issue.

  Source Code
    This is open source software. The code repository is available for
    public review and contribution under the terms of the license.

    <https://github.com/dagolden/Benchmark-Lab>

      git clone https://github.com/dagolden/Benchmark-Lab.git

AUTHOR
    David Golden <dagolden@cpan.org>

COPYRIGHT AND LICENSE
    This software is Copyright (c) 2016 by David Golden.

    This is free software, licensed under:

      The Apache License, Version 2.0, January 2004