Pipeline Operators

_images/Pipeline.png — Fig. 8 **LEAP operator pipeline.** This figure depicts a typical LEAP operator pipeline. First is a parent population from which the next operator selects individuals, which are then cloned by the next operator to be followed by operators for mutating and evaluating the individual. (For brevity, a crossover operator was not included, but could also have been freely inserted into this pipeline.) The pool operator is a sink for offspring, and drives the demand for the upstream operators to repeatedly select, clone, mutate, and evaluate individuals repeatedly until the pool has the desired number of offspring. Lastly, another selection operator returns the final set of individuals based on the offspring pool and optionally the parents.

Overview

leap_ec.individual.Individual, leap_ec.problem.Problem, and leap_ec.decoder.Decoder are passive classes that need an external framework to make them function. In LEAP Concepts the notion of a pipeline of evolutionary algorithm (EA) operators that use these classes was introduced. That is, Individual, Decoder, and Problem are the “nouns” and the pipeline operators a the verbs that operate on those nouns. The operator pipeline objective is to create a new set of evaluated individuals from an existing set of prospective parents that can be in a new set of prospective parents.

Fig. 8 is shown again here to depict a typical set of LEAP pipeline operators. The pipeline generally starts with a “source”, or a parent population, from which the next operator typically selects for creating offspring. This is followed by a clone operator that ensure the subsequent pertubation operators do not modify the selected parents. (And so it is critically important that users always have a clone operator as a part of the offspring creation pipeline before any mutation, crossover, or other genome altering operators.) The pertubation operators can be mutation or also include a crossover operator. At this point in the pipeline we have a completed offspring with no fitness, so the next operator evaluates the offspring to assign that fitness. Then the evaluated offspring is collected into a pool of offspring that acts as a “sink” for new individuals, and is the principal driving for the pipeline; i.e., it is the need to fill the sink that “pulls” individuals down the pipeline. Once the offspring pool reaches a desired size it returns all the offspring to another selection operator to cull the offspring, and optionally the parents, to return the next set of prospective parents.

Or, more explicitly:

Start with a collection of Individuals that are prospective parents as the pipeline “source”
A selection operator for selecting one or more parents to begin the creation of a new offspring
A clone operator that makes a copy of the selected parents to ensure the following operators don’t overwrite those parents
A set of mutation, crossover, or other operators that perturb the cloned individual’s genome, thus (hopefully) giving the new offspring unique values
An operator to evaluate the new offspring
A pool that serves as a “sink” for evaluated offspring; this pool is sent to the next operator, or is returned from the function, once the pool reaches a specified size
Another selection operator to cull the offspring (and optionally parents) to return a population of new prospective parents

This is, the general sequence for most LEAP pipelines, but there will be the occasional variation on this theme. For example, many of the provided “canned” algorithms take just snippets of an offspring creation pipeline. E.g., leap_ec.distributed.asynchronous.steady_state() has an offspring_pipeline parameter that doesn’t have parents explicitly as part of the pipeline; instead, for steady_state() it’s implied that the parents will be provided during the run internally.

Implementation Details

The LEAP pipeline is implemented using the toolz.functoolz.pipe() function, which has arguments comprised of a collection of data followed by an arbitrary number of functions. When invoked the data is passed as an argument to the first function, and the output of that function is fed as an argument to the next function — this repeats for the rest of the functions. The output of the last function is returned as the overall pipeline output. (See: https://toolz.readthedocs.io/en/latest/api.html#toolz.functoolz.pipe )

Loose-coupling via generator functions

The first “data” argument is a collection of Individuals representing prospective parents, which can be a sequence, such as a list or tuple. The design philosophy for the operator functions that follow was to ensure they were as loosely coupled as possible. This was achieved by implementing some operators as generator functions that accept iterators as arguments. That way, new operators can be spliced into the pipeline and they’d automatically “hook up” to their neighbors.

For example, consider the following snippet:

gen = 0
while gen < max_generation:
    offspring = toolz.pipe(parents,
                           ops.tournament_selection,
                           ops.clone,
                               mutate_bitflip,
                           ops.evaluate,
                           ops.pool(size=len(parents)))

    parents = offspring
    gen += 1

The above code snippet is an example of a very basic genetic algorithm implementation that uses a toolz.pipe() function to link together a series of operators to do the following:

binary tournament_selection selection on a set of parents
clone those that were selected
perform mutation bit-bitflip on the clones
evaluate the offspring
accumulate as many offspring as there are parents

Since we only have mutation in the pipeline, only one parent at a time is selected to be cloned to create an offspring. However, let’s make one change to that pipeline by adding crossover:

gen = 0
while gen < max_generation:
    offspring = toolz.pipe(parents,
                           ops.tournament_selection,
                           ops.clone,
                               mutate_bitflip,
                           ops.uniform_crossover, # NEW OPERATOR
                           ops.evaluate,
                           ops.pool(size=len(parents)))

    parents = offspring
    gen += 1

This does the following:

binary tournament_selection selection on a set of parents
clone those that were selected
perform mutation bitflip on the clones
perform uniform crossover between the two offspring
evaluate the offspring
accumulate as many offspring as there are parents

Adding crossover means that now two parents are selected instead of one. However, note that the tournament_selection selection operator wasn’t changed. It automatically selects two parents instead of one, as necessary.

Let’s take a closer look at uniform_crossover() (this is a simplified version; the actual code has more type checking and docstrings).

def uniform_crossover(next_individual: Iterator,
                      p_swap: float = 0.5) -> Iterator:
    def _uniform_crossover(ind1, ind2, p_swap):
        for i in range(len(ind1.genome)):
            if random.random() < p_swap:
                ind1.genome[i], ind2.genome[i] = ind2.genome[i], ind1.genome[i]

        return ind1, ind2

    while True:
        parent1 = next(next_individual)
        parent2 = next(next_individual)

        child1, child2 = _uniform_crossover(parent1, parent2, p_swap)

        yield child1
        yield child2

Note that the argument next_individual is an Iterator that “hooks up” to a previously yielded Individual from the previous pipeline operator. The uniform_crossover operator doesn’t care how the previous Individual is made, it just has a contract that when next() is invoked that it will get another Individual. And, since this is a generator function, it yields the crossed-over Individuals. It also has two yield statements that ensures both crossed-over Individuals are returned, thus eliminating a potential source of genetic drift by arbitrarily only yielding one and discarding the other.

Operators for collections of Individuals

There is another class of operators that work on collections of Individuals such as selection and pooling operators. Generally:

selection pipeline operators: accept a collection of Individuals and yield a selected Individual (and thus are generator functions)
pooling operators: accept an Iterator from which to get the next() Individual, and returns a collection of Individuals

Below shows an example of a selection operator, which is a simplified version of the tournament_selection() operator:

def tournament_selection(population: List, k: int = 2) -> Iterator:
    while True:
        choices = random.choices(population, k=k)
        best = max(choices)

        yield best

(Again, the actual leap_ec.ops.tournament_selection() has checks and docstrings.)

This depicts how a typical selection pipeline operator works. It accepts a population parameter (plus some optional parameters), and yields the selected individual.

Below is example of a pooling operator:

def pool(next_individual: Iterator, size: int) -> List:
    return [next(next_individual) for _ in range(size)]

This accepts an Iterator from which it gets the next individual, and it uses that iterator to accumulate a specified number of Individuals via a list comprehension. Once the desired number of Individuals is accumulated, the list of those Individuals is returned.

Currying Function Decorators

Some pipeline operators have user-specified parameters. E.g., leap_ec.ops.pool() has the mandatory size parameter. However, given that toolz.pipe() takes functions as parameters, how do we ensure that we pass in functions that have set parameters?

Normally we would use the Standard Python Library’s functools.partial to set the function parameters and then pass in the function returned from that call. However, toolz has a convenient function wrapper that does the same thing, toolz.functools.curry. (See: https://toolz.readthedocs.io/en/latest/api.html#toolz.functoolz.curry ) Pipeline operators that take on user-settable parameters are all wrapped with curry to allow functions with parameters set to be passed into toolz.pipe().

Operator Class

Most of the pipeline operators are implemented as functions. However, from time to time an operator will need to persist state between invocations. For generator functions, that comes with using yield in that the next time that function is invoked the next individual is returned. However, there are some operators that use closures, such as :py:func:leap_ec.ops.migrate.

In any case, sometimes if one wants persistent state in a pipeline operator a closure or using yield isn’t enough. In which case, having a class that can have objects that persist state might be useful.

To that end, leap_ec.ops.Operator is an abstract base-class (ABC) that provides a template of sorts for those kinds of classes. That is, you would write an Operator sub-class that provides a __call__() member function that would allow objects of that class to be inserted into a LEAP pipeline just like any other operator. Presumably during execution the internal object state would be continually be updated with book-keeping information as Individuals flow through it in the pipeline.

leap_ec.ops.CooperativeEvaluate is an example of using this class.

Table of Pipeline Operators

Representation Specificity		Input -> Output	Operator
Representation Agnostic		Iterator → Iterator	clone()
			evaluate()
			uniform_crossover()
			n_ary_crossover()
			CooperativeEvaluate
		Iterator → population	pool()
		population → population	truncation_selection()
			const_evaluate()
			insertion_selection()
			migrate()
		population → Iterator	tournament_selection()
			naive_cyclic_selection()
			cyclic_selection()
			random_selection()
Representation Dependent Dependent	binary_rep	Iterator → Iterator	mutate_bitflip()
	real_rep	Iterator → Iterator	mutate_gaussian()
	int_rep	Iterator → Iterator	mutate_randint()
	segmented_rep	Iterator → Iterator	apply_mutation()
			add_segment()
			remove_segment()
			copy_segment()

Admittedly it can be confusing when considering the full suite of LEAP pipeline operators, especially in remembering what kind of operators “connect” to what. With that in mind, the above table breaks down pipeline operators into different categories. First, there are two broad categories of pipeline operators — operators that don’t care about the internal representation of Individuals, or “Representation Agnostic” operators; and those operators that do depend on the internal representation, or “Representation Dependent” operators. Most of the operators are “Representation Agnostic” in that it doesn’t matter if a given Individual has a genome of bits, real-values, or some other representation. Only two operators are dependent on representation, and those will be discussed later.

The next category is broken down by what kind of input and output a given operator takes. That is, generally, an operator takes a population (collection of Individuals) or an Iterator from which a next Individual can be found. Likewise, a given operator can return a population or yield an Iterator to a next Individual. So, operators that return an Iterator can be connected to operators that expect an Iterator for input. Similarly, an operator that expects a population can be connected directly to a collection of Individuals (e.g., be the second argument to toolz.pipe()) or to an operator that returns a collection of Individuals.

If you are familiar with evolutionary algorithms, most of these connections are just common sense. For example, selection operators would select from a population.

With regards to “Representation Dependent” operators there currently are only two: leap_ec.binary_rep.mutate_bitflip() and leap_ec.real_rep.mutate_gaussian(). The former relies on a genome of all bits, and the latter of real-values. In the future, LEAP will support other representations that will similarly have their own operators.

Warning

Are all operators really representation agnostic? In reality, most of the operators assume that Individual.genome is a numpy array, which may not always be the case. For example, the user may come up with a representation that employs, say, a sparse matrix. In that case, the crossover operators will fail.

In the future we intend on adding support for other popular representations that will show up as LEAP sub-packages. (I.e., just as binary_rep and real_rep provide support for binary and real-value representations.)

So, in a sense, for where it matters, LEAP currently assumes some sort of sequence for genomes though, again, plans are afoot to add more representation types. In the interim, you will have to add your own operators to support new non-sequence genomic representations.

Type-checking Decorator Functions

However, to help minimize the chances that pipeline operators would be mis-used the operators have function decorates that due parameter type-checking to ensure the correct parameters are being passed in. These are:

iteriter_op: This checks for signatures of type Iterator -> Iterator
listlist_op: Checks for population -> population type operators
listiter_op: Checks for population -> population type operators
iterlist_op: Checks for population -> Iterator type operators

These can be found in leap_ec.ops.

API Documentation

Base operator classes and representation agnostic functions

Fundamental evolutionary operators.

This module provides many of the most important functions that we string together to create EAs out of operator pipelines. You’ll find many traditional selection and reproduction strategies here, as well as components for classic algorithms like island models and cooperative coevolution.

Representation-specific operators tend to reside within their own subpackages, rather than here. See for example leap_ec.real_rep.ops and leap_ec.binary_rep.ops.

class leap_ec.ops.CooperativeEvaluate(num_trials: int, collaborator_selector, log_stream=None, combine=<function concat_combine>, context={'leap': {'distrib': {'non_viable': 0}}})

Bases: Operator

A simple, non-parallel implementation of cooperative coevolutionary fitness evaluation.

Parameters

num_trials (int) – the number of combined solutions & fitness estimates to collect when computing a partial solution’s fitness.
collaborator_selector – a selection operator that we use to choose individuals from the other subpopulations to create a combined solution.
context – the algorithm’s state context. Used to access subpopulation information.
log_stream – optional file object to collect statistics about combined individuals to.
combine – the function used to combine partial solutions into combined solutions.

class leap_ec.ops.Crossover(persist_children, p_xover)

Bases: Operator

abstract recombine(parent_a, parent_b): Perform recombination between two parents to produce two new individuals.

class leap_ec.ops.NAryCrossover(num_points=2, p_xover=1.0, persist_children=False)

Bases: Crossover

Do crossover between individuals between N crossover points.

1 < n < genome length - 1

We also assume that the passed in individuals are clones of parents.

>>> from leap_ec.individual import Individual
>>> from leap_ec.ops import NAryCrossover
>>> import numpy as np

>>> genome1 = np.array([0, 0])
>>> genome2 = np.array([1, 1])
>>> first = Individual(genome1)
>>> second = Individual(genome2)
>>> pop = [first, second]
>>> select = naive_cyclic_selection(pop)

>>> op = NAryCrossover()
>>> result = op(select)

>>> new_first = next(result)
>>> new_second = next(result)

If persist_children is True and there is a child that was made by crossover but isn’t used in the first call, it will be yielded in a future call.

>>> op = NAryCrossover(p_xover=0.0, persist_children=True)
>>>
>>> next(op(select)) is first  # Create an iterator loop with op(select) and consume 1 individual
True
>>> next(op(select)) is second # Create a different iterator loop with op(select)
True

With persist_children set to False, the second child will not be yielded if the iterator is consumed an odd number of times. Instead, on the next call the loop is started anew.

>>> op = NAryCrossover(p_xover=0.0, persist_children=False)
>>>
>>> next(op(select)) is first  # Create an iterator loop with op(select) and consume 1 individual
True
>>> next(op(select)) is second # Create a different iterator loop with op(select)
False

Parameters

num_points – how many crossing points do we use? Defaults to 2, since 2-point crossover has been shown to be the least disruptive choice for this value.
p – the probability that crossover is performed.
persist_children (bool) – whether unyielded children should persist between calls. This is useful for leap_ec.distrib.asynchronous.steady_state, where the pipeline may only produce one individual at a time.

Returns

a pipeline operator that returns two recombined individuals (with probability p), or two unmodified individuals (with probability 1 - p)

recombine(parent_a, parent_b): Perform recombination between two parents to produce two new individuals.

class leap_ec.ops.Operator

Bases: ABC

Abstract base class that documents the interface for operators in a LEAP pipeline.

LEAP treats operators as functions of two arguments: the population, and a “context” dict that may be used in some algorithms to maintain some global state or parameters independent of the population.

TODO The above description is outdated. –Siggy TODO Also this is for a population based operator. We also have operators for individuals

You can inherit from this class to define operators as classes. Classes support operators that take extra arguments at construction time (such as a mutation rate) and maintain some internal private state, and they allow certain special patterns (such as multi-function operators).

But inheriting from this class is optional. LEAP can treat any callable object that takes two parameters as an operator. You may define your custom operators as closures (which also allow for construction-time arguments and internal state), as simple functions ( when no additional arguments are necessary), or as curried functions ( i.e. with the help of toolz.curry(…).

class leap_ec.ops.UniformCrossover(p_swap: float = 0.2, p_xover: float = 1.0, persist_children=False)

Bases: Crossover

Parameterized uniform crossover iterates through two parents’ genomes and swaps each of their genes with the given probability.

In a classic paper, De Jong and Spears showed that this operator works particularly well when the swap probability p_swap is set to about 0.2. LEAP thus uses this value as its default.

De Jong, Kenneth A., and W. Spears. “On the virtues of parameterized uniform crossover.” Proceedings of the 4th international conference on genetic algorithms. Morgan Kaufmann Publishers, 1991.

>>> from leap_ec.individual import Individual
>>> from leap_ec.ops import UniformCrossover, naive_cyclic_selection
>>> import numpy as np

>>> genome1 = np.array([0, 0])
>>> genome2 = np.array([1, 1])
>>> first = Individual(genome1)
>>> second = Individual(genome2)
>>> pop = [first, second]
>>> select = naive_cyclic_selection(pop)
>>> op = UniformCrossover()
>>> result = op(select)
>>> new_first = next(result)
>>> new_second = next(result)

The probability can be tuned via the p_swap parameter: >>> op = UniformCrossover(p_swap=0.1) >>> result = op(select)

If persist_children is True and there is a child that was made by crossover but isn’t used in the first call, it will be yielded in a future call.

>>> op = UniformCrossover(p_xover=0.0, persist_children=True)
>>>
>>> next(op(select)) is first  # Create an iterator loop with op(select) and consume 1 individual
True
>>> next(op(select)) is second # Create a different iterator loop with op(select)
True

With persist_children set to False, the second child will not be yielded if the iterator is consumed an odd number of times. Instead, on the next call the loop is started anew.

>>> op = UniformCrossover(p_xover=0.0, persist_children=False)
>>>
>>> next(op(select)) is first  # Create an iterator loop with op(select) and consume 1 individual
True
>>> next(op(select)) is second # Create a different iterator loop with op(select)
False

Parameters

p_swap – how likely are we to swap each pair of genes when crossover is performed
p_xover (float) – the probability that crossover is performed in the first place
persist_children (bool) – whether unyielded children should persist between calls. This is useful for leap_ec.distrib.asynchronous.steady_state, where the pipeline may only produce one individual at a time.

Returns

a pipeline operator that returns two recombined individuals (with probability p_xover), or two unmodified individuals (with probability 1 - p_xover)

recombine(parent_a, parent_b): Perform recombination between two parents to produce two new individuals.

leap_ec.ops.clone(next_individual: Iterator = '__no__default__') → Iterator

clones and returns the next individual in the pipeline

The clone’s fitness is set to None, its parents are set to the individual from which it was cloned (i.e., the parent), and it is assigned its own UUID.

>>> from leap_ec.individual import Individual
>>> import numpy as np

Create a common decoder and problem for individuals.

>>> genome = np.array([1, 1])
>>> original = Individual(genome)

>>> cloned_generator = clone(iter([original]))

Parameters: next_individual – iterator for next individual to be cloned
Returns: copy of next_individual

leap_ec.ops.compute_expected_probability(expected_num_mutations: float, individual_genome: List) → float

Computed the probability of mutation based on the desired average expected mutation and genome length.

The equation here is \(p = 1/L * \texttt{expected_num_mutations}\). To see why this is correct, note that the number of mutations performed is characterized by a binomial distribution with \(n=L\) trials (one weighted “coin flip” per gene), and that the mean (expected_num_mutations) of a binomial distribution is given by \(n*p = expected_num_mutations\).

Parameters

expected_num_mutations – times individual is to be mutated on average
individual_genome – genome for which to compute the probability

Returns

the corresponding probability of mutation

leap_ec.ops.compute_population_values(population: ~typing.List, offset=0, exponent: int = 1, key=<function <lambda>>) → ndarray

Returns a list of values where the zero-point of the population is shifted and the values are scaled by exponentiation.

Parameters

population – the population to compute values from.
offset – the offset from zero. Specifying offset=’pop-min’ will use the population’s minimum value as the new zero-point. Defaults to 0.
exponent (int) – the power to which values are raised to. Defaults to 1.
key – a function that computes a metric based on an Individual.

Returns

a numpy array of values that have been shifted by offset and scaled by exponent corresponding to each individual in the population.

leap_ec.ops.concat_combine(collaborators)

Combine a list of individuals by concatenating their genomes.

You can choose whether this or some other function is used for combining collaborators by passing it into the CooperativeEvaluate constructor.

leap_ec.ops.const_evaluate(population: List = '__no__default__', value='__no__default__') → List

An evaluator that assigns a constant fitness to every individual.

This ignores the Problem associated with each individual for the purpose of assigning a constant fitness.

This is useful for algorithms that need to assign an arbitrary initial fitness value before using their normal evaluation method. Some forms of cooperative coevolution are an example.

leap_ec.ops.cyclic_selection(population: List = '__no__default__') → Iterator

Deterministically returns individuals in order, then shuffles the test_sequence, returns the individuals in that new order, and repeats this process.

>>> from leap_ec.individual import Individual
>>> from leap_ec.ops import cyclic_selection
>>> import numpy as np

>>> pop = [Individual(np.array([0, 0])),
...        Individual(np.array([0, 1]))]

>>> cyclic_selector = cyclic_selection(pop)

Parameters: population – from which to select
Returns: the next selected individual

leap_ec.ops.elitist_survival(offspring: List = '__no__default__', parents: List = '__no__default__', k: int = 1, key=None) → List

This allows k best parents to compete with the offspring.

>>> from leap_ec.individual import Individual
>>> from leap_ec.binary_rep.problems import MaxOnes
>>> import numpy as np
First, let’s make a “pretend” population of parents using the MaxOnes problem.
>>> pretend_parents = [Individual(np.array([0, 0, 0]), problem=MaxOnes()),
...                    Individual(np.array([1, 1, 1]), problem=MaxOnes())]
Then a “pretend” population of offspring. (Pretend in that we’re pretending that the offspring came from the parents.)
>>> pretend_offspring = [Individual(np.array([0, 0, 0]), problem=MaxOnes()),
...                      Individual(np.array([1, 1, 0]), problem=MaxOnes()),
...                      Individual(np.array([1, 0, 1]), problem=MaxOnes()),
...                      Individual(np.array([0, 1, 1]), problem=MaxOnes()),
...                      Individual(np.array([0, 0, 1]), problem=MaxOnes())]
We need to evaluate them to get their fitness to sort them for elitist_survival.
>>> pretend_parents = Individual.evaluate_population(pretend_parents)
>>> pretend_offspring = Individual.evaluate_population(pretend_offspring)
This will take the best parent, which has [1,1,1], and replace the worst offspring, which has [0,0,0] (because this is the MaxOnes problem) >>> survivors = elitist_survival(pretend_offspring, pretend_parents)
>>> assert pretend_parents[1] in survivors # yep, best parent is there
>>> assert pretend_offspring[0] not in survivors # worst guy isn't
We orginally ordered 5 offspring, so that’s what we better have. >>> assert len(survivors) == 5

Please note that the literature has a number of variations of elitism and other forms of overlapping generations. For example, this may be a good starting point:

De Jong, Kenneth A., and Jayshree Sarma. “Generation gaps revisited.” In Foundations of genetic algorithms, vol. 2, pp. 19-28. Elsevier, 1993.

Parameters

offspring – list of created offpring, probably from pool()
parents – list of parents, usually the ones that offspring came from
k – how many elites from parents to keep?
key – optional key criteria for selecting; e.g., can be used to impose parsimony pressure

Returns

surviving population, which will be offspring with offspring replaced by any superior parent elites

leap_ec.ops.evaluate(next_individual: Iterator = '__no__default__') → Iterator

Evaluate and returns the next individual in the pipeline

>>> from leap_ec.individual import Individual
>>> from leap_ec.decoder import IdentityDecoder
>>> from leap_ec.binary_rep.problems import MaxOnes
>>> import numpy as np

We need to specify the decoder and problem so that evaluation is possible.

>>> genome = np.array([1, 1])
>>> ind = Individual(genome, decoder=IdentityDecoder(), problem=MaxOnes())

>>> evaluated_ind = next(evaluate(iter([ind])))

Parameters

next_individual – iterator pointing to next individual to be evaluated
kwargs – contains optional context state to pass down the pipeline in context dictionaries

Returns

the evaluated individual

leap_ec.ops.grouped_evaluate(population: list = '__no__default__', max_individuals_per_chunk: int = None) → list

Evaluate the population by sending groups of multiple individuals to a fitness function so they can be evaluated simultaneously.

This is useful, for example, as a way to evaluate individuals in parallel on a GPU.

leap_ec.ops.insertion_selection(offspring: List = '__no__default__', parents: List = '__no__default__', key=None) → List

do exclusive selection between offspring and parents

This is typically used for Ken De Jong’s EV algorithm for survival selection. Each offspring is deterministically selected and a random parent is selected; if the offspring wins, then it replaces the parent.

Note that we make a _copy_ of the parents and have the offspring compete with the parent copies so that users can optionally preserve the original parents. You may wish to do that, for example, if you want to analyze the composition of the original parents and the modified copy.

Parameters

offspring – population to select from
parents – parents that are copied and which the copies are potentially updated with better offspring
key – optional key for determining max() by other criteria such as for parsimony pressure

Returns

the updated parent population

leap_ec.ops.iteriter_op(f)

This decorator wraps a function with runtime type checking to ensure that it always receives an iterator as its first argument, and that it returns an iterator.

We use this to make debugging operator pipelines easier in EAs: if you accidentally hook up, say an operator that outputs a list to an operator that expects an iterator, we’ll throw an exception that pinpoints the issue.

Parameters: function (f) – the function to wrap

leap_ec.ops.iterlist_op(f)

This decorator wraps a function with runtime type checking to ensure that it always receives an iterator as its first argument, and that it returns a list.

We use this to make debugging operator pipelines easier in EAs: if you accidentally hook up, say an operator that outputs a list to an operator that expects an iterator, we’ll throw an exception that pinpoints the issue.

Parameters: function (f) – the function to wrap

leap_ec.ops.listiter_op(f)

This decorator wraps a function with runtime type checking to ensure that it always receives a list as its first argument, and that it returns an iterator.

We use this to make debugging operator pipelines easier in EAs: if you accidentally hook up, say an operator that outputs an iterator to an operator that expects a list, we’ll throw an exception that pinpoints the issue.

Parameters: function (f) – the function to wrap

leap_ec.ops.listlist_op(f)

This decorator wraps a function with runtime type checking to ensure that it always receives a list as its first argument, and that it returns a list.

We use this to make debugging operator pipelines easier in EAs: if you accidentally hook up, say an operator that outputs an iterator to an operator that expects a list, we’ll throw an exception that pinpoints the issue.

Parameters: function (f) – the function to wrap

leap_ec.ops.migrate(topology, emigrant_selector, replacement_selector, migration_gap, customs_stamp=<function <lambda>>, metric=None, context={'leap': {'distrib': {'non_viable': 0}}})

A migration operator for use in island models.

This operator works with multi-population algorithms, and is thus meant to used with leap_ec.algorithm.multi_population_ea.

Specifically, it assumes that

the population argument passed into the returned function is a particular sub-population that we want to process “emigration” out of and “immigration” into,

the context state object contains an integer field context[‘leap’][‘generation’] indicating the current generation count of the algorithm, and

the context also contains a integer field context[‘leap’][‘current_subpopulation’] indicating the index of the subpopulation that is currently being processed in the overall collection of subpopulations (i.e. the one that population belongs to).

These assumptions are essentially what leap_ec.algorithm.multi_population_ea implements.

>>> import networkx as nx
>>> from leap_ec import ops, context
>>> from leap_ec.data import test_population
>>> pop0 = test_population[:]  # Shallow copy
>>> pop1 = test_population[:]

>>> op = migrate(topology=nx.complete_graph(2),
...              emigrant_selector=ops.tournament_selection,
...              replacement_selector=ops.random_selection,
...              migration_gap=50)
>>> context['leap']['generation'] = 0
>>> context['leap']['current_subpopulation'] = 0
>>> op(pop0)
[Individual<...>(...), Individual<...>(...), Individual<...>(...), Individual<...>(...)]

>>> context['leap']['current_subpopulation'] = 1
>>> op(pop1)
[Individual<...>(...), Individual<...>(...), Individual<...>(...), Individual<...>(...)]

This operator is a stateful closure: it maintains an internal list of all the out-going “emigrations” that occurred in the previous time step, so that it can process them as “immigrations” in the current time step.

Parameters

topology – a networkx topology defining the connectivity among islands
emigrant_selector – a selection operator for choosing individuals to leave an island
replacement_selector – a selection operator choosing contestants that will be replaced by an incoming immigrant if the immigrant has higher fitness
migration_gap (int) – migration will occur regularly after every migration_gap evolutionary steps
customs_stamp – an optional function to transfrom an individual upon its arrival to a new island. This can be used, for example, to change the individual’s decoder or problem in a heterogeneous island model.
metric – an optional function of the form f(generation, immigrant_individual, contestant_indidivudal, success) for recording information about migration events.
context – the context object to check for EA state, such as the current generation number, and the ID of the subpopulation that is currently being processed.

leap_ec.ops.migration_metric(stream, header: bool = True, notes: Optional[dict] = None)

Returns a function that can be used to record migration events.

The purpose of a migration metric is to record information about migrations that occur inside a migration operator. Because these events take place inside the operator (rather than across operators), they cannot be recorded by a LEAP pipeline probe.

In general, the interface for a migration metric function takes four parameters:

generation: the current generation

immigrant_ind: the individual that is attempting to migrate

contestant_ind: the individual that has been chosen to be replaced

success: True if the migration is successful, False otherwise

The metric included here records the fitness of both individuals and writes them (along with the generation and success values) to a CSV. You can write your own metric if you need to record other information (such as, say, genomes).

>>> import sys
>>> from leap_ec import Individual
>>> from leap_ec.binary_rep.problems import MaxOnes
>>> m = migration_metric(sys.stdout,
...                      header=True,
...                      notes={'run': 0, 'description': 'Test output'}
... )
run,description,generation,migrant_fitness,contestant_fitness,success

>>> ind1 = Individual(np.array([1, 1, 1]), problem=MaxOnes())
>>> f = ind1.evaluate()
>>> contestant = Individual(np.array([0, 1, 1]), problem=MaxOnes())
>>> f = contestant.evaluate()
>>> m(0, ind1, contestant, True)
0,Test output,0,3,2,True

Parameters

stream – file object to write the CSV data to
header (bool) – a CSV header will be written if True
notes (dict) – a dict specifying additional constant-value columns to include in the CSV output

leap_ec.ops.naive_cyclic_selection(population: List = '__no__default__', indices: List = None) → Iterator

Deterministically returns individuals, and repeats the same test_sequence when exhausted.

This is “naive” because it doesn’t shuffle the population between complete tours to minimize bias.

>>> from leap_ec.individual import Individual
>>> from leap_ec.ops import naive_cyclic_selection
>>> import numpy as np

>>> pop = [Individual(np.array([0, 0])),
...        Individual(np.array([0, 1]))]

>>> cyclic_selector = naive_cyclic_selection(pop)

Parameters: population – from which to select
Returns: the next selected individual

leap_ec.ops.pool(next_individual: Iterator = '__no__default__', size: int = '__no__default__') → List

‘Sink’ for creating size individuals from preceding pipeline source.

Allows for “pooling” individuals to be processed by next pipeline operator. Typically used to collect offspring from preceding set of selection and birth operators, but could also be used to, say, “pool” individuals to be passed to an EDA as a training set.

>>> from leap_ec.individual import Individual
>>> from leap_ec.ops import naive_cyclic_selection
>>> import numpy as np

>>> pop = [Individual(np.array([0, 0])),
...        Individual(np.array([0, 1]))]

>>> cyclic_selector = naive_cyclic_selection(pop)

>>> pool = pool(cyclic_selector, 3)

print(pool) [Individual([0, 0], None, None), Individual([0, 1], None, None), Individual([0, 0], None, None)]

Parameters

next_individual – generator for getting the next offspring
size – how many kids we want

Returns

population of size offspring

leap_ec.ops.proportional_selection(population: ~typing.List = '__no__default__', offset=0, exponent: int = 1, key=<function <lambda>>) → Iterator

Returns an individual from a population in direct proportion to their fitness or another given metric.

To deal with negative fitness values use offset=’pop-min’ or set a custom offset. A ValueError is thrown if the result of adding offset to a fitness value results in a negative number. The value of an individual is calculated as follows

value = (fitness + offset)^exponent

Parameters

population – the population to select from. Should be a list, not an iterator.
offset – the offset from zero. If negative fitness values are possible and the minimum is unknown use offest=’pop-min’ for an adaptive offset. Defaults to 0.
exponent (int) – the power to which fitness values are raised to. This can be tuned to increase or decrease selection pressure by creating larger or smaller differences between fitness values in the population. Defaults to 1.
key – a function that computes the metric used to compare individuals. Defaults to fitness.

Returns

a random individual based on the proportion of the given metric in the population.

>>> from leap_ec import Individual
>>> from leap_ec.binary_rep.problems import MaxOnes
>>> from leap_ec.ops import proportional_selection
>>> import numpy as np

>>> genome1 = np.array([0, 0, 0])
>>> genome2 = np.array([0, 0, 1])
>>> pop = [Individual(genome1, problem=MaxOnes()),
...        Individual(genome2, problem=MaxOnes())]
>>> pop = Individual.evaluate_population(pop)
>>> selected = proportional_selection(pop)

leap_ec.ops.random_bernoulli_vector(shape: Union[int, Tuple], p: float = 0.5) → ndarray

Generates a random vector of Boolean balues from a Bernoulli process—that is, from a sequence of weighted coin flips.

We use this function throughout LEAP because its implementation was found to be much faster than, say, just calling np.random.choice([0, 1]).

>>> from leap_ec.ops import random_bernoulli_vector
>>> random_bernoulli_vector(5, p=0.4)
array([..., ..., ..., ..., ...])

Parameters

shape – shape of the random vector—can be an integer or a tuple.
p – success probability of the bernoulli trials.

Returns

boolean numpy array

leap_ec.ops.random_selection(population: List = '__no__default__', indices=None) → Iterator

return a uniformly randomly selected individual from the population

Parameters: population – from which to select
Returns: a uniformly selected individual

leap_ec.ops.sus_selection(population: ~typing.List = '__no__default__', n=None, shuffle: bool = True, offset=0, exponent: int = 1, key=<function <lambda>>) → Iterator

Returns an individual from a population in proportion to their fitness or another given metric using the stochastic universal sampling algorithm.

To deal with negative fitness values use offset=’pop-min’ or set a custom offset. A ValueError is thrown if the result of adding offset to a fitness value results in a negative number. The value of an individual is calculated as follows

value = (fitness + offset)^exponent

Parameters

population – the population to select from. Should be a list, not an iterator.
n – the number of evenly spaced points to use in the algorithm. Default is None which uses len(population).
shuffle (bool) – if True, n points are resampled after one full pass over them. If False, selection repeats over the same n points. Defaults to True.
offset – the offset from zero. If negative fitness values are possible and the minimum is unknown use offset=’pop-min’ for an adaptive offset. Defaults to 0.
exponent (int) – the power to which fitness values are raised to. This can be tuned to increase or decrease selection pressure by creating larger or smaller differences between fitness values in the population. Defaults to 1.
key – a function that computes the metric used to compare individuals. Defaults to fitness.

Returns

a random individual based on the proportion of the given metric in the population.

>>> from leap_ec import Individual
>>> from leap_ec.binary_rep.problems import MaxOnes
>>> from leap_ec.ops import sus_selection
>>> import numpy as np

>>> genome1 = np.array([0, 0, 0])
>>> genome2 = np.array([1, 1, 1])
>>> pop = [Individual(genome1, problem=MaxOnes()),
...        Individual(genome2, problem=MaxOnes())]
>>> pop = Individual.evaluate_population(pop)
>>> selected = sus_selection(pop)

leap_ec.ops.tournament_selection(population: list = '__no__default__', k: int = 2, key=None, select_worst: bool = False, indices=None) → Iterator

Returns an operator that selects the best individual from k individuals randomly selected from the given population.

Like other selection operators, this assumes that if one individual is “greater than” another, then it is “better than” the other. Whether this indicates maximization or minimization isn’t handled here: the Individual class determines the semantics of its “greater than” operator.

Parameters

population – the population to select from. Should be a list, not an iterator.
k (int) – number of contestants in the tournament. k=2 does binary tournament selection, which approximates linear ranking selection in the expectation. Higher values of k yield greedier selection strategies—k=3, for instance, is equal to quadratic ranking selection in the expectation.
key – an optional function that computes keys to sort over. Defaults to None, in which case Individuals are compared directly.
select_worst (bool) – if True, select the worst individual from the tournament instead of the best.
indices (list) – an optional list that will be populated with the index of the selected individual.

Returns

the best of k individuals drawn from population

>>> from leap_ec import Individual
>>> from leap_ec.binary_rep.problems import MaxOnes
>>> from leap_ec.ops import tournament_selection
>>> import numpy as np

>>> pop = [Individual(np.array([0, 0, 0]), problem=MaxOnes()),
...        Individual(np.array([0, 0, 1]), problem=MaxOnes())]
>>> pop = Individual.evaluate_population(pop)
>>> best = tournament_selection(pop)

leap_ec.ops.truncation_selection(offspring: List = '__no__default__', size: int = '__no__default__', parents: List = None, key=None) → List

return the size best individuals from the given population

This defaults to (mu, lambda) if parents is not given.

>>> from leap_ec.individual import Individual
>>> from leap_ec.binary_rep.problems import MaxOnes
>>> from leap_ec.ops import truncation_selection
>>> import numpy as np

>>> pop = [Individual(np.array([0, 0, 0]), problem=MaxOnes()),
...        Individual(np.array([0, 0, 1]), problem=MaxOnes()),
...        Individual(np.array([1, 1, 0]), problem=MaxOnes()),
...        Individual(np.array([1, 1, 1]), problem=MaxOnes())]

We need to evaluate them to get their fitness to sort them for truncation.

>>> pop = Individual.evaluate_population(pop)

>>> truncated = truncation_selection(pop, 2)

TODO Do we want an optional context to over-ride the ‘parents’ parameter?

Parameters

offspring – offspring to truncate down to a smaller population
size – is what to resize population to
second_population – is optional parent population to include with population for downsizing

Returns

truncated population

Pipeline operators for binary representations

Binary representation specific pipeline operators.

leap_ec.binary_rep.ops.genome_mutate_bitflip(genome: ndarray = '__no__default__', expected_num_mutations: float = None, probability: float = None) → ndarray

Perform bitflip mutation on a particular genome.

This function can be used by more complex operators to mutate a full population (as in mutate_bitflip), to work with genome segments (as in leap_ec.segmented.ops.apply_mutation), etc. This way we don’t have to copy-and-paste the same code for related operators.

Parameters

genome – of binary digits that we will be mutating
expected_num_mutations – on average how many mutations are we expecting?

Returns

mutated genome

leap_ec.binary_rep.ops.mutate_bitflip(next_individual: Iterator = '__no__default__', expected_num_mutations: float = None, probability: float = None) → Iterator

Perform bit-flip mutation on each individual in an iterator (population).

This assumes that the genomes have a binary representation.

>>> from leap_ec.individual import Individual
>>> from leap_ec.binary_rep.ops import mutate_bitflip
>>> import numpy as np

>>> original = Individual(np.array([1, 1]))
>>> op = mutate_bitflip(expected_num_mutations=1)
>>> pop = iter([original])
>>> mutated = next(op(pop))

Parameters

next_individual – to be mutated
expected_num_mutations – on average how many mutations done (specificy either this or probability, but not both)
probability – the probability of mutating any given gene (specificy either this or expected_num_mutations, but not both)

Returns

mutated individual

leap_ec.binary_rep.ops.random() → x in the interval [0, 1).

Pipeline operators for real-valued representations

Pipeline operators for real-valued representations

leap_ec.real_rep.ops.apply_hard_bounds(genome, hard_bounds)

A helper that ensures that every gene is contained within the given bounds.

Parameters

genome – list of values to apply bounds to.
hard_bounds – if a (low, high) tuple, the same bounds will be used for every gene. If a list of tuples is given, then the ith bounds will be applied to the ith gene.

Both sides of the range are inclusive:

>>> genome = np.array([0, 10, 20, 30, 40, 50])
>>> apply_hard_bounds(genome, hard_bounds=(20, 40))
array([20, 20, 20, 30, 40, 40])

Different bounds can be used for each locus by passing in a list of tuples:

>>> bounds= [ (0, 1), (0, 1), (50, 100), (50, 100), (0, 100), (0, 10) ]
>>> apply_hard_bounds(genome, hard_bounds=bounds)
array([ 0,  1, 50, 50, 40, 10])

leap_ec.real_rep.ops.genome_mutate_gaussian(genome='__no__default__', std: float = '__no__default__', expected_num_mutations='__no__default__', bounds: Tuple[float, float] = (-inf, inf), transform_slope: float = 1.0, transform_intercept: float = 0.0)

Perform Gaussian mutation directly on real-valued genes (rather than on an Individual).

This used to be inside mutate_gaussian, but was moved outside it so that leap_ec.segmented.ops.apply_mutation could directly use this function, thus saving us from doing a copy-n-paste of the same code to the segmented sub-package.

Parameters

genome – of real-valued numbers that will potentially be mutated
std – the mutation width—either a single float that will be used for all genes, or a list of floats specifying the mutation width for each gene individually.
expected_num_mutations – on average how many mutations are expected

Returns

mutated genome

leap_ec.real_rep.ops.mutate_gaussian(next_individual: Iterator = '__no__default__', std='__no__default__', expected_num_mutations: Union[int, str] = None, bounds=(-inf, inf), transform_slope: float = 1.0, transform_intercept: float = 0.0) → Iterator

Mutate and return an Individual with a real-valued representation.

This operators on an iterator of Individuals:

>>> from leap_ec.individual import Individual
>>> from leap_ec.real_rep.ops import mutate_gaussian
>>> import numpy as np
>>> pop = iter([Individual(np.array([1.0, 0.0]))])

Mutation can either use the same parameters for all genes:

>>> op = mutate_gaussian(std=1.0, expected_num_mutations='isotropic', bounds=(-5, 5))
>>> mutated = next(op(pop))

Or we can specify the std and bounds independently for each gene:

>>> pop = iter([Individual(np.array([1.0, 0.0]))])
>>> op = mutate_gaussian(std=[0.5, 1.0],
...                      expected_num_mutations='isotropic',
...                      bounds=[(-1, 1), (-10, 10)]
... )
>>> mutated = next(op(pop))

Parameters

next_individual – to be mutated
std – standard deviation to be equally applied to all individuals; this can be a scalar value or a “shadow vector” of standard deviations
expected_num_mutations – if an int, the expected number of mutations per individual, on average. If ‘isotropic’, all genes will be mutated.
bounds – to clip for mutations; defaults to (- ∞, ∞)

Returns

a generator of mutated individuals.

Pipeline operators for segmented representations

Segmented representation specific pipeline operators.

leap_ec.segmented_rep.ops.add_segment(next_individual: Iterator = '__no__default__', seq_initializer: Callable = '__no__default__', probability: float = '__no__default__', append: bool = False) → Iterator

Possibly add a segment to the given individual

New segments can be always appended, or randomly inserted within the individual’s genome.

TODO add a parameter for accepting a function that will yield a distribution for the number of segments to be randomly inserted.

>>> from leap_ec.individual import Individual
>>> from leap_ec.binary_rep.initializers import create_binary_sequence
>>> import numpy as np
>>> original = Individual([np.array([0, 0]), np.array([1, 1])])
>>> mutated = next(add_segment(iter([original]),
...                seq_initializer=create_binary_sequence(2),
...                probability=1.0))

Parameters

next_individual – to possibly add a segment
seq_initializer – callable for initializing any new segments
probability – likelihood of adding a segment
append – if True, always append any new segments

Returns

yielded individual with a possible new segment

leap_ec.segmented_rep.ops.apply_mutation(next_individual: Iterator = '__no__default__', mutator: Callable[[list, float], list] = '__no__default__', expected_num_mutations: float = 1.0) → Iterator

This expects next_individual to have a segmented representation; i.e., a test_sequence of sequences. mutator will be applied to each sub-test_sequence with the expected probability. The expected probability applies to all the sequences, and defaults to a single mutation among all components, on average.

>>> from leap_ec.binary_rep.ops import genome_mutate_bitflip
>>> mutation_op = apply_mutation(mutator=genome_mutate_bitflip)
>>> import numpy as np

>>> from leap_ec.individual import Individual
>>> original = Individual(np.array([[0, 0], [1, 1]]))
>>> mutated = next(mutation_op(iter([original])))

Parameters

next_individual – to possibly mutate
mutator – function to be applied to each segment in the individual’s genome; first argument is a segment, the second the expected probability of mutating each segment element.
expected – expected mutations on average in [0.0,1.0]

Returns

yielded mutated individual

leap_ec.segmented_rep.ops.copy_segment(next_individual: Iterator = '__no__default__', probability: float = '__no__default__', append: bool = False) → Iterator

with a given probability, randomly select and copy a segment

>>> from leap_ec.individual import Individual
>>> import numpy as np
>>> original = Individual([np.array([0, 0])])
>>> mutated = next(copy_segment(iter([original]), probability=1.0))
>>> assert np.all(mutated.genome[0] == [0, 0])            and np.all(mutated.genome[1] == [0, 0])

param next_individual

to have a segment possibly removed

param probability

likelihood of doing this

param append

if True, always append any new segments

returns

the next individual

leap_ec.segmented_rep.ops.remove_segment(next_individual: Iterator = '__no__default__', probability: float = '__no__default__') → Iterator

for some chance, remove a segment

Nothing happens if the individual has a single segment; i.e., there is no chance for an empty individual to be returned.

>>> from leap_ec.individual import Individual
>>> import numpy as np
>>> original = Individual([np.array([0, 0]), np.array([1, 1])])
>>> mutated = next(remove_segment(iter([original]), probability=1.0))
>>> assert np.all(mutated.genome[0] == [0, 0])             or np.all(mutated.genome[0] == [1, 1])

param next_individual

to have a segment possibly removed

param probability

likelihood of removing a segment

returns

the next individual

leap_ec.segmented_rep.ops.segmented_mutate(next_individual: Iterator = '__no__default__', mutator_functions: list = '__no__default__'): A mutation operator that applies a different mutation operator to each segment of a segmented genome.