Parsimony Pressure
One common problem with variable length representations is “bloat” whereby genome lengths will gradually increase over time. This may be due to over-fitting or the accumulation of “junk DNA” over time.
LEAP currently provides two approaches to mitigating bloat. First is a very simple “genome tax,” or penalty by genome length, popularized by Koza [Koz92]. The second is lexicographical parsimony, or “tie breaking parsimony,” where the individual with the shortest genome is returned if their respective fitnesses happen to be equivalent Luke and Panait [LP02].
API
Parsimony pressure functions.
These are intended to be used as key parameters for selection operators.
Provided are Koza-style parsimony pressure and lexicographic parsimony key functions.
- leap_ec.parsimony.koza_parsimony(ind='__no__default__', *, penalty='__no__default__')
Penalize fitness by genome length times a constant, in the style of Koza [Koz92].
>>> import toolz >>> from leap_ec.individual import Individual >>> from leap_ec.decoder import IdentityDecoder >>> from leap_ec.binary_rep.problems import MaxOnes >>> import leap_ec.ops as ops >>> import numpy as np >>> problem = MaxOnes() >>> pop = [Individual(np.array([0, 0, 0, 1, 1, 1]), problem=problem), ... Individual(np.array([0, 0]), problem=problem), ... Individual(np.array([1, 1]), problem=problem), ... Individual(np.array([1, 1, 1]), problem=problem)] >>> pop = Individual.evaluate_population(pop) >>> best, = ops.truncation_selection(pop, size=1) >>> print(best.genome, best.fitness) [0 0 0 1 1 1] 3
>>> best, = ops.truncation_selection(pop, size=1, key=koza_parsimony(penalty=.5)) >>> print(best.genome, best.fitness) [1 1 1] 3
\[f_p(x) = f(x) - cl(x)\]Where f(x) is original fitness, c is a penalty constant, and l(x) is the genome length.
- Parameters
ind – to be compared
penalty – for denoting penalty strength
- Returns
altered comparison criteria
- leap_ec.parsimony.lexical_parsimony(ind)
If two fitnesses are the same, break the tie with the smallest genome
This implements Lexicographical Parsimony Pressure [LP02], which is essentially where if the fitnesses of two individuals are close, then break the tie with the smallest genome.
>>> import toolz >>> from leap_ec.individual import Individual >>> from leap_ec.binary_rep.problems import MaxOnes >>> import leap_ec.ops as ops >>> import numpy as np >>> problem = MaxOnes() >>> pop = [Individual(np.array([0, 0, 0, 1, 1, 1]), problem=problem), ... Individual(np.array([0, 0]), problem=problem), ... Individual(np.array([1, 1]), problem=problem), ... Individual(np.array([1, 1, 1]), problem=problem)] >>> pop = Individual.evaluate_population(pop) >>> best, = ops.truncation_selection(pop, size=1) >>> print(best.genome, best.fitness) [0 0 0 1 1 1] 3
>>> best, = ops.truncation_selection(pop, size=1, key=lexical_parsimony) >>> print(best.genome, best.fitness) [1 1 1] 3
- Parameters
ind – to be compared
- Returns
altered comparison criteria