Gett’n Jiggy with Genetic Algorithms

While at the airport reading “The Singularity is Near: 2045,” Ray Kurzweil persuaded me to code up a simple genetic algorithm (GA).  The concept of GAs are pretty simply: start with a set of solutions and allow the best ones to survive and have offspring.

A simple application is in airline routing problems.  Airlines have finite resources and must choose which flights to take in order to maximize their profits (airlines are an extreme example because of how low of margins they operate under).

In this post, we’ll walk through how to program a genetic algorithm in python to solve a linear maximization problem with linear constraints.

Solutions

We let a solution take the form \textbf{x} = (x_0,x_1,...,x_n)^T, and a population or set of k solutions takes the form \mathbb{P}=\{\textbf{x}_1, ... ,\textbf{x}_k \}. The analogy to genetics is that a solution is like a strand of DNA, and just like each gene can take a state, each element of our solution vector can take a state. In this example, we restrict our states to be binary so x_i \in \{0,1\}.

 #generates a 1D array (solution) of zeros and ones with probability p
#
#input: random 1d array with values in [0,1)
#output: converts values to 1 if element < p
def init_1D_p(x,p=0.5):
    for i in range(len(x)):
        if x[i] < p:
            x[i] = 1
        else:
            x[i]=0
    return x

#initializes k 1D solutions of length n
#
#output: row i is solution i
def init_solns_1D(k,n,p=0.5):
    solns = np.zeros((k,n))
    for i in range(k):
        solns[i] = init_1D_p(np.random.rand(n),p)
    return solns

Objective

In general, we will wish to either maximize or minimize some objective function that takes a solution as an input. We will assume that we wish to maximize (the assumption to maximize isn’t important since we can just maximze the negative of an objective function if we wanted the minimization)  a linear function

\max \textbf{o}^T \textbf{x}

where \textbf{o} \in \mathbb{R}^n.

#x: solutions with each row as a solution
#o: coefficients of objective function
def Linear_Objective(x):
    o = np.array([1,2,3])
    return np.dot(x,o)

Constraints

The solution will have to be feasible, meaning that it must satisfy some constraints.

feasibleregion.png
Solutions must exist in a feasible region (image source)

We will deal with just one linear constraint

\textbf{c}^T \textbf{x} \leq b

In our airline example, the coefficient of c_i could be the amount of resource flight i is using, b is the total amount of resources, and o_i is the amount of profit for flight i

#x: solutions with each row as a solution
#c: coefficients so c_i is constraint coefficient for x_i
#ub: the upperbound
#
#return: boolean value; true if contraint satisfied
def Constraint1_UpperBound(x,c,ub):
    return np.dot(x,c)<ub 

Mating

An offspring, \textbf{d} is created by mating two solutions, \textbf{y},\textbf{z} \in P, so that the kth element of the offspring is

d_k =  \begin{cases} y_k & p\leq 0.5 \\ x_k & p > 0.5 \end{cases}

where p \sim U(0,1).

mate
The offspring solution is a mixture of both parent solutions (image source)
#mates x1 and x2 and randomizes elements from child < p
def mate_1D(x1,x2):
    rand_mate=np.random.rand(len(x1))
    result= np.zeros(len(x1))
    for i in range(len(x1)):
        if rand_mate[i] < 0.5:
            result[i] = x1[i]
        else:
            result[i] = x2[i]
    return result

Mutations

After all evaluations have been made and offspring have replaced failed solutions, we introduce random mutations to each element of each member in the population.

mutation.png
DNA example of a mutation in the genome (image source)

We let each element of a solution mutate, or switch states, with probability p_{\text{mutate}}.

#applies random mutations to elements of x with probability p
def random_mutation_1D(x,p=.01):
    import random
    rand_mate=np.random.rand(len(x))
    for i in range(len(x)):
        if rand_mate[i]<p:
            x[i] = random.sample([0,1],1)[0]
    return x

Process and Simulations

We can view implementing our genetic algorithm as a process, where the state of each element of our process is the \mathbb{P}. To get the next generation of solutions (or next state in our process), we take the current population, evaluate all the members via the objective function and constraints, apply mating, and finally apply mutations; we keep doing this for as long as many iterations as we’d like our simulation to last. The remainder of this section goes through the details of coding up the simulation.

We will simulate a generation by evaluating each member of the population via the objective function and the constraints.

# return: booleans that are true if the index contains a working solution
def getWorkingSolutionIndices(constraint_results):
    #results = np.array(sum(constraint_results))
    results = []
    for i in range(len(constraint_results)):
        if constraint_results[i] == True:
            results.append(i)
    return np.array(results)

If the member does not satisfy all constraints, we will replace it by offspring of two members of the current generation that do satisfy the constraints; additionally, we will pick these members so that solutions that have a better objective measure have a greater probability of being chosen for mating. To do this, we create a probability mass function (pmf) where the probability of selecting a solution is proportional to how well it did on the objective function.

#returns a probability distribution with weighted values
#corresponding to objective solutions that satisfy constraints
def getPDF(solns,working_soln_ind,obj_results):
    total = sum(obj_results[working_soln_ind])
    prob = np.zeros(len(solns))
    for i in working_soln_ind:
        prob[i]=obj_results[i]/total
    return prob

Now, to sample from this pmf we get the cumulative distribution function (cmf) and generate a uniform random variable.

#input:a probability mass function
#output: corresponding cdf
def getCDF(pmf):
    cdf = np.zeros(len(pmf))
    running_total = 0
    for i,p in enumerate(pmf):
        running_total += p
        cdf[i] = running_total
    return cdf

#samples pmf once and returns the index of the random sample
def sample_p_dist(pmf):
    cdf = getCDF(pmf)
    rand_num = random.random()
    return np.argmax(cdf==cdf[np.argmax(cdf>rand_num)])

Since the big picture idea of applying a genetic algorithm is to find a good solution to our objective functions, we should store each feasible solution and its objective value.

def updateSolnToResult(solns,dictionary,objective_function,obj_coef):
    for soln in solns:
        dictionary[str(soln)]=objective_function(soln,obj_coef)
    return dictionary

Now for simulating our algorithm we set some hyper-parameters and make sure to store the good results.

#reinitialize the solns
solnToResult={}
solns = init_solns_1D(num_solns,3)

#implementation that cares about satisfying constraint and maximizing our function
for _ in range(num_steps):
    new_solns = np.zeros((num_solns,3))
    obj_results = Linear_Objective(solns) #get objective value for each solution
    c1_results = Constraint1_UpperBound(solns) #check constraints

    #get working solution indices
    working_soln_ind = getWorkingSolutionIndices(c1_results)

    #get pmf of distribution to sample from
    prob = getPDF(solns,working_soln_ind,obj_results)

    #replace failed solutions with randmized ones
    for i in range(num_solns):
        if c1_results[i] == True:
            updateSolnToResult(solns[i],solnToResult,Linear_Objective)
            if(len(working_soln_ind)>=2): #check if mating is possible
                #pick random solution to mate with according to the pmf from weighted obj results
                mate_index=sample_p_dist(prob)
                offspring = mate_1D(solns[i],solns[mate_index])
            else:
                offspring = solns[i]
        else: #replace it with a some working solutions
            if(len(working_soln_ind)<=2): #check if mating is possible
                mate_index1,mate_index2 = sample_p_dist(prob),sample_p_dist(prob)
                offspring = mate_1D(solns[mate_index1],solns[mate_index2])
            else: #replace it with a some random solutions
                offspring = init_1D_p(np.random.rand(3))
        new_solns[i] = offspring

    solns = mutate(new_solns) #reset the solution set and mutate

print solns

Conclusion

That’s all there is to it! This was a simple example, but if you need to program a genetic algorithm for more complicated problems, you should be able to follow this basic outline. To view a notebook with all the code, check out my github.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s