# Search-and-Scoring¶

Search-and-scoring approaches to learning BBN structures search over a space of BBN structures and score the candidates. The highest scoring BBN structure is typically the output.

Let’s read our data into a Spark DataFrame SDF.

[1]:

from pysparkbbn.discrete.data import DiscreteData

data = DiscreteData(sdf)


## Genetic algorithm¶

We use genetic algorithm GA as a search-and-scoring approach to learning BBN structures. In general, the GA algorithm has the following major steps.

• Initialization: a population of BBN structures

• Fitness: the population is scored according to a fitness function and filtered

• Crossover: two parents from the population undergo a crossover operation to produce two new offspring

• Mutation: each offspring undergo a mutation operation

The fitness, crossover and mutation steps are repeated until a threshold of iterations is reached or there is convergence (a higher scoring BBN structure cannot be discovered).

[2]:

from pysparkbbn.discrete.ssslearn import Ga

ga = Ga(data, sc, max_iters=3)
g = ga.get_network()

[4]:

%matplotlib inline
import matplotlib.pyplot as plt
import networkx as nx

fig, ax = plt.subplots(figsize=(5, 5))

nx.draw(g,
with_labels=True,
node_size=500,
alpha=0.8,
font_weight='bold',
font_family='monospace',
node_color='r',
arrowsize=15,
ax=ax)