# Search-and-Scoring¶

Search-and-scoring approaches to learning BBN structures search over a space of BBN structures and score the candidates. The highest scoring BBN structure is typically the output.

## Load data¶

Let’s read our data into a Spark DataFrame `SDF`

.

```
[1]:
```

```
from pysparkbbn.discrete.data import DiscreteData
sdf = spark.read.csv('hdfs://localhost/data-1479668986461.csv', header=True)
data = DiscreteData(sdf)
```

## Genetic algorithm¶

We use genetic algorithm `GA`

as a search-and-scoring approach to learning BBN structures. In general, the GA algorithm has the following major steps.

Initialization: a population of BBN structures

Fitness: the population is scored according to a fitness function and filtered

Crossover: two parents from the population undergo a crossover operation to produce two new offspring

Mutation: each offspring undergo a mutation operation

The fitness, crossover and mutation steps are repeated until a threshold of iterations is reached or there is convergence (a higher scoring BBN structure cannot be discovered).

```
[2]:
```

```
from pysparkbbn.discrete.ssslearn import Ga
ga = Ga(data, sc, max_iters=3)
g = ga.get_network()
```

```
[4]:
```

```
%matplotlib inline
import matplotlib.pyplot as plt
import networkx as nx
fig, ax = plt.subplots(figsize=(5, 5))
nx.draw(g,
with_labels=True,
node_size=500,
alpha=0.8,
font_weight='bold',
font_family='monospace',
node_color='r',
arrowsize=15,
ax=ax)
```