Full Example

We have covered how to learn the structure and parameters of a Bayesian Belief Network BBN. Let’s see how we can combine the structure and parameters to create a BBN. Additionally, let’s see how we can use the BBN for exact inference.

Load data

Let’s read our data into a Spark DataFrame SDF.

[1]:
from pysparkbbn.discrete.data import DiscreteData

sdf = spark.read.csv('hdfs://localhost/data-1479668986461.csv', header=True)
data = DiscreteData(sdf)

Structure learning

Let’s pick the naive Bayes algorithm to learn the structure.

[2]:
from pysparkbbn.discrete.scblearn import Naive

naive = Naive(data, 'n3')
g = naive.get_network()

Parameter learning

After we have a structure, we can learn the parameters.

[3]:
from pysparkbbn.discrete.plearn import ParamLearner
import json

param_learner = ParamLearner(data, g)
p = param_learner.get_params()

print(json.dumps(p, indent=2))
{
  "n3": [
    0.47345,
    0.52655
  ],
  "n1": [
    0.8588024078572183,
    0.1411975921427817,
    0.6534042351153737,
    0.34659576488462635
  ],
  "n2": [
    0.8773893758580632,
    0.12261062414193685,
    0.44725097331687397,
    0.552749026683126
  ],
  "n4": [
    0.667546731439434,
    0.33245326856056606,
    0.1642768967809325,
    0.8357231032190675
  ],
  "n5": [
    0.29675784137712535,
    0.4307741049741261,
    0.27246805364874854,
    0.29503370999905043,
    0.17909030481435761,
    0.525875985186592
  ]
}

BBN

Now that we have the structure and parameters, we can build a BBN. Use the get_bbn utility method to help bring together the structure and parameters.

[4]:
from pysparkbbn.discrete.bbn import get_bbn

bbn = get_bbn(g, p, data.get_profile())

Inference

With a BBN defined, we can use py-bbn to proceed with exact inference.

[5]:
from pybbn.pptc.inferencecontroller import InferenceController

join_tree = InferenceController.apply(bbn)

for node, posteriors in join_tree.get_posteriors().items():
    p = ', '.join([f'{val}={prob:.5f}' for val, prob in posteriors.items()])
    print(f'{node} : {p}')
n3 : f=0.47345, t=0.52655
n4 : f=0.40255, t=0.59745
n2 : f=0.65090, t=0.34910
n1 : f=0.75065, t=0.24935
n5 : maybe=0.29585, no=0.29825, yes=0.40590