Computing the log-likelihood of data for a Bayesian network

Computing the log-likelihood of some data for a given Bayesian network is a fundamental task in many inference tasks, whether Bayesian (predictive/posterior log-likelihood) or not (log-likelihood loss in cross-validation). bnlearn provides a logik() method for bn.fit objects. In its most basic form, it takes just a fitted Bayesian network and the data frame containing the data and returns the log-likelihood of the whole sample:

> library(bnlearn)
> training.set = learning.test[1:4000, ]
> test.set = learning.test[4001:5000, ]
> dag = model2network("[A][C][F][B|A][D|A:C][E|B:F]")
> bn = bn.fit(dag, training.set)
> logLik(bn, test.set)
[1] -4780.369

However, logLik() can also compute the log-likelihood of the data for a subset of the nodes in the network specified by the nodes argument.

> logLik(bn, test.set, nodes = c("A", "C", "F"), debug = TRUE)
> computing the log-likelihood of a discrete network.
* processing node A.
  > 1000 locally-complete observations out of 1000.
  > log-likelihood is -1098.737553.
* processing node C.
  > 1000 locally-complete observations out of 1000.
  > log-likelihood is -704.801355.
* processing node F.
  > 1000 locally-complete observations out of 1000.
  > log-likelihood is -693.493696.
[1] -2497.033

And it can return the log-likelihood for the individual observations in the data instead of summing it up over the whole sample (for all the nodes in the network or the nodes specified in the nodes argument).

> summary(logLik(bn, test.set, nodes = c("A", "C", "F"), by.sample = TRUE, debug = TRUE))
> computing the log-likelihood of a discrete network.
* processing node A.
* processing node C.
* processing node F.
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -4.784  -3.356  -2.102  -2.497  -2.080  -2.068

logLik() returns -Inf if the data have probability or density equal to zero, which typically happens if the model is singular. For discrete nodes, if there are probabilities equal to zero in their conditional probability tables for the values observed in the data:

> dag = model2network("[A][B|A]")
> dists = list(
+   A = matrix(c(0.4, 0.6), ncol = 2, dimnames = list(NULL, c("LOW", "HIGH"))),
+   B = matrix(c(0.8, 0.2, 0, 1), ncol = 2,
+              dimnames = list(c("GOOD", "BAD"), c("LOW", "HIGH")))
+ )
> bnD = custom.fit(dag, dists)
> dataD = data.frame(
+   A = factor(c("LOW", "LOW", "HIGH", "HIGH"), levels = c("LOW", "HIGH")),
+   B = factor(c("GOOD", "BAD", "GOOD", "BAD"), c("GOOD", "BAD"))
+ )
> cbind(dataD, loglik = logLik(bnD, dataD, by.sample = TRUE))
     A    B     loglik
1  LOW GOOD -1.1394343
2  LOW  BAD -2.5257286
3 HIGH GOOD       -Inf
4 HIGH  BAD -0.5108256

For Gaussian and conditional Gaussian nodes, if their distribution is singular and it has a standard error equal to zero:

> dag = model2network("[A][B|A]")
> dists = list(
+   A = list(coef = c("(Intercept)" = 2), sd = 1),
+   B = list(coef = c("(Intercept)" = 2, A = 3), sd = 0)
+ )
> bnG = custom.fit(dag, dists)
> 
> dataG = data.frame(A = rnorm(4), B = rnorm(4))
> cbind(dataG, loglik = logLik(bnG, dataG, by.sample = TRUE))
            A          B loglik
1 -0.54521158 -0.6348741   -Inf
2 -1.94930138 -1.0533841   -Inf
3  0.04507692 -1.4910099   -Inf
4 -0.85922770  1.4223929   -Inf

However, logLik() returns +Inf for those obsevations that coincide with the expected value of the singular distribution.

> spike = data.frame(A = 1, B = 2 + 3 * 1)
> cbind(dataG, loglik = logLik(bnG, spike, by.sample = TRUE))
            A          B loglik
1 -0.54521158 -0.6348741    Inf
2 -1.94930138 -1.0533841    Inf
3  0.04507692 -1.4910099    Inf
4 -0.85922770  1.4223929    Inf

logLik() returns NA if any of the parameters of the Bayesian network that are involved in the computation of the log-likelihood are equal to NA, which may happen when they are estimated by maximum likelihood.

> cl = class(bnD)
> class(bnD) = "list"
> bnD$B$prob[, "HIGH"] = NA
> class(bnD) = cl
> 
> cbind(dataD, loglik = logLik(bnD, dataD, by.sample = TRUE))
     A    B    loglik
1  LOW GOOD -1.139434
2  LOW  BAD -2.525729
3 HIGH GOOD        NA
4 HIGH  BAD        NA
> cl = class(bnG)
> class(bnG) = "list"
> bnG$B$coefficients["A"] = NA
> class(bnG) = cl
> 
> cbind(dataG, loglik = logLik(bnG, dataG, by.sample = TRUE))
            A          B loglik
1 -0.54521158 -0.6348741     NA
2 -1.94930138 -1.0533841     NA
3  0.04507692 -1.4910099     NA
4 -0.85922770  1.4223929     NA

The behaviour of logLik() for data with missing values is described here.

Last updated on Sat Feb 17 23:43:15 2024 with bnlearn 5.0-20240208 and R version 4.3.2 (2023-10-31).