## References

[1] | C. K. Chow and C. N. Liu. Approximating Discrete Probability Distributions With Dependence
Trees. IEEE Transactions on Information Theory, IT-14:462-467, 1968. [ bib ] |

[2] | T. W. Schoener. The Anolis Lizards of Bimini: Resource Partitioning in a Complex Fauna.
Ecology, 49(4):704-726, 1968. [ bib ] |

[3] | K. V. Mardia, J. T. Kent, and J. M. Bibby. Multivariate Analysis. Academic Press, 1979.
[ bib ] |

[4] | S. E. Fienberg. The Analysis of Cross-Classified Categorical Data. Springer, 2nd
edition, 1980. [ bib ] |

[5] | Z. Reinis, J. Pokorny, V. Basika, J. Tiserova, K. Gorican, D. Horakova, E. Stuchlikova, T.
Havranek, and F. Hrabovsky. Prognostic Significance of the Risk Profile in the Prevention of Coronary Heart Disease.
Bratisl Lek Listy, 76:137-150, 1981. [ bib ] |

[6] | S. L. Lauritzen and D. J. Spiegelhalter. Local Computation with Probabilities on Graphical
Structures and their Application to Expert Systems (with discussion). Journal of the Royal Statistical Society:
Series B (Statistical Methodology), 50(2):157-224, 1988. [ bib ] |

[7] | J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible
Inference. Morgan Kaufmann, 1988. [ bib ] |

[8] | I. A. Beinlich, H. J. Suermondt, R. M. Chavez, and G. F. Cooper. The ALARM Monitoring System: A
Case Study with Two Probabilistic Inference Techniques for Belief Networks. In Proceedings of the 2nd European
Conference on Artificial Intelligence in Medicine, pages 247-256. Springer-Verlag, 1989. [ bib | http ] |

[9] | J. Whittaker. Graphical Models in Applied Multivariate Statistics. Wiley, 1990. [
bib ] |

[10] | D. Dor and M. Tarsi. A Simple Algorithm to Construct a Consistent Extension of a Partially Oriented Graph. Technical report, UCLA, Cognitive Systems Laboratory, 1992. Available as Technical Report R-185. [ bib ] |

[11] | D. Geiger and D. Heckerman. Learning Gaussian Networks. In Proceedings of the 10th
Conference on Uncertainty in Artificial Intelligence, pages 235-243, 1994. Available as Technical Report
MSR-TR-94-10. [ bib ] |

[12] | G. F. Cooper and C. Yoo. Causal Discovery from a Mixture of Experimental and Observational Data.
In UAI '99: Proceedings of the Fifteenth Annual Conference on Uncertainty in Artificial Intelligence, pages
116-125. Morgan Kaufmann, 1995. [ bib ] |

[13] | D. M. Chickering. A Transformational Characterization of Equivalent Bayesian Network Structures.
In UAI '95: Proceedings of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence, pages
87-98. Morgan Kaufmann, 1995. [ bib ] |

[14] | D. Heckerman, D. Geiger, and D. M. Chickering. Learning Bayesian Networks: The Combination of
Knowledge and Statistical Data. Machine Learning, 20(3):197-243, September 1995. Available as Technical Report
MSR-TR-94-09. [ bib ] |

[15] | B. Abramson, J. Brown, W. Edwards, A. Murphy, and R. L. Winkler. Hailfinder: A Bayesian system
for forecasting severe weather. International Journal of Forecasting, 12(1):57-71, 1996. [ bib | http ] |

[16] | N. Friedman, D. Geiger, and M. Goldszmidt. Bayesian Network Classifiers. Machine
Learning, 29(2-3):131-163, 1997. [ bib ] |

[17] | J. Binder, D. Koller, S. Russell, and K. Kanazawa. Adaptive Probabilistic Networks with Hidden
Variables. Machine Learning, 29(2-3):213-244, 1997. [ bib | http ] |

[18] | N. Friedman, M. Goldszmidt, and A. Wyner. Data Analysis with Bayesian Networks: A Bootstrap
Approach. In UAI '99: Proceedings of the 15th Annual Conference on Uncertainty in Artificial Intelligence,
pages 196-205. Morgan Kaufmann, 1999. [ bib ] |

[19] | R. Castelo and A. Siebes. Priors on Network Structures. Biasing the Search for Bayesian
Networks. International Journal of Approximate Reasoning, 24(1):39-57, 2000. [ bib ] |

[20] | D. I. Edwards. Introduction to Graphical Modelling. Springer, 2nd edition, 2000. [
bib ] |

[21] | P. Legendre. Comparison of Permutation Methods for the Partial Correlation and Partial Mantel
Tests. Journal of Statistical Computation and Simulation, 67:37-73, 2000. [ bib ] |

[22] | A. J. Hartemink. Principled Computational Methods for the Validation and Discovery of
Genetic Regulatory Networks. PhD thesis, School of Electrical Engineering and Computer Science, Massachusetts
Institute of Technology, 2001. [ bib ] |

[23] | G. Melançon, I. Dutour, and M. Bousquet-Mélou. Random Generation of Directed
Acyclic Graphs. Electronic Notes in Discrete Mathematics, 10:202-207, 2001. [ bib ] |

[24] | G. Elidan. Bayesian Network Repository, 2001. https://www.cse.huji.ac.il/~galel/Repository/. [ bib ] |

[25] | F. Pesarin. Multivariate Permutation Tests With Applications in Biostatistics. Wiley,
2001. [ bib ] |

[26] | S. Imoto, S. Y. Kim, H. Shimodaira, S. Aburatani, K. Tashiro, S. Kuhara, and S. Miyano.
Bootstrap Analysis of Gene Networks Based on Bayesian Networks and Nonparametric Regression. Genome
Informatics, 13:369-370, 2002. [ bib ] |

[27] | J. S. Ide and F. G. Cozman. Random Generation of Bayesian Networks. In SBIA '02: Proceedings
of the 16th Brazilian Symposium on Artificial Intelligence, pages 366-375. Springer-Verlag, 2002. [ bib ] |

[28] | A. Agresti. Categorical Data Analysis. Wiley Series in Probability and Statistics.
Wiley-Interscience, 2nd edition, 2002. [ bib ] |

[29] | O. Ledoit and M. Wolf. Improved Estimation of the Covariance Matrix of Stock Returns with an
Application to Portfolio Selection. Journal of Empirical Finance, 10:603-621, 2003. [ bib ] |

[30] | R. E. Neapolitan. Learning Bayesian Networks. Prentice Hall, 2003. [ bib ] |

[31] | I. Tsamardinos, C. F. Aliferis, and A. Statnikov. Time and Sample Efficient Discovery of Markov
Blankets and Direct Causal Relations. In KDD '03: Proceedings of the Ninth ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, pages 673-678. ACM, 2003. [ bib ] |

[32] | I. Tsamardinos, C. F. Aliferis, and A. Statnikov. Algorithms for Large Scale Markov Blanket
Discovery. In Proceedings of the Sixteenth International Florida Artificial Intelligence Research Society
Conference, pages 376-381. AAAI Press, 2003. [ bib ] |

[33] | D. Margaritis. Learning Bayesian Network Model Structure from Data. PhD thesis, School
of Computer Science, Carnegie-Mellon University, Pittsburgh, PA, May 2003. Available as Technical Report CMU-CS-03-153.
[ bib ] |

[34] | G. Melançon and P. Fabrice. Generating Connected Acyclic Digraphs Uniformly at Random.
Information Processing Letters, 90(4):209-213, 2004. [ bib ] |

[35] | S. Yaramakala and D. Margaritis. Speculative Markov Blanket Discovery for Optimal Feature
Selection. In ICDM '05: Proceedings of the Fifth IEEE International Conference on Data Mining, pages 809-812.
IEEE Computer Society, 2005. [ bib ] |

[36] | A. A. Margolin, I. Nemenman, K. Basso, C. Wiggins, G. Stolovitzky, R. Favera, and A. Califano.
ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context. BMC
Bioinformatics, 7(Suppl 1):S7, 2006. [ bib ] |

[37] | I. Tsamardinos, L. E. Brown, and C. F. Aliferis. The Max-Min Hill-Climbing Bayesian Network
Structure Learning Algorithm. Machine Learning, 65(1):31-78, 2006. [ bib
] |

[38] | R. Daly and Q. Shen. Methods to Accelerate the Learning of Bayesian Network Structures. In
Proceedings of the 2007 UK Workshop on Computational Intelligence. Imperial College, London, 2007. [ bib ] |

[39] | C. Borgelt, R. Kruse, and M. Steinbrecher. Graphical Models: Representations for Learning,
Reasoning and Data Mining. Wiley, 2nd edition, 2009. [ bib ] |

[40] | D. Koller and N. Friedman. Probabilistic Graphical Models: Principles and Techniques.
MIT Press, 2009. [ bib ] |

[41] | S. J. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall,
3rd edition, 2009. [ bib ] |

[42] | J. Hausser and K. Strimmer. Entropy inference and the James-Stein estimator, with application to
nonlinear gene association networks. Statistical Applications in Genetics and Molecular Biology, 10:1469-1484,
2009. [ bib ] |

[43] | J. Bang-Jensen and G. Gutin. Digraphs: Theory, Algorithms and Applications. Springer,
2nd edition, 2009. [ bib ] |

[44] | M. Scutari. Structure Variability in Bayesian Networks. ArXiv Statistics - Methodology
e-prints, 2009. [ bib | arXiv ] |

[45] | J. Pearl. Causality: Models, Reasoning and Inference. Cambridge University Press, 2nd
edition, 2009. [ bib ] |

[46] | I. Tsamardinos and G. Borboudakis. Permutation Testing Improves Bayesian Network Learning. In J.
Balcázar, F. Bonchi, A. Gionis, and M. Sebag, editors, Machine Learning and Knowledge Discovery in
Databases, pages 322-337. Springer, 2010. [ bib ] |

[47] | C. F. Aliferis, A. Statnikov, I. Tsamardinos, S. Mani, and X. D. Xenofon. Local Causal and
Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical
Evaluation. Journal of Machine Learning Research, 11:171-234, 2010. [ bib ] |

[48] | M. Scutari. Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical
Software, 35(3):1-22, 2010. [ bib | http ] |

[49] | K. Korb and A. Nicholson. Bayesian Artificial Intelligence. Chapman and Hall, 2nd
edition, 2010. [ bib ] |

[50] | M. Scutari and R. Nagarajan. On Identifying Significant Edges in Graphical Models. In
Proceedings of the Workshop `Probabilistic Problem Solving in Biomedicine' of the 13th Artificial Intelligence in
Medicine (AIME) Conference, pages 15-27, 2011. [ bib ] |

[51] | A. Cano, M. Gómez-Olmedo, A. R. Masegosa, and S. Moral. Locally Averaged Bayesian
Dirichlet Metrics for Learning the Structure and the Parameters of Bayesian Networks. International Journal of
Approximate Reasoning, 54:526-540, 2013. [ bib ] |

[52] | R. Nagarajan, M. Scutari, and S. Lèbre. Bayesian Networks in R with Applications in
Systems Biology. Use R! series. Springer, 2013. [ bib ] |

[53] | J. Suzuki. A Theoretical Analysis of the BDeu Scores in Bayesian Network Structure Learning.
Behaviormetrika, 44:97-116, 2016. [ bib ] |