Prado, Estevão B. (2022) Extensions to Bayesian tree-based machine learning algorithms. PhD thesis, National University of Ireland Maynooth.
Preview
Thesis_Estevao_Batista.pdf
Download (4MB) | Preview
Abstract
Bayesian additive regression trees (BART) is a Bayesian tree-based algorithm
which can provide high predictive accuracy in both classification and regression
problems. Unlike other machine learning algorithms based on an ensemble of trees,
such as random forests and gradient boosting, BART is not based on recursive partitioning.
Rather, it is a fully Bayesian model built upon a likelihood function and
diligently specified prior distributions.
In this thesis, we propose methodological extensions to BART to deal with two
main limitations of tree-based methods: the limited ability to fit smooth functions,
which is inherently associated with how methods based on trees are built, as well
as the lack of adequate mechanisms that enable to quantify in an interpretable
fashion the impact of certain inputs of primary interest on the output.
Firstly, we present an extension that aims to deal with linear effects at the terminal
nodes level. By considering linear piecewise functions instead of piecewise constants,
local linearities are captured more efficiently and fewer trees are required to
achieve equal or better performance than BART. Secondly, motivated by an agricultural
application, we develop a semi-parametric BART model in which marginal
genotypes and environment effects are estimated along with their interactions.
Last, motivated by data collected in 2019 under the seventh cycle of the quadrennial
Trends in International Mathematics and Science Study, we extend semiparametric
models based on BART, which generally assume that the set of covariates
in the linear predictor and the BART model are mutually exclusive, to account
for shared covariates. In particular, we change the tree-generation moves in BART
to deal with bias/confounding between the parametric and non-parametric components,
even when they have covariates in common.
Item Type: | Thesis (PhD) |
---|---|
Keywords: | Extensions; Bayesian; tree-based; machine learning; algorithms; |
Academic Unit: | Faculty of Science and Engineering > Research Institutes > Hamilton Institute |
Item ID: | 17285 |
Depositing User: | IR eTheses |
Date Deposited: | 06 Jun 2023 15:02 |
URI: | https://mu.eprints-hosting.org/id/eprint/17285 |
Use Licence: | This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available here |
Repository Staff Only (login required)
Downloads
Downloads per month over past year