Please use this identifier to cite or link to this item: http://bura.brunel.ac.uk/handle/2438/28954
Title: New Bayesian regression models for massive data and extreme longitudinal data
Authors: Chu, Yuanqi
Advisors: Yu, K
Roman, D
Keywords: Beta-negative binomial INGARCH model;Bayesian quantile regression for big data;Bayesian LASSO quantile regression;Bayesian two-part latent class model;Kumaraswamy quantile mixed regression
Issue Date: 2024
Publisher: Brunel University London
Abstract: The phenomena of heavy-tailedness and asymmetry are ubiquitous in a variety of practical applications. The intriguing property of heavy-tailedness implies that the underlying distribution is capable of producing anomalous observations which deviate too far from the main body of observations. Down-weighing such extreme observations for an asymmetric distribution can sacrifice inherited information and introduce considerable bias on parameter estimation. Over the past decades, two main approaches have emerged to tackle the distributional deviation caused by heavy-tailedness. The first procedure adopts mixture models to accommodate the heterogeneity in the distribution of the data. The second technique considers appropriate distributions to take care of the majority as well as the heavy tail of the data. This thesis aims to make some novel contributions to the following three issues related to massive data and extreme longitudinal data exhibiting heavy-tailed characteristics. First, the multitude of existing literature coping with continuous distributions with heavy-tailedness contrasts sharply with the scarcity of integer-valued distributions. This especially applies to integer-valued time series modelling. Second, heavy tails can considerably shadow the nature of the dependence between the response and the covariates of interest, calling the normality assumption and conventional linear models into question. The quantile regression (QR) approach, which is robust to outlier contamination associated with heavy-tailed errors, serves as a remedy for these hurdles. Bayesian quantile regression (BQR) has received increasing attention from both theoretical and empirical viewpoints with wide applications and variants, but little attention has been paid to BQR for big data analysis. Third, the phenomena of heavy-tailedness often arise with the semi-continuous data, which are commonly characterized by a mixture of zero values and continuously distributed positive values. This conceptual framework leads to the formulation of a two-part model. The literature on two-part models, especially in Bayesian paradigms, for investigating quantiles of semi-continuous longitudinal data with bounded support such as the standard unit interval p0, 1q, is relatively limited. This thesis encapsulates three themes to address the above-mentioned challenges: Bayesian integer-valued time series modelling with heavy-tailedness characteristics, Bayesian quantile regression for big data analysis and Bayesian quantile parametric mixed regression for semi-continuous longitudinal data with bounded support. The main contributions are elaborated as below: • Chapter 2 gives rise to the Bayesian inference for log-linear Beta–negative binomial integer-valued generalized autoregressive conditional heteroscedastic (BNB-INGARCH) models and conducts parameter estimations within adaptive Markov chain Monte Carlo frameworks. The conditions for the posterior distribution of the full model parameter to be proper given some general priors have been presented. • Chapter 3 contributes to a new approach of Bayesian quantile regression for big data. This chapter introduces the structure link between Bayesian scale mixtures of normals linear regression and BQR via normal-inversegamma (NIG) distribution type of likelihood function, prior distribution and posterior distribution. The big data based algorithms for BQR and Bayesian LASSO quantile regression are provided and the proposed algorithms are demonstrated via simulations and a real-world data analysis. • Chapter 4 introduces a two-part latent class Kumaraswamy quantile mixed regression with Bayesian inference for bounded longitudinal data that exhibit a large spike at zeros. Correlated random effects with class-specific covariance structures are formulated for the binary and the bounded positive components to account for both zero inflation and unobserved heterogeneity. The developed method portrays the trajectory of distinct latent class evolutions in the underlying outcome process, which provides valuable insights into the latent cluster structure at various quantiles encompassing the tails and caters to the exploration of skewed longitudinal data with bounded support.
Description: This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University London
URI: https://bura.brunel.ac.uk/handle/2438/28954
Appears in Collections:Dept of Mathematics Theses
Mathematical Sciences

Files in This Item:
File Description SizeFormat 
FulltextThesis.pdf967.52 kBAdobe PDFView/Open


Items in BURA are protected by copyright, with all rights reserved, unless otherwise indicated.