Bayesian Markov chain Monte Carlo imputation for the transiting exoplanets with an application in clustering analysis

Huei-Wen Teng, Wen Liang Hung*, Yen Ju Chao

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

To impute the missing values of mass in the transiting exoplanet data, this paper uses the Frank copula to combine two Pareto marginal distributions. Next, a Bayesian Markov chain Monte Carlo (MCMC) imputation method is proposed. The proposed Bayesian MCMC imputation method is found to outperform the mean imputation method. Clustering analysis can shed light on the formation and evolution of exoplanets. After imputing the missing values of mass in the transiting exoplanet data using the proposed approach, the similarity-based clustering method (SCM) clustering algorithm is applied to the logarithm of mass and period for this complete data set. The SCM clustering result indicates two clusters. Furthermore, the intracluster Spearman rank-order correlation coefficients (Formula presented.) for mass and period in these two clusters are 0.401 and (Formula presented.) , respectively, at a significance level of 0.01. This result illustrates that the mass and period correlate in an opposite way between the two different clusters. It implies that the formation and evolution processes of these two clusters are different.

Original languageEnglish
Pages (from-to)1120-1132
Number of pages13
JournalJournal of Applied Statistics
Volume42
Issue number5
DOIs
StatePublished - 4 May 2015

Keywords

  • copula
  • hot Jupiters
  • Metropolis–Hastings algorithm
  • missing data
  • transiting exoplanets

Fingerprint Dive into the research topics of 'Bayesian Markov chain Monte Carlo imputation for the transiting exoplanets with an application in clustering analysis'. Together they form a unique fingerprint.

Cite this