TITLE:  An Affine-Invariant Bayesian Cluster Process with Split-Merge Gibbs Sampler

ABSTRACT:

We develop a clustering algorithm which does not requires knowing the number of clusters in advance. Furthermore, our clustering method is rotation-, scale- and translation-invariant. We call it “Affine-invariant Bayesian (AIB) process”. A highly efficient split-merge Gibbs sampling algorithm is proposed. Using the Ewens sampling distribution as prior of the partition and the profile residual likelihoods of the responses under three different covariance matrix structures, we obtain inferences in the form of a posterior distribution on partitions. The proposed split-merge MCMC algorithm successfully and efficiently estimate the partition. Our experimental results indicate that the AIB process outperforms other competing methods. In addition, the proposed algorithm is irreducible and aperiodic, so that the estimate is guaranteed to converge to the true partition.

BIO:   Dr. Hsin-Hsiung is an Assistant Professor in the Department of Statistics at University of Central Florida (UCF). Dr. Huang received his Ph.D. in Statistics from the University of Illinois at Chicago and two MS degrees from the Georgia Institute of Technology and National Taiwan University as well as BA and BS from National Taiwan University. His scholarly interests and expertise include Bayesian clustering, classification, genome comparison, robust dimension reduction, and text categorization. His research addresses challenges in analyzing big data in bioinformatics and cybersecurity by developing and evaluating new statistical methods. Examples of his research projects include classifying multiple-segmented viruses, discovering the association of biomarkers and hypertension, as well as business intelligence classification. Dr. Huang has been awarded the 2013 Taiwanese study-abroad student research award when he was a doctoral student at UIC, and the 2016 In-House grant at UCF.