Dissertations, Theses, and Capstone Projects
Date of Degree
9-2021
Document Type
Dissertation
Degree Name
Ph.D.
Program
Computer Science
Advisor
Robert Haralick
Committee Members
Mikael Vejdemo-Johansson
Michael Grossberg
Yuri Katz
Subject Categories
Artificial Intelligence and Robotics | Data Science
Keywords
clustering, topological data analysis, information theory
Abstract
This work studies the application of topological analysis to non-linear manifold clustering. A novel method, that exploits the data clustering structure, allows to generate a topological representation of the point dataset. An analysis of topological construction under different simulated conditions is performed to explore the capabilities and limitations of the method, and demonstrated statistically significant improvements in performance. Furthermore, we introduce a new information-theoretical validation measure for clustering, that exploits geometrical properties of clusters to estimate clustering compressibility, for evaluation of the clustering goodness-of-fit without any prior information about true class assignments. We show how the new validation measure, when used as regularization criteria, allows creation of clusters that are more informative. A final contribution is a new metaclustering technique that allows to create a model-based clustering beyond point and linear shaped structures. Driven by topological structure and our information-theoretical criteria, this technique provides structured view of the data on new comprehensive and interpretation level. Improvements of our clustering approach are demonstrated on a variety of synthetic and real datasets, including image and climatological data.
Recommended Citation
Diky, Artyom, "Piecewise Linear Manifold Clustering" (2021). CUNY Academic Works.
https://academicworks.cuny.edu/gc_etds/4441