Data-dependent

Towards a Persistence Diagram that is Robust to Noise and Varied Densities

We propose a new filter function for Topological Data Analysis (TDA) based on a new data-dependent kernel.

Kernel-based clustering via Isolation Distributional Kernel

We propose the first clustering algorithm that employs an adaptive distributional kernel without any optimization, while achieving a similar optimization objective function.

A new distributional treatment for time series and an anomaly detection investigation

We propose a distributional treatment for anomalous subsequence detection with a linear runtime.

Streaming Hierarchical Clustering Based on Point-Set Kernel

We propose a novel efficient hierarchical clustering called StreaKHC that enables massive streaming data to be mined. .

Improving the Effectiveness and Efficiency of Stochastic Neighbour Embedding with Isolation Kernel

We presents a new insight into improving the performance of Stochastic Neighbour Embedding (t-SNE) by using Isolation kernel instead of Gaussian kernel.

Improving the Effectiveness and Efficiency of Stochastic Neighbour Embedding with Isolation Kernel

Replacing Gaussian kernel with Isolation kernel in t-SNE significantly improves the quality of the final visualisation output.

Data-dependent Similarity

Investigating the data-dependent similarity measures for distance-based learning algorithms. The source code of the latest data-dependent similarity measure **aNNE** (AAAI-19) can be obtained from **[here](https://github.com/zhuye88/anne-dbscan-demo)**.

Nearest-Neighbour-Induced Isolation Similarity and Its Impact on Density-Based Clustering

We identify shortcomings of using a tree method to implement Isolation Similarity; and propose a nearest neighbour method instead.

Lowest probabilitymass neighbour algorithms: relaxing the metric constraint in distance-based neighbourhood algorithms

We propose to use mass-based dissimilarity, which employs estimates of the probability mass to measure dissimilarity, to replace the distance metric.

Overcoming key weaknesses of Distance-based Neighbourhood Methods using a Data Dependent Dissimilarity Measure

A generic data dependent dissimilarity, named massbased dissimilarity, is proposed to allow for different implementations.