摘要:
A novel algorithm named A Robust Clustering Algorithm for Categorical(ROCK)model is proposed to improve clustering quality and it is efficient for the data of high dimensionality,sparsity and categorical nature.A novel concept called"common neighbors"(links),an appropriate selection of nearest neighbors,is adopted as similarity measure between a pair of points.The key step of computing adjacency matrix,which has a significant effect on the time complexity,could be implemented by GPU's excellent performance such as the number of floating-point operations per second and the parallel processing on fragment vector processing,and the others could be finished by Central Processing Units(CPU).Some experiments conducted in a PC with AMD 643500+CPU and NVIDIA Ge-Force 6800 GT graphic card demonstrate that the present algorithm is faster than the previous CPU-based algorithms,thus it is applicable for the clustering data stream that requiring for high speed processing and high quality clustering results.