This is related to what I am doing recently. Essentially, clustering statistics means consider the relationship between more than one point (Histogram is one example). A good example is 2-Point Correlation Function, which consider the statistics of edges, not points. The following articles might be good start points: ## SurveyVicent J. Martinez, Enn SaarClustering statistics in cosmology In SPIE Proceedings Vol. 4847, 2002, "Astronomical Data Analysis II." http://arxiv.org/abs/astro-ph/0209208 Nicholas M. Ball, Robert J. Brunner Data Mining and Machine Learning in Astronomy Int.J.Mod.Phys.D19:1049-1106,2010 http://arxiv.org/abs/0906.2173 Alexander S. Szalay, Jim Gray, Jan vandenBerg Petabyte Scale Data Mining: Dream or Reality? In SIPE Astronmy Telescopes and Instruments, 22-28 August 2002, Waikoloa, Hawaii http://arxiv.org/abs/cs/0208013 ## Fast Statistics ComputationJoshua Dolence, Robert BrunnerFast two-point correlations of extremely large data sets 9th LCI International Conference on HighPerformance Clustered Computing (2008) Jeffrey P. Gardner, Andrew Connolly, Cameron McBride Enabling Rapid Development of Parallel Tree Search Applications In Challenges of Large Applications in Distributed Environments (CLADE) 2007, Monterey, CA. http://arxiv.org/abs/0709.1967 Andrew Moore (CMU), Andy Connolly (UPitt), Chris Genovese (CMU), Alex Gray (CMU), Larry Grone (CMU), Nick Kanidoris II (CMU), Robert Nichol (CMU), Jeff Schneider (CMU), Alex Szalay (JHU), Istvan Szapudi (CITA), Larry Wasserman (CMU) Fast Algorithms and Efficient Statistics: N-point Correlation Functions In Proceedings of MPA/MPE/ESO Conference "Mining the Sky", July 31 - August 4, 2000, Garching, Germany |

ResearchBlog >