Recent News
UNM receives $1.5 million to support computational workforce development
May 8, 2023
Tapia elected to Computing Research Association Board of Directors
March 3, 2023
UNM computer science students take part in HPC competition
March 3, 2023
Computer science professor, student part of AI panel on March 8
February 24, 2023
News Archives
A Sublinear-Time Randomized Approximation Scheme for the Robinson-Foulds Metric
January 24, 2006
- Date: Tuesday, January 24, 2006
- Time: 11:00 am — 12:15 pm.
- Place: Woodward 149
Nick Pattengale
Department of Computer Science, UNM and Sandia National Labs
The Robinson-Foulds (RF) metric is the measure most widely used in comparing phylogenetic trees; it can be computed in linear time using Day’s algorithm. When faced with the need to compare large numbers of large trees, however, even linear time becomes prohibitive. We present a randomized approximation scheme that provides, with high probability, a (1+epsilon) approximation of the true RF metric for all pairs of trees in a given collection. Our approach is to use a sublinear-space embedding of the trees, combined with an application of the Johnson-Lindenstrauss lemma to approximate vector norms very rapidly. We discuss the consequences of various parameter choices (in the embedding and in the approximation requirements). We also implemented our algorithm as a Java class that can easily be combined with popular packages such as Mesquite; in consequence, we present experimental results illustrating the precision and running-time tradeoffs as well as demonstrating the speed of our approach.