Transparent Gif

Department of Computer Science

University of California, Santa Barbara

Abstract

PSI: Indexing Protein Structures for Fast Similarity Search

by: Orhan Camoglu, Tamer Kahveci, and Ambuj Singh

Abstract:

We consider the problem of finding similarities in proteinstructure databases. Our techniques extract feature vectors on triplets ofSSEs (Secondary Structure Elements). Later, these feature vectors are indexed using a multidimensional index structure. Our first technique finds proteins similar to a query protein in a protein dataset.This technique quickly prunes unpromising proteins using theindex structure. The remaining proteins are then aligned using a popularalignment tool such as VAST. We also develop a novel statistical model to estimate the goodness of a match using the SSEs. Our second technique considers the problem of joining two protein datasets to find an all-to-all similarity. Experimental results show that our techniques improve the pruning time of VAST 3 to 3.5 times while keeping the sensitivity similar.

Keywords:

Protein structures, feature vectors, indexing, dataset join

Date:

January 2003

Document: 2003-03

XHTML Validation | CSS Validation
Updated 14-Nov-2005
Questions should be directed to: webmaster@cs.ucsb.edu