Skip to main navigation Skip to search Skip to main content

Constructing similarity graphs from large-scale biological sequence collections

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Detecting similar pairs in large biological sequence collections is one of the most commonly performed tasks in computational biology. With the advent of high throughput sequencing technologies the problem regained significance as data sets with millions of sequences became ubiquitous. This paper is an initial report on our parallel, distributed memory and sketching-based approach to constructing large-scale sequence similarity graphs. We develop load balancing techniques, derived from multi-way number partitioning and work stealing, to manage computational imbalance and ensure scalability on thousands of processors. Our experimental results show that the method is efficient, and can be used to analyze data sets with millions of DNA sequences in acceptable time limits.

Original languageEnglish
Title of host publicationProceedings - IEEE 28th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2014
PublisherIEEE Computer Society
Pages500-507
Number of pages8
ISBN (Electronic)9780769552088
DOIs
StatePublished - Nov 27 2014
Event28th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2014 - Phoenix, United States
Duration: May 19 2014May 23 2014

Publication series

NameProceedings - IEEE 28th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2014

Conference

Conference28th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2014
Country/TerritoryUnited States
CityPhoenix
Period05/19/1405/23/14

Keywords

  • Load balancing
  • Min-wise independent permutations
  • Parallel computational biology
  • Sequence similarity

Fingerprint

Dive into the research topics of 'Constructing similarity graphs from large-scale biological sequence collections'. Together they form a unique fingerprint.

Cite this