Pinboard (jm)
https://pinboard.in/u:jm/public/
recent bookmarks from jmCloudBurst2012-07-23T10:02:33+00:00
http://sourceforge.net/apps/mediawiki/cloudburst-bio/index.php?title=CloudBurst
jmCloudBurst uses well-known seed-and-extend algorithms to map reads to a reference genome. It can map reads with any number of differences or mismatches. [..] Given an exact seed, CloudBurst attempts to extend the alignment into an end-to-end alignment with at most k mismatches or differences by either counting mismatches of the two sequences, or with a dynamic programming algorithm to allow for gaps. CloudBurst uses [Hadoop] to catalog and extend the seeds. In the map phase, the map function emits all length-s k-mers from the reference sequences, and all non-overlapping length-s kmers from the reads. In the shuffle phase, read and reference kmers are brought together. In the reduce phase, the seeds are extended into end-to-end alignments. The power of MapReduce and CloudBurst is the map and reduce functions run in parallel over dozens or hundreds of processors.
JM_SOUGHT -- the next generation ;)]]>bioinformatics mapreduce hadoop read-alignment dna sequencing sought antispam algorithmshttps://pinboard.in/https://pinboard.in/u:jm/b:d5988af4cc00/