2015-10-05 75 views
1

The SeqAn tutorial for Pattern Matching提到StringSet可以作爲乾草堆或針。在嘗試使用StringSet爲如下草堆,在線模式搜索字符串集

StringSet<Dna5String> seqs; 

/* do stuff to load sequences into seqs */ 

Finder<StringSet<Dna5String> > finder(seqs); 
Pattern<Dna5String, Simple> pattern(Dna5String("GAATTC")); 

if (find(finder, pattern)) 
{ 
    std::cout << '[' << beginPosition(finder) << ',' << endPosition(finder) 
      << ")\t" << infix(finder) << std::endl; 
} else 
{ 
    std::cout << "No match!"; 
} 

我得到的錯誤:

error: use of overloaded operator '==' is ambiguous (with operand types 'const const seqan::String, seqan::Alloc >' and 'const seqan::SimpleType')

任何人都有一個想法,這個應該怎麼處理得當?

Finder中使用單個Dna5String可以正常工作。本教程確實展示瞭如何執行離線搜索(即與索引),但這不是我想要的。如果SeqAn中的Finder-Pattern工具已經處理它,我寧願不必手動迭代StringSet

回答

1

你可以試試,

#include <iostream> 
#include <seqan/sequence.h> // CharString, ... 
#include <seqan/find.h> 
#include <seqan/stream.h> 

using namespace seqan; 

typedef Iterator<StringSet<Dna5String> >::Type TStringSetIterator; 

int main(int, char const **) 
{ 
    StringSet<Dna5String> seqs; 
    Dna5String seq1 = 
     "TAGGTTTTCCGAAAAGGTAGCAACTTTACGTGATCAAACCTCTGACGGGGTTTTCCCCGTCGAAATTGGGTG" 
     "TTTCTTGTCTTGTTCTCACTTGGGGCATCTCCGTCAAGCCAAGAAAGTGCTCCCTGGATTCTGTTGCTAACG" 
     "AGTCTCCTCTGCATTCCTGCTTGACTGATTGGGCGGACGGGGTGTCCACCTGACGCTGAGTATCGCCGTCAC" 
     "GGTGCCACATGTCTTATCTATTCAGGGATCAGAATTCATTCAGGAAATCAGGAGATGCTACACTTGGGTTAT" 
     "CGAAGCTCCTTCCAAGGCGTAGCAAGGGCGACTGAGCGCGTAAGCTCTAGATCTCCTCGTGTTGCAACTACA" 
     "CGCGCGGGTCACTCGAAACACATAGTATGAACTTAACGACTGCTCGTACTGAACAATGCTGAGGCAGAAGAT" 
     "CGCAGACCAGGCATCCCACTGCTTGAAAAAACTATNNNNCTACCCGCCTTTTTATTATCTCATCAGATCAAG"; 
    Dna5String seq2 = 
     "ACCGACGATTAGCTTTGTCCGAGTTACAACGGTTCAATAATACAAAGGATGGCATAAACCCATTTGTGTGAA" 
     "AGTGCCCATCACATTATGATTCTGTCTACTATGGTTAATTCCCAATATACTCTCGAAAAGAGGGTATGCTCC" 
     "CACGGCCATTTACGTCACTAAAAGATAAGATTGCTCAAANNNNNNNNNACTGCCAACTTGCTGGTAGCTTCA" 
     "GGGGTTGTCCACAGCGGGGGGTCGTATGCCTTTGTGGTATACCTTACTAGCCGCGCCATGGTGCCTAAGAAT" 
     "GAAGTAAAACAATTGATGTGAGACTCGACAGCCAGGCTTCGCGCTAAGGACGCAAAGAAATTCCCTACATCA" 
     "GACGGCCGCGNNNAACGATGCTATCGGTTAGGACATTGTGCCCTAGTATGTACATGCCTAATACAATTGGAT" 
     "CAAACGTTATTCCCACACACGGGTAGAAGAACNNNNATTACCCGTAGGCACTCCCCGATTCAAGTAGCCGCG"; 

    clear(seqs); 
    appendValue(seqs, seq1); 
    appendValue(seqs, seq2); 

    Pattern<Dna5String, Simple> pattern(Dna5String("GAATTC")); 

    //For each sequence in seqs 
    for (TStringSetIterator it = begin(seqs); it != end(seqs); ++it) 
    { 
     std::cout << *it << std::endl; 
     //I create a finder for each sequence in seqs 
     Finder<Dna5String> finder(*it); 
     if (find(finder, pattern)){ 
      std::cout << '[' << beginPosition(finder) << ',' << endPosition(finder) 
         << ")\t" << infix(finder) << std::endl; 
     }else{ 
      std::cout << "No match!" << std::endl; 
     } 
    } 
    return 0; 
} 

你:

 
TAGGTTTTCCGAAAAGGTAGCAACTTTACGTGATCAAACCTCTGACGGGGTTTTCCCCGTCGAAATTGGGTGTTTCTTGTCTTGTTCTCACTTGGGGCATCTCCGTCAAGCCAAGAAAGTGCTCCCTGGATTCTGTTGCTAACGAGTCTCCTCTGCATTCCTGCTTGACTGATTGGGCGGACGGGGTGTCCACCTGACGCTGAGTATCGCCGTCACGGTGCCACATGTCTTATCTATTCAGGGATCAGAATTCATTCAGGAAATCAGGAGATGCTACACTTGGGTTATCGAAGCTCCTTCCAAGGCGTAGCAAGGGCGACTGAGCGCGTAAGCTCTAGATCTCCTCGTGTTGCAACTACACGCGCGGGTCACTCGAAACACATAGTATGAACTTAACGACTGCTCGTACTGAACAATGCTGAGGCAGAAGATCGCAGACCAGGCATCCCACTGCTTGAAAAAACTATNNNNCTACCCGCCTTTTTATTATCTCATCAGATCAAG 
[247,253) GAATTC 
ACCGACGATTAGCTTTGTCCGAGTTACAACGGTTCAATAATACAAAGGATGGCATAAACCCATTTGTGTGAAAGTGCCCATCACATTATGATTCTGTCTACTATGGTTAATTCCCAATATACTCTCGAAAAGAGGGTATGCTCCCACGGCCATTTACGTCACTAAAAGATAAGATTGCTCAAANNNNNNNNNACTGCCAACTTGCTGGTAGCTTCAGGGGTTGTCCACAGCGGGGGGTCGTATGCCTTTGTGGTATACCTTACTAGCCGCGCCATGGTGCCTAAGAATGAAGTAAAACAATTGATGTGAGACTCGACAGCCAGGCTTCGCGCTAAGGACGCAAAGAAATTCCCTACATCAGACGGCCGCGNNNAACGATGCTATCGGTTAGGACATTGTGCCCTAGTATGTACATGCCTAATACAATTGGATCAAACGTTATTCCCACACACGGGTAGAAGAACNNNNATTACCCGTAGGCACTCCCCGATTCAAGTAGCCGCG 
No match! 

編輯,我希望這可以幫助您

.... 
#include <seqan/index.h> 
.... 

Pattern<Dna5String> pattern(Dna5String("GAATTC")); 
Index< StringSet<Dna5String > > myIndex(seqs); 
Finder< Index<StringSet<Dna5String > > > finder(myIndex); 
while (find(finder, pattern)){ 
    std::cout << '[' << beginPosition(finder) << ',' << endPosition(finder) 
       << ")\t" << infix(finder) << std::endl; 
} 
.... 

你,

 
[< 0 , 247 >,< 0 , 253 >) GAATTC 
+0

正如問題所述,我意識到這是一個選項,但庫文檔聽起來好像你不需要手動迭代。 – merv

+0

我不知道,我會閱讀文檔 –

+0

@merv我添加了一個替代解決方案....我不得不從'Pattern'聲明中刪除'Simple' –