2010-12-08 64 views
1
#!/usr/local/bin/perl 
use warnings; 
use 5.012; 
use Text::Document; 
use Text::DocumentCollection; 

my $c = Text::DocumentCollection->new(file => 'coll.db' ); 

my $doc_one = Text::Document->new(lowercase => 0, compressed => 0); 
my $doc_two = Text::Document->new(lowercase => 0, compressed => 0); 
my $doc_three = Text::Document->new(lowercase => 0, compressed => 0); 

$doc_one->AddContent('foo bar biz buu muu muu'); 
$doc_two->AddContent('foo foofoo Foo foo'); 
$doc_three->AddContent('one two three foo foo'); 

$c->Add('key_one', $doc_one); 
$c->Add('key_two', $doc_two); 
$c->Add('key_three', $doc_three); 

有人能告訴我一個明智的,可以理解的回調函數的例子嗎?如何在Perl的Text :: Document中使用回調函數?

#!/usr/local/bin/perl 
use warnings; 
use 5.012; 
use Text::Document; 
use Text::DocumentCollection; 

my $c = Text::DocumentCollection->NewFromDB(file => 'coll.db'); 

my @result = $c->EnumerateV(\&Callback, 'the rock'); 
say "@result"; 

sub Callback { 
    ... 
    ... 
} 

# The function Callback will be called on each element of the collection as: 
# my @l = CallBack($c, $key, $doc, $rock); 
# where $rock is the second argument to Callback. 
# Since $c is the first argument, the callback may be an instance method of Text::DocumentCollection. 
# The final result is obtained by concatenating all the partial results (@l in the example above). 
# If you do not want a result, simply return the empty list(). 

回答

2

裏面的EnumerateV功能,回調函數被調用爲集合中的所有文檔,而每一個回調函數調用的返回值被收集並返回。使用map函數可能有一個非常簡單和等效的方法來編寫它。

在任何情況下,這裏是爲您的樣品數據的示例回調函數:

sub document_has_twice { 
    # return document key if term appears twice in the document 
    my ($collection_object, $key, $document, $search_term) = @_; 
    if ($document->{terms}{$search_term} 
      && $document->{terms}{$search_term} >= 2) { 
     return $key; 
    } 
    return; 
} 

my @r = $c->EnumerateV(\&document_has_twice, "foo"); 
print "These documents contain the word 'foo' at least twice: @r\n"; 

@r = $c->EnumerateV(\&document_has_twice, "muu"); 
print "These documents contain the word 'muu' at least twice: @r\n"; 

@r = $c->EnumerateV(\&document_has_twice, "stackoverflow"); 
print "These documents contain the word 'stackoverflow' at least twice: @r\n"; 

輸出:

These documents contain the word 'foo' at least twice: key_three key_two 
These documents contain the word 'muu' at least twice: key_one 
These documents contain the word 'stackoverflow' at least twice: 
相關問題