逐行讀取文件

-2

我有兩個文件，我想逐行閱讀（第一個包含每行一個單詞，第二個每行一個句子）。逐行讀取文件

目標是計算句子的數量從file 2包含一個單詞在file 1。

這裏是我的代碼：

open(my $words, '<:utf8', 'test') or die "Unable to open for read: $!"; `#test file is the file that contain my words` 
open(my $sentences, '<:utf8', 'sentences') or die "Unable to open for read: $!"; `#sentences fila that contain one sentence per line` 
open my $fh_resultat, ">:utf8", 'result'; 
my $word; 
#i want to calculate the number of sentences from my $sentences that containe word from my file $words 
while(defined($word = <$words>)) { 
    chomp $word ; 
    $word =~ s/^\s*|\s*$//g; 
    my $nb = 0; 
    my $idf; 
    my $ph; 
    while (defined ($ph = <$sentences>)){ 
     my @tab = split(/ /, $ph); 
     chomp @tab ; 
     foreach my $val(@tab) { 
      if($word eq $val){ 
       $nb = $nb + 1; 
       last; 
      } 
     } 
    } 
    print $fh_resultat "$word:$nb\n"; 
}

，但只對第一個文件的第一個字的處理！

來源

2017-05-16 rim

如果您要求大量的人閱讀並理解您的代碼，那麼儘可能讓它閱讀起來很簡單。我已經做了一些輕量級的重新格式化，以添加一些縮進，並使您對空白的使用更加均勻。請在將來自己做。 –

當您將文件句柄讀入文件末尾時，從該文件句柄讀取的下一個文件將返回undef。無論您打電話多少次，它都會繼續返回undef。

如果不使用seek()函數將文件指針重置爲文件的起始位置，則無法遍歷短語文件。

seek $CorpusPhrases, 0, 0;

或者，你可能會考慮讀你的文件之一（或兩者）到內存中，這樣你就不需要繼續閱讀文件。

來源

2017-05-16 12:53:08

看着你的代碼;只會對文件的第一個字執行處理，因爲您在從「word」文件中讀取的第一行遍歷整個「句子」文件中的。

上述兩種解決方案已經提到;使用查找和加載到內存中。

我是一個提倡將文件加載到內存並進行相應處理的人。

#test file is the file that contain my words 
open(my $words, '<:utf8', 'test') or die "Unable to open for read: $!"; 

#sentences fila that contain one sentence per line 
open(my $sentences, '<:utf8', 'sentences') or die "Unable to open for read: $!"; 
open my $fh_resultat, ">:utf8", 'result'; 
my $word; 

#i want to calculate the number of sentences from my $sentences that containe word from my file $words 

#load sentences into memory 
my @process; 
while ($line = <$sentences>) { 
    push (@process, $line); 
} 
close(sentences); 

while(defined($word = <$words>)) { 
    chomp $word ; 
    $word =~ s/^\s*|\s*$//g; 
    my $nb = 0; 
    my $idf; 
    my $ph; 

    for $ph (@process) { 
     my @tab = split(/ /, $ph); 
     chomp @tab ; 
     foreach my $val(@tab) { 
      if($word eq $val){ 
       $nb = $nb + 1; 
       last; 
      } 
     } 
    } 
    print $fh_resultat "$word:$nb\n"; 
}

來源

2017-05-16 15:22:08 Carlos

你有一種相當冗長的寫作方式'my @ process = <$sentences>;':-) –

逐行讀取文件

回答

相關問題