一些優化後,我有:
my %ebcdic_fields;
while (my $line = <DATA>) {
my (undef, undef, $start, undef, $length, $indicator) = split /,/, $line;
next if $indicator !~ m/^A/;
$ebcdic_fields{$start-1} = $length - 2;
}
while (my $line = <>) {
while (my ($start, $length) = each %ebcdic_fields) {
my $fpos = $start + $length + 1;
my $before = substr ($line, $start, $length);
my $format = substr ($line, $fpos, 1);
my $trimed_before = $before + 0; # keep at least one 0 before the dot
if ($format ge 'J' and $format ne '{') {
substr ($line, $fpos, 1) =~ tr/}JKLMNOPQR//;
substr ($line, $start, $length) = ' ' x ($length - length($trimed_before) - 2) . '-' . $trimed_before . '.';
} else {
substr ($line, $fpos, 1) =~ tr/{ABCDEFGHI//;
substr ($line, $start, $length) = ' ' x ($length - length($trimed_before) - 1) . $trimed_before . '.';
}
}
print $line;
}
__DATA__
1,SEQ_NO,1,11,11,N
2,CTR_REC_ID,12,14,3,N
3,CTR_SEQ_AMT,15,23,9,A
4,CTR_CONTRACT_NO,24,46,23,N
5,CTR_CONTRACT_AMT,47,59,13,A
6,CTR_TRACK_NO,60,62,3,N
可悲的是,我已經實現了最快的是這樣的:
my %ebcdic_fields;
while (my $line = <DATA>) {
my (undef, undef, $start, undef, $length, $indicator) = split /,/, $line;
next if $indicator !~ m/^A/;
$ebcdic_fields{$start-1} = $length - 2;
}
while (my $line = <>) {
while (my ($start, $length) = each %ebcdic_fields) {
my $format = substr ($line, $start + $length + 1, 1);
my $trimed_before = (substr ($line, $start, $length) + 0); # keep at least one 0 before the dot
if ($format eq '{') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 1) . $trimed_before . '.' . substr ($line, $start + $length, 1) . '0';
} elsif ($format eq 'A') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 1) . $trimed_before . '.' . substr ($line, $start + $length, 1) . '1';
} elsif ($format eq 'B') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 1) . $trimed_before . '.' . substr ($line, $start + $length, 1) . '2';
} elsif ($format eq 'C') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 1) . $trimed_before . '.' . substr ($line, $start + $length, 1) . '3';
} elsif ($format eq 'D') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 1) . $trimed_before . '.' . substr ($line, $start + $length, 1) . '4';
} elsif ($format eq 'E') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 1) . $trimed_before . '.' . substr ($line, $start + $length, 1) . '5';
} elsif ($format eq 'F') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 1) . $trimed_before . '.' . substr ($line, $start + $length, 1) . '6';
} elsif ($format eq 'G') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 1) . $trimed_before . '.' . substr ($line, $start + $length, 1) . '7';
} elsif ($format eq 'H') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 1) . $trimed_before . '.' . substr ($line, $start + $length, 1) . '8';
} elsif ($format eq 'I') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 1) . $trimed_before . '.' . substr ($line, $start + $length, 1) . '9';
} elsif ($format eq '}') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 2) . '-' . $trimed_before . '.' . substr ($line, $start + $length, 1) . '0';
} elsif ($format eq 'J') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 2) . '-' . $trimed_before . '.' . substr ($line, $start + $length, 1) . '1';
} elsif ($format eq 'K') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 2) . '-' . $trimed_before . '.' . substr ($line, $start + $length, 1) . '2';
} elsif ($format eq 'L') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 2) . '-' . $trimed_before . '.' . substr ($line, $start + $length, 1) . '3';
} elsif ($format eq 'M') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 2) . '-' . $trimed_before . '.' . substr ($line, $start + $length, 1) . '4';
} elsif ($format eq 'N') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 2) . '-' . $trimed_before . '.' . substr ($line, $start + $length, 1) . '5';
} elsif ($format eq 'O') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 2) . '-' . $trimed_before . '.' . substr ($line, $start + $length, 1) . '6';
} elsif ($format eq 'P') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 2) . '-' . $trimed_before . '.' . substr ($line, $start + $length, 1) . '7';
} elsif ($format eq 'Q') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 2) . '-' . $trimed_before . '.' . substr ($line, $start + $length, 1) . '8';
} elsif ($format eq 'R') {
substr ($line, $start, $length+2) = ' ' x ($length - length($trimed_before) - 2) . '-' . $trimed_before . '.' . substr ($line, $start + $length, 1) . '9';
}
}
print $line;
}
我覺得這說明很難理解。源文件開頭的5,22,24是什麼?源文件中的指標在哪裏 - 它似乎在格式文件中? –
不好意思。混淆了。在源文件數據上面加了管道知道每個字段的位置。管道不是源文件數據的一部分。現在,如果我們檢查格式文件,第一個字段是SEQ_NO,它有11個字符,從1開始到源文件中的第11位.INDICATOR格式爲file.We需要讀取格式文件以瞭解源文件中的每個字段詳細信息。格式文件中CTR_CONTRACT_AMT字段的指示符是A,表示需要轉換。我希望此說明清楚您的疑問@ Mark Setchell – user1768029
它仍然沒有道理!如果你有10個字段要轉換,你只需在開始時讀**一次**格式文件,然後當你讀取數據文件的每一行時,你進行全部10次轉換並寫入相應的輸出行。您不必在內存中存儲多行,因爲您從不需要任何其他行來處理當前行,因此它不會成爲內存問題。 –