2013-03-20 100 views
0

我想將文件分成兩個文件。 如果文件名是example.txt那麼它將被分成兩個文件,如EX1.txtEX2.txt我想用perl腳本將一個文件分成兩個不同的文件

拆分取決於每行中的第二個字段。例如:如果HDR線有TEA003890459作爲第二場,那麼輸出將會去EX1.txt 但是如果HDR有TEA003886004那麼輸出進入EX2.txt。 我也想要統計索賠號碼。

我要做到這一點使用以下邏輯:

if Header-Row then 
    if Dummy cost center then 
     write to Gas file 
     keep in mind: Claim-Nummer (eg. Array or Hash) 
    else 
     write to normal file 
    end if 
else if Detail-Row then 
    if kept Claim-Nummer then 
     write to Gas file 
    else 
     write to normal file 
    end if 
end if 

該文件包含以下數據:

HDR^TEA003890459^082582^Mohd Jamil^Jamili Fahmi Bin^^458^+^92000^+^92000^+^0000^+^0000^+^0000^^0^^0^^0^^0^^0^^0^20130307^^^^^^^222^MY0BD^2^[email protected]^  MY0BCC#6482362304         
DTL^TEA003890459^E^MY0BCC#6482362304    641301137^+^47000^MFA^20130209^Medical Expenses [Family]^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Medical 
DTL^TEA003890459^E^MY0BCC#6482362304    641301137^+^45000^MGE^20130304^Medical Expenses (Employee clinica^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Medical 
HDR^TEA003886004^082770^Bin Omar^Mohamad Fadzlizam^^458^+^135800^+^135800^+^0000^+^0000^+^0000^^0^^0^^0^^0^^0^^0^20130307^^^^^^^222^MY0BD^4^[email protected]^  MY0BCC#6485163100         
DTL^TEA003886004^E^MY0BCC#6485163100    641301137^+^25000^MFA^20130221^Medical Expenses [Family]^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Claim 
DTL^TEA003886004^E^MY0BCC#6485163100    641301137^+^37150^MFA^20130224^Medical Expenses [Family]^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Claim 
DTL^TEA003886004^E^MY0BCC#6485163100    641301137^+^23650^MFA^20130226^Medical Expenses [Family]^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Claim 
DTL^TEA003886004^E^MY0BCC#6485163100    641301137^+^50000^MGE^20130304^Medical Expenses (Employee clinica^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Claim 
HDR^TEA003886162^082792^Lim^Jia Jieh^^458^+^280400^+^280400^+^0000^+^0000^+^0000^^0^^0^^0^^0^^0^^0^20130305^^^^^^^222^MY0BD^4^[email protected]^  MY0BCC#6482363474         
DTL^TEA003886162^E^MY0BCC#6482363474    641301137^+^110000^MGE^20130131^Medical Expenses (Employee clinica^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Medical claim 31/1,20/2,28/2 
DTL^TEA003886162^E^MY0BCC#6482363474    641301137^+^60000^MGE^20130220^Medical Expenses (Employee clinica^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Medical claim 31/1,20/2,28/2 
DTL^TEA003886162^E^MY0BCC#6482363474    641301137^+^50400^MGE^20130220^Medical Expenses (Employee clinica^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Medical claim 31/1,20/2,28/2 
DTL^TEA003886162^E^MY0BCC#6482363474    641301137^+^60000^MGE^20130228^Medical Expenses (Employee clinica^^^0^^0^^0^^0^^0^^0^^0^^0^^C16Medical claim 31/1,20/2,28/2 
+1

我試圖改寫你的問題。請檢查我是否意外更改了意思。 **問題:**①什麼是「TDR」? ②什麼是「虛擬成本中心」? ③什麼是「* Gas​​ *文件」? ③你的示例文件不會被分成三個文件嗎? ④您的文件看起來像逗號分隔數據(CSV),使用'^'作爲字段分隔符。你能承認這一點嗎? – amon 2013-03-20 13:40:01

回答

0

你的解釋和僞代碼和示例數據似乎告訴一個不同的故事

但是要讀取第二個字段,一旦文件打開並按照描述的順序排序

open(my $ex1,">EX1.txt")||die"EX1.txt $!"; 
open(my $ex2,">EX2.txt")||die"EX2.txt $!"; 
$wanted="TEA003890459"; 
while($line = <$ifile>) { 

    @field=split('\^',$line); 
    if ($field[1] eq $wanted) { # fields start from 0 so 1 is the second field 
    print $ex1 $line; 
    else { 
    print $ex2 $line; 
    } 
} 

編輯:修復分裂ARG

+0

'split'中的第一個參數是一個正則表達式,其中'^'是一個元字符。 →'split/\^/'什麼的。 – amon 2013-03-20 14:03:04

+0

很對,固定 – Vorsprung 2013-03-21 08:43:09

+0

而不是添加'EDIT:fix split arg'爲什麼你不把它放在編輯總結? – 2013-03-22 01:26:11

0

喜歡的東西:

#!/usr/bin/perl 

foreach (<>) { 
     my @out = split(/\^/,$_); 
     if ($out[0] eq 'HDR') { 
       close OUTFILE; 
       open OUTFILE,">>$out[1].txt" or die(); 
     } elsif ($out[0] eq 'DTL') { 
       print OUTFILE $_; 
     } 
} 

運行帶:

./split.pl < infile.txt 

會分裂出檔案的頭型。您可以使用Linux wc命令爲每個條目計數條目。

+1

爲什麼使用2 arg ['open'](http://p3rl.org/open「perldoc -f open」),而不是3 arg ['open'](http://p3rl.org/open「 perldoc -f打開「)?你爲什麼不在''或'['die']中包含['$!'](http://perldoc.perl.org/perlvar.html#%24!「perldoc -v'$!'」) (http://p3rl.org/die「perldoc -f die」)? – 2013-03-22 01:22:14