2013-03-13 45 views
1

我有一個文件中像下面格式化文件:如何用awk和sed或perl腳本來獲得所需的文件

101 start_time 
102 start_time 
101 end_time 
103 start_time 
103 end_time 
102 end_time 
104 start_time 
104 end_time 
102 start_time 
102 end_time 

我想有一個輸出文件象下面這樣:

101 start_time end_time 
102 start_time end_time 
103 start_time end_time 
104 start_time end_time 
102 start_time end_time 

使用基本的sed或awk操作或使用perl如何完成?請幫忙!

+0

'的grep -v結束| sed的/ $/end_time /'|排序| uniq'(或類似的東西,我沒有測試它)。 – 2013-03-13 22:41:45

回答

2

關於如何:

awk '$1 in a{ print $1, a[$1], $2; delete a[$1]; next} {a[$1] = $2}' input 
+0

你有重複的行 – 2013-03-13 22:44:02

+0

@sputnick我不知道我理解你的評論。 OP需要輸出中有重複的行,並根據需要提供這些行。 – 2013-03-13 22:48:35

+1

我已經使用過1美元,但無論如何,無論如何+1 ... – 2013-03-13 22:59:43

0
perl -anE'say "@F end_time" if $F[1] eq "start_time"' 
0

遵循一個Perl的方法。

  • 注1:不是寫得很好,但它工作正常
  • 注2:我的答案是基於以下考慮:通過「START_TIME」和「END_TIME」你不要隨便指一個字符串,但某種時間戳或任何

你去那裏:

#!/usr/bin/perl 
use warnings; 
use strict; 

my @waiting; #here we will keep track of the order 
my %previous; #here we will save previous rows that still can't be printed 
open (my $IN,'<','file.txt') or die "$!"; #assuming that your data is in file.txt 
while (<$IN>) { 
    chomp; 
    my ($id,$time)=split/ /; 
    if (exists $previous{$id}) { #if this is the end_time 
     $previous{$id}->[1]=$time; 
     if ($waiting[0]==$id) { #if we are not waiting for another row's end_time 
      my $count=0; 
      for (@waiting) { #print anything you have available 
       last if !defined $previous{$_}->[1]; 
       print join(' ',$x,@{$previous{$_}}),"\n"; 
       delete $previous{$_}; 
       $count++; 
      } 
      shift @waiting for 1..$count; 
     } 
    } 
    else { #if this is the start_time 
     push @waiting,$id; 
     $previous{$id}=[$time,undef]; 
    } 
} 
close $IN; 
相關問題