2017-04-04 66 views
0

這是一個原始的awk文件,我想格式化它。如何將兩個awk文件合併爲一個?

輸入的內容----原來awk的文件名爲test.txt

awk 'BEGIN {maxlength = 0}\ 
    {\ 
      if (length($0) > maxlength) {\ 
       maxlength = length($0);\ 
       longest = $0;\ 
      }\ 
    }\ 
    END {print longest}' somefile 

預期輸出----格式良好的AWK文件

awk 'BEGIN {maxlength = 0}      \ 
    {           \ 
      if (length($0) > maxlength) {  \ 
       maxlength = length($0);   \ 
       longest = $0;     \ 
      }         \ 
    }           \ 
    END {print longest}' somefile 

第一步:獲得最長的行和字符編號

step1.awk

#! /usr/bin/awk 
BEGIN {max =0 } 
{ 
    if (length($0) > max) { max = length($0)} 
} 
END {print max} 

AWK -f step1.awk的test.txt

現在對於所有的行最大長度爲50

步驟2把\在位置50 + 2 = 52。

step2.awk

#! /usr/bin/awk 
{ 
if($0 ~ /\\$/){ 
    gsub(/\\$/,"",$0); 
    printf("%-*s\\\n",n,$0); 
    } 
else{ 
    printf("%s\n",$0); 
    } 
} 

AWK -f step2.awk -vn = 52的test.txt> well_formatted.txt

如何第一步和第二步組合成只有一個步驟,和結合step1.awk和step2.awk作爲一個awk文件?

+0

發佈輸入內容和預期輸出得到快速幫助 – RomanPerekhrest

回答

4

更好的版本,在那裏你可以使用sub()代替gsub(),避免測試相同的正則表達式兩次sub(/\\$/,""){ ... }

awk 'FNR==NR{ 
      if(length>max)max = length 
      next 
    } 
    sub(/\\$/,""){ 
      printf "%-*s\\\n", max+2, $0 
      next 
    }1' test.txt test.txt 

說明

awk 'FNR==NR{        # Here we read file and will find, 
              # max length of line in file 
              # FNR==NR is true when awk reads first file 

      if(length>max)max = length # find max length 
      next       # stop processing go to next line 
    } 
    sub(/\\$/,""){      # Here we read same file once again, 
              # if substitution was made for the regex in record then 

      printf "%-*s\\\n", max+2, $0 # printf with format string max+2 
      next       # go to next line 
    }1         # 1 at the end does default operation print $0, 
              # nothing but your else statement printf("%s\n",$0) in step2 
    ' test.txt test.txt 

你還沒有告訴我們,你的輸入和預期輸出是什麼,有一些假設,

如果輸入看起來像下面

[email protected]:/tmp$ cat f 
123  \ 
\ 
12345  
123456 \ 
1234567 \ 
123456789 
12345  

你如下

[email protected]:/tmp$ awk 'FNR==NR{ if(length>max)max = length; next} 
sub(/\\$/,"",$0){ printf "%-*s\\\n",max+2,$0; next }1' f f 
123   \ 
      \ 
12345  
123456  \ 
1234567  \ 
123456789 
12345 
+0

你可以使用'sub()'而不是'gsub()'來保持你的代碼的獨創性,我沒有修改,只是合併成一個 –

+1

差不多相同的代碼+1(但它是客觀的: - D)。 – NeronLeVelu

+0

@NeronLeVelu謝謝你,OP沒有顯示任何東西,除了代碼:) –

0

也許這樣的事?

wc -L test.txt | cut -f1 -d' ' | xargs -I{} sed -i -e :a -e 's/^.\{1,'{}'\}$/& /;ta' test.txt && sed -i -r 's/(\\)([ ]*)$/\2\1/g' test.txt 
+1

程序可讀性? –

1
awk ' 
    # first round 
    FNR == NR { 
     # take longest (compare and take longest line by line) 
     M = M < (l = length($0)) ? l : M 
     # go to next line 
     next 
     } 

    # for every line of second round (due to previous next) that finish by/
    /[/]$/ {   
     # if a modification is needed 
     if ((l = length($0)) < M) { 
     # add the missing space (using sprintf "%9s" for 9 spaces) 
     sub(/[/]$/, sprintf("%" (M - l) "s/", "")) 
     } 
     } 
    # print all line [modified or not] (7 is private joke but important is <> 0) 
    7 
    ' test.txt test.txt 

注意獲取輸出:

  • 兩倍文件末尾是強制讀取文件的兩倍
  • 假定在最後/(沒有空格)後沒有任何內容。可以很容易地適應,但是不是目的
  • 假定線而不/不被修改,但仍然印刷
+0

似乎在前向和反向之間有些混淆(可能是我的困惑:)。 '/ [/] $ /'對我來說也會導致語法錯誤。 – jas

+0

喜歡你的'7是私人笑話'++,我認爲它的反斜槓不是嗎? –

+2

不,它來自另一個堆棧用戶,他解釋說鍵盤上鍵入的速度快於1,所以他拿7,我每次在我的代碼中選擇這個習慣時都會保持習慣(另一個是當人們試圖理解爲什麼,閱讀代碼) – NeronLeVelu

1

下面是一個用於GNU AWK。兩次運行,第一次發現最大長度和第二次輸出。

$ awk 'BEGIN{FS=OFS=""}NR==FNR{m=(m<NF?NF:m);next}$NF=="\\"{$NF=sprintf("% "m-NF+2"s",$NF)}1' file file 

輸出:

awk 'BEGIN {maxlength = 0}    \ 
    {         \ 
      if (length($0) > maxlength) { \ 
       maxlength = length($0); \ 
       longest = $0;   \ 
      }        \ 
    }         \ 
    END {print longest}' somefile 

解釋:

BEGIN  { FS=OFS="" }       # each char on different field 
NR==FNR { m=(m<NF?NF:m); next }    # find m ax length 
$NF=="\\" { $NF=sprintf("% " m-NF+2 "s",$NF) } # NF gets space padded 
1            # output 

如果你想使每個字符的推移它的場,最後將炭在$NFFS設置爲""\ s遠離代碼,請將更改爲sprintf以符合您的喜好。