2009-11-18 234 views
29

輸入文件具有記錄爲:8712351,8712353,8712353,8712354,8712356,8712352,8712355 8712352,8712355如何使用COBOL從文件中刪除重複項?

使用COBOL我需要從上述文件刪除重複並寫入到一個輸出文件中。 I 寫了簡單的邏輯來讀取記錄並寫入輸出文件。

我需要從上述文件中刪除重複項(例如8712353,8712352)的邏輯。下面是程序邏輯:

IDENTIFICATION DIVISION. 
    PROGRAM-ID.RemoveDup. 
    ENVIRONMENT DIVISION. 
    INPUT-OUTPUT SECTION. 
    FILE-CONTROL. 
    SELECT INPUTFILEDUP ASSIGN TO 'C:\Cobol\INPUTFILEDUP.txt' 
      ORGANIZATION IS LINE SEQUENTIAL. 
    SELECT OUTFILEDUP ASSIGN TO 'C:\Cobol\OUTFILEDUP.txt' 
       ORGANIZATION IS LINE SEQUENTIAL. 

    DATA DIVISION. 

    FILE SECTION. 
    FD INPUTFILEDUP. 
    01 INPUTFILEDUPREC. 
     88 EOFINPUTFILEDUP VALUE HIGH-VALUES. 
     02 INPUTFILEID  PIC 9(07). 

    FD OUTFILEDUP. 
    01 OUTFILEDUPREC   PIC 9(07). 

    WORKING-STORAGE SECTION. 
    77 WS-VARIABLE   PIC 9(09). 
    77 REC-NOT-MATCH   PIC 9(01). 
    77 CUR-VARIABLE   PIC 9(09). 

    PROCEDURE DIVISION. 
    BEGIN. 
    OPEN INPUT INPUTFILEDUP 
    OPEN OUTPUT OUTFILEDUP 

    READ INPUTFILEDUP 
     AT END SET EOFINPUTFILEDUP TO TRUE 
    END-READ 
    PERFORM UNTIL (EOFINPUTFILEDUP) 
       WRITE OUTFILEDUPREC FROM INPUTFILEID 
       READ INPUTFILEDUP 
        AT END SET EOFINPUTFILEDUP TO TRUE 
          PERFORM UNTIL (EOFINPUTFILEDUP) 
    END-READ 
    END-PERFORM 
        CLOSE INPUTFILEDUP 
        CLOSE OUTFILEDUP 
    STOP RUN. 

我按升序排序的輸入文件:8712351,8712353,8712353,8712354,8712356,8712352,8712355,8712352,8712355 和它的工作,下面是修改後的代碼:

但是,假設我的文件沒有升序或降序,我需要在刪除dups之前編寫排序邏輯。請你能更新我下面的代碼這是我試過,但沒有全成在做這個,如果輸入FIEL結構是這樣的:

8712351,8712353,8712353,8712354,8712356,8712352,8712355,8712352,8712355

IDENTIFICATION DIVISION. 
    PROGRAM-ID.RemoveDup2. 
    ENVIRONMENT DIVISION. 
    INPUT-OUTPUT SECTION. 
    FILE-CONTROL. 
    SELECT INPUTFILEDUP ASSIGN TO 'C:\Cobol\INPUTFILEDUP.txt' 
      ORGANIZATION IS LINE SEQUENTIAL. 
    SELECT OUTFILEDUP ASSIGN TO 'C:\Cobol\OUTFILEDUP.txt' 
       ORGANIZATION IS LINE SEQUENTIAL. 

    DATA DIVISION. 

    FILE SECTION. 
    FD INPUTFILEDUP. 
    01 INPUTFILEDUPREC. 
     88 EOFINPUTFILEDUP VALUE HIGH-VALUES. 
     02 INPUTFILEID  PIC 9(07). 

    FD OUTFILEDUP. 
    01 OUTFILEDUPREC   PIC 9(07). 

    WORKING-STORAGE SECTION. 
    77 WS-VARIABLE   PIC 9(09) VALUE ZERO. 
    77 REC-NOT-MATCH   PIC 9(01). 
    77 CUR-VARIABLE   PIC 9(7) VALUE ZERO. 

    PROCEDURE DIVISION. 
    BEGIN. 
    OPEN INPUT INPUTFILEDUP 
    OPEN OUTPUT OUTFILEDUP 

    READ INPUTFILEDUP 
     AT END SET EOFINPUTFILEDUP TO TRUE 
    END-READ 
    PERFORM UNTIL (EOFINPUTFILEDUP) 
     IF INPUTFILEID NOT EQUAL TO WS-VARIABLE 
       MOVE INPUTFILEID TO WS-VARIABLE 
       WRITE OUTFILEDUPREC FROM INPUTFILEID 
       READ INPUTFILEDUP 
        AT END SET EOFINPUTFILEDUP TO TRUE 
       PERFORM UNTIL (EOFINPUTFILEDUP) 
     ELSE 
       DISPLAY "dUPLICATE FOUND" INPUTFILEID 

    READ INPUTFILEDUP 
    AT END SET EOFINPUTFILEDUP TO TRUE 

    END-READ 

     END-PERFORM 

    CLOSE INPUTFILEDUP 
    CLOSE OUTFILEDUP 
    STOP RUN. 
+0

WOW新的最喜歡的標籤! :)關於從中刪除重複項的數據的問題:8712351等數字是否都會在相對緊湊的範圍內發生,例如8700000-8800000?或者是否有可能在一個巨大的範圍內從1-N變化的數字? – 2009-11-18 19:10:46

回答

2

OrganizationSequential時,刪除的記錄是最後讀取的記錄。 Delete語句僅在對文件的最後一次操作是成功的Read語句時有效。如果不是,Delete返回43. File Status值由於Delete不能返回File Status值以2開頭當文件OpenSequential訪問,這樣的Delete編碼Invalid Key是不允許的。

當選擇用於文件DynamicRandom接入時,Delete statment,像Rewrite,變得有點限制較少。被刪除的記錄不需要先前讀過。只需在fle的記錄說明中填寫主要Key信息併發出Delete聲明。如果記錄不存在,則返回23的File Status並存在Invalid Key條件。

從274頁的274

Sams Teach Yourself COBOL in 24 Hours

頁(我剛纔;從我的書架撒下來)。所以在你的情況下,你可能會設置你的記錄,按照INPUTFILEID排序,在你經歷第一次發生的給定INPUTFILEID的發生時做出記錄,並相應地Delete(在將它寫入輸出文件之後) 。

1

如果您在使用cobol程序讀取文件之前先對文件進行外部排序,您可以使用SORT關鍵字EQUALS刪除重複項。如果您在cobol程序之前對文件進行排序並且不刪除重複項,那麼簡單的IF語句和保存字段將允許您刪除dups。

設置一個INPUTFILEID保存字段。在閱讀後.​​..如果inputfileid等於inputfileid-save,則再次讀取,如果沒有寫入...在將inputfileid移動到inputfileid-save之後。你將不得不分手當前的表演來做到這一點。

如果你不完全明白我在說什麼,將幫助您改變代碼只是讓我知道

5

最後它的工作。 下面是代碼

IDENTIFICATION DIVISION. 
    PROGRAM-ID.RemoveDup2. 
    ENVIRONMENT DIVISION. 
    INPUT-OUTPUT SECTION. 
    FILE-CONTROL. 
    SELECT INPUTFILEDUP ASSIGN TO 'C:\Cobol\INPUTFILEDUP.txt' 
      ORGANIZATION IS LINE SEQUENTIAL. 
    SELECT OUTFILEDUP ASSIGN TO 'C:\Cobol\OUTFILEDUP.txt' 
       ORGANIZATION IS LINE SEQUENTIAL. 
    SELECT WorkFile ASSIGN TO "WORK.TMP". 

    DATA DIVISION. 

    FILE SECTION. 
    FD INPUTFILEDUP. 
    01 INPUTFILEDUPREC. 
     88 EOFINPUTFILEDUP VALUE HIGH-VALUES. 
     02 INPUTFILEID  PIC 9(07). 

    FD OUTFILEDUP. 
    01 OUTFILEDUPREC   PIC 9(07). 

    SD WorkFile. 
    01 WORKREC. 
     02 WINPUTFILEID  PIC 9(07). 

    WORKING-STORAGE SECTION. 
    77 WS-VARIABLE   PIC 9(09) VALUE ZERO. 
    77 REC-NOT-MATCH   PIC 9(01). 
    77 CUR-VARIABLE   PIC 9(7) VALUE ZERO. 

    PROCEDURE DIVISION. 
    BEGIN. 
     SORT WorkFile ON ASCENDING KEY WINPUTFILEID 
     USING INPUTFILEDUP GIVING INPUTFILEDUP 

    OPEN INPUT INPUTFILEDUP 
    OPEN OUTPUT OUTFILEDUP 

     READ INPUTFILEDUP 
       AT END SET EOFINPUTFILEDUP TO TRUE 
    END-READ 
     PERFORM UNTIL (EOFINPUTFILEDUP) 
      IF INPUTFILEID NOT EQUAL TO WS-VARIABLE 
        MOVE INPUTFILEID TO WS-VARIABLE 
        WRITE OUTFILEDUPREC FROM INPUTFILEID 
        READ INPUTFILEDUP 
         AT END SET EOFINPUTFILEDUP TO TRUE 
     PERFORM UNTIL (EOFINPUTFILEDUP) 
      ELSE 
        DISPLAY "DUPLICATE FOUND " INPUTFILEID 

    READ INPUTFILEDUP 
       AT END SET EOFINPUTFILEDUP TO TRUE 
    END-READ 
    END-PERFORM 

    CLOSE INPUTFILEDUP 
    CLOSE OUTFILEDUP 

    STOP RUN. 
1

sort標準是這些操作系統關閉工作遵循DRY原則,齒輪-t用於分離和-u的唯一身份。這是C.