2017-08-17 70 views
4

我正在讀取文件orderedfile.txt中的數據。有時,這種文件的形式的報頭:跳過C中文本文件的標題頭

BEGIN header 

     Real Lattice(A)    Lattice parameters(A) Cell Angles 
    2.4675850 0.0000000 0.0000000  a = 2.467585 alpha = 90.000000 
    0.0000000 30.0000000 0.0000000  b = 30.000000 beta = 90.000000 
    0.0000000 0.0000000 30.0000000  c = 30.000000 gamma = 90.000000 

1       ! nspins 
25 300 300    ! fine FFT grid along <a,b,c> 
END header: data is "<a b c> pot" in units of Hartrees 

1  1  1   0.042580 
1  1  2   0.049331 
1  1  3   0.038605 
1  1  4   0.049181 

有時無標頭存在並且在第一行中的數據開始。我的數據讀取代碼如下所示。它在數據從第一行開始時起作用,但不在頭中出現。有沒有辦法解決這個問題?

int readinputfile() { 
    FILE *potential = fopen("orderedfile.txt", "r"); 
    for (i=0; i<size; i++) { 
     fscanf(potential, "%lf %lf %*f %lf", &x[i], &y[i], &V[i]); 
    } 
    fclose(potential); 
} 
+3

切換到讀取整行。這允許您檢測標題,然後讀取,直到數據開始。 – Yunnosch

回答

2

下面的代碼將使用fgets()閱讀每一行。對於每行sscanf()用於掃描字符串並將其存儲到雙變量中。
查看正在運行的example (with stdin) at ideone

#include <stdio.h> 

int main() 
{ 
    /* maybe the buffer must be greater */ 
    char lineBuffer[256]; 
    FILE *potential = fopen("orderedfile.txt", "r"); 

    /* loop through every line */ 
    while (fgets(lineBuffer, sizeof(lineBuffer), potential) != NULL) 
    { 
     double a, b, c; 
     /* if there are 3 items matched print them */ 
     if (3 == sscanf(lineBuffer, "%lf %lf %*f %lf", &a, &b, &c)) 
     { 
     printf("%f %f %f\n", a, b, c); 
     } 
    } 
    fclose(potential); 

    return 0; 
} 

它正在與您提供的頭,但如果在標題行,例如:

1  1  2   0.049331 

會出現那麼這行也將被讀取。如果BEGIN header存在於您給定的標題中,或者在行數已知的情況下使用行計數,則另一種可能性是搜索單詞END header
要搜索子串,可以使用功能strstr()

2

檢查返回值fscanf。如果它返回三,你的輸入是正確的;否則,你仍然在頭,所以你必須跳過行:

int readinputfile() { 
    FILE *potential = fopen("orderedfile.txt", "r"); 
    int res; 
    while(res = fscanf(potential, "%lf %lf %*f %lf", &x[i], &y[i], &V[i])) { 
     if (res != 3) { 
      fscanf(potential, "%*[^\n]"); 
      continue; 
     } 
     i++; 
     ... // Optionally, do anything else with the data that you read 
    } 
    fclose(potential); 
} 

Demo.

+1

@chqrlie當然 - 功能不是那麼大,所以我添加了其餘部分。謝謝! – dasblinkenlight

+0

最後的解決方法是添加i = i - 1項,以在跳過行時停止循環增量! –

2

我認爲這是一個很多更可靠,明確查找標頭的開始和結束的比它是依賴於以往任何時候都匹配scanf()風格的格式字符串頭沒有字符串:

FILE *fp = fopen(...); 

int inHeader = 0; 

size_t lineLen = 128; 
char *linePtr = malloc(lineLen); 

// skip header lines 
while (getline(&linePtr, &lineLen, fp) >= (ssize_t) 0) 
{ 
    // check for the start of the header (need to do this first to 
    // catch the first line) 
    if (!inHeader) 
    { 
     inHeader = !strncmp(linePtr, "BEGIN header", strlen("BEGIN header")); 
    } 
    else 
    { 
     // if we were in the header, check for the end line and go to next line 
     inHeader = strncmp(linePtr, "END header", strlen("END header")); 

     // need to skip this line no matter what because it's in the header 
     continue; 
    } 

    // if we're not in the header, either break this loop 
    // which leaves the file at the first non-header line, 
    // or process the line in this loop 
    if (!inHeader) 
    { 
     ... 
    } 
} 
... 

你可能更喜歡使用strstr()而不是strncmp()。這樣頭標開始/結束字符串不必開始行。

+1

爲什麼'malloc()'調用? 'size_t lineLen = 0; char * linePtr = NULL;'對POSIX.1 ['getline()']來說是完全正確的(http://man7.org/linux/man-pages/man3/getline.3.html)。 –

+0

@NominalAnimal *爲什麼'malloc()'調用?*只是爲了避免線長度緩慢增長緩衝區多個調用。沒有什麼是真正重要的。 –