獲取文件路徑的部分

我有一個文件路徑，從__FILE__宏獲得，我想從中提取2個文件。獲取文件路徑的部分

格式爲：/some/path/to/a/file/AAA/xxx/BBB.cc。我想要AAA和BBB的路徑。 xxx通常是src，inc，tst等，文件擴展名一般是.cc，但不能保證。

我知道我可以使用string.find()甚至將字符串拆分爲/字符的數組，但是由於需要搜索的次數，這兩個字符都不是很有效。我想到了sscanf，並認爲這可能是最好的方法，但是，我一直無法定義格式，以至於它會跳過大部分開始並獲得我需要的部分。我如何使用sscanf來做到這一點，還是有更好的方法？

感謝您的幫助。

來源

2011-09-09 steveo225

您是否嘗試過使用['strtok（）']（http://www.cplusplus.com/reference/clibrary/cstring/strtok/）？ – Kusalananda

@KAK'strtok'是我儘可能避免它的最有缺陷的c函數之一。 – CodesInChaos

@CodeInChaos，爲什麼？ – Kusalananda

使用rfind，這樣就可以在年底開始和向後工作：

string s = "/some/path/to/a/file/AAA/xxx/BBB.cc"; 

unsigned int a = s.rfind('.'); 
unsigned int b = s.rfind('/'); 
string BBB = s.substr(b+1,a-b-1); 

a = s.rfind('/',b-1); 
b = s.rfind('/',a-1); 
string AAA = s.substr(b+1,a-b-1);

來源

2011-09-09 12:35:10 Beta

做正確的事
如果它不夠快，提高它

很容易只是自己寫這不是試圖讓sscanf會做到這一點。你的代碼會更容易理解，並且速度更快（但是，我懷疑這很重要）。

只是從字符串的後面循環。當找到第一個點時，請記住該位置，然後在找到第一個斜線時提取BBB。記住第二個斜線的位置，並在找到第三個斜線時提取AAA。

來源

2011-09-09 12:06:17

這種事情通常是這麼做的，所以找到一個標準的解決方案（比如使用'strtok（）'）比自己編寫代碼更好，並且可能會弄巧成拙。這不是它。 – Kusalananda

除了'strtok'不是一個很漂亮的函數（需要銷燬源字符串 - 所以如果你有一個字符串字面值，你需要將它複製到其他地方 - 並且是不可重入的），它可以很好地從左到右掃描，但如果我理解正確，他需要從右向左掃描。 –

char *path = ... /* fill this however you like, for example function argument */ 
char *AAA_start, *AAA_end; 
char *BBB_start, *BBB_end; 
     // go the end of the string and find the first . 
for (BBB_end = path+strlen(path); *BBB_end != '.'; --BBB_end); 
     // continue to find the first/
for (BBB_start = BBB_end; *BBB_start != '/'; --BBB_start); 
     // Now you have the beginning and end of BBB 
     // continue from there to find next/
for (AAA_end = BBB_start-1; *AAA_end != '/'; --AAA_end); 
     // continue from there to find next/
for (AAA_start = AAA_end-1; *AAA_start != '/'; --AAA_start); 
     // Now you have the beginning and end of AAA 

     // Now you can do whatever you want with AAA and BBB, for example 
char *AAA = new char[AAA_end-AAA_start+2]; // AAA_end is included in the result 
              // hence +1. Another +1 for the NULL 
char *BBB = new char[BBB_end-BBB_start+2]; 
memcpy(AAA, AAA_start, AAA_end-AAA_start+1); 
memcpy(BBB, BBB_start, BBB_end-BBB_start+1); 
AAA[AAA_end-AAA_start+1] = NULL; 
BBB[BBB_end-BBB_start+1] = NULL;

這是基本的想法。現在，你需要添加錯誤檢查到這一點：

char *path = ... /* fill this however you like, for example function argument */ 
char *AAA_start, *AAA_end; 
char *BBB_start, *BBB_end; 
for (BBB_end = path+strlen(path); *BBB_end != '.' && BBB_end != path; --BBB_end); 
if (BBB_end == path) return FAIL; 
for (BBB_start = BBB_end; *BBB_start != '/' && BBB_start != path; --BBB_start); 
if (BBB_start == path) return FAIL; 
for (AAA_end = BBB_start-1; *AAA_end != '/' && AAA_end != path; --AAA_end); 
if (AAA_end == path) return FAIL; 
for (AAA_start = AAA_end-1; *AAA_start != '/' && AAA_start != path; --AAA_start); 
if (AAA_start == path && *AAA_start != '/') return FAIL; 

char *AAA = new char[AAA_end-AAA_start+2]; 
char *BBB = new char[BBB_end-BBB_start+2]; 
memcpy(AAA, AAA_start, AAA_end-AAA_start+1); 
memcpy(BBB, BBB_start, BBB_end-BBB_start+1); 
AAA[AAA_end-AAA_start+1] = NULL; 
BBB[BBB_end-BBB_start+1] = NULL;

來源

2011-09-09 13:36:01 Shahbaz

如果'path'沒有正確的格式，這段代碼將會在字符串的開頭之外讀取，因此會顯示未定義的行爲。更健壯一點也不錯。 – CodesInChaos

@CodeInChaos，我不想讓錯誤檢查臃腫的代碼，所以OP不會錯過這一點。無論哪種方式，我都會更新答案。 – Shahbaz

發佈非生產就緒代碼是很好的IMO。但是，答案應該指出缺陷，以便用戶知道他應該改變而不是僅僅複製粘貼。就我個人而言，我會實現一個類似於'rfind'的幫助函數，因爲這段代碼有點難以閱讀。 – CodesInChaos

正則表達式可以做的伎倆：

#include <boost/regex.hpp> 
#include <iostream> 
#include <cstdlib> 

int main() { 
    std::string path("/some/path/to/a/file/AAA/xxx/BBB.cc"); 

    boost::regex path_re(".+/([^/]+)/[^/]+/([^.]+)\\.(.+?)", boost::regex::perl); 
    boost::smatch m; 
    if(regex_match(path, m, path_re)) { 
     std::cout << "part 1 " << m[1] << '\n'; 
     std::cout << "part 2 " << m[2] << '\n'; 
     std::cout << "part 3 " << m[3] << '\n'; 
    } 
    else { 
     abort(); 
    } 
}

輸出：

part 1 AAA 
part 2 BBB 
part 3 cc

注意，它不處理非 - 正則路徑，其中包含/./個元素。

來源

2011-09-09 15:58:38

獲取文件路徑的部分

回答

相關問題