在MinGW下使用boost :: filestream的UTF-8名稱

我遇到了boost文件流的問題：我需要在windows下的用戶目錄中創建和修改文件。然而，用戶名包含一個變音符號，它在MinGW下編譯時會失敗，因爲標準缺少boost_使用的文件流的wide_char open（）API。請參閱Read/Write file with unicode file name with plain C++/Boost,UTF-8-compliant IOstreams和https://svn.boost.org/trac10/ticket/9968 在MinGW下使用boost :: filestream的UTF-8名稱

但是我碰到了這個問題，這個問題主要發生在嘗試使用系統代碼頁之外的字符時。在我的情況下，我只使用系統代碼頁中的字符，因爲用戶目錄顯然存在。這讓我覺得，這應該工作，如果我能告訴的boost ::路徑期望所有std::string S作爲beeing UTF8但在調用string()成員函數（其中發生在boost::fstream::open）

所以當把它們轉換成系統編碼基本上：有沒有辦法使用boost（和boost locale）自動地進行轉換（UTF8->系統編碼）？

是完整的，這裏是我設置的區域代碼：

#ifdef _WIN32 
     // On windows we want to enforce the encoding (mostly UTF8). Also using "" would use the default which uses "wrong" separators 
     std::locale::global(boost::locale::generator().generate("C")); 
#else 
     // In linux/OSX this suffices 
     std::locale::global(std::locale::classic()); 
#endif // _WIN32 
     // Use also the encoding (mostly UTF8) for bfs paths 
     bfs::path::imbue(std::locale());

來源

2017-09-25 Flamefire

我發現使用其他的圖書館，都有自己的缺點2級的解決方案。

Pathie（Docu）它看起來像一個完全替代的boost ::文件系統提供UTF8知道流和路徑處理以及創建符號鏈接和其他文件/文件夾操作。真正的酷是內置的支持獲得特殊的目錄（溫度，家庭，程序文件夾等）
缺點：只作爲動態庫，因爲靜態構建有錯誤。如果你已經使用boost，也可能會矯枉過正。
Boost.NoWide（Docu）提供幾乎所有文件和流處理程序的替代方法，以在Windows上支持UTF8，並回退到其他標準函數。文件流接受UTF8編碼的值（用於名稱），它使用自身的提升。
缺點：沒有路徑處理，也不接受bfs::path或寬字符串（bfs::path Windows上的內部格式爲UTF16），因此需要修補程序，雖然它很簡單。如果你想使用std::cout等UTF8字符串（是直接工作！）
另一個很酷的事情：它提供了一個類，以在Windows上將argc/argv轉換爲UTF8。

來源

2017-09-27 21:55:05 Flamefire

這是Windows上的問題，因爲Windows使用UTF-16，而不是UTF-8。我經常使用這個功能來解決你的問題非常（我已經去掉了幾件事情要在這裏發佈）

// get_filename_token.cpp 

// Turns a UTF-8 filename into something you can pass to fstream::open() on 
// Windows. Returns the argument on other systems. 

// Copyright 2013 Michael Thomas Greer 
// Distributed under the Boost Software License, Version 1.0. 
// (See accompanying file LICENSE_1_0.txt 
// or copy at   http://www.boost.org/LICENSE_1_0.txt) 

#ifdef _WIN32 

#include <string> 

#ifndef NOMINMAX 
#define NOMINMAX 
#endif 
#include <windows.h> 

std::string get_filename_token(const std::string& filename) 
    { 
    // Convert the UTF-8 argument path to a Windows-friendly UTF-16 path 
    wchar_t* widepath = new wchar_t[ filename.length() + 1 ]; 
    MultiByteToWideChar(CP_UTF8, 0, filename.c_str(), -1, widepath, filename.length() + 1); 

    // Now get the 8.5 version of the name 
    DWORD n = GetShortPathNameW(widepath, NULL, 0); 
    wchar_t* shortpath = new wchar_t[ n ]; 
    GetShortPathNameW(widepath, shortpath, n); 

    // Convert the short version back to a C++-friendly char version 
    n = WideCharToMultiByte(CP_UTF8, 0, shortpath, -1, NULL, 0, NULL, NULL); 
    char* ansipath = new char[ n ]; 
    WideCharToMultiByte(CP_UTF8, 0, shortpath, -1, ansipath, n, NULL, NULL); 

    std::string result(ansipath); 

    delete [] ansipath; 
    delete [] shortpath; 
    delete [] widepath; 

    return result; 
    } 

#else 

std::string get_filename_token(const std::string& filename) 
    { 
    // For all other systems, just return the argument UTF-8 string. 
    return filename; 
    } 

#endif

來源

2017-09-25 03:50:07

是的，我猜我需要這樣下去。我甚至想創建一個新的iofstream類，就像boost類一樣提供新的開放函數和ctors。你爲什麼要轉換回UTF8？ CP_ACP不會更好嗎？爲什麼助推不會這樣，因爲這看起來很簡單。像8.3這樣的名稱並不總是在ANSI或者ANSI之類的缺點？ – Flamefire

發現退縮：這隻適用於現有文件。所以需要確保文件確實存在，這可能會在嘗試使用widechar實現創建並使用短路徑方式打開時爲競爭條件打開大門。 – Flamefire

因爲它是跨平臺的。所有現代* nixen將打開一個UTF-8文件名，並且不會破壞舊代碼。同樣，_all_ Windows文件名可以轉換爲OS可接受的8.3文件名「token」，這在技術上是UTF-8子集。 –

在MinGW下使用boost :: filestream的UTF-8名稱

回答

相關問題