So, I really want VCMI to support unicode file paths ASAP, because Windows users with a non-ascii username currently can’t run VCMI (and I’m one of them). I’ve implemented a solution using shortpaths (github.com/vcmi/vcmi/pull/151), but as I was told, it is not a very good solution.
So I thought about a better solution. While MSVC supports a non-standard wchar_t overload of std::fstream::open(), it does not exist in MinGW, so that’s not portable. I could, however, re-write CFileInputStream to use C-style FILE* internally instead of an ifstream. Then supporting unicode file paths would be easy and portable, with very little platform-specific code. The interface of CFileInputStream wouldn’t need to change at all - the beauty of hiding implementation details.
Do the VCMI developers think this would be a good solution? If so, I’m willing to implement it.
Internally, the boost::filesystem::basic_ifstream::open() method (and the constructor) that takes a boost::filesystem::path as an argument, converts the path to a c-string and passes it to std::basic_ifstream::open(). For non-windows systems, it gets passed as an UTF-8 encoded char*, which works perfectly fine. However, windows does not support opening files through an UTF-8 name. So on MSVC, it gets passed as an UTF-16 encoded wchar_t*, to a non-standard overload of std::basic_ifstream::open(), which takes a wchar_t* as an argument instead of a char*. But that overload does not exist on MinGW. So there, it ends up passing an UTF-8 encoded char*, even though windows does not support it. The code compiles just fine, but you get an error every time you try to open a file with a non-ascii name.
That wouldn’t work either, because boost::filesystem::path already stores the path in UTF-16 on windows, conversion is not the issue. boost::filesystem::basic_ifstream relies on calling std::basic_ifstream to open files, and there is no standardized way to open unicode files as a std::basic_ifstream on windows. I’ve seen an ugly hack for accomplishing it with MinGW, but it is very ugly and relies on implementation details of libstdc++. However, opening a FILE* with a unicode path is easy - there’s a function _wfopen that takes a wchar_t*, and it exists on both MSVC and MinGW. That’s why I proposed that solution.
Yes, that’s what I think is needed. Since CFileInputStream currently doesn’t use any iostream functionality, it would be easy to switch it to c-style IO. I’m not sure of how to handle zip files, because minizip doesn’t seem to support unicode file names.
I have updated my github with some changes for better unicode support. Most notably, I made a class called FileStream that is a subtype of std::iostream and uses a FILE* internally. I also merged Ivan’s serialization branch so I wouldn’t need to duplicate some work. Some things probably don’t work as intended right now; in particular, I’m not sure how to do things in CClient::loadGame().
I’ve decided to try a different approach instead. I’ll revert back all my changes, then only implement changes to CFileInputStream. It seems I’ve misunderstood how parts of VCMI work. I’ll also use your MinizipExtensions.cpp/h files to make the functions in the ZipArchive namespace support unicode filenames.
I’ve decided to somewhat narrow the scope of my task. I’m only aiming to add support for unicode characters in the path in front of the VCMI-specific folders. For example, “C:\Users<non-ascii name>\My Games\vcmi” is supported, but “C:\Program Files\vcmi\config<non-ascii file>” is not. Implementing the latter would mean major changes to how VCMI handles resources. And it would be a whole lot harder to support unicode names inside zip archives.
This way, users with non-ascii usernames are not “discriminated” any more (meaning everything works just as fine for them as for everyone else). But for example, savegames still can’t have non-ascii filenames.
What I have left to do now is to test things some more, and to make some of the new code cleaner.
[s]So, I’ve finished unicode support in most places, but as very often, 20% of the task requires 80% of the work…
ISimpleResourceLoader has a method load() returning std::unique_ptr, but there is no corresponding method for writing. I’d like to add a method write() which returns std::unique_ptr (or similar). Currently, a lot of code looks like “std::ofstream(ISimpleResourceLodader::GetResourceName())”, which is a bit messy, and I don’t think those call sites needs to actually know the file name, they only need an output stream.[/s]
EDIT: I don’t think the above is needed right now; maybe it’s something to be done at a later time.