Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
248 views
in Technique[技术] by (71.8m points)

c++ - What is the idiomatic C++17 standard approach to reading binary files?

Normally I would just use C style file IO, but I'm trying a modern C++ approach, including using the C++17 specific features std::byte and std::filesystem.

Reading an entire file into memory, traditional method:

#include <stdio.h>
#include <stdlib.h>

char *readFileData(char *path)
{
    FILE *f;
    struct stat fs;
    char *buf;

    stat(path, &fs);
    buf = (char *)malloc(fs.st_size);

    f = fopen(path, "rb");
    fread(buf, fs.st_size, 1, f);
    fclose(f);

    return buf;
}

Reading an entire file into memory, modern approach:

#include <filesystem>
#include <fstream>
#include <string>
using namespace std;
using namespace std::filesystem;

auto readFileData(string path)
{
    auto fileSize = file_size(path);
    auto buf = make_unique<byte[]>(fileSize);
    basic_ifstream<byte> ifs(path, ios::binary);
    ifs.read(buf.get(), fileSize);
    return buf;
}

Does this look about right? Can this be improved?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Personally I prefer std::vector<std::byte>to using std::string unless you are reading an actual text document. The problem with make_unique<byte[]>(fileSize); is that you instantly lose the size of the data and have to carry it in a separate variable. It may be a tiny fraction faster than a std::vector<std::byte> given that it won't zero initialize. But I think that will probably always be overshadowed by the time taken reading off the disk.

So for a binary file I use something like this:

std::vector<std::byte> load_file(std::string const& filepath)
{
    std::ifstream ifs(filepath, std::ios::binary|std::ios::ate);

    if(!ifs)
        throw std::runtime_error(filepath + ": " + std::strerror(errno));

    auto end = ifs.tellg();
    ifs.seekg(0, std::ios::beg);

    auto size = std::size_t(end - ifs.tellg());

    if(size == 0) // avoid undefined behavior 
        return {}; 

    std::vector<std::byte> buffer(size);

    if(!ifs.read((char*)buffer.data(), buffer.size()))
        throw std::runtime_error(filepath + ": " + std::strerror(errno));

    return buffer;
}

This is the fastest method I know of. It also avoids a common mistake in determining the size of the data in the file because ifs.tellg() is not necessarily the same as the file size after opening the file at the end and ifs.seekg(0) is not theoretically the correct way to locate the start of the file (even though it works in practice most places).

The error message from std::strerror(errno) is guaranteed to work on POSIX systems (that should include Microsoft but not sure).

Obviously you can use std::filesystem::path const& filepath in place of std::string if you want.

Also, especially for pre C++17, you can use std::vector<unsigned char> or std::vector<char> if you don't have or want to use std::byte.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...