3
\$\begingroup\$

I proposed several techniques to answer to https://stackoverflow.com/q/78672409/21691539.

A classical way to answer would be to use std::index_sequence-based solution but op had two constraints:

  • it should work with a non default-constructible type
  • it should work for large std::array sizes, larger than what common std::index_sequence implementations are supporting.

I thus proposed this solution:

// creates an std::array<T, N> whose element i is constructed with gen(i)
template <std::size_t N, typename T, typename Generator>
std::array<T, N> make_array(Generator gen) {

    // checking std::array layout
    static_assert(sizeof(std::array<T, N>) == N * sizeof(T));

    // auxiliary struct used to trigger the call of destructors on the stored objects
    // also managed alignement
    struct ArrayWrapper {
        unsigned char storage[sizeof(std::array<T, N>) + alignof(T)];
        std::size_t tmpOffset =
            std::size_t{reinterpret_cast<std::uintptr_t>(storage) % alignof(T)};
        std::size_t Offset = tmpOffset == 0 ? 0 : alignof(T) - tmpOffset;
        unsigned char *alignedStorage = storage+Offset;
        ~ArrayWrapper() {
            T* toDelete = reinterpret_cast<T*>(alignedStorage);
            for (std::size_t i = 0; i != N; ++i) {
                toDelete[i].~T();
            }
        }
    };

    // create storage
    ArrayWrapper storage;
    // construct objects in storage through placement new
    for (std::size_t i = 0; i != N; ++i) {
        new (storage.storage + storage.Offset + i * sizeof(T))
            T(gen(static_cast<int>(i)));
    }
    
    // forcing move semantic for moving the contained objects instead of copying
    // should be legal as std::array has the correct layout and is implicit lifetime
return std::move(
        *reinterpret_cast<std::array<T, N>*>(storage.storage + storage.Offset));
};

Live with sanitizer
Live without sanitizer
Live without sanitizer, nor string

The idea is to allocate a raw storage, construct the objects in place, alias the storage to an std::array<T,N> and "move-initialize" the created array from the aliased one (see full example below). The temporary resource is released on leaving the creator function (objects are deleted).

I have several concerns regarding this code.

The first one is: it is UB? I believe not (see discussion there but I'm unsure).

The second one is: what is it's footprint? Honestly I didn't profile it properly, neither with respect to execution time (as I couldn't compare to the std::index_sequence-based solution which does not compile at all), nor with respect to memory usage (I believe I've got an overhead of one array).

Eventually would there be ways to improve this design?

Here is a full example, similar to the compiler explorer links above:

#include <array>
#include <cstddef>
#include <cstdint>
#include <utility>

#define USESTRING
#ifdef USESTRING
#include <string>
#endif

struct Int {
    int i;
#ifdef USESTRING
    std::string s;
    Int(int val) : i(val), s{} {}
#else
    constexpr Int(int val) : i(val) {}
#endif
    Int() = delete;
    Int(const Int&) = default;
    Int(Int&&) = default;
    Int& operator=(const Int&) = default;
    Int& operator=(Int&&) = default;
    ~Int() = default;
};

template <std::size_t N, typename T, typename Generator>
std::array<T, N> make_array(Generator gen) {
    static_assert(sizeof(std::array<T, N>) == N * sizeof(T));
    struct ArrayWrapper {
        unsigned char storage[sizeof(std::array<T, N>) + alignof(T)];
        std::size_t tmpOffset =
            std::size_t{reinterpret_cast<std::uintptr_t>(storage) % alignof(T)};
        std::size_t Offset = tmpOffset == 0 ? 0 : alignof(T) - tmpOffset;
        unsigned char *alignedStorage = storage+Offset;
        ~ArrayWrapper() {
            T* toDelete = reinterpret_cast<T*>(alignedStorage);
            for (std::size_t i = 0; i != N; ++i) {
                toDelete[i].~T();
            }
        }
    };

    ArrayWrapper storage;
    for (std::size_t i = 0; i != N; ++i) {
        new (storage.alignedStorage + i * sizeof(T))
            T(gen(static_cast<int>(i)));
    }
    return std::move(
        *reinterpret_cast<std::array<T, N>*>(storage.alignedStorage));
};

int main() {
    constexpr std::size_t N = 50000;
    auto gen = [](int i) { return Int(i); };
    std::array<Int, N> arr = make_array<N, Int>(gen);
    // following line does not compile
    // constexpr std::array<Int, N> arr2 = make_array<N, Int>(gen);
    arr[10] = Int(2);
    return (arr[10].i) * (arr[3].i);
}

EDIT: as suggested in @G.Sliepen answer, I propose also a dynamic-temporary version for review: Presence of UB and memory usage of a std::array initialization: version with temporary array on heap

\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

It's inefficient

The idea of constructing an array from a generator is really nice. However, it is not efficient at all. You are first constructing the whole array in a temporary storage location, which is then moved into the return value, but before returning from the function it first has to destroy the moved-from elements of the temporary storage. Only in very trivial cases will the compiler be able to optimize this away.

If T does have a default constructor, then the naive implementation:

template <std::size_t N, typename T, typename Generator>
std::array<T, N> make_array2(Generator gen) {
    std::array<T, N> storage;
    for (std::size_t i = 0; i != N; ++i) {
        storage[i] = gen(i);
    }
    return storage;
};

Is actually vastly superior, as it will assign values directly in the return value. You could make two overloads or use if constexpr to choose to use the naive version or your version depending on whether T is trivial or has a default constructor.

It always uses stack space

While std::array has the advantage that it can live completely on the stack, you don't have to. You can store a std::array on the heap as well. That is important for large arrays, as stack space can be quite limited. However, your make_array() always puts the temporary storage space on the stack, which could cause a stack overflow if N is very large.

Match the order of template parameters of std::array

I would switch the order of the template parameters N and T so that it matches that of std::array.

\$\endgroup\$
5
  • \$\begingroup\$ Thanks for this feedback. Regarding your first comment, the prerequisite is that T may be not default-constructible. Yet an improvement would be to "if constexpr" and use your implementation when T is default-constructible. Regarding the stack space usage, you can see in the linked question a first version where I used dynamic storage . I thought that using the stack was an improvement but it's not so obvious if I follow your reasoning. May you be so kind as having a look at the dynamic version? As for the last remark: I totally agree, thanks. \$\endgroup\$
    – Oersted
    Commented 2 days ago
  • \$\begingroup\$ NB I had another version where I removed the explicit destructors call when Twas trivially destructible but it seemed to change nothing for the 3 tested compilers. \$\endgroup\$
    – Oersted
    Commented 2 days ago
  • \$\begingroup\$ Sorry, I read too fast your answer and I see now that you suggested the "if constexpr". \$\endgroup\$
    – Oersted
    Commented 2 days ago
  • \$\begingroup\$ Regarding the cost of deallocating, I heard of a technique using pmr to efficiently release memory but I'm unsure of its validity. See stackoverflow.com/questions/78667752/…. Do you think it would be applicable? \$\endgroup\$
    – Oersted
    Commented 2 days ago
  • \$\begingroup\$ If you want us to review the version that uses dynamic storage, just create a new question here on Code Review. As for polymorphic memory resources: these will only optimize the memory (de)allocation, but do nothing for the construction and destruction of the objects that live in them. That still needs to happen, otherwise you will get UB. \$\endgroup\$
    – G. Sliepen
    Commented yesterday

Not the answer you're looking for? Browse other questions tagged or ask your own question.