The Complete Guide to Return X;

The Complete Guide to Return X;

The Complete Guide to return x; I also do C++ training! [email protected] Arthur O’Dwyer 2021-05-04 Outline ● The “return slot”; NRVO; C++17 “deferred materialization” [4–23] ● C++11 implicit move [24–29]. Question break. ● Problems in C++11; solutions in C++20 [30–46]. Question break. ● The reference_wrapper saga; pretty tables of vendor divergence [47–55] ● Quick sidebar on coroutines and related topics [56–65]. Question break. ● P2266 proposed for C++23 [66–79]. Questions! Hey look! Slide numbers! 3 x86-64 calling convention int f() { _Z1fv: int i = 42; movl $42, -4(%rsp) return i; movl -4(%rsp), %eax } retq int test() _Z4testv: { callq _Z1fv int j = f(); addl $1, %eax return j + 1; retq } On x86-64, the function’s return value usually goes into the %eax register. 4 x86-64 calling convention Stack Segment int f() { Since f and test each have their own int i = 42; f i printf("%p\n", &i); stack frame, i and j naturally are different return i; prints “0x9ff00020” variables. } test j j is initialized with a int test() { copy of i — C++ : int j = f(); loves copy semantics. : printf("%p\n", &j); : return j + 1; prints “0x9ff00040” } main 5 x86-64 calling convention Stack Segment struct S { int m; }; Even for class types, C++ does “return by f i S f() { copy.” prints “ ” S i = S{42}; 0x9ff00020 The return value is printf("%p\n", &i); still passed in a test j return i; machine register } when possible. : : S test() { : S j = f(); prints “0x9ff00040” printf("%p\n", &j); main return j; } 6 x86-64 calling convention But what about when Stack Segment struct S { int m[3]; }; S is too big to fit in a register? S f() { f i Then x86-64 says that S i = S{{1,3,5}}; the caller should pass printf("%p\n", &i); an extra parameter, return i; pointing to space in } the caller’s own : return slot stack frame big : S test() { enough to hold the test: S j = f(); result. printf("%p\n", &j); This is the “return j return j; slot.” } 7 x86-64 calling convention Stack Segment struct S { int m[3]; }; The following explanation is the baseline C++98 explanation, with no optimizations or tricks. S f() { f i S i = S{{1,3,5}}; printf("%p\n", &i); Don’t worry, the tricks are return i; coming. } : return slot : S test() { test: S j = f(); printf("%p\n", &j); j return j; } 8 x86-64 calling convention Stack Segment struct S { int m[3]; }; S f() { f i S i = S{{1,3,5}}; printf("%p\n", &i); f knows the return slot’s return i; address because test passed that address to f as a hidden } : return parameter (in register %rdi). slot : At the return statement, test: S test() { i’s value is copied into the S j = f(); return slot. j printf("%p\n", &j); return j; } 9 x86-64 calling convention Stack Segment struct S { int m[3]; }; S f() { S i = S{{1,3,5}}; Now f is done and i is gone. printf("%p\n", &i); Think of the value of f() as the return i; value “in the return slot.” } : return slot : S test() { test: S j = f(); printf("%p\n", &j); j return j; } 10 x86-64 calling convention Stack Segment struct S { int m[3]; }; S f() { S i = S{{1,3,5}}; printf("%p\n", &i); return i; } : return slot : test: S test() { Finally, the value in the return S j = f(); slot is used to copy-initialize j. printf("%p\n", &j); j return j; } 11 Again! Faster! Stack Segment struct S { int m[3]; }; This step is slow. S f() { test can do better. S i = S{{1,3,5}}; Since test controls the printf("%p\n", &i); allocations of both the return return i; slot and j, and test knows that the return slot will be used } : return to copy-initialize j, test can slot allocate them both at the same : S test() { memory address and avoid test: S j = f(); having to do the copy! printf("%p\n", &j); j return j; } 12 Again! Faster! Stack Segment struct S { int m[3]; }; S f() { S i = S{{1,3,5}}; printf("%p\n", &i); return i; } f i S test() { S j = f(); return printf("%p\n", &j); test slot, also j return j; } 13 Again! Faster! Stack Segment struct S { int m[3]; }; S f() { S i = S{{1,3,5}}; printf("%p\n", &i); return i; } f i S test() { S j = f(); return printf("%p\n", &j); test slot, also j return j; } 14 Again! Faster! Stack Segment struct S { int m[3]; }; S f() { And we’re done! S i = S{{1,3,5}}; Notice that the optimizer can printf("%p\n", &i); do this optimization all by itself, under the as-if rule, because return i; the “copying” of S has no f i } user-visible side effects. S test() { S j = f(); return printf("%p\n", &j); test slot, also j return j; } 15 C++98 “copy elision” Stack Segment struct S { int m[3]; S(S&&); }; S f() { If we give the “copy” visible S i = S{{1,3,5}}; side effects, guess what? printf("%p\n", &i); We can still do the optimization! The C++98 return i; standard permitted us to elide f i } even a visible constructor call when initializing an object with S test() { a temporary of the same type. S j = f(); return printf("%p\n", &j); But wait, C++17 made it even test slot, also j return j; better!... } 16 C++17 “deferred prvalue materialization” C++17 changed the high-level formal struct S { int m[3]; S(S&&); }; semantics of a prvalue expression like “f()”. S f() { In C++14, f() eagerly evaluated into a S i = S{{1,3,5}}; temporary object that had to be moved printf("%p\n", &i); into j. Copy-elision was permitted by a return i; special case. } In C++17, a prvalue is more like a recipe for initializing an object, known S test() { as the expression’s “result object.” S j = f(); Here, that result object ultimately turns printf("%p\n", &j); out to be j itself. return j; The formal semantics changed. } The machine code did not! 17 Again! Faster! Stack Segment struct S { int m[3]; S(S&&); }; S f() { This step is slow. S i = S{{1,3,5}}; f can do better. printf("%p\n", &i); Since f controls the allocation return i; of i, and knows that it will be } used to initialize the return slot, f i f can allocate i in the return slot, and avoid having to do S test() { the copy! S j = f(); Let’s run through that... return printf("%p\n", &j); test slot, also j return j; } 18 Named Return Value Optimization Stack Segment struct S { int m[3]; S(S&&); }; S f() { S i = S{{1,3,5}}; printf("%p\n", &i); test allocates its stack frame as usual, and passes f a return i; pointer to the return slot, in } which f will construct its result. S test() { (...in which the result object of S j = f(); the prvalue f() will be return printf("%p\n", &j); materialized.) test slot, also j return j; } 19 Named Return Value Optimization Stack Segment struct S { int m[3]; S(S&&); }; S f() { S i = S{{1,3,5}}; printf("%p\n", &i); return i; } f S test() { S j = f(); i, also return test printf("%p\n", &j); slot, return j; also j } 20 Named Return Value Optimization i is already located in the Stack Segment struct S { int m[3]; S(S&&); }; return slot, so this return statement corresponds to S f() { zero machine code. S i = S{{1,3,5}}; This optimization is done printf("%p\n", &i); automatically under the as-if rule, but even when the return i; constructor call would be } visible, C++98 permits the compiler to elide it as a f S test() { special case. S j = f(); Note that i is not a prvalue, i, also return so this was not affected by test printf("%p\n", &j); C++17’s “deferred slot, also j return j; materialization” business. } 21 Conditions for NRVO to kick in Many things have to go right for NRVO to happen. ● There must be a return slot. Trivial types can just be returned in registers ○ But types with non-trivial SMFs will always be returned via return slot ● The allocation of “return variable” i must be under f’s control ○ Otherwise f can’t allocate i into the return slot! ● i must have the exact same type (modulo cv-qualification) as f’s return slot ○ Otherwise i won’t fit into the return slot! ● One mental — not physical — caveat: The return’s operand must be exactly a (possibly parenthesized) id-expression, such as i. Nothing more complicated. ○ Otherwise things could get very confusing for the human programmer! 22 Examples of NRVO not happening struct Trivial { int m; }; struct S { S(); ~S(); }; struct D : public S { int n; }; Trivial f() { Trivial x; return x; } // no return slot S x; S g1() { return x; } // g doesn’t control allocation of x S g2() { static S x; return x; } // same deal S g3(S x) { return x; } // same deal (params are caller-allocated!) S h() { D x; return x; } // D is too big for the return slot 23 Introducing move semantics ● C++11 added move constructors and move-only types, such as unique_ptr.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    84 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us