Bits of Bytes
bits of coding, C++, Qt, git, gamedev, linux and other tech stuff
October 13, 2016
benchmark, C++, C++11, cpp, performance, pointer, raw pointer, shared_ptr, smart pointer, STD, test, unique_ptr
In this post I analyse and discuss the performance of raw pointers vs smart pointers in few C++11 benchmarks. The operations tested are creation, copy, data access and destruction.
These are limited
← Previous post
Next post →
Did you activate optimizations? If not, this comparison isn’t really useful.
Last line: “They were compiled using g++ 4.8.4 with the following flags: -O3 -s -Wall -std=c++11.”
So optimizations were on.
yes of course, I added compilation flags to the post.
Your comment about weak_ptr keeping an object alive is incorrect.
If a weak_ptr is created from a shared_ptr, then it will participate in reference counting. If it is created from a raw pointer, it will not.
FWIW, this is also true for shared_ptr — if two shared_ptr’s are created from the same *raw* pointer, they don’t know about each other, will have separate reference counts, and one of them is guaranteed to dangle.
If you want to have multiple smart pointer (weak or shared makes no difference), the second and subsequent smart pointer MUST be created from a smart pointer, not from a raw pointer.
A good explanation can found at: http://thispointer.com/create-shared_ptr-objects-carefully/
The OP’s comment about weak pointers is definitely incorrect, but in a way different from you suggest. The whole point of a weak pointer is that it DOES NOT participate in the reference counting that keeps the shared pointer alive; weak pointers’ API enforces this by ensuring that the only thing one can do with a weak pointer is copy it or convert it to a shared_ptr of the same type and which itself a copy of the shared_ptr from which the weak_ptr was obtained.
I’m confused by your assertion that “if a weak_ptr is created from a shared_ptr, then it will participate in the reference counting.” There is no other way to obtain a weak_ptr than through a shared_ptr or copy/move of another weak_ptr. Most importantly, there is no way to create a weak_ptr from a raw pointer (http://en.cppreference.com/w/cpp/memory/weak_ptr/weak_ptr).
And weak_ptr’s definitely do not participate in the counts of strong references. Most shared_ptr implementations do track the number of extant weak_ptrs, that a non-zero weak_ptr count does not prevent destruction of the owning shared_ptr.
I am just reporting what stated in the stackoverflow discussion I posted below.
It’s something which I haven’t tested myself yet, but I will and eventually will blog about it.
What he means is when you create a std::shared_ptr using new vs std::make_shared. In the former case, the control block and the object are allocated separately, so when the last std::shared_ptr is destroyed, the object can be deallocated (the control block stays alive if there are any remaining std::weak_ptr). In the latter case, both the object and control block are allocated in a single operation. This means that the object’s memory will remain allocated until the last std::weak_ptr is destoyed.
What I wanted to get across was that a weak_ptr will keep the *control block* alive as long as any weak_ptr to the object exists.
Unfortunately I couldn’t find the diagram I was looking for, and my choice of words was unclear. (Still can’t find the diagram I was looking for, but this is not bad: https://goo.gl/images/cWbKz2)
That, plus the fact that it is all too easy to create two separate, unrelated shared_ptr’s from a single raw pointer.
Thanks for the clarification; makes much more sense.
what you are referring to are common pitfalls when using smart pointers.
What I am talking about is something completely different, you can read more about it here: http://stackoverflow.com/questions/20895648/difference-in-make-shared-and-normal-shared-ptr-in-c
Good post! People are really too afraid of using the smart pointers for performance reasons whereas they are in fact really cheap to use.
Not sure I believe the results here without further analysis. Measuring things in a very small loop that is executed millions of times tends to create results not representative of real world performance, due to techniques like loop unrolling and the multi scalar nature of CPUs being able to parallelize certain things in a loop they may not be able to otherwise.
I suspect that the cost of shared_ptr is quite a bit higher than presented here, since it involves an atomic increment/decrement every time the shared_ptr is passed around. People tend to pass shared_ptr’s around by value for some reason, mistaking them for raw ptrs. This is quite an expensive operation, and probably outweighs the cost of creation and destruction by quite a bit. Unique_ptr’s don’t really have this issue, since they force developers to do the right thing (move instead of copy).
Unless for some very specific use cases, I don’t really see the need for anybody to use shared_ptr’s in single threaded code. A combination of unique_ptr’s and raw ptrs/references should be preferred. There is a reason why there is no shared_ptr implementation without an atomic integer inside, that pretty much gives a hint that shared_ptr’s are primarily for multi threaded code, and their overheard should be avoided in single threaded applications.
I am always surprised, how many experts really believe, that they can write the most efficient code in the world by manually optimizing their C/C++ code through the usage of obscure techniques for the final gain of 12 microsecs in the main loop, only in order to forget to sort their main data structure only once instead of over and over again inside each iteration. That is, where people should focus, IMHO.
The most important statement of Davide’s article, I think, definitely is: “we are still talking about 23ms per 1M pointers. That means that unless your code looks like the one in the benchmark (and it shouldn
Concern about copy performance…
Are you sure that copying really occurs and this is not a result of copy elisions of optimizer?
Your test is wrong, if it was true, it would imply that atomic reference counter is free, which is not, is actually expensive and non-deterministic.
Your email address will not be published. Required fields are marked *
Save my name, email, and website in this browser for the next time I comment.