c++ memory management

32
C++ Memory Management Anil Bapat

Upload: reachanil

Post on 22-Dec-2014

24.950 views

Category:

Technology


9 download

DESCRIPTION

This talk is about the most error prone part of C++ - The memory management part.

TRANSCRIPT

Page 1: C++ Memory Management

C++ Memory Management

Anil Bapat

Page 2: C++ Memory Management

Agenda

• Why pointers are evil

• new/delete operators

• Overloading new/delete

• Memory pools

• Smart Pointers– Scoped pointers– Reference counting– Copy on write

Page 3: C++ Memory Management

Common Pointer related pitfalls

• Bitwise copying of objects could lead to dangling pointers + memory leak

• Treating operator= the same as a copy constructor could lead to memory leaks

• Treating arrays polymorphically leads to a crash

• Non virtual destructors in base class could lead to memory leaks

Page 4: C++ Memory Management

Common Pointer related pitfalls (Contd)

• Pairing up new with delete [] or new [] with delete leads to undefined behavior (a.k.a late night debugging)

• Pairing up new with free or malloc with delete also leads to undefined behavior

• Incorrect casts (Pick base class pointer from container and cast it to the wrong derived class) leads to crashes

Page 5: C++ Memory Management

Common Pointer related pitfalls (Contd)

• Returning a pointer from within a function – Returning pointer to local data (Crash)– Returning pointer to a static buffer (crash)

• Un-allocated pointers being used in programs• Too many exit paths (Including exceptions) from

functions, memory allocated within the function needs to be freed in all these paths explicitly

• Overlapping src and dst pointers in memcpy

Page 6: C++ Memory Management

new/delete operators

• The memory related operators are –– new– delete– new []– delete []– placement new– placement delete

Page 7: C++ Memory Management

new/delete operators (Contd)

• operator new allocates memory for the specified type, if memory is unavailable it invokes the installed new_handler function if it is available, else throws a bad_alloc exception

• operator new (std::nothrow) does all of the above, but doesn’t throw an exception, returns NULL instead

Page 8: C++ Memory Management

new/delete operators (Contd)

• One could use the set_new_handler function to install a custom new_handler which will be called when new cannot allocate any more memory

• The custom new_handler could free up some excess capacity in the memory pools and make more memory available, or simply abort/throw an exception, or even install ANOTHER new_handler

Page 9: C++ Memory Management

new/delete operators (Contd)

• The operator delete frees up the memory allocated by operator new

• The size of the object to be deleted is passed to the operator delete along with the raw pointer

• OK to call delete NULL/0• new[]/delete[] do alloc/de-alloc for arrays• Doing delete [] pBase with pBase pointing

to a derived class array leads to a crash

Page 10: C++ Memory Management

placement new/delete

• With placement new, no memory allocation done, but the object is constructed and ‘Placed’ at the memory location provided via the void* passed to new

• With placement delete, no memory de-allocation is done

• Used for ‘Memory mapping (mmap)’, ‘Cache Friendly Code’ and such advanced applications

Page 11: C++ Memory Management

the new_handler

• When new/new[] is not able to allocate the required memory, it invokes the new_handler

• Users install new_handlers that could somehow ‘Create more memory’ (Usually by reducing the capacity of certain pools)

• Users typically have 2 new_handlers defined – CreateMoreMemoryHdlr() and GiveUpHdlr(), the first time the former is invoked and the former installs the latter handler after it has created more memory. The latter handler simply aborts the program

Page 12: C++ Memory Management

Overloading new/delete operators

• new/delete/new[]/delete[] operators can be replaced ‘Globally’, not the placement new/delete

• All the 6 operators could be replaced for a specific class

• If a class replaces new/delete, it can still use the global new[] and delete[], but not the global placement new/delete, or for that matter any other overloaded new/delete form, it will have to define own placement/overloaded new/delete as well in the class itself

Page 13: C++ Memory Management

Overloading new/delete operators (Contd)

• To be safe, when you decide to go for class specific replacement, replace all of new/delete/new[] and delete[] together in the class. If you need them, replace the placement new/delete as well

• In the base class methods, compare the size of the object with the base class type, if unequal, direct the call to the global ::operator new/delete method (To account for derived classes missing new/delete operators)

Page 14: C++ Memory Management

Overloading new/delete operators (Contd)

• If base class dtor is virtual, then operator delete in the derived class will get the correct size_t value, regardless of which type of pointer is used for the delete (Surprise!)

• However, the behavior is undefined if we do this for deleting a derived class array using base pointer

Page 15: C++ Memory Management

Why replace new/delete?

• Better performance

• Debugging

• Collecting statistics

Page 16: C++ Memory Management

Performance

• Pre-allocating memory at init time, instead of getting it at runtime from the system, saves us time in the critical path and also reduces ‘Heap fragmentation’

• Custom new/deletes could do some simplifying assumptions to increase speed– All objects of same size [simplified search algorithm –

no best fit/worst fit fundas required]– Same thread creating and freeing memory for a

specific type (No locks required)

Page 17: C++ Memory Management

Debugging

• Custom new/delete methods could be made to insert a ‘Magic number’ few bytes before and after the actual memory allocated and then detect when this gets overwritten

• Custom new/deletes could build a list of allocated memory elements and later check for leaks

Page 18: C++ Memory Management

Collecting statistics

• While doing system engineering of a product, one could use custom new/deletes to collect the statistics of the peak/average memory usage on a global or a class wise basis.

• Tests run on the system can help size our memory pools appropriately and to accurately determine our memory requirements and the spread

Page 19: C++ Memory Management

Memory pools

• Maximum benefit by defining ‘Class specific memory pools’ as we can make most simplifying assumptions on a class basis (Fixed size and threading)

• Not much benefit in defining global memory pools – No simplifying assumptions can be made– The default new/delete are very efficient general purpose

memory allocators, and it is very tough to outdo them in speed

• Profile and compare the default new/delete with your own new/delete version before you replace the global new/delete operators.

Page 20: C++ Memory Management

Class Specific Pools

• Have a static member of the class as your own memory pool instance (Memory pool ctor takes NumElements and FixedSizeOfElems as parameters)

• Define new/delete operators in your class to alloc/free elements from your own pool

• Alternative – Hierarchy specific pools (Define a memory pool in the base class with size_of_elems = max(sizes of all possible derived classes) and num_elems = (Total expected number of all objects in hierarchy) – Easier, less code, but be careful, check for size.

Page 21: C++ Memory Management

Memory debuggers are tricked

• Memory debuggers like valgrind can’t account for what happens within your pools.

• There should be a DEBUG flag setting which can disable all pool dependency in your code so you could use that flag for valgrind debugging

• Valgrind works on ‘Binary instrumentation’• Get familiar with valgrind – We identified several

vicious memory leaks and corruptions with this tool in multiple products – valgrind.org

Page 22: C++ Memory Management

Smart Pointers

• Looks and feels like a pointer, but is smarter• operators ->, * and ! overloaded to enable sp->,

*sp and if (!sp) work as if sp was actually a pointer type

• Behaves just like pointers w.r.t upcasts and downcasts the inheritance hierarchy. This is done using a template member function in the sp class that does all valid conversions of the internal pointer

Page 23: C++ Memory Management

The ‘Smart’ part

• Prevent memory leaks – dtor automatically frees the contained pointer

• Prevent dangling pointers – copy ctor and assignment operators implement specific ‘Transfer of ownership’ policies

• Improve performance – By implementing reference counting and copy on write

Page 24: C++ Memory Management

Smarter, and invisible too!

• All operations happen ‘Under the hood’ without the users knowing about it (Fully encapsulated)

• Except the place where the sp is defined, everywhere else it is just like using a pointer

• Existing code that uses pointers, could be made to use smart pointers with minimal code change

Page 25: C++ Memory Management

Smart pointer class

template <class T>class SmartPointer{public:

SmartPointer(T* pointee = NULL) : _pointee(pointee);SmartPointer(const SmartPointer& rhs);SmartPointer& operator=(const SmartPointer& rhs);~SmartPointer();T* operator->() const;T& operator*() const;bool operator!() const;template <class Destination>operator SmartPointer<Destination>(){ return SmartPointer<Destination>(_pointee);};

private:T* _pointee;

};

Page 26: C++ Memory Management

Smart pointer usageBase* pBase = new Base;Base* pDerived = new Derived;

extern func(SmartPointer<Base> spBase); // Function that takes // base smart pointer as argument

SmartPointer<Derived> sp(pDerived);

sp->mData = 0;cout << (*sp).mData;if (!sp)

cout << “pClass pointer value is NULL”;

func(sp) // Can pass smart pointer of derived class type to a // function that accepts smart pointer of base class type

Page 27: C++ Memory Management

Types of smart pointers

• Categorized based on the ‘Ownership policy’ they implement– Scoped pointers (Eg: auto_ptr)– Shared pointers (The boost shared_ptr)

• Scoped pointers are light weight and don’t allow multiple pointers to same data

• Shared pointers implement reference counting and copy on write and hence are heavy duty beasts

Page 28: C++ Memory Management

Scoped Pointers

• dtor destroys the contained pointer (Preventing memory leaks and providing exception safety)

• Either prevent copying (Like boost::scoped_ptr) or transfer ownership to destination (Like std::auto_ptr) or do a deep copy (Like nonstd::auto_ptr)

• std::auto_ptr is dangerous. Passing it to a function, makes the passed value point to NULL

Page 29: C++ Memory Management

Reference counting

• Keep a count of the number of references to a particular piece of data and delete the data when this count goes to 0

• This is the C++ ‘Garbage collection’ mechanism

• Significant performance gains for frequently referenced huge data structures

Page 30: C++ Memory Management

Reference counting (Contd)

• Value - A contained class that contains the contained raw pointer to data along with the reference count value

• SmartPointer ctor sets the pValue->refcount to 1• SmartPointer copy ctor increments the pValue->refcount• SmartPointer dtor decrements the pValue->refcount and

if it is 0, deletes the contained raw pointer• SmartPointer = operator decrements the

pValue->refcount of LHS object and deletes if 0, and then increments the pValue->refcount of RHS object, also sets the pValue to point to the RHS pValue

Page 31: C++ Memory Management

Copy on Write

• Copy on Write technique adds further smarts to reference counting

• When some data is changed by some reference, don’t change the existing data, instead make a copy of the data, and have the modifying reference point to the new copy, breaking away from the earlier shared group

• Works like ref-counting until there’s only read-only access, but deviates when the data is changed

• non const member functions to implement the COW technique

Page 32: C++ Memory Management

Smart pointer techniques beyond pointers

• Scoped locks automate locking and unlocking of mutexes

• Extend to any resource (Eg: File handles)

• Used for implementing ‘Proxy’ objects -> and * could get some data across the network and make it available locally (‘Under the hood’), allowing us to access remote objects using pointers