this lecture is a bit of a departure in that we’ll cover how c++’s features are actually...

32
we’ll cover how C++’s features are actually suspiciously similar to pointers to functions in (for those of you who haven’t seen C pointers to see that virtual functions can actually do things ns cannot. Virtual function implementati on Pointers to functions in C Why C++’s type system is more powerful than C’s detour back to C++

Upload: regina-norcutt

Post on 14-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

This lecture is a bit of a departure in that we’ll cover how C++’s features are actually implemented. This implementation will look suspiciously similar to pointers to functions in C, so we’ll take a detour to check this out (for those of you who haven’t seen C pointers to functions before). When we’re done, we’ll see that virtual functions can actually do things in a type-safe way that pointers to functions cannot.

Virtual function implementation

Pointers to functions in C

Why C++’s type system is more powerful than C’s

detour back to C++

How virtual functions are (typically) implemented

If a class has any virtual functions, an extra (hidden) pointer is added to the beginning of each object of that class. This pointer points to the virtual function table (v-table for short), which pointers to code for each virtual function supported by the object.

vtable ptr

...

data

data

ptr to code

ptr to code

...

object vtable 01001010111101001001

01001010111101001001

Example:class Shape {

int xCoord, yCoord; // coordinates of center

ShapeColor color; // current color

public:

void move(int xNew, int yNew);

virtual void draw();

virtual void rotate(double angle);

virtual double area();

};

Shape s1;

The object s

looks like:

vtable ptr

color

xCoord

yCoord

ptr to Shape::draw

ptr to Shape::rotate

ptr to Shape::area

s1 Shape v-table

Multiple object of type Shape share the same v-table:

Shape s1;

Shape s2;

vtable ptr

color

xCoord

yCoord

s2vtable ptr

color

xCoord

yCoord

s1 ptr to Shape::draw

ptr to Shape::rotate

ptr to Shape::area

Shape v-table

Each class derived from Shape gets its own v-table, which contains code for inherited and overridden member functions. Suppose Circle inherits Shape::rotate (without overriding it), and overrides Shape::draw and Shape::area.

To make things interesting, let’s also suppose that Circle adds a new virtual function, circumference.

class Circle: public Shape {

int radius;

public:

void draw();

double area();

virtual double circumference();

};

Shape s1;

Shape s2;

Circle c1;

Now the

objects

and

v-tables

look like:

s2

vtable ptr

color

xCoord

yCoord

s1

ptr to Circle::draw

ptr to Shape::rotate

ptr to Circle::area

Circle v-table

ptr to Shape::draw

ptr to Shape::rotate

ptr to Shape::area

Shape v-table

ptr to Circle::circ...

vtable ptr

color

xCoord

yCoord

vtable ptr

color

xCoord

yCoord

radius

c1

When a virtual function is invoked, C++ looks up the right code at run-time (it knows at compile-time what offset in the v-table to look in) and calls it:

Shape *sPtr = new Circle();

double a = sPtr->area();

So Circle::area is called, and passed an implicit this pointer:

Circle::area(sPtr); // sPtr is implicitly passed

// as “this”

sPtr

ptr to Circle::draw

ptr to Shape::rotate

ptr to Circle::area

ptr to Circle::circ...

vtable ptr

color

xCoord

yCoord

radius

Circle::area can then do its thing, making use of Circle-specific data like radius:

double Circle::area() {

return PI * radius * radius;

}

Remember, there is an implicit this pointer being passed to the function, so the function really works like:

double Circle::area(Circle *this) {

return PI * this->radius * this->radius;

}

This lecture was inspired in part by a conversation about C++ I had with a colleague once. He asked me what C++ could do that C couldn’t do, so I told him about virtual functions. When he asked how virtual functions worked, I told him about the pointer to the v-table in each object, and the corresponding run-time dispatch to functions in the v-table.

“No big deal,” he said, “you can do all that in C with pointers to functions.”

Was he right?

Pointers to functions

C and C++ provide datatypes for pointers to functions. Suppose there is a function named compareInt:

int compareInt(int i1, int i2) {

if(i1 < i2) return -1;

else if(i1 == i2) return 0;

else return 1;

}

We can create a pointer to this function as follows:

int (*ptrToCompareInt)(int i1, int i2);

ptrToCompareInt = &compareInt;

ptrToCompareInt is a pointer to a function. Its declaration means:

int (*ptrToCompareInt)(int i1, int i2);

return typeof functionpointed to

argument typesof functionpointed to

star means“pointer”

name ofvariable

Example:

int addInt(int i1, int i2) {return i1 + i2;}

int multiplyInt(int i1, int i2) {return i1 * i2;}

...

int (*ptrToFunc)(int i1, int i2);

ptrToFunc = &addInt;

int result1 = (*ptrToFunc)(7, 6);

ptrToFunc = &multiplyInt;

int result2 = (*ptrToFunc)(7, 6);

cout << result1 << " " << result2 << endl;

this prints:13 42

A practical example of pointers to functions is the qsort routine, which is provided in stdlib.h in C and C++:

void qsort(void *base, int num, size_t width, int (*compare)(void *elem1, void *elem2 ) );

qsort takes an array of elements and sorts them:• base and num specify the array and the size of the array• width specifies the size, in bytes, of each element in the array• compare is a pointer to a function which compares two elements

from the array

void qsort(void *base, int num, size_t width, int (*compare)(void *elem1, void *elem2 ) );

What’s all this void* stuff?

void* is C’s way of giving up - C is saying, “I don’t have a powerful enough type system do precisely describe what’s going on here”.

“void* elem1” means that C says “I have no idea what type elem1 has”.

As we’ll see later, the void* in this case stems from a lack of polymorphism - qsort can be implemented in a much nicer way with the polymorphism provided by templates.

void qsort(void *base, int num, size_t width, int (*compare)(void *elem1, void *elem2 ) );

Instead of writing the nice compareInt functionint compareInt(int i1, int i2) {

if(i1 < i2) return -1;

else if(i1 == i2) return 0;

else return 1;

}

to use qsort, we have to write a function that takes void* arguments and does a bunch of casts:

int compareInt(void *i1Ptr, void *i2Ptr) {

if(*((int*)i1Ptr) < *((int*)i2Ptr)) return -1;

else if(*((int*)i1Ptr) == *((int*)i2Ptr)) return 0;

else return 1;

}

Ugh! This is the first clue that pointers to functions, without any additional polymorphism, may not solve all our problems.

The next clue is that pointers to functions refer to code but hold no other data.

In languages with higher-order functions, function values can hold both data and code. C and C++ do not have higher-order functions - pointers to functions refer to code but contain no data.

So here’s what we get in C:

• structs: data, but no code

• pointers to functions: code, but no data

But we can fix this. Just mix the two together to get both code and data.

Based on this, let’s take a crack at implementing v-tables with structs and pointers to functions.

class Shape {

int xCoord, yCoord; // coordinates of center

public:

virtual void draw();

virtual void rotate(double angle);

virtual double area();

};

We’ll start with a slightly simplified version of Shape. We need a struct to hold the data in a Shape object:

struct Shape {

ShapeVTable *vTable;

int xCoord, yCoord;

};

class Shape { // C++ version

int xCoord, yCoord; // coordinates of center

public:

virtual void draw();

virtual void rotate(double angle);

virtual double area();

};

We also need a struct full of pointers to functions to implement the v-table:

struct Shape { // struct/pointer to function version

ShapeVTable *vTable;

int xCoord, yCoord;

};

struct ShapeVTable {

void (*draw)(Shape *myself);

void (*rotate)(Shape *myself, double angle);

double (*area)(Shape *myself);

};

The myself argument is an imitation of C++’s “this”.

struct Shape {

ShapeVTable *vTable;

int xCoord, yCoord;

};

struct ShapeVTable {

void (*draw)(Shape *myself);

void (*rotate)(Shape *myself, double angle);

double (*area)(Shape *myself);

};

So now we have something that looks like an object with a pointer to a v-table:

vtable ptr

xCoord

yCoord

ptr to Shape’s draw

ptr to Shape’s rotate

ptr to Shape’s area

Shape ShapeVTable

struct Shape {

ShapeVTable *vTable;

int xCoord, yCoord;

};

struct ShapeVTable {

void (*draw)(Shape *myself);

void (*rotate)(Shape *myself, double angle);

double (*area)(Shape *myself);

};

So far, so good. But now let’s try to implement the data and v-table for a simplified version of Circle, which overrides the area function:

class Circle: public Shape {

int radius;

public:

double area();

};

struct Shape {

ShapeVTable *vTable;

int xCoord, yCoord;

};

First, let’s look at the data part. We run into an immediate problem. We could define a completely new struct:

struct Circle {

CircleVTable *vTable;

int xCoord, yCoord;

int radius;

};

But this Circle has no relation to Shape, so they can’t be used interchangeably (we don’t get the polymorphism we wanted).

struct Shape {

ShapeVTable *vTable;

int xCoord, yCoord;

};

Maybe we could embed a Shape inside a circle:struct Circle {

Shape shape;

int radius;

};

But let’s just punt on this, and assume we can use a simple C++-like inheritance (but we won’t assume that the inheritance mechanism supports virtual functions - the whole point is to implement virtual functions ourselves):

struct Circle: public Shape {

int radius;

};

struct Shape {

ShapeVTable *vTable;

int xCoord, yCoord;

};

struct ShapeVTable {

void (*draw)(Shape *myself);

void (*rotate)(Shape *myself, double angle);

double (*area)(Shape *myself);

};

struct Circle: public Shape {

int radius;

};

We can almost sort of use this now to imitate virtual function calls:

Circle c;

Shape *s = &c;

// imitation of “double a = s->area()”:

double a = (*(s->vTable->area))(s);

struct Shape {

ShapeVTable *vTable;

int xCoord, yCoord;

};

struct ShapeVTable {

void (*draw)(Shape *myself);

void (*rotate)(Shape *myself, double angle);

double (*area)(Shape *myself);

};

struct Circle: public Shape {

int radius;

};

But we have trouble when we try to implement a Circle::area function that conforms to the type in ShapeVTable:

double circle_area(Shape *myself) {

return PI * myself->radius * myself->radius;

}

The problem is that myself has type Shape*, and Shapes don’t have a radius member.

circle_area could do a cast, but this subverts the type system:

double circle_area(Shape *myself) {

return PI * ((Circle *) myself)->radius *

((Circle *) myself)->radius;

}

Maybe the solution is that Shape should have a vtable pointing to a ShapeVTable, and Circle should have a vtable pointing to a CircleVTable (which holds functions that take myself arguments of type Circle*, instead of Shape*):

struct Shape {

ShapeVTable *vTable;

...

};

struct Circle {

CircleVTable *vTable;

...

};

But we’ve already seen that this causes Shapes and Circles to be incompatible.

struct Shape {

ShapeVTable *vTable;

...

};

struct Circle {

CircleVTable *vTable;

...

};

In particular, the following code thinks that it is looking up a function in a ShapeVTable, when it is actually using a CircleVTable:

Circle c;

Shape *s = &c;

double a = (*(s->vTable->area))(s);

The type system can’t be sure that ShapeVTable and CircleVTable are compatible, so it won’t allow this code to typecheck.

So we’re stuck with:

double circle_area(Shape *myself) {

return PI * myself->radius * myself->radius;

}

Compare this to C++’s virtual functions:

double Circle::area(Circle *this) {

return PI * this->radius * this->radius;

}

With C++’s virtual functions, Circle::area somehow knows that it gets a this pointer of type Circle*, not Shape*.

double Circle::area(Circle *this) {

return PI * this->radius * this->radius;

}

Circle::area knows that it gets a this pointer of type Circle*, even when the caller doesn’t know that it has a Circle:

Circle c;

Shape *s = &c;

double a = s->area();

// at run-time (after the v-table lookup), this means:

// double a = Circle::area(s); // s is the “this” ptr

Even though the caller is dealing with a type Shape*, the function that gets called knows that it gets a Circle*, not just a Shape*.

double Circle::area(Circle *this) {

return PI * this->radius * this->radius;

}

So virtual member functions know more about their own data than the caller knows. This is a very powerful form of data hiding, or abstraction.

C++’s support for this abstraction is the reason that it’s type system is more expressive than C’s type system or Pascal’s system, and is why C++ is particularly suitable for object-oriented programming.

Note: languages with higher-order functions share some of this abstraction expressiveness, and higher-order functions are closely related to objects in this respect.

“What is a ‘virtual member function’?

From an OO perspective, it is the single most important feature of C++”

- from the C++ FAQ Lite by Marshall Cline

http://www.cerfnet.com/~mpcline/C++-FAQs-Lite/

Object oriented programming consists of building objects that know how to operate on their own data. Virtual functions are the mechanism that binds code tightly to data, and let’s the code know more about the data than anyone else knows about the data.

Remember to turn in homework 5

Good luck on your projects!