behind the performance of quake 3 engine: fast inverse square root

Post on 04-Jul-2015

1.450 Views

Category:

Technology

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

Quake 3 was probably the most famous first-person shooter back in 1999. It had fascinating graphics and very high-responsiveness which is the result of a performance optimization and high-quality code written by id Software team. One of the most famous optimization tricks is the function that computes the approximate of inverse (reciprocal) square root through some clever bit hacking. This function is the subject of investigations by mathematicians and programmers even today. In this presentation we try to understand how it works and we also try to find the author.

TRANSCRIPT

Behind the Performance of Quake 3 Engine:

Fast Inverse Square RootMaksym Zavershynskyi

Quake 3 Arena

First Person Shooter

Released: 1999Engine: Id Tech 3

Average reviewers score: ~9/10

Architecture

• C-Language

• Client-Server separation

• Virtual Machine

• Local C Compiler for Scripts

• Highly Optimized Code

ShadingCreates the depth of perception

+ =

Material Based Shading

[1]

•Shading•Lighting •Reflections•...

What makes a nice picture?

Angle of Incidence

αnormal

view

greater α - darker shading

Vector Normalization(x,y,z)

1

(a,b,c)

Vector Normalization(x,y,z)

1

(a,b,c)

Fast Inverse Square Root

Inverse Square Root

float Q_rsqrt( float number ){ return 1.0f/sqrt(number);}

Fast Approximate Inverse Square Root

float Q_rsqrt( float number ){ long i; float x2, y; const float threehalfs = 1.5F; x2 = number * 0.5F; y = number; i = * ( long * ) &y; // evil floating //point bit level hacking i = 0x5f3759df - ( i >> 1 ); // what the f☀✿k?

y = * ( float * ) &i; y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, //this can be removed return y;}

float Q_rsqrt( float number ){ long i; float x2, y; const float threehalfs = 1.5F; x2 = number * 0.5F; y = number; i = * ( long * ) &y; // evil floating point bit level hacking i = 0x5f3759df - ( i >> 1 ); // what the f☀✿k?

y = * ( float * ) &i; y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed return y;}

(1)Interpret float as integer

(2)Good initial guess with magic number 0x5f3759df

(3)One iteration of Newton’s approximation

(1)(2)

(3)(1)

(1)Interpret float as integer

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

32-bit float:

E M

0.15625 which is 1.01x2-3 in binaryE=-3+127=124 or 01111100 in binaryM=.01

(1)Interpret float as integer

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

(1)Interpret float as integer

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

E → E/2

(1)Interpret float as integer

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

E → E/2

(1)Interpret float as integer

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

the magic number 0x5f3759df0 1 0 1 1 1 1 1 0 0 1 1 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1

0x5f3759df - (i>>1)0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1

result: 2.614 (exact value 1/sqrt(x)=2.52982..)

(1)Interpret float as integer

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

the magic number 0x5f3759df0 1 0 1 1 1 1 1 0 0 1 1 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1

0x5f3759df - (i>>1)0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1

result: 2.614 (exact value 1/sqrt(x)=2.52982..)

(2)Magic Number: 0x5f3759df

•Gives a good initial guess.

•Minimizes the relative error.

•Trying to find a better number that minimizes

the error of initial guess we come up with:

0x5f37642f [4]

(2)Magic Number: 0x5f3759df

•Gives a good initial guess.

•Minimizes the relative error.

•Trying to find a better number that minimizes

the error of initial guess we come up with:

0x5f37642f [4]

Did we find a better magical number? ;)

(3)One iteration of Newton’s method

Newton’s method:Given a suitable approximation yn to the root of f(y),gives a better one yn+1 using

root

(3)One iteration of Newton’s method

Newton’s method:Given a suitable approximation yn to the root of f(y),gives a better one yn+1 using

In our case:

y = y * ( 1.5f - ( 0.5f * x * y * y ) );

(3)One iteration of Newton’s method

After one iteration of Newton’s methodour magic number 0x5f37642f gives worse approximation than the original magic number 0x5f3759df !!! [4]

Open Question:How was the original magic number derived?

Open Question:How was the original magic number 0x5f3759df derived?

•Lomont in 2003 numerically found a slightly better magic number 0x5f375a86 [4]

•Robertson in 2012 analytically found the same better magic number 0x5f375a86 [3]

Max relative error: 0.177% [3]

With the 2nd iteration of Newton’s method: 0.00047% [3]

How good?

In 1999: ???

Today: on CPUs 3-4 times faster

With the 2nd iteration of Newton’s method: 2-2.5 faster

How fast?

[3]

Who wrote it?

Who?John Carmack?Lead Programmer of Quake, Doom, Wolfenstein 3D

Michael Abrash?Author of:Zen of Assembly LanguageZen of Graphics Programming

[8]

Who?John Carmack?Lead Programmer of Quake, Doom, Wolfenstein 3D

“...Not me, and I don’t think it is Michael (Abrash). Terje Mathison perhaps?...”

Michael Abrash?Author of:Zen of Assembly LanguageZen of Graphics Programming

[8]

Who?

Terje Mathisen?Assembly language optimization for x86 microprocessors.

“... I wrote fast & accurate invssqrt()... for a computational fluid chemistry problem...

...The code is not the same as I wrote...”[8]

Who?Gary Tarolli?Co-founder of 3dfx (predecessor of Nvidia)

[8]

Who?Gary Tarolli?Co-founder of 3dfx (predecessor of Nvidia)

“It did pass by my keyboard many many years ago, I may have tweaked the hex constant a bit or so, but other than that I can’t take credit for it, except that

I used it a lot and probably contributed to its popularity and longevity. “

[8]

Who?Gary Tarolli?Co-founder of 3dfx (predecessor of Nvidia)

“It did pass by my keyboard many many years ago, I may have tweaked the hex constant a bit or so, but other than that I can’t take credit for it, except that

I used it a lot and probably contributed to its popularity and longevity. “

This hack is older than 1990!!!

[8]

Who?Cleve Moler inspirationFounder of the first MATLAB,one of the founders of MathWorks,is currently a Chief Mathematician there.

Greg Walsch author (most probably)Being working on Internet and distributed computing technologies since before it was even the Internet, and helping to engineer the first WYSIWYG word processor at Xerox PARC while at Stanford University

[9]

[9]

Who?

Inspired by Cleve Moler from the code written by Velvel Kahan and K.C. Ng at Berkeley around

1986!!!

http://www.netlib.org/fdlibm/e_sqrt.c

[10]

Finally

It is Fast: 3-4 faster than the straightforward code

It is Good: 0.17% maximum relative error

It can be Improved

Dates back in 1986

Thank you!

http://zavermax.github.io

Quake 1,3 Architecture

1) Fabien Sanglard, Quake 3 source code review. 2012 http://fabiensanglard.net/quake3/

2) Michael Abrash, Ramblings in Realtime http://www.bluesnews.com/abrash/

Inverse Square Root

3) Matthew Robertson, A Brief History of InvSqrt. 2012 Bachelor’s Thesis. Brunswick, Germany

4) Chris Lomont, Fast Inverse Square root, Indiana: Purdue University, 2003

5) Jim Blinn, Floating-point tricks, IEEE Comp. Graphics and Applications 17, no 4, 1997

6) David Elbery, Fast Inverse square root (Revisited), Geometric Tools, LLC, 2010

7) Charles McEniry, The Mathematics Behind the Fast Inverse Square Root Function Code, 2007

Investigation of the Authorship

8) Rys Sommefeldt, Origin of Quake3’s Fast InvSqrt() 2006 http://www.beyond3d.com/content/articles/8/

9) Rys Sommefeldt, Origin of Quake3’s Fast InvSqrt() - Part Two 2007 http://www.beyond3d.com/content/articles/15/

10) http://blogs.mathworks.com/cleve/2012/06/19/symplectic-spacewar/#comment-13

Additional

11) http://en.wikipedia.org/wiki/Fast_inverse_square_root

12) https://github.com/id-Software/Quake-III-Arena

Some literature here

top related