behind the performance of quake 3 engine: fast inverse square root

Behind the Performance of Quake 3 Engine:

Fast Inverse Square RootMaksym Zavershynskyi

Quake 3 Arena

First Person Shooter

Released: 1999Engine: Id Tech 3

Average reviewers score: ~9/10

Architecture

• C-Language

• Client-Server separation

• Virtual Machine

• Local C Compiler for Scripts

• Highly Optimized Code

ShadingCreates the depth of perception

Material Based Shading

•Shading•Lighting •Reflections•...

What makes a nice picture?

Angle of Incidence

αnormal

greater α - darker shading

Vector Normalization(x,y,z)

(a,b,c)

Vector Normalization(x,y,z)

(a,b,c)

Fast Inverse Square Root

Inverse Square Root

float Q_rsqrt( float number ){ return 1.0f/sqrt(number);}

Fast Approximate Inverse Square Root

float Q_rsqrt( float number ){ long i; float x2, y; const float threehalfs = 1.5F; x2 = number * 0.5F; y = number; i = * ( long * ) &y; // evil floating //point bit level hacking i = 0x5f3759df - ( i >> 1 ); // what the f☀✿k?

y = * ( float * ) &i; y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, //this can be removed return y;}

float Q_rsqrt( float number ){ long i; float x2, y; const float threehalfs = 1.5F; x2 = number * 0.5F; y = number; i = * ( long * ) &y; // evil floating point bit level hacking i = 0x5f3759df - ( i >> 1 ); // what the f☀✿k?

y = * ( float * ) &i; y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed return y;}

(1)Interpret float as integer

(2)Good initial guess with magic number 0x5f3759df

(3)One iteration of Newton’s approximation

(1)(2)

(3)(1)

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

32-bit float:

0.15625 which is 1.01x2-3 in binaryE=-3+127=124 or 01111100 in binaryM=.01

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

E → E/2

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

E → E/2

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

the magic number 0x5f3759df0 1 0 1 1 1 1 1 0 0 1 1 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1

0x5f3759df - (i>>1)0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1

result: 2.614 (exact value 1/sqrt(x)=2.52982..)

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

the magic number 0x5f3759df0 1 0 1 1 1 1 1 0 0 1 1 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1

0x5f3759df - (i>>1)0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1

result: 2.614 (exact value 1/sqrt(x)=2.52982..)

(2)Magic Number: 0x5f3759df

•Gives a good initial guess.

•Minimizes the relative error.

•Trying to find a better number that minimizes

the error of initial guess we come up with:

0x5f37642f [4]

(2)Magic Number: 0x5f3759df

•Gives a good initial guess.

•Minimizes the relative error.

•Trying to find a better number that minimizes

the error of initial guess we come up with:

0x5f37642f [4]

Did we find a better magical number? ;)

(3)One iteration of Newton’s method

Newton’s method:Given a suitable approximation yn to the root of f(y),gives a better one yn+1 using

In our case:

y = y * ( 1.5f - ( 0.5f * x * y * y ) );

After one iteration of Newton’s methodour magic number 0x5f37642f gives worse approximation than the original magic number 0x5f3759df !!! [4]

Open Question:How was the original magic number derived?

Open Question:How was the original magic number 0x5f3759df derived?

•Lomont in 2003 numerically found a slightly better magic number 0x5f375a86 [4]

•Robertson in 2012 analytically found the same better magic number 0x5f375a86 [3]

Max relative error: 0.177% [3]

With the 2nd iteration of Newton’s method: 0.00047% [3]

How good?

In 1999: ???

Today: on CPUs 3-4 times faster

With the 2nd iteration of Newton’s method: 2-2.5 faster

How fast?

Who wrote it?

Who?John Carmack?Lead Programmer of Quake, Doom, Wolfenstein 3D

Michael Abrash?Author of:Zen of Assembly LanguageZen of Graphics Programming

Who?John Carmack?Lead Programmer of Quake, Doom, Wolfenstein 3D

“...Not me, and I don’t think it is Michael (Abrash). Terje Mathison perhaps?...”

Michael Abrash?Author of:Zen of Assembly LanguageZen of Graphics Programming

Terje Mathisen?Assembly language optimization for x86 microprocessors.

“... I wrote fast & accurate invssqrt()... for a computational fluid chemistry problem...

...The code is not the same as I wrote...”[8]

Who?Gary Tarolli?Co-founder of 3dfx (predecessor of Nvidia)

“It did pass by my keyboard many many years ago, I may have tweaked the hex constant a bit or so, but other than that I can’t take credit for it, except that

I used it a lot and probably contributed to its popularity and longevity. “

Who?Gary Tarolli?Co-founder of 3dfx (predecessor of Nvidia)

“It did pass by my keyboard many many years ago, I may have tweaked the hex constant a bit or so, but other than that I can’t take credit for it, except that

I used it a lot and probably contributed to its popularity and longevity. “

This hack is older than 1990!!!

Who?Cleve Moler inspirationFounder of the first MATLAB,one of the founders of MathWorks,is currently a Chief Mathematician there.

Greg Walsch author (most probably)Being working on Internet and distributed computing technologies since before it was even the Internet, and helping to engineer the first WYSIWYG word processor at Xerox PARC while at Stanford University

Inspired by Cleve Moler from the code written by Velvel Kahan and K.C. Ng at Berkeley around

1986!!!

http://www.netlib.org/fdlibm/e_sqrt.c

Finally

It is Fast: 3-4 faster than the straightforward code

It is Good: 0.17% maximum relative error

It can be Improved

Dates back in 1986

Thank you!

http://zavermax.github.io

Quake 1,3 Architecture

1) Fabien Sanglard, Quake 3 source code review. 2012 http://fabiensanglard.net/quake3/

2) Michael Abrash, Ramblings in Realtime http://www.bluesnews.com/abrash/

Inverse Square Root

3) Matthew Robertson, A Brief History of InvSqrt. 2012 Bachelor’s Thesis. Brunswick, Germany

4) Chris Lomont, Fast Inverse Square root, Indiana: Purdue University, 2003

5) Jim Blinn, Floating-point tricks, IEEE Comp. Graphics and Applications 17, no 4, 1997

6) David Elbery, Fast Inverse square root (Revisited), Geometric Tools, LLC, 2010

7) Charles McEniry, The Mathematics Behind the Fast Inverse Square Root Function Code, 2007

Investigation of the Authorship

8) Rys Sommefeldt, Origin of Quake3’s Fast InvSqrt() 2006 http://www.beyond3d.com/content/articles/8/

9) Rys Sommefeldt, Origin of Quake3’s Fast InvSqrt() - Part Two 2007 http://www.beyond3d.com/content/articles/15/

10) http://blogs.mathworks.com/cleve/2012/06/19/symplectic-spacewar/#comment-13

Additional

11) http://en.wikipedia.org/wiki/Fast_inverse_square_root

12) https://github.com/id-Software/Quake-III-Arena

Some literature here

behind the performance of quake 3 engine: fast inverse square root

Technology

unit 6 square root and inverse variation functions · 2019....

eok6 quake

introduction to megafunction ip cores · altfp_exp...

quake (sismos)

quake modeling.pdf

Łukasz jagiełło - fast inverse square root - papers we...

earth quake ppt

quake quest

wblissmath.weebly.comwblissmath.weebly.com/uploads/5/8/2/3/58237713/math_3_unit_1_note… ·...

quake howto

quake 2010

earth quake rubble

earth quake load

temporal lobe quake

scientific calculator operation guide - global.sharp · 8...

emergency - ritsumeikan asia pacific...

nc math 2- square root and inverse variation...

nc math 2 unit 6 square root and inverse variation...

earth quake

quake finder