Crazydev
Raiting:
3

"The Magic Constant" 0x5f3759df


In this article, we'll talk about the "magic" constant 0x5f3759df, which lies at the heart of the elegant algorithmic trick for quick calculation of the inverse square root .

Here is the complete implementation of this algorithm:

float FastInvSqrt(float x) {
float xhalf = 0.5f * x;
  int i = * (int *) & x; // represent the float bits as an integer
  i = 0x5f3759df - (i >> 1); // what the hell is going on here ?
x = *(float*)&i;
x = x*(1.5f-(xhalf*x*x));
return x;
}
This code calculates some (fairly good) approximation for the formula

image

Today, this implementation is already well known, and it became so after the appearance in the code of the game Quake III Arena in 2005. Her creation was once attributed to John Carmack, but it turned out that roots go much further - to Ardent Computer , where in the mid-80s it was written by Greg Walsh. Specifically, the version of the code that is shown above (with funny comments) is really from the Quake code.
In this article we will try to understand this hack, mathematically derive this very constant and try to generalize this method to calculate arbitrary degrees from -1 to 1.

Yes, it will take a bit of math, but the school course will be more than enough.
Why?
Why do we need to consider the inverse square root, and even try to do it so quickly that we need to implement special hacks for this? Because this is one of the main operations in 3D programming. When working with 3D graphics, surface normals are used. This is a vector (with three coordinates) of length one, which is needed to describe lighting, reflection, etc. There are many such vectors. No, not even just a lot, but a LOT. How do we normalize (reduce the length to one) vector? We divide each coordinate by the current length of the vector. Well, or, paraphrasing, multiply each coordinate of the vector by the value:

image

Calculation is relatively simple and works quickly on all types of processors. But the calculation of the square root and division - expensive operations. And that's why - meet the fast square root extraction algorithm - FastInvSqrt.
What does it do?
What does the above function do to calculate the result? It consists of four basic steps. First of all, it takes the input number (which came to us in the format float) and interprets its bits as the value of a new variable i of type integer (integer).

int i = * (int *) & x; // represent the bits float as an integer Further, on the resulting integer, some integer arithmetic operation is performed, which works quite quickly and gives us some approximation of the required result

i = 0x5f3759df - (i >> 1); // what the hell is going on here? What we got as a result of this operation, is not yet, in fact, the result. This is just an integer, the bits of which represent some other floating-point number that we need. So, you need to reverse the conversion of int to float.

x = *(float*)&i; And, finally, one iteration of Newton's method is performed to improve the approximation.

x = x*(1.5f-(xhalf*x*x)); And now we have an excellent approximation of the operation of extracting the inverse square root. The last part of the algorithm (Newton's method) is a trivial enough thing, I will not dwell on this. The key part of the algorithm is step # 2: performing some tricky arithmetic operation on the integer obtained from interpreting the floating-point number bits as int contents. This is where we focus.
What the hell is going on here?
To understand the further text, we need to remember the format , which stores floating-point numbers in memory. I will describe here only what is important here for us (the rest is always possible to spy on Wikipedia). A floating-point number is stored as a combination of three components: a sign, exponents, and a mantissa. Here are bits of the 32-bit floating-point number:

image

The first is the sign bit, further 8 bits of the exponent and 23 bits of the mantissa. Since we are dealing here with the calculation of the square root, I assume that we will only deal with positive numbers (the first bit will always be 0).

Considering a floating-point number as just a set of bits, the exponent and the mantissa can be perceived as just two positive integers. Let us denote them, respectively, E and M (since we will often refer to them later). On the other hand, interpreting the floating point bits, we will consider the mantissa as a number between 0 and 1, i.e. all zeros in the mantissa will mean 0, and all units will be a number that is very close (but not equal). 1. Well, instead of interpreting the exponent as an unsigned 8-bit integer, let's subtract the offset (denoted it as B) , to get a signed integer in the range from -127 to 128. Let's denote the float-interpretation of these values ​​as e and m. In order not to be confused, we will use the capital notation (E, M) to denote integer values ​​and lower-case (e, m) - to denote floating-point numbers (float).

Conversion from one to another is trivial:

image

image

In these formulas for 32-bit numbers, L = 2 ^ 23, and B = 127. Having some values ​​of e and m, you can get the number they represent:

image

and the value of the corresponding integer interpretation of the number:

image

Now we have almost all the pieces of the puzzle, which are needed to explain the "hack" in the code above. So, a certain number x comes to the input and we need to calculate its inverse square root:

image

For some reason, which will soon become clear, I'll start by taking the base 2 logarithm from both parts of the equation:

image

Since the numbers we are working with are actually floating-point numbers, we can represent x and y according to the above formula for representing such numbers:

image

Oh, those logarithms. Perhaps you have not used them since school and a little forgotten. Do not worry, we'll get rid of them a little further, but so far we still need them, so let's see how they work here.
In both parts of the equation we have an expression of the form:

image

where v is in the range from 0 to 1. You can notice that for v from 0 to 1 this function is fairly close to a straight line:

image

Well, or in the form of an expression:

image

Where σ is a constant. This is not an ideal approximation, but we can try to pick σ in such a way that it is quite good. Now, using it, we can transform the above equation with the logarithms into another, not strictly equal, but rather close and, most importantly, STRONG LINEAR expression:

image

This is already something! Now it's time to stop working with floating-point representations and move on to the integer representation of the mantissa and the exponent:

image

Having executed some more trivial transformations (you can skip details), we will get something that looks quite familiar:

image

image

image

Look closely at the left and right sides of the last equation. As we can see, we have obtained an expression for the integer form of the required value of y, expressed in terms of a linear form, which includes an integer representation of the value of x:

image

In simple terms: "y (in integer form) is some constant minus half of the integer form x". In the form of a code this is:

i = K - (i >> 1); Very similar to the formula in the function code at the beginning of the article, right?

It remains for us to find the constant K. We already know the values ​​of B and L, but we still do not know what is equal to σ.
As you remember, σ is some "correction value", which we introduced to improve the approximation of the logarithm function to a straight line on the segment from 0 to 1. Ie. we can choose this number ourselves. I'll take the number 0.0450465 as giving a good approximation and used in the original implementation. Using it, we get:

image

Guess how the number 1597463007 is presented in HEX? Well, of course, this is 0x5f3759df. Well, it should have happened, because I chose σ in such a way to get this number.

Thus, this number is not a bitmask (as some people think, simply because it is written in hex form), but as a result of calculating the approximation.

But, as Knut would say: "We have just proved that this should work, but did not check that it really works." To evaluate the quality of our formula, let's draw the graphs of the inverse square root calculated in this way and the present, exact implementation:

image

The graph is built for numbers from 1 to 100. Not bad, right? And this is not magic, not focus, but just the correct use of a somewhat, perhaps, exotic trick with the representation of floating-point numbers in the form of integers and vice versa.
But that's not all!
All the above conversions and expressions gave us not only an explanation of the constant 0x5f3759df, but also several more valuable conclusions.

First, let's talk about the values ​​of the numbers L and B. They are determined not by our task of extracting the inverse square root, but by the format of storing the number with a floating comma. This means that the same trick can be done for both 64-bit and 128-bit floating-point numbers - you just need to repeat the calculations to calculate other constants.

Secondly, the selected value of σ is not very important to us. It may not (and indeed does not) give a better approximation of the function x + σ to the logarithm. σ is chosen so, because it gives the best result together with the subsequent application of the Newton algorithm. If we did not apply it, then the choice of σ would be a separate interesting problem in itself, this topic is disclosed in other publications.

Well, in the end, let's look at the coefficient "-1/2" in the final formula. It turned out that way because of the essence of what we wanted to compute ("the inverse square root"). But, in general, the degree here can be any one from -1 to 1. If we denote the degree as p and generalize all the same transformations, then instead of "-1/2" we get:

image

Let's put p = 0.5 This will be a calculation of the usual (not the opposite) square root of the number:

image
image

In the form of a code it will be:

i = 0x1fbd1df5 + (i >> 1); And does it work? Of course, it works:

image

This is probably a well-known way of quickly approximating the value of the square root, but runaway Google did not give me its name. Perhaps you can tell?
This method will work with more "strange" degrees, like a cubic root:

image

image

What's in the code will be expressed like:

i = (int) (0x2a517d3c + (0.333f * i)); Unfortunately, due to the degree of 1/3, we can not use bitwise shift operations and are forced to apply multiplication by 0.333f here. The approximation is still quite good:

image
And even more than that!
By this point, you could already replace that the change in the degree of the function being calculated trivially changes our calculations: we simply calculate a new constant. This is a completely non-costly operation and we can do this even at the stage of code execution, for different required degrees. If we multiply two previously known constants:

image

That we can calculate the required values ​​on the fly, for an arbitrary degree from -1 to 1:

i = (1 - p) * 0x3f7a3bea + (p * i); Slightly simplifying the expression, we can even save on one multiplication:

i = 0x3f7a3bea + p * (i - 0x3f7a3bea); This formula gives us a "magic" constant, by means of which we can calculate different degrees of numbers on the fly (for degrees from -1 to 1). For complete happiness, we lack only the certainty that the approximate value calculated in this way can be as effectively improved by Newton's algorithm as it did in the original realization for the inverse square root. I have not studied this topic more deeply and this is likely to be the topic of a separate publication (probably not mine).

The expression above contains the new "magic constant" 0x3f7a3bea. In a sense (because of its universality), it is even "more magical" than the constant in the original code. Let's call it C and look at it a little more closely.

Let's check the work of our formula for the case when p = 0. As you remember from the mathematics course, any number in the power of 0 is equal to one. What will happen to our formula? It's very simple - multiplying by 0 will destroy the second term and we will have:

i = 0x3f7a3bea; That, really is a constant and, being translated into a float-format, will give us 0.977477 - i.e. "Almost 1". Since we are dealing with approximations, this is a good approximation. In addition, it tells us something else. Our constant C has absolutely no random meaning. This is a unit in floating-point number format (well, or "almost one")
It is interesting. Let's take a closer look:

The integer representation of C is:

image

This is almost, but still not quite, a floating-point number form. The only problem is that we subtract the second part of the expression, but we should add it. But this can be corrected:

image

Now it looks exactly like an integer representation of a floating-point number. To determine which number, we calculate the exponent and the mantissa, and then the number C. This is the exponent:

image

But the mantissa:

image

And, therefore, the value of the number itself will be equal to:

image

And the truth is, if we divide our σ (and it was equal to 0.0450465) by 2 and subtract the result from unity, then we will get the already known 0.97747675, the one that is "almost 1". This allows us to look at C from the other side and calculate it on the runtime:

float sigma = 0.0450465;
float c_sigma = 1 - (0.5f * sigma);
int C_sigma = *(*int)&c_sigma;
Note that for a fixed σ all these numbers will be constants and the compiler will be able to compute them at the compilation stage. The result is 0x3f7a3beb, which is not exactly 0x3f7a3bea from the calculations above, but differs from it only by 1 bit (least significant). To get from this number the original constant from the code (and the title of this article) can be multiplied by another 1.5.

With all these calculations, we came close to understanding that in the code at the beginning of the article there is no "magic", "tricks", "intuition", "selection" and other dirty hacks, and there is only pure mathematics in all its pristine beauty. For me, the main conclusion from this story was the news that converting float to int and vice versa by reinterpreting the same set of bits is not a programmer error and not a "hack", but quite a reasonable operation sometimes. Slightly, of course, exotic, but very fast and giving practically useful results. And, it seems to me, other applications can be found for this operation - we will wait.
xially 19 september 2017, 9:34
Vote for this post
Bring it to the Main Page
 

Comments

Leave a Reply

B
I
U
S
Help
Avaible tags
  • <b>...</b>highlighting important text on the page in bold
  • <i>..</i>highlighting important text on the page in italic
  • <u>...</u>allocated with tag <u> text shownas underlined
  • <s>...</s>allocated with tag <s> text shown as strikethrough
  • <sup>...</sup>, <sub>...</sub>text in the tag <sup> appears as a superscript, <sub> - subscript
  • <blockquote>...</blockquote>For  highlight citation, use the tag <blockquote>
  • <code lang="lang">...</code>highlighting the program code (supported by bash, cpp, cs, css, xml, html, java, javascript, lisp, lua, php, perl, python, ruby, sql, scala, text)
  • <a href="http://...">...</a>link, specify the desired Internet address in the href attribute
  • <img src="http://..." alt="text" />specify the full path of image in the src attribute