Resources

People often ask me "How did you learn how to hack?" The answer: by reading. This page is a collection of the blog posts and other articles that I have accumulated over the years of my journey. Enjoy!

Tricks With the Floating-Point Format- 645

Bruce DawsonPosted 4 Years Ago
  • The IEEE 754-1985 specifies the floating point format of 32-bit numbers, which is known as the float type in many languages. The format for a floating point number is much more complicated than the integer format because of the sheer volume of numbers it can represent.
  • A 32-bit float has three fields, in this order:
    1. One-bit sign field.
    2. Eight-bit exponent field.
    3. Twenty-three bit mantissa field. This is where lesser significant bits are stored at.
  • The author says to play with the format some with the provided code, which was actually really helpful! Here are a few things that I learned:
    • The exponent bit starts at 128 for a value of 1. Then, it moves backwards or forwards depending on the decimal point.
    • The sign & magnitude scheme is used for this. As a result, all numbers (including 0) can be represented in the positive and negative forms.
  • The formula for decoding a float is quite simple besides these cases though: (1.0 + mantissa-field / 0x800000) * 2^(exponent-field-127) . From this, we use the sign bit to make this a positive or negative number. When decoding the format, there is an implied leading 1 in front of the mantissa field if the value is between -126 and 127. if the value is -126, then do not add the leading one.
  • There are a few special cases that are interesting to talk about. First, if the exponent field is 255 & the mantissa is zero, then the value is infinity. And if the exponent field is 255 & the mantissa is non-zero, then this is Not A Number (NaN).
  • Why the implicit 1 though? We know that the first bit of the binary pattern will be a 1. Otherwise, the exponent would be smaller. This saves a bit and allows for the usage of bigger numbers!
  • This is a huge chain of posts, that I am real excited to learn all about! Floats have caused issues for developers for many years; I am hyped to finally understand why. More on these articles will be coming!