Why does Python's hash of infinity have the digits of π?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







235















The hash of infinity in Python has digits matching pi:



>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159


Is that just a coincidence or is it intentional?










share|improve this question






















  • 9





    Not certain, but my guess would be that it's as deliberate as hash(float('nan')) being 0.

    – cs95
    May 20 at 20:04






  • 121





    Ask Tim Peters. Here's the commit where he introduced this constant, 19 years ago: github.com/python/cpython/commit/…. I kept those special values when I reworked the numeric hash in bugs.python.org/issue8188

    – Mark Dickinson
    May 20 at 20:38








  • 8





    @MarkDickinson Thanks. It looks like Tim may have also used the digits of e for hash of -inf originally.

    – wim
    May 20 at 20:42








  • 17





    @wim Ah yes, true. And apparently I changed that to -314159. I'd forgotten about that.

    – Mark Dickinson
    May 20 at 20:44






  • 4





    Did you test this in python implementations other than CPython? PyPy, Jython? If not, you should tag this with the appropriate runtime, as it's likely specific to that implementation (whether intentional or not).

    – jpmc26
    May 22 at 6:56




















235















The hash of infinity in Python has digits matching pi:



>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159


Is that just a coincidence or is it intentional?










share|improve this question






















  • 9





    Not certain, but my guess would be that it's as deliberate as hash(float('nan')) being 0.

    – cs95
    May 20 at 20:04






  • 121





    Ask Tim Peters. Here's the commit where he introduced this constant, 19 years ago: github.com/python/cpython/commit/…. I kept those special values when I reworked the numeric hash in bugs.python.org/issue8188

    – Mark Dickinson
    May 20 at 20:38








  • 8





    @MarkDickinson Thanks. It looks like Tim may have also used the digits of e for hash of -inf originally.

    – wim
    May 20 at 20:42








  • 17





    @wim Ah yes, true. And apparently I changed that to -314159. I'd forgotten about that.

    – Mark Dickinson
    May 20 at 20:44






  • 4





    Did you test this in python implementations other than CPython? PyPy, Jython? If not, you should tag this with the appropriate runtime, as it's likely specific to that implementation (whether intentional or not).

    – jpmc26
    May 22 at 6:56
















235












235








235


21






The hash of infinity in Python has digits matching pi:



>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159


Is that just a coincidence or is it intentional?










share|improve this question
















The hash of infinity in Python has digits matching pi:



>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159


Is that just a coincidence or is it intentional?







python math hash floating-point pi






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited May 24 at 1:24









Andy Lester

70.8k12 gold badges80 silver badges139 bronze badges




70.8k12 gold badges80 silver badges139 bronze badges










asked May 20 at 20:00









wimwim

179k59 gold badges357 silver badges480 bronze badges




179k59 gold badges357 silver badges480 bronze badges











  • 9





    Not certain, but my guess would be that it's as deliberate as hash(float('nan')) being 0.

    – cs95
    May 20 at 20:04






  • 121





    Ask Tim Peters. Here's the commit where he introduced this constant, 19 years ago: github.com/python/cpython/commit/…. I kept those special values when I reworked the numeric hash in bugs.python.org/issue8188

    – Mark Dickinson
    May 20 at 20:38








  • 8





    @MarkDickinson Thanks. It looks like Tim may have also used the digits of e for hash of -inf originally.

    – wim
    May 20 at 20:42








  • 17





    @wim Ah yes, true. And apparently I changed that to -314159. I'd forgotten about that.

    – Mark Dickinson
    May 20 at 20:44






  • 4





    Did you test this in python implementations other than CPython? PyPy, Jython? If not, you should tag this with the appropriate runtime, as it's likely specific to that implementation (whether intentional or not).

    – jpmc26
    May 22 at 6:56
















  • 9





    Not certain, but my guess would be that it's as deliberate as hash(float('nan')) being 0.

    – cs95
    May 20 at 20:04






  • 121





    Ask Tim Peters. Here's the commit where he introduced this constant, 19 years ago: github.com/python/cpython/commit/…. I kept those special values when I reworked the numeric hash in bugs.python.org/issue8188

    – Mark Dickinson
    May 20 at 20:38








  • 8





    @MarkDickinson Thanks. It looks like Tim may have also used the digits of e for hash of -inf originally.

    – wim
    May 20 at 20:42








  • 17





    @wim Ah yes, true. And apparently I changed that to -314159. I'd forgotten about that.

    – Mark Dickinson
    May 20 at 20:44






  • 4





    Did you test this in python implementations other than CPython? PyPy, Jython? If not, you should tag this with the appropriate runtime, as it's likely specific to that implementation (whether intentional or not).

    – jpmc26
    May 22 at 6:56










9




9





Not certain, but my guess would be that it's as deliberate as hash(float('nan')) being 0.

– cs95
May 20 at 20:04





Not certain, but my guess would be that it's as deliberate as hash(float('nan')) being 0.

– cs95
May 20 at 20:04




121




121





Ask Tim Peters. Here's the commit where he introduced this constant, 19 years ago: github.com/python/cpython/commit/…. I kept those special values when I reworked the numeric hash in bugs.python.org/issue8188

– Mark Dickinson
May 20 at 20:38







Ask Tim Peters. Here's the commit where he introduced this constant, 19 years ago: github.com/python/cpython/commit/…. I kept those special values when I reworked the numeric hash in bugs.python.org/issue8188

– Mark Dickinson
May 20 at 20:38






8




8





@MarkDickinson Thanks. It looks like Tim may have also used the digits of e for hash of -inf originally.

– wim
May 20 at 20:42







@MarkDickinson Thanks. It looks like Tim may have also used the digits of e for hash of -inf originally.

– wim
May 20 at 20:42






17




17





@wim Ah yes, true. And apparently I changed that to -314159. I'd forgotten about that.

– Mark Dickinson
May 20 at 20:44





@wim Ah yes, true. And apparently I changed that to -314159. I'd forgotten about that.

– Mark Dickinson
May 20 at 20:44




4




4





Did you test this in python implementations other than CPython? PyPy, Jython? If not, you should tag this with the appropriate runtime, as it's likely specific to that implementation (whether intentional or not).

– jpmc26
May 22 at 6:56







Did you test this in python implementations other than CPython? PyPy, Jython? If not, you should tag this with the appropriate runtime, as it's likely specific to that implementation (whether intentional or not).

– jpmc26
May 22 at 6:56














3 Answers
3






active

oldest

votes


















45














_PyHASH_INF is defined as a constant equal to 314159.



I can't find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.






share|improve this answer





















  • 3





    Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this case hash(314159) is also 314159. Also try, in Python 3, hash(2305843009214008110) == 314159 (this input is 314159 + sys.hash_info.modulus) etc.

    – ShreevatsaR
    May 21 at 11:43






  • 1





    @ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions

    – Patrick Haugh
    May 21 at 13:37






  • 1





    Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)

    – ShreevatsaR
    May 22 at 1:52



















213





+100









Summary: It's not a coincidence; _PyHASH_INF is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.





The value of hash(float('inf')) is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf in Python 3:



>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159


(Same results with PyPy too.)





In terms of code, hash is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash attribute of the built-in float type (PyTypeObject PyFloat_Type), which is the float_hash function, defined as return _Py_HashDouble(v->ob_fval), which in turn has



    if (Py_IS_INFINITY(v))
return v > 0 ? _PyHASH_INF : -_PyHASH_INF;


where _PyHASH_INF is defined as 314159:



#define _PyHASH_INF 314159




In terms of history, the first mention of 314159 in this context in the Python code (you can find this with git bisect or git log -S 314159 -p) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython git repository.



The commit message says:




Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470.
This was a misleading bug -- the true "bug" was that hash(x) gave an error
return when x is an infinity. Fixed that. Added new Py_IS_INFINITY macro to
pyport.h. Rearranged code to reduce growing duplication in hashing of float and
complex numbers, pushing Trent's earlier stab at that to a logical conclusion.
Fixed exceedingly rare bug where hashing of floats could return -1 even if there
wasn't an error (didn't waste time trying to construct a test case, it was simply
obvious from the code that it could happen). Improved complex hash so that
hash(complex(x, y)) doesn't systematically equal hash(complex(y, x)) anymore.




In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v) in Objects/floatobject.c and made it just return _Py_HashDouble(v->ob_fval);, and in the definition of long _Py_HashDouble(double v) in Objects/object.c he added the lines:



        if (Py_IS_INFINITY(intpart))
/* can't convert to long int -- arbitrary */
v = v < 0 ? -271828.0 : 314159.0;


So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.



Related later commits:




  • By Mark Dickinson in Apr 2010 (also), making the Decimal type behave similarly


  • By Mark Dickinson in Apr 2010 (also), moving this check to the top and adding test cases


  • By Mark Dickinson in May 2010 as issue 8188, completely rewriting the hash function to its current implementation, but retaining this special case, giving the constant a name _PyHASH_INF (also removing the 271828 which is why in Python 3 hash(float('-inf')) returns -314159 rather than -271828 as it does in Python 2)


  • By Raymond Hettinger in Jan 2011, adding an explicit example in the "What's new" for Python 3.2 of sys.hash_info showing the above value. (See here.)


  • By Stefan Krah in Mar 2012 modifying the Decimal module but keeping this hash.


  • By Christian Heimes in Nov 2013, moved the definition of _PyHASH_INF from Include/pyport.h to Include/pyhash.h where it now lives.







share|improve this answer























  • 43





    The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.

    – Russell Borogove
    May 21 at 4:30






  • 23





    @RussellBorogove No but it makes it about one million times less likely ;)

    – pipe
    May 21 at 15:01






  • 3





    @RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.

    – ShreevatsaR
    May 21 at 15:32








  • 2





    @pipe Let's just say "removes any reasonable doubt" and call it a day.

    – jpmc26
    May 22 at 6:04






  • 8





    @cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we want hash(42.0) to be the same as hash(42), also the same as hash(Decimal(42)) and hash(complex(42)) and hash(Fraction(42, 1)). The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.

    – ShreevatsaR
    May 22 at 13:22



















11














Indeed,



sys.hash_info.inf


returns 314159. The value is not generated, it's built into the source code.
In fact,



hash(float('-inf'))


returns -271828, or approximately -e, in python 2 (it's -314159 now).



The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.






share|improve this answer






























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56227419%2fwhy-does-pythons-hash-of-infinity-have-the-digits-of-%25cf%2580%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    45














    _PyHASH_INF is defined as a constant equal to 314159.



    I can't find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.






    share|improve this answer





















    • 3





      Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this case hash(314159) is also 314159. Also try, in Python 3, hash(2305843009214008110) == 314159 (this input is 314159 + sys.hash_info.modulus) etc.

      – ShreevatsaR
      May 21 at 11:43






    • 1





      @ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions

      – Patrick Haugh
      May 21 at 13:37






    • 1





      Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)

      – ShreevatsaR
      May 22 at 1:52
















    45














    _PyHASH_INF is defined as a constant equal to 314159.



    I can't find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.






    share|improve this answer





















    • 3





      Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this case hash(314159) is also 314159. Also try, in Python 3, hash(2305843009214008110) == 314159 (this input is 314159 + sys.hash_info.modulus) etc.

      – ShreevatsaR
      May 21 at 11:43






    • 1





      @ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions

      – Patrick Haugh
      May 21 at 13:37






    • 1





      Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)

      – ShreevatsaR
      May 22 at 1:52














    45












    45








    45







    _PyHASH_INF is defined as a constant equal to 314159.



    I can't find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.






    share|improve this answer













    _PyHASH_INF is defined as a constant equal to 314159.



    I can't find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered May 20 at 20:19









    Patrick HaughPatrick Haugh

    34.5k10 gold badges35 silver badges52 bronze badges




    34.5k10 gold badges35 silver badges52 bronze badges











    • 3





      Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this case hash(314159) is also 314159. Also try, in Python 3, hash(2305843009214008110) == 314159 (this input is 314159 + sys.hash_info.modulus) etc.

      – ShreevatsaR
      May 21 at 11:43






    • 1





      @ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions

      – Patrick Haugh
      May 21 at 13:37






    • 1





      Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)

      – ShreevatsaR
      May 22 at 1:52














    • 3





      Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this case hash(314159) is also 314159. Also try, in Python 3, hash(2305843009214008110) == 314159 (this input is 314159 + sys.hash_info.modulus) etc.

      – ShreevatsaR
      May 21 at 11:43






    • 1





      @ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions

      – Patrick Haugh
      May 21 at 13:37






    • 1





      Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)

      – ShreevatsaR
      May 22 at 1:52








    3




    3





    Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this case hash(314159) is also 314159. Also try, in Python 3, hash(2305843009214008110) == 314159 (this input is 314159 + sys.hash_info.modulus) etc.

    – ShreevatsaR
    May 21 at 11:43





    Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this case hash(314159) is also 314159. Also try, in Python 3, hash(2305843009214008110) == 314159 (this input is 314159 + sys.hash_info.modulus) etc.

    – ShreevatsaR
    May 21 at 11:43




    1




    1





    @ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions

    – Patrick Haugh
    May 21 at 13:37





    @ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions

    – Patrick Haugh
    May 21 at 13:37




    1




    1





    Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)

    – ShreevatsaR
    May 22 at 1:52





    Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)

    – ShreevatsaR
    May 22 at 1:52













    213





    +100









    Summary: It's not a coincidence; _PyHASH_INF is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.





    The value of hash(float('inf')) is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf in Python 3:



    >>> import sys
    >>> sys.hash_info
    sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
    >>> sys.hash_info.inf
    314159


    (Same results with PyPy too.)





    In terms of code, hash is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash attribute of the built-in float type (PyTypeObject PyFloat_Type), which is the float_hash function, defined as return _Py_HashDouble(v->ob_fval), which in turn has



        if (Py_IS_INFINITY(v))
    return v > 0 ? _PyHASH_INF : -_PyHASH_INF;


    where _PyHASH_INF is defined as 314159:



    #define _PyHASH_INF 314159




    In terms of history, the first mention of 314159 in this context in the Python code (you can find this with git bisect or git log -S 314159 -p) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython git repository.



    The commit message says:




    Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470.
    This was a misleading bug -- the true "bug" was that hash(x) gave an error
    return when x is an infinity. Fixed that. Added new Py_IS_INFINITY macro to
    pyport.h. Rearranged code to reduce growing duplication in hashing of float and
    complex numbers, pushing Trent's earlier stab at that to a logical conclusion.
    Fixed exceedingly rare bug where hashing of floats could return -1 even if there
    wasn't an error (didn't waste time trying to construct a test case, it was simply
    obvious from the code that it could happen). Improved complex hash so that
    hash(complex(x, y)) doesn't systematically equal hash(complex(y, x)) anymore.




    In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v) in Objects/floatobject.c and made it just return _Py_HashDouble(v->ob_fval);, and in the definition of long _Py_HashDouble(double v) in Objects/object.c he added the lines:



            if (Py_IS_INFINITY(intpart))
    /* can't convert to long int -- arbitrary */
    v = v < 0 ? -271828.0 : 314159.0;


    So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.



    Related later commits:




    • By Mark Dickinson in Apr 2010 (also), making the Decimal type behave similarly


    • By Mark Dickinson in Apr 2010 (also), moving this check to the top and adding test cases


    • By Mark Dickinson in May 2010 as issue 8188, completely rewriting the hash function to its current implementation, but retaining this special case, giving the constant a name _PyHASH_INF (also removing the 271828 which is why in Python 3 hash(float('-inf')) returns -314159 rather than -271828 as it does in Python 2)


    • By Raymond Hettinger in Jan 2011, adding an explicit example in the "What's new" for Python 3.2 of sys.hash_info showing the above value. (See here.)


    • By Stefan Krah in Mar 2012 modifying the Decimal module but keeping this hash.


    • By Christian Heimes in Nov 2013, moved the definition of _PyHASH_INF from Include/pyport.h to Include/pyhash.h where it now lives.







    share|improve this answer























    • 43





      The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.

      – Russell Borogove
      May 21 at 4:30






    • 23





      @RussellBorogove No but it makes it about one million times less likely ;)

      – pipe
      May 21 at 15:01






    • 3





      @RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.

      – ShreevatsaR
      May 21 at 15:32








    • 2





      @pipe Let's just say "removes any reasonable doubt" and call it a day.

      – jpmc26
      May 22 at 6:04






    • 8





      @cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we want hash(42.0) to be the same as hash(42), also the same as hash(Decimal(42)) and hash(complex(42)) and hash(Fraction(42, 1)). The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.

      – ShreevatsaR
      May 22 at 13:22
















    213





    +100









    Summary: It's not a coincidence; _PyHASH_INF is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.





    The value of hash(float('inf')) is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf in Python 3:



    >>> import sys
    >>> sys.hash_info
    sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
    >>> sys.hash_info.inf
    314159


    (Same results with PyPy too.)





    In terms of code, hash is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash attribute of the built-in float type (PyTypeObject PyFloat_Type), which is the float_hash function, defined as return _Py_HashDouble(v->ob_fval), which in turn has



        if (Py_IS_INFINITY(v))
    return v > 0 ? _PyHASH_INF : -_PyHASH_INF;


    where _PyHASH_INF is defined as 314159:



    #define _PyHASH_INF 314159




    In terms of history, the first mention of 314159 in this context in the Python code (you can find this with git bisect or git log -S 314159 -p) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython git repository.



    The commit message says:




    Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470.
    This was a misleading bug -- the true "bug" was that hash(x) gave an error
    return when x is an infinity. Fixed that. Added new Py_IS_INFINITY macro to
    pyport.h. Rearranged code to reduce growing duplication in hashing of float and
    complex numbers, pushing Trent's earlier stab at that to a logical conclusion.
    Fixed exceedingly rare bug where hashing of floats could return -1 even if there
    wasn't an error (didn't waste time trying to construct a test case, it was simply
    obvious from the code that it could happen). Improved complex hash so that
    hash(complex(x, y)) doesn't systematically equal hash(complex(y, x)) anymore.




    In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v) in Objects/floatobject.c and made it just return _Py_HashDouble(v->ob_fval);, and in the definition of long _Py_HashDouble(double v) in Objects/object.c he added the lines:



            if (Py_IS_INFINITY(intpart))
    /* can't convert to long int -- arbitrary */
    v = v < 0 ? -271828.0 : 314159.0;


    So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.



    Related later commits:




    • By Mark Dickinson in Apr 2010 (also), making the Decimal type behave similarly


    • By Mark Dickinson in Apr 2010 (also), moving this check to the top and adding test cases


    • By Mark Dickinson in May 2010 as issue 8188, completely rewriting the hash function to its current implementation, but retaining this special case, giving the constant a name _PyHASH_INF (also removing the 271828 which is why in Python 3 hash(float('-inf')) returns -314159 rather than -271828 as it does in Python 2)


    • By Raymond Hettinger in Jan 2011, adding an explicit example in the "What's new" for Python 3.2 of sys.hash_info showing the above value. (See here.)


    • By Stefan Krah in Mar 2012 modifying the Decimal module but keeping this hash.


    • By Christian Heimes in Nov 2013, moved the definition of _PyHASH_INF from Include/pyport.h to Include/pyhash.h where it now lives.







    share|improve this answer























    • 43





      The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.

      – Russell Borogove
      May 21 at 4:30






    • 23





      @RussellBorogove No but it makes it about one million times less likely ;)

      – pipe
      May 21 at 15:01






    • 3





      @RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.

      – ShreevatsaR
      May 21 at 15:32








    • 2





      @pipe Let's just say "removes any reasonable doubt" and call it a day.

      – jpmc26
      May 22 at 6:04






    • 8





      @cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we want hash(42.0) to be the same as hash(42), also the same as hash(Decimal(42)) and hash(complex(42)) and hash(Fraction(42, 1)). The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.

      – ShreevatsaR
      May 22 at 13:22














    213





    +100







    213





    +100



    213




    +100





    Summary: It's not a coincidence; _PyHASH_INF is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.





    The value of hash(float('inf')) is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf in Python 3:



    >>> import sys
    >>> sys.hash_info
    sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
    >>> sys.hash_info.inf
    314159


    (Same results with PyPy too.)





    In terms of code, hash is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash attribute of the built-in float type (PyTypeObject PyFloat_Type), which is the float_hash function, defined as return _Py_HashDouble(v->ob_fval), which in turn has



        if (Py_IS_INFINITY(v))
    return v > 0 ? _PyHASH_INF : -_PyHASH_INF;


    where _PyHASH_INF is defined as 314159:



    #define _PyHASH_INF 314159




    In terms of history, the first mention of 314159 in this context in the Python code (you can find this with git bisect or git log -S 314159 -p) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython git repository.



    The commit message says:




    Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470.
    This was a misleading bug -- the true "bug" was that hash(x) gave an error
    return when x is an infinity. Fixed that. Added new Py_IS_INFINITY macro to
    pyport.h. Rearranged code to reduce growing duplication in hashing of float and
    complex numbers, pushing Trent's earlier stab at that to a logical conclusion.
    Fixed exceedingly rare bug where hashing of floats could return -1 even if there
    wasn't an error (didn't waste time trying to construct a test case, it was simply
    obvious from the code that it could happen). Improved complex hash so that
    hash(complex(x, y)) doesn't systematically equal hash(complex(y, x)) anymore.




    In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v) in Objects/floatobject.c and made it just return _Py_HashDouble(v->ob_fval);, and in the definition of long _Py_HashDouble(double v) in Objects/object.c he added the lines:



            if (Py_IS_INFINITY(intpart))
    /* can't convert to long int -- arbitrary */
    v = v < 0 ? -271828.0 : 314159.0;


    So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.



    Related later commits:




    • By Mark Dickinson in Apr 2010 (also), making the Decimal type behave similarly


    • By Mark Dickinson in Apr 2010 (also), moving this check to the top and adding test cases


    • By Mark Dickinson in May 2010 as issue 8188, completely rewriting the hash function to its current implementation, but retaining this special case, giving the constant a name _PyHASH_INF (also removing the 271828 which is why in Python 3 hash(float('-inf')) returns -314159 rather than -271828 as it does in Python 2)


    • By Raymond Hettinger in Jan 2011, adding an explicit example in the "What's new" for Python 3.2 of sys.hash_info showing the above value. (See here.)


    • By Stefan Krah in Mar 2012 modifying the Decimal module but keeping this hash.


    • By Christian Heimes in Nov 2013, moved the definition of _PyHASH_INF from Include/pyport.h to Include/pyhash.h where it now lives.







    share|improve this answer















    Summary: It's not a coincidence; _PyHASH_INF is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.





    The value of hash(float('inf')) is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf in Python 3:



    >>> import sys
    >>> sys.hash_info
    sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
    >>> sys.hash_info.inf
    314159


    (Same results with PyPy too.)





    In terms of code, hash is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash attribute of the built-in float type (PyTypeObject PyFloat_Type), which is the float_hash function, defined as return _Py_HashDouble(v->ob_fval), which in turn has



        if (Py_IS_INFINITY(v))
    return v > 0 ? _PyHASH_INF : -_PyHASH_INF;


    where _PyHASH_INF is defined as 314159:



    #define _PyHASH_INF 314159




    In terms of history, the first mention of 314159 in this context in the Python code (you can find this with git bisect or git log -S 314159 -p) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython git repository.



    The commit message says:




    Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470.
    This was a misleading bug -- the true "bug" was that hash(x) gave an error
    return when x is an infinity. Fixed that. Added new Py_IS_INFINITY macro to
    pyport.h. Rearranged code to reduce growing duplication in hashing of float and
    complex numbers, pushing Trent's earlier stab at that to a logical conclusion.
    Fixed exceedingly rare bug where hashing of floats could return -1 even if there
    wasn't an error (didn't waste time trying to construct a test case, it was simply
    obvious from the code that it could happen). Improved complex hash so that
    hash(complex(x, y)) doesn't systematically equal hash(complex(y, x)) anymore.




    In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v) in Objects/floatobject.c and made it just return _Py_HashDouble(v->ob_fval);, and in the definition of long _Py_HashDouble(double v) in Objects/object.c he added the lines:



            if (Py_IS_INFINITY(intpart))
    /* can't convert to long int -- arbitrary */
    v = v < 0 ? -271828.0 : 314159.0;


    So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.



    Related later commits:




    • By Mark Dickinson in Apr 2010 (also), making the Decimal type behave similarly


    • By Mark Dickinson in Apr 2010 (also), moving this check to the top and adding test cases


    • By Mark Dickinson in May 2010 as issue 8188, completely rewriting the hash function to its current implementation, but retaining this special case, giving the constant a name _PyHASH_INF (also removing the 271828 which is why in Python 3 hash(float('-inf')) returns -314159 rather than -271828 as it does in Python 2)


    • By Raymond Hettinger in Jan 2011, adding an explicit example in the "What's new" for Python 3.2 of sys.hash_info showing the above value. (See here.)


    • By Stefan Krah in Mar 2012 modifying the Decimal module but keeping this hash.


    • By Christian Heimes in Nov 2013, moved the definition of _PyHASH_INF from Include/pyport.h to Include/pyhash.h where it now lives.








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited May 23 at 18:24

























    answered May 20 at 20:42









    ShreevatsaRShreevatsaR

    32.4k15 gold badges90 silver badges114 bronze badges




    32.4k15 gold badges90 silver badges114 bronze badges











    • 43





      The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.

      – Russell Borogove
      May 21 at 4:30






    • 23





      @RussellBorogove No but it makes it about one million times less likely ;)

      – pipe
      May 21 at 15:01






    • 3





      @RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.

      – ShreevatsaR
      May 21 at 15:32








    • 2





      @pipe Let's just say "removes any reasonable doubt" and call it a day.

      – jpmc26
      May 22 at 6:04






    • 8





      @cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we want hash(42.0) to be the same as hash(42), also the same as hash(Decimal(42)) and hash(complex(42)) and hash(Fraction(42, 1)). The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.

      – ShreevatsaR
      May 22 at 13:22














    • 43





      The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.

      – Russell Borogove
      May 21 at 4:30






    • 23





      @RussellBorogove No but it makes it about one million times less likely ;)

      – pipe
      May 21 at 15:01






    • 3





      @RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.

      – ShreevatsaR
      May 21 at 15:32








    • 2





      @pipe Let's just say "removes any reasonable doubt" and call it a day.

      – jpmc26
      May 22 at 6:04






    • 8





      @cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we want hash(42.0) to be the same as hash(42), also the same as hash(Decimal(42)) and hash(complex(42)) and hash(Fraction(42, 1)). The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.

      – ShreevatsaR
      May 22 at 13:22








    43




    43





    The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.

    – Russell Borogove
    May 21 at 4:30





    The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.

    – Russell Borogove
    May 21 at 4:30




    23




    23





    @RussellBorogove No but it makes it about one million times less likely ;)

    – pipe
    May 21 at 15:01





    @RussellBorogove No but it makes it about one million times less likely ;)

    – pipe
    May 21 at 15:01




    3




    3





    @RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.

    – ShreevatsaR
    May 21 at 15:32







    @RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.

    – ShreevatsaR
    May 21 at 15:32






    2




    2





    @pipe Let's just say "removes any reasonable doubt" and call it a day.

    – jpmc26
    May 22 at 6:04





    @pipe Let's just say "removes any reasonable doubt" and call it a day.

    – jpmc26
    May 22 at 6:04




    8




    8





    @cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we want hash(42.0) to be the same as hash(42), also the same as hash(Decimal(42)) and hash(complex(42)) and hash(Fraction(42, 1)). The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.

    – ShreevatsaR
    May 22 at 13:22





    @cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we want hash(42.0) to be the same as hash(42), also the same as hash(Decimal(42)) and hash(complex(42)) and hash(Fraction(42, 1)). The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.

    – ShreevatsaR
    May 22 at 13:22











    11














    Indeed,



    sys.hash_info.inf


    returns 314159. The value is not generated, it's built into the source code.
    In fact,



    hash(float('-inf'))


    returns -271828, or approximately -e, in python 2 (it's -314159 now).



    The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.






    share|improve this answer
































      11














      Indeed,



      sys.hash_info.inf


      returns 314159. The value is not generated, it's built into the source code.
      In fact,



      hash(float('-inf'))


      returns -271828, or approximately -e, in python 2 (it's -314159 now).



      The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.






      share|improve this answer






























        11












        11








        11







        Indeed,



        sys.hash_info.inf


        returns 314159. The value is not generated, it's built into the source code.
        In fact,



        hash(float('-inf'))


        returns -271828, or approximately -e, in python 2 (it's -314159 now).



        The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.






        share|improve this answer















        Indeed,



        sys.hash_info.inf


        returns 314159. The value is not generated, it's built into the source code.
        In fact,



        hash(float('-inf'))


        returns -271828, or approximately -e, in python 2 (it's -314159 now).



        The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited May 24 at 2:04

























        answered May 21 at 16:39









        Alec AlameddineAlec Alameddine

        4,1884 gold badges13 silver badges41 bronze badges




        4,1884 gold badges13 silver badges41 bronze badges

































            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56227419%2fwhy-does-pythons-hash-of-infinity-have-the-digits-of-%25cf%2580%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Færeyskur hestur Heimild | Tengill | Tilvísanir | LeiðsagnarvalRossið - síða um færeyska hrossið á færeyskuGott ár hjá færeyska hestinum

            He _____ here since 1970 . Answer needed [closed]What does “since he was so high” mean?Meaning of “catch birds for”?How do I ensure “since” takes the meaning I want?“Who cares here” meaningWhat does “right round toward” mean?the time tense (had now been detected)What does the phrase “ring around the roses” mean here?Correct usage of “visited upon”Meaning of “foiled rail sabotage bid”It was the third time I had gone to Rome or It is the third time I had been to Rome

            Bunad