Why does Python's hash of infinity have the digits of π?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}
The hash of infinity in Python has digits matching pi:
>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159
Is that just a coincidence or is it intentional?
python math hash floating-point pi
|
show 6 more comments
The hash of infinity in Python has digits matching pi:
>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159
Is that just a coincidence or is it intentional?
python math hash floating-point pi
9
Not certain, but my guess would be that it's as deliberate ashash(float('nan'))
being0
.
– cs95
May 20 at 20:04
121
Ask Tim Peters. Here's the commit where he introduced this constant, 19 years ago: github.com/python/cpython/commit/…. I kept those special values when I reworked the numeric hash in bugs.python.org/issue8188
– Mark Dickinson
May 20 at 20:38
8
@MarkDickinson Thanks. It looks like Tim may have also used the digits of e for hash of -inf originally.
– wim
May 20 at 20:42
17
@wim Ah yes, true. And apparently I changed that to-314159
. I'd forgotten about that.
– Mark Dickinson
May 20 at 20:44
4
Did you test this in python implementations other than CPython? PyPy, Jython? If not, you should tag this with the appropriate runtime, as it's likely specific to that implementation (whether intentional or not).
– jpmc26
May 22 at 6:56
|
show 6 more comments
The hash of infinity in Python has digits matching pi:
>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159
Is that just a coincidence or is it intentional?
python math hash floating-point pi
The hash of infinity in Python has digits matching pi:
>>> inf = float('inf')
>>> hash(inf)
314159
>>> int(math.pi*1e5)
314159
Is that just a coincidence or is it intentional?
python math hash floating-point pi
python math hash floating-point pi
edited May 24 at 1:24
Andy Lester
70.8k12 gold badges80 silver badges139 bronze badges
70.8k12 gold badges80 silver badges139 bronze badges
asked May 20 at 20:00
wimwim
179k59 gold badges357 silver badges480 bronze badges
179k59 gold badges357 silver badges480 bronze badges
9
Not certain, but my guess would be that it's as deliberate ashash(float('nan'))
being0
.
– cs95
May 20 at 20:04
121
Ask Tim Peters. Here's the commit where he introduced this constant, 19 years ago: github.com/python/cpython/commit/…. I kept those special values when I reworked the numeric hash in bugs.python.org/issue8188
– Mark Dickinson
May 20 at 20:38
8
@MarkDickinson Thanks. It looks like Tim may have also used the digits of e for hash of -inf originally.
– wim
May 20 at 20:42
17
@wim Ah yes, true. And apparently I changed that to-314159
. I'd forgotten about that.
– Mark Dickinson
May 20 at 20:44
4
Did you test this in python implementations other than CPython? PyPy, Jython? If not, you should tag this with the appropriate runtime, as it's likely specific to that implementation (whether intentional or not).
– jpmc26
May 22 at 6:56
|
show 6 more comments
9
Not certain, but my guess would be that it's as deliberate ashash(float('nan'))
being0
.
– cs95
May 20 at 20:04
121
Ask Tim Peters. Here's the commit where he introduced this constant, 19 years ago: github.com/python/cpython/commit/…. I kept those special values when I reworked the numeric hash in bugs.python.org/issue8188
– Mark Dickinson
May 20 at 20:38
8
@MarkDickinson Thanks. It looks like Tim may have also used the digits of e for hash of -inf originally.
– wim
May 20 at 20:42
17
@wim Ah yes, true. And apparently I changed that to-314159
. I'd forgotten about that.
– Mark Dickinson
May 20 at 20:44
4
Did you test this in python implementations other than CPython? PyPy, Jython? If not, you should tag this with the appropriate runtime, as it's likely specific to that implementation (whether intentional or not).
– jpmc26
May 22 at 6:56
9
9
Not certain, but my guess would be that it's as deliberate as
hash(float('nan'))
being 0
.– cs95
May 20 at 20:04
Not certain, but my guess would be that it's as deliberate as
hash(float('nan'))
being 0
.– cs95
May 20 at 20:04
121
121
Ask Tim Peters. Here's the commit where he introduced this constant, 19 years ago: github.com/python/cpython/commit/…. I kept those special values when I reworked the numeric hash in bugs.python.org/issue8188
– Mark Dickinson
May 20 at 20:38
Ask Tim Peters. Here's the commit where he introduced this constant, 19 years ago: github.com/python/cpython/commit/…. I kept those special values when I reworked the numeric hash in bugs.python.org/issue8188
– Mark Dickinson
May 20 at 20:38
8
8
@MarkDickinson Thanks. It looks like Tim may have also used the digits of e for hash of -inf originally.
– wim
May 20 at 20:42
@MarkDickinson Thanks. It looks like Tim may have also used the digits of e for hash of -inf originally.
– wim
May 20 at 20:42
17
17
@wim Ah yes, true. And apparently I changed that to
-314159
. I'd forgotten about that.– Mark Dickinson
May 20 at 20:44
@wim Ah yes, true. And apparently I changed that to
-314159
. I'd forgotten about that.– Mark Dickinson
May 20 at 20:44
4
4
Did you test this in python implementations other than CPython? PyPy, Jython? If not, you should tag this with the appropriate runtime, as it's likely specific to that implementation (whether intentional or not).
– jpmc26
May 22 at 6:56
Did you test this in python implementations other than CPython? PyPy, Jython? If not, you should tag this with the appropriate runtime, as it's likely specific to that implementation (whether intentional or not).
– jpmc26
May 22 at 6:56
|
show 6 more comments
3 Answers
3
active
oldest
votes
_PyHASH_INF
is defined as a constant equal to 314159
.
I can't find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.
3
Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this casehash(314159)
is also314159
. Also try, in Python 3,hash(2305843009214008110) == 314159
(this input is314159 + sys.hash_info.modulus
) etc.
– ShreevatsaR
May 21 at 11:43
1
@ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions
– Patrick Haugh
May 21 at 13:37
1
Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)
– ShreevatsaR
May 22 at 1:52
add a comment |
Summary: It's not a coincidence; _PyHASH_INF
is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.
The value of hash(float('inf'))
is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf
in Python 3:
>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159
(Same results with PyPy too.)
In terms of code, hash
is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash
attribute of the built-in float type (PyTypeObject PyFloat_Type
), which is the float_hash
function, defined as return _Py_HashDouble(v->ob_fval)
, which in turn has
if (Py_IS_INFINITY(v))
return v > 0 ? _PyHASH_INF : -_PyHASH_INF;
where _PyHASH_INF
is defined as 314159:
#define _PyHASH_INF 314159
In terms of history, the first mention of 314159
in this context in the Python code (you can find this with git bisect
or git log -S 314159 -p
) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython
git repository.
The commit message says:
Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470.
This was a misleading bug -- the true "bug" was thathash(x)
gave an error
return whenx
is an infinity. Fixed that. Added newPy_IS_INFINITY
macro to
pyport.h
. Rearranged code to reduce growing duplication in hashing of float and
complex numbers, pushing Trent's earlier stab at that to a logical conclusion.
Fixed exceedingly rare bug where hashing of floats could return -1 even if there
wasn't an error (didn't waste time trying to construct a test case, it was simply
obvious from the code that it could happen). Improved complex hash so that
hash(complex(x, y))
doesn't systematically equalhash(complex(y, x))
anymore.
In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v)
in Objects/floatobject.c
and made it just return _Py_HashDouble(v->ob_fval);
, and in the definition of long _Py_HashDouble(double v)
in Objects/object.c
he added the lines:
if (Py_IS_INFINITY(intpart))
/* can't convert to long int -- arbitrary */
v = v < 0 ? -271828.0 : 314159.0;
So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.
Related later commits:
By Mark Dickinson in Apr 2010 (also), making the
Decimal
type behave similarlyBy Mark Dickinson in Apr 2010 (also), moving this check to the top and adding test cases
By Mark Dickinson in May 2010 as issue 8188, completely rewriting the hash function to its current implementation, but retaining this special case, giving the constant a name
_PyHASH_INF
(also removing the 271828 which is why in Python 3hash(float('-inf'))
returns-314159
rather than-271828
as it does in Python 2)By Raymond Hettinger in Jan 2011, adding an explicit example in the "What's new" for Python 3.2 of
sys.hash_info
showing the above value. (See here.)By Stefan Krah in Mar 2012 modifying the Decimal module but keeping this hash.
By Christian Heimes in Nov 2013, moved the definition of
_PyHASH_INF
fromInclude/pyport.h
toInclude/pyhash.h
where it now lives.
43
The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.
– Russell Borogove
May 21 at 4:30
23
@RussellBorogove No but it makes it about one million times less likely ;)
– pipe
May 21 at 15:01
3
@RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.
– ShreevatsaR
May 21 at 15:32
2
@pipe Let's just say "removes any reasonable doubt" and call it a day.
– jpmc26
May 22 at 6:04
8
@cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we wanthash(42.0)
to be the same ashash(42)
, also the same ashash(Decimal(42))
andhash(complex(42))
andhash(Fraction(42, 1))
. The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.
– ShreevatsaR
May 22 at 13:22
|
show 4 more comments
Indeed,
sys.hash_info.inf
returns 314159
. The value is not generated, it's built into the source code.
In fact,
hash(float('-inf'))
returns -271828
, or approximately -e, in python 2 (it's -314159 now).
The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56227419%2fwhy-does-pythons-hash-of-infinity-have-the-digits-of-%25cf%2580%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
_PyHASH_INF
is defined as a constant equal to 314159
.
I can't find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.
3
Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this casehash(314159)
is also314159
. Also try, in Python 3,hash(2305843009214008110) == 314159
(this input is314159 + sys.hash_info.modulus
) etc.
– ShreevatsaR
May 21 at 11:43
1
@ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions
– Patrick Haugh
May 21 at 13:37
1
Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)
– ShreevatsaR
May 22 at 1:52
add a comment |
_PyHASH_INF
is defined as a constant equal to 314159
.
I can't find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.
3
Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this casehash(314159)
is also314159
. Also try, in Python 3,hash(2305843009214008110) == 314159
(this input is314159 + sys.hash_info.modulus
) etc.
– ShreevatsaR
May 21 at 11:43
1
@ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions
– Patrick Haugh
May 21 at 13:37
1
Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)
– ShreevatsaR
May 22 at 1:52
add a comment |
_PyHASH_INF
is defined as a constant equal to 314159
.
I can't find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.
_PyHASH_INF
is defined as a constant equal to 314159
.
I can't find any discussion about this, or comments giving a reason. I think it was chosen more or less arbitrarily. I imagine that as long as they don't use the same meaningful value for other hashes, it shouldn't matter.
answered May 20 at 20:19
Patrick HaughPatrick Haugh
34.5k10 gold badges35 silver badges52 bronze badges
34.5k10 gold badges35 silver badges52 bronze badges
3
Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this casehash(314159)
is also314159
. Also try, in Python 3,hash(2305843009214008110) == 314159
(this input is314159 + sys.hash_info.modulus
) etc.
– ShreevatsaR
May 21 at 11:43
1
@ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions
– Patrick Haugh
May 21 at 13:37
1
Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)
– ShreevatsaR
May 22 at 1:52
add a comment |
3
Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this casehash(314159)
is also314159
. Also try, in Python 3,hash(2305843009214008110) == 314159
(this input is314159 + sys.hash_info.modulus
) etc.
– ShreevatsaR
May 21 at 11:43
1
@ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions
– Patrick Haugh
May 21 at 13:37
1
Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)
– ShreevatsaR
May 22 at 1:52
3
3
Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this case
hash(314159)
is also 314159
. Also try, in Python 3, hash(2305843009214008110) == 314159
(this input is 314159 + sys.hash_info.modulus
) etc.– ShreevatsaR
May 21 at 11:43
Small nitpick: it is almost inevitable by definition that the same value will be used for other hashes, e.g. in this case
hash(314159)
is also 314159
. Also try, in Python 3, hash(2305843009214008110) == 314159
(this input is 314159 + sys.hash_info.modulus
) etc.– ShreevatsaR
May 21 at 11:43
1
1
@ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions
– Patrick Haugh
May 21 at 13:37
@ShreevatsaR I just meant that as long as they don't choose this value to be the hash of other values by definition, then choosing a meaningful value like this doesn't increase the chance of hash collisions
– Patrick Haugh
May 21 at 13:37
1
1
Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)
– ShreevatsaR
May 22 at 1:52
Ah ok, that makes sense (something like: as long as they don't use the same value as hash for other meaningful values... or something like that). Meanwhile, I had fun thinking about how find all numeric values that hash to the same value; have answered it on the only question I could find, here. :-)
– ShreevatsaR
May 22 at 1:52
add a comment |
Summary: It's not a coincidence; _PyHASH_INF
is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.
The value of hash(float('inf'))
is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf
in Python 3:
>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159
(Same results with PyPy too.)
In terms of code, hash
is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash
attribute of the built-in float type (PyTypeObject PyFloat_Type
), which is the float_hash
function, defined as return _Py_HashDouble(v->ob_fval)
, which in turn has
if (Py_IS_INFINITY(v))
return v > 0 ? _PyHASH_INF : -_PyHASH_INF;
where _PyHASH_INF
is defined as 314159:
#define _PyHASH_INF 314159
In terms of history, the first mention of 314159
in this context in the Python code (you can find this with git bisect
or git log -S 314159 -p
) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython
git repository.
The commit message says:
Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470.
This was a misleading bug -- the true "bug" was thathash(x)
gave an error
return whenx
is an infinity. Fixed that. Added newPy_IS_INFINITY
macro to
pyport.h
. Rearranged code to reduce growing duplication in hashing of float and
complex numbers, pushing Trent's earlier stab at that to a logical conclusion.
Fixed exceedingly rare bug where hashing of floats could return -1 even if there
wasn't an error (didn't waste time trying to construct a test case, it was simply
obvious from the code that it could happen). Improved complex hash so that
hash(complex(x, y))
doesn't systematically equalhash(complex(y, x))
anymore.
In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v)
in Objects/floatobject.c
and made it just return _Py_HashDouble(v->ob_fval);
, and in the definition of long _Py_HashDouble(double v)
in Objects/object.c
he added the lines:
if (Py_IS_INFINITY(intpart))
/* can't convert to long int -- arbitrary */
v = v < 0 ? -271828.0 : 314159.0;
So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.
Related later commits:
By Mark Dickinson in Apr 2010 (also), making the
Decimal
type behave similarlyBy Mark Dickinson in Apr 2010 (also), moving this check to the top and adding test cases
By Mark Dickinson in May 2010 as issue 8188, completely rewriting the hash function to its current implementation, but retaining this special case, giving the constant a name
_PyHASH_INF
(also removing the 271828 which is why in Python 3hash(float('-inf'))
returns-314159
rather than-271828
as it does in Python 2)By Raymond Hettinger in Jan 2011, adding an explicit example in the "What's new" for Python 3.2 of
sys.hash_info
showing the above value. (See here.)By Stefan Krah in Mar 2012 modifying the Decimal module but keeping this hash.
By Christian Heimes in Nov 2013, moved the definition of
_PyHASH_INF
fromInclude/pyport.h
toInclude/pyhash.h
where it now lives.
43
The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.
– Russell Borogove
May 21 at 4:30
23
@RussellBorogove No but it makes it about one million times less likely ;)
– pipe
May 21 at 15:01
3
@RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.
– ShreevatsaR
May 21 at 15:32
2
@pipe Let's just say "removes any reasonable doubt" and call it a day.
– jpmc26
May 22 at 6:04
8
@cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we wanthash(42.0)
to be the same ashash(42)
, also the same ashash(Decimal(42))
andhash(complex(42))
andhash(Fraction(42, 1))
. The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.
– ShreevatsaR
May 22 at 13:22
|
show 4 more comments
Summary: It's not a coincidence; _PyHASH_INF
is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.
The value of hash(float('inf'))
is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf
in Python 3:
>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159
(Same results with PyPy too.)
In terms of code, hash
is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash
attribute of the built-in float type (PyTypeObject PyFloat_Type
), which is the float_hash
function, defined as return _Py_HashDouble(v->ob_fval)
, which in turn has
if (Py_IS_INFINITY(v))
return v > 0 ? _PyHASH_INF : -_PyHASH_INF;
where _PyHASH_INF
is defined as 314159:
#define _PyHASH_INF 314159
In terms of history, the first mention of 314159
in this context in the Python code (you can find this with git bisect
or git log -S 314159 -p
) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython
git repository.
The commit message says:
Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470.
This was a misleading bug -- the true "bug" was thathash(x)
gave an error
return whenx
is an infinity. Fixed that. Added newPy_IS_INFINITY
macro to
pyport.h
. Rearranged code to reduce growing duplication in hashing of float and
complex numbers, pushing Trent's earlier stab at that to a logical conclusion.
Fixed exceedingly rare bug where hashing of floats could return -1 even if there
wasn't an error (didn't waste time trying to construct a test case, it was simply
obvious from the code that it could happen). Improved complex hash so that
hash(complex(x, y))
doesn't systematically equalhash(complex(y, x))
anymore.
In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v)
in Objects/floatobject.c
and made it just return _Py_HashDouble(v->ob_fval);
, and in the definition of long _Py_HashDouble(double v)
in Objects/object.c
he added the lines:
if (Py_IS_INFINITY(intpart))
/* can't convert to long int -- arbitrary */
v = v < 0 ? -271828.0 : 314159.0;
So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.
Related later commits:
By Mark Dickinson in Apr 2010 (also), making the
Decimal
type behave similarlyBy Mark Dickinson in Apr 2010 (also), moving this check to the top and adding test cases
By Mark Dickinson in May 2010 as issue 8188, completely rewriting the hash function to its current implementation, but retaining this special case, giving the constant a name
_PyHASH_INF
(also removing the 271828 which is why in Python 3hash(float('-inf'))
returns-314159
rather than-271828
as it does in Python 2)By Raymond Hettinger in Jan 2011, adding an explicit example in the "What's new" for Python 3.2 of
sys.hash_info
showing the above value. (See here.)By Stefan Krah in Mar 2012 modifying the Decimal module but keeping this hash.
By Christian Heimes in Nov 2013, moved the definition of
_PyHASH_INF
fromInclude/pyport.h
toInclude/pyhash.h
where it now lives.
43
The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.
– Russell Borogove
May 21 at 4:30
23
@RussellBorogove No but it makes it about one million times less likely ;)
– pipe
May 21 at 15:01
3
@RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.
– ShreevatsaR
May 21 at 15:32
2
@pipe Let's just say "removes any reasonable doubt" and call it a day.
– jpmc26
May 22 at 6:04
8
@cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we wanthash(42.0)
to be the same ashash(42)
, also the same ashash(Decimal(42))
andhash(complex(42))
andhash(Fraction(42, 1))
. The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.
– ShreevatsaR
May 22 at 13:22
|
show 4 more comments
Summary: It's not a coincidence; _PyHASH_INF
is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.
The value of hash(float('inf'))
is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf
in Python 3:
>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159
(Same results with PyPy too.)
In terms of code, hash
is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash
attribute of the built-in float type (PyTypeObject PyFloat_Type
), which is the float_hash
function, defined as return _Py_HashDouble(v->ob_fval)
, which in turn has
if (Py_IS_INFINITY(v))
return v > 0 ? _PyHASH_INF : -_PyHASH_INF;
where _PyHASH_INF
is defined as 314159:
#define _PyHASH_INF 314159
In terms of history, the first mention of 314159
in this context in the Python code (you can find this with git bisect
or git log -S 314159 -p
) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython
git repository.
The commit message says:
Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470.
This was a misleading bug -- the true "bug" was thathash(x)
gave an error
return whenx
is an infinity. Fixed that. Added newPy_IS_INFINITY
macro to
pyport.h
. Rearranged code to reduce growing duplication in hashing of float and
complex numbers, pushing Trent's earlier stab at that to a logical conclusion.
Fixed exceedingly rare bug where hashing of floats could return -1 even if there
wasn't an error (didn't waste time trying to construct a test case, it was simply
obvious from the code that it could happen). Improved complex hash so that
hash(complex(x, y))
doesn't systematically equalhash(complex(y, x))
anymore.
In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v)
in Objects/floatobject.c
and made it just return _Py_HashDouble(v->ob_fval);
, and in the definition of long _Py_HashDouble(double v)
in Objects/object.c
he added the lines:
if (Py_IS_INFINITY(intpart))
/* can't convert to long int -- arbitrary */
v = v < 0 ? -271828.0 : 314159.0;
So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.
Related later commits:
By Mark Dickinson in Apr 2010 (also), making the
Decimal
type behave similarlyBy Mark Dickinson in Apr 2010 (also), moving this check to the top and adding test cases
By Mark Dickinson in May 2010 as issue 8188, completely rewriting the hash function to its current implementation, but retaining this special case, giving the constant a name
_PyHASH_INF
(also removing the 271828 which is why in Python 3hash(float('-inf'))
returns-314159
rather than-271828
as it does in Python 2)By Raymond Hettinger in Jan 2011, adding an explicit example in the "What's new" for Python 3.2 of
sys.hash_info
showing the above value. (See here.)By Stefan Krah in Mar 2012 modifying the Decimal module but keeping this hash.
By Christian Heimes in Nov 2013, moved the definition of
_PyHASH_INF
fromInclude/pyport.h
toInclude/pyhash.h
where it now lives.
Summary: It's not a coincidence; _PyHASH_INF
is hardcoded as 314159 in the default CPython implementation of Python, and was picked as an arbitrary value (obviously from the digits of π) by Tim Peters in 2000.
The value of hash(float('inf'))
is one of the system-dependent parameters of the built-in hash function for numeric types, and is also available as sys.hash_info.inf
in Python 3:
>>> import sys
>>> sys.hash_info
sys.hash_info(width=64, modulus=2305843009213693951, inf=314159, nan=0, imag=1000003, algorithm='siphash24', hash_bits=64, seed_bits=128, cutoff=0)
>>> sys.hash_info.inf
314159
(Same results with PyPy too.)
In terms of code, hash
is a built-in function. Calling it on a Python float object invokes the function whose pointer is given by the tp_hash
attribute of the built-in float type (PyTypeObject PyFloat_Type
), which is the float_hash
function, defined as return _Py_HashDouble(v->ob_fval)
, which in turn has
if (Py_IS_INFINITY(v))
return v > 0 ? _PyHASH_INF : -_PyHASH_INF;
where _PyHASH_INF
is defined as 314159:
#define _PyHASH_INF 314159
In terms of history, the first mention of 314159
in this context in the Python code (you can find this with git bisect
or git log -S 314159 -p
) was added by Tim Peters in August 2000, in what is now commit 39dce293 in the cpython
git repository.
The commit message says:
Fix for http://sourceforge.net/bugs/?func=detailbug&bug_id=111866&group_id=5470.
This was a misleading bug -- the true "bug" was thathash(x)
gave an error
return whenx
is an infinity. Fixed that. Added newPy_IS_INFINITY
macro to
pyport.h
. Rearranged code to reduce growing duplication in hashing of float and
complex numbers, pushing Trent's earlier stab at that to a logical conclusion.
Fixed exceedingly rare bug where hashing of floats could return -1 even if there
wasn't an error (didn't waste time trying to construct a test case, it was simply
obvious from the code that it could happen). Improved complex hash so that
hash(complex(x, y))
doesn't systematically equalhash(complex(y, x))
anymore.
In particular, in this commit he ripped out the code of static long float_hash(PyFloatObject *v)
in Objects/floatobject.c
and made it just return _Py_HashDouble(v->ob_fval);
, and in the definition of long _Py_HashDouble(double v)
in Objects/object.c
he added the lines:
if (Py_IS_INFINITY(intpart))
/* can't convert to long int -- arbitrary */
v = v < 0 ? -271828.0 : 314159.0;
So as mentioned, it was an arbitrary choice. Note that 271828 is formed from the first few decimal digits of e.
Related later commits:
By Mark Dickinson in Apr 2010 (also), making the
Decimal
type behave similarlyBy Mark Dickinson in Apr 2010 (also), moving this check to the top and adding test cases
By Mark Dickinson in May 2010 as issue 8188, completely rewriting the hash function to its current implementation, but retaining this special case, giving the constant a name
_PyHASH_INF
(also removing the 271828 which is why in Python 3hash(float('-inf'))
returns-314159
rather than-271828
as it does in Python 2)By Raymond Hettinger in Jan 2011, adding an explicit example in the "What's new" for Python 3.2 of
sys.hash_info
showing the above value. (See here.)By Stefan Krah in Mar 2012 modifying the Decimal module but keeping this hash.
By Christian Heimes in Nov 2013, moved the definition of
_PyHASH_INF
fromInclude/pyport.h
toInclude/pyhash.h
where it now lives.
edited May 23 at 18:24
answered May 20 at 20:42
ShreevatsaRShreevatsaR
32.4k15 gold badges90 silver badges114 bronze badges
32.4k15 gold badges90 silver badges114 bronze badges
43
The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.
– Russell Borogove
May 21 at 4:30
23
@RussellBorogove No but it makes it about one million times less likely ;)
– pipe
May 21 at 15:01
3
@RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.
– ShreevatsaR
May 21 at 15:32
2
@pipe Let's just say "removes any reasonable doubt" and call it a day.
– jpmc26
May 22 at 6:04
8
@cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we wanthash(42.0)
to be the same ashash(42)
, also the same ashash(Decimal(42))
andhash(complex(42))
andhash(Fraction(42, 1))
. The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.
– ShreevatsaR
May 22 at 13:22
|
show 4 more comments
43
The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.
– Russell Borogove
May 21 at 4:30
23
@RussellBorogove No but it makes it about one million times less likely ;)
– pipe
May 21 at 15:01
3
@RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.
– ShreevatsaR
May 21 at 15:32
2
@pipe Let's just say "removes any reasonable doubt" and call it a day.
– jpmc26
May 22 at 6:04
8
@cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we wanthash(42.0)
to be the same ashash(42)
, also the same ashash(Decimal(42))
andhash(complex(42))
andhash(Fraction(42, 1))
. The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.
– ShreevatsaR
May 22 at 13:22
43
43
The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.
– Russell Borogove
May 21 at 4:30
The choice of -271828 for -Inf eliminates any doubt that the pi association was accidental.
– Russell Borogove
May 21 at 4:30
23
23
@RussellBorogove No but it makes it about one million times less likely ;)
– pipe
May 21 at 15:01
@RussellBorogove No but it makes it about one million times less likely ;)
– pipe
May 21 at 15:01
3
3
@RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.
– ShreevatsaR
May 21 at 15:32
@RussellBorogove Well there was no doubt in my mind anyway after seeing “314159” in the source code :-) I mean, the probability P(programmer wanted something arbitrary and picked a few digits of well-known constant) >> P(programmer typed some digits at random) * P(it happened to be a specific sequence of six digits), where the second factor is 1/1000000 but even the first factor is probably smaller than the LHS already! (I've many times written “random” numbers like 123456 or 314159, but never checked-in random strings of digits.) But sure, seeing the other constant is nice too.
– ShreevatsaR
May 21 at 15:32
2
2
@pipe Let's just say "removes any reasonable doubt" and call it a day.
– jpmc26
May 22 at 6:04
@pipe Let's just say "removes any reasonable doubt" and call it a day.
– jpmc26
May 22 at 6:04
8
8
@cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we want
hash(42.0)
to be the same as hash(42)
, also the same as hash(Decimal(42))
and hash(complex(42))
and hash(Fraction(42, 1))
. The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.– ShreevatsaR
May 22 at 13:22
@cmaster: See the part above where it says May 2010, namely the documentation section on hashing of numeric types and issue 8188 — the idea is that we want
hash(42.0)
to be the same as hash(42)
, also the same as hash(Decimal(42))
and hash(complex(42))
and hash(Fraction(42, 1))
. The solution (by Mark Dickinson) is an elegant one IMO: defining a mathematical function that works for any rational number, and using the fact that floating-point numbers are rational numbers too.– ShreevatsaR
May 22 at 13:22
|
show 4 more comments
Indeed,
sys.hash_info.inf
returns 314159
. The value is not generated, it's built into the source code.
In fact,
hash(float('-inf'))
returns -271828
, or approximately -e, in python 2 (it's -314159 now).
The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.
add a comment |
Indeed,
sys.hash_info.inf
returns 314159
. The value is not generated, it's built into the source code.
In fact,
hash(float('-inf'))
returns -271828
, or approximately -e, in python 2 (it's -314159 now).
The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.
add a comment |
Indeed,
sys.hash_info.inf
returns 314159
. The value is not generated, it's built into the source code.
In fact,
hash(float('-inf'))
returns -271828
, or approximately -e, in python 2 (it's -314159 now).
The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.
Indeed,
sys.hash_info.inf
returns 314159
. The value is not generated, it's built into the source code.
In fact,
hash(float('-inf'))
returns -271828
, or approximately -e, in python 2 (it's -314159 now).
The fact that the two most famous irrational numbers of all time are used as the hash values makes it very unlikely to be a coincidence.
edited May 24 at 2:04
answered May 21 at 16:39
Alec AlameddineAlec Alameddine
4,1884 gold badges13 silver badges41 bronze badges
4,1884 gold badges13 silver badges41 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f56227419%2fwhy-does-pythons-hash-of-infinity-have-the-digits-of-%25cf%2580%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
9
Not certain, but my guess would be that it's as deliberate as
hash(float('nan'))
being0
.– cs95
May 20 at 20:04
121
Ask Tim Peters. Here's the commit where he introduced this constant, 19 years ago: github.com/python/cpython/commit/…. I kept those special values when I reworked the numeric hash in bugs.python.org/issue8188
– Mark Dickinson
May 20 at 20:38
8
@MarkDickinson Thanks. It looks like Tim may have also used the digits of e for hash of -inf originally.
– wim
May 20 at 20:42
17
@wim Ah yes, true. And apparently I changed that to
-314159
. I'd forgotten about that.– Mark Dickinson
May 20 at 20:44
4
Did you test this in python implementations other than CPython? PyPy, Jython? If not, you should tag this with the appropriate runtime, as it's likely specific to that implementation (whether intentional or not).
– jpmc26
May 22 at 6:56