this post was submitted on 14 May 2024
312 points (91.5% liked)
Programmer Humor
32562 readers
398 users here now
Post funny things about programming here! (Or just rant about your favourite programming language.)
Rules:
- Posts must be relevant to programming, programmers, or computer science.
- No NSFW content.
- Jokes must be in good taste. No hate speech, bigotry, etc.
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
They are two different data types with potentially different in-memory representations.
Well, yeah, but they do mean the exact same thing, hopefully: true or false
Although thinking about it, someone above mentioned that the numpy
bool_
is an object, so I guess that is really: true or false or null/NoneIn an abstract sense, they do mean the same things but, in a technical sense, the one most relevant to programming, they do not.
The standard Python
bool
type is a subclass of the integer type. This means that it is stored as either 4 bytes (int32
) or 8 bytes (int64
).The
numpy.bool_
type is something closer to a native C boolean and is stored in 1 byte.So, memory-wise, one could store a
numpy.bool_
in a Pythonbool
but that now leaves 3-7 extra bytes that are unused in the variable. This introduces not just unnecessary memory usage but potential space for malicious data injection or extraction. Now, if one tries to store a Pythonbool
in anumpy.bool_
, if the interpreter or OS don't throw an error and kill the process, you now have a buffer overflow/illegal memory access problem.What about converting on the fly? Well, that can be done but will come at a performance cost as every function that can accept a
numpy.bool_
now has to perform additional type checking, validation, and conversion on every single function call. That adds up quick when processing data on scales where numpy is called for.