diff --git a/docs/learning-from-bidict.rst b/docs/learning-from-bidict.rst index 2aaeaf3..12870b1 100644 --- a/docs/learning-from-bidict.rst +++ b/docs/learning-from-bidict.rst @@ -1,19 +1,17 @@ Learning from bidict -------------------- -Below is an outline of -some of the more fascinating Python corners -I got to explore further +Below is an outline of some of the more fascinating +and lesser-known Python corners I got to explore further thanks to working on bidict. If you are interested in learning more about any of the following, I highly encourage you to -`read bidict's code `__. +`read bidict's code `__. -I've sought to optimize the code not just for correctness and for performance, -but also to make it a pleasure to read, -to share the `joys of computing `__ -in bidict with others. +I've sought to optimize the code not just for correctness and performance, +but also to make for a clear and enjoyable read, +illuminating anything that could otherwise be obscure or subtle. I hope it brings you some of the joy it's brought me. 😊 @@ -21,16 +19,15 @@ I hope it brings you some of the joy it's brought me. 😊 Python syntax hacks =================== -bidict used to support -`slice syntax `__ -for looking up keys by value: +Bidict used to support (ab)using a specialized form of Python's :ref:`slice ` syntax +for getting and setting keys by value: .. code-block:: python >>> element_by_symbol = bidict(H='hydrogen') - >>> element_by_symbol['H'] # normal syntax for the forward mapping + >>> element_by_symbol['H'] # [normal] syntax for the forward mapping 'hydrogen' - >>> element_by_symbol[:'hydrogen'] # :slice syntax for the inverse + >>> element_by_symbol[:'hydrogen'] # [:slice] syntax for the inverse (no longer supported) 'H' See `this code `__ @@ -38,37 +35,105 @@ for how this was implemented, and `#19 `__ for why this was dropped. -Efficient ordered mappings -========================== +Code structure +============== -**It's a real, live, industrial-strength linked list in the wild!** -If you've only ever seen the tame kind in those boring data structures courses, -you may be in for a treat: -see `_orderedbase.py `__. -Inspired by Python's own :class:`~collections.OrderedDict` -`implementation `_. +Bidicts come in every combination of mutable, immutable, ordered, and unordered types, +implementing Python's various +:class:`relevant ` +:class:`collections ` +:class:`interfaces ` +as appropriate. + +Factoring the code to maximize reuse, modularity, and +adherence to `SOLID `__ design principles +has been one of the most fun parts of working on bidict. + +To see how this is done, check out this code: + +- `_base.py `__ +- `_frozenbidict.py `__ +- `_mut.py `__ +- `_bidict.py `__ +- `_orderedbase.py `__ +- `_frozenordered.py `__ +- `_orderedbidict.py `__ + +Data structures are amazing +=========================== + +Data structures are one of the most fascinating and important +building blocks of programming and computer science. + +It's all too easy to lose sight of the magic when having to implement them +for computer science courses or job interview questions. +Part of this is because many of the most interesting real-world details get left out, +and you miss all the value that comes from ongoing, direct practical application. + +Bidict shows how fundamental data structures +can be implemented in Python for important real-world usage, +with practical concerns at top of mind. +Come to catch sight of a real, live, industrial-strength linked list in the wild. +Stay for the rare, exotic bidirectional mappings breeds you'll rarely see at home. +[#fn-data-struct]_ + +.. [#fn-data-struct] To give you a taste: + + A regular :class:`~bidict.bidict` + encapsulates two regular dicts, + keeping them in sync to preserve the bidirectional mapping invariants. + Since dicts are unordered, regular bidicts are unordered too. + How should we extend this to implement an ordered bidict? + + We'll still need two backing mappings to store the forward and inverse associations. + To store the ordering, we use a (circular, doubly-) linked list. + This allows us to e.g. delete an item in any position in O(1) time. + + Interestingly, the nodes of the linked list encode only the ordering of the items; + the nodes themselves contain no key or value data. + The two backing mappings associate the key and value data + with the nodes, providing the final pieces of the puzzle. + + Can we use dicts for the backing mappings, as we did for the unordered bidict? + It turns out that dicts aren't enough—the backing mappings must actually be + (unordered) bidicts themselves! + +Check out `_orderedbase.py `__ +to see this in action. -Property-based testing is amazing -================================= +Property-based testing is revolutionary +======================================= -Dramatically increase test coverage by -asserting that your properties hold for ~all valid inputs. -Don't just automatically run the testcases you happened to think of manually, -generate your testcases automatically -(and a whole lot more of the ones you'd never think of) too. +When your automated tests run, +are they only checking the test cases +you happened to hard-code into your test suite? +How do you know these test cases aren't missing +some important edge cases? + +With property-based testing, +you describe the types of test case inputs your functions accept, +along with the properties that should hold for all inputs. +Rather than having to think up your test case inputs manually +and hard-code them into your test suite, +they get generated for you dynamically, +in much greater quantity and edge case-exercising diversity +than you could come up with by hand. +This dramatically increases test coverage +and confidence that your code is correct. Bidict never would have survived so many refactorings with so few bugs if it weren't for property-based testing, enabled by the amazing `Hypothesis `__ library. It's game-changing. -See `bidict's property-based tests -`__. +Check out `bidict's property-based tests +`__ +to see this in action. -Python's surprises, gotchas, and a mistake -========================================== +Python surprises, gotchas, regrets +================================== - See :ref:`addendum:nan as key`. @@ -110,10 +175,13 @@ Python's surprises, gotchas, and a mistake >>> len({od, od2, d}) 2 - According to Raymond Hettinger (the author of :class:`~collections.OrderedDict`), + According to Raymond Hettinger + (Python core developer responsible for much of Python's collections), this design was a mistake - (it violates the `Liskov substitution principle - `__), + (e.g. it violates the `Liskov substitution principle + `__ + and the `transitive property of equality + `__), but it's too late now to fix. Fortunately, it wasn't too late for bidict to learn from this. @@ -121,94 +189,6 @@ Python's surprises, gotchas, and a mistake and their separate :meth:`~bidict.FrozenOrderedBidict.equals_order_sensitive` method. -Python's data model -=================== - -- What happens when you implement a custom :meth:`~object.__eq__`? - e.g. What's the difference between ``a == b`` and ``b == a`` - when only ``a`` is an instance of your class? - See the great write-up in https://eev.ee/blog/2012/03/24/python-faq-equality/ - for the answer. - -- If an instance of your special mapping type - is being compared against a mapping of some foreign mapping type - that contains the same items, - should your ``__eq__()`` method return true? - - bidict says yes, again based on the `Liskov substitution principle - `__. - Only returning true when the types matched exactly would violate this. - And returning :obj:`NotImplemented` would cause Python to fall back on - using identity comparison, which is not what is being asked for. - - (Just for fun, suppose you did only return true when the types matched exactly, - and suppose your special mapping type were also hashable. - Would it be worth having your ``__hash__()`` method include your type - as well as your items? - The only point would be to reduce collisions when multiple instances of different - types contained the same items - and were going to be inserted into the same :class:`dict` or :class:`set`, - since they'd now be unequal but would hash to the same value otherwise.) - -- Making an immutable type hashable - (so it can be inserted into :class:`dict`\s and :class:`set`\s): - Must implement :meth:`~object.__hash__` such that - ``a == b ⇒ hash(a) == hash(b)``. - See the :meth:`object.__hash__` and :meth:`object.__eq__` docs, and - the `implementation `__ - of :class:`~bidict.frozenbidict`. - - - Consider :class:`~bidict.FrozenOrderedBidict`: - its :meth:`~bidict.FrozenOrderedBidict.__eq__` - is :ref:`order-insensitive `. - So all contained items must participate in the hash order-insensitively. - - - Can use `collections.abc.Set._hash `__ - which provides a pure Python implementation of the same hash algorithm - used to hash :class:`frozenset`\s. - (Since :class:`~collections.abc.ItemsView` extends - :class:`~collections.abc.Set`, - :meth:`bidict.frozenbidict.__hash__` - just calls ``ItemsView(self)._hash()``.) - - - Does this argue for making :meth:`collections.abc.Set._hash` non-private? - - - Why isn't the C implementation of this algorithm directly exposed in - CPython? The only way to use it is to call ``hash(frozenset(self.items()))``, - which wastes memory allocating the ephemeral frozenset, - and time copying all the items into it before they're hashed. - - - Unlike other attributes, if a class implements ``__hash__()``, - any subclasses of that class will not inherit it. - It's like Python implicitly adds ``__hash__ = None`` to the body - of every class that doesn't explicitly define ``__hash__``. - So if you do want a subclass to inherit a base class's ``__hash__()`` - implementation, you have to set that manually, - e.g. by adding ``__hash__ = BaseClass.__hash__`` in the class body. - See :class:`~bidict.FrozenOrderedBidict`. - - This is consistent with the fact that - :class:`object` implements ``__hash__()``, - but subclasses of :class:`object` - that override :meth:`~object.__eq__` - are not hashable by default. - -- Using :meth:`~object.__new__` to bypass default object initialization, - e.g. for better :meth:`~bidict.bidict.copy` performance. - See `_base.py `__. - -- Overriding :meth:`object.__getattribute__` for custom attribute lookup. - See :ref:`extending:Sorted Bidict Recipes`. - -- Using - :meth:`object.__getstate__`, - :meth:`object.__setstate__`, and - :meth:`object.__reduce__` to make an object pickleable - that otherwise wouldn't be, - due to e.g. using weakrefs, - as bidicts do (covered further below). - - Better memory usage through ``__slots__`` ========================================= @@ -290,11 +270,19 @@ of :func:`~bidict.namedbidict`. API Design ========== -How to deeply integrate with Python's :mod:`collections`? +How to deeply integrate with Python's :mod:`collections` and other built-in APIs? -- Thanks to :class:`~collections.abc.Hashable` +- Beyond implementing :class:`collections.abc.Mapping`, + bidicts implement additional APIs + that :class:`dict` and :class:`~collections.OrderedDict` implement + (e.g. :func:`setdefault`, :func:`popitem`, etc.). + + - When creating a new API, making it familiar, memorable, and intuitive + is hugely important to a good user experience. + +- Thanks to :class:`~collections.abc.Hashable`'s implementing :meth:`abc.ABCMeta.__subclasshook__`, - any class that implements all the required methods of the + any class that implements the required methods of the :class:`~collections.abc.Hashable` interface (namely, :meth:`~collections.abc.Hashable.__hash__`) makes it a virtual subclass already, no need to explicitly extend. @@ -302,15 +290,8 @@ How to deeply integrate with Python's :mod:`collections`? ``issubclass(Foo, Hashable)`` will always be True, no need to explicitly subclass via ``class Foo(Hashable): ...`` -- :class:`collections.abc.Mapping` and - :class:`collections.abc.MutableMapping` - don't implement :meth:`~abc.ABCMeta.__subclasshook__`, - so must either explicitly subclass - (if you want to inherit any of their implementations) - or use :meth:`abc.ABCMeta.register` - (to register as a virtual subclass without inheriting any implementation) - -- How to make your own open ABC like :class:`~collections.abc.Hashable`? +- How to make your own open ABC like :class:`~collections.abc.Hashable`, + i.e. how does :class:`~bidict.BidirectionalMapping` work? - Override :meth:`~abc.ABCMeta.__subclasshook__` to check for the interface you require. @@ -324,18 +305,20 @@ How to deeply integrate with Python's :mod:`collections`? :class:`list` is a subclass of :class:`object`, but :class:`list` is not a subclass of :class:`~collections.abc.Hashable`. +- :class:`collections.abc.Mapping` and + :class:`collections.abc.MutableMapping` + don't implement :meth:`~abc.ABCMeta.__subclasshook__`, + so must either explicitly subclass + (if you want to inherit any of their implementations) + or use :meth:`abc.ABCMeta.register` + (to register as a virtual subclass without inheriting any implementation) + - Notice we have :class:`collections.abc.Reversible` but no ``collections.abc.Ordered`` or ``collections.abc.OrderedMapping``. Proposed in `bpo-28912 `__ but rejected. Would have been useful for bidict's ``__repr__()`` implementation (see ``_base.py``), and potentially for interop with other ordered mapping implementations - such as `SortedDict `__ - -- Beyond :class:`collections.abc.Mapping`, bidicts implement additional APIs - that :class:`dict` and :class:`~collections.OrderedDict` implement. - - - When creating a new API, making it familiar, memorable, and intuitive - is hugely important to a good user experience. + such as `SortedDict `__. How to make APIs Pythonic? @@ -359,6 +342,94 @@ How to make APIs Pythonic? ``.inverse``-based spellings. +Python's data model +=================== + +- What happens when you implement a custom :meth:`~object.__eq__`? + e.g. What's the difference between ``a == b`` and ``b == a`` + when only ``a`` is an instance of your class? + See the great write-up in https://eev.ee/blog/2012/03/24/python-faq-equality/ + for the answer. + +- If an instance of your special mapping type + is being compared against a mapping of some foreign mapping type + that contains the same items, + should your ``__eq__()`` method return true? + + Bidict says yes, again based on the `Liskov substitution principle + `__. + Only returning true when the types matched exactly would violate this. + And returning :obj:`NotImplemented` would cause Python to fall back on + using identity comparison, which is not what is being asked for. + + (Just for fun, suppose you did only return true when the types matched exactly, + and suppose your special mapping type were also hashable. + Would it be worth having your ``__hash__()`` method include your type + as well as your items? + The only point would be to reduce collisions when multiple instances of different + types contained the same items + and were going to be inserted into the same :class:`dict` or :class:`set`, + since they'd now be unequal but would hash to the same value otherwise.) + +- Making an immutable type hashable + (so it can be inserted into :class:`dict`\s and :class:`set`\s): + Must implement :meth:`~object.__hash__` such that + ``a == b ⇒ hash(a) == hash(b)``. + See the :meth:`object.__hash__` and :meth:`object.__eq__` docs, and + the `implementation `__ + of :class:`~bidict.frozenbidict`. + + - Consider :class:`~bidict.FrozenOrderedBidict`: + its :meth:`~bidict.FrozenOrderedBidict.__eq__` + is :ref:`order-insensitive `. + So all contained items must participate in the hash order-insensitively. + + - Can use `collections.abc.Set._hash `__ + which provides a pure Python implementation of the same hash algorithm + used to hash :class:`frozenset`\s. + (Since :class:`~collections.abc.ItemsView` extends + :class:`~collections.abc.Set`, + :meth:`bidict.frozenbidict.__hash__` + just calls ``ItemsView(self)._hash()``.) + + - Does this argue for making :meth:`collections.abc.Set._hash` non-private? + + - Why isn't the C implementation of this algorithm directly exposed in + CPython? The only way to use it is to call ``hash(frozenset(self.items()))``, + which wastes memory allocating the ephemeral frozenset, + and time copying all the items into it before they're hashed. + + - Unlike other attributes, if a class implements ``__hash__()``, + any subclasses of that class will not inherit it. + It's like Python implicitly adds ``__hash__ = None`` to the body + of every class that doesn't explicitly define ``__hash__``. + So if you do want a subclass to inherit a base class's ``__hash__()`` + implementation, you have to set that manually, + e.g. by adding ``__hash__ = BaseClass.__hash__`` in the class body. + See :class:`~bidict.FrozenOrderedBidict`. + + This is consistent with the fact that + :class:`object` implements ``__hash__()``, + but subclasses of :class:`object` + that override :meth:`~object.__eq__` + are not hashable by default. + +- Using :meth:`~object.__new__` to bypass default object initialization, + e.g. for better :meth:`~bidict.bidict.copy` performance. + See `_base.py `__. + +- Overriding :meth:`object.__getattribute__` for custom attribute lookup. + See :ref:`extending:Sorted Bidict Recipes`. + +- Using + :meth:`object.__getstate__`, + :meth:`object.__setstate__`, and + :meth:`object.__reduce__` to make an object pickleable + that otherwise wouldn't be, + due to e.g. using weakrefs, + as bidicts do (covered further below). + + Portability =========== @@ -413,6 +484,6 @@ Other interesting stuff in the standard library Tools ===== -See :ref:`thanks:Projects` for some of the fantastic tools +See the :ref:`Thanks ` page for some of the fantastic tools for software verification, performance, code quality, etc. that bidict has provided an excuse to play with and learn.