The LRI implementation this replaces had a bug where upon multiple
inserts of the same key, cache eviction could result in a key error when
we try to delete a key that had already been evicted from the cache.
Furthermore, the original implementation had an issue where the size of
the cache was unbounded since eviction only looked at the number of keys
in the dictionary, not at the size of the underlying queue keeping track
of key insertion order. While the size of data structure may not be a
problem on all systems, the unbounded size of the cache could lead to
poor performance in cases where you have many stale keys you need to
evict before finding a key present in the current dictionary. The
problem encountered is similar to python's priority queue reference
implementation where it is expensive to rebalance the queue, so
instead you keep track of and noop on stale values that have since had
an insertion into the queue (
https://docs.python.org/3/library/heapq.html#priority-queue-implementation-notes)
This new version benefits from the work done on LRU (so now we get
things like thread safety) but implements the same algorithm as the old
version of LRI. This does come at a cost, testing locally (Intel Core
i5-8350U CPU @ 1.70GHz × 8) I found that the cache takes about 10^-7
seconds longer to access on average (sample of 1 million accesses of a
cache of size 100). In addition, inserts are now 3 times as expensive
as the old version of this code. Given the bugs in the old code, this
solution, with it's slowdown, was deemed acceptable. I'm not sure a
faster solution can be implemented without sacrificing some correctness,
so I am personally satisfied with this.
!NOTICE!
The new version of LRI is backward incompatible in the following ways:
- `__repr__` returns information about the cache in addition to the
values it stores. Before it returned the `__repr__` of `dict`
- The new version is thread safe