Description
The caching algorithm for re._compile()
is one of the hottest sites in the stdlib. It was rewritten many times (it is not all changes):
- Save OrderedDict import in re #76519
- Don't completely dump the regex cache when full #72480
- Faster bypass re cache when DEBUG is passed #66700
- re._compiled_typed's lru_cache causes significant degradation of the mako_v2 bench #60593
- Option to make the lru_cache type specific #57436
- d9e8cc6
- Standardise (and publish?) cache handling in standard library #53642
- 5a63183
The patch for using lru_cache()
was repeatedly applied and reverted 3 times! Eventually it turned out to be slower than a simple dict lookup. It does not fit very well in this case, because some results should be not cached, and additional checks should be added before calling the cached function, adding significant overhead.
But the LRU caching algorithm can have advantage over the current simple algorithm if not all compiler patterns fit in the cache. It removes entries from the cache more smarty.
I tested with random keys with exponential distribution. With the cache size 512, the largest difference is for lambda between 1/70 and 1/80. The LRU caching algorithm has 3 times less misses: 0.16% to 0.33% against 0.5 to 1.1% misses in the current algorithm. For significantly larger (almost no misses) or smaller (tens percent of misses) lambda both algorithms gives almost the same result.
Direct implementation of the LRU caching algorithm using OrderedDict or just dict would be slower, because every hit requires moving the found entry to the end. But we can use double caching. The primary smaller and faster cache does not reorder entries. But if the key is not found in the primary cache, we look it up in the secondary LRU cache. This algorithm has the same number of misses as the LRU caching algorithm, but it has faster hits.