mirror of https://github.com/python/cpython.git
Move comment about permanent generation to gcmodule.c (GH-17718)
The comment about the collection rules for the permanent generation was incorrectly referenced by a comment in gcmodule.c (the comment has been moved long ago into a header file). Moving the comment into the relevant code helps with readability and avoids broken references.
This commit is contained in:
parent
91874bb071
commit
90913985b6
|
@ -16,42 +16,6 @@ extern "C" {
|
||||||
/* If we change this, we need to change the default value in the
|
/* If we change this, we need to change the default value in the
|
||||||
signature of gc.collect. */
|
signature of gc.collect. */
|
||||||
#define NUM_GENERATIONS 3
|
#define NUM_GENERATIONS 3
|
||||||
|
|
||||||
/*
|
|
||||||
NOTE: about the counting of long-lived objects.
|
|
||||||
|
|
||||||
To limit the cost of garbage collection, there are two strategies;
|
|
||||||
- make each collection faster, e.g. by scanning fewer objects
|
|
||||||
- do less collections
|
|
||||||
This heuristic is about the latter strategy.
|
|
||||||
|
|
||||||
In addition to the various configurable thresholds, we only trigger a
|
|
||||||
full collection if the ratio
|
|
||||||
long_lived_pending / long_lived_total
|
|
||||||
is above a given value (hardwired to 25%).
|
|
||||||
|
|
||||||
The reason is that, while "non-full" collections (i.e., collections of
|
|
||||||
the young and middle generations) will always examine roughly the same
|
|
||||||
number of objects -- determined by the aforementioned thresholds --,
|
|
||||||
the cost of a full collection is proportional to the total number of
|
|
||||||
long-lived objects, which is virtually unbounded.
|
|
||||||
|
|
||||||
Indeed, it has been remarked that doing a full collection every
|
|
||||||
<constant number> of object creations entails a dramatic performance
|
|
||||||
degradation in workloads which consist in creating and storing lots of
|
|
||||||
long-lived objects (e.g. building a large list of GC-tracked objects would
|
|
||||||
show quadratic performance, instead of linear as expected: see issue #4074).
|
|
||||||
|
|
||||||
Using the above ratio, instead, yields amortized linear performance in
|
|
||||||
the total number of objects (the effect of which can be summarized
|
|
||||||
thusly: "each full garbage collection is more and more costly as the
|
|
||||||
number of objects grows, but we do fewer and fewer of them").
|
|
||||||
|
|
||||||
This heuristic was suggested by Martin von Löwis on python-dev in
|
|
||||||
June 2008. His original analysis and proposal can be found at:
|
|
||||||
http://mail.python.org/pipermail/python-dev/2008-June/080579.html
|
|
||||||
*/
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
NOTE: about untracking of mutable objects.
|
NOTE: about untracking of mutable objects.
|
||||||
|
|
||||||
|
|
|
@ -1381,8 +1381,40 @@ collect_generations(PyThreadState *tstate)
|
||||||
for (int i = NUM_GENERATIONS-1; i >= 0; i--) {
|
for (int i = NUM_GENERATIONS-1; i >= 0; i--) {
|
||||||
if (gcstate->generations[i].count > gcstate->generations[i].threshold) {
|
if (gcstate->generations[i].count > gcstate->generations[i].threshold) {
|
||||||
/* Avoid quadratic performance degradation in number
|
/* Avoid quadratic performance degradation in number
|
||||||
of tracked objects. See comments at the beginning
|
of tracked objects (see also issue #4074):
|
||||||
of this file, and issue #4074.
|
|
||||||
|
To limit the cost of garbage collection, there are two strategies;
|
||||||
|
- make each collection faster, e.g. by scanning fewer objects
|
||||||
|
- do less collections
|
||||||
|
This heuristic is about the latter strategy.
|
||||||
|
|
||||||
|
In addition to the various configurable thresholds, we only trigger a
|
||||||
|
full collection if the ratio
|
||||||
|
|
||||||
|
long_lived_pending / long_lived_total
|
||||||
|
|
||||||
|
is above a given value (hardwired to 25%).
|
||||||
|
|
||||||
|
The reason is that, while "non-full" collections (i.e., collections of
|
||||||
|
the young and middle generations) will always examine roughly the same
|
||||||
|
number of objects -- determined by the aforementioned thresholds --,
|
||||||
|
the cost of a full collection is proportional to the total number of
|
||||||
|
long-lived objects, which is virtually unbounded.
|
||||||
|
|
||||||
|
Indeed, it has been remarked that doing a full collection every
|
||||||
|
<constant number> of object creations entails a dramatic performance
|
||||||
|
degradation in workloads which consist in creating and storing lots of
|
||||||
|
long-lived objects (e.g. building a large list of GC-tracked objects would
|
||||||
|
show quadratic performance, instead of linear as expected: see issue #4074).
|
||||||
|
|
||||||
|
Using the above ratio, instead, yields amortized linear performance in
|
||||||
|
the total number of objects (the effect of which can be summarized
|
||||||
|
thusly: "each full garbage collection is more and more costly as the
|
||||||
|
number of objects grows, but we do fewer and fewer of them").
|
||||||
|
|
||||||
|
This heuristic was suggested by Martin von Löwis on python-dev in
|
||||||
|
June 2008. His original analysis and proposal can be found at:
|
||||||
|
http://mail.python.org/pipermail/python-dev/2008-June/080579.html
|
||||||
*/
|
*/
|
||||||
if (i == NUM_GENERATIONS - 1
|
if (i == NUM_GENERATIONS - 1
|
||||||
&& gcstate->long_lived_pending < gcstate->long_lived_total / 4)
|
&& gcstate->long_lived_pending < gcstate->long_lived_total / 4)
|
||||||
|
|
Loading…
Reference in New Issue