cpython/Parser
Gregory P. Smith cec1e9dfd7
[3.9] gh-95778: CVE-2020-10735: Prevent DoS by very large int() (#96502)
* Correctly pre-check for int-to-str conversion (#96537)

Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =)

The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact.

The justification for the current check. The C code check is:
```c
max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10
```

In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is:
$$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$

From this it follows that
$$\frac{M}{3L} < \frac{s-1}{10}$$
hence that
$$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$
So
$$2^{L(s-1)} > 10^M.$$
But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check.

<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->

Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
Co-authored-by: Christian Heimes <christian@python.org>
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
2022-09-05 11:21:03 +02:00
..
pegen [3.9] gh-95778: CVE-2020-10735: Prevent DoS by very large int() (#96502) 2022-09-05 11:21:03 +02:00
pgen Fix typo in the parser generator (GH-18603) 2020-03-09 02:58:24 +00:00
Python.asdl bpo-40528: Improve and clear several aspects of the ASDL definition code for the AST (GH-19952) 2020-05-06 15:29:32 +01:00
acceler.c
asdl.py bpo-40528: Improve and clear several aspects of the ASDL definition code for the AST (GH-19952) 2020-05-06 15:29:32 +01:00
asdl_c.py [3.9] bpo-11105: Do not crash when compiling recursive ASTs (GH-20594) (GH-26522) 2021-06-03 22:22:34 +01:00
grammar1.c bpo-39882: Add _Py_FatalErrorFormat() function (GH-19157) 2020-03-25 19:27:36 +01:00
listnode.c bpo-40268: Remove a few pycore_pystate.h includes (GH-19510) 2020-04-14 17:52:15 +02:00
myreadline.c bpo-38156: Fix compiler warning in PyOS_StdioReadline() (GH-21721) 2020-08-03 17:56:54 -07:00
node.c bpo-40502: Initialize n->n_col_offset (GH-19988) 2020-05-08 17:58:28 -03:00
parser.c bpo-39882: Py_FatalError() logs the function name (GH-18819) 2020-03-07 00:54:20 +01:00
parser.h
parsetok.c bpo-40335: Correctly handle multi-line strings in tokenize error scenarios (GH-19619) 2020-04-21 01:53:04 +01:00
token.c
tokenizer.c bpo-45738: Fix computation of error location for invalid continuation characters in the parser (GH-29550) (GH-29552) 2021-11-14 01:47:27 +00:00
tokenizer.h closes bpo-39721: Fix constness of members of tok_state struct. (GH-18600) 2020-02-27 18:44:52 -08:00