pytest segfault, not with -v

Discussion:

(too old to reply)

Marco Sulla

2021-11-19 23:44:56 UTC

I have a battery of tests done with pytest. My tests break with a
segfault if I run them normally. If I run them using pytest -v, the
segfault does not happen.
What could cause this quantical phenomenon?

Are you testing an extension that you're compiling? That kind of problem
can occur if there's an uninitialised variable or incorrect reference
counting (Py_INCREF/Py_DECREF).

Ok, I know. But why can't it be reproduced if I do pytest -v? This way
I don't know which test fails.
Furthermore I noticed that if I remove the __pycache__ dir of tests,
pytest does not crash, until I re-ran it with the __pycache__ dir
present.
This way is very hard for me to understand what caused the segfault.
I'm starting to think pytest is not good for testing C extensions.

MRAB

2021-11-20 00:44:26 UTC

Permalink

Post by Marco Sulla

Are you testing an extension that you're compiling? That kind of problem
can occur if there's an uninitialised variable or incorrect reference
counting (Py_INCREF/Py_DECREF).

If there are too few Py_INCREF or too many Py_DECREF, it'll free the
object too soon, and whether or when that will cause a segfault will
depend on whatever other code is running. That's the nature of the
beast: it's unpredictable!

You could try running each of the tests in a loop to see which one
causes a segfault. (Trying several in a loop will let you narrow it down
more quickly.)

pytest et al. are good for testing behaviour, but not for narrowing down
segfaults.

Marco Sulla

2021-11-20 18:07:32 UTC

Permalink

I know how to check the refcounts, but I don't know how to check the
memory usage, since it's not a program, it's a simple library. Is
there not a way to check inside Python the memory usage? I have to use
a bash script (I'm on Linux)?

Indeed I have introduced a command line parameter in my bench.py
script that simply specifies the number of times the benchmarks are
performed. This way I have a sort of segfault checker.
But I don't bench any part of the library. I suppose I have to create
a separate script that does a simple loop for all the cases, and
remove the optional parameter from bench. How boring.
PS: is there a way to monitor the Python consumed memory inside Python
itself? In this way I could also trap memory leaks.

I'm on Windows 10, so I debug in Microsoft Visual Studio. I also have a
look at the memory usage in Task Manager. If the program uses more
memory when there are more iterations, then that's a sign of a memory
leak. For some objects I'd look at the reference count to see if it's
increasing or decreasing for each iteration when it should be constant
over time.

Post by MRAB

Post by Marco Sulla

Are you testing an extension that you're compiling? That kind of problem
can occur if there's an uninitialised variable or incorrect reference
counting (Py_INCREF/Py_DECREF).

If there are too few Py_INCREF or too many Py_DECREF, it'll free the
object too soon, and whether or when that will cause a segfault will
depend on whatever other code is running. That's the nature of the
beast: it's unpredictable!
You could try running each of the tests in a loop to see which one
causes a segfault. (Trying several in a loop will let you narrow it down
more quickly.)
pytest et al. are good for testing behaviour, but not for narrowing down
segfaults.

--
https://mail.python.org/mailman/listinfo/python-list

Dan Stromberg

2021-11-20 18:59:55 UTC

Permalink

Post by Marco Sulla
I know how to check the refcounts, but I don't know how to check the
memory usage, since it's not a program, it's a simple library. Is
there not a way to check inside Python the memory usage? I have to use
a bash script (I'm on Linux)?

ps auxww
...can show you how much memory is in use for the entire process.

It's commonly combined with grep, like:
ps auxww | head -1
ps auxww | grep my-program-name

Have a look at the %MEM, VSZ and RSS columns.

But being out of memory doesn't necessarily lead to a segfault - it can (EG
if a malloc failed, and some C programmer neglected to do decent error
checking), but an OOM kill is more likely.

Dan Stromberg

2021-11-20 19:22:06 UTC

Permalink

Post by Dan Stromberg

ps auxww
...can show you how much memory is in use for the entire process.
ps auxww | head -1
ps auxww | grep my-program-name
Have a look at the %MEM, VSZ and RSS columns.
But being out of memory doesn't necessarily lead to a segfault - it can
(EG if a malloc failed, and some C programmer neglected to do decent error
checking), but an OOM kill is more likely.

The above can be used to detect a leak in the _process_.

Once it's been established (if it's established) that the process is
getting oversized, you can sometimes see where the memory is going with:
https://www.fugue.co/blog/diagnosing-and-fixing-memory-leaks-in-python.html

But again, a memory leak isn't necessarily going to lead to a segfault.

Dieter Maurer

2021-11-21 06:14:13 UTC

Permalink

If Python was compiled appropriately (with "PYMALLOG_DEBUG"), `sys` contains
the function `_debugmallocstats` which prints details
about Python's memory allocation and free lists.

I was not able to compile Python 2.7 in this way. But the (system) Python 3.6
of Ubuntu was compiled appropriately.

Note that memory leaks usually do not cause segfaults (unless the application
runs out of memory due to the leak).

Your observation shows (apparently) non-deterministic behavior. In those cases,
minor differences (e.g. with/without "-v") can significantly change
the behavior (e.g. segfault or not). Memory management bugs (releasing memory
still in use) are a primary cause for this kind of behavior in Python
applications.

Barry

2021-11-21 13:30:30 UTC

Permalink

I would run the whole set of tests under gdb and wait for the segv to happen.
You may find that an isolated test will pass. Sometimes it is a sequence of test
and lead to the segv.

Barry

Post by Marco Sulla

Are you testing an extension that you're compiling? That kind of problem
can occur if there's an uninitialised variable or incorrect reference
counting (Py_INCREF/Py_DECREF).

Marco Sulla

2021-12-18 13:10:53 UTC

Permalink

Ok, I created the script:

https://github.com/Marco-Sulla/python-frozendict/blob/master/test/debug.py

The problem is it does _not_ crash, while a get a segfault using
pytest with python 3.9 on MacOS 10.15

Maybe it's because I'm using eval / exec in the script?

Post by MRAB

Post by Marco Sulla

Are you testing an extension that you're compiling? That kind of problem
can occur if there's an uninitialised variable or incorrect reference
counting (Py_INCREF/Py_DECREF).

If there are too few Py_INCREF or too many Py_DECREF, it'll free the
object too soon, and whether or when that will cause a segfault will
depend on whatever other code is running. That's the nature of the
beast: it's unpredictable!
You could try running each of the tests in a loop to see which one
causes a segfault. (Trying several in a loop will let you narrow it down
more quickly.)
pytest et al. are good for testing behaviour, but not for narrowing down
segfaults.
--
https://mail.python.org/mailman/listinfo/python-list

Marco Sulla

2021-12-18 20:01:34 UTC

Permalink

Emh, maybe I was not clear. I created a C extension and it segfaults.
So I created that script to see where it segfaults. But the script
does not segfault. My doubt is: is that because I'm using eval and
exec in the script?

Post by Marco Sulla
https://github.com/Marco-Sulla/python-frozendict/blob/master/test/debug.py
The problem is it does _not_ crash, while a get a segfault using
pytest with python 3.9 on MacOS 10.15
Maybe it's because I'm using eval / exec in the script?

Segfaults can result from C stack overflow which in turn can
be caused in special cases by too deeply nested function calls
(usually, Python's "maximal recursion depth exceeded" prevents
this before a C stack overflow).
Otherwise, whatever you do in Python (this includes "eval/exec")
should not cause a segfault. The cause for it likely comes from
a memory management bug in some C implemented part of your
application.
Note that memory management bugs may not show deterministic
behavior. Minor changes (such as "with/without -v")
can significantly change the outcome.

Dieter Maurer

2021-12-19 18:12:54 UTC

Permalink

Post by Marco Sulla
Emh, maybe I was not clear. I created a C extension and it segfaults.
So I created that script to see where it segfaults. But the script
does not segfault. My doubt is: is that because I'm using eval and
exec in the script?

The segfault in your C extension is likely caused by a memory management
error. The effects of such errors are typically non local and
apparently non deterministic: small things can decide whether
you see or do not see such an effect.

Use other tools (than a script) to hunt memory management errors.

Python has a compile time option which can help (I forgot its name,
but when you search for Python compile time options, you should
find it): it puts marks before and after the memory areas used for
objects allocated via its API and checks that those marks remain intact.
There is some chance that effects of Memory management errors are
detected earlier by those checks and therefore easier to analyse.

There are specialized tools for the analysis of memory management
errors, e.g. `valgrind/memcheck`. Use one of those for complex problems.

Python's memory management rules (to be observed by C extensions)
are complex. It is quite easy to violate them.

For this reason, I try not to directly write C extensions for Python
but let them generate by `cython`.
The source of a `cython` program resembles Python source code
with extensions. The extensions control the compilation to C,
mostly for optimization purposes. For example, there is a
declaration to mark a variable as a C (rather than Python) variable.
When `cython` compiles the program to "C", it observes
all the complex rules of the Python C API.