Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Emit a warning in case of possible memory leaks #428

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

antocuni
Copy link
Contributor

@antocuni antocuni commented Feb 3, 2022

Highlights of this PR:

  • kill AutoFreeBuffers and free_all_other_buffers: as discussed previously, they are broken and cannot be possibly fixed
  • introduce a new compiler pass which tries to detect suspicious cases of possible memory leaks
  • make it possible to silence the warnings on a per-function basis

The warning is triggered under very specific conditions, i.e. if inside a certain function:

  1. we call a buffer constructor (such as e.g. xp.Array()) AND
  2. we never call Array.free() or free_buffer(buf).

The idea is that by doing this way, the casual user will learn that they needs to manage memory manually as soon as they starts to use arrays, but then they are on their own.

If the function emits a spurious warning, it is possible to disable it by passing disable_leak_warnings=True to the @rjit decorator.

Possible follow-ups/improvements:

  1. improve the detection logic: for example, if the function creates only one array and always return it, we might want to disable the warning
  2. improve the error message with a link to the relevant docs, once we write them
  3. change/improve the name of the disable_leak_warnings flag (suggestions are welcome)

@antocuni antocuni added enhancement New feature or request WIP Work In Progress labels Feb 3, 2022
@antocuni antocuni changed the title WIP: Emit a warning in case of possible memory leaks Emit a warning in case of possible memory leaks Feb 4, 2022
@antocuni antocuni marked this pull request as ready for review February 4, 2022 16:50
Copy link
Contributor

@pearu pearu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks good. Thanks, @antocuni!

I committed a change that includes some information about the function where a possible memory leak is detected. Feel free to adjust this change if needed.

I have a couple of concerns that should be addressed before merging the PR.

First, this PR disables automatic memory management of buffers and replaces it with a detection method of possible memory leaks. There are a number of tests that rely on automatic memory management and by disabling it, the PR actually may introduce memory leaks in cases where the automatic memory management worked. All these tests eject memory leak warnings and are therefore easy to locate.
Could you revise all the memory leak detection warnings reported in CI and updated the corresponding test functions to manage memory correctly?

Second, there exists false positive detection of possible memory leaks in cases where a function constructs a buffer object and returns it. For example, a false positive memory leak is detected in
https://github.com/xnd-project/rbc/blob/6d6ad969cb0de338f111d95695852c9a869432e7/rbc/tests/test_omnisci_text.py#L125-L133
Would it be possible to extend the memory detection method to the case where a buffer is constructed and no free is called then the return of the buffer will qualify as the existence of free call?

Comment on lines +99 to +121
class MissingFreeWarning(Warning):

_msg = """
Possible memory leak detected in function `{}` defined in {}#{}!

In RBC, arrays and buffers must be freed manually by calling the method
.free() or the function free_buffer(): see the relevant docs.
"""

def __init__(self, functionname, filename, firstlineno):
msg = self.make_message(functionname, filename, firstlineno)
Warning.__init__(self, msg)

@classmethod
def make_message(cls, functionname, filename, firstlineno):
return cls._msg.format(functionname, filename, firstlineno)


class MissingFreeError(Exception):

def __init__(self, functionname, filename, firstlineno):
msg = MissingFreeWarning.make_message(functionname, filename, firstlineno)
Exception.__init__(self, msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move those to rbc/errors.py?

@antocuni antocuni force-pushed the antocuni/detect-missing-free branch 2 times, most recently from 689a87d to f45b032 Compare February 8, 2022 16:11
@antocuni antocuni changed the title Emit a warning in case of possible memory leaks WIP: Emit a warning in case of possible memory leaks Feb 9, 2022
@antocuni
Copy link
Contributor Author

antocuni commented Feb 9, 2022

I committed a change that includes some information about the function where a possible memory leak is detected. Feel free to adjust this change if needed.

brilliant idea, thank you!

I have a couple of concerns that should be addressed before merging the PR.

First, this PR disables automatic memory management of buffers and replaces it with a detection method of possible memory leaks. There are a number of tests that rely on automatic memory management and by disabling it, the PR actually may introduce memory leaks in cases where the automatic memory management worked. All these tests eject memory leak warnings and are therefore easy to locate. Could you revise all the memory leak detection warnings reported in CI and updated the corresponding test functions to manage memory correctly?

good point. To address it, I slightly changed the behavior: I removed the option disable_leak_warnings but I introduce a new option on_missing_free which can be one the following values:

  • 'warn': emit a warning in case rbc detects a missing free. This is the default as it is the most useful for the casual rbc user.
  • 'fail': raise an exception instead of a warning. This is the default inside rbc own tests, because we want to be sure not to introduce memory leaks by mistake
  • 'ignore': as the name implies, just disable the detection of missing free. This is useful if you know that your code is correct but the rbc static checker think otherwise. Currently I had to use it on all the functions which construct-and-return a new array.

Second, there exists false positive detection of possible memory leaks in cases where a function constructs a buffer object and returns it. For example, a false positive memory leak is detected in

https://github.com/xnd-project/rbc/blob/6d6ad969cb0de338f111d95695852c9a869432e7/rbc/tests/test_omnisci_text.py#L125-L133

Would it be possible to extend the memory detection method to the case where a buffer is constructed and no free is called then the return of the buffer will qualify as the existence of free call?

yes, I think it's possible and also a good idea but it's not completely straightforward. For example, consider this artificial case:

def foo(flag):
    a = Array(10)
    if flag:
        a = Array(20)
    return a

do we want RBC to emit a warning or not?
Defining a sane logic and implement is not straightforward, so I though it is better to start with a simpler PR which is useful on its own, and possibly improve it later.

@antocuni
Copy link
Contributor Author

antocuni commented Feb 9, 2022

Status update
TL;DR: the CI tests continue to crash and I don't know why.

The tests keep failing as soon as they run rbc/tests/test_omnisci_array_operators.py::test_array_operators[abs]. This is what I found so far:

  1. the test reliably passes on my machine
  2. the test reliably fails on CI. It seems it causes a segfault in the omnisci server with ALL combinations of omnisci/numba/python
  3. the only difference between master and this PR is the call to a.free(). If I comment out the call, the CI tests are green again

Some info on my machine:

  • python 3.9.7
  • libllvml11 11.1.0
  • llvmlite 0.37.0
  • numba 0.54.1
  • OmniSci Version: 5.9.0dev-20211020-e1ebffa825

I temporarily pushed commit 82e54bd which keeps only test_abs and dumps the llvm IR:

the two dumps look a bit differently, in particular the second seems to do more in the entry: block although I don't know why. But apart for that, they look correct to me. In particular, I see the following in both dumps:

  %.5.i = tail call i8* @allocate_varlen_buffer(i64 %.2, i64 4), !noalias !1
[...]
  tail call void @free(i8* %.5.i) #0, !noalias !28

i.e., the buffer is allocated by calling allocate_varlen_buffer, which indirectly calls malloc, and freed by calling the libc free(), which looks correct.

So, at the moment I have no clue about what's wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request WIP Work In Progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants