-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: Use grid tree for counting grid cell inclusion #4529
base: main
Are you sure you want to change the base?
Conversation
Looks like some real failures, which I can probably replicate here and work on! |
Current status is that there's a difference in the slice methods for the grid tree and the grid objects, which leads to a mismatch of roughly 1/2 of the first dimension in one of the slices. Working on tracking it down. |
I'm working on what I hope is among the last set of issues, and it's pointing to some uninitialized memory somewhere. |
This passes locally, but answer tests may show other issues. |
To be honest, I am a little stunned that we are seeing >1k failures on They all follow this rough backtrace, from what I can tell:
|
me too. I suspect this could be indicative of a problem with the answer test framework rather than an actual issue with your refactor, but that's just a wild (and vague) guess |
Actually, I figured it out. It's only relevant in situations where grids can have multiple parents, but there it occurs. I'm going to address that. |
I have to handle something else for a bit, but I just pushed a change that disabled AMRVAC using the Grid Tree and that should enable AMR types with multiple parents for a child to work. I had to do some git stashing so it's possible I missed something, but I will return to this ASAP. I'm going to open an issue about AMRVAC, because I think we could potentially have it load the non-leaf node AMR data as well, which would allow it to use the grid tree. |
Interestingly, passing an array as a parameter defined in a Cython argument list as |
Enzo-E has negative-level grids, which confused how we were computing the number of "root" grids. |
I'm still attempting to get this working; I think I'm running into some fencepost errors. I am glad the tests are catching it quickly. |
@yt-fido test this please |
That's an interesting one:
|
@matthewturk that's a flaky build error we've been seeing since Jenkins was migrated to Python 3.10 . IRCC the core problem is with parallel builds, but it seems arch-specific |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got started reviewing but stopped when I stumbled upon a comment that said the code didn't work. Waiting for further instructions :)
if not hasattr(self.Parent, "dds"): | ||
self.Parent._setup_dx() | ||
self.dds = self.Parent.dds.d / self.ds.refine_by | ||
parents = list(always_iterable(self.Parent, base_type=(AMRGridPatch,))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the base_type
argument necessary here ? or in other words, is AMRGridPatch
iterable ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is indeed necessary, because AMRGridPatch
exposes an iteration-like interface. This is a long-time issue, unfortunately.
for k, v in sp.items(): | ||
if getattr(v, "size", None) == 1: | ||
sp[k] = v.item(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a couple problems here
- modifying a dict while iterating over it is unsafe and may raise warnings/errors (use a copy)
- using the existence of a
size
attribute to detect array-like objects seems a bit fragile. Could this be done withisinstance
instead ? If not, I would suggest using a EAFP pattern instead of LBYL (basically, switch to try/except) - the comment also doesn't explain why this conversion is needed. It'd be more helpful if it did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the reason I'm using size
rather than a base class is to avoid h5py as an instance; we're actually checking here if it's an HDF5 attribute array, rather than a numpy array. I'm not sure that I agree with the suggested change -- particularly because we really do want this specific check, less so than whether or not it can be accessed like a dict.
yt/frontends/stream/definitions.py
Outdated
@@ -47,20 +47,21 @@ def assign_particle_data(ds, pdata, bbox): | |||
if len(ds.stream_handler.fields) > 1: | |||
pdata.pop("number_of_particles", None) | |||
num_grids = len(ds.stream_handler.fields) | |||
parent_ids = ds.stream_handler.parent_ids | |||
# We have to convert these to a list of lists |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems to relate to line 58 ? If so, it should be moved closer to it.
I would also like to know the why here.
yt/geometry/grid_container.pyx
Outdated
@@ -55,12 +55,23 @@ cdef class GridTree: | |||
np.ndarray[np.float64_t, ndim=2] left_edge, | |||
np.ndarray[np.float64_t, ndim=2] right_edge, | |||
np.ndarray[np.int32_t, ndim=2] dimensions, | |||
np.ndarray[np.int64_t, ndim=1] parent_ind, | |||
list parent_ind, # Not an array anymore! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment will probably not age well. Also, I think Cython 3 supports more expressive uses of Python type annotations like list[list[float]]
, might be worth a try
list parent_ind, # Not an array anymore! | |
list parent_ind, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know that -- I'll give that a go.
Co-authored-by: Clément Robert <[email protected]>
I'm trying to let the tests go so we have a current version that includes #4525. |
I think I've addressed all the comments. Unfortunately I still need to wait for the tests to pass or fail to make any more efforts to make it working :) |
Tests all pass! |
Instead of using the grid objects, by default let's use the grid tree to count the number of cells included in a data object.
This is an independent PR, and may have test failures that can be corrected within this PR. It's independent of #4525, but related.