-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offer a "low memory mode" and multiprocessing of array chunks #79
Comments
This would be a huge improvement for the scalability of Omniscape. I feel like solution 1 would be infinitely scalable, which isn't the case for solution 2, no? |
Yeah, on a second read after leaving this for a while, I'm preferring option 1 above... I think it would be faster and also easy to test for correctness. |
But thinking more about it, option 2 if implemented well could be more suited for HPC where each thread is using very little memory. Ultimately, this might be the best solution. |
Yeah, I think option 1 has the opportunity to be much faster though (lazy read/write plus associated lock/unlocks in the code for thread safety for that many operations could involve massive overhead). Also, if in option 2 there were N output GeoTiffs that were written to separately for thread safety, where N is the number of threads, then each of those N tiles would be the full size of the landscape, which could introduce additional memory bottlenecks when adding them together. |
I tested option 1 by splitting the input resistance map in Bash+GDAL first into overlapping tiles, then running Omniscape on each tile, and finally remosaic the tiles in Python using rasterio.merge(method="max). It works really well and you don't see any stitch marks in the final output. I looked at the Omniscape code and I see it would require some significant refactoring to do this within Omniscape. It also seems quite difficult to do the final merge in GDAL only using the max pixel value. I don't know if there is an equivalent merge function in Julia. |
Awesome! Glad it worked well. I think there might be a way to mosiac the TIFFs using GDAL.jl or ArchGDAL.jl? But actually, I use gdal_merge.py from the command line when merging rasters, and maybe that's not available in Julia... We could possibly do it with GeoData.jl. There may be a way to create an empty GDALarray, then write to it lazily using GeoData.jl. The key is lazy writing, to ensure the most efficient use of RAM. cc @rafaqz, who might have some ideas/thoughts. I think GeoData.jl (or maybe ArchGDAL?) might have what we need. I see the workflow as having these steps:
I'd like to start working on these components when I have time -- It might be a bit until I have a chunk of time to do it, but I think we're starting to get a somewhat clear plan of action at this point. |
Just a note to mention that gdal_merge.py doesn't allow to take the maximum value of overlapping pixels. If you don't do this, there will be some serious stitch marks. |
Also, the tiling of the landscape could end up giving us a solid foundation from which to build out functionality to enable hierarchical parallelism. |
GeoData.jl has the |
Thanks @rafaqz! I'll keep you in the loop when I find time to work on this, and may solicit some feedback as I go 🙂 |
Version 1 will be the way I go, as it has several benefits, including making multiprocessing easy to do, which will enable hierarchical parallel processing (e.g. see #106) |
Often grids are so large that allocating arrays requires tons of memory. Each moving window solve is independent, so a low-memory mode may be relatively simple.
Two options come to mind:
1. Split the landscape into chunks, and solve the chunks in serial
The landscape could be split into adjacent, non-overlapping rectangular or square chunks. Each chunk (and corresponding inputs) would need to be buffered by
radius
(omniscape argument) so that the pixels at the edge of the unbuffered chunk aren't missing any data in their moving windows. Then, results for each chunk could be stored and mosaicked, taking the max value. If done correctly, his would result in the same exact map as would be produced in Omniscape v0.4.2.2. Lazy reading and writing for each moving window solve
A given moving window solve only needs the data directly pertinent to it, so there is no need to store the entire raster inputs in-memory. Just load what is needed for each moving window (GeoData.jl has functionality to do this), solve it, lazily-write to an output raster then remove/garbage collect data and move on to the next solve. Thread safety could be a little tricky. Each thread could write to its own output component, then output components could be summed, but this would need to be done lazily/by chunk as well, otherwise you'd end up loading all the components in memory, which would take as much memory as the current method of Omniscape. Another consideration is computing the coordinates and source strengths of target pixels (the ones on which the moving windows are centered). The biggest memory hog in Omniscape is the allocation of the array(s) that store(s) current values. They have the same number of rows and columns asn the resistance surface, but have a 3rd dimension equal to the number of parallel workers. If computing normalized current flow in addition to cumulative current, there are two of them! Because of this, even if the source strength layer needs to be loaded in full into memory, memory savings would still be substantial.
I tend to like option 2, it's the slickest, and if done in a thread safe way, I think it would be more straightforward to implement in a way such that errors from edge cases would be less likely, but IO overhead might be significant -- not sure.
The text was updated successfully, but these errors were encountered: