Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Suggestion] Increased compatability for mutliple AUX data channels to be stored in a single AUX block #158

Open
DamianNeurode opened this issue Sep 19, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@DamianNeurode
Copy link

When creating snirf files with a large number of Auxiliary channels, the overhead required for each column can make the data file necessarily large, especially if all the auxiliary channels have a common time axis. For example, with a 6 axis accelerometer being acquired by the same device, 5 unneeded repetitions of the time axis are needed, bloating the file size needed by this part of the snirf file by almost 2x. While the current format does support auxiliary data being a 2D matrix and not just a 1D vector, only one label is allowed meaning that it would be difficult to infer what each column corresponded to without capturing it in an external file.

Instead I propose that we allow “/nirs(i)/aux(j)/name” to be a list of strings (or equivalent) instead of just a single string. This would allow for “/nirs(i)/aux(j)/dataTimeSeries” to be used as a 2D matrix with interpretability of columns. For cases where a common time axis is not being used, the existing spec can be used where different auxiliary data are given different blocks

The only other solution I can see would be to place all the column names into “/nirs(i)/aux(j)/name” with a common delineator such as "," or ";"

@DamianNeurode DamianNeurode added the enhancement New feature or request label Sep 19, 2024
@samuelpowell
Copy link
Collaborator

@DamianNeurode for the time axis "an array of length <2> is allowed where the first entry is the start time and the second entry is the sample time spacing in TimeUnit specified in the metaDataTags".

Recording the data in this format should ameliorate your concerns regarding file size.

@DamianNeurode
Copy link
Author

@samuelpowell Thanks for pointing that out! It's true that this would solve the majority of the size issue. Do you see any benefit in this change for commonly sampled auxilary data that is either non-uniformly sampled or recorded in high frequency bursts over a long recording duration?

@samuelpowell
Copy link
Collaborator

If an example were provided of multi-channel data with these characteristics then it would certainly be worth further consideration.

I would want to be convinced that there was a fairly broad use case for any change to the standard. But in this case there is the added complication of how this would interact with the current specification in which the second dimension is used when there are multiple channels of the same data.

There are obviously ways around this such as having a higher dimensional array, or rules about how the second dimension is used based upon the dimensionality of the auxiliary name field, but none of them immediately pass the simplicity smell test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants