-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blocking commands may block permanently due to lost wakeup #2473
Comments
@PokIsemaine I'm not sure if I understood your point correctly:
Is that right? |
Yeah |
I see. Thank you for raising this issue. I knew this issue indeed, but didn't find a proper way to solve the data race. It's nice to have an issue to track this. |
/assign |
I forgot this a bit, is this issue fixed? |
@RiversJin tells me and I just realized that Blocking commands might need a mechenism for lock and blocking. Some commands, like bzpop, would block on multiple keys. If multiple bzpop comming with same key and different order, would this being a dead lock? |
We can try using |
It's probably not fixed yet, but I think it might be resolved after #2620. I'll test this part later. |
That's not lock, I think "BlockKeys" might following the same ordering? Or we should have a very well-defined order for "Lock-acquire - blockOnKey - LockRelease". The later part can be solved by #2620 or being a different patch The current problem is that the sequence is "Lock-acquire - LockRelease - blockOnKey", which between release and blockOn the whole things can be done by another thread |
The blockKeys itself might not blocking, my bad, this can(and might should) be handled by well-designed sequence |
But after #2620, the lock is released after the blockkey |
Aha, you're right. So this should be fixed after that. Maybe we can also denote that the lock should be held when |
Search before asking
Version
unstable
Minimal reproduce step
I discovered this issue while testing a PR(#2332), even though the PR increased the likelihood of it occurring, the problem actually existed beforehand.
It is difficult to reproduce manually without modifying the code, and it almost exclusively occurs during Go's integration tests.
Here's an example with
TestRegression
:By printing some information, I was able to trace the error:
The issue occurs here:
To manually reproduce the issue consistently, you can add a sleep and complete a PUSH within 5 seconds:
What did you expect to see?
Blocking commands will not be blocked after new key values are added.
What did you see instead?
Blocking command blocks permanently.
Anything Else?
No response
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: