-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: recover from panics in LoadSectorStates #1115
base: master
Are you sure you want to change the base?
Conversation
frrist
commented
Jan 19, 2023
- related to Lotus daemon being killed and causing websocket errors lotus#10050
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I'm having some issues with this PR. This is the
The daemon is kept running but the walk is not completing. The list command outputs this: [
{
"ID": 1,
"Name": "walk_1674391696",
"Type": "walk",
"Error": "",
"Tasks": [
"block_header",
"block_parent",
"drand_block_entrie",
"data_cap_balance",
"miner_beneficiary",
"miner_sector_deal",
"miner_sector_infos_v7",
"miner_sector_infos",
"miner_sector_post",
"miner_pre_commit_info",
"miner_sector_event",
"miner_current_deadline_info",
"miner_fee_debt",
"miner_locked_fund",
"miner_info",
"market_deal_proposal",
"market_deal_state",
"message",
"block_message",
"receipt",
"message_gas_economy",
"parsed_message",
"internal_messages",
"internal_parsed_messages",
"vm_messages",
"multisig_transaction",
"chain_power",
"power_actor_claim",
"chain_reward",
"actor",
"actor_state",
"id_addresses",
"derived_gas_outputs",
"chain_economics",
"chain_consensus",
"multisig_approvals",
"verified_registry_verifier",
"verified_registry_verified_client"
],
"Running": true,
"RestartOnFailure": false,
"RestartOnCompletion": false,
"RestartDelay": 0,
"Params": {
"maxHeight": "2399050",
"minHeight": "2396162",
"storage": "CSV",
"window": "0s"
},
"StartedAt": "2023-01-22T12:48:16.859521259Z",
"EndedAt": "0001-01-01T00:00:00Z"
}
] |
@davidgasquez is it failing for all epochs or just some? |
I think is just some (will check tomorrow when I'm back). That said, it happens in most of them and should be easy to reproduce in any recent partial archive. |
I've used The
There is also a bunch of empty files:
Finally, in
That said, the walk seems to continue without problems after these websocket errors. |
1 similar comment
I've used The
There is also a bunch of empty files:
Finally, in
That said, the walk seems to continue without problems after these websocket errors. |
- prevents panic
Tested things locally and seems its working fine. I've got some errors but I think they're unrelated with this PR:
Will send some jobs and keep an eye on things. |
Heads up that I just got the first |
@frrist do you have any ideas about why it might be panicking in the first place? |
@hsanjuan current theory is filecoin-project/filet#22 (comment) |
The "maybe there are blocks missing in a datastore" doesn't fit the randomness that seems to affect this issue. |
@davidgasquez can you point me to the github issue for the websocket bug? |
There is one higher level in Filet and another in Lotus. |
@frrist should we merge? |