We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi, I am curious about why n_blocks_per_split is calculated using params.seqlen_k instead of actual_seqlen_k in the following code:
n_blocks_per_split
params.seqlen_k
actual_seqlen_k
flash-attention/csrc/flash_attn/src/flash_fwd_kernel.h
Line 525 in b443207
It seems to be wrong in some cases.
Considering: seqlen_k = 1024; seqlen_k_new = 1; BlockN = 128; num_split = 4;
the n_blocks_per_split would be equal to 2. And then n_block_max can only reach a maximum of 8 ((3 + 1) * 2) according to:
n_block_max
Line 529 in b443207
If we attempt to append KV, n_block_copy_min is also equal to 8, which means there is no condition that allows gKNew to append to gK:
n_block_copy_min
gKNew
gK
Line 727 in b443207
Line 730 in b443207
Am I missing something here?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hi,
I am curious about why
n_blocks_per_split
is calculated usingparams.seqlen_k
instead ofactual_seqlen_k
in the following code:flash-attention/csrc/flash_attn/src/flash_fwd_kernel.h
Line 525 in b443207
It seems to be wrong in some cases.
Considering:
seqlen_k = 1024;
seqlen_k_new = 1;
BlockN = 128;
num_split = 4;
the
n_blocks_per_split
would be equal to 2. And thenn_block_max
can only reach a maximum of 8 ((3 + 1) * 2) according to:flash-attention/csrc/flash_attn/src/flash_fwd_kernel.h
Line 529 in b443207
If we attempt to append KV,
n_block_copy_min
is also equal to 8, which means there is no condition that allowsgKNew
to append togK
:flash-attention/csrc/flash_attn/src/flash_fwd_kernel.h
Line 727 in b443207
flash-attention/csrc/flash_attn/src/flash_fwd_kernel.h
Line 730 in b443207
Am I missing something here?
The text was updated successfully, but these errors were encountered: