You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
does actually appear to save pattern type with specific names for routing to different pattern impls.
Do we need to use search patterns v2 to replicate the results of the paper? Or are the vertical_and_slash settings actually enough to pull off needle-in-a-haystack for long sequences?
The text was updated successfully, but these errors were encountered:
The configs provided in the repo can reproduce the results from the paper. This means that the vertical_and_slash settings are sufficient to pass the Needle In A Haystack test for long sequences.
The search_pattern function reroutes to vertical_and_slash because our tests have shown that this setting offers better generalization and efficiency across different context windows and tasks.
The sparse search is built based on Algorithm 1 in the paper, aligning different patterns with the kernel's runtime. Block sparse falls back to the VS pattern because our tests showed that this adjustment achieves better generalization across different lengths and tasks. This pattern adjustment was made empirically.
Describe the issue
Regardless of the pattern observed, the config saves it as "vertical_and_slash" when using the search_patterns function.
MInference/minference/modules/minference_forward.py
Lines 198 to 216 in b5b8745
The configs saved in the repo appear to only contain this method type ^.
Specifically those lines:
When doing the forward pass, I think this means that we never route to anything other than the vertical_and_slash impl / kernels.
Is this a bug or intended? The experiment docs cite the use of this search patterns function.
On the other hand:
Search pattern v2
MInference/minference/modules/minference_forward.py
Line 220 in b5b8745
does actually appear to save pattern type with specific names for routing to different pattern impls.
Do we need to use search patterns v2 to replicate the results of the paper? Or are the vertical_and_slash settings actually enough to pull off needle-in-a-haystack for long sequences?
The text was updated successfully, but these errors were encountered: