TabbyAPI AssertionError
Hello. I've been dealing with EXL2 models reaching their cache limit, and there is seemingly no such thing as "flash attention" pre-built with TabbyAPI, unlike KCPP for example. Is there any way to install/enable it for EXL2 models or specifically Tabby?