How do I set kv cache quantisation in Ollama?
It seems like this was recently added based on github, but I'm struggling to find more information. How do I go about setting it up for the models I want to run?
It seems like this was recently added based on github, but I'm struggling to find more information. How do I go about setting it up for the models I want to run?