For people self hosting LLMs.. I have a couple docker images I maintain github.com/…/text-generation-webui-docker (updated to 1.3.1 and has a fix for gqa to run llama2 70B)...