Large Language Models up to 4x Faster on RTX With TensorRT-LLM for Windows cross-posted from: lemdro.id/post/2377716 (!aistuff)