New AI/LLM Breakthrough - FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness cross-posted from: lemmy.world/post/1709025...