Reborn2966,

i was able to run the llama demo on my phone, how cool is that!

andrefsp,
@andrefsp@lemmy.world avatar

Very cool! In my company I had to serve BERT in rust using the tensorflow C API. If I knew of this framework I would have given it a shot, the examples are there and they look easy to understand.

egeres,
@egeres@lemmy.ml avatar

I can’t believe I’ll get excited about phone specs again 🙌🏻✨. Do you think it could be possible to parallelize computation among various phones to run inference on transformer models? I assume is not worth it since you would need to transfer a ton of data among devices to run attention per layer, but the llama people have pulled so many tricks at this point…

  • All
  • Subscribed
  • Moderated
  • Favorites
  • random
  • wartaberita
  • uselessserver093
  • Food
  • aaaaaaacccccccce
  • [email protected]
  • test
  • CafeMeta
  • testmag
  • MUD
  • RhythmGameZone
  • RSS
  • dabs
  • TheResearchGuardian
  • Ask_kbincafe
  • KbinCafe
  • Testmaggi
  • Socialism
  • feritale
  • oklahoma
  • SuperSentai
  • KamenRider
  • All magazines