Who uses Google TPUs for inference in production?
19 by arthurdelerue | 2 comments on Hacker News.
I am really puzzled by TPUs. I've been reading everywhere that TPUs are powerful and a great alternative to NVIDIA. I have been playing with TPUs for a couple of months now, and to be honest I don't understand how can people use them in production for inference: - almost no resources online showing how to run modern generative models like Mistral, Yi 34B, etc. on TPUs - poor compatibility between JAX and Pytorch - very hard to understand the memory consumption of the TPU chips (no nvidia-smi equivalent) - rotating IP addresses on TPU VMs - almost impossible to get my hands on a TPU v5 Is it only me? Or did I miss something? I totally understand that TPUs can be useful for training though.
Subscribe to:
Post Comments (Atom)
New top story on Hacker News: The Copenhagen Book: general guideline on implementing auth in web applications
The Copenhagen Book: general guideline on implementing auth in web applications 11 by sebnun | 0 comments on Hacker News.
-
Can the moon influence human health? New research 25 by sabrina_ramonov | 11 comments on Hacker News.
-
By BY MARGAUX LASKEY from NYT Food https://ift.tt/WvwezDd
-
By BY EMMA G. FITZSIMMONS, MATTHEW HAAG AND JEFFERY C. MAYS from NYT New York https://ift.tt/2tgjTeQ
No comments:
Post a Comment