Yeah, you don’t need a VPN as their is also a relay component that forms a sorta sync thing network. While the data is always encrypted, with the relaying you are using external servers to route the traffic. The relaying also isn’t required, but ensures data can be synced even when a direct connection isn’t possible (e.g. You arent home and aren’t on your VPN).
I was looking for something similar back at the start of summer and the best I could find at the time was a Microsoft model on hugging face - huggingface.co/microsoft/speecht5_tts. It’s a bit robotic, but its pretty versatile and since it outputs a .wav file it’s easy to integrate it into any system you might be working on/with.
Only thing that’s difficult about it is that you need to understand sampling rates to make sure the voice is created correctly, but I think the example on the hugging face page works as is.