I’m a senior network engineer who has complete control and visibility (if desired) over every packet that flows through my WAN connection.
All of my IoT devices are on a network that is segregated from my main network for ease of monitoring purposes.
After about three years of monitoring the devices on my IoT network, I have only observed a minuscule amount of traffic directed towards Amazon servers, and only immediately after the devices are triggered.
Only several hundred megabytes of data have been sent to Amazon in total from three echo devices over the last several years. Terabytes of data have been received, mainly from Spotify and SiriusXM, but very little has been sent. Several (single digits) gigabytes have been directly received from Amazon, explained by firmware updates for three devices that are about 130MB each.
I haven’t looked into the TLS-encrypted outgoing traffic but it is so small in volume that it is safe to say that it is impossible, no matter what compression is used, that Echo devices are sending any audio data to Amazon that isn’t directly related to the query made to the device. The only unencrypted traffic is a periodic get request of an Amazon-hosted webpage on the order of several dozen kilobytes which is almost certainly a periodic internet connection check.
It may be possible to set up an sslstrip-like service to decrypt and examine the outgoing streams that are encrypted but like I said the amount of outgoing data is so small that it isn’t worth the effort to decrypt and examine the sessions just to find out what format Amazon chose to send “Alexa please turn on the bedroom lights” in.
I would notice even an extremely low resolution highly-compressed outgoing audio stream, if it existed, especially since I have three Echo devices.
I just glance at some graphs about once per week to check on the health of my networks, but many more people much more skilled and motivated than I have Echo (and other IoT) devices under close and constant scrutiny.
They have found no evidence that Alexa-enabled devices are sending any information to any third party except for small audio snippets directly related to the queries they need to process.
What signatures have you noticed with Amazon devices that I should be on the lookout for?