It is holiday season so I wasn’t expecting much to happen with the Comcast issue (and to be honest, I haven’t been online at 10pm as much either to notice it), but on December 30th I did get a call from Mark N. in Executive Customer Relations updating me on the fact that the engineering team had not been able to find anything, and telling me they were closing the ticket. He also stated if I saw the issue again that I should let him know and he would re-open it.
At the time, I commented that while I had not been online at the affected hour (10pm Pacific for those new to this thread) to be able to update him on my experience, I felt it was unlikely to have changed given that it has been ongoing for over a year now.
Then, on January 1st, I happened to be online and notice the typical stalls in video streaming and poor performance in Twitter and Facebook with images & videos not loading. I ran some speed tests and traces too, and got these:
Notice the chart for the download starts high, and then drops fast to a very low level. That is a bit different to previous results (although this was 10:35pm too, and things were starting to improve, especially the video stream). The average is also much higher than it is at the peak of the problem (where I typically see under 1 Mbps download). But, at other times of the day I see 50Mbps or more for the download speeds, and a nice flat intra-test chart.
The trace shows the packet losses at Sunnyvale though, same as always. I suspect that this is the cause of the fast drop in the speed test too as the TCP window size adapts down to cope with the higher than ideal packet loss.
The burst also explains why my command line speed test script didn’t see anything even when the GUI version was showing degradation. I suspect the command line tool is using only a small file, whereas the GUI version I know adapts how much it downloads to make sure it gets a good average.
Tonight, I happened to be on Twitter and noticed this tweet about another Comcast customer in the SF Bay Area having issues, and pointing at the Sunnyvale router:
When I reached out to @Pixel, he got back to me with the comment that he sees issues almost every night too around 10pm ± 1 hour. And he is in Santa Clara. So, now I have reports of the same performance problems from Alameda, Oakland, San Francisco and Santa Clara. But still I get this from Comcast support:
I seriously think that Comcast needs to invest in some training for their support people. When Frank Eliason was running the Twitter support team it seemed to be staffed with people who were on the top of their game. Sadly, that doesn’t seem to be the case any longer.
The same support person however did let me know that the ticket (ESL02794458) had been closed with the comment that it was resolved. Let me be crystal clear now:
I have never stated that the issue was resolved.
The request to close the ticket came from Comcast because they were unable to find anything. As I mentioned above, my response to that was that I felt it was unlikely to have changed since it had been the same for over a year, but I had not been home at the right time to be able to confirm or deny it was resolved. That is most definitely not the same as me reporting the issue as resolved. Furthermore, for the record:
The issue is not resolved.
And, as should be apparent to the people at Comcast by now, I am not the only customer affected by this. I strongly suggest they get their act together and find out what is causing these performance issues. And I would also suggest that perhaps rather than sitting in the Hayward head end trying to find ways to pin the blame on my modem/router/Wi-Fi, they take a trip to the Sunnyvale router’s location and have a look at what is happening there. It seems too much of a coincidence that that one node always shows high packet loss rates when there are performance issues, and not when everything is working OK.
Not Just Traceroute & Speedtest
One more thing I’ve heard is that these tools might not show problems when the rest of the network performance is OK. Let me be very clear on this for the folks at Comcast too:
I only run Speedtest and traceroute when I see other things failing.
More specifically, the applications I have seen failing are:
- Video streaming from Amazon and Netflix (stalling, dropping down to the lowest quality stream, and even failing to load the catalog on our Roku box sometimes)
- Media loading in Facebook & Twitter (text loads, but no images, including avatars/profile pics in many cases)
- Web pages timing out on loading (especially more complex sites like the NY Times, the Washington Post or even the Amazon home page)
- Video calling (Google Hangouts, FaceTime etc) disconnecting repeatedly
Only when I see something like that failing do I think to run speed tests or traceroute on my home network.