Do Naive xG Models underestimate Expected Goals for Top Teams?

After documenting the implementation of a simple xG model I have spent quite a bit of time thinking about what makes a good model and how you could go about quantifying its quality. Coincidentally, a few weeks ago Tom Worville posted the below chart which sparked a bit of a discussion around the relationship between high shooting output and overperformance of xG. Look at how clinical Son's been in terms of scoring above expected (t/t @Torvaney for the gridline idea here) pic.