This seems pretty damning if you ask me.
It depends on a number of factors that aren't accounted for in the tweet. Not saying it's not suspicious but statistical analysis is very easy to feck up and my questions would be as follows:
1. Did Carlsen and Niemann play a similar number of games over the sample period? Rule 101 of statistical analysis is to express this type of comparison in terms of a rate - using absolute numbers tells us very little on its own. If Carlsen played 100 games and Niemann played 1,000 then Carlsen's 100% rate would be twice that of Niemann's despite the headline number being 2 vs 10.
2. Were their opponents of a similar stature? My intuition would be that it is easier to play closer to perfect when the opponent plays badly. My thesis is that your opportunities would be more plentiful, often more obvious and the path forward often simpler to infer/deduce. My suspicion is that Niemann's pool of games probably contains a greater proportion of open tournaments than Carlsen's - who likely plays more frequently in invitationals against fellow SuperGM's. I'd contend that because of this Hans likely has a greater number and proportion of games involving inferior opponents against whom it was easier to "play well". For me this single difference might fundamentally skew the two datasets and for this reason I'd suggest Magnus might be a less than ideal point of comparison
3. What were the parameters of the analysis and are we sure they were the same for both? As far as I can work out ChessBase uses a random number of cloud sourced chess engines in order to compute the games and does so according to some number of predefined user constraints. I *think* that in order for a move to count as 'engine correlated' it must be the top suggested line on any one of the 15-25 random engines of differing abilities currently providing the analysis. This is already quite a broad net. Given that the engines ChessBase uses constantly change in real time it's clear that no two analyses can ever be truly identical even if they use identical inputs on identical data (though it would probably be quite close). I've also heard tell (not sure) that the definition of what counts as 'engine correlated' might be expanded by the user to include the top 3 lines - this would obviously have the effect of greatly expanding the number of moves counting as such. At any rate, what is certain is that the tool's sensitivity can be manipulated by the user in multiple other ways prior to its run (engine depth, time constraint, use of opening book etc). Given that the analysis of Niemann was done by a different person at a different time with unclear user defined parameters it's difficult to know to what extent that analysis can be compared to the one done here on Carlsen.
In short, far more rigour needs to be employed in the production and comparison of the analyses for us to be able to derive meaning from them.