It is not a "balancing", it is the variable. What you can not follow with your sheet, is the rating variations, that may tell you a bit about the expected upcoming game performance. Thats why I', asing you to share the ratings, from the 2 tests and the main team page where you have the 5 last ratings of each player, there you can see the behaviour. If, a player have a max. of 3 games with a max rating, after a 10-9-10- it comes a drop, it doesn't matter the opponent quality, as this is to create a scenario variability and thats why probably you end up saying- "It should be the opposite, to score 2 yesterday and to struggle today, against almost equal match-up, but no, not in this game..."
To create unique scenarios the engine uses only few players of each squad to be determinant in a match (it does a player simplification), so, those who have a expected rating drop, or can be vanished, or underperforming.
Following my criteria, is how one can manipulate the ratings, in order to earn a rating push in a particular match. You synchronize a drop before a final, to force a push, and works mostly of times, because the engine have a way to proceed to simulate.
Then, remember that,
If you play vs a % team that is some levels below you, what you do, is to equalize all your players at the same star,
because the simulator, have a margin that is covered, after this distance of 180% all players are in the same line, so you can, by doing this, expose the internal programming, and you will notice that maybe a 3* or even a scores more than a 9* because vs a team -50 levels or -2000% your team is absolutely uniform because the simulator reach the max. distance that can cover.
That's how you expose the internal programming to test the players.