Players overall rating is only boosted by the size of sample and by the degree of its dependency more on Impact, ADR and KPR rather than on low DPR.
The rest is worked out separately like ratings vs top5/10/20 teams that reveals did a player pad his overall stats against weak teams or was actually good vs any level of opposition, ratings at big, elite events and at playoffs of such events and so on and so forth.
All of these are different metrics that aren't added to the main one, they, being just a part of overall rating, only reveal the true essence of it.
EVPs don't actually double the value of ratings as well, they just show whether these ratings led to good event for a team or not, that, of course, increase the weight at some degree, but not quite double it since M/EVPs (or just high peaks) are considered as a part of overall rating, not as a some kind of addition to it.
You fairly noticed that two players having different sample sizes and qualities of these samples are somewhat questionable to be compared to each other, yet it's not necessarily impossible. If you have some arbitrary criterias, then you can make up with a certain method to deal with such a task.
In my book, HLTV manages to do it quite well even though their system isn't flawless.