A major weakness of the classical Monte Carlo test is that it is biased when the null hypothesis is composite. This problem persists even when the number of simulations tends to infinity. A standard remedy is to perform a double bootstrap test involving two stages of Monte Carlo simulation: under suitable conditions, this test is asymptotically exact for any fixed significance level. However, the two-stage test is shown to perform poorly in some common applications: for a given number of simulations, the test with the smallest achievable significance level can be strongly biased. A ‘balanced’ version of the two-stage test is proposed, which is exact, for all achievable significance levels, when the null hypothesis is simple, and which performs well for composite null hypotheses.