Here's an update. I did some research:
I came across the following paper:
https://www.jstor.org/stable/2682861
The paper describes how to compare the ratios of two outcomes of a multinomial sample. The idea is to calculate a confidence interval of the odds ratio (rho) of two of the outcomes. The lower and upper limit of the confidence interval can be calculated as:
rho_ij_(low) = rho( alpha ; Xi ; Xj ) = {[( Xj + 1 ) / Xi ] * F( alpha/2 ; 2Xj + 2 ; 2Xi )}^(-1)
rho_ij_high = rho( alpha' ; Xi ; Xj ) = [( Xi + 1 ) / Xj ] * F( alpha'/2 ; 2Xi + 2 ; 2Xj )
where alpha is the significance level, Xi and Xj are the number of outcomes i and j, and F is the critical value of the F-distribution.
If the value '1', is not present in the confidence interval, we can discard the hypothesis that the ratios are equal. I'll provide an example as well:
Outcome 1: 13 times
Outcome 2: 2 times
Outcome 3: 9 times
The null hypothesis is that all three outcomes have the same probability (1/3). I calculate the lower and upper values of the odds ratio for all three cases with a 10% significance:
rho_12_min = 1.75
rho_12_max = 40.31
The null hypothesis is discarded with a 10% significance, and outcomes 1 and 2 do not have the same probability.
rho_13_min = 0.65
rho_13_max = 3.31
The null hypothesis is not discarded. Outcomes 1 and 3 have the same probability.
rho_23_min = 0.03
rho_23_max = 0.92
The null hypothesis is discarded with a 10% significance, and outcomes 2 and 3 do not have the same probability.
The corresponding 99% confidence intervals are:
rho_12 = [1.06 ; 139.7]
rho_13 = [0.44 ; 5.21]
rho_23 = [0.01 ; 1.65]
The outcomes 1 and 2 are different with a 1% significance.
Is this a feasible approach?
Here, I have tried the chi-test on the same data with two groups at a time:
X2_12 = [(13-7.5)^2 + (2-7.5)^2] = 8.067
Critical values:
10% significance: 6.64 ---> Discard H0
1% significance: 10.83 ---> Keep H0
X2_13 = [(13-11)^2 + (9-11)^2] = 0.72
10% significance: 6.64 ---> Keep H0
X2_23 = [(13-7.5)^2 + (2-7.5)^2] = 4.45
10% significance: 6.64 ---> Keep H0
As you can see, the conclusions are different depending on what test you choose. Is this because one (or both) of the tests are errenous, or are my calculations incorrect?
I would appreciate feedback on this matter.