The skillup formula, when S<=200 and skill<=190, is
P= S*(200-skill) / (Y*F*100*200) where S is the adjusted best relevant stat; the difficulty Y is 2, 3 or 4; F is 1 on a sucessful skillup or 2 on a failed skillup; and skill is your skill when you make the attempt. Y was left for us to find out.
In looking for ways to estimate it, I came across the negative binomial distribution. It give the probablilty P of r successes and x failures when the probability of success is p for one attempt. (I use the Excel docs' definition of r).
Using Excel notation, the general formula is P= negbinomdist(failures, successes, p) and the formula for one success is
P= p * (1-p)^x.
Our situation is more complex because p is different when skilling up on successful combines c from skilling up on failed combines f. So we can't write P= p * (1-p)^(c+f-1). It's still worth it to find a way to simplify failures to a single number to take advantage of what is known about the negative binomial distribution.
We know that the chance of skilling on a sucessful combine is twice that of skilling up on a failure, so let's say that the number of attempts for a single skillup is c/2+f+0.5. (The analysis works better with the +0.5; don't ask me why.)
If we set things up so that p=1/Y, we can calculate the following table for the percentage of occurance of attempts when Y is 2, 3 or 4. (The number of attempts can go to infinity, but I accumulated all the attempts greater or equal to 10 into one line). We can test for the closest match between the distribution of our skillup attempts and these theoretical distributions. I used the Chi2 test. I also noticed from these distributions that the median plus 1 is an estimate of Y.
We can also simplify the formulas for the negative binomial statistics.
The average number of failures r*(1-p)/p simplifies to Y-1. In other words, Y is equal to the number of attempts when p=1/Y.
Every skillup gives an estimate of Y which we can then average.
The variance r*(1-p)/p^2 simplifies to Y*(Y-1). I used the F ratio to test the difference between results and theory. (The F here is different from the F in the skillup formula; it's just the name of that statistic in the litterature.)
We now need to adjust the skillup data so that p=1/Y. From the skillup formula, we see that p=1/Y when S=100, F=1 and skill=0. So we adjust each individual skillup data to get
Yi = ((c/2+f) +0.5) * (S/100) * ((200-skill)/200) when the skillup came on a combine; and
Yi = ((c/2+f)/2 +0.5) * (S/100) * ((200-skill)/200) when the skillup came on a failed combine.
We have to round those fractional frequencies and we have to set the minmum frequency to 1.
We're working with so few data points that hell runs can distort the results. The best solution is more data, but I did a version of the mean and of the analysis of variance with the Yi capped at 10. I think this is a better compromise than excluding the hell runs altogether.
I got the following results so far:
Baking 3
Brewing 3
Fletching 4
Jewelcraft 4 <- new estimate
Pottery 4
Smithing 2
Tailoring 2
Data for the level-capped skills is not as easy to get. With the technique I used to gather data (explained here), I can get only 4 skillups per character.
I attached the spreadsheet I used to this message. To switch tradeskills, just select the one you want from the drop-down list on the "Analysis" sheet. Trivial levels are included although this analysis does not need them.
Edit:
I added data from a monk on Live and from four magicians with S=150 to the data section of the spreadsheet.
I also changed the estimate for jewelcraft to 4.
I analyse the new data in a following thread
P= S*(200-skill) / (Y*F*100*200) where S is the adjusted best relevant stat; the difficulty Y is 2, 3 or 4; F is 1 on a sucessful skillup or 2 on a failed skillup; and skill is your skill when you make the attempt. Y was left for us to find out.
In looking for ways to estimate it, I came across the negative binomial distribution. It give the probablilty P of r successes and x failures when the probability of success is p for one attempt. (I use the Excel docs' definition of r).
Using Excel notation, the general formula is P= negbinomdist(failures, successes, p) and the formula for one success is
P= p * (1-p)^x.
Our situation is more complex because p is different when skilling up on successful combines c from skilling up on failed combines f. So we can't write P= p * (1-p)^(c+f-1). It's still worth it to find a way to simplify failures to a single number to take advantage of what is known about the negative binomial distribution.
We know that the chance of skilling on a sucessful combine is twice that of skilling up on a failure, so let's say that the number of attempts for a single skillup is c/2+f+0.5. (The analysis works better with the +0.5; don't ask me why.)
If we set things up so that p=1/Y, we can calculate the following table for the percentage of occurance of attempts when Y is 2, 3 or 4. (The number of attempts can go to infinity, but I accumulated all the attempts greater or equal to 10 into one line). We can test for the closest match between the distribution of our skillup attempts and these theoretical distributions. I used the Chi2 test. I also noticed from these distributions that the median plus 1 is an estimate of Y.
Code:
[B]Trials Y=2 Y=3 Y=4[/B] 1 50% 33% 25% 2 25% 22% 19% 3 13% 15% 14% 4 6% 10% 11% 5 3% 7% 8% 6 2% 4% 6% 7 1% 3% 4% 8 0% 2% 3% 9 0% 1% 3% 10 0% 3% 8%
The average number of failures r*(1-p)/p simplifies to Y-1. In other words, Y is equal to the number of attempts when p=1/Y.
Every skillup gives an estimate of Y which we can then average.
The variance r*(1-p)/p^2 simplifies to Y*(Y-1). I used the F ratio to test the difference between results and theory. (The F here is different from the F in the skillup formula; it's just the name of that statistic in the litterature.)
We now need to adjust the skillup data so that p=1/Y. From the skillup formula, we see that p=1/Y when S=100, F=1 and skill=0. So we adjust each individual skillup data to get
Yi = ((c/2+f) +0.5) * (S/100) * ((200-skill)/200) when the skillup came on a combine; and
Yi = ((c/2+f)/2 +0.5) * (S/100) * ((200-skill)/200) when the skillup came on a failed combine.
We have to round those fractional frequencies and we have to set the minmum frequency to 1.
We're working with so few data points that hell runs can distort the results. The best solution is more data, but I did a version of the mean and of the analysis of variance with the Yi capped at 10. I think this is a better compromise than excluding the hell runs altogether.
I got the following results so far:
Baking 3
Brewing 3
Fletching 4
Jewelcraft 4 <- new estimate
Pottery 4
Smithing 2
Tailoring 2
Data for the level-capped skills is not as easy to get. With the technique I used to gather data (explained here), I can get only 4 skillups per character.
I attached the spreadsheet I used to this message. To switch tradeskills, just select the one you want from the drop-down list on the "Analysis" sheet. Trivial levels are included although this analysis does not need them.
Edit:
I added data from a monk on Live and from four magicians with S=150 to the data section of the spreadsheet.
I also changed the estimate for jewelcraft to 4.
I analyse the new data in a following thread
Comment