Hi there @Lloyd_Hightower ,
I was curious on how the shuffling algorithm works so I tried to see if i can find the implementation in the js files and I found this line:
x3 = function(a) {
return a.jb.sort(() => .5 - Math.random()).slice(0, 3)
}
This function does the following:
- It takes an array
a.jb
, which I assume contains all the projects? - It then uses the
sort()
method with a comparison function that returns a random value between -0.5 and 0.5:() => .5 - Math.random()
- After shuffling, it uses
slice(0, 3)
to select the first 3 items from the shuffled array.
And so I was curious to check statistically if it is “fair” or “unfair” by doing a data distribution analysis, so I used an iteration of over 100k with the same exact algorithm over 3046 projects, 3 shown at every shuffle. These were the results of my findings:
Total Projects: 3046
Projects Selected per Run: 3
Basic Statistics:
Expected Selections per Project: 9.85
Mean Selection Rate: 0.000985
Standard Deviation: 0.000574
Minimum Selection Rate: 0.000000
Median Selection Rate: 0.000900
Maximum Selection Rate: 0.004700
Fairness Metrics:
Coefficient of Variation: 0.583104
Gini Coefficient: 0.299487
Kolmogorov-Smirnov Statistic: 0.097298
The Coefficient of Variation (CV) of 0.583104 indicates high variability in project selection rates, showing the process isn’t uniform. The Gini Coefficient of 0.299487 reveals moderate inequality in selection frequency, suggesting some projects are chosen more often than others. The Kolmogorov-Smirnov (KS) test statistic of 0.097298 confirms the selection distribution significantly deviates from uniformity (p>0.05), providing strong evidence of non-random selection. The standard deviation (0.000574) being high relative to the mean also indicates high variability. The range of selection rates, from 0 to 0.004700, shows some projects are heavily favored while others are never selected. Collectively, these metrics show that it’s definitely not a very uniformed shuffling algorithm.
This isn’t probably a very big deal, but if this algorithm is actually being used to shuffle the projects, it wouldn’t be doing a great job.
Thanks for reading!
edit: I designed a very uniform, and statistically sound algorithm for the shuffling, I am just gonna leave it here, who knows it might help or not:
class UniformProjectSelector {
constructor(totalProjects, selectCount) {
this.totalProjects = totalProjects;
this.selectCount = selectCount;
this.projectPool = this.initializeProjectPool();
this.cycleCount = 0;
}
initializeProjectPool() {
return Array.from({ length: this.totalProjects }, (_, i) => i + 1);
}
shuffleArray(array) {
for (let i = array.length - 1; i > 0; i--) {
const j = Math.floor(Math.random() * (i + 1));
[array[i], array[j]] = [array[j], array[i]];
}
}
selectProjects() {
if (this.projectPool.length < this.selectCount) {
this.cycleCount++;
const remainingProjects = this.projectPool.slice();
this.projectPool = this.initializeProjectPool();
this.shuffleArray(this.projectPool);
return [...remainingProjects, ...this.projectPool.slice(0, this.selectCount - remainingProjects.length)];
}
const selectedProjects = this.projectPool.slice(0, this.selectCount);
this.projectPool = this.projectPool.slice(this.selectCount);
return selectedProjects;
}
getStats() {
return {
totalProjects: this.totalProjects,
selectCount: this.selectCount,
remainingInPool: this.projectPool.length,
completeCycles: this.cycleCount
};
}
}
And here are the statistics on the new algorithm:
Final stats: {
totalProjects: 3046,
selectCount: 3,
remainingInPool: 478,
completeCycles: 9
}
Uniformity Test Results:
Chi-squared statistic: 42.697999999999155
Degrees of freedom: 3045
Expected selections per project: 9.848982271831911
Actual min selections: 9
Actual max selections: 11
- The chi-squared statistic (42.698) is actually lower than the degrees of freedom (3045), which is exceptional. This shows basically that the distribution is even more uniform than we would expect by chance.
- The difference between the minimum (9) and maximum (11) selections is only 2, which is remarkably small given the large number of projects and selections. This basically shows that no project is significantly over- or under-represented.
- The completion of 9 full cycles ensures that every project has been selected at least 9 times, with some selected 10 or 11 times due to the current incomplete cycle.
- The expected selections (9.849) closely match the actual range (9-11), which confirms the algorithm’s fairness.