How do I calculate if a test like this is statistically significant?
I let people rate how much they like different things on a scale of 1-10. How do I actually tell if people like one thing more than another thing if the sample sizes are different? This is not about any real scientific study, more like a personal test :)
For example, if one thing got voted on 10 times and has an average value of 6.5, and another thing got voted on 6 times and has a 6.1, is the 6.5 thing actually more liked? Or is this small sample size still so random that it could with a high chance go both ways?
I've never done anything like this, if someone could explain it or direct me to the correct key words/links, that would be hugely appreciated :)
I've read up a bit on p-value determination, but I'm not sure what my "null hypothesis" is here actually, numerically. If I'd put it in words I guess my hypothesis would be "this thing is more liked than the other thing", but honestly, it seems like my specific case would be much simpler than all the stuff I'm reading here :D
Your null hypothesis is the thing you're trying to disprove. For example, if I wanted to run a study to asses the effect of adding a certain growth hormone to a cell culture, my null hypothesis would be "there is no effect". In your case, it would be "there is no difference in how much different things are liked". From there, you'd run your study, and do your statistical analysis, for which there are different methods based on the type of data, number of groups your comparing, sample size, etc., and I'm not a statistician so I can't say which methods are best for what you're planning.
When it comes to p-value, to really simplify it, you can think of your p-value as the likelihood your null hypothesis is true. That's not exactly what it means, but it's an easy way to remember it.