Anybody Interested in Analyzing Last Summer’s /r/statistics user survey?

The last year has really taken off in terms of opportunities for me and I've been a less than ideal mod. I've had the survey results sitting in R script since last summer but never got around to finishing it. What's the community's interest in finishing the analysis? It wasn't the greatest survey, but definitely some interesting results. I can either post the data here publicly or PM to who ever is interested . submitted by /u/keepitsalty

Comparing Categorical Data

Hi, I have two categorical data sets that I wish to compare. The data set consists of a probability reading vs. a letter reading. This image illustrates it with the red and green lines: What suggestions do you have to compare the two data sets – the area overlap perhaps? Any other ideas? Thanks. submitted by /u/largecontainer95

What test to use: distance from forest edge vs. plant abundance

I did an ecology study where I measured how the average abundance of 5 plant species changes moving from a meadow across the edge of a forest. I took 10 measurements along a transect. 3 of the species show aproximately normal distribution and 2 show a linear increase. What statistical test should I use to show how distance affects abundance for each of the species? Perhaps Pearson's correlation or a t-test with pairs of distance measurements for the normally-distributed species and Spearman's rank for the non-normally distributed species? I am having trouble picking the most suitable test. Thanks for helping an absolute statistics noob! submitted by /u/opezpzlmnsqz

What’s the best correlation test?

Hello guys, my statistical knowledge is less than basic. I'm a newbie. I am doing a medical study (as a medical student). I want to correlate spleen stiffness values which are a scale of value in kPa (from 10 kPa to 60 kPa) and the presence/absence of esopagheal varices expressed in 0 (absence) or 1 (presence). What is the best statistical test that I could use to see if there is a statistically significant correlation? I'm using SPSS. submitted by /u/AcceptableDesigner

G-Test instead of Chi-Squared Test

Hello, today in class our lecturer mentioned that we should use the G-Test when an observed value (in cell) is below 5. Unfortunately, he had no slides about the G-Test but I've looked it up and I pretty much understand the formula. Here are the questions though: What exactly does the G-Test do? And: What do I do after I've calculated the G-Test statistic? I get the formula but I cant find what to do afterwards or what to do with the statistic. Thanks in advance submitted by /u/VeronsFabulousBeard

Best options for repeated measures analysis?

Trying to identify statistical difference in testosterone levels in a longitudinal study. Group of 12 (n=12) measured (standardized) at 15 points (k=15) throughout the year. Been a while since I've tackled stats, so working through some of the options I have. Just looking for some advice, pointers on best options to analyze this data (I'm using an old version of SPSS). IV: Time DV: Testosterone level What's my best strategy for identifying statistical difference between the 15 time points?? Originally looked at repeated measures ANOVA, but there a two individuals with missing data points (one of 15 points missing each, so I don't want to throw the set of data out). As well (correct me if I'm wrong), normality has to be found in each of the time points, which isn't the case with this data set, so, my understanding, assumptions of RMANOVA are violated?? Would really appreciate some direction, thanks. submitted by /u/HockeyAndThings

Can data be a combination of missing at random, missing not at random, mcar?

If I have a dataset which is drawn from multiple sources – part hospital records, part telephone calls, and part nurse visits – would it make sense if I said the missing data is both missing completely at random and missing not at random? I have a variable from the survey portion where the missingness is correlated with the variable itself, a case of MNAR. But some data is missing from the hospitals which I know has nothing to do with the survey, and appears completely at random. Can I say the data is a combination of MCAR and MNAR? It makes sense to me… but i have never seen it in the literature. submitted by /u/windupcrow