## Need some suggestions for papers!

Hello /r/statistics, I am starting my masters in statistics early and want to finish the thesis as soon as possible. I had a few ideas and one of them had to do with gene expression data for which I also have information about race, age, and whether or not the subject developed crohn's disease.

For now, all I have done is do some data science type applications trying to get a nice prediction however, I want to really approach this so I could garner some meaningful insights.

Do any of you guys know some good papers that you would suggest which focus on looking at gene expression data? And what are some good introductory papers to such a field and analyzing it statistically? What are the most important ones?

Also, any advice or suggestions about my masters (and perhaps life in, in general) is greatly appreciated. Thank you.

submitted by /u/bthi

## Unsure what test to use for particular experiment involving colors and popularity.

I have conducted an experiment to determine whether environmental factors can influence the participants’s responses. Specifically, I wonder if wearing all blue will influence the participant to favor the color blue when asked what their favorite color is.

I have gone through and randomly surveyed 80 people what their favorite color clothing is. The independent variable is the color of my clothing, with the response being the dependent variable. I have come back with the following data that I entered into excel

I feel like it should be a chi-squared test to test if Blue has gained in popularity. I would really appreciate if anyone has any ideas of how to do a test of significance to see if the difference is due to chance or a sign that the participants are more likely to pick blue when the researcher is wearing blue. I am using a TI-84 Plus CE to do the tests.

Edit: I have been trying Chi Squared Goodness-Of-Fit tests. The issue is since the other colors dont change much other that more people pick blue. Is there a way to isolate just Blue’s variance or see how popular blue is rather than comparing how well the entire experiments distribution compares to the control. I get the CNTRB list of each categories X2 statistic but how would I find the p-value of just one category. I can do X2 cdf but what would I put in the degrees of freedom.

Edit 2: Would a one proportion z-test work? I could put in the null that probabiltiy > .2125 and the proportion of blue/total as the ratio.

Edit 3: Messin around more with tests. I think i should use a 2 proportion z test with null p1=p2 and alternate p1

submitted by /u/AgentHunt_

## Prerequisite Knowledge Going into PhD

Hi,

So I will be starting my PhD in Statistics this fall and I was wondering how much knowledge programs assume entering students have. The main reason for asking is because the prof I want to work with gave me some of his current papers to read and there is decent amount of material in the papers that I have only vaguely seen (specifically concentration inequalities and martingales). I have done some measure theoretic probability but not to the point where I can totally understand what is being done in the papers.

Thanks

submitted by /u/alphabetaglamma

