[Q] best way to represent ranked data ?

I am not a student and this is not for homework. I will try to keep it brief.

imagine this scenario : six widgets from 6 different manufacturers

6 people are asked to rank the widgets in order from best (1) to worst (6). no two widgets can occupy the same rank ( no ties) for each individual.

Is there a way to represent to collective ranking of the widgets for all six people ?

thanks in advance

submitted by /u/sandysanBAR
[link] [comments]

[C] Finished my associates in math. Looking for a job while I work on a statistics degree.


I’m finishing up my associates in mathematics and psychology this semester and just finished applying to universities. Once there, I’m hoping to become a statistics major and pursue that as a career.

I’m wondering if there are any online jobs, perhaps in data management that I could look for while I’m at university with the degrees that I have. I’d like to get my foot in the door for getting a job in statistics, but I don’t know what to look for.

Thank you for any advice.

submitted by /u/TheDarcEye
[link] [comments]

[Q] Odds of a given combination of two partly correlated variables?

If two variables is completely unrelated to each other the combined odds for a given combination (say two dice rolls) are oddsofhappeningX * oddsofhappeningY (1/6*1/6=1/36 to get two sixes). Say two variables has a correlation coefficient of like 0.8 which means a explained variance of around 50%. What are the odds of a given combination now (like if rolling a high number with the dice somehow made it more likely to roll high the second time) and how do I calculate it?

submitted by /u/RagnarDa
[link] [comments]

[Q] How do I explain in words how multivariate linear regression analyses work?

For a class, I am doing a mock quantitative study where I have to evaluate how several factors (knowledge, attitudes, etc (independent variables) race, gender, etc (control variables)) correlate with participation in environmentally friendly behavior (dependent variable). I have to explain how multivariate linear regression analysis works with respect to my variables in my research paper. I have never taken a stats class in my life, and it wasn’t a prerequisite for the research methods class that I’m in, so I’m not sure HOW to explain it in a paper…

This is what I have so far:

“I will use multivariate linear regression analysis to determine which factors are the most predictive or restrictive of pro-environmental behavior. I will have several models within the linear regression, enter different sets of continuous and discrete variables, and regress them on the dependent variable (behavior). Then I would enter control variables (race/ethnicity, gender, level of education) to determine how these variables influence the independent variables.”

Sorry if that makes no sense. Like I said, I don’t know how linear regression works. Pls help if you can, thank you!

submitted by /u/rosetintedmuse
[link] [comments]

[Q] Appropriate way to analyze survival data


I have a dataset of ~150 patients and about 20 variables. My goal is to evaluate simultaneously the effects of several factors on survival, so I proceeded to do a Cox proportional hazards model.

I first did a univariate Cox regression to identify the variables which were significant on their own. I then did a multivariate Cox regression analysis to generate a model (which ended up having 6 variables).

However, now I am not sure this is the right way to approach it. The 20 variables are all of clinical interest and I am not sure if I should have removed them from the final model. I would have liked to report the hazard ratio of these other variables. Any recommendations?

submitted by /u/DasRite
[link] [comments]

[Q] Selecting OR of sub-categories to represent parent category

This is going to be a tough question to describe, but lets see what happens…

I am looking at a study that reports the effect pancreatic gland texture (soft vs hard) has on the risk of developing an infection. It sub-categorises ‘infection’ into two – ‘superficial infection’ and ‘deep infection’. An OR and confidence interval is then reported for each sub-category:

Soft gland vs hard gland -> superficial infection: OR 1.39 (1.05 – 1.85)

Soft gland vs hard gland -> deep infection: OR 2.22 (1.79-2.74)

I need to report a statistic for ‘infection’ (i.e. EITHER superficial or deep) for this study. My guess is to choose the option with the higher OR, as this would represent the odds of any infection.

Is this reasonable? Does the CI change this assumption?

submitted by /u/1Surgeon
[link] [comments]

[Q] How can I calculate the probability of getting any one of 3 particular numbers using any basic function (add subtract multiply divide) to combine a set of 8 random numbers between 1 and 6?

Alright, so let me explain. In a Pathfinder campaign, a spell caster wants to use a particular feat that allows him to get bonuses for a spell if he rolls a number of d6 equal to his ranks in a particular skill (which is currently 8) and can combine the numbers rolled in any way to get one of 3 prime numbers associated with the spell’s level. If the 3 numbers matter, right now his best spells are level 4, meaning the prime numbers he could get are 31, 37, or 41 for level 4 spells. If he does not get one of those prime numbers, he fails the spell.

I wanted to tell him the probability of being successful, but I don’t know where to start and don’t have K2SO here to calculate that for me.

submitted by /u/bhughey24
[link] [comments]

Weekly /r/Statistics Discussion – What problems, research, or projects have you been working on? – December 04, 2019

Please use this thread to discuss whatever problems, projects, or research you have been working on lately. The purpose of this sticky is to help community members gain perspective and exposure to different domains and facets of Statistics that others are interested in. Hopefully, both seasoned veterans and newcomers will be able to walk away from these discussions satisfied, and intrigued to learn more.

It’s difficult to lay ground rules around a discussion like this, so I ask you all to remember Reddit’s sitewide rules and the rules of our community. We are an inclusive community and will not tolerate derogatory comments towards other user’s sex, race, gender, politics, character, etc. Keep it professional. Downvote posts that contribute nothing or detract from the conversation. Do not downvote on the mere fact you disagree with the person. Use the report button liberally if you feel it needs moderator attention.

Homework questions are (generally) not appropriate! That being said, I think at this point we can often discern between someone genuinely curious and making efforts to understand an exercise problem and a lazy student. We don’t want this thread filling up with a ton of homework questions, so please exhaust other avenues before posting here. I would suggest looking to /r/homeworkhelp, /r/AskStatistics, or CrossValidated first before posting here.

Surveys and shameless self-promotion are not allowed! Consider this your only warning. Violating this rule may result in temporary or permanent ban.

I look forward to reading and participating in these discussions and building a more active community! Please feel free to message me if you have any feedback, concerns, or complaints.



submitted by /u/AutoModerator
[link] [comments]

[Question] Simple Slope Analysis and Both Significant & Non Significant Interactions.

Dear Community,

I calculated a Multiple Regression Model: Y = B0 + B1Gender + B2RT1 + B3RT2 + B4(GenderRT1) + B5(GenderRT2)

The interaction Gender*RT1 is significant.

Now, I want to calculate a simple slope for this interaction to find out where my effect is coming from. For that, I have to rearrange the formula. My question is, do I rearrange to above formula, or do I rearrange: Y = B0 + B1Gender + B2RT1 + B3(GenderRT1), which would be the formula for the significant interaction term and subordinates anyway.

Thank you for your help.

Best Regards.

submitted by /u/AyraLightbringer
[link] [comments]

[S] [R] G*Power Analysis for a 2×4 ANOVA with all participants doing all conditions

Hi r/statistics,

I am currently conducting research and will test the results for a 2×4 interaction. Thus I have 8 categorical groups (stimulus types) that can be divided into 2×4 sub-groups. All participants will do all conditions.

What kind of g*power analysis am I supposed to do in this situation?


submitted by /u/LizardInASuit
[link] [comments]