submitted by /u/curryeater259 [link] [comments]

# Tag: nevin manimala

## Pros and Cons of MS/PhD in Biostats vs MS/PhD in Stats?

submitted by /u/the_siloviki [link] [comments]

## Regression analysis help

Hi guys, just checking if I’m able to do a regression analysis using my survey data. I have 60 surveys, 30 from one location and 30 from another. I asked the respondents how much time they usually spend and how much money they usually spend when visiting the location. Then i asked them to fill out a 12 question semantic differential table to gauge their feelings towards the atmosphere of that location. In a basic sense, hypothesis would be one locations atmosphere makes people feel better and thus spend more time and money their. Is this enough? Have you any ideas of how else i could present my data? Thanks a lot. submitted by /u/Wagamamamany [link] [comments]

## How to best test reliability of translated scales in small scale pilot (N = 35)

Hey guys, So for my research I translated validated scales and piloted them to improve translation. However, I’m wondering if there’s a way to check the reliability of these translated scales before the main study. Most sources recommend 5 respondents per item (I have about 30 in total consisting of ~8 scales), which I’m not going to reach by far; I have about 35 respondents. I can’t find a clear source on minimum requirements to calculate Cronbach’s Alpha, which would be my next move. Many people seem to say “if it’s just a pilot go ahead”, but I couldn’t find any articles supporting this. Does anyone have an idea how to best approach this? It’s for a small project and I don’t have time to gather many more participants for the pilot. Greatly appreciated! submitted by /u/Eu4iaa [link] [comments]

## Probability of a single score belonging to one distribution vs. another distribution

I apologize if this is a very simplistic question but I just can’t seem to find a clear answer anywhere. I am wondering if there is a way to determine the likelihood that a single score or value along a continuum is part of one distribution or another, given the means and standard deviations for each distribution. To elaborate a bit more: I’m a clinical neuropsychologist and am looking to enhance my diagnostic impressions, mostly in determining whether someone has dementia or not. The research literature is full of studies showing means and standard deviations for healthy people and for people with dementia on standard tests. I’d like to take a patient’s single score on a test and be able to write in a report something like, “Given this person’s score on X test, there is a Y likelihood of belonging to a healthy group and a Z likelihood of belonging to a dementia group.” I don’t think that I’m looking for a likelihood ratio because that’s associated with a cutoff score and the sensitivity/specificity values associated with that cutoff. I’m looking for probabilities associated with a single score that doesn’t depend on a cutoff. I guess I may be able to use just a simple z-score or percentile, which I already do all the time, but that speaks to the single score and all scores above or below it. I really want a method that can take two different means/standard deviations into account. In other words, if an effect size is thought to be pretty big, I should be able to take advantage of that discrepancy between groups and utilize it clinically. Hope that makes sense, thanks in advance for your help. submitted by /u/NPDoc [link] [comments]

## Correcting heteroscedasticity

Hi guys, sorry if this is really simple but I havent used a statistical program for a few years and Im very rusty. After performing a standard regression analysis on some product data I scraped from the web, I found it to be suffering from heteroscedasticity. How would I go about correcting this? submitted by /u/kinkwik [link] [comments]

## I feel that a major reason of these statistical misconceptions in science is that statistics as a discipline isnt respected as much as it should be

In most scientific departments statistics is taught and viewed simply as a technical formality that you need in order to do a proper research. Instead of the introductory stats classes being presented as giving a general view of the interdisciplinary language of sciences (=statistics), students get the impression that they can actually conduct statistical analysis by themselves. Imagine if non-law students who take civil law courses (most social sciences and humanities disciplines have those) believed that they wouldnt need a lawyer if they got into legal trouble. Or if biologists thought that they can do the job of the doctor because they had some courses on human physiology. Yet in the case of statistics there are so many researchers who run quantitative research and think that they dont need consultancy from an actual statistician. How low your opinion on statistics as a discipline must be in order to think that? I maybe sound elitist or gatekeeper, but this isnt my intention. I just want statistics to get the same respect and recognition that any other scientific specialty gets. Firstly because i think that the field deserves it, and secondly because it will benefit scientific progress as a whole. submitted by /u/Sorokose [link] [comments]

## Factor Analysis- individual correlations

Hello statisticians. I have a question regarding factor analysis. I am working on an assignment for school in which we have created a model with one latent variable explaining 6 manifest variables. We are happy with the outcome and indicators of the model, but I am wondering if I can then compare the model results against the individual correlations of each observation (ie: running the factor analysis as many times as observations). How would this work, is this possible? Maybe this is a stupid question, I’m new to statistics. Thanks for your help. submitted by /u/salgranon [link] [comments]

## Training a Linear Model and making Prediction on the Test Set

Hey folks, Somehow, we are not able to get the adequate length of predicted numbers by creating a linear model on the training set and making prediction on the test set. We end up with comparing 194 predicted values, while the test set only has e 49 values. Therefore the vectors cannot be coerced to calculate the mean square error. Any suggestions on how to resolve this? n = dim(complete.data)[1] # Output: 243 train.index = sample(1:n, n*0.8) # index for train set train = complete.data[train.index,] # training set 80 % of data # Length: 194 test = complete.data[-train.index,] # test set 20 % of data # Length: 49 lin.mod = lm(Average_Score ~ ., data = train) pred.lin = predict(lin.mod, newx = as.matrix(test[1:12])) lin.mse = mean(lin.mod$residuals^2)# This does not compare the difference between the predicted & true values of Y lin.mse = mean((pred.lin-test[,13])^2)# Warning message: In pred.lin – test[, 13] : longer object length is not a multiple of shorter object length length(pred.lin) # Output: 194 length(test[,13]) # Output: 49 submitted by /u/dnzsn [link] [comments]

## Simple Questions – March 22, 2019

This recurring thread will be for questions that might not warrant their own thread. We would like to see more conceptual-based questions posted in this thread, rather than “what is the answer to this problem?”. For example, here are some kinds of questions that we’d like to see in this thread: Can someone explain the concept of maпifolds to me? What are the applications of Represeпtation Theory? What’s a good starter book for Numerical Aпalysis? What can I do to prepare for college/grad school/getting a job? Including a brief description of your mathematical background and the context for your question can help others give you an appropriate answer. For example consider which subject your question is related to, or the things you already know or have tried. submitted by /u/AutoModerator [link] [comments]