Help interpreting SPSS BLR output

Hi, I am doing a binary logistic regression on smoking data from a well known study. I am doing the BLR with 25 year survival status as the DV (Alive=0, Dead=1) and smoking status (No=0, Yes=1) When conducting the BLR I am asked to first do the smoking data only and then the second analysis is to include age as a variable (categorized into 5 groups). The first run with only the smoking variable I get a Exp(B) of 1.46, which I believe means that a smoker is 1.46 times more likely to be in the Dead group. No running the analysis with the interaction of smoking status and age (both categorical) I get -B’s in all categories and (Exp) B ranging from .025, .051,.095,.282,.785,.645. Am I correct in saying that when age is considered that the likelihood of smokers belonging in the death group is actually reduced? I am using an interaction, should I not use an interaction between the covariables?? Thanks!

New to experiments and statistical testing

I have three experiments. In each, I assign subjects with a reading assignment, and then ask them questions of comprehension. While I give them multiple options, I only care about whether they got it right/wrong. In each experiment, I vary between control/treatment by changing the formatting of the text, with the hypothesis that the format change will decrease their understanding of the text. The reading assignment in all three experiments is very similar, but not identical, but the format change is exactly the same.

My questions are:

  1. Is the proper testing method here Chi-Square/Fisher’s based on a contingency table?
  2. If (1) is the right way to go about it, the P-levels are too high. Is it permissible to combine the outcomes of all three experiments? If so, how do I go about testing it?

As a graduate student in Statistics, what can one do to earn money and experience?

Hi everyone! I am considering graduate studies in Statistics and curious about the opportunities to earn statistical research, consulting experience while getting a little case out of it. So far, I can only imagine a grad student works as a TA or work towards the thesis. Please help me learn more information regarding this topic. Thank you so much!

Do I have to learn programming?

I am in my second year of college and I decided to try out a computer science course. However, I really am not enjoying programming, and the thought of having to use it in my career is pretty daunting. Do i have to force myself to learn programming in order to get a good career in mathematics or statistics? I’ve thought about becoming an actuary, but I don’t think its for me. Should I just tough it out and force myself to get good at programming? Thanks in advance.

MS in statistics and PHD in biostatistics?

I’m currently going for the master’s in statistics (second semester here, gonna have 1-2 more years left depending on whether or not I do well in a theoretical class). I recently took a liking to a biostatistics class I’m taking (as well as the field in general and how applied it is). I’m curious as to if people have gotten a master’s in statistics and then chose to get PHD in biostatistics or not. I have no interest in academia but research and making further advancements in the workforce do sound appealing. Not sure yet if I want to go for the PHD though. I could use someone to talk to about this.

Negative values when calculating two standard deviations from mean

I have a normal distribution of rock porosity data that range from 0.002 to 0.179 (representing percentages, 0.2% and 17.9%). The mean is 0.0662 and STD is 0.0355. Based on this, two standard deviations below the mean I have entered negative values. Porosity cannot be negative.

I am writing an algorithm which selects a random value from the porosity normal distribution and pairs it with a value of permeability from another normal distribution. For instance, if a value of porosity that is two standard deviations lower than the mean is selected, then my permeability should be random value that is also two standard deviations less than the permeability mean.

The issue here is that porosity can never be less than 0. I am not very good with statistics and I am not sure how to deal with this. Is there a specific method to keep values positive while still honouring the distribution?

Who invented the z-test?

I was working on a workshop a few months ago about hypothesis testing and t-tests, and came across the details of William Gosset and his time at Guinness – which is pretty awesome and I think makes the history and stat much more interesting and relatable to students. Anyway, I’m working on something similar for z-tests but I can’t find any sources that identify the author of the z-test. I see some sketchy sources that attribute it to Gosset, but I have a feeling that those are not correct. Thoughts?

Making specific predictions using correlation coefficient

Let’s say r = 0.5.

I was told by a professor that this can be interpreted as follows: “If A were to increase by 1 of its standard deviations, B would be predicted to increase by 0.5 of its standard deviations.”

However, this should work in both directions (e.g., “If B were to increase by 1 of its standard deviations…”), and that doesn’t seem to be the case, assuming they have different standard deviations.


SD for A = 4
SD for B = 8

r = 0.5

“If A were to increase by 4, B would be predicted to increase by 4.”
“If B were to increase by 8, A would be predicted to increase by 2.”

Those two statements do not work in conjunction.

So, what do I have wrong? Did my statistics professor lie to me? How can I make a specific prediction using a correlation coefficient? If you can use my example and show me how to predict a value for B given an increase in A (or vice versa), I will bestow you with positive karma for the rest of the day. Thank you so much for your help!!

Necessary caveats:

  1. I am aware that correlation does not imply causation.
  2. I am aware of the relationship between correlation and regression.
  3. I am aware of how to interpret the strength of the relationship based on a correlation coefficient.

[University Statistics] Sampling Project Ideas

Hello! I am in my second year statistics program at uni. I would like to ask you guys if you have any topics that you think would be interesting to cover for a project that I have.

Basically, the project consists of designing a sampling questionnaire that will be given out to the specific study populations. What will be on the questionnaire and to who it will be given to is determined by what is the goal of the project. Some things that I have going in my head goes along the lines of evaluating lifestyle choices to the success of students in post-secondary. However, I would like to hear some other ideas! If you want to know if there is an association between certain causes and effects or any topics in general–please comment! I would LOVE to hear more ideas. 🙂 Thank you!

What is SAS/IML and how do I get it?

1) Decided that I want to learn SAS. 2) Downloaded free software (“base SAS”). 3) Have been following tutorials, but they are all about data manipulation and basic plotting and printing. 4) Started wondering where the hell all the programming is. 5) Realized I need something called “SAS/IML”.


1) What is it? 2) How do I get it?

