## What term is the reducible error and what is the irreducible error?

My textbook is talking about prediction intervals and confidence intervals. I understand the intuition behind it, how we do not include the irreducible error in confidence intervals because we assume the error is normally distributed. However, how do you find this value? Say I did multiple linear regression on my data. To get the confidence interval I would use the Standard Error term right. So for prediction intervals what other error term is used? My textbook did a horrible job of explaining this. Thank you for any help!

EDIT: Is it the MSE our estimate for the irreducible error? (mean squared error)

## Can someone give me an example of questions answered by normal distribution?

Binomial distribution answers questions like, “For 60 coin tosses, what’s the probability of x heads?”

While poisson distribution answers questions like “If my mean measurements are lambda during a fixed time interval, then what’s the probability of next measurement being x in the upcoming interval?”

What questions would be answered in a similar manner by normal distribution? Thanks!

## When interested in only the interaction within a regression model, do I need to include the main effects?

I’m only interested in the interaction between 2 variables within a linear regression model. Is it necessary to include the non-interaction terms when building the model?

## Can I use a two sample independent t-test here?

Hi every one, I’ve applied an two samples independent t -test on my data, but I’m not sure if what I did is statiscally correct.

Data

The data consists out of 40,000 customer transactions over a period of 7 months. It contains the variables:

• Product the customer bought (A or B)
• Date (Day and month)
• Unique customer ID
• Money spend

Now I wanted/need to know if there is a difference if product B is bought more in the weekend (Saturday – Sunday) compared to a weekday (Monday – Friday). What I did is as follows:

• Calculated the total number of date days (213 days)
• Caculated the total number of transactions per date and what percentage of those consisted out of product B (called this the chance on product B)
• Created a dummy variable (1 for weekend day, 0 for a normal weekday)

Now I applied a Welch T-test with as indepedent variable the dummy for weekend/weekday and depedent the chance on product B being bought.

Outcome was siginificant at any level of alpha, therefore means are different for weekend. (Weekday 30% chance on average of product B being bougt, in the weekend 36%). I’m hesitating because somehow the raw data consists only out of categorical variables and I turned one into a continuous.

