Why so many people think that they know stats after taking a watered down intro course?

I honestly wouldnt mind if it wasnt for them being surprised when i tell them that i major in stats (they dont believe that you can fill an entire curriculum with that).

There are so many people who dont know that stats are based on very complex mathematical foundations and think that they are all about calculating medians and averages.

Apparently the only way for the discipline of statistics to get recognition (at least in Europe) is to be merged with some bullshit marketing term like “data science”. So frustrating.

submitted by /u/Asosas
[link] [comments]

Stars and Bars formula – does the analogous formula with Stirling Numbers of the Second kind mean anything?

I’ve been doing a bit of combi work, and this interesting thing popped out. We know that the number of ways to put n identical objects into k distinct bins is C(n+k-1,n), where C(n,k) is the binomial coefficient (or choose function). Is there a physical interpretation for the analogous S{n+k-1,k} with Stirling Numbers of the Second Kind? I was writing a tricky problem and that popped out of some generating function manipulations, so I’m not sure what the actual combinatoric meaning of it is supposed to be.

I’d really appreciate any insights anyone could share.

submitted by /u/ShisukoDesu
[link] [comments]

How to find out which predictor influences the most of the variance of the model?

Experiment: there are 300 rats.

I give them medicine A, a medicine B, and a medicine C… and I let them run in the wheel for 15 minutes everyday.

I’m interested into modelling how the the blood pressure of the rats changes over the time. My dependent variable (Y) is the blood pressure of the rats.

Predictors are: medicine A, medicine B, medicine C, and the running in the wheel for 15 minutes per day for the first week, than gradually increasing the ”sport activity” of 15 minutes per week

(first week 15 minutes, second week 15minutes+ 15 minutes, third week 15minutes of running activity x 3, and so on).

I measure the blood pressure of rats in January, then in February, then in March (monthly) and I find out that it is increasing

Now I want to build a model that tells me which one of the predictors has had the greatest impact on determining an increase of pressure in rats. How do I know if it has been the medicine A, the medicine B, the medicine C or letting them running in the weels is the most impactful predictor on the blood’s pressure increasing? Which predictor does it explain the best the dependent variable (Y)? Which predictor has THE most influence on Y?

submitted by /u/luchins
[link] [comments]

Basic Stats- Linear Regression

Hi guys. I tried to google my issue. But can someone guide me how to solve this?

I have standard dev (sd) of x is 13, sd of y is 16. r is 0.65 From there I found the slope which is 0.8. But for the life of me I cannot find a way to find the y intercept without the mean values of x and y. We don’t have a data table. They just asks us for a prediction of a y value, but without the intercept I cannot calculate it.

I know there’s some hint with the mean-mean point, but Idk.

Edit: Question :

In a large class, there were 267 students who wrote both the midterm and the final exam. The standard deviation of the midterm grades was 13, and that of the final exam was 16. The correlation between the grades on the midterm and the final was 0.65. Based on the least squares regression line fitted to the data of the 267 students, if a student scored 20 points below the mean on the midterm, then how many points below the mean on the final would you predict her final exam grade to be?

submitted by /u/maybe_babyyy_
[link] [comments]

Topological density as an “approximate” notion of cardinality

Cantor’s theorem shows us that there are sets of uncountable cardinality, such that we will never be able to write an unambiguous finite description of each element in the set (an enumeration). The real numbers (cardinality 2N) or the first uncountable ordinal ω_1 are the prototypical examples of such sets. This has lent itself at times to the notion that most reals or large countable ordinals are just non-constructable “junk” numbers that can never be described or talked about in any fashion.

The interesting thing about the reals, though, is that we can relax the above criterion a bit to get something slightly better for the real world by looking at finite approximate descriptions that are arbitrarily accurate. That is, if we are willing to allow an arbitrarily small choice of error ε, we can finitely represent any real number r to within a difference of at most ε. (Trivially, we can just pick some rational number within ε of r.)

So there is some sense in which the reals are “approximately countable” in a way that other sets may not be. I thought this might be a nontrivial property and worth generalizing to see how sets might be drawn into “approximate equivalence classes” this way.

Clearly, there needs to be more structure than just that of a set for the notion of “approximable” to make any sense. Topological spaces seem to fit the bill: there is this notion of the “density” of a space, which is the cardinality of the largest dense subset. In particular, “separable” spaces have countable density, so you can always give a finite description for a point arbitrarily close to any other point in the space.

Is there a better way to formalize this intuition though? For instance, the space of real-valued bounded sequences isn’t separable, but is there some deeper sense in which it is approximable? Likewise with the first uncountable ordinal ω_1?

submitted by /u/nothingtoseeherelol
[link] [comments]

Can some help me figure out an expected value calculation?

If there is a better sub to submit this to, please let me know.

I am in a football pick em pool where each week we pick the game winners and receive points for correct picks.

This week, there are 2 games to pick with 4 teams to choose from. Each game is worth 5 points this week for a possible total of 10 points.

The current point rankings before this week are

Player 1 – 222 pts

Player 2 (me) – 218 pts

Player 3 – 215 pts

At the beginning of the season, we made preseason picks for the teams that would make the Superbowl that are worth bonus points. Each preseason pick to make the Superbowl is worth 3 points and picking the winner of the Superbowl preseason is worth 5 points.

Team 1 and Team 2 (me) have 1 team live for Superbowl bonus points (both picked New Orleans). Best case scenario = 3 bonus points. Plus 5 points if New Orleans wins the Superbowl.

Team 3 has 2 teams live for the Superbowl (New England and Los Angeles). Best case scenario is 6 bonus points. Plus 5 points if Los Angeles wins the Superbowl.

So taking into account current points, the points from this week’s picks, and possible bonus points, how would I find what the best expected value picks are this week?

New Orleans vs Los Angeles

Kansas City vs New England

I know that all of this is based on each team having an equal projected winning percentage. In actuality, New Orleans is projected to have a 63% chance to win and Kansas City 60%. I don’t know if this is necessary info or if it makes things too convoluted, but I wanted to provide as much info as possible.

So, this week’s picks are worth 5 points each.

Bonus picks are worth 3 points each for picking the 2 Superbowl participants. 5 points for picking the Superbowl winner.

I can’t know the other player’s picks, but I can guess. That makes it a bit harder to write out decision trees for all of this because I wouldn’t know what tree would be accurate.

Is there any way someone could help me?

submitted by /u/OrangeYouExcited
[link] [comments]

Is there a physical representation of a 4-term expansion?

That might’ve been a little confusing, sorry. We all know about Pascal’s triangle, used for binomial expansions, a geometric representation of the binomial theorem. Move one step up, to trinomials, and the expansions yields a pattern that can be viewed 3 dimensionally as a pyramid. I’m wondering if there is a geometric figure, perhaps multiple pyramids?, of a 4 term expansion.

submitted by /u/mybetafish
[link] [comments]

What is the justification for identifying R^n with the intuitive notion of “flat space” that we all have in our heads?

I’m asking about something more than the formal properties of Rn. I have some intuitive sense of why R, as the complete ordered field, should have “no holes”, but I really have no sense of why we should expect it to be “flat” or “straight” (in a general, intuitive sense of those words), or why we should expect R2 to represent the plane we all visualize when thinking about 2-D (Euclidean) geometry, etc. Rather, throughout my mathematical education, I have had trouble convincing myself that this image should anymore correspond to the set {(x,y) in R2 | sqrt( x2 + y2 )=r} than to any other circle in any other metric space. Maybe my question here isn’t well-formed, or maybe I simply need to learn more math and the answer will be obvious (edit: or maybe the answer is just empirical, i.e., R2 just turns out to be a really good model of the average flat surface, for some reason). But I’ve felt uneasy about this for a long time, so if anyone is able to enlighten me, that would wonderful!

submitted by /u/Max1461
[link] [comments]

If we rely only on mean, median, and mode to describe data, what are we missing?

In previous classes I have studied frequency distribution and measures of central tendency such as mean, median, and mode. If relying only on those tools to describe data, what important component is missing? An example of the missing component?

Is what I’m looking for qualitative data? Anything else I could be missing to answer the question?

submitted by /u/layshea
[link] [comments]