Confused out of my mind regarding variance-related analyses (x-post r/statistics)

I am looking at baseball exit velocities (how fast the ball hits off the bat) in different stadiums over 2017, 2018, and 2019. 2017 Variance: .4 2018 Variance: .43 2019 Variance: .54 So I am thinking that maybe there were some tracking errors in 2019. For each stadium, I looked at hitters who hit there. For each hitter who hit at a stadium, I found the difference of their variance at that one stadium, minus their variance at all other stadiums. For each stadium I took the mean of all these differences. For example, I looked at everybody who hit at Yankee stadium. For each hitter, I calculated the difference between their Yankee stadium variance and variance anywhere other than Yankee stadium. And then I took the average of all these differences for Yankee stadium. After doing this avg for each stadium, I looked at the stadium with biggest average differences. There was one stadium, Stadium A, that had a difference of .15, whereas all other stadium had differences of less than .1. Stadium A’s var in 2019 also increased .4 from it’s variance in 2018. So I was thinking that Stadium A would explain the overall high increase in variance from 2019 to 2018. The problem is, removing Stadium A from the 2019 dataset BARELY impacted the 2019 var (less than 0.01). So that goes to show that this big year to year change in variance can’t be attributed to Stadium A. There are other outlier-looking stadiums that had a big change in variance year to year from 2018 to 2019. But in the first analysis I did where I compared avg diff of var at one stadium compared to all other stadiums, these batters at these outlier stadiums had SIMILAR VARIANCES WHEN COMPARED TO OTHER STADIUMS. And even if I did remove these outlier stadiums from the dataset, MY OVERALL VARIANCE BARELY CHANGED FOR 2019 AND WAS STILL GREATER THAN .5. So yeah, I’m really f*cking confused as to why there was that .1 increase from 2018 to 2019. I am thinking maybe my methodology of comparing stadiums vs all other stadiums is wrong bc I use an average… and I also remove null values and stuff like that and some stadiums have more nulls than others. Any ideas? submitted by /u/hapatheorist [link] [comments]

Best way to statistically analyse weight loss?

I record my weight daily and compute a weekly average, along with the SEM. I then plot my weekly weight with the SEM as error bars and note a downwards trend. I have two questions: 1) If I weigh the same or more this week than last week, how can I tell if this is likely a real failure to lose weight, or just a result of measurement variabilityrandom chance? 2) How should I best interpolate my weight loss curve in order to predict my weight in the future? A linear function fits poorly while a polynomial fit produces weird trends outside of my data range. Exponential decays plateau out quickly which is demotivating. submitted by /u/womerah [link] [comments]

car fuse panel problem

ok. so i bought a car. i test drove it. checked everything i thought i should check. worked just fine. great.

not great. a week after i bought it, it started having problems with starting, upshifting, rear view mirror control didn’t function, no cruise control, and trunk wouldn’t open. fek.

first thing’s first: transmission.

after saving up the money i thought i needed to fix the transmission, i went to take it to a shop.

disaster: i started to smell fuel in quite a high concentration. stopped at a gas station to check it out.

fuel was pouring out from the fuel tank under the car.

ohfuckmyshitwithaburger.jpg

called a mobile mechanic and he came by to check it out.

turns out my fuel leak is coming from the line above the fuel tank. that’s gonna cost me 200-300 bucks. god damn it. but it also turns out that the fuel line being jank might also be causing my starting and shifting issue, so there’s hope that the overall costs could be lower than expected. hopehopehopehope.

talked to him about my other issues. dude who sold me the car said there were some fuse problems and left me with a bundle of fuses. ok cool.

mechanic shows me where the interior fuse panel is (idk if that’s what it’s called. please correct me)

so i check it out. i check out the schematic. find out several of the fuses are missing.

check out the fuse bundle the dude left me. i had 1 of the missing fuses. it’s the trunk one. fantastic. now i have a trunk for when i move. not being sarcastic, this is actually pretty great.

now what about the rear view mirror control and cruise control?

cant find mirror control fuse slot. fek.

find cruise control. slot melted. also dont have right fuse. tripple fek.

so that brings me to the problem and my question: is there a way to replace the fuse panel so that i can actually install a fuse to make cruise control work? if so, how much you think it’s gonna cost?

it’s a 2005 buick lacrosse, btw.

thanks in advance.

submitted by /u/Tyranim
[link] [comments]

The Long Train Problem

I thought of a problem while waiting at a railroad crossing. Very simply, it goes like this:

When you approach a railroad crossing (with a train crossing it), how many cars do you want to see waiting there?

Assuming, of course, that you want to spend as little time waiting at the intersection as possible, there are arguments to be made for fewer and for more cars. If you see fewer cars, it’s likelier that the total time the train will have spent at the crossing will be small. If there are more cars, it’s likelier that the train has already spent most of its total time at the intersection. So it’s a bit of a paradox, right?

For the sake of this problem, assume that you’re not able to measure how fast or how long the given train is. However, you have two values to work with: R and f(t). R is the number of cars per unit time which approach the crossing, which is assumed to be constant. f(t) is the continuous probability distribution that a train will spend a total time t at the intersection, given that you encountered it at the intersection. (Without that last given condition, the problem would look different, and probably harder.)

I haven’t spent too much time on this problem, but I think it might be a little tricky. If anyone has any thoughts on it, I’d love to hear them!

(Also, let’s say that you consider only the cars on your side of the tracks.)

submitted by /u/Squirrels_are_neat
[link] [comments]

Fench article on Akshay Venjatesh, Fields Medal 2018

I’m just a french lover of math and was reading the following article in the french edition of scientific ameican about Akshay Venkatesh and wahoo! (I lack of words)

https://www.pourlascience.fr/sd/mathematiques/akshay-venkatesh-un-mathematicien-batisseur-de-ponts-16726.php

Edit: sorry for typo in title, i’m on mobile

submitted by /u/One_Philosopher
[link] [comments]

Estimating change in time series A when perturbating time series B

Say we have historical data on two time series. Say time series B is its own process and is not influenced at all by time series A, but time series A is highly dependent on time series B, say for example they have an 80% correlation with 0 lags both in the original time series and their first difference.

What are appropriate models which could help me awnser the following: Given a perturbation in the near future in B, by how much will A move?

To be more concrete, say B is oil price and A is an oil company, if oil increases or decreases by 5 bucks, by how much will the oil firm’s stock price move?

submitted by /u/its-trivial
[link] [comments]

How do I Project a Fiber of the Hopf Fibration?

Yes, I do know the formula for stereographic projection from R4 to R3.

(a,b,c,d) gets sent to (a/(1-d), b/(1-d), c/(1-d))

What I don’t quite know how to do is project a fiber around S3 (one of its great circles.)

I would assume that we project “sets of points” for each circle by individually “tracing” out each point on the fiber and then projecting each point, forming a villarceau circle in R3.

I do indeed know how to “trace” out a fiber in C2. I have no problem with this. I was having some trouble learning how to trace out a fiber with the versors. I assume, as the wikipedia page has it written, that a fiber around S3 contains the set of quaternion elements where q = u + vp, under the condition that u2 + v2 = 1. Also, in this formula, p is a vector quaternion of the form (0 + bi + cj + dk) with a magnitude of 1 which lies on S2.

So, is that all I need to know in order to trace out a fiber using quaternions? I was reading the wiki page around the section titled, “geometric interpretation using rotations.” They were saying things about p, it’s antipodal point, and 180 degree rotations which I just couldn’t follow. For example, I have no idea how they derived the other formulas such as this one: https://wikimedia.org/api/rest_v1/media/math/render/svg/d60d533f953eed6b8975fd4c6ca39c7f48d41107.

Anyway, when projecting a fiber, I wanted to know if we strictly have to do this using quaternions. Do we take each quaternion of the form (a + bi + cj + dk) and then extract it’s real components, (a,b,c,d) and then project this point, and then repeat this process for every point around the fiber?

Or, do we project from C2 by taking two complex numbers, (z,w), break them up into their components, (a + bi), (c + di) and then extract all of the real numbers (a,b,c,d) and then repeat this process for each point of the fiber?

Or, is there a way to project a fiber directly from C2 to C x R? Instead of the traditional projection from R4 to R3.

submitted by /u/adam717
[link] [comments]

Average of probabilities

Hi. I’m developing a work where I have multiple people classifying an image. Now I need to process the data I have but I’m really bad with statistics stuff and I would like some help.

Imagine that each person classifies an image and attributes a probability to each option regarding that image.

For example, person 1 classifies image as: A=60%, B=20%, C=10%, D=10%.
Person 2 classifies as: A=40%, B=10%, C=10%, D=40%,
And so on with multiple people.

I want to get a final result on the global probabilities for each option. How do I calculate the average and attribute different weights to each person (like Person 3 has double the weight of person 1). Any examples for similar problems that I can read to understand all this stuff?

submitted by /u/cardrig
[link] [comments]