What is your approach to continuous data containing a “missing” level, yet pertinent to your data?

I work in an industry related to the automotive industry. We typically run models for a variety of topics but recently ran into an issue with a model using Horse Power as an input.

The data I receive is an aggregate from various manufacturers. Recently, there had been some criticism with electric cars not being internal combustion engines – and therefore not having a proper horse power.

The issue: I have thousands of cars in my data with a wide range of horse power values (certainly a continuous measure). This variable has proven to be predictive in this model time and again, but recently Tesla and other manufacturers have the horse power measure removed. From the perspective of data, this is an incredibly important data point. While horsepower is being used by other data points, the lack of points that this car is electric and should still be included.

My question to you: how would you best treat this data to continue to include? Remove altogether and use other variables? Make categorical, with 500+ levels? Apply the average HP to electrics (I would be concerned about applying the “average” since when HP used to be reported it could be on the upper spectrum)? Any thoughts on the matter would be greatly appreciated!

submitted by /u/zarjaa
[link] [comments]

Need help with power analysis

I have to do a power analysis for my dissertation. However, my chair stated he does not know how to do one. My other committee member says I need to as well but he is unresponsive, and the other CM says I don’t need it at all. I’d rather be on the safe side and have one. Here are the tests we will be running for my study that is focused on establishing psychometric properties for the development of a self-report instrument:

Split half reliability

Internal consistency/cronbach’s alpha

Test retest reliability

Differential validity (comparing my instrument to another one)

I would really appreciate it if someone could help me run a power analysis for each of these. I’m just feeling so stuck. 🙁 feel free to message me if you need more details or would like to help!

submitted by /u/smalldoglady
[link] [comments]

Ways to find people interested in math in your area

I know that the title sounds a bit like a porn ad, but I genuienly want to know how to find people I can talk with about math. I am also rather young and would prefer if I could talk to people of my age (~16 years). Is there a website that lists math clubs? I live in europe and I can only find websites from america or alumni ones (for universities)

submitted by /u/candlelightener
[link] [comments]

Is this paper incorrectly omitting the use of false discovery rate correction methods?


See this paper- table 3 is where I’m focusing on. They used Mann-Whitney p value and set cutoff to .05, but don’t seem to make any correction for false discovery rate which seems wrong given they have made a large number of comparisons (total of 268 comparisons).

Am I right in saying that setting this P value and not correcting for false discovery rate probably gave them some erroneous results?

submitted by /u/runninggartman
[link] [comments]

Dream Car Monday: You can buy anything you want, but the ONLY thing you can listen to through the stereo is music released the model year of your vehicle.

You want to drive a 2005 Ford GT? Great! Enjoy listening to 50 Cent, Kelly Clarkson, and The Killers.

submitted by /u/Smitty_Oom
[link] [comments]