Confusion on calculating sum of squared errors in r with a linear model

I have a linear model (excerpt below). To my understanding, to calculate the SSE would be summarize(sum(.resid^2)), but the data camp exercise is telling me that SSE is calculated by summarize(SSE = var(.resid)). I can’t understand why the equation changed from the presentation to the exercise > glimpse(mod_null) Observations: 507 Variables: 32 $ bia.di 42.9, 43.7, 40.1, 44.3, 42.5, 43.3, 43.5, 44.4, 43.5, 42… $ bii.di 26.0, 28.5, 28.2, 29.9, 29.9, 27.0, 30.0, 29.8, 26.5, 28… $ bit.di 31.5, 33.5, 33.3, 34.0, 34.0, 31.5, 34.0, 33.2, 32.1, 34… $ 17.7, 16.9, 20.9, 18.4, 21.5, 19.6, 21.9, 21.8, 15.5, 22… $ che.di 28.0, 30.8, 31.7, 28.2, 29.4, 31.3, 31.7, 28.8, 27.5, 28… $ elb.di 13.1, 14.0, 13.9, 13.9, 15.2, 14.0, 16.1, 15.1, 14.1, 15… $ wri.di 10.4, 11.8, 10.9, 11.2, 11.6, 11.5, 12.5, 11.9, 11.2, 12… $ kne.di submitted by /u/Geologist2010 [link] [comments]

ICC & F values?

I have computed two different intraclass correlations for some data I’m working on. Both sets have an ICC of 1. In Rstudio the output has a column labelled ‘F’. Each data set has a different value for this. I’m really unclear what this value means. Any help? I have to decide between the two datasets but as they both have an ICC of 1 I’m stuck. Thanks in advance. submitted by /u/clickily [link] [comments]

Preliminary results of the M4 forecast competition: hybrid approaches and combinations of forecasting methods produce greater accuracy The combination of methods was the king of the M4. Of the 17 most accurate methods, 12 were “combinations” of mostly statistical approaches. The biggest surprise was a “hybrid” approach that utilised both statistical and ML features. This method produced both the most accurate forecasts and the most precise PIs, and was submitted by Slawek Smyl, a Data Scientist at Uber Technologies. According to sMAPE, it was close to 10% more accurate than the combination benchmark. The second most accurate method was a combination of seven statistical methods and an ML one, with the weights for the averaging calculated by an ML algorithm that was trained to minimise the forecasting error through holdout tests. This method was submitted jointly by Spain’s University of A Coruña and Australia’s Monash University. The most accurate and second most accurate methods also achieved an amazing success in specifying the 95% PIs correctly. These are the first methods we are aware of that have done so, rather than underestimating the uncertainty considerably. The six pure ML methods that were submitted in the M4 all performed poorly, with none of them being more accurate than Comb and only one being more accurate than Naïve2. This supports the findings of the latest PLOS ONE paper by Makridakis, Spiliotis and Assimakopoulos. submitted by /u/true_unbeliever [link] [comments]

Help (a dumb clinician) with sample size calculation for clinical field study

Hi, Medical physician working with my PhD here, familiar with basics statistics but considering myself in general very average at statistics (at least calculations!). Grateful for any help 🙂 In our research group we are planning on launching a field trial to validate a novel technique for pap smear analysis (screening for cervical cancer; big problem especially in many low-resource areas!). This technique could potentially improve the cancer screening significantly in areas lacking adequate screening. The research question/hypothesis is that our technique is comparable to traditional diagnosis – e.g. microscopy analysis of samples, for the detection of high grade pre-cancerous lesions. The problem is that I am trying to calculate the amount of patients/samples needed for the study to confidently be able to say that our technique is not significantly worse than the golden standard, i.e. traditional microscopy analysis (reject the null hypothesis). So for the data we can assume that the prevalence of pre-cancerous lesions we want to detect is about 5% in the study population. Light microscopy, to which we are comparing our method, has a sensitivity of about 60% and a specificity of about 90% for the detection of these lesions. For the alpha parameter, the traditional 0.05 value is good, and for statistical power 80 % would probably be enough (beta = 0.2). I apologise if the question is too simple, but for a more “clinically” oriented person, I’m having a hard time figuring out what would be the best way to estimate the sample size required, performing power calculations etc 🙂 Would it make sense for example to try to compare the methods with kappa statistics, say assume that the agreement is better than moderate (k > 0.4)? Thank you so much if you can help explain what would be the most sensible way to solve this! Any help appreciated 🙂 Have worked mainly with Stata, but apparently power calculators etc. are also available online..? submitted by /u/kattenfreja [link] [comments]

Hypocrisy – Icky competitive intelligence yields much lulz

I’m seeing someone who has no idea this site exists, and I aim to keep it that way. One night she came over for my delicious green chile chicken enchiladas and a piratebay movie and chill. So we’re sitting on the couch and she’s playing with my bitchin’ thinkpad t520 with the clit mouse and Linux Mint. She signs into her facebook, and we’re talking about my ex a little, and she admits the reason she reached out was my exes post about us breaking up. I thought that was pretty funny. We’d met at a party a few years back (she with her ex, me with mine), and she caught me peeking up her skirt at some pretty white panties. She saw me looking and smiled and spread her legs a little more. It was hot as fuck. We flirted a bit at the time, innocently, until my ex stomped over and started talking at 80 db. Years later she gave me a call when my ex announced the end on FB and we went for some drinks and a little blow. Girl can drink some whiskey. I actually found a picture of her and her then boyfriend on that glider today. You can see the panties in it. Back on the couch, my lady friend keeps creeping on my exes FB, I’m watching the movie and getting annoyed that she’s not paying attention, missing crucial details. Then she says, “Oh, fuck you bitch.” I’m like, “what?” And she hands me the lappy, so I can read a post by the ex saying something idiotic like “I hate pot. Pot destroys lives.” That really annoyed me. It was about me. It was further proof that she still didn’t believe that I was having legit mental health issues. I found some other stuff, confirming suspicions about more fucking lies, but never mind that. I broke cover, and switched to a parachute account to comment on how retarded it was considering her constant alcohol consumption and the obvious discrepancies in all the various alcohol related morbidity rates. Maybe being a drunk for so long had shrunk her brain while enlarging her liver? Then my friend had to leave early which was disappointing since I was looking forward to a nice half-n-half. Crazy girls got that good pussy. A month later, and I’m checking my email and I see one from our joint account that says overdraft protection was activated. I don’t want this shit in my inbox so I’m trying to get my name off all these joint accounts, which was impossible apparently (fuck you Wells Fargo). My brain is tired, and the adderall wore off hours ago. I’m just sort of aimlessly clicking around when the loa of perversity mounted his chwal, and I decide for shits and giggles to look at her account activity. It’s unethical, and a bad idea, not illegal, but still pretty shitty. But my name is on those statements, so. The first thing I see is a transaction to her growler place, and I remember her pot comment, and I get an idea. For reference I spend $167/month on weed. I buy an oz for $250, and it lasts about a month and half. Longer these days, since I started on the lamictal. I looked at 5 statements, from 2017-12-23 to 2018-05-23, and averaged out, using just her growler place and watering holes, we’re exactly even. $167/month. However, I noticed that she eats out at least once a day, and if I know my ex, she’s having a beer with every meal she can. She’s also taking out an average of about $500/month in cash, so who knows how much of that is going to booze as well. April was an especially busy month with almost $500(!) spent on beer. Now, the next thing I noticed was interesting. She directly transferred $1,000 in two $500 installments to her new beau. There’s also a bunch of venmo transfers, often around $150, for a total of $870. The reasons for these transfers are a mystery, and I can only speculate. So I will. For my own amusement, I’m going to assume she’s paying him to be her boyfriend. Source submitted by /u/LikeTotesObvi [link] [comments]