SPSS SYNTAX for A’ (A-prime)

I am conducting a test with signal detection theory (‘old’/’knew’ & ‘same’/different’), and have the Syntax and results for Hit rates, correct rejections, adjusted HR, adjusted False alarms (too ensure d’ does not sum to infinity) and also Bias (c). However, some of my data is skewed (negatively) and have found that d’ is not appropriate for this, and should use A’ instead.

When I have looked at articles with A’, I managed to find this:

COMPUTE APRIME = .5 + (ABS(HR – FA)/(HR – FA))*((HR – FA)**2 + ABS(HR – FA)) / (4*MAX(HR,FA) – 4*HR*FA).


(Stanislaw & Todorov, 1999)

However, when I input this into SPSS with my own variable names some of the functions do not come up in red (i.e., the second ABS – the first one does appear red). Note: I am fairly new to SPSS Syntax, but I understand that bright red means an error somewhere.

My question is therefore:

Will it matter in computing my data if the functions (i.e., ABS, MAX etc.) are not coloured red? Will the SYNTAX ignore them or include them?

Thank you all in advance, feel free to ask if I have not explained anything properly or have any questions.

submitted by /u/Commander_Nayr
[link] [comments]

PyCM 1.7 released: Machine learning library for confusion matrix statistical analysis


Changelog :

  • Gini Index (GI) added
  • Example-7 added
  • pycm_profile.py added
  • class_name argument added to stat,save_stat,save_csv and save_html methods
  • overall_param and class_param arguments empty list bug fixed
  • matrix_params_calc, matrix_params_from_table and vector_filter functions optimized
  • overall_MCC_calc, CEN_misclassification_calc and convex_combination functions optimized
  • Document modified

submitted by /u/sepandhaghighi
[link] [comments]

Calculating the relative likelihood with AIC values

I’m using AIC for model selection, and would like to use a relative likelihood measure to quantify how many times a model with minimum AIC (AICmin) fits better than the alternative (with AICi).

For that, I’m using Burnham et al. (2011) formula, which is:

RL = exp ( 0.5 * ( AICmin - AICi )) 

The expression is quite easy. However, I’m worried to miss something. In mi case, AICmin is negative (AICmin = -239.10, AICi = 210.43), which makes the difference term (AICmin – AICi) also negative, and thus a relative likelihood on the order of zero (RL = 2.43e-98). In the original article I don’t find any reference saying that the difference should be absolute, but if so, the ratio becomes too high (RL = 4.11+97) for me to be credible. Am I missing something? Thank you!

submitted by /u/EuGENE87
[link] [comments]

Box-Cox and Tidwell transformation

I’ll make this as short as possible:

Data on sales price for houses as well as their size given square-meters:

Simpler Linear Model: Sales_price ~ Sq_m

Residual analysis of residuals, fitted values and Q-Q plot looks good, initially no apparent need for transformation.

Run Box-Cox, discover that the 95% confint for maximizing log-Likelihood given lambda is between 0.5 and 0.55, further analysis suggests 0.537.. lambda = 1 isn’t part of the confint so that suggests that a transformation is need or at least statistically justified, correct?

Now the question is that once the response has been Box-Cox transformed with the above lambda, does it then make sense to seek a Box-Tidwell transformation of ‘Sq_m’ as well? And if so should the analysis be done using the newly transformed response or using the original data?

submitted by /u/DrChrispeee
[link] [comments]

Are Imaginary/Complex numbers or Trig relevant to Stats?

I looked this up and the only thing I could find was regarding Characteristic Functions (the stats version of what I as an engineer know better as the Fourier Transform, but with a different sign).

Are there any other places where they are used?

How about Trigonometry? I have not seen an area where Trig is very relevant either.

submitted by /u/ice_shadow
[link] [comments]

Correlating ordinal and categorical data

Correlating continuous variables is straightforward even for a noob like me.

But if I have ordinal data (not normally distributed) and categorical or other ordinal data. How can I correlate them? Or show a relationship.

Someone suggested Kruskal-Wallis as a one way ANOVA for non parametric data. Is this correct?

submitted by /u/JSS35
[link] [comments]

Advice needed – applied statistics

Hello everyone, I’m studying actuarial science at university, and many of the subjects have to do with math and statistics, which I really like and I’m generally good at. But one thing that’s bugging me is that even though I’m halfway through my career and “know” quite a lot about statistics I realize that I have no idea how to use it ir apply it whatsoever. Can anyone recommend any good resources (books, or whatever is fine) to know a little more about applied statistics, specially regarding data analysis and social science investigation?

Thank you in advance and Please excuse my bad English

submitted by /u/hitmelikethatsnare
[link] [comments]

anova on a categorical response and continuous predictors?

Here’s my sample of data where tissue is my response and IncytePD.1818836 is my predictor:

 IncytePD.1818836 tissue 1 0.0835033 PL 2 0.1258060 PL 3 -0.0123343 PL 4 0.0232523 PL 5 -0.0048043 PL 56 0.0748163 MML 57 -0.1668530 MML 58 -0.1090200 MML 59 0.1392490 MML 80 0.0013013 PLNM 81 -0.1555230 PLNM 82 -0.1817740 PLNM 83 -0.3269790 PLNM 

How do I do anova on this?

I’m suppose to get the mean of each class and then see if it’s significant.

Is this correct??

aov(liver_data$`IncytePD.1818836` ~ tissue, data = live_data) 

I’m not sure if aov function automatically take the means in the IncytePD.1818836 column, then group by the 3 classes in tissue response, and finally compare it three way.

It’s a bit weird with the formula so I am a bit confuse on flipping the response and the predict
liver_data$IncytePD.1818836 ~ tissue

Thank you.

submitted by /u/urmyheartBeatStopR
[link] [comments]

Help! How do I use R studio (or not?) to complete a hypothesis test (paired t-test) without the data set?

I need to conduct a hypothesis test with a significance level of .05 to determine whether the mean differs significantly before and after an exercise. I need to find test statistics, p-value, and degrees of freedom and interpret them in respect to the “data”. However, the only information given is the following:

Before exercise Mean: 255.63 SD: 115.48

After exercise Mean: 284.75 SD: 132.64

Difference Mean: 29.13 SD: 21.01

Additionally, if you actually calculate the difference between the mean and sd for the before and after, it’s different than what’s given.

What’s the deal?

submitted by /u/thaDEA2470
[link] [comments]