The SingleCaseES package provides R functions for calculating basic, within-case effect size indices for single-case designs, including several non-overlap measures and parametric effect size measures, and for estimating the gradual effects model developed by Swan & Pustejovsky (2018). Estimation procedures for standard errors and confidence intervals are provided for the subset of effect sizes indices with known sampling distributions. This vignette covers the mathematical definitions of the basic non-overlap and parametric effect size measures, along with some details about how they are estimated. Parker, Vannest, & Davis (2011) provides a review of the non-overlap measures, including worked examples of the calculations. Pustejovsky (2019) provides a critical review of non-overlap measures and parametric effect sizes. However, neither of these reviews include details about standard error calculations.
All of the within-case effect size measures are defined in terms of a comparison of observations between two phases (call them phase A and phase B) within a single-case design. Let m and n denote the number of observations in phase A and phase B, respectively. Let y1A, ..., ymA denote the observations from phase A and y1B, ..., ynB denote the observations from phase B.
The non-overlap effect size measures are all defined in terms of ordinal comparisons between data points from phase A and data points from phase B. It will therefore be helpful to have notation for the data points from each phase, sorted in rank order. Thus, let y(1)A, y(2)A, ..., y(m)A denote the values of the baseline phase data, sorted in increasing order, and let y(1)B, y(2)B, ..., y(n)B denote the values of the sorted treatment phase data.
The parametric effect size measures are all defined under a simple model for the data-generating process, in which observations in phase A are sampled from a distribution with constant mean μA and standard deviation σA, while observations in phase B are sampled from a distribution with constant mean μB and standard deviation σB. Let ȳA and ȳB denote the sample means for phase A and phase B, respectively. Let sA and sB denote the sample standard deviations for phase A and phase B, respectively. Let zα/2 denote the 1 − α/2 critical value from a standard normal distribution. Finally, we use ln () to denote the natural logarithm function.
Parker & Vannest (2009) proposed non-overlap of all pairs (NAP) as an effect size index for use in single-case research. NAP is defined in terms of all pair-wise comparisons between the data points in two different phases for a given case (i.e., a treatment phase versus a baseline phase). For an outcome that is desirable to increase, NAP is the proportion of all such pair-wise comparisons where the treatment phase observation exceeds the baseline phase observation, with pairs that are exactly tied getting a weight of 1/2. NAP is exactly equivalent to the modified Common Language Effect Size (Vargha & Delaney, 2000) and has been proposed as an effect size index in other contexts too (e.g., Acion, Peterson, Temple, & Arndt, 2006).
NAP can be interpreted as an estimate of the probability that a randomly selected observation from the B phase improves upon a randomly selected observation from the A phase. For an outcome where increase is desirable, the effect size parameter is
θ = Pr(YB > YA) + 0.5 × Pr(YB = YA).
For an outcome where decrease is desirable, the effect size parameter is
θ = Pr(YB < YA) + 0.5 × Pr(YB = YA).
For an outcome where increase is desirable, calculate
qij = I(yjB > yiA) + 0.5I(yjB = yiA) for i = 1, ..., m and j = 1, ..., n. For an outcome where decrease is desirable, one would instead use
qij = I(yjB < yiA) + 0.5I(yjB = yiA).
The NAP effect size index is then calculated as
$$ \text{NAP} = \frac{1}{m n} \sum_{i=1}^m \sum_{j=1}^n q_{ij}. $$
The SingleCaseES package provides several different methods for estimating the standard error of NAP. The default method is calculated based on the exactly unbiased variance estimator described by Sen (1967; cf. Mee, 1990), which assumes that the observations are mutually independent and are identically distributed within each phase. Let
$$ \begin{aligned} Q_1 &= \frac{1}{m n^2} \sum_{i=1}^m \left[\sum_{j=1}^n \left(q_{ij} - \text{NAP}\right)\right]^2, \\ Q_2 &= \frac{1}{m^2 n} \sum_{j=1}^n \left[\sum_{i=1}^m \left(q_{ij} - \text{NAP}\right)\right]^2, \qquad \text{and} \\ Q_3 &= \frac{1}{m n} \sum_{i=1}^m \sum_{j=1}^n \left(q_{ij} - \text{NAP}\right)^2. \end{aligned} $$
The SE is then calculated as
$$ SE_{\text{unbiased}} = \sqrt{\frac{\text{NAP}(1 - \text{NAP}) + n Q_1 + m Q_2 - 2 Q_3}{(m - 1)(n - 1)}}. $$
Another method for estimating a standard error was introduced by Hanley & McNeil (1982). This standard error is calculated as
$$ SE_{\text{Hanley}} = \sqrt{\frac{1}{mn}\left(\text{NAP}(1 - \text{NAP}) + (n - 1)Q_1 + (m - 1)Q_2\right)}, $$
with Q1 and Q2 defined as above. This standard error is based on the same assumptions as the unbiased SE.
A limitation of SEunbiased and SEHanley is that they will be equal to zero when there is complete non-overlap (i.e., when NAP is equal to zero or equal to one). In order to ensure a strictly positive standard error for NAP, the SingleCaseES package calculates SEunbiased and SEHanley using a truncation of NAP. Specifically, the formulas are evaluated using
$$ \widetilde{\text{NAP}} = \text{max}\left\{\frac{1}{2 mn}, \ \text{min}\left\{\frac{2mn - 1}{2mn}, \ \text{NAP} \right\} \right\} $$ in place of NAP.
A final method for estimating a standard error is to work under the null hypothesis that there is no effect—i.e., that the data points from each phase are sampled from the same distribution. Under the null hypothesis, the sampling variance of NAP depends only on the number of observations in each phase:
$$ SE_{\text{null}} = \sqrt{\frac{m + n + 1}{12 m n}} $$ (cf. Grissom & Kim, 2001, p. 141). If null hypothesis is not true—that is, if the observations in phase B are drawn from a different distribution than the observations in phase A—then this standard error will tend to be too large.
A confidence interval for θ can be calculated using a method proposed by Newcombe [Newcombe (2006); Method 5], which assumes that the observations are mutually independent and are identically distributed within each phase. Using a confidence level of 100% × (1 − α), the endpoints of the confidence interval are defined as the values of θ that satisfy the equality
$$ (\text{NAP} - \theta)^2 = \frac{z^2_{\alpha / 2} h \theta (1 - \theta)}{mn}\left[\frac{1}{h} + \frac{1 - \theta}{2 - \theta} + \frac{\theta}{1 + \theta}\right], $$
where h = (m + n)/2 − 1 and zα/2 is 1 − α/2 critical value from a standard normal distribution. This equation is a fourth-degree polynomial in θ, solved using a numerical root-finding algorithm.
Scruggs, Mastropieri, & Casto (1987) proposed the percentage of non-overlapping data (PND) as an effect size index for single-case designs. For an outcome where increase is desirable, PND is defined as the proportion of observations in the B phase that exceed the highest observation from the A phase. For an outcome where decrease is desirable, PND is the proportion of observations in the B phase that are less than the lowest observation from the A phase.
This effect size does not have a stable parameter definition because the magnitude of the maximum (or minimum) value from phase A depends on the number of observations in the phase (Allison & Gorman, 1994; Pustejovsky, 2019).
For an outcome where increase is desirable,
$$ \text{PND} = \frac{1}{n} \sum_{j=1}^n I(y^B_j > y^A_{(m)}), $$
where y(m)A is the maximum value of y1A, ..., ymA. For an outcome where decrease is desirable,
$$ \text{PND} = \frac{1}{n} \sum_{j=1}^n I(y^B_j < y^A_{(1)}), $$
where y(1)A is the minimum value of y1A, ..., ymA.
The sampling distribution of PND has not been described, and so standard errors and confidence intervals are not available.
Ma (2006) proposed the percent exceeding the median, defined as the proportion of observations in phase B that improve upon the median of phase A. Ma (2006) did not specify an effect size parameter corresponding to this index. However, it would be reasonable to define the parameter as the probability that a randomly selected observation from the B phase represents an improvement over the median of the distribution of A phase outcomes. Let ηA denote the median of the distribution of outcomes in phase A. For an outcome where increase is desirable, the PEM parameter would then be
ξ = Pr(YB > ηA) + 0.5 × Pr(YB = ηA). For an outcome where decrease is desirable, it would be ξ = Pr(YB < ηA) + 0.5 × Pr(YB = ηA).
For an outcome where increase is desirable,
$$ \text{PEM} = \frac{1}{n}\sum_{j=1}^n \left[ I(y^B_j > m_A) + 0.5 \times I(y^B_j = m_A) \right], $$
where mA = median(y1A, ..., ymA). For an outcome where decrease is desirable,
$$ \text{PEM} = \frac{1}{n}\sum_{j=1}^n \left[ I(y^B_j < y^A_{(1)}) + 0.5 \times I(y^B_j = m_A) \right]. $$
The sampling distribution of PEM has not been described, and so standard errors and confidence intervals are not available.
For an outcome where increase (decrease) is desirable, Parker, Vannest, & Davis (2011) defined PAND as the proportion of observations remaining after removing the fewest possible number of observations from either phase so that the highest remaining point from the baseline phase is less than the lowest remaining point from the treatment phase (lowest remaining point from the baseline phase is larger than the highest remaining point from the treatment phase).
This effect size does not have a stable parameter definition because its magnitude depends on the number of observations in each phase (Pustejovsky, 2019).
For an outcome where increase is desirable, PAND is calculated as
$$ \text{PAND} = \frac{1}{m + n} \max \left\{\left(i + j\right) I\left(y^A_{(i)} < y^B_{(n + 1 - j)}\right)\right\}, $$
where y(0)A = −∞, y(n + 1)B = ∞, and the maximum is taken over the values 0 ≤ i ≤ m and 0 ≤ j ≤ n. For an outcome where decrease is desirable, PAND is calculated as
$$ \text{PAND} = \frac{1}{m + n} \max \left\{\left(i + j\right) I\left(y^A_{(m + 1 - i)} > y^B_{(j)}\right)\right\}, $$
where y(m + 1)A = ∞, y(0)B = −∞, and the maximum is taken over the values 0 ≤ i ≤ m and 0 ≤ j ≤ n.
The sampling distribution of PAND has not been described, and so standard errors and confidence intervals are not available.
The robust improvement rate difference is defined as the robust phi coefficient corresponding to a certain 2 × 2 table that is a function of the degree of overlap between the observations each phase (Parker, Vannest, & Davis, 2011). This effect size does not have a stable parameter definition because its magnitude depends on the number of observations in each phase (Pustejovsky, 2019).
For notational convenience, let y(0)A = y(0)B = −∞ and y(m + 1)A = y(n + 1)B = ∞. For an outcome where increase is desirable, let ĩ and j̃ denote the values that maximize the quantity
(i + j)I(y(i)A < y(n + 1 − j)B) for 0 ≤ i ≤ m and 0 ≤ j ≤ n. For an outcome where decrease is desirable, let ĩ and j̃ instead denote the values that maximize the quantity
(i + j)I(y(m + 1 − i)A > y(j)B).
Now calculate the 2 × 2 table
$$ \begin{array}{|c|c|} \hline m - \tilde{i} & \tilde{j} \\ \hline \tilde{i} & n - \tilde{j} \\ \hline \end{array} $$
Parker, Vannest, & Brown (2009) proposed the non-robust improvement rate difference, which is equivalent to the phi coefficient from this table. Parker, Vannest, & Davis (2011) proposed to instead use the robust phi coefficient, which involves modifying the table so that the row- and column-margins are equal. Robust IRD is thus equal to
$$ \text{IRD} = \frac{n - m - \tilde{i} - \tilde{j}}{2 n} - \frac{m + n - \tilde{i} - \tilde{j}}{2 m}. $$
Robust IRD is algebraically related to PAND as
$$ \text{IRD} = 1 - \frac{(m + n)^2}{2mn}\left(1 - \text{PAND}\right). $$ Just as with PAND, the sampling distribution of robust IRD has not been described, and so standard errors and confidence intervals are not available.
Tau is one of several effect sizes proposed by Parker, Vannest, Davis, & Sauber (2011) and known collectively as “Tau-U.” The basic estimator Tau does not make any adjustments for time trends. For an outcome where increase is desirable, the effect size parameter is
τ = Pr(YB > YA) − Pr(YB < YA)
(for an outcome where decrease is desirable, the effect size parameter would have the opposite sign). This parameter is a simple linear transformation of the NAP parameter θ:
τ = 2θ − 1.
For an outcome where increase is desirable, calculate
wij = I(yjB > yiA) − I(yjB < yiA)
For an outcome where decrease is desirable, one would instead use
wij = I(yjB < yiA) − I(yjB > yiA).
The Tau effect size index is then calculated as
$$ \text{Tau} = \frac{1}{m n} \sum_{i=1}^m \sum_{j=1}^n w_{ij} = 2 \times \text{NAP} - 1. $$
Standard errors and confidence intervals for Tau are calculated using transformations of the corresponding SEs and CIs for NAP. All of the methods assume that the observations are mutually independent and are identically distributed within each phase.
Standard errors for Tau are calculated as SETau = 2SENAP, where SENAP is the standard error for NAP calculated based on one of the available methods (unbiased, Hanley, or null).
The CI for τ is calculated as
[Lτ, Uτ] = [2Lθ − 1, 2Uθ − 1],
where Lθ and Uθ are the lower and upper bounds of the CI for θ, calculated using a method proposed by Newcombe (2006).
Tau-U is one of several effect sizes proposed by Parker, Vannest, Davis, & Sauber (2011). The Tau-U variant is similar to Tau, but includes an adjustment term that is a function of the baseline time trend. For an outcome where increase is desirable, the index is calculated as Kendall’s S statistic for the comparison between the phase B data and the phase A data, plus Kendall’s S statistic for the A phase observations, scaled by the product of the number of observations in each phase.
This effect size does not have a stable parameter definition and its feasible range depends on the number of observations in each phase (Tarlow, 2017).
For an outcome where increase is desirable, calculate
wijAB = I(yjB > yiA) − I(yjB < yiA)
and
wijAA = I(yjA > yiA) − I(yjA < yiA)
For an outcome where decrease is desirable, one would instead use
wijAB = I(yjB < yiA) − I(yjB > yiA)
and
wijAA = I(yjA < yiA) − I(yjA > yiA).
The Tau-U effect size index is then calculated as
$$ \text{Tau-U} = \frac{1}{m n} \left(\sum_{i=1}^m \sum_{j=1}^n w^{AB}_{ij} - \sum_{i=1}^{m - 1} \sum_{j=i + 1}^m w^{AA}_{ij}\right). $$
The sampling distribution of Tau-U has not been described, and so standard errors and confidence intervals are not available.
Tarlow (2017) proposed to modify the Tau effect size index by first adjusting the observations for a linear trend in the A phase. The index can be calculated with or without conducting a pre-test for significance of the A phase time trend. We provide two approaches to calculate Tau no matter whether the baseline trend is significant or not. The first approach is using Kendall’s rank correlation (with adjustment for ties), as used in Tarlow (2017). The second one is using Tau (non-overlap) index (without adjustment for ties).
If the pre-test for A phase time trend is used, then slope of the baseline trend is first tested using Kendall’s rank correlation. If the baseline slope is significantly different from zero, the outcomes are adjusted for baseline trend using Theil-Sen regression, and the residuals from Theil-Sen regression are used to calculate the Kendall’s rank correlation or Tau (non-overlap) index. If the baseline slope is not significantly different from zero, then no baseline trend adjustment is made, and the Tau-BC effect size is calculated using Kendall’s rank correlation or Tau (non-overlap) index.
If the pre-test for A phase time trend is not used, then the outcomes are adjusted for baseline trend using Theil-Sen regression, regardless of whether the slope is significantly different from zero. The residuals from Theil-Sen regression are then used to calculate the Kendall’s rank correlation or Tau (non-overlap) index.
The formal definition of Tau-BC require positing a model for the time trend in the data series. Thus, suppose that the outcomes can be expressed in terms of a linear time trend and an error term:
$$ \begin{aligned} y_i^A &= \alpha + \beta (i) + \epsilon_i^A, \quad \text{for} \quad i = 1,...,m \\ y_j^B &= \alpha + \beta (m + j) + \epsilon_j^B \quad \text{for} \quad j = 1,...,n. \end{aligned} $$ Within each phase, assume that the error terms are independent and share a common distribution. The Tau-BC parameter can then be defined as the Tau parameter for the distribution of the error terms, or
τBC = Pr(ϵB > ϵA) − Pr(ϵB < ϵA). An equivalent definition in terms of the outcome distributions is
τBC = Pr[YjB − β(m + j − i) > YiA] − Pr[YjB − β(m + j − i) < YiA] for i = 1, ..., m and j = 1, ..., n.
Estimation of τBC entails correcting the data series for the baseline slope β. If using the baseline trend pre-test, the null hypothesis of H0 : β = 0 is first tested using Kendall’s rank correlation. If the test is not significant, then set β̂ = 0 and α̂ = 0. If the test is significant or if the pre-test for baseline time trend is not used, then the slope is estimated by Theil-Sen regression. Specifically, we calculate the slope between every pair of observations in the A phase: $$ s_{hi} = \frac{y_i^A - y_h^A}{i - h} $$ for i = 1, ..., m − 1 and h = i + 1, ..., m. The overall slope estimate is taken to be the median over all m(m − 1)/2 slope pairs:
β̂ = median{s12, ..., s(m − 1)m}. The intercept term is estimated by taking the median observation in the A phase after correcting for the estimated linear time trend:
α̂ = median{y1A − β̂ × 1, y2A − β̂ × 2, ..., ymA − β̂ × m}. However, the intercept estimate is irrelevant for purposes of estimating Tau-BC because the Tau estimator is a function of ranks and is invariant to a linear shift of the data series.
After estimating the phase A time trend, τBC is estimated by de-trending the full data series and calculating Kendall’s rank correlation or Tau (non-overlap) on the de-trended observations. Specifically, set ϵ̂iA = yiA − β̂(i) − α̂ for i = 1, ..., m and ϵ̂jB = yjB − β̂(m + j) − α̂. For an outcome where increase is desirable, calculate wijϵ = I(ϵ̂jB > ϵ̂iA) − I(ϵ̂jB < ϵ̂iA)
or, for an outcome where decrease is desirable, calculate wijϵ = I(ϵ̂jB < ϵ̂iA) − I(ϵ̂jB > ϵ̂iA).
Tau-BC (non-overlap) is then estimated by $$ \text{Tau}_{BC} = \frac{1}{m n} \sum_{i=1}^m \sum_{j=1}^n w^\epsilon_{ij}. $$
If calculated with Kendall’s rank correlation, Tau-BC is estimated as the rank correlation between {ϵ̂1A, …, ϵ̂mA, ϵ̂1B, …, ϵ̂nB} and a dummy coded variable {01, …, 0m, 11, …, 1n}, with an adjustment for ties (Kendall, 1970, p. 35). Specifically,
$$ \text{Tau}_{BC}^* = \frac{1}{D} \sum_{i=1}^m \sum_{j=1}^n w^\epsilon_{ij}, $$ where
$$ D = \sqrt{m \times n \times \left(\frac{(m+n)(m+n-1)}{2}-U\right)} $$ and U is the number of ties between all possible pairs of observations (including pairs within phase A, pairs within phase B, and pairs of one phase A and one phase B data point). U can be computed as
$$ U = \sum_{i=1}^{m - 1} \sum_{j = i+1}^m I\left(\hat\epsilon^A_i = \hat\epsilon^A_j\right) + \sum_{i=1}^{n - 1} \sum_{j = i+1}^n I\left(\hat\epsilon^B_i = \hat\epsilon^B_j\right) + \sum_{i=1}^m \sum_{j=1}^n I\left(\hat\epsilon^A_i = \hat\epsilon^B_j\right). $$
We prefer and recommend to use the Tau-AB form, which divides by m × n rather than by D, because it leads to a simpler interpretation. Furthermore, using D means that TauBC* may be sensitive to variation in phase lengths. To see this sensitivity, consider a scenario where there are no tied values and so every value {ϵ̂1A, …, ϵ̂mA, ϵ̂1B, …, ϵ̂nB} is unique. In this case, U = 0 and $$ D = \sqrt{\frac{1}{2} m n (m + n)(m + n - 1)} = m n \sqrt{1 + \frac{m - 1}{2n} + \frac{n - 1}{2m}}. $$ Thus, the denominator will always be larger than mn, meaning that TauBC* will always be smaller than TauBC. Further, the largest and smallest possible values of TauBC* will be ±mn/D, or about $1 / \sqrt{2}$ when m and n are close to equal. In contrast, the largest and smallest possible values of TauBC are always -1 and 1, respectively.
The exact sampling distribution of TauBC* (Kendall, adjusted for ties) has not been described. Tarlow (2017) proposed to approximate its sampling variance using $$ SE_{Kendall} = \sqrt{\frac{2 (1 - \text{Tau}_{BC}^2)}{m + n}}, $$ arguing that this would generally be conservative (in the sense of over-estimating the true sampling error). When Tau-BC is calculated using Kendall’s rank correlation, the SingleCaseES package reports a standard error based on this approximation.
When calculated without adjustment for ties, the SingleCaseES package takes a different approach for estimating the standard error for TauBC (non-overlap), reporting approximate standard errors and confidence intervals for TauBC based on the methods described above for Tau (non-overlap, without baseline trend correction). An important limitation of this approach is that it does not account for the uncertainty introduced by estimating the phase A time trend (i.e., the uncertainty in β̂).
Gingerich (1984) and Busk & Serlin (1992) proposed a within-case standardized mean difference for use in single-case designs (within-case because it is based on the data for a single individual, rather than across individuals). The standardized mean difference parameter δ is defined as the difference in means between phase B and phase A, scaled by the standard deviation of the phase A outcome distribution:
$$ \delta = \frac{\mu_B - \mu_A}{\sigma_A}. $$
Note that σA represents within-individual variability only. In contrast, the SMD applied to a between-groups design involves scaling by a measure of between- and within-individual variability. Thus, the scale of the within-case SMD is not comparable to the scale of the SMD from a between-groups design.
The SMD δ can be estimated under the assumption that the observations are mutually independent and have constant variance within each phase. There are two ways that the SMD, depending on whether it is reasonable to assume that the standard deviation of the outcome is constant across phases (i.e., σA = σB).
Gingerich (1984) and Busk & Serlin (1992) originally suggested scaling by the SD from phase A only, due to the possibility of non-constant variance across phases. Without assuming constant SDs, an estimate of the standardized mean difference is
$$ d_A = \left(1 - \frac{3}{4m - 5}\right) \frac{\bar{y}_B - \bar{y}_A}{s_A}. $$
The term in parentheses is a small-sample bias correction term (cf. Hedges, 1981; Pustejovsky, 2019). The standard error of this estimate is calculated as
$$ SE_{d_A} = \left(1 - \frac{3}{4m - 5}\right)\sqrt{\frac{1}{m} + \frac{s_B^2}{n s_A^2} + \frac{d_A^2}{2(m - 1)}}. $$
If it is reasonable to assume that the SDs are constant across phases, then one can use the pooled sample SD, defined as
$$ s_p = \sqrt{\frac{(m - 1)s_A^2 + (n - 1) s_B^2}{m + n - 2}}. $$
The SMD can then be estimated as
$$ d_p = \left(1 - \frac{3}{4(m + n) - 9}\right) \frac{\bar{y}_B - \bar{y}_A}{s_p}, $$
with approximate standard error
$$ SE_{d_A} = \left(1 - \frac{3}{4(m + n) - 9}\right)\sqrt{\frac{1}{m} + \frac{1}{n} + \frac{d_p^2}{2(m + n - 2)}}. $$
Whether the estimator is based on the baseline or pooled standard deviation, an approximate confidence interval for δ is given by
[d − zα/2 × SEd, d + zα/2 × SEd].
The log-response ratio (LRR) is an effect size index that quantifies the change from phase A to phase B in proportionate terms. Pustejovsky (2015) proposed to use it as an effect size index for single-case designs (see also Pustejovsky, 2018). The LRR is appropriate for use with outcomes on a ratio scale—that is, where zero indicates the total absence of the outcome. The LRR parameter is defined as
ψ = ln (μB/μA),
The logarithm is used so that the range of the index is less restricted.
There are two variants of the LRR (Pustejovsky, 2018), corresponding to whether therapeutic improvements correspond to negative values of the index (LRR-decreasing or LRRd) or positive values of the index (LRR-increasing or LRRi). For outcomes measured as frequency counts or rates, LRRd and LRRi are identical in magnitude but have opposite sign. However, for outcomes measured as proportions (ranging from 0 to 1) or percentages (ranging from 0% to 100%), LRRd and LRRi will differ in both sign and magnitude because the outcomes are first transformed to be consistent with the selected direction of therapeutic improvement.
To account for the possibility that the sample means may be equal to zero, even if the mean levels are strictly greater than zero, the LRR is calculated using truncated sample means, given by $$ \tilde{y}_A = \text{max} \left\{ \bar{y}_A, \ \frac{1}{2 D m}\right\} \qquad \text{and} \qquad \tilde{y}_B = \text{max} \left\{ \bar{y}_B, \ \frac{1}{2 D n}\right\}, $$ where D is a constant that depends on the scale and recording procedure used to measure the outcomes (Pustejovsky, 2018). To ensure that the standard error of LRR is strictly positive, it is calculated using truncated sample variances, given by $$ \tilde{s}_A^2 = \text{max}\left\{s_A^2, \ \frac{1}{D^2 m^3}\right\} \qquad \text{and} \qquad \tilde{s}_B^2 = \text{max} \left\{ s_B^2, \ \frac{1}{D^2 n^3}\right\}. $$
A basic estimator of the LRR is then given by
R1 = ln (ỹB) − ln (ỹA).
However, R1 will be biased when one or both phases include only a small number of observations. A bias-corrected estimator is given by
$$ R_2 = \ln\left(\tilde{y}_B\right) + \frac{\tilde{s}_B^2}{2 n \tilde{y}_B^2} - \ln\left(\tilde{y}_A\right) - \frac{\tilde{s}_A^2}{2 m \tilde{y}_A^2}. $$ The bias-corrected estimator is the default option in SingleCaseES.
Under the assumption that the outcomes in each phase are mutually independent, an approximate standard error for R1 or R2 is given by
$$ SE_R = \sqrt{\frac{\tilde{s}_A^2}{m \tilde{y}_A^2} + \frac{\tilde{s}_B^2}{n \tilde{y}_B^2}}. $$
Under the same assumptions, an approximate confidence interval for ψ is
[R − zα/2 × SER, R + zα/2 × SER].
The log-odds ratio is an effect size index that quantifies the change from phase A to phase B in terms of proportionate change in the odds that a behavior is occurring (Pustejovsky, 2015). It is appropriate for use with outcomes on a percentage or proportion scale. The LOR parameter is defined as
$$ \psi = \ln\left(\frac{\mu_B/(1-\mu_B)}{\mu_A/(1-\mu_A)}\right), $$
where the outcomes are measured in proportions. The log odds ratio ranges from −∞ to ∞, with a value of zero corresponding to no change in mean levels.
To account for the possibility that the sample means may be equal to zero or one, even if the mean levels are strictly between zero and one, the LOR is calculated using truncated sample means, given by
$$ \tilde{y}_A = \text{max} \left\{ \text{min}\left[\bar{y}_A, 1 - \frac{1}{2 D m}\right], \frac{1}{2 D m}\right\} $$ and
$$ \tilde{y}_B = \text{max} \left\{ \text{min}\left[\bar{y}_B, 1 - \frac{1}{2 D n}\right], \frac{1}{2 D n}\right\}, $$
where D is a constant that depends on the scale and recording procedure used to measure the outcomes (Pustejovsky, 2018).To ensure that the corresponding standard error is strictly positive, it is calculated using truncated sample variances, given by
$$ \tilde{s}_A^2 = \text{max}\left\{s_A^2, \ \frac{1}{D^2 m^3}\right\} \qquad \text{and} \qquad \tilde{s}_B^2 = \text{max} \left\{ s_B^2, \ \frac{1}{D^2 n^3}\right\}. $$
A basic estimator of the LOR is given by
LOR1 = ln (ỹB) − ln (1 − ỹB) − ln (ỹA) + ln (1 − ỹA).
However, like the LRR, this estimator will be biased when the one or both phases include only a small number of observations. A bias-corrected estimator of the LOR is given by
$$ LOR_2 = \ln\left(\tilde{y}_B\right) - \ln\left(1-\tilde{y}_B\right) - \frac{\tilde{s}_B^2(2 \tilde{y}_B - 1)}{2 n_B (\tilde{y}_B)^2(1-\tilde{y}_B)^2} - \ln\left(\tilde{y}_A\right) + \ln\left(1-\tilde{y}_A\right) + \frac{\tilde{s}_A^2(2 \tilde{y}_A - 1)}{2 n_A (\tilde{y}_A)^2(1-\tilde{y}_A)^2}. $$ This estimator uses a small-sample correction to reduce bias when one or both phases include only a small number of observations.
Under the assumption that the outcomes in each phase are mutually independent, an approximate standard error for LOR is given by
$$ SE_{LOR} = \sqrt{\frac{\tilde{s}^2_A}{n_A \tilde{y}_A^2 (1 - \tilde{y}_A)^2} + \frac{\tilde{s}^2_B}{n_B \tilde{y}_B^2 (1 - \tilde{y}_B)^2}}. $$
Under the same assumption, an approximate confidence interval for ψ is
[LOR − zα/2 × SELOR, LOR + zα/2 × SELOR],
Bonett & Price (2020b) described the log ratio of medians (LRM) effect size, which can be used to quantify the change in medians from phase A to phase B. The LRM is the natural logarithm of the ratio of medians. This effect size is appropriate for outcomes that are skewed or right-censored (Bonett & Price, 2020b). For an outcome where increase is desirable, the LRM parameter is defined as
λ = ln (ηB/ηA) = ln (ηB) − ln (ηA), where ηB and ηA are the population medians for phase B and phase A, respectively. For an outcome where decrease is desirable, the LRM parameter has the opposite sign:
λ = ln (ηA/ηB) = ln (ηA) − ln (ηB).
A natural estimator of the λ is given by
LRM = ln (mB) − ln (mA), where mB and mA are the sample medians for phase B and phase A, respectively. Note that the sample median might be zero for either phase B and phase A in some single-case design data, resulting in infinite LRM.
Standard errors and confidence intervals for LRM can be obtained under the assumption that the outcome data within each phase are mutually independent and follow a common distribution. Using the fact that the logarithm of the median is the same or close to the median of the log-transformed outcomes, the standard error for LRM can be calculated using the order statistics within each phase (Bonett & Price, 2020a). Let $$ \begin{aligned} l_A &= \text{max}\left\{1, \ \frac{m}{2} - \sqrt{m}\right\}, \quad &u_A &= m - l_A + 1, \\ l_B &= \text{max}\left\{1, \ \frac{n}{2} - \sqrt{n}\right\}, \quad &u_B &= n - l_B + 1, \end{aligned} $$ and find $$ q_A = \Phi^{-1}\left(\frac{1}{2^m}\sum_{i=0}^{l_A - 1} \frac{m!}{i!(m - i)!}\right) \quad \text{and} \quad q_B = \Phi^{-1}\left(\frac{1}{2^n}\sum_{j=0}^{l_B - 1} \frac{n!}{j!(n - j)!}\right). $$ The standard error of LRM is then $$ SE_{LRM} = \sqrt{\left(\frac{\ln\left(y^B_{(u_B)}\right)-\ln\left(y^B_{(l_B)}\right)}{2\ q_B}\right)^2 + \left(\frac{\ln\left(y^A_{(u_A)}\right)-\ln\left(y^A_{(l_A)}\right)}{2\ q_A}\right)^2}, $$ (Bonett & Price, 2020a) where y(lA)A, y(uA)A are the lA and uA order statistics of the phase A outcomes and y(lB)B, y(uB)B are the lB and uB order statistics of the phase B outcomes.
An approximate confidence interval for λ is [LRM − zα/2 × SELRM, LRM + zα/2 × SELRM], where zα/2 is 1 − α/2 critical value from a standard normal distribution.
Ferron, Goldstein, Olszewski, & Rohrer (2020) proposed a percent of goal obtained (PoGO) effect size metric for use in single-case designs. Let γ denote the goal level of behavior, which must be specified by the analyst or researcher. Percent of goal obtained quantifies the change in the mean level of behavior relative to the goal. The PoGO parameter θ is defined as: $$ \theta = \frac{\mu_B - \mu_A}{\gamma - \mu_A} \times 100\%. $$
Approaches for estimation of PoGO depend on one’s assumption about the stability of the observations in phases A and B. Under the assumption that the observations are temporally stable, a natural estimator of PoGO is $$ PoGO = \frac{\bar{y}_B - \bar{y}_A}{\gamma - \bar{y}_A} \times 100\%. $$
Patrona, Ferron, Olszewski, Kelley, &
Goldstein (2022) proposed a
method for calculating a standard error for the PoGO estimator under the
assumptions that the observations within each phase are mutually
independent. The standard error uses an approximation for the standard
error of two independent, normally distributed random variables due to
Dunlap & Silver (1986). It is calculated as
$$
SE_{PoGO} = \frac{1}{\gamma - \bar{y}_A} \sqrt{\frac{s_A^2}{n_A} +
\frac{s_B^2}{n_B} + \left(\frac{\bar{y}_B - \bar{y}_A}{\gamma -
\bar{y}_A}\right)^2 \frac{s_A^2}{n_A}}.
$$ Patrona et al. (2022) also provided a more
general approximation, which can be applied when PoGO is estimated using
regressions that control for time trends or auto-correlation. However,
these methods are not implemented in SingleCaseES
.
An approximate confidence interval for PoGO is given by [PoGO − zα/2 × SEPoGO, PoGO + zα/2 × SEPoGO], where zα/2 is the 1 − α/2 critical value from a standard normal distribution (Patrona et al., 2022).