Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Log-normal distribution
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
====Confidence interval for comparing two log normals==== Comparing two log-normal distributions can often be of interest, for example, from a treatment and control group (e.g., in an [[A/B testing|A/B test]]). We have samples from two independent log-normal distributions with parameters <math>(\mu_1, \sigma_1^2)</math> and <math>(\mu_2, \sigma_2^2)</math>, with sample sizes <math>n_1</math> and <math>n_2</math> respectively. Comparing the medians of the two can easily be done by taking the log from each and then constructing straightforward confidence intervals and transforming it back to the exponential scale. <math display="block">\mathrm{CI}(e^{\mu_1-\mu_2}): \exp\left(\hat \mu_1 - \hat \mu_2 \pm z_{1-\frac{\alpha}{2}} \sqrt{\frac{S_1^2}{n} + \frac{S_2^2}{n} } \right)</math> These CI are what's often used in epidemiology for calculation the CI for [[relative-risk]] and [[odds-ratio]].<ref>[https://sphweb.bumc.bu.edu/otlt/MPH-Modules/PH717-QuantCore/PH717-Module8-CategoricalData/PH717-Module8-CategoricalData5.html?fbclid=IwY2xjawFeH3JleHRuA2FlbQIxMAABHbmxa15uyyzJuzEwh9PIUr_m2Jsc9NGiPuS6IwfA36Ca5r1wV1EoPEz3MQ_aem_03PRd_jlRfbsnr6xCPkZmw Confidence Intervals for Risk Ratios and Odds Ratios]</ref> The way it is done there is that we have two approximately Normal distributions (e.g., p<sub>1</sub> and p<sub>2</sub>, for RR), and we wish to calculate their ratio.{{efn|The issue is that we don't know how to do it directly, so we take their logs, and then use the [[delta method]] to say that their logs is itself (approximately) normal. This trick allows us to pretend that their exp was log normal, and use that approximation to build the CI. Notice that in the RR case, the median and the mean in the base distribution (i.e., before taking the log), is actually identical (since they are originally normal, and not log normal). For example, <math>\hat p_1 \dot \sim N(p_1, p_1(1-p1)/n)</math> and <math>\ln \hat{p}_1 \dot \sim N(\ln p_1, (1-p1)/(p_1 n))</math> Hence, building a CI based on the log and than back-transform will give us <math>CI(p_1): e^{\ln \hat{p}_1 \pm (1 - \hat{p}_1)/(\hat{p}_1 n))}</math>. So while we expect the CI to be for the median, in this case, it's actually also for the mean in the original distribution. i.e., if the original <math>\hat p_1</math> was log-normal, we'd expect that <math>\operatorname{E}[\hat p_1] = e^{\ln p_1 + \tfrac{1}{2} (1 - p1)/(p_1 n)}</math>. But in practice, we KNOW that <math>\operatorname{E}[\hat p_1] = e^{\ln p_1} = p_1</math>. Hence, the approximation we have is in the second step (of the delta method), but the CI are actually for the expectation (not just the median). This is because we are starting from a base distribution that is normal, and then using another approximation after the log again to normal. This means that a big approximation part of the CI is from the delta method. }} However, the ratio of the expectations (means) of the two samples might also be of interest, while requiring more work to develop. The ratio of their means is: <math display="block">\frac{\operatorname{E}(X_1)}{\operatorname{E}(X_2)} = \frac{e^{\mu_1 + \sigma_1^2 / 2}}{e^{\mu_2 + \sigma_2^2 /2}} = e^{(\mu_1 - \mu_2) + \frac{1}{2} \left(\sigma_1^2 - \sigma_2^2\right)}</math> Plugin in the estimators to each of these parameters yields also a log normal distribution, which means that the Cox Method, discussed above, could similarly be used for this use-case: <math display="block">\mathrm{CI}\left( \frac{\operatorname{E}(X_1)}{\operatorname{E}(X_2)} = \frac{e^{\mu_1 + \sigma_1^2 / 2}}{e^{\mu_2 + \sigma_2^2 / 2}} \right): \exp\left(\left(\hat \mu_1 - \hat \mu_2 + \tfrac{1}{2}S_1^2 - \tfrac{1}{2}S_2^2\right) \pm z_{1-\frac{\alpha}{2}} \sqrt{ \frac{S_1^2}{n_1} + \frac{S_2^2}{n_2} + \frac{S_1^4}{2(n_1-1)} + \frac{S_2^4}{2(n_2-1)} } \right)</math> {{hidden begin|style=width:100%|ta1=center|border=1px #aaa solid|title=[Proof]}} To construct a confidence interval for this ratio, we first note that <math>\hat \mu_1 - \hat \mu_2</math> follows a normal distribution, and that both <math>S_1^2</math> and <math>S_2^2</math> has a [[chi-squared distribution]], which is [[Chi-squared distribution#Related distributions|approximately]] normally distributed (via [[Central limit theorem|CLT]], with the relevant [[Variance#Distribution of the sample variance|parameters]]). This means that <math display="block">(\hat \mu_1 - \hat \mu_2 + \frac{1}{2}S_1^2 - \frac{1}{2}S_2^2) \sim N\left((\mu_1 - \mu_2) + \frac{1}{2}(\sigma_1^2 - \sigma_2^2), \frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2} + \frac{\sigma_1^4}{2(n_1-1)} + \frac{\sigma_2^4}{2(n_2-1)} \right)</math> Based on the above, standard [[Normal distribution#Confidence intervals|confidence intervals]] can be constructed (using a [[Pivotal quantity]]) as: <math>(\hat \mu_1 - \hat \mu_2 + \frac{1}{2}S_1^2 - \frac{1}{2}S_2^2) \pm z_{1-\frac{\alpha}{2}} \sqrt{ \frac{S_1^2}{n_1} + \frac{S_2^2}{n_2} + \frac{S_1^4}{2(n_1-1)} + \frac{S_2^4}{2(n_2-1)} } </math> And since confidence intervals are preserved for monotonic transformations, we get that: <math>CI\left( \frac{\operatorname{E}(X_1)}{\operatorname{E}(X_2)} = \frac{e^{\mu_1 + \frac{\sigma_1^2}{2}}}{e^{\mu_2 + \frac{\sigma_2^2}{2}}} \right):e^{\left((\hat \mu_1 - \hat \mu_2 + \frac{1}{2}S_1^2 - \frac{1}{2}S_2^2) \pm z_{1-\frac{\alpha}{2}} \sqrt{ \frac{S_1^2}{n_1} + \frac{S_2^2}{n_2} + \frac{S_1^4}{2(n_1-1)} + \frac{S_2^4}{2(n_2-1)} } \right)}</math> As desired. {{hidden end}} It's worth noting that naively using the [[Maximum likelihood estimation|MLE]] in the ratio of the two expectations to create a [[ratio estimator]] will lead to a [[Consistency (statistics)|consistent]], yet biased, point-estimation (we use the fact that the estimator of the ratio is a log normal distribution):{{efn|The formula can found by just treating the estimated means and variances as approximetly normal, which indicates the terms is itself a log-normal, enabling us to quickly get the expectation. The bias can be partially minimized by using: <math display="block">\begin{align} \widehat \left[ \frac{\operatorname{E}(X_1)}{\operatorname{E}(X_2)} \right] &= \left[ \frac{\widehat \operatorname{E}(X_1)}{\widehat \operatorname{E}(X_2)} \right] \frac{2}{\widehat \left( \frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2} + \frac{\sigma_1^4}{2(n_1-1)} + \frac{\sigma_2^4}{2(n_2-1)} \right)} \\ &\approx \left[e^{(\widehat \mu_1 - \widehat \mu_2) + \frac{1}{2}\left(S_1^2 - S_2^2\right)}\right] \frac{2}{\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2} + \frac{S_1^4}{2(n_1-1)} + \frac{S_2^4}{2(n_2-1)}} \end{align} </math>}}{{citation needed|date=December 2024}} <math display="block">\begin{align} \operatorname{E}\left[ \frac{\widehat \operatorname{E}(X_1)}{\widehat \operatorname{E}(X_2)} \right] &= \operatorname{E}\left[\exp\left(\left(\widehat \mu_1 - \widehat \mu_2\right) + \tfrac{1}{2} \left(S_1^2 - S_2^2\right)\right)\right] \\ &\approx \exp\left[{(\mu_1 - \mu_2) + \frac{1}{2}(\sigma_1^2 - \sigma_2^2) + \frac{1}{2}\left( \frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2} + \frac{\sigma_1^4}{2(n_1-1)} + \frac{\sigma_2^4}{2(n_2-1)} \right) }\right] \end{align} </math>
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Log-normal distribution
(section)
Add topic