There are two Perl repositories available on CPAN that deal with Chi-squared analysis(`Statistics::ChiSquare`

and `Statistics::Distributions)`

. However neither one outputs the Chi-squared value for the analysis of two binary populations.

We can use the formula below to calculate the Chi-squared value with one degree of freedom.

χ2 = [n(ad – bc)2] / [(a + b) (c + d) (a + c) (b + d)]

n = a + b + c + d

Where:

variable |
population 1 |
population 2 |

+ |
a |
b |

– |
c |
d |

Example:

Suppose we wish to determine the relationship between disease in two species. Both disease and the species are binary variables, so the Chi-squared test is applied:

Diseased |
species 1 |
species 2 |

No |
57 |
36 |

Yes |
63 |
88 |

n = (57 + 36 + 63 + 88) = 244

χ^{2} = [244*(57*88 – 36*63)^{2}] / [(57 + 36) (63 + 88) (57 + 63) (36 + 88)]

χ^{2} = 8.81

The critical Chi-squared distribution P-values at 1 degree of freedom are:

D.F. |
0.1 |
0.05 |
0.025 |
0.01 |
0.005 |

1 |
2.71 |
3.84 |
5.02 |
6.63 |
7.88 |

The χ^{2} value (8.82) is below the P-value 0.005.

Since the corresponding P-value is less than 0.05 (P<0.05), the data suggest that the prevalence of disease is significantly higher in species 2. Therefore we reject the null hypothesis.

Below is a Perl subroutine to automatically calculate Chi-squared.

```
sub chi_squared {
my ($a,$b,$c,$d) = @_;
return 0 if($b+$d == 0);
my $n= $a + $b + $c + $d;
return (($n*($a*$d - $b*$c)**2) / (($a + $b)*($c + $d)*($a + $c)*($b + $d)));
}
print &chi_squared(57,36,63,88);
```

Output: