Signal Detection and Parameter Estimation
Statistics 101: definitions
Bayes factor
\[p(m|D) = \frac{p(D|m)\,p(m)}{p(D)}\]-
$p(m D)$ – probability of model $m$, given data $D$ (hard) -
$p(D m)$ – probability of data, given model $m$ (easier) - $p(m)$ – prior (prior probability of given model)
- $p(D)$ – normalisation
Evidence
\[p(D|m) = \int{\rm d}\theta\,{p(D|\theta,m)\,p(\theta|m)}\]- Also called marginal likelihood
- Integrates likelihood over all model parameters
- Penalises overly complex models (Occam factor)
- Used to compare models
Likelihood
\[\mathcal{L}(\theta) = p(D|\theta,m)\]- Probability of observing data $D$
- Assuming model $m$ with specific parameters $\theta$
- Central quantity for parameter estimation
- Often maximise $\mathcal{L}$ or $\log\mathcal{L}$
Posterior
\[p(\theta|D,m) = \frac{p(D|\theta,m)\,p(\theta|m)}{p(D|m)}\]- Probability of parameters after observing the data
- Combines likelihood and prior
- Central quantity in Bayesian inference
Interpretation:
- peak → most probable parameters
- width → parameter uncertainty
Odds ratio
Compare two models $m_1$ and $m_2$:
\[\frac{p(m_1|D)}{p(m_2|D)} = \frac{p(D|m_1)}{p(D|m_2)} \times \frac{p(m_1)}{p(m_2)}\]- Posterior odds = Bayes factor × prior odds
- Bayes factor:
- $B_{12} > 1$ favours $m_1$
- $B_{12} < 1$ favours $m_2$
Priors
\[p(\theta|m)\]- Probability distribution for parameters before seeing the data
- Encodes prior knowledge or assumptions
- Must be specified in Bayesian inference
Examples:
- Uniform prior — all values equally likely
- Log prior — equal probability per decade
- Informative prior — based on previous measurements
Frequentist vs Bayesian (signal searches)
Frequentist
- Parameters $\theta$ are fixed but unknown
- Data are random realisations of noise
Parameter estimation: maximise likelihood (or related statistic)
Detection:
- compare null hypothesis (noise) with signal model
- compute test statistic → obtain p-value
Decision rule:
\[p < p_{\rm thresh}\](e.g. $5\sigma$ significance)
Bayesian
- Parameters $\theta$ are random variables
- incorporate prior information
Parameter estimation:
- use posterior distribution
Model comparison:
- compare evidences using the Bayes factor
$\chi^2$ likelihood (Gaussian noise)
For data vector $\mathbf{d}$ and model prediction $\mathbf{m}(\theta)$:
\[\chi^2 = (\mathbf{d}-\mathbf{m})^{T} C^{-1} (\mathbf{d}-\mathbf{m})\]Likelihood:
\[p(D|\theta,m) \propto \exp\!\left(-\frac{1}{2}\chi^2\right)\]- $C$ - noise covariance matrix
- Includes correlations between data points
Special case: uncorrelated Gaussian noise
\[\chi^2 = \sum_i \frac{(d_i - m_i)^2}{\sigma_i^2}\]Finding signals
Matched filter and signal-to-noise ratio (SNR)
Method for detecting a known signal shape in noise.
Inner product (overlap function):
\[(a|b) = a^T C^{-1} b\]- $C$ - noise covariance matrix $\approx\sigma^2$
- weights data by inverse noise power
For constant uncorrelated noise:
\[(a|b) = \vec{a}\cdot\vec{b} / \sigma^2\]Matched filter output:
\[z = (d|s)\]- $d$ - data
- Assumed to be signal + noise:
- $d = s + n$
- $s$ - template signal
- $s = A \bar s$
- Only is cases where signal is linear in amplitude (most cases)
- $\bar s$ is shape of signal, $A$ is amplitude
Interpretation:
- weighted correlation of data with expected signal
- Overlap of signal and data
Signal-to-noise ratio:
\[\rho = \frac{(d|s)}{\sqrt{(s|s)}} = \frac{(d|\bar s)}{\sqrt{(\bar s|\bar s)}}\]- normalised matched-filter output
- dimensionless detection statistic
Expected SNR for true signal:
\[\rho_{\rm exp}^2 = (s|s)\]Optimality:
- matched filter maximises SNR in Gaussian noise
Detection:
- candidate signal if
Typical thresholds:
- $\rho \sim 5$–$10$ depending on background and number of trials
Matched filter / SNR (brief derivation)
Assume additive noise:
\[d = A h + n\]Gaussian likelihood, noise with covariance $C$:
\[p(d|A) \propto \exp\!\left[-\frac12(d-Ah)^T C^{-1}(d-Ah)\right]\]| Maximise $\ln p(d | A)$ w.r.t. $A$: |
Expand:
\[(d-Ah|d-Ah) = (d|d) - 2A(d|h) + A^2(h|h)\]Differentiate and set to zero:
\[\frac{\partial}{\partial A}\Big[(d|d) - 2A(d|h) + A^2(h|h)\Big]=0\] \[-2(d|h) + 2A(h|h)=0 \quad\Rightarrow\quad \hat A = \frac{(d|h)}{(h|h)}\]Define the matched-filter (amplitude) statistic:
\[z = (d|h)\]Normalise by its noise RMS to get SNR:
Under noise-only ($d=n$),
\[\langle z \rangle = 0, \qquad {\rm Var}(z)=\langle (n|h)^2\rangle = (h|h)\]So
\[\rho = \frac{z}{\sqrt{(h|h)}} = \frac{(d|h)}{\sqrt{(h|h)}}\]-
matched filter: compute $(d h)$ (noise-weighted correlation) - SNR: matched filter output in units of its noise standard deviation
Template searches
Signal parameters unknown → search over templates:
\[s(t;\theta_k)\]Procedure:
- Generate template bank
- Compute matched filter for each template
- Find maximum SNR
- Candidate signals correspond to large peaks in SNR
- Either have a threshold for how large we consider as the trigger
- Or find the largest in a data set, investigate, etc.
Parameter estimation
Signal model:
\[s(t;\theta)\]with parameters
\[\theta = \{\theta_1,\theta_2,\dots\}\]Goal:
- find parameters that best match the data
Statistic to maximise:
\[\mathcal{L}(\theta) = p(D|\theta) ,\quad \rho(\theta),\quad\text{etc.}\]Interpretation:
- parameters that maximise the detection statistic
- give the best-fit signal model
Special case: unknown amplitude
If the signal has the form
\[s(t) = A\,\bar s(t)\]then the best-fit amplitude can be found analytically:
\[\hat A = \frac{(d|\bar s)}{(\bar s|\bar s)}\]So the matched filter automatically finds the optimal amplitude.
Estimating uncertainty
Uncertainty may determined by the shape of the likelihood peak around the best-fit parameters.
- Very sharp peak: low uncertainty in best-fit value
- Very narrow peak: high uncertainty
Approximate rule:
\[\Delta\chi^2 \sim 1\]gives the $1\sigma$ uncertainty on a parameter.
Key idea:
- sharper peak in likelihood → smaller uncertainty
- uncertainty typically scales as