30 Exercises
30.1 Likelihood
Consider a homogeneous Poisson process with intensity \(\lambda\) on a window \(W\).
Starting from the Poisson distribution of \(N(W)\), derive the likelihood \[ L(\lambda; X) = e^{-\lambda |W|} \lambda^n. \]
Show that the MLE is \[ \hat{\lambda} = \frac{n}{|W|}. \]
Answer
\(N(W) \sim \text{Poisson}(\lambda |W|)\), so
\[ \mathbb{P}(N(W)=n) = \frac{(\lambda |W|)^n}{n!} e^{-\lambda |W|}. \] Ignoring constants not depending on \(\lambda\) gives
\[ L(\lambda) \propto e^{-\lambda |W|} \lambda^n. \]Log-likelihood: \[ \ell(\lambda) = n \log \lambda - \lambda |W|. \] Differentiate: \[ \frac{d\ell}{d\lambda} = \frac{n}{\lambda} - |W| = 0 \Rightarrow \hat{\lambda} = \frac{n}{|W|}. \]
- Starting from the partition argument in the notes, explain precisely where the term \[ \int_W \lambda(u)\,du \] comes from.
Then explain why this term corresponds to the expected number of points.
Answer
It arises as the limit of the sum over small cells: \[ \sum_j \lambda(u_j)|B_j| \to \int_W \lambda(u)\,du. \]
For a Poisson process, \[ \mathbb{E}[N(W)] = \int_W \lambda(u)\,du, \] so this term represents the expected total number of points.
- Let \(\Lambda(u)\) be a random intensity field.
Show that the Cox process likelihood can be written as \[ L(\theta;X) = \mathbb{E}_\Lambda\left[L(\theta;X \mid \Lambda)\right]. \]
Explain why this is analogous to a mixture model.
Answer
By definition: \[ L(\theta;X) = \int L(\theta;X \mid \Lambda)\, dP(\Lambda). \]
This is identical in structure to mixture models: \[ f(x) = \int f(x \mid z)\, dP(z). \]
Here, \(\Lambda\) plays the role of a latent variable.
30.2 Minimum Contrast
- Show that the minimum contrast estimator based on the PCF,
\[ C(\theta) = \int w(r)\,[\hat{g}(r) - g(r;\theta)]^2 dr, \]
is a functional analogue of ordinary least squares.
What plays the role of: - response variable? - predictor? - residual?
Answer
- response: \(\hat{g}(r)\)
- predictor: \(g(r;\theta)\)
- residual: \(\hat{g}(r) - g(r;\theta)\)
So the contrast is an \(L^2\) loss over functions.
- Suppose \[ g(r;\theta) = 1 + A e^{-2r/s}. \]
Show that \[ \log(g(r)-1) = \log A - \frac{2}{s} r. \]
Explain how this leads to a linear regression estimator.
Discuss when this transformation may be unstable.
Answer
Direct log: \[ \log(g(r)-1) = \log A - \frac{2}{s}r. \]
Regress \(y = \log(\hat{g}(r)-1)\) on \(r\).
When \(\hat{g}(r) \approx 1\), the log is unstable.
- Explain why the choice of fitting range \([r_{\min}, r_{\max}]\) is critical.
Your answer should discuss:
- behaviour near \(r=0\)
- behaviour for large \(r\)
- bias–variance trade-off
- Suppose two models have identical PCFs \(g(r)\) but different higher-order structure.
Show that minimum contrast based on the PCF cannot distinguish between them.
30.3 Composite Likelihood
- Starting from \[ \lambda^{(2)}(u,v) = \lambda(u)\lambda(v) g(u,v), \]
derive the pairwise composite likelihood
\[ L_{\text{pair}}(\theta) \propto \prod_{i<j} \lambda^{(2)}(x_i,x_j). \]
Then express it in terms of \(\lambda\) and \(g\).
- In the stationary case, show that the pairwise log-likelihood reduces to
\[ \ell(\theta) = \sum_{i<j} \log g(r_{ij}) + \text{constant}. \]
Explain why this makes the estimator depend only on the PCF.
- Compare minimum contrast and composite likelihood:
- What data do they use?
- What structure do they rely on?
- Why might they give similar estimates?
30.4 Pseudolikelihood
- Show that for a Poisson process,
\[ \lambda(u \mid X) = \lambda(u). \]
Explain why this implies pseudolikelihood equals the true likelihood.
- For a Cox process, show that
\[ \lambda(u \mid X) = \mathbb{E}[\Lambda(u) \mid X]. \]
Explain why this is difficult to compute.
- Compare Cox and Gibbs processes in terms of conditional intensity:
- which has tractable \(\lambda(u \mid X)\)?
- which is easier for pseudolikelihood?
30.5 Simulation and Applied Exercises
- Poisson likelihood simulation
Simulate a homogeneous Poisson process with known \(\lambda\).
- Estimate \(\lambda\) using the likelihood.
- Repeat over many simulations.
- Verify empirically that the estimator is unbiased.
- Minimum contrast fitting
Simulate a clustered point process (e.g. LGCP or Thomas).
- Estimate \(\hat{g}(r)\)
- Fit parameters using minimum contrast
- Plot \(\hat{g}(r)\) and fitted \(g(r)\)
Investigate sensitivity to:
- fitting range
- weighting
- Composite likelihood vs minimum contrast
Using the same simulated dataset:
- Fit parameters using minimum contrast
- Fit parameters using pairwise likelihood (if implemented)
- Compare estimates and fitted PCFs
- Identifiability experiment (important)
Simulate two processes with similar PCFs:
- one LGCP
- one CSCP
- Compute empirical \(\hat{g}(r)\)
- Fit both models using minimum contrast
- Compare fitted curves
Discuss:
- how similar the fits are
- what this implies about identifiability
- Effect of sample size
Repeat Exercise 17 for increasing window sizes.
Explain:
- how estimation improves
- whether model distinguishability improves
30.6 Challenging / Theory Extensions
- Suppose \(g(r) = 1 + \epsilon h(r)\) for small \(\epsilon\).
Show that, to first order,
\[ \log g(r) \approx \epsilon h(r). \]
Explain why this suggests different models may look similar under weak clustering.
- Consider two models:
- Model A: \(g_A(r) = \exp(C(r))\)
- Model B: \(g_B(r) = 1 + C(r)\)
Show that for small \(C(r)\),
\[ g_A(r) \approx g_B(r). \]
Explain why this is relevant for LGCP vs CSCP.