Cayula-Cornillon#

A classical front detection method looking at the field’s bimodality in a moving-window. Based on Cayula & Cornillon (1992).

Important

This only implements the histogram analysis and cohesion check. This does not include the cloud detection or contour following.

Definition#

Histogram analysis#

The algorithm first does an histogram analysis inside the moving window to measure bimodality and find a threshold temperature that separates two clusters of values, hopefully corresponding to cold and hot water masses.

The histogram of valid values inside the window is computed. The width of the bins can be adjusted (default is 0.1°C wide).

Note

Some data can be compressed with linear packing. This means it is discretized which can cause numerical noise in the histogram. In that case it is useful to shift the bins by half the discretization step. See Linear packing and shifting histogram bins and the Xarray function documentation for details.

For each possible threshold value, the bimodality is estimated by looking at intra-cluster and inter-cluster variance. For a threshold \(\tau\), we compute the number of values in each cluster and their average value:

\[\begin{split}\begin{cases} N_1 = \sum_{t<\tau} h(t) \\ N_2 = \sum_{t>\tau} h(t) \end{cases} ,\; \begin{cases} \mu_1 = \sum_{t<\tau} th(t) / N_1 \\ \mu_2 = \sum_{t>\tau} th(t) / N_2 \end{cases}\end{split}\]

We can then compute the variance resulting from the separation into two cluster (inter-cluster variance):

\[J_b = \frac{N_1 N_2}{(N_1+N_2)^2} (\mu_1 - \mu_2)^2\]

The separation temperature \(\tau_{\text{opt}}\) is taken as the one that maximizes the inter-cluster variance contribution to the total variance \(\sigma\). The distribution is considered bimodal if the ratio \(J_b / \sigma\) exceeds a fixed criteria. By default, that criteria is 0.7, as recommended by the article. See Cayula & Cornillon (1992) for more details on that choice.

Cohesion check#

So far, the algorithm only looks at the distribution of values. This distribution can be bimodal even if the two clusters are not spatially coherent water masses. Bimodality could be the result of patchiness because of clouds, land, or noise. Hence the two clusters are tested for spatial coherence.

In the window, we count the total numbers \(T_1\) and \(T_2\) of valid neighbors for each cluster (cold and warm respectively). We also count the numbers \(R_1\) and \(R_2\) of neighbors that are of the same cluster. We only consider the four closest neighbors.

The clusters are considered spatially coherent (and the fronts inside this window kept) if the following criteria are met:

\[\frac{R_1}{T_1} > 0.92,\; \frac{R_2}{T_2} > 0.92,\; \frac{R_1 + R_2}{T_1 + T_2} > 0.90\]

Edges#

If the distribution is bimodal, edges given by the separation temperature \(\tau_{\text{opt}}\) are found. We select pixels inside the moving window that have at least one adjacent pixel on the opposite side of the threshold.

This gives fronts that are one-pixel wide. However, if the moving-window is shifted in increments smaller than its size, there can be overlap in edges found in two windows. The returned values (the count of detected front in each pixel) can thus exceed one, and fronts can be wider than one pixel.

Note

By default, the window steps are equal to its size, so there is no overlap. However the detected fronts can be sensitive to the window placement.

Dask support#

By default there is no overlap between two subsequent window and the result the result is potentially sensitive to the absolute placement of the window. When having a Dask array chunked along the core dimensions (latitude and/or longitude), there is no guarantee that the window placement will be the same as if the image was a single chunk.

In the example below, the chunk size is slightly too small. The window placement in the second block will not be the same as if the image was not chunked (the dashed red lines).

../_images/cayula_cornillon_blocks.svg

Because of this, if the core dimensions are chunked, the function will only accept input arrays where the block size is a multiple of the window step ( which is equal to the window size by default).

Be careful in the combination of window size and step that you choose if your array in chunked in the core dimensions.

Functions#

Detect fronts:

Supported types and requirements#

Supported input types: Numpy, Dask, Xarray

Requirements:

  • numpy

  • numba

Other implementations#

Language

Notes

MGET

C

Toolbox for ArcGIS. It seems the plugins only ships the compiled code. It has not yet been ported for the new version, whose code is available publicly at jjrob/MGET.

C

C

JUNO

Pure Python (Xarray, Pandas)

This is pure Python and may be slow

References#

[cayula_1992]

Cayula J.-F., Cornillon P. “Edge Detection Algorithm for SST Images”. J. Atmos. Oceanic Tech. 9.1, p. 67-80, DOI:10.1175/1520-0426(1992)009<0067:edafsi>2.0.co;2, 1992