# ChArUco: 2D identification strategy (baseline) Goal: obtain ChArUco 2D corner positions that are as stable as possible (sub-pixel) in order to quantify the impact of blur, compression, and aberrations, and to prepare calibration/reconstruction stages. The project deliberately separates: - a **geometric prior** (planar board + ArUco/ChArUco correspondences); - an **image observation** (blur, compression, contrast, etc.); - methods that rely on a **parametric model** (pinhole + distortion) or a **non-parametric model** (smoothed field). ## Error measurement On synthetic datasets, the error is computed against the ground truth stored in `gt_charuco_corners.npz`: - matching by `corner_id` (stable ID); - per-view metrics (left/right): RMS, p50, p95, max, bias dx/dy. Command: ```bash .venv/bin/python -m stereocomplex.cli eval-charuco-detection dataset/v0 --method ``` ## Pixel-center convention (important) The project uses a “pixel centers at integer coordinates” convention (see `docs/CONVENTIONS.md`). OpenCV often reports corners in a convention shifted by 0.5 px; the evaluation code compensates for that shift for `--method charuco`. ## Available methods (CLI `--method`) ### 1) `charuco` (direct OpenCV) - OpenCV ArUco pipeline → ChArUco interpolation (inner chessboard corners). - Pro: simple, no camera model assumption. - Limitation: accuracy is often limited (sensitivity to blur/compression + conventions + internal heuristics). ### 2) `homography` (2nd-pass planar geometry) - Detects ArUco corners, estimates a global homography (RANSAC), then projects all ChArUco corners. - Works well when the image is well explained by a “simple” planar projective mapping. - Limitation: degrades in the presence of out-of-model distortions (e.g. strong radial distortion). ### 3) `pnp` (2nd-pass parametric K + distortion) - Uses `meta.json` (pitch/crop/resize + `f_um`) to build `K` and distortion coefficients, then: - runs `solvePnPRansac` on ArUco 3D→2D corners, - uses `projectPoints` to predict ChArUco corners. - Pro: robust when the optics can be modeled as pinhole + (Brown) distortion. - Limitation: not applicable / biased for non-pinhole systems (e.g. non-central microscope/CMO models). **Important note (focal length)** In the current synthetic dataset, `f_um` is known because it is generated and stored in `meta.json` (`sim_params.f_um`). Therefore, method `pnp` uses it as a **known** parameter to isolate the “point identification” effect. In real data, `f_um` (and more generally `K` and distortion) are not known a priori: - either they are estimated by a classical multi-view calibration (e.g. Zhang) before running `pnp`, - or they are part of an auto-calibration problem (latent variables to estimate), - or one avoids the pinhole assumption and uses a non-parametric method (e.g. `rayfield`). ### 4) `rayfield` (2nd-pass non-parametric “smoothed field” on the board plane) Goal: replace a pinhole model by a weaker assumption: the mapping from the board plane to the image is **low-frequency**. Implementation (plane-only): - global homography `H` (RANSAC) as a stable baseline; - residual field `r(x,y)` estimated on a grid (bilinear), regularized with a smoothing term (Laplacian) and robust to outliers (Huber); - prediction: `u(x,y) = H(x,y) + r(x,y)`. Pros: - does not depend on a pinhole optical model; - captures slow variations (complex aberrations) while remaining stable. Limitation: - this “ray-field” is **restricted to the plane** (a 2D warp); for a full 3D per-pixel ray field, calibration across multiple poses/planes is required. ### 4b) `rayfield_tps` and `rayfield_tps_robust` (recommended in this repo) The current default used in the examples/paper is `rayfield_tps_robust`: - base homography `H`, - TPS residual field, - robustification by IRLS (Huber). Compared to the grid backend, TPS is usually more stable when the residual field is only observed sparsely (AruCo corners). ### 5) `kfield` (a “local K” field approximated by smoothed affines) This method was an intermediate step: the idea is to replace a global `K` by a spatially varying field, under a low-frequency assumption. Note: in the current code, `kfield` does **not** interpolate a pinhole matrix $K$ in the strict sense. Instead, it builds a smoothed field of local **affine** (first-order) models obtained by **linearizing** the plane→image mapping. #### Linearization (Jacobian) Consider an unknown (potentially complex) mapping between the board plane and the image: - `u = u(x,y)` - `v = v(x,y)` Around a reference point $(x_q, y_q)$, we can write a first-order Taylor expansion: - `u(x,y) ≈ u_q + (∂u/∂x)_q · (x-x_q) + (∂u/∂y)_q · (y-y_q)` - `v(x,y) ≈ v_q + (∂v/∂x)_q · (x-x_q) + (∂v/∂y)_q · (y-y_q)` The local **Jacobian** (the linear part) is: ``` J(x_q,y_q) = [[∂u/∂x, ∂u/∂y], [∂v/∂x, ∂v/∂y]] (évalué en (x_q,y_q)) ``` The `kfield` idea is to estimate this local Jacobian (and offset) from the ArUco correspondences available in the image, then smooth/interpolate it to obtain a low-frequency approximation. #### Construction (what the code does) - choose an anchor grid in board coordinates $(x,y)$; - at each anchor, fit a local affine model by weighted least squares (nearest ArUco neighbors): ```{math} u(x,y)=a_0 + a_1 x + a_2 y,\quad v(x,y)=b_0 + b_1 x + b_2 y ``` where `a1,a2,b1,b2` estimate the local Jacobian $(\partial u/\partial x, \partial u/\partial y, \partial v/\partial x, \partial v/\partial y)$. - smooth each parameter $(a_0,a_1,a_2,b_0,b_1,b_2)$ on the grid (Gaussian); - for a query point $(x,y)$, bilinearly interpolate these parameters and apply the affine mapping. Why this is not sufficient: - a local affine model does not capture projective effects (and even less distortion) over the full board; - directly interpolating a matrix $K$ is not “geometrically stable” (constraints on $f_x,f_y$, etc.). In practice, `rayfield` (homography + smoothed residual field) matches the “low-frequency” intuition while remaining numerically stable. ## Assumptions per method (what it “assumes”) Summary of dependencies (as of the current code): - `charuco`: does not require `K`/distortion, but depends on OpenCV heuristics. - `homography`: does not require `K`/distortion; assumes a global homography explains the board image well. - `tps`: does not require `K`/distortion; assumes a smooth 2D warp (thin-plate spline) and can extrapolate unstably if underconstrained. - `pnp`: **requires** an optical model (pinhole + distortion) and its parameters (or a prior step estimating them). - `rayfield`: does not require `K`/distortion; assumes a low-frequency planar warp and uses only correspondences (Aruco) + regularization. - `rayfield_tps`: a `rayfield` variant where the residual is reconstructed by regularized TPS (instead of a bilinear grid + Laplacian). - `rayfield_tps_robust`: TPS residual + robust loss (recommended default). ## Photometric refinements (CLI `--refine`) Refinements based on structure tensor/gradients exist (`tensor`, `lines`, `lsq`, `noble`), but on the current datasets they often moved corners toward a photometric optimum that does not match the GT geometric center. They should be considered as ablations/experiments rather than the recommended method. ## Current recommendation - If the optics are well approximated by pinhole + distortion: prefer `pnp`. - If the optics are complex/non-central: prefer `rayfield_tps_robust` (low-frequency assumption) and increase regularization if needed. ## Paper comparison (reproducible script) The manuscript includes an automatically generated table (methods vs errors). To regenerate it: ```bash .venv/bin/python paper/experiments/compare_charuco_methods.py dataset/v0_png --splits train bash paper/build_pdflatex.sh ``` ## Worked example (raw OpenCV vs ray-field + plots) See `docs/RAYFIELD_WORKED_EXAMPLE.md` (includes a detailed explanation of why a global homography + a smoothed residual field can correct part of the aberrations/distortions on the board plane).