API Reference

API Reference#

Complete API documentation for phasecurvefit.

Main Function#

phasecurvefit.walk_local_flow(*args: object, **kwargs: object)

Order tracers with the local-flow walk (deprecated; use order).

Deprecated since version 0.3: walk_local_flow will be removed in v0.4. Use pcf.order(positions, velocities) – or pcf.orderers.LocalFlowOrderer(...).order(...) for non-default walk parameters – which routes through the same implementation without a warning.

Thin wrapper that emits a DeprecationWarning and forwards to the private implementation (plain and Quantity dispatch), returning an unchanged WalkLocalFlowResult.

Parameters:

args (object)
kwargs (object)

Return type:

WalkLocalFlowResult

Result Accessor#

Helper function to extract ordered data from results.

phasecurvefit.order_w(res: WalkLocalFlowResult, /)

Get xs and vs in the ordered sequence from a WalkLocalFlowResult.

Filters out unvisited indices (marked as -1) and returns only the visited observations in the order they were traversed.

Parameters:

res (WalkLocalFlowResult) – The result from walk_local_flow.

Return type:

tuple[dict[str, Union[Real[Array, 'N'], Real[ndarray, 'N'], Real[TypedNdArray, 'N']]], dict[str, Union[Real[Array, 'N'], Real[ndarray, 'N'], Real[TypedNdArray, 'N']]]]

Returns:

xs (dict[str, Array]) – Position arrays reordered according to the algorithm’s output, with unvisited observations removed.
vs (dict[str, Array]) – Velocity arrays reordered according to the algorithm’s output, with unvisited observations removed.

Examples

>>> import jax.numpy as jnp
>>> import phasecurvefit as pcf
>>> pos = {"x": jnp.array([3.0, 1.0, 2.0])}
>>> vel = {"x": jnp.array([1.0, 1.0, 1.0])}
>>> lfo = pcf.orderers.LocalFlowOrderer(start_idx=1, metric_scale=0.0)
>>> result = pcf.order(pos, vel, lfo)
>>> ordered_pos, ordered_vel = pcf.order_w(result)

Distance Metrics#

Pluggable distance metrics for controlling how the algorithm selects the next point. See the Metrics Guide for usage examples.

class phasecurvefit.metrics.AbstractDistanceMetric

Bases: Module

Abstract base class for distance metrics in phase-space walks.

A distance metric computes modified distances between a current point and all candidate next points, incorporating both spatial and velocity information. Different metrics can implement different weighting schemes or use different phase-space representations.

Examples

>>> import phasecurvefit as pcf
>>> metric = pcf.metrics.AlignedMomentumDistanceMetric()
>>> # Use with walk_local_flow via metric parameter

final class phasecurvefit.metrics.AlignedMomentumDistanceMetric

Bases: AbstractDistanceMetric

Default momentum-based distance metric.

Computes modified distance as:

$$ d = d_0 + lambda (1 - costheta) $$

where $d_0$ is the Euclidean distance in position space, $theta$ is the angle between the current velocity and the direction to the candidate point, and $lambda$ controls the relative importance of momentum alignment.

When $lambda = 0$, reduces to pure nearest-neighbor search in position space. As $lambda$ increases, points aligned with the current velocity direction are increasingly favored.

This is the original phase-flow walk metric from Nibauer et al. (2022).

Examples

>>> import jax.numpy as jnp
>>> import phasecurvefit as pcf
>>> metric = pcf.metrics.AlignedMomentumDistanceMetric()
>>> pos = {"x": jnp.array([0.0, 1.0, 2.0]), "y": jnp.array([0.0, 0.5, 1.0])}
>>> vel = {"x": jnp.array([1.0, 1.0, 1.0]), "y": jnp.array([0.5, 0.5, 0.5])}
>>> current_pos = {k: v[0] for k, v in pos.items()}
>>> current_vel = {k: v[0] for k, v in vel.items()}
>>> distances = metric(current_pos, current_vel, pos, vel, metric_scale=1.0)
>>> distances.shape
(3,)

final class phasecurvefit.metrics.SpatialDistanceMetric

Bases: AbstractDistanceMetric

Position-only distance metric.

Computes pure Euclidean distance in position space, ignoring velocity information entirely. This reduces to standard nearest-neighbor search.

$$ d = d_0 $$

where $d_0$ is the Euclidean distance between positions. The metric_scale parameter is ignored.

This metric is useful when: - Velocity information is unreliable or unavailable - Pure spatial proximity is the desired ordering criterion - Comparing against baseline nearest-neighbor approaches

Examples

>>> import jax.numpy as jnp
>>> import phasecurvefit as pcf
>>> metric = pcf.metrics.SpatialDistanceMetric()
>>> pos = {"x": jnp.array([0.0, 1.0, 2.0]), "y": jnp.array([0.0, 0.5, 1.0])}
>>> vel = {"x": jnp.array([1.0, 1.0, 1.0]), "y": jnp.array([0.5, 0.5, 0.5])}
>>> current_pos = {k: v[0] for k, v in pos.items()}
>>> current_vel = {k: v[0] for k, v in vel.items()}
>>> distances = metric(current_pos, current_vel, pos, vel, metric_scale=0.0)
>>> distances.shape
(3,)

final class phasecurvefit.metrics.FullPhaseSpaceDistanceMetric

Bases: AbstractDistanceMetric

Full 6D phase-space distance metric.

Computes the Euclidean distance in the full 6-dimensional phase space by combining position and velocity differences. The parameter metric_scale (with time units) converts velocity differences to position units.

$$ d = sqrt{d_0^2 + (tau cdot d_v)^2} $$

where $d_0$ is the Euclidean distance in position space, $d_v$ is the Euclidean distance in velocity space, and $tau$ is the time parameter (metric_scale) that converts velocity to position units.

This metric treats position and velocity symmetrically in phase space, without directional bias from momentum alignment. The metric_scale parameter determines the relative weighting of velocity differences.

Physically, if we think of phase space as having position coordinates measured in kpc and velocity coordinates measured in kpc/Myr, then metric_scale with units of Myr converts velocity differences to kpc, allowing us to compute a true Euclidean distance in a uniformly scaled phase space.

This metric is useful when:

Position and velocity information are equally important
You want true 6D proximity without momentum direction bias
The natural time scale of the system is known

Examples

>>> import jax.numpy as jnp
>>> import phasecurvefit as pcf
>>> metric = pcf.metrics.FullPhaseSpaceDistanceMetric()
>>> pos = {"x": jnp.array([0.0, 1.0, 2.0]), "y": jnp.array([0.0, 0.5, 1.0])}
>>> vel = {"x": jnp.array([1.0, 1.5, 2.0]), "y": jnp.array([0.5, 1.0, 1.5])}
>>> current_pos = {k: v[0] for k, v in pos.items()}
>>> current_vel = {k: v[0] for k, v in vel.items()}
>>> # metric_scale=1.0 means 1 unit of velocity diff = 1 unit of position diff
>>> distances = metric(current_pos, current_vel, pos, vel, metric_scale=1.0)
>>> distances.shape
(3,)

Phase-Space Utilities#

Low-level functions for phase-space operations. Available in the phasecurvefit.w submodule.

Compute Euclidean distance between two position points in Cartesian space.

This function operates on scalar components only (single points). Use jax.vmap to compute distances for arrays of points.

Parameters:

q_a (ScalarComponents) – Position dictionaries with scalar Cartesian components (keys: “x”, “y”, “z”).
q_b (ScalarComponents) – Position dictionaries with scalar Cartesian components (keys: “x”, “y”, “z”).

Returns:

The Euclidean distance.

Return type:

FLikeSz0

Examples

>>> import jax.numpy as jnp
>>> q_a = {"x": jnp.array(0.0), "y": jnp.array(0.0)}
>>> q_b = {"x": jnp.array(3.0), "y": jnp.array(4.0)}
>>> float(euclidean_distance(q_a, q_b))
5.0

phasecurvefit.w.euclidean_distance(q_a: Mapping[str, jaxtyping.Real[AbstractQuantity, '']], q_b: Mapping[str, jaxtyping.Real[AbstractQuantity, '']], /) → jaxtyping.Real[AbstractQuantity, '']

Parameters:

q_a (dict[str, Real[Array, ''] | Real[ndarray, ''] | number | int | float | Real[TypedNdArray, '']])
q_b (dict[str, Real[Array, ''] | Real[ndarray, ''] | number | int | float | Real[TypedNdArray, '']])

Return type:

Float[Array, ’’] | Float[ndarray, ’’] | number | float | Float[TypedNdArray, ’’]

Euclidean distance between Quantity-valued component dictionaries.

Computes the distance between two phase-space positions represented as dictionaries with unxt Quantity scalar values.

Parameters:

q_a (Mapping[str, unxt.AbstractQuantity]) – Position dictionaries with Quantity-valued components. Must have the same keys. All values must have compatible length dimensions.
q_b (Mapping[str, unxt.AbstractQuantity]) – Position dictionaries with Quantity-valued components. Must have the same keys. All values must have compatible length dimensions.

Returns:

The Euclidean distance with the unit of the input components.

Return type:

unxt.Quantity

Examples

>>> import jax.numpy as jnp
>>> import unxt as u
>>> q_a = {"x": u.Q(0.0, "m"), "y": u.Q(0.0, "m")}
>>> q_b = {"x": u.Q(3.0, "m"), "y": u.Q(4.0, "m")}
>>> euclidean_distance(q_a, q_b)
Quantity(Array(5., dtype=float32, weak_type=True), unit='m')

Compute unit direction vector from position a to b in Cartesian space.

This function operates on scalar components only (single points). Use jax.vmap to compute directions for arrays of points.

Parameters:

q_a (ScalarComponents) – Position dictionaries with scalar Cartesian components (keys: “x”, “y”, “z”).
q_b (ScalarComponents) – Position dictionaries with scalar Cartesian components (keys: “x”, “y”, “z”).

Returns:

Dictionary of unit direction Cartesian components.

Return type:

ScalarComponents

Examples

>>> import jax.numpy as jnp
>>> q_a = {"x": jnp.array(0.0), "y": jnp.array(0.0)}
>>> q_b = {"x": jnp.array(3.0), "y": jnp.array(4.0)}
>>> udir = unit_direction(q_a, q_b)
>>> float(udir["x"]), float(udir["y"])
(0.6..., 0.8...)

phasecurvefit.w.unit_direction(q_a: Mapping[str, jaxtyping.Real[AbstractQuantity, '']], q_b: Mapping[str, jaxtyping.Real[AbstractQuantity, '']], /) → Mapping[str, jaxtyping.Real[AbstractQuantity, '']]

Parameters:

q_a (dict[str, Real[Array, ''] | Real[ndarray, ''] | number | int | float | Real[TypedNdArray, '']])
q_b (dict[str, Real[Array, ''] | Real[ndarray, ''] | number | int | float | Real[TypedNdArray, '']])

Return type:

Compute unit direction vector from q_a to q_b for Quantity-valued components.

Computes the unit direction vector pointing from position q_a to q_b, where both positions are represented as dictionaries with unxt Quantity scalar values.

Parameters:

q_a (Mapping[str, unxt.AbstractQuantity]) – Position dictionaries with Quantity-valued components. Must have the same keys. All values must have compatible length dimensions.
q_b (Mapping[str, unxt.AbstractQuantity]) – Position dictionaries with Quantity-valued components. Must have the same keys. All values must have compatible length dimensions.

Returns:

A dictionary representing the unit direction vector. The components are dimensionless Quantities.

Return type:

Mapping[str, unxt.AbstractQuantity]

Examples

>>> import jax.numpy as jnp
>>> import unxt as u
>>> q_a = {"x": u.Q(0.0, "m"), "y": u.Q(0.0, "m")}
>>> q_b = {"x": u.Q(3.0, "m"), "y": u.Q(4.0, "m")}
>>> unit_direction(q_a, q_b)
{'x': Quantity(Array(0.6, dtype=float32, weak_type=True), unit=''),
 'y': Quantity(Array(0.8, dtype=float32, weak_type=True), unit='')}

Compute the unit velocity vector in Cartesian space.

This function operates on scalar components only (single velocity vector). Use jax.vmap to compute unit velocities for arrays of velocities.

Parameters:: velocity (ScalarComponents) – Velocity dictionary with scalar Cartesian components (keys: “x”, “y”, “z”).
Returns:: Dictionary of unit velocity Cartesian components.
Return type:: ScalarComponents

Examples

>>> import jax.numpy as jnp
>>> vel = {"x": jnp.array(3.0), "y": jnp.array(4.0)}
>>> uvel = unit_velocity(vel)
>>> float(uvel["x"]), float(uvel["y"])
(0.6..., 0.8...)

phasecurvefit.w.unit_velocity(velocity: Mapping[str, jaxtyping.Real[AbstractQuantity, '']], /) → Mapping[str, jaxtyping.Real[AbstractQuantity, '']]

Parameters:: velocity (dict[str, Real[Array, ''] | Real[ndarray, ''] | number | int | float | Real[TypedNdArray, '']])
Return type:: dict[str, Real[Array, ’’] | Real[ndarray, ’’] | number | int | float | Real[TypedNdArray, ’’]]

Compute unit velocity vector for Quantity-valued components.

Computes the unit velocity vector from a velocity represented as a dictionary with unxt Quantity scalar values.

Parameters:: velocity (Mapping[str, unxt.AbstractQuantity]) – Velocity dictionary with Quantity-valued components. All values must have compatible velocity dimensions (length/time).
Returns:: A dictionary representing the unit velocity vector. The components are dimensionless Quantities.
Return type:: Mapping[str, unxt.AbstractQuantity]

Examples

>>> import jax.numpy as jnp
>>> import unxt as u
>>> vel = {"x": u.Q(3.0, "m/s"), "y": u.Q(4.0, "m/s")}
>>> unit_velocity(vel)
{'x': Quantity(Array(0.6, dtype=float32, ...), unit=''),
 'y': Quantity(Array(0.8, dtype=float32, ...), unit='')}

Compute cosine similarity between two vectors in Cartesian space.

The cosine similarity is defined as the dot product of the vectors. For unit vectors, this equals the cosine of the angle between them.

This function operates on scalar components only (single vectors). Use jax.vmap to compute similarities for arrays of vectors.

Parameters:

vec_a (ScalarComponents) – Vector dictionaries with scalar Cartesian components (keys: “x”, “y”, “z”).
vec_b (ScalarComponents) – Vector dictionaries with scalar Cartesian components (keys: “x”, “y”, “z”).

Returns:

The cosine similarity (dot product).

Return type:

FLikeSz0

Examples

>>> import jax.numpy as jnp
>>> # Parallel vectors
>>> a = {"x": jnp.array(1.0), "y": jnp.array(0.0)}
>>> b = {"x": jnp.array(1.0), "y": jnp.array(0.0)}
>>> float(cosine_similarity(a, b))
1.0

>>> # Orthogonal vectors
>>> a = {"x": jnp.array(1.0), "y": jnp.array(0.0)}
>>> b = {"x": jnp.array(0.0), "y": jnp.array(1.0)}
>>> float(cosine_similarity(a, b))
0.0

phasecurvefit.w.cosine_similarity(vel_a: Mapping[str, jaxtyping.Real[AbstractQuantity, '']], vel_b: Mapping[str, jaxtyping.Real[AbstractQuantity, '']], /) → jaxtyping.Real[AbstractQuantity, '']

Parameters:

vec_a (dict[str, Real[Array, ''] | Real[ndarray, ''] | number | int | float | Real[TypedNdArray, '']])
vec_b (dict[str, Real[Array, ''] | Real[ndarray, ''] | number | int | float | Real[TypedNdArray, '']])

Return type:

Float[Array, ’’] | Float[ndarray, ’’] | number | float | Float[TypedNdArray, ’’]

Compute cosine similarity between Quantity-valued velocity components.

Computes the cosine similarity (dimensionless) between two vectors represented as dictionaries with unxt Quantity scalar values. The result is the cosine of the angle between the two vectors.

Parameters:

vel_a (Mapping[str, unxt.AbstractQuantity]) – Velocity or direction dictionaries with Quantity-valued components. Must have the same keys. All values must have compatible dimensions.
vel_b (Mapping[str, unxt.AbstractQuantity]) – Velocity or direction dictionaries with Quantity-valued components. Must have the same keys. All values must have compatible dimensions.
vec_a (dict[str, Real[Array, ''] | Real[ndarray, ''] | number | int | float | Real[TypedNdArray, '']])
vec_b (dict[str, Real[Array, ''] | Real[ndarray, ''] | number | int | float | Real[TypedNdArray, '']])

Returns:

The dimensionless cosine similarity between the two vectors.

Return type:

unxt.Quantity

Examples

>>> import jax.numpy as jnp
>>> import unxt as u
>>> vel_a = {"x": u.Q(1.0, "m/s"), "y": u.Q(0.0, "m/s")}
>>> vel_b = {"x": u.Q(0.0, "m/s"), "y": u.Q(1.0, "m/s")}
>>> cosine_similarity(vel_a, vel_b)
Quantity(Array(0., dtype=float32, ...), unit='')

Extract a phase-space point at the given index.

This function uses standard phase-space notation where: - q = position (generalized Cartesian coordinates: “x”, “y”, “z”) - p = momentum/velocity (generalized Cartesian momenta: “x”, “y”, “z”) - w = (q, p) = full phase-space point

This extracts a single point (scalar components) from arrays. For extracting multiple points, use jax.vmap or array indexing directly.

Parameters:

q (VectorComponents) – Position dictionary with 1D array Cartesian values of shape (N,).
p (VectorComponents) – Velocity/momentum dictionary with 1D array Cartesian values of shape (N,).
idx (int | Array) – Index or indices to extract. Can be: - int: Extract a single point (returns scalar arrays) - 0-d Array: Extract a single point (returns scalar arrays) - 1-d Array: Extract multiple points (returns 1D arrays)

Returns:

The (position, velocity) tuple at the given index/indices.

Return type:

tuple[dict[str, Array], dict[str, Array]]

Examples

>>> import jax.numpy as jnp
>>> pos = {"x": jnp.array([1.0, 2.0, 3.0]), "y": jnp.array([4.0, 5.0, 6.0])}
>>> vel = {"x": jnp.array([0.1, 0.2, 0.3]), "y": jnp.array([0.4, 0.5, 0.6])}

Extract a single point:

>>> q, p = get_w_at(pos, vel, 1)
>>> float(q["x"]), float(p["y"])
(2.0, 0.5)

Extract multiple points:

>>> q, p = get_w_at(pos, vel, jnp.array([0, 2]))
>>> list(q["x"])
[Array(1., dtype=float32), Array(3., dtype=float32)]

Types#

class phasecurvefit.WalkLocalFlowResult(positions: dict[str, Real[Array, 'N'] | Real[ndarray, 'N'] | Real[TypedNdArray, 'N']], velocities: dict[str, Real[Array, 'N'] | Real[ndarray, 'N'] | Real[TypedNdArray, 'N']], indices: Int[Array, 'N'], *, gamma_range: tuple[float, float] = (0.0, 1.0), backbone: dict[str, Real[Array, 'N'] | Real[ndarray, 'N'] | Real[TypedNdArray, 'N']] | None = None)

Bases: OrderingResult

Result of the local flow walk algorithm.

This class represents the complete output of the phase-flow walk algorithm. It contains the walk ordering, original phase-space data, and provides methods for examining and interpolating along the discovered stream.

positions

Position dictionary with keys (e.g., “x”, “y”, “z”) and values as 1D arrays of shape (n_obs,). These are the original positions from the input, not reordered.

Type:: dict[str, Array]

velocities

Velocity dictionary with same keys and shape as positions. These are the original velocities from the input, not reordered.

Type:: dict[str, Array]

indices

Ordered indices of visited observations. Shape (n_obs,). Unvisited observations are marked with -1.

The walk order can be extracted by filtering: indices[indices >= 0]. See ordering property for a convenience accessor.

Type:: Int[Array, ” n_obs”]

gamma_range

Valid range of the ordering parameter in __call__. Default is (0.0, 1.0). This is a static field and cannot be changed after construction.

Type:: tuple[float, float]

Notes

The walk algorithm discovers a path through phase-space by following the local flow defined by the velocity field. The ordering encodes which observations form a coherent sequence along this path.

Key distinction: indices is an array of length n_obs where the position in the array indicates the order in the walk, and the value at that position is the original observation index. For example:

indices = [3, 7, 1, -1, 5, ...]
#          ^ 1st visited observation is index 3
#             ^ 2nd visited observation is index 7
#                ^ 3rd visited observation is index 1
#                   ^ 4th observation was not visited
#                      ^ 5th visited observation is index 5

Properties provide convenient access to: - visited: Boolean mask of visited observations - ordering: Indices in walk order (filtered non-negative) - ordered: Positions/velocities reordered by walk - skipped_indices: Indices of unvisited observations

The interpolation method (__call__()) enables smooth spatial interpolation along the discovered path using a continuous ordering parameter $gamma in [0, 1]$.

Examples

Basic Usage: Extract Ordering and Properties

>>> import jax.numpy as jnp
>>> import phasecurvefit as pcf
>>> pos = {
...     "x": jnp.linspace(0, 10, 20),
...     "y": jnp.sin(jnp.linspace(0, 2 * 3.14159, 20)),
... }
>>> vel = {"x": jnp.ones(20), "y": jnp.cos(jnp.linspace(0, 2 * 3.14159, 20))}
>>> result = pcf.order(pos, vel)
>>> result.indices
Array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19], dtype=int32)
>>> result.n_visited
Array(20, dtype=int32)
>>> result.n_skipped
Array(0, dtype=int32)

Accessing Ordered Data

>>> qs_ordered, vs_ordered = result.ordered
>>> qs_ordered["x"].shape
(20,)

Spatial Interpolation with Gamma Parameter

The walk result can be called as a function to interpolate spatial positions from an ordering parameter $gamma in [0, 1]$:

>>> gamma = jnp.array([0.0, 0.5, 1.0])
>>> positions_interp = result(gamma)
>>> positions_interp["x"]
Array([ 0.,  5., 10.], dtype=float32)

Scalar Interpolation

>>> pos_at_midpoint = result(0.5)
>>> pos_at_midpoint["x"]
Array(5., dtype=float32)

JAX Transformations: JIT Compilation

The interpolator is JIT-compatible for efficient compilation:

>>> import jax
>>> @jax.jit
... def get_position(gamma):
...     return result(gamma)
>>> get_position(0.25)
{'x': Array(2.5, dtype=float32), 'y': Array(0.9897884, dtype=float32)}

JAX Transformations: Vectorization with vmap

Interpolate multiple gamma values efficiently:

>>> gamma_batch = jnp.linspace(0, 1, 100)
>>> @jax.jit
... def interpolate_many(gammas):
...     return jax.vmap(result)(gammas)
>>> positions_batch = interpolate_many(gamma_batch)
>>> positions_batch["x"].shape
(100,)

JAX Transformations: Automatic Differentiation

Compute gradients of positions with respect to the ordering parameter:

>>> def loss(gamma):
...     pos = result(gamma)
...     return jnp.sum(pos["x"] ** 2 + pos["y"] ** 2)
>>> grad_fn = jax.grad(loss)
>>> grad_at_half = grad_fn(0.5)

Composition: JIT + vmap + grad

Combine transformations for maximum efficiency:

>>> @jax.jit
... def compute_gradients(gammas):
...     return jax.vmap(jax.grad(loss))(gammas)
>>> compute_gradients(jnp.linspace(0, 1, 50))
Array([  0. , 5.6351056, 11.270211 , ...],  dtype=float32)

>>> result.visited.shape
(20,)

Parameters:

positions (dict[str, Union[Real[Array, 'N'], Real[ndarray, 'N'], Real[TypedNdArray, 'N']]])
velocities (dict[str, Union[Real[Array, 'N'], Real[ndarray, 'N'], Real[TypedNdArray, 'N']]])
indices (Int[Array, 'N'])
gamma_range (tuple[float, float])
backbone (dict[str, Union[Real[Array, 'N'], Real[ndarray, 'N'], Real[TypedNdArray, 'N']]] | None)

property all_visited: Bool[Array, '']: Whether all observations were visited (no skips).

backbone: VectorComponents | None = None

gamma_range: tuple[float, float] = (0.0, 1.0)

property n_skipped: Int[Array, '']: Number of observations that were not visited (skipped).

property n_visited: Int[Array, '']: Number of observations that were visited (not skipped).

property ordered: tuple[dict[str, Real[Array, 'N'] | Real[ndarray, 'N'] | Real[TypedNdArray, 'N']], dict[str, Real[Array, 'N'] | Real[ndarray, 'N'] | Real[TypedNdArray, 'N']]]: Positions and velocities ordered by the walk (visited only).

property ordering: Int[Array, 'n_visited']: Indices of visited observations in the order they were visited.

property skipped_indices: Int[Array, 'n_skipped']: Indices of skipped observations (marked as -1 in indices).

property visited: Bool[Array, 'N']: Boolean array indicating which observations were visited.

positions: VectorComponents

velocities: VectorComponents

indices: ISzN

phasecurvefit.ScalarComponents : TypeAlias = Mapping[str, FLikeSz0]#

Type alias for dictionaries mapping component names to scalar JAX arrays.

Used for single phase-space points. Keys are coordinate/component names (e.g., “x”, “y”, “z”), values are 0-dimensional JAX arrays.

Example:

position: ScalarComponents = {
    "x": jnp.array(1.0),
    "y": jnp.array(2.0),
}

phasecurvefit.VectorComponents : TypeAlias = Mapping[str, FLikeSzN]#

Type alias for dictionaries mapping component names to 1D JAX arrays.

Used for arrays of phase-space points. Keys are coordinate/component names (e.g., “x”, “y”, “z”), values are 1-dimensional JAX arrays of shape (N,).

Example:

position: VectorComponents = {
    "x": jnp.array([0.0, 1.0, 2.0]),
    "y": jnp.array([0.0, 1.0, 2.0]),
}

Autoencoder Module#

Neural network for interpolating skipped tracers. See Autoencoder Guide for details.

Classes#

class phasecurvefit.nn.PathAutoencoder(encoder: OrderingNet, decoder: AbstractTrackNet, normalizer: AbstractNormalizer, width: WidthNet | None = None)

Bases: AbstractAutoencoder

Autoencoder combining OrderingNet and TrackNet.

This autoencoder is trained to assign $gamma$ values to stream tracers that were skipped by the phase-flow walk algorithm. It consists of two parts:

Interpolation Network: Maps phase-space coordinates $(x, v) to (gamma, p)$ where $gamma in [0, 1]$ is the ordering parameter and $p in [0, 1]$ is the membership probability.
Param-Net (Decoder): Maps $gamma to x$, reconstructing the position from the ordering parameter.

Parameters:

encoder (OrderingNet)
decoder (AbstractTrackNet)
normalizer (AbstractNormalizer)
width (WidthNet | None)

encoder: OrderingNet

decoder: AbstractTrackNet

normalizer: AbstractNormalizer

width: WidthNet | None = None

Foreground half-width $sigma(gamma)$ (e.g. a stream’s), or None.

Present only when the autoencoder was trained with mixture-model membership (see MixtureMembershipConfig). It is kept on the model because it is needed at inference to turn the encoder’s prior membership $pi$ into a calibrated posterior; see posterior_membership.

property gamma_range: tuple[float, float]: Return the gamma range from the encoder.

classmethod make(normalizer: AbstractNormalizer, *, gamma_range: tuple[float, float], ordering_width_size: int = 100, ordering_depth: int = 2, track_width_size: int = 128, track_depth: int = 3, decoder: AbstractTrackNet | None = None, key: Key[Array, ''] | UInt32[Array, '2'] | UInt32[Array, '4'])

Parameters:

normalizer (AbstractNormalizer)
gamma_range (tuple[float, float])
ordering_width_size (int)
ordering_depth (int)
track_width_size (int)
track_depth (int)
decoder (AbstractTrackNet | None)
key (Union[Key[Array, ''], UInt32[Array, '2'], UInt32[Array, '4']])

Return type:

PathAutoencoder

decode(gamma: Float[Array, 'N'], /, *, key: Key[Array, ''] | UInt32[Array, '2'] | UInt32[Array, '4'] | None = None)

Decode $gamma$ to reconstructed position.

Parameters:

gamma (Array) – Ordering parameter, typically in the range defined by gamma_range, shape (N,). Some decoders may support extrapolation beyond this range.
key (PRNGKeyArray, optional) – JAX random key for stochastic decoding (if applicable).

Returns:

position – Reconstructed dict of positions.

Return type:

VectorComponents

Encode phase-space coordinates to ($gamma$, $p$).

Parameters:

qs (VectorComponents) – Spatial / velocity coordinates of shape (N, n_dims).
ps (VectorComponents) – Spatial / velocity coordinates of shape (N, n_dims).
key (PRNGKeyArray, optional) – JAX random key for stochastic encoding (if applicable).

Return type:

tuple[Float[Array, 'N'], Float[Array, 'N']]

Returns:

gamma (Array) – Ordering parameter in [0, 1], shape (N,).
prob (Array) – Membership probability in [0, 1], shape (N,).

class phasecurvefit.nn.OrderingNet(in_size: int = 6, width_size: int = 100, depth: int = 2, *, gamma_range: tuple[float, float] = (0.0, 1.0), key: Key[Array, ''] | UInt32[Array, '2'] | UInt32[Array, '4'])

Bases: Module

Interpolation network:$(x, v) ;mapsto; (gamma, p),$.

This network takes N-D phase-space coordinates and outputs:

$gamma in [0, 1]$: The ordering parameter along the stream
$p in [0, 1]$: The membership probability (1 = likely stream member)

The architecture follows Appendix B.3 of Nibauer et al. (2022).

Uses scan-over-layers for improved compilation speed. See: https://docs.kidger.site/equinox/tricks/#improve-compilation-speed-with-scan-over-layers

Parameters:

in_size (int) – Number of spatial + kinematic dimensions (6 for 3D: x, y, z, vx, vy, vz).
width_size (int) – The size of each hidden layer.
depth (int, optional) –
The number of hidden layers, not include the input layer or output heads. For example, depth=2 results in an network with layers:

[Linear(in_size, width_size), Linear(width_size, width_size), Linear(width_size, out_size), (output_heads)]
key (PRNGKeyArray) – JAX random key for initialization.
gamma_range (tuple[float, float])

in_size: int

width_size: int

depth: int

gamma_range: tuple[float, float]

mlp: MLP

gamma_head: Linear

prob_head: Linear

class phasecurvefit.nn.TrackNet(out_size: int = 3, width_size: int = 100, depth: int = 3, *, key: Key[Array, ''] | UInt32[Array, '2'] | UInt32[Array, '4'])

Bases: AbstractTrackNet

Param-Net (decoder): maps $gamma to$ position (x, y, z).

This network reconstructs the stream track position from the ordering parameter $gamma$. It serves as the second half of the autoencoder.

The architecture follows Appendix B.1 of Nibauer et al. (2022).

Uses scan-over-layers for improved compilation speed. See: https://docs.kidger.site/equinox/tricks/#improve-compilation-speed-with-scan-over-layers

Parameters:

key (PRNGKeyArray) – JAX random key for initialization.
out_size (int) – Number of spatial dimensions (2 for 2D, 3 for 3D) for the track speed.
hidden_size (int, optional) – Size of hidden layers. Default: 100.
n_hidden (int, optional) – Number of hidden layers. Default: 3.
width_size (int)
depth (int)

out_size: int

width_size: int

depth: int

mlp: MLP

class phasecurvefit.nn.TrainingConfig(*, optimizer: ~optax._src.base.GradientTransformation = (<function chain.<locals>.init_fn>, <function chain.<locals>.update_fn>), batch_size: int = 100, show_pbar: bool = True, member_threshold: float = 0.5, n_epochs_encoder: int = 800, lambda_prob: float = 1.0, n_epochs_decoder: int = 100, n_epochs_both: int = 200, lambda_q: float = 1.0, lambda_p: tuple[float, float] = (1.0, 5.0), weight_by_density: bool | ~collections.abc.Mapping[str, object] = False, freeze_encoder_final_training: bool = False, membership: ~phasecurvefit._src.nn.membership.MixtureMembershipConfig | None = None)

Bases: object

Configuration for three-phase autoencoder training.

Parameters:

optimizer (GradientTransformation)
batch_size (int)
show_pbar (bool)
member_threshold (float)
n_epochs_encoder (int)
lambda_prob (float)
n_epochs_decoder (int)
n_epochs_both (int)
lambda_q (float)
lambda_p (tuple[float, float])
weight_by_density (bool | Mapping[str, object])
freeze_encoder_final_training (bool)
membership (MixtureMembershipConfig | None)

optimizer: GradientTransformation = (<function chain.<locals>.init_fn>, <function chain.<locals>.update_fn>): Optax optimizer for training.

batch_size: int = 100: Batch size for training.

show_pbar: bool = True: Show an epoch progress bar via tqdm.

member_threshold: float = 0.5: Membership p > threshold for identifying stream members.

n_epochs_encoder: int = 800: Number of epochs for Phase 1 training (OrderingNet)

lambda_prob: float = 1.0: Weight for probability loss terms.

n_epochs_decoder: int = 100: Number of epochs for Phase 2 training (TrackNet)

n_epochs_both: int = 200: Number of epochs for Phase 2 training (TrackNet)

lambda_q: float = 1.0: Weight for phase-2 spatial training.

lambda_p: tuple[float, float] = (1.0, 5.0): Weight range for phase-2 velocity training.

weight_by_density: bool | Mapping[str, object] = False: Whether to inverse density weight the samples. USE WITH CARE.

freeze_encoder_final_training: bool = False: Whether to freeze the encoder during phase 2 training.

membership: MixtureMembershipConfig | None = None

Opt in to mixture-model membership (outlier rejection).

None (the default) preserves the existing behaviour exactly. Supply a MixtureMembershipConfig to model the field as a stream + background mixture (Hogg, Bovy & Lang 2010, §3) so that membership is a calibrated posterior driven by the distance from the fitted track.

Applies to phase 3 (joint encoder+decoder training), which is where the decoder – and therefore the residual – exists.

property n_epochs: int: Return the total number of epochs.

encoderonly_config()

Construct the OrderingNet config.

Return type:: OrderingTrainingConfig

decoderonly_config()

Construct the TrackNet config.

Return type:: TrackTrainingConfig

autoencoder_config()

Construct the Autoencoder config.

Return type:: EncoderDecoderTrainingConfig

Training Functions#

phasecurvefit.nn.train_autoencoder(model: PathAutoencoder, all_ws: Float[Array, 'N TwoF'], ordering_indices: Int[Array, 'N'], /, *, config: TrainingConfig | None = None, key: Key[Array, ''] | UInt32[Array, '2'] | UInt32[Array, '4'])

Train the PathAutoencoder in two phases.

This function orchestrates the complete two-phase training procedure:

Phase 1 (OrderingNet/Encoder): Trains the encoder to predict $gamma$ (ordering parameter) and $p$ (membership probability) from phase-space coordinates. Uses the ordering from the walk algorithm as supervision.

Phase 2 (TrackNet/Decoder): Trains the decoder to reconstruct spatial positions from $gamma$ while aligning with velocity directions. Uses the trained encoder to filter stream members based on membership probability threshold.

Parameters:

model (EncoderExternalDecoder) – Untrained or partially trained autoencoder model.
all_ws (Array, shape (N, 2*n_dims)) – All phase-space coordinates (positions + velocities).
ordering_indices (Array, shape (N,)) – Ordering indices from walk algorithm. Valid indices (>= 0) indicate ordered tracers; -1 indicates skipped/unordered tracers.
config (OrderingTrainingConfig, optional) – Complete training configuration for both phases. If None (default), uses default configuration.
key (PRNGKeyArray) – Random key for training (split internally for each phase).
model – The autoencoder model to train. Its encoder will be updated.
all_ws – All phase-space coordinates (positions + velocities) in normalized form.
ordering_indices – Ordering indices from walk algorithm. Valid indices (>= 0) indicate ordered tracers; -1 indicates skipped/unordered tracers.
config – Training configuration for the encoder. If None, uses default config.
decoder_kwargs (Mapping, optional) – Keyword arguments passed to decoder function creation. For running-mean decoder, can include ‘window_size’. If None, uses defaults.
key – Random key for training.

Return type:

tuple[AutoencoderResult, dict[str, PyTree], n_epochs}']]

Returns:

result (AutoencoderResult) – Result containing the fully trained autoencoder and ordering data.
opt_states (dict[str, optax.OptState]) – Dictionary with ‘encoder’, ‘decoder’ and ‘both’ optimizer states.
losses (Array, shape (n_epochs_encoder + n_epochs_both,)) – Concatenated training losses from both phases.
.. py (function:: train_autoencoder(model: phasecurvefit._src.nn.abstractautoencoder.AbstractAutoencoder, walk_results: phasecurvefit._src.orderers.result.OrderingResult, /, *, config: phasecurvefit._src.nn.autoencoder.TrainingConfig | None = None, key: Union[jaxtyping.Key[Array, ‘’], jaxtyping.UInt32[Array, ‘2’], jaxtyping.UInt32[Array, ‘4’]]) -> tuple[phasecurvefit._src.nn.result.AutoencoderResult, dict[str, jaxtyping.PyTree], jaxtyping.Float[Array, ‘{config.n_epochs}’]]) – :noindex:
.. py (function:: train_autoencoder(model: phasecurvefit._src.nn.externalautoencoder.EncoderExternalDecoder, all_ws: jaxtyping.Float[Array, ‘N TwoF’], ordering_indices: jaxtyping.Int[Array, ‘N’], /, *, config: phasecurvefit._src.nn.order_net.OrderingTrainingConfig | phasecurvefit._src.nn.autoencoder.TrainingConfig | None = None, key: Union[jaxtyping.Key[Array, ‘’], jaxtyping.UInt32[Array, ‘2’], jaxtyping.UInt32[Array, ‘4’]]) -> tuple[phasecurvefit._src.nn.result.AutoencoderResult, dict[str, jaxtyping.PyTree], jaxtyping.Float[Array, ‘{config.n_epochs}’]]) – :noindex:
Train the EncoderExternalDecoder encoder and create running-mean decoder.
This function provides a simplified training workflow
1. Train the encoder (OrderingNet) using supervised learning from ordering indices
2. Create a running-mean decoder using the trained encoder and training data
Unlike train_autoencoder for PathAutoencoder, this does not train a decoder
network. Instead, it uses the provided (or default) decoder function.

Returns:

result (AutoencoderResult) – Result containing the trained autoencoder and ordering data.
opt_state (dict[str, PyTree]) – Optimizer state from encoder training (wrapped in dict for consistency).
losses (Array, shape (n_epochs,)) – Training losses from encoder training.

Examples

>>> import jax.numpy as jnp
>>> import jax.random as jr
>>> import phasecurvefit as pcf

>>> key = jr.key(0)
>>> N = 50
>>> positions = {"x": jnp.arange(N, dtype=float), "y": jnp.zeros(N)}
>>> velocities = {"x": jnp.ones(N), "y": jnp.zeros(N)}
>>> ordering = jnp.arange(N)

>>> model = pcf.nn.EncoderExternalDecoder(
...     pcf.nn.OrderingNet(in_size=4, width_size=32, depth=2, key=jr.key(1)),
...     pcf.nn.RunningMeanDecoder(window_size=0.05),
...     pcf.nn.StandardScalerNormalizer(positions, velocities),
... )

Train (with minimal epochs for demonstration)

>>> qs_norm, ps_norm = model.normalizer.transform(positions, velocities)
>>> ws_norm = jnp.concat([qs_norm, ps_norm], axis=1)
>>> config = pcf.nn.OrderingTrainingConfig(
...     n_epochs=10, batch_size=16, show_pbar=False
... )
>>> result, opt_state, losses = pcf.nn.train_autoencoder(
...     model, ws_norm, ordering, config=config, key=jr.key(2)
... )
>>> losses.shape
(10,)

phasecurvefit.nn.fill_ordering_gaps(model: AbstractAutoencoder, result: AbstractResult, /, prob_threshold: float = 0.5)

Use trained autoencoder to fill gaps in phase-flow walk ordering.

This function predicts $gamma$ values for all tracers (including those skipped by phase-flow walk) and returns a complete ordering.

Parameters:

model (PathAutoencoder) – Trained autoencoder model.
result (AbstractResult) – Result from walk_local_flow.
prob_threshold (float, optional) – Minimum membership probability to include. Default: 0.5.

Returns:

result – Complete ordering including previously skipped tracers.

Return type:

AutoencoderResult

Examples

>>> import jax
>>> import jax.numpy as jnp
>>> import phasecurvefit as pcf

>>> pos = {"x": jnp.linspace(0, 5, 20), "y": jnp.zeros(20)}
>>> vel = {"x": jnp.ones(20), "y": jnp.zeros(20)}
>>> result = pcf.order(pos, vel)
>>> keys = jax.random.split(jax.random.key(0), 2)
>>> normalizer = pcf.nn.StandardScalerNormalizer(pos, vel)
>>> model = pcf.nn.PathAutoencoder.make(
...     normalizer, gamma_range=result.gamma_range, key=keys[0]
... )
>>> cfg = pcf.nn.TrainingConfig(show_pbar=False)
>>> result, *_ = pcf.nn.train_autoencoder(model, result, config=cfg, key=keys[1])

Membership & Outlier Rejection#

Mixture-model membership, after Hogg, Bovy & Lang (2010), §3. See Outlier rejection.

class phasecurvefit.nn.MixtureMembershipConfig(*, sigma_init: float = 0.1, sigma_min: float = 0.001, sigma_ceiling: tuple[float, float] = (0.5, 0.1), warmup_frac: float = 0.3, background_density: float | None = None, background_inflate: float = 1.0, width_size: int = 32, depth: int = 2, lambda_velocity: float = 1.0)

Bases: object

Configuration for mixture-model membership (outlier rejection).

Pass an instance of this to phasecurvefit.nn.EncoderDecoderTrainingConfig.membership (or TrainingConfig.membership) to swap the classifier-style membership loss for the generative mixture model of Hogg, Bovy & Lang (2010), §3.

Leaving membership=None (the default) preserves the existing behaviour exactly.

sigma_init

Stream half-width the WidthNet is initialised to predict. Set it to your best guess at the stream width, in the normalised coordinates the networks see (StandardScalerNormalizer makes this roughly “in units of the field’s standard deviation”).

Type:: float

sigma_min

Hard floor on the width, guarding the collapse-onto-a-single-star degeneracy that afflicts every mixture likelihood.

Type:: float

sigma_ceiling

(start, stop) for the annealing ceiling on the width, applied geometrically across epochs; see sigma_ceiling. start should be a few times the expected width, stop of order the width. This is one of the two knobs that matter: with no ceiling, a free width will inflate to swallow nearby outliers rather than reject them.

Type:: tuple[float, float]

warmup_frac

Fraction of training spent ramping the membership term in; see membership_rampup. This is the other knob that matters: with no warm-up, the model can collapse to “everything is background” before the track has fitted, and never recover. Set to 0 only if you are supplying an already-good track.

Type:: float

background_density

$rho_{mathrm{bg}}$. If None (default), computed from the extent of the training positions via uniform_background_density.

Type:: float | None

background_inflate

Passed to uniform_background_density when background_density is None. Increase if the observed stars do not fill the survey footprint.

Type:: float

width_size, depth

Architecture of the WidthNet.

Type:: int

lambda_velocity

Weight on the velocity-alignment term, which is reweighted by the posterior responsibility so that outliers cannot drag the track’s tangent. Set to 0 to disable the velocity term entirely.

Type:: float

Examples

>>> import phasecurvefit as pcf

Enable outlier rejection with a stream you expect to be ~0.1 wide:

>>> membership = pcf.nn.MixtureMembershipConfig(
...     sigma_init=0.1, sigma_ceiling=(0.5, 0.1)
... )
>>> cfg = pcf.nn.TrainingConfig(membership=membership, show_pbar=False)
>>> cfg.membership.sigma_init
0.1

The default is None, i.e. the legacy classifier membership:

>>> pcf.nn.TrainingConfig().membership is None
True

Parameters:

sigma_init (float)
sigma_min (float)
sigma_ceiling (tuple[float, float])
warmup_frac (float)
background_density (float | None)
background_inflate (float)
width_size (int)
depth (int)
lambda_velocity (float)

sigma_init: float = 0.1

sigma_min: float = 0.001

sigma_ceiling: tuple[float, float] = (0.5, 0.1)

warmup_frac: float = 0.3

background_density: float | None = None

background_inflate: float = 1.0

width_size: int = 32

depth: int = 2

lambda_velocity: float = 1.0

make_width_net(*, key: Key[Array, ''] | UInt32[Array, '2'] | UInt32[Array, '4'])

Build the WidthNet this config describes.

Parameters:: key (Union[Key[Array, ''], UInt32[Array, '2'], UInt32[Array, '4']])
Return type:: WidthNet

resolve_background_density(qs: Real[Array, 'N D'], /)

Return the configured background density, or derive it from qs.

Parameters:: qs (Real[Array, 'N D'])
Return type:: Float[Array, ''] | float

class phasecurvefit.nn.WidthNet(width_size: int = 32, depth: int = 2, *, sigma_init: float = 0.1, sigma_min: float = 0.001, key: Key[Array, ''] | UInt32[Array, '2'] | UInt32[Array, '4'])

Bases: Module

Width $sigma$ of the foreground component along the ordering parameter.

The foreground density need not have a uniform width. A stream, for example, is typically narrow near the progenitor and fans out towards the tidal tails; the same is true of many curved distributions. A single scalar $sigma$ forces one compromise, which is both a worse fit and – more importantly – a worse outlier detector, because the compromise width is too generous wherever the stream is genuinely thin.

This is a small MLP $gamma mapsto log sigma_{mathrm{raw}}(gamma)$, exponentiated to guarantee positivity and then capped from above by an annealing ceiling (see sigma_ceiling):

\[\sigma(\gamma, t) = \min\!\big( \sigma_{\mathrm{raw}}(\gamma),\; \sigma_{\mathrm{ceil}}(t) \big)\]

The floor is enforced by construction (exp is positive) plus sigma_min, which keeps the Gaussian from collapsing onto individual stars – the classic degenerate maximum of any mixture likelihood, where one component shrinks to zero width around a single point and the likelihood diverges.

Parameters:

width_size (int) – Hidden-layer size and number of hidden layers of the MLP.
depth (int) – Hidden-layer size and number of hidden layers of the MLP.
sigma_init (float) – Width the network is initialised to predict, everywhere. Choose something comparable to the expected stream width; it only sets the starting point.
sigma_min (float) – Hard floor on the returned width. Guards the collapse-onto-a-star degeneracy. Should be well below the true stream width (e.g. the typical positional uncertainty).
key (PRNGKeyArray) – JAX random key for initialisation.

Examples

>>> import jax.numpy as jnp
>>> import jax.random as jr
>>> import phasecurvefit as pcf

>>> width = pcf.nn.WidthNet(sigma_init=0.2, key=jr.key(0))

At initialisation the predicted width is close to sigma_init everywhere:

>>> sigma = width(jnp.array(0.3))
>>> bool(0.1 < sigma < 0.4)
True

It is a scalar-in, scalar-out function, so vmap it over a batch:

>>> import jax
>>> gammas = jnp.linspace(-1.0, 1.0, 5)
>>> jax.vmap(width)(gammas).shape
(5,)

Widths are strictly positive, always:

>>> bool(jnp.all(jax.vmap(width)(gammas) > 0))
True

sigma_min: float

width_size: int

depth: int

mlp: MLP

phasecurvefit.nn.posterior_membership(model: PathAutoencoder, ws: Float[Array, 'N TwoF'], /, *, background_density: float | None = None, key: Key[Array, ''] | UInt32[Array, '2'] | UInt32[Array, '4'] | None = None)

Posterior probability that each point belongs to the foreground component.

The foreground is whatever the model fits a track through – a stream is the running example. This is the number you want when deciding which points are members. It is not the encoder’s raw prob output: that is only the prior $pi_n$, formed from the star’s phase-space coordinates before the model has looked at how far the star actually landed from the fitted track. This function folds in the residual, giving the posterior of Hogg, Bovy & Lang (2010), §3:

\[\hat{q}_n = \frac{\pi_n \, \mathcal{N}(r_n; 0, \sigma^2(\gamma_n))} {\pi_n \, \mathcal{N}(r_n; 0, \sigma^2(\gamma_n)) + (1 - \pi_n) \, \rho_{\mathrm{bg}}}\]

Requires a model trained with MixtureMembershipConfig (so that model.width exists).

Parameters:

model (PathAutoencoder) – A model trained with mixture membership.
ws (Array, shape (N, 2D)) – Phase-space coordinates, in the same normalised frame used for training.
background_density (float | None) – $rho_{mathrm{bg}}$. If None, recomputed from the extent of ws. Pass the same value used at training time if you are scoring a different field from the one you trained on – otherwise the posterior is calibrated against the wrong background.
key (PRNGKeyArray | None) – Optional key for the (deterministic) networks.

Returns:

responsibility – Posterior membership in [0, 1]. Threshold at 0.5 for a hard cut, or keep it as a weight – Hogg et al. recommend the latter.

Return type:

Array, shape (N,)

Raises:

ValueError – If the model was not trained with mixture membership.

phasecurvefit.nn.mixture_membership_loss(qs_meas: Float[Array, 'N D'], qs_pred: Float[Array, 'N D'], prob: Float[Array, 'N'], sigma: Float[Array, 'N'], mask: Bool[Array, 'N'], /, *, log_bg_density: float, rampup: Float[Array, ''] | float = 1.0)

Negative log-likelihood of the stream/background mixture.

Implements the “mixture” model of Hogg, Bovy & Lang (2010), §3, with the bad-data fraction amortised into a per-star, network-predicted $pi_n$:

\[-\log \mathcal{L} = -\frac{1}{N} \sum_n \log\!\left[ \pi_n \, \mathcal{N}\!\left(q_n; x_\theta(\gamma_n), \sigma_n^2 \mathbb{I}\right) + (1 - \pi_n) \, \rho_{\mathrm{bg}} \right]\]

Minimising this simultaneously fits the track ($x_theta$), the width ($sigma$), and the membership ($pi$) – and the last of these is now driven by the residual, which is the entire point.

Evaluated with jax.nn.logsumexp semantics (via jnp.logaddexp) so that a star which is implausible under both components – very far from the track and in a low-density corner – does not underflow to -inf.

Parameters:

qs_meas (Array, shape (N, D)) – Observed positions.
qs_pred (Array, shape (N, D)) – Decoded track positions, $x_theta(gamma_n)$.
prob (Array, shape (N,)) – Encoder membership output $pi_n$, in [0, 1].
sigma (Array, shape (N,)) – Stream half-width at each star’s $gamma$. Must be positive.
mask (Array, shape (N,)) – True for real, usable stars; False for padding or ignorable rows. Only masked-in stars contribute.
log_bg_density (float) – $log rho_{mathrm{bg}}$; see uniform_background_density.
rampup (Array, shape () | float, optional) – Membership ramp $w in [0, 1]$ from membership_rampup. The effective membership is $pi^{mathrm{eff}} = 1 - w (1 - pi)$, so w=0 gives a pure reconstruction NLL and w=1 the full mixture. Defaults to 1 (no warm-up); see membership_rampup for why you almost certainly want to ramp.

Return type:

tuple[Float[Array, ''], Float[Array, 'N']]

Returns:

loss (Array, shape ()) – Mean negative log-likelihood over the masked-in stars.
responsibility (Array, shape (N,)) – Posterior membership $hat{q}_n$ for every star (including masked-out ones, whose values are meaningless). Returned because the caller almost always wants it – to weight the velocity-alignment term, and to report membership – and recomputing it would duplicate the whole forward pass.

Computed from the effective membership, so during warm-up it is (correctly) close to 1 everywhere.

Examples

>>> import jax.numpy as jnp
>>> from phasecurvefit.nn import mixture_membership_loss

Two stars on the track and one far off it. With a confident prior, the on-track stars get responsibility ~1 and the outlier ~0:

>>> qs_meas = jnp.array([[0.0, 0.0], [1.0, 0.0], [0.0, 9.0]])
>>> qs_pred = jnp.array([[0.0, 0.0], [1.0, 0.0], [0.0, 0.0]])
>>> prob = jnp.array([0.9, 0.9, 0.9])
>>> sigma = jnp.array([0.1, 0.1, 0.1])
>>> mask = jnp.ones(3, dtype=bool)
>>> loss, resp = mixture_membership_loss(
...     qs_meas, qs_pred, prob, sigma, mask, log_bg_density=float(jnp.log(0.01))
... )
>>> bool(resp[0] > 0.99), bool(resp[1] > 0.99), bool(resp[2] < 0.01)
(True, True, True)

The loss is finite even for the wildly-discrepant star:

>>> bool(jnp.isfinite(loss))
True

phasecurvefit.nn.membership_responsibility(prob: Float[Array, 'N'], r2: Float[Array, 'N'], sigma: Float[Array, 'N'], /, *, log_bg_density: Float[Array, ''] | float, n_dims: int)

Posterior probability that each star is a stream member.

This is the E-step of the mixture model: given the current track, width, and prior membership $pi_n$, the posterior membership of star $n$ is

\[\hat{q}_n = \frac{\pi_n \, \mathcal{N}(r_n; 0, \sigma_n^2)} {\pi_n \, \mathcal{N}(r_n; 0, \sigma_n^2) + (1 - \pi_n)\, \rho_{\mathrm{bg}}}\]

which is Hogg, Bovy & Lang (2010)’s $q_i$ after marginalisation – the quantity they recommend reporting instead of a hard cut.

Note the distinction from prob: $pi_n$ is what the encoder believes from the star’s phase-space coordinates alone, before seeing how far it actually landed from the track. $hat{q}_n$ folds in the residual. The responsibility is the number you want when identifying members; prob is only its prior.

Parameters:

prob (Array, shape (N,)) – Encoder membership output $pi_n$, in [0, 1].
r2 (Array, shape (N,)) – Squared residual $lVert q_n - x_theta(gamma_n) rVert^2$.
sigma (Array, shape (N,)) – Stream half-width at each star’s $gamma$.
log_bg_density (float) – $log rho_{mathrm{bg}}$.
n_dims (int) – Spatial dimensionality $D$.

Returns:

responsibility – Posterior membership in [0, 1].

Return type:

Array, shape (N,)

Examples

>>> import jax.numpy as jnp
>>> from phasecurvefit.nn import membership_responsibility

A star sitting exactly on the track, with a flat prior, is confidently a member; one far away is confidently not:

>>> prob = jnp.array([0.5, 0.5])
>>> r2 = jnp.array([0.0, 100.0])  # on the track / far off it
>>> sigma = jnp.array([0.1, 0.1])
>>> q = membership_responsibility(
...     prob, r2, sigma, log_bg_density=float(jnp.log(0.01)), n_dims=2
... )
>>> bool(q[0] > 0.99), bool(q[1] < 0.01)
(True, True)

phasecurvefit.nn.sigma_ceiling(epoch_idx: Int[Array, ''] | int, num_epochs: int, /, *, start: float, stop: float)

Annealing ceiling on the stream width, geometric from start to stop.

The mixture likelihood has a well-known failure mode: rather than lowering the membership of a few nearby outliers, it can simply widen the stream component until they are explained. The likelihood genuinely prefers this, so no amount of training fixes it – it is not a convergence problem.

Annealing removes the option. Early on the ceiling is generous, which is what we want: the decoded track is still garbage, so every star is far from it, and an aggressive width would reject the entire stream. As the track sharpens, the ceiling is squeezed and stars that stay far from the track are progressively forced to explain themselves as background.

The schedule is geometric (linear in $log sigma$), which is the natural choice for a scale parameter.

Parameters:

epoch_idx (Array, shape () | int) – Current epoch, in [0, num_epochs). May be a tracer.
num_epochs (int) – Total number of epochs. Must be static.
start (float) – Ceiling at the first and last epoch. start should be comfortably larger than the expected stream width (a few times is fine); stop should be of order the stream width.
stop (float) – Ceiling at the first and last epoch. start should be comfortably larger than the expected stream width (a few times is fine); stop should be of order the stream width.

Returns:

ceiling – The width ceiling for this epoch.

Return type:

Array, shape ()

Examples

>>> import jax.numpy as jnp
>>> from phasecurvefit.nn import sigma_ceiling

The ceiling starts at start and ends at stop:

>>> float(sigma_ceiling(0, 5, start=1.0, stop=0.1))
1.0
>>> round(float(sigma_ceiling(4, 5, start=1.0, stop=0.1)), 6)
0.1

…and is geometric in between, so the midpoint is the geometric mean:

>>> round(float(sigma_ceiling(2, 5, start=1.0, stop=0.01)), 6)
0.1

A single-epoch run just uses start:

>>> float(sigma_ceiling(0, 1, start=1.0, stop=0.1))
1.0

phasecurvefit.nn.membership_rampup(epoch_idx: Int[Array, ''] | int, num_epochs: int, /, *, warmup_frac: float)

Ramp the membership term in from zero: 0 during warm-up, 1 afterwards.

Why this is not optional#

The mixture likelihood has a second degenerate optimum, and it is worse than the width-inflation one because it is self-reinforcing.

At initialisation the decoded track is meaningless, so every star has a huge residual. The likelihood’s cheapest move is to declare the whole dataset background: push $pi_n to 0$ for all $n$. But once $pi_n = 0$ the stream component carries no weight, so no gradient reaches the decoder – the track is free to drift anywhere, residuals grow further, and $pi$ is pinned at zero forever. Training collapses to “there is no stream”. We have measured this: with outliers at 20 stream-widths, an un-ramped mixture drove the median inlier residual to ~8 (stream width 0.15) and rejected all 400 genuine members.

The cure is the standard EM initialisation: fit the track first, with membership effectively pinned at 1, and only then let the model start disowning stars. Concretely we use

\[\pi^{\mathrm{eff}}_n = 1 - w(t)\,(1 - \pi_n)\]

so that at $w = 0$ the loss degenerates to the plain Gaussian reconstruction NLL (the background term vanishes), and at $w = 1$ it is the full mixture. Gradients still flow into $pi$ during warm-up (through the $w pi_n$ term), they simply cannot yet act on the reconstruction.

Together with sigma_ceiling this brackets the problem from both sides: the ramp stops membership collapsing before the track exists, and the ceiling stops the width inflating once it does.

type epoch_idx:: Int[Array, ''] | int
param epoch_idx:: Current epoch, in [0, num_epochs). May be a tracer.
type epoch_idx:: Array, shape () | int
type num_epochs:: int
param num_epochs:: Total number of epochs. Must be static.
type num_epochs:: int
type warmup_frac:: float
param warmup_frac:: Fraction of training spent ramping, in [0, 1). w rises linearly from 0 to 1 across the first warmup_frac * (num_epochs - 1) epochs and stays at 1 thereafter. 0 disables the ramp (full mixture from epoch zero) – don’t, unless you have a good reason.
type warmup_frac:: float
returns:: w – Ramp value in [0, 1].
rtype:: Array, shape ()

Examples

>>> from phasecurvefit.nn import membership_rampup

With a 50% warm-up the mixture is fully on by the halfway point:

>>> float(membership_rampup(0, 10, warmup_frac=0.5))
0.0
>>> float(membership_rampup(5, 10, warmup_frac=0.5))
1.0
>>> float(membership_rampup(9, 10, warmup_frac=0.5))
1.0

…and rises linearly in between:

>>> round(float(membership_rampup(2, 10, warmup_frac=0.5)), 3)
0.444

warmup_frac=0 means no warm-up at all:

>>> float(membership_rampup(0, 10, warmup_frac=0.0))
1.0

Parameters:

epoch_idx (Int[Array, ''] | int)
num_epochs (int)
warmup_frac (float)

Return type:

Float[Array, ’’]

phasecurvefit.nn.uniform_background_density(qs: Float[Array, 'N D'], /, *, inflate: float = 1.0)

Flat background density $rho_{mathrm{bg}}$ from the field’s extent.

The mixture model needs a density for the “not a stream member” component. Hogg, Bovy & Lang (2010) §3 use a broad Gaussian $(Y_b, V_b)$ and marginalise over it; when the field is a bounded footprint – as it is here – the natural analogue is a uniform density over that footprint,

\[\rho_{\mathrm{bg}} = \frac{1}{\prod_d \left(\max_d q - \min_d q\right)} .\]

This is not a free knob to tune: it is the reciprocal of the field volume, and it is what makes membership a calibrated posterior rather than an arbitrary cut. It is, however, the quantity that sets where the stream/background crossover sits, so it is worth being deliberate about the footprint you hand it.

Parameters:

qs (Array, shape (N, D)) – Positions defining the field.
inflate (float, optional) – Multiply the extent of each axis by this factor before taking the volume. Use > 1 if the observed points do not fill the true survey footprint. Default 1.

Returns:

rho_bg – Background density, in units of (length)^-D. Returned as a 0-d array so that it stays trace-transparent under jit (e.g. inside posterior_membership); use float(...) if you need a Python scalar.

Return type:

Array, scalar

Examples

>>> import jax.numpy as jnp
>>> from phasecurvefit.nn import uniform_background_density

A unit square has unit background density:

>>> qs = jnp.array([[0.0, 0.0], [1.0, 1.0]])
>>> float(uniform_background_density(qs))
1.0

A 2x2 square has a quarter the density:

>>> qs = jnp.array([[0.0, 0.0], [2.0, 2.0]])
>>> float(uniform_background_density(qs))
0.25

API Reference

Contents

API Reference#

Main Function#

Result Accessor#

Distance Metrics#

Phase-Space Utilities#

Types#

Autoencoder Module#

Classes#

Training Functions#

Membership & Outlier Rejection#

Why this is not optional#

Index#