API Reference

This page provides comprehensive documentation for all public functions in YAXArraysToolbox.jl.


Basic Operations

These are the core functions for everyday data analysis tasks.

Time Series Plotting

Visualize how variables change over time by aggregating spatial dimensions.

YAXArraysToolbox.plot_timeFunction

Plot time

The function allow to plot the time series of a given variables in a cube or all the variables present in a cube. As is expected that cubes contain spatial dimensions the spatial dimensions are collapsed using a function e.g., estimating the mean of the variable using the pixels of a certain area for each time step.

## Arguments:

- ```cube_in``` YAXArray Cube.
- ```time_axis```: String. Name of the time axis.
- ```var_axis```: String. Name of the axis containing the variables.
- ```var```: String or nothing. Name of the variable to be plotted. If nothing all the variables presented in the cube are plotted.
- ```lat_axis```: String. Name of the latitude axis.
- ```lon_axis```: String. Name of the longitute axis.
- ```fun```: String. Name of the function used to collapse the spatial dimensions. It must be "median", "mean", "std", "var", "sum", "quant", "min", or "max".
- ```plot_type```: String. Name of the plot type. By default: "lines". It can also be "scatter".
- ```p```: Float64. in the interval [0,1]. If ```fun=quant``` p is the value of the quantile. 
- ```resolution```: Tuple. Plot resolution. By default ```resolution = (600, 400)```. 
- ```ncol```: Number of plots by column. By default ```ncol = 1```.
- ```nrow```: Number of plots by row. By default ```ncol = 1```.
- ```showprog```: Boolean. Progress Bar.
- ```max_cache```: String. Maximum cache to read the data. It must be in MB e.g. "100MB" or in GB "10GB".


## Examples

```julia
using YAXArrays, Zarr, CairoMakie, GeoMakie, Statistics, DimensionalData

metric = ["median", "mean", "std", "var", "sum", "quant", "min", "max"]


cube_in = open_dataset(
    "https://s3.bgc-jena.mpg.de:9000/esdl-esdc-v2.1.1/esdc-8d-0.25deg-1x720x1440-2.1.1.zarr",
)

cube_in = Cube(cube_in)
cube_in.Variable

cube_in = cube_in[
    lon = (-9.0 .. 0.0),
    lat = (35 .. 40),
    Ti = (Date(2010) .. Date(2014)),
    Variable = At(["leaf_area_index", "sensible_heat"]),
]



plot_time(
    cube_in;
    time_axis = :Ti,
    var_axis = :Variable,
    lon_axis = :lon,
    lat_axis = :lat,
    var = "sensible_heat",
    fun = "median",
    p = 0.2,
    showprog = true,
    max_cache = "100MB",
)

plot_time(
    cube_in;
    time_axis = :Ti,
    var_axis = :Variable,
    lon_axis = :lon,
    lat_axis = :lat,
    var = nothing,
    fun = "median",
    resolution = (900, 600),
    p = 0.2,
    showprog = true,
    max_cache = "100MB",
    ncol = 2,
)

for i in eachindex(metric)
    println(metric[i])
    plot_time(
        cube_in;
        time_axis = :Ti,
        var_axis = :Variable,
        lon_axis = :lon,
        lat_axis = :lat,
        var = "sensible_heat",
        fun = metric[i],
        p = 0.2,
        showprog = true,
        max_cache = "100MB",
    )
end

```
source

Example:

plot_time(
    cube;
    fun = "mean",           # Aggregation function
    var = "temperature",    # Variable to plot
    time_axis = :time,      # Name of time dimension
    resolution = (900, 600) # Figure size
)

Spatial Mapping

Create maps by aggregating the temporal dimension.

YAXArraysToolbox.plot_spaceFunction

Plot Space/Maps

Arguments

  • cube_in: YAXArray Cube.
  • time_axis: String. Name of the time axis.
  • var_axis: String. Name of the axis containing the variables.
  • var: String or nothing. Name of the variable to be plotted. If nothing all the variables presented in the cube are plotted.
  • lat_axis: String. Name of the latitude axis.
  • lon_axis: String. Name of the longitute axis.
  • fun: String. Name of the function used to collapse the spatial dimensions. It must be "median", "mean", "std", "var", "sum", "quant", "min", or "max".
  • p: Float64. in the interval [0,1]. If fun=quant p is the value of the quantile.
  • colormap: Color Map. By default: colormap = Reverse(:batlow)
  • coastlines: Boolean. Plot coast lines. By default coastlines = false
  • resolution: Plot resolution. By default resolution = (800, 300).
  • xticklabel_pad: Int. X labels padding. By default xticklabel_pad = 20.
  • yticklabel_pad: Int. Y labels padding. By default yticklabel_pad =20.
  • ncol: Number of plots by column. By default ncol = 1.
  • nrow: Number of plots by row. By default ncol = 1.
  • showprog: Boolean. Progress Bar.
  • max_cache: String. Maximum cache to read the data. It must be in MB e.g. "100MB" or in GB "10GB".

Examples


using YAXArraysToolbox
using CairoMakie
using Statistics
using GeoMakie
using YAXArrays
using DimensionalData


cube_in = open_dataset(
    "https://s3.bgc-jena.mpg.de:9000/esdl-esdc-v2.1.1/esdc-8d-0.25deg-1x720x1440-2.1.1.zarr",
)

cube_in = Cube(cube_in)


cube_in = cube_in[
    lon = (-9.0 .. 0.0),
    lat = (35 .. 40),
    Ti = (Date(2010) .. Date(2014)),
    Variable = At(["leaf_area_index", "sensible_heat"]),
]

plot_space(cube_in; time_axis = :Ti, resolution = (900, 500), xticklabel_pad = 25, yticklabel_pad = 25, var_axis = :Variable, var = "leaf_area_index", fun = "median")


metric = ["median", "mean", "std", "var", "sum", "quant", "min", "max"]


for i in eachindex(metric)
    println(metric[i])
    plot_space(
        cube_in;
        time_axis = :Ti,
        var_axis = :Variable,
        lon_axis = :lon,
        lat_axis = :lat,
        var = "sensible_heat",
        fun = metric[i],
        p = 0.2,
        showprog = true,
        max_cache = "100MB",
    )
end



plot_space(
    cube_in;
    time_axis = :Ti,
    var_axis = :Variable,
    lon_axis = :lon,
    lat_axis = :lat,
    var = nothing,
    fun = "median",
    resolution = (1200, 300),
    p = 0.2,
    showprog = true,
    max_cache = "100MB",
    ncol = 2,
)
source

Example:

plot_space(
    cube;
    fun = "median",         # Aggregation function
    var = "lai",            # Variable to plot
    time_axis = :time,      # Name of time dimension
    resolution = (900, 600) # Figure size
)

Temporal Aggregation

Resample data to different temporal resolutions (e.g., 8-day to monthly).

YAXArraysToolbox.aggregate_timeFunction

Aggregate by time

Arguments:

  • cube_in YAXArray Cube.
  • time_axis: String. Name of the time axis.
  • new_resolution: String. New temporal resolution can be "day", "month", "year".
  • new_time_step: Int64. Time step to be computed in the new time series. e.g. new_resolution="day", new_time_step=8 will compute the function each 8 days. The new time dimension will only contain the days corresponding to the 8th day.
  • fun: String. Function to be applied to aggregate the time. It can be "median", "mean", "std", "var", "sum", "quant", "min", "max".
  • p: Float64 in the interval [0,1]. If fun=quant p is the value of the quantile.
  • skipMissing: Boolean. Skip missing values when aggregating the data. If all values are missing, NaN is returned.
  • skipnan: Boolean. Skip NaN values when aggregating the data. If all values are NaN, NaN is returned.
  • showprog: Boolean. Progress Bar.
  • max_cache: String. Maximum cache to read the data. It must be in MB e.g. "100MB" or in GB "10GB".

Examples


using YAXArrays, Zarr, DimensionalData, YAXArraysToolbox

esds = open_dataset("https://s3.bgc-jena.mpg.de:9000/esdl-esdc-v2.1.1/esdc-8d-0.25deg-184x90x90-2.1.1.zarr")
esdc = Cube(esds)

# Estimating the monthly LAI

lai_month = aggregate_time(esdc[Variable = At("leaf_area_index")]; time_axis = :Ti, new_resolution = "month", new_time_step=1, fun="mean", p=nothing, skipMissing=true, skipnan=true, showprog=true, max_cache="1GB")

source

Example:

# Aggregate from 8-day to monthly means
monthly_cube = aggregate_time(
    cube;
    new_resolution = "month",  # Target resolution
    fun = "mean",              # Aggregation function
    skipMissing = true         # Handle missing values
)

Supported resolutions: "day", "month", "year"

Supported functions: "mean", "median", "std", "var", "sum", "min", "max", "quant"


Masking Functions

Functions for filtering and subsetting data based on various criteria.

Temporal Masking

Filter data by time period.

YAXArraysToolbox.masking_timeFunction

Masking using time dimension.

The function implements two methods:

  1. Masking based on a threshold value for one of the variables presented in the cube. e.g., masking the values of all the variables presented in the cube where radiation is lower than X.
  2. Masking based on the quantile threshold, where the quantile is estimated using the time series for each one of the variables presented in the cube.

The masked vales are set as NaN.

Arguments:

  • cube_in YAXArray Cube.
  • time_axis: String. Name of the time axis.
  • var_axis: String. Name of the axis containing the variables.
  • var_mask: String or nothing. Name of the variable to be used to mask the other variables. If String val must be an Int64 or Float64 number. If nothing, val must be nothing and p must be a Float64 in the interval [0,1].
  • val: Float64 or nothing. The value of the threshold in var_mask to be used to mask all the variables in the cube. If var_mask = nothing then, val=nothing
  • p: Float64 or nothing. Quantile value used as a threshold to mask the variables.
  • comp: String. Standard comparison operation between the threshold value and each one of the elements. comp Must be one of the following: "==", "!=" "<", "<=", ">", ">=".
  • showprog: Boolean. Progress Bar.
  • max_cache: String. Maximum cache to read the data. It must be in MB e.g. "100MB" or in GB "10GB".

Examples

using YAXArrays, Statistics, Zarr, NetCDF, YAXArraysToolbox

esds = open_dataset(
    "https://s3.bgc-jena.mpg.de:9000/esdl-esdc-v2.1.1/esdc-8d-0.25deg-184x90x90-2.1.1.zarr",
)
esdc = Cube(esds)

esdc_small = esdc[
    lon = (-86, -35),
    lat = (-56, 14),
    time = (Date(2010), Date(2014)),
    Variable = ["leaf_area_index", "sensible_heat", "potential_evaporation"],
]

test = masking_time(
    esdc_small;
    time_axis = "time",
    var_axis = "Variable",
    var_mask = "leaf_area_index",
    val = 0.2,
    comp = "<",
    showprog = true,
    max_cache = "1GB",
)

plot_time(esdc_small; time_axis="time", var_axis="Variable", var = "leaf_area_index", lat_axis = "lat", lon_axis="lon", fun = "min")

plot_time(test; time_axis="time", var_axis="Variable", var = "leaf_area_index", lat_axis = "lat", lon_axis="lon", fun = "min")
source

Example:

# Keep only data from 2010-2015
masked = masking_time(
    cube;
    start_date = Date(2010, 1, 1),
    end_date = Date(2015, 12, 31)
)

Spatial Masking

Apply spatial masks based on another data cube.

YAXArraysToolbox.masking_spaceFunction

Masking using spatial dimension

The masked vales are set as NaN!!.

Arguments:

  • cube_in YAXArray Cube to be masked.

-mask YAXArray Cube without time dimension and with a single variable to be used as mask. All values equal to NaN or missing will be masked in cubein. The mask will be applied to all the variables and time steps presented in ```cubein```.

  • lat_axis: String. Name of the latitude axis.
  • lon_axis: String. Name of the longitude axis.
  • val_mask: NaN or missing. Value present in mask to be used as reference to mask cube_in. Must be NaN or missing.
  • showprog: Boolean. Progress Bar.
  • max_cache: String. Maximum cache to read the data. It must be in MB e.g. "100MB" or in GB "10GB".

Examples

using YAXArrays, Zarr, DimensionalData, Test
axlist = (
    Dim{:Ti}(range(1, 20, length = 20)),
    Dim{:x}(range(1, 10, length = 10)),
    Dim{:y}(range(1, 5, length = 15)),
    Dim{:Variable}(["var1", "var2"]),
    )
    
    
    data = rand(20, 10, 15, 2)
    
    
    ds = YAXArray(axlist, data)
    
    axlist = (
    Dim{:x}(range(1, 10, length = 10)),
    Dim{:y}(range(1, 5, length = 15)),
    Dim{:Variable}(["var1"]),
    )
    
    
    data = rand(10, 15, 1)
    
    data[3,5,1] = NaN
    
    data[1,10,1] = NaN
    
    
    data[9,5,1] = NaN
    
    ds_mask = YAXArray(axlist, data)
    
    
    
    test_cube = masking_space(ds, ds_mask; lat_axis = :x, lon_axis = :y)
source

Example:

# Mask using a land/water mask
masked = masking_space(
    data_cube,
    land_mask_cube;
    threshold = 0.5  # Minimum land fraction
)

General Masking

Apply combined masks with multiple criteria.

YAXArraysToolbox.masking_procFunction

Masking processor

Arguments:

  • cube_in_to_mask: YAXArray cube to be masked.

  • cube_rsquare: Nothing, or YAXArray cube with the $R^{2}$ variable. If set to nothing no mask is applied

  • rsquare_thr: Float64. $R^{2}$ threshold. All values lower than rsquare_thr are set to NaN

  • cube_co_occurrence: Nothing, or YAXArray cube with the co-occurrence variable. If set to nothing no mask is applied.

  • co_occurence_thr: Float64. Co-occurence threshold. All values lower than co_occurence_thr are set to NaN

  • cube_delta: Nothing, or YAXArray cube with delta variable. If set to nothing no mask is applied.

  • minmax_delta: Tuple. Minimum and maximum thresholds of delta variable. Values lower and higher than the thresholds are set to NaN. It is also possible to set any of the thresholds as nothing e.g. (-1, nothing) or (nothing, 1) in these cases only one threshold is applied.

  • time_dim: Nothing, or String. Name of the time dimension. This dimensions needs to be present in all the cubes. If set to nothing no time dimension considered (It can result in slower computation time!). By default time_dim = time

  • showprog: Boolean. Show progress bar. By default showprog = true

Output:

  • YAXArray cube masked.
source

Space-for-Time Analysis

Functions for analyzing land cover change impacts using spatial variability as a proxy for temporal change.

Main Processing Function

YAXArraysToolbox.space4time_procFunction

Space for time processor

Arguments:

  • cube_con : YAXARRAY with the continous variable to be analyized.
  • cube_classes: YAXARRAY with the discrete classes to be used in the space4time.
  • time_axis_name : String or nothing. Name of the time axis on the input cubes. By default time_axis_name = :time. if time_axis_name = nothing, not time dimension considered.
  • lon_axis_name : String. Name of the longitude axis on the input cubes. By default lon_axis_name = "lon"
  • lat_axis_name : String. Name of the longitude axis on the input cubes. By default lon_axis_name = "lat"
  • classes_var_name : String. Name of the Variable containing the discrete classes. By default classes_var_name = "classes".
  • winsize: Edge size of the moving window on pixels. By default winsize = 5. E.g. winsize = 5 will produce a moving window with 5^2 pixels.
  • minpxl : Minimum number of pixels in the moving window. By default minpxl = 25. Change accordindly to your winsize parameter.
  • minDiffPxls: Minimum number pixels in the moving window that must have different compositions. Must be any value in the interval 1 to winsize^2. By default minDiffPxls = 15.
  • classes_vec: A string vector with the names of the classes on cube_classes to be used. e.g. from MPI-BGC internal structure classes_vec = ["Evergreen_Needleleaf_Forests", "Evergreen_Broadleaf_Forests", "Deciduous_Needleleaf_Forests", "Deciduous_Broadleaf_Forests", "Mixed_Forests", "Closed_Shrublands", "Open_Shrublands", "Woody_Savannas", "Savannas", "Grasslands", "Permanent_Wetlands", "Croplands", "Urban_and_Built-up_Lands", "Cropland/Natural_Vegetation_Mosaics", "Permanent_Snow_and_Ice", "Barren", "Water_Bodies"]

  • max_value: Indicates if the scale of the presence of the discrete classes if from 0 to 1 or 0 to 100 if max_value = 100 then the data is re-scaled from 0 to 1. By default max_value = 1

  • showprog: Show progress bar. By default showprog = true

  • max_cache: Size of the cache to allocate temporarily sections of the cubes. By default max_cache = 1e8

Output:

The space4time_proc produces a YAXARRAY.Dataset with three cubes:

  • summarymovwindow cube has one axis summary_stat, and three variables:

    • rsquared: Coefficient of determination. Fraction of variance explained by the model.
    • cumulative_variance: Variance preserved after the singular variance decomposition of the classes matrix.
    • predicted: Mean prediction of Z for moving window with the real combination of values.
  • metrics_for_classes cube has one axis Values of Z for pure classes, and two variables:

    • estimated: Value of the biophysical variable when the class is 1.
    • estimated_error: estimated error of the value of the biophysical variable when the class is 1.
  • metricsfortransitions has two axis transitions (all the transitions by pairs between the different classes), and Differences with three variables:

    • delta: Potential change in biophysical produced of going from one class the another.
    • delta_error: Error estimation of the potential change in biophysical produced of going from one class to another.
    • co_occurrence: Metric that represents the gradient of no presence of either class (0) to 'full and evenly balanced presence of both classes' (1).
source

Example:

results = space4time_proc(
    climate_cube,           # Climate variable (e.g., LST)
    landcover_cube,         # Land cover fractions
    altitude_cube;          # Altitude data (optional)
    classes_vec = ["forest", "grassland", "cropland"],
    winsize = 5,            # Moving window size
    showprog = true
)

Function Index


Type Reference

All functions in YAXArraysToolbox work with YAXArray objects from YAXArrays.jl.

Common Parameters

ParameterTypeDescription
cubeYAXArrayInput data cube
funStringAggregation function: "mean", "median", "std", "var", "sum", "min", "max", "quant"
time_axisSymbolName of the time dimension (typically :time or :Ti)
varString or NothingVariable name to process, or nothing for all
showprogBoolShow progress bar
max_cacheStringMaximum memory cache (e.g., "1GB")