qlat_utils.data — Data Analysis Utilities¶
Source: qlat-utils/qlat_utils/data.py
Note: Update this document when updating the source file.
Outline¶
Overview¶
The qlat_utils.data module provides data analysis tools for lattice QCD
computations. It includes:
Interpolation — linear interpolation for arrays with fractional indices.
Jackknife resampling — standard, super-jackknife, and randomized jackknife-bootstrap (RJK) methods for error estimation.
The
Dataclass — a wrapper that supports arithmetic on nested numeric structures (scalars, lists, NumPy arrays).Value display — formatting of
(value, error)pairs for publication.Context managers — temporary override of global jackknife and display settings.
import qlat_utils as q
import numpy as np
avg, err = q.avg_err([1.0, 1.1, 0.9, 1.05])
print(q.show_val_err((avg, err)))
Physical Constants¶
Name |
Value |
Description |
|---|---|---|
|
|
Fine-structure constant |
|
|
Conversion factor: hbar*c / (1 fm * 1 GeV) |
import qlat_utils as q
print(q.alpha_qed) # 0.0072973525693...
print(q.fminv_gev) # 0.197326979
Type Tuples¶
Module-level tuples used for type-checking throughout the library. Extended
types (float128, complex256) are included when the platform supports them.
Name |
Contents |
|---|---|
|
|
|
|
|
|
|
|
|
|
Interpolation¶
interp_i_arr(data_x_arr, x_arr)¶
Return index array i_arr such that q.interp(data_x_arr, i_arr) is
approximately x_arr. Useful for mapping x-coordinates to fractional indices.
Parameter |
Type |
Description |
|---|---|---|
|
array-like |
Known x-values (must be monotonic) |
|
float or array-like |
Target x-values |
interp(data_arr, i_arr, axis=-1)¶
Return approximately data_arr[..., i_arr] using linear interpolation.
The index i_arr may be non-integer (fractional indices are interpolated
between adjacent elements).
Parameter |
Type |
Description |
|---|---|---|
|
array-like |
Source data |
|
float or 1-D array |
Fractional index or indices |
|
int |
Axis along which to interpolate (default |
interp_x(data_arr, data_x_arr, x_arr, axis=-1)¶
Interpolate data_arr at arbitrary x-values. Combines interp_i_arr and
interp.
Parameter |
Type |
Description |
|---|---|---|
|
array-like |
Source data |
|
array-like |
x-values for |
|
float or 1-D array |
Target x-values |
|
int |
Axis along which to interpolate (default |
get_threshold_idx(arr, threshold)¶
Return the fractional index x such that interp(arr, [x]) is approximately
threshold. Uses binary search on a 1-D array.
get_threshold_i_arr(data_arr, threshold_arr, axis=-1)¶
Broadcast version of get_threshold_idx over an array. Returns an index array
where each entry satisfies the threshold condition along the given axis.
get_threshold_x_arr(data_arr, data_x_arr, threshold_arr, axis=-1)¶
Like get_threshold_i_arr, but returns x-values instead of indices.
Data Wrapper Class¶
class Data¶
A wrapper around numeric values that supports arithmetic operations on nested
structures (scalars, lists, NumPy arrays, LatData).
Supported value types: numeric scalars, numpy.ndarray, q.LatData, and
list (element-wise operations).
import qlat_utils as q
d1 = q.Data([1.0, 2.0, 3.0])
d2 = q.Data([0.5, 0.5, 0.5])
d3 = d1 + d2 # Data([1.5, 2.5, 3.5])
d4 = d1 * 2.0 # Data([2.0, 4.0, 6.0])
d5 = -d1 # Data([-1.0, -2.0, -3.0])
Method |
Description |
|---|---|
|
Return the wrapped value |
|
Return the squared norm |
|
MPI global sum (requires |
|
Addition |
|
Subtraction |
|
Scalar or element-wise multiplication |
|
Unary negation and identity |
|
Copy support |
Basic Statistics¶
check_zero(x)¶
Return True if x is a real type and equals zero.
qnorm(x)¶
Return the squared norm of x. For scalars: x*x. For complex:
re^2 + im^2. For arrays: abs(vdot(x, x)). For lists/tuples: sum of
qnorm of each element.
q.qnorm(2) # 4
q.qnorm(1 + 2j) # 5 (1*1 + 2*2)
average(data_list)¶
Return the arithmetic mean of data_list.
average_ignore_nan(value_arr_list)¶
Return element-wise average across a list of NumPy arrays, ignoring NaN
values. Returns NaN for elements where all inputs are NaN.
block_data(data_list, block_size, is_overlapping=True)¶
Return a list of block averages. If is_overlapping is True (default),
blocks overlap by block_size - 1 entries.
avg_err(data_list, *, eps=1, block_size=1)¶
Compute (avg, err) of data_list using blocking. The error estimate is:
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
list |
— |
Data values |
|
float |
|
Additional scaling factor for error |
|
int |
|
Blocking size |
Returns (avg, err) where both have the same type as the data.
partial_sum(x, *, is_half_last=False)¶
Modify x in-place to its cumulative (partial) sum, preserving length. If
is_half_last is True, each entry becomes the average of the current and
previous partial sums (trapezoidal rule). Works for 1-D and 2-D arrays.
fsqr(data) / fsqrt(data)¶
Component-wise square and square root. For complex types, real and imaginary
parts are processed separately: fsqr(a + bi) = a^2 + b^2 i, fsqrt(a + bi) = sqrt(a) + sqrt(b) i. Supports scalars, Data, and NumPy arrays.
err_sum(*vs)¶
Return the quadrature sum of errors: sqrt(sum(fsqr(v_i))).
q.err_sum(1.4, 2.1, 1.0) # 2.7147743920996454
Jackknife Resampling¶
jackknife(data_list, *, eps=1)¶
Perform standard jackknife. Returns jk_arr of length N + 1 where:
jk_arr[0]= averagejk_arr[i]=avg - (eps / N) * (data[i] - avg)fori >= 1
jk_avg(jk_arr)¶
Return the average (first element) of a jackknife array.
jk_err(jk_arr, *, eps=1, block_size=1)¶
Return the jackknife error estimate:
The eps and block_size must match those used in the corresponding
jackknife call. Note: len(jk_arr) = N + 1.
jk_avg_err(jk_arr, *, eps=1, block_size=1)¶
Return (jk_avg, jk_err).
Super-Jackknife¶
sjackknife(data_list, jk_idx_list, *, avg=None, ...)¶
Perform super-jackknife resampling. Data from different ensembles (identified
by jk_idx_list) are combined into a single jackknife array.
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
list/ndarray |
— |
Original data |
|
list |
— |
Index for each data point (e.g., |
|
any |
|
Pre-computed average (auto-computed if |
|
bool |
|
Use hash when jk_idx not in |
|
int |
|
Hash table size |
|
|
|
RNG state (default: |
|
list |
|
All possible indices; |
|
callable |
|
Function returning |
|
callable |
|
|
|
float |
|
Scaling factor |
sjk_avg(jk_arr) / sjk_err(jk_arr, *, eps=1) / sjk_avg_err(jk_arr, *, eps=1)¶
Average, error, and (avg, err) for super-jackknife arrays. The error formula
differs from standard jackknife: no N/(N-1) factor.
sjk_mk_jk_val(rs_tag, val, err, *, ...)¶
Create a synthetic jackknife array from a central value and error using Gaussian random numbers.
Randomized Jackknife-Bootstrap Hybrid¶
rjackknife(data_list, jk_idx_list, *, avg=None, ...)¶
Jackknife-bootstrap hybrid resampling. Returns jk_arr of length
1 + n_rand_sample. The distribution of jk_arr approximates the
distribution of the average.
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
list/ndarray |
— |
Original data |
|
list |
— |
Index for each data point |
|
any |
|
Pre-computed average |
|
|
|
RNG state |
|
int |
|
Number of random samples |
|
callable |
|
|
|
bool |
|
Normalize random vectors |
|
bool |
|
Shift blocking per sample |
|
float |
|
Scaling factor |
The formula is:
where \(r_{i,j} \sim \mathcal{N}(0, 1)\) and \(b(i,j)\) is the block size.
rjk_avg(jk_arr) / rjk_err(jk_arr, eps=1) / rjk_avg_err(rjk_list, eps=1)¶
Average, error, and (avg, err) for randomized jackknife arrays.
rjk_mk_jk_val(rs_tag, val, err, *, ...)¶
Create a synthetic RJK array from a central value and error.
Unified Jackknife API¶
The g_* functions provide a unified interface that dispatches to either
super-jackknife or RJK based on global settings in default_g_jk_kwargs.
default_g_jk_kwargs¶
Global dictionary controlling jackknife behavior. Key settings:
Key |
Default |
Description |
|---|---|---|
|
|
|
|
|
Scaling factor |
|
|
Number of random samples (RJK only) |
|
|
Normalize random vectors (RJK only) |
|
|
Hash unknown jk indices (super only) |
|
|
Hash table size (super only) |
|
|
Default blocking size |
|
|
Per- |
|
|
RNG state |
g_mk_jk(data_list, jk_idx_list, *, avg=None, ...)¶
Create a (randomized) super-jackknife dataset from un-jackknifed data.
Dispatches to sjackknife or rjackknife based on jk_type.
g_mk_jk_val(rs_tag, val, err, *, ...)¶
Create a synthetic jackknife array from a value and error. Dispatches to
sjk_mk_jk_val or rjk_mk_jk_val.
g_jk_avg(jk_arr) / g_jk_err(jk_arr) / g_jk_avg_err(jk_arr)¶
Unified average, error, and (avg, err) extraction.
g_jk_avg_err_arr(jk_arr)¶
Return an array with shape jk_arr[0].shape + (2,) where the last axis is
(avg, err).
g_jk_size(*, jk_type, ...)¶
Return the number of samples in the jackknife array (1 + n_samples).
g_jk_blocking_func(i, jk_idx)¶
Apply the configured blocking function.
g_jk_sample_size(job_tag, traj_list)¶
Return the number of distinct blocks for a given job_tag and trajectory list.
get_jk_state() / set_jk_state(state)¶
Save and restore the current default_g_jk_kwargs state (for use with
@cache_call).
Value Display¶
show_val(val, *, is_latex=True, num_float_digit=None, num_exp_digit=None, exponent=None)¶
Format a single numeric value for display.
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
int/float |
— |
Value to format |
|
bool/None |
|
Use LaTeX exponent notation |
|
int/bool/None |
|
Number of decimal digits (auto if |
|
int/bool/None |
|
Significant digits in scientific notation |
|
int/None |
|
Force a specific exponent |
q.show_val(0.00123) # "1.23 \\times 10^{-3}"
q.show_val(0.00123, is_latex=False) # "1.23E-3"
q.show_val(1234.0) # "1234.0"
show_val_err(val_err, *, is_latex=True, num_float_digit=None, num_exp_digit=None, exponent=None)¶
Format a (value, error) pair. Error is shown in parentheses. If val_err is
a single number, it is formatted as a plain value.
q.show_val_err((1.12e16, 12e6)) # auto scientific notation
q.show_val_err((1.12e16, 12e7), exponent=10) # force exponent
q.show_val_err((1.12e16, 12e7), exponent=10, is_latex=False) # "1.12000(120)E10"
Context Managers¶
class NewDictValues(dictionary, **kwargs)¶
Context manager that temporarily overrides keys in dictionary and restores
them on exit.
class JkKwargs(**kwargs)¶
Context manager that temporarily overrides default_g_jk_kwargs.
with q.JkKwargs(n_rand_sample=2048, block_size=10):
jk_arr = q.g_mk_jk(data_list, jk_idx_list)
class ShowKwargs(**kwargs)¶
Context manager that temporarily overrides default_show_val_kwargs.
with q.ShowKwargs(is_latex=False, exponent=-10):
print(q.show_val_err((1.23e-10, 0.05e-10)))
Examples¶
Interpolation¶
import qlat_utils as q
import numpy as np
# Interpolate data at fractional indices
data = np.array([10.0, 20.0, 30.0, 40.0])
result = q.interp(data, 1.5) # 25.0 (midpoint between 20 and 30)
result_arr = q.interp(data, [0.5, 1.5, 2.5]) # [15.0, 25.0, 35.0]
# Interpolate with explicit x-coordinates
x_data = np.array([0.0, 1.0, 2.0, 3.0])
y_data = np.array([0.0, 1.0, 4.0, 9.0])
x_new = np.array([0.5, 1.5, 2.5])
y_new = q.interp_x(y_data, x_data, x_new) # interpolated y-values
Basic Error Estimation¶
import qlat_utils as q
import numpy as np
# Generate correlated data
data = [1.0 + 0.1 * np.random.randn() for _ in range(100)]
# Simple average and error
avg, err = q.avg_err(data)
print(f"avg = {avg:.4f}, err = {err:.4f}")
# With blocking to reduce autocorrelation
avg_b, err_b = q.avg_err(data, block_size=5)
print(f"avg = {avg_b:.4f}, err = {err_b:.4f}")
Jackknife Resampling¶
import qlat_utils as q
import numpy as np
data = [1.0, 1.1, 0.9, 1.05, 0.95, 1.02, 0.98, 1.03]
# Standard jackknife
jk_arr = q.jackknife(data)
avg = q.jk_avg(jk_arr)
err = q.jk_err(jk_arr)
print(f"Jackknife: avg = {avg:.4f}, err = {err:.4f}")
# Unified API (uses RJK by default)
jk_arr = q.g_mk_jk(data, list(range(len(data))))
avg, err = q.g_jk_avg_err(jk_arr)
print(f"RJK: avg = {avg:.4f}, err = {err:.4f}")
Formatting Values¶
import qlat_utils as q
# Format a single value
print(q.show_val(0.00123)) # "1.23 \times 10^{-3}"
print(q.show_val(0.00123, is_latex=False)) # "1.23E-3"
# Format value with error
print(q.show_val_err((1.12e16, 12e6))) # auto-notation
print(q.show_val_err((1.12e16, 12e7), exponent=10)) # "1.1200(12) \times 10^{10}"
Context Managers¶
import qlat_utils as q
# Temporarily change jackknife settings
with q.JkKwargs(n_rand_sample=2048, block_size=10):
jk_arr = q.g_mk_jk(data_list, jk_idx_list)
avg, err = q.g_jk_avg_err(jk_arr)
# Temporarily change display settings
with q.ShowKwargs(is_latex=False):
print(q.show_val_err((3.14, 0.01)))
Data Wrapper¶
import qlat_utils as q
d1 = q.Data([1.0, 2.0, 3.0])
d2 = q.Data([0.1, 0.2, 0.3])
d3 = d1 + d2 # Data([1.1, 2.2, 3.3])
d4 = d1 * 2.0 # Data([2.0, 4.0, 6.0])
d5 = d1 - d2 # Data([0.9, 1.8, 2.7])
norm = d1.qnorm() # 14.0 (1 + 4 + 9)