obspy.Stream
objects to
internally represent simulated and observed data. While obspy.Stream
's are
a great representation type for seismological data and come with convenient
processing and filtering functions, there are a few limitations to the
original data interface used within Salvus.obspy.Stream
objects are mainly used within the seismological community.
For non-seismology users, learning how to work with these objects would
provide an unnecessary extra challenge.2025.1.0
, we have changed the way
that data is represented internally within Salvus. Salvus now uses
xarray data structures
with a specific format to represent data internally. In this tutorial, you
will learn what this format looks like and how you can start using it within
Salvus. If you prefer to continue to use the old interface based on
obspy.Stream
objects, do not worry, all the old functionality is still
supported!import salvus.namespace as sn
import matplotlib.pyplot as plt
import numpy as np
import xarray as xr
import scipy.signal
# Frequency of the plane-harmonic wave in Hz.
f = 10.0
# Velocity in m/s.
velocity = 3000.0
# Wavenumber
k = 2 * np.pi * f / velocity
# Total number of periods included in the data.
number_of_periods = 4
# Temporal resolution of the data
samples_per_period = 20
# Time axis
t = np.linspace(
0,
number_of_periods * (1 / f),
int(number_of_periods * (1 / f) / (1 / (samples_per_period * f))),
)
# Space axis
x = np.arange(25) * 10
# Generate plane-harmonic data at the sampling points
u = np.sin(2 * np.pi * f * t[np.newaxis, :] - k * x[:, np.newaxis])
# Let's visualize the data
plt.imshow(u, aspect="auto")
plt.xlabel("t (# of samples)")
plt.ylabel("x (# of samples)")
# What it would look like to include axis data manually
# plt.imshow(u,extent=[t.min(), t.max(), x.min(), x.max()], aspect="auto")
# The axes are now labeled in whatever units the dimensions are defined.
# plt.xlabel("t")
# plt.ylabel("x")
Text(0, 0.5, 'x (# of samples)')
xarray
can
help us to additionally store this information in a single data structure.
Let us therefore wrap our data into an xarray.DataArray
structure. Upon
creation of the xarray
object can provide the following information:data
: The numpy array containing our data.dims
: A list with the labels for each dimension of our dataset. Our data
is 2D, so we add two labels, one for the time and one for the space
dimension.coords
: Here, we can optionally provide the coordinates for each
dimension. We pass the time and space axis as a dictionary. The keys of the
dictionary have to agree with the dimensions the we set in the previous
row.name
: An optional name for our data.u_xr = xr.DataArray(
data=u,
dims=["x", "t"],
coords={"t": t, "x": x},
name="u(x, t)",
)
u_xr
<xarray.DataArray 'u(x, t)' (x: 25, t: 80)> Size: 16kB array([[ 0.00000000e+00, 3.12796607e-01, 5.94201029e-01, ..., -5.94201029e-01, -3.12796607e-01, -9.79717439e-16], [-2.07911691e-01, 1.08482541e-01, 4.13989494e-01, ..., -7.48443128e-01, -5.03439960e-01, -2.07911691e-01], [-4.06736643e-01, -1.00572732e-01, 2.15684631e-01, ..., -8.69974671e-01, -6.72080571e-01, -4.06736643e-01], ..., [ 9.94521895e-01, 9.11920769e-01, 7.37799515e-01, ..., 8.62021355e-01, 9.77313066e-01, 9.94521895e-01], [ 9.94521895e-01, 9.77313066e-01, 8.62021355e-01, ..., 7.37799515e-01, 9.11920769e-01, 9.94521895e-01], [ 9.51056516e-01, 9.99992093e-01, 9.48568727e-01, ..., 5.81332295e-01, 8.06673158e-01, 9.51056516e-01]]) Coordinates: * t (t) float64 640B 0.0 0.005063 0.01013 0.01519 ... 0.3899 0.3949 0.4 * x (x) int64 200B 0 10 20 30 40 50 60 ... 180 190 200 210 220 230 240
<xarray.DataArray 'u(x, t)' (x: 25, t: 80)> Size: 16kB array([[ 0.00000000e+00, 3.12796607e-01, 5.94201029e-01, ..., -5.94201029e-01, -3.12796607e-01, -9.79717439e-16], [-2.07911691e-01, 1.08482541e-01, 4.13989494e-01, ..., -7.48443128e-01, -5.03439960e-01, -2.07911691e-01], [-4.06736643e-01, -1.00572732e-01, 2.15684631e-01, ..., -8.69974671e-01, -6.72080571e-01, -4.06736643e-01], ..., [ 9.94521895e-01, 9.11920769e-01, 7.37799515e-01, ..., 8.62021355e-01, 9.77313066e-01, 9.94521895e-01], [ 9.94521895e-01, 9.77313066e-01, 8.62021355e-01, ..., 7.37799515e-01, 9.11920769e-01, 9.94521895e-01], [ 9.51056516e-01, 9.99992093e-01, 9.48568727e-01, ..., 5.81332295e-01, 8.06673158e-01, 9.51056516e-01]]) Coordinates: * t (t) float64 640B 0.0 0.005063 0.01013 0.01519 ... 0.3899 0.3949 0.4 * x (x) int64 200B 0 10 20 30 40 50 60 ... 180 190 200 210 220 230 240
xarray.DataArray
, we can use the
plot()
method for a quick visualization. Note that the time and space
coordinates are now represented correctly, as well as adding an appropriate
color scale automatically.# Note how we can perform conventional numpy
# operations on the xarray data arrays, such as
# the applying the transform operator here.
u_xr.T.plot()
<matplotlib.collections.QuadMesh at 0x7dec3f73a7d0>
data
attribute:# Get the data as a numpy array
u = u_xr.data
u
array([[ 0.00000000e+00, 3.12796607e-01, 5.94201029e-01, ..., -5.94201029e-01, -3.12796607e-01, -9.79717439e-16], [-2.07911691e-01, 1.08482541e-01, 4.13989494e-01, ..., -7.48443128e-01, -5.03439960e-01, -2.07911691e-01], [-4.06736643e-01, -1.00572732e-01, 2.15684631e-01, ..., -8.69974671e-01, -6.72080571e-01, -4.06736643e-01], ..., [ 9.94521895e-01, 9.11920769e-01, 7.37799515e-01, ..., 8.62021355e-01, 9.77313066e-01, 9.94521895e-01], [ 9.94521895e-01, 9.77313066e-01, 8.62021355e-01, ..., 7.37799515e-01, 9.11920769e-01, 9.94521895e-01], [ 9.51056516e-01, 9.99992093e-01, 9.48568727e-01, ..., 5.81332295e-01, 8.06673158e-01, 9.51056516e-01]])
# Get the time axis as a numpy array
t = u_xr.t.data
t
array([0. , 0.00506329, 0.01012658, 0.01518987, 0.02025316, 0.02531646, 0.03037975, 0.03544304, 0.04050633, 0.04556962, 0.05063291, 0.0556962 , 0.06075949, 0.06582278, 0.07088608, 0.07594937, 0.08101266, 0.08607595, 0.09113924, 0.09620253, 0.10126582, 0.10632911, 0.11139241, 0.1164557 , 0.12151899, 0.12658228, 0.13164557, 0.13670886, 0.14177215, 0.14683544, 0.15189873, 0.15696203, 0.16202532, 0.16708861, 0.1721519 , 0.17721519, 0.18227848, 0.18734177, 0.19240506, 0.19746835, 0.20253165, 0.20759494, 0.21265823, 0.21772152, 0.22278481, 0.2278481 , 0.23291139, 0.23797468, 0.24303797, 0.24810127, 0.25316456, 0.25822785, 0.26329114, 0.26835443, 0.27341772, 0.27848101, 0.2835443 , 0.28860759, 0.29367089, 0.29873418, 0.30379747, 0.30886076, 0.31392405, 0.31898734, 0.32405063, 0.32911392, 0.33417722, 0.33924051, 0.3443038 , 0.34936709, 0.35443038, 0.35949367, 0.36455696, 0.36962025, 0.37468354, 0.37974684, 0.38481013, 0.38987342, 0.39493671, 0.4 ])
# Get the space axis as a numpy array
x = u_xr.x.data
x
array([ 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240])
# Generate a list of unique receiver identifiers
receiver_id = [f"R{i:02d}" for i in range(u.shape[0])]
print(f"`receiver_id` is a list of strings, of length: {len(receiver_id)}")
print(receiver_id, "\n")
data_array = xr.DataArray(
data=u,
dims=["receiver_id", "time"],
coords={"time": t, "receiver_id": receiver_id},
name="wave_recording", # Only used for visualization & labelling
)
data_array.plot()
`receiver_id` is a list of strings, of length: 25 ['R00', 'R01', 'R02', 'R03', 'R04', 'R05', 'R06', 'R07', 'R08', 'R09', 'R10', 'R11', 'R12', 'R13', 'R14', 'R15', 'R16', 'R17', 'R18', 'R19', 'R20', 'R21', 'R22', 'R23', 'R24']
<matplotlib.collections.QuadMesh at 0x7dec33fec610>
sel()
method on the
"receiver_id"
dimension:# Select the receiver "R15"
data_array.sel(receiver_id="R15").plot()
[<matplotlib.lines.Line2D at 0x7dec33d46450>]
isel()
method:# Select the three receivers by their index
data_array.isel(receiver_id=[15, 18, 21]).plot()
<matplotlib.collections.QuadMesh at 0x7dec33d2e490>
ds = xr.Dataset(
{
"var_1": data_array,
"var_2": data_array**2,
"var_3": data_array - 0.5,
}
)
ds
<xarray.Dataset> Size: 49kB Dimensions: (time: 80, receiver_id: 25) Coordinates: * time (time) float64 640B 0.0 0.005063 0.01013 ... 0.3899 0.3949 0.4 * receiver_id (receiver_id) <U3 300B 'R00' 'R01' 'R02' ... 'R22' 'R23' 'R24' Data variables: var_1 (receiver_id, time) float64 16kB 0.0 0.3128 ... 0.8067 0.9511 var_2 (receiver_id, time) float64 16kB 0.0 0.09784 ... 0.6507 0.9045 var_3 (receiver_id, time) float64 16kB -0.5 -0.1872 ... 0.3067 0.4511
<xarray.Dataset> Size: 49kB Dimensions: (time: 80, receiver_id: 25) Coordinates: * time (time) float64 640B 0.0 0.005063 0.01013 ... 0.3899 0.3949 0.4 * receiver_id (receiver_id) <U3 300B 'R00' 'R01' 'R02' ... 'R22' 'R23' 'R24' Data variables: var_1 (receiver_id, time) float64 16kB 0.0 0.3128 ... 0.8067 0.9511 var_2 (receiver_id, time) float64 16kB 0.0 0.09784 ... 0.6507 0.9045 var_3 (receiver_id, time) float64 16kB -0.5 -0.1872 ... 0.3067 0.4511
"receiver_id"
and the "component"
. Let us
see how this works in practice."receiver_id"
, the second dimension to the "component"
, and the third
dimension to "time"
.# Let's mock some multicomponent data
# where each component corresponds to a
# scaled version of our original data.
theta = np.pi / 8
# X component
u_x = np.cos(theta) * u
# Y component
u_y = np.sin(theta) * u
# 3D array with multicomponent data,
# where the first dimension corresponds
# to the receiver id, the second to
# the component, and the third to time.
u_multicomponent = np.stack([u_x, u_y], axis=1)
u_multicomponent.shape
(25, 2, 80)
xarray.DataArray
from that.components = ["X", "Y"]
# 3D data structure
data_array_multicomponent = xr.DataArray(
# Data as 3D numpy array
data=u_multicomponent,
# Labels for each dimension
dims=["receiver_id", "component", "time"],
# Coordinates for each dimension
coords={
"receiver_id": receiver_id,
"component": components,
"time": t,
},
)
data_array_multicomponent
<xarray.DataArray (receiver_id: 25, component: 2, time: 80)> Size: 32kB array([[[ 0.00000000e+00, 2.88986383e-01, 5.48970169e-01, ..., -5.48970169e-01, -2.88986383e-01, -9.05140890e-16], [ 0.00000000e+00, 1.19702079e-01, 2.27390889e-01, ..., -2.27390889e-01, -1.19702079e-01, -3.74921632e-16]], [[-1.92085356e-01, 1.00224799e-01, 3.82476420e-01, ..., -6.91471287e-01, -4.65117875e-01, -1.92085356e-01], [-7.95643595e-02, 4.15144712e-02, 1.58426920e-01, ..., -2.86416785e-01, -1.92658132e-01, -7.95643595e-02]], [[-3.75775660e-01, -9.29170890e-02, 1.99266616e-01, ..., -8.03751792e-01, -6.20921484e-01, -3.75775660e-01], [-1.55651375e-01, -3.84875184e-02, 8.25389350e-02, ..., -3.32924893e-01, -2.57194100e-01, -1.55651375e-01]], ..., [[ 9.18818424e-01, 8.42504934e-01, 6.81637871e-01, ..., 7.96403887e-01, 9.02919539e-01, 9.18818424e-01], [ 3.80587052e-01, 3.48976970e-01, 2.82343651e-01, ..., 3.29881291e-01, 3.74001519e-01, 3.80587052e-01]], [[ 9.18818424e-01, 9.02919539e-01, 7.96403887e-01, ..., 6.81637871e-01, 8.42504934e-01, 9.18818424e-01], [ 3.80587052e-01, 3.74001519e-01, 3.29881291e-01, ..., 2.82343651e-01, 3.48976970e-01, 3.80587052e-01]], [[ 8.78661650e-01, 9.23872227e-01, 8.76363232e-01, ..., 5.37081009e-01, 7.45268820e-01, 8.78661650e-01], [ 3.63953572e-01, 3.82680406e-01, 3.63001536e-01, ..., 2.22466238e-01, 3.08700453e-01, 3.63953572e-01]]]) Coordinates: * receiver_id (receiver_id) <U3 300B 'R00' 'R01' 'R02' ... 'R22' 'R23' 'R24' * component (component) <U1 8B 'X' 'Y' * time (time) float64 640B 0.0 0.005063 0.01013 ... 0.3899 0.3949 0.4
<xarray.DataArray (receiver_id: 25, component: 2, time: 80)> Size: 32kB array([[[ 0.00000000e+00, 2.88986383e-01, 5.48970169e-01, ..., -5.48970169e-01, -2.88986383e-01, -9.05140890e-16], [ 0.00000000e+00, 1.19702079e-01, 2.27390889e-01, ..., -2.27390889e-01, -1.19702079e-01, -3.74921632e-16]], [[-1.92085356e-01, 1.00224799e-01, 3.82476420e-01, ..., -6.91471287e-01, -4.65117875e-01, -1.92085356e-01], [-7.95643595e-02, 4.15144712e-02, 1.58426920e-01, ..., -2.86416785e-01, -1.92658132e-01, -7.95643595e-02]], [[-3.75775660e-01, -9.29170890e-02, 1.99266616e-01, ..., -8.03751792e-01, -6.20921484e-01, -3.75775660e-01], [-1.55651375e-01, -3.84875184e-02, 8.25389350e-02, ..., -3.32924893e-01, -2.57194100e-01, -1.55651375e-01]], ..., [[ 9.18818424e-01, 8.42504934e-01, 6.81637871e-01, ..., 7.96403887e-01, 9.02919539e-01, 9.18818424e-01], [ 3.80587052e-01, 3.48976970e-01, 2.82343651e-01, ..., 3.29881291e-01, 3.74001519e-01, 3.80587052e-01]], [[ 9.18818424e-01, 9.02919539e-01, 7.96403887e-01, ..., 6.81637871e-01, 8.42504934e-01, 9.18818424e-01], [ 3.80587052e-01, 3.74001519e-01, 3.29881291e-01, ..., 2.82343651e-01, 3.48976970e-01, 3.80587052e-01]], [[ 8.78661650e-01, 9.23872227e-01, 8.76363232e-01, ..., 5.37081009e-01, 7.45268820e-01, 8.78661650e-01], [ 3.63953572e-01, 3.82680406e-01, 3.63001536e-01, ..., 2.22466238e-01, 3.08700453e-01, 3.63953572e-01]]]) Coordinates: * receiver_id (receiver_id) <U3 300B 'R00' 'R01' 'R02' ... 'R22' 'R23' 'R24' * component (component) <U1 8B 'X' 'Y' * time (time) float64 640B 0.0 0.005063 0.01013 ... 0.3899 0.3949 0.4
"receiver_id"
and "component"
dimensions into a multi-indexed "trace"
dimension. To do
so, we use the stack()
method. This is exactly how data is represented
internally within Salvus: any dimension that is not time
is stacked into
trace
.da_multi_indexed = data_array_multicomponent.stack(
trace=("receiver_id", "component")
)
da_multi_indexed
<xarray.DataArray (time: 80, trace: 50)> Size: 32kB array([[ 0.00000000e+00, 0.00000000e+00, -1.92085356e-01, ..., 3.80587052e-01, 8.78661650e-01, 3.63953572e-01], [ 2.88986383e-01, 1.19702079e-01, 1.00224799e-01, ..., 3.74001519e-01, 9.23872227e-01, 3.82680406e-01], [ 5.48970169e-01, 2.27390889e-01, 3.82476420e-01, ..., 3.29881291e-01, 8.76363232e-01, 3.63001536e-01], ..., [-5.48970169e-01, -2.27390889e-01, -6.91471287e-01, ..., 2.82343651e-01, 5.37081009e-01, 2.22466238e-01], [-2.88986383e-01, -1.19702079e-01, -4.65117875e-01, ..., 3.48976970e-01, 7.45268820e-01, 3.08700453e-01], [-9.05140890e-16, -3.74921632e-16, -1.92085356e-01, ..., 3.80587052e-01, 8.78661650e-01, 3.63953572e-01]]) Coordinates: * time (time) float64 640B 0.0 0.005063 0.01013 ... 0.3899 0.3949 0.4 * trace (trace) object 400B MultiIndex * receiver_id (trace) <U3 600B 'R00' 'R00' 'R01' 'R01' ... 'R23' 'R24' 'R24' * component (trace) <U1 200B 'X' 'Y' 'X' 'Y' 'X' ... 'Y' 'X' 'Y' 'X' 'Y'
<xarray.DataArray (time: 80, trace: 50)> Size: 32kB array([[ 0.00000000e+00, 0.00000000e+00, -1.92085356e-01, ..., 3.80587052e-01, 8.78661650e-01, 3.63953572e-01], [ 2.88986383e-01, 1.19702079e-01, 1.00224799e-01, ..., 3.74001519e-01, 9.23872227e-01, 3.82680406e-01], [ 5.48970169e-01, 2.27390889e-01, 3.82476420e-01, ..., 3.29881291e-01, 8.76363232e-01, 3.63001536e-01], ..., [-5.48970169e-01, -2.27390889e-01, -6.91471287e-01, ..., 2.82343651e-01, 5.37081009e-01, 2.22466238e-01], [-2.88986383e-01, -1.19702079e-01, -4.65117875e-01, ..., 3.48976970e-01, 7.45268820e-01, 3.08700453e-01], [-9.05140890e-16, -3.74921632e-16, -1.92085356e-01, ..., 3.80587052e-01, 8.78661650e-01, 3.63953572e-01]]) Coordinates: * time (time) float64 640B 0.0 0.005063 0.01013 ... 0.3899 0.3949 0.4 * trace (trace) object 400B MultiIndex * receiver_id (trace) <U3 600B 'R00' 'R00' 'R01' 'R01' ... 'R23' 'R24' 'R24' * component (trace) <U1 200B 'X' 'Y' 'X' 'Y' 'X' ... 'Y' 'X' 'Y' 'X' 'Y'
# The shape of this new array is now 2D
print(f"Original shape:\t{data_array_multicomponent.shape}")
print(f"Stacked shape:\t{da_multi_indexed.shape}")
Original shape: (25, 2, 80) Stacked shape: (80, 50)
da_multi_indexed.sel(component="X").T.plot()
<matplotlib.collections.QuadMesh at 0x7dec33ccadd0>
"trace"
dimension
for a specific receiver and component pair like this.# Select the Y component of receiver R11.
da_multi_indexed.sel(trace=("R11", "Y")).plot()
[<matplotlib.lines.Line2D at 0x7dec33c780d0>]
unstack
method, which gives us our original 3D array.da_3D = da_multi_indexed.unstack()
da_3D
<xarray.DataArray (time: 80, receiver_id: 25, component: 2)> Size: 32kB array([[[ 0.00000000e+00, 0.00000000e+00], [-1.92085356e-01, -7.95643595e-02], [-3.75775660e-01, -1.55651375e-01], ..., [ 9.18818424e-01, 3.80587052e-01], [ 9.18818424e-01, 3.80587052e-01], [ 8.78661650e-01, 3.63953572e-01]], [[ 2.88986383e-01, 1.19702079e-01], [ 1.00224799e-01, 4.15144712e-02], [-9.29170890e-02, -3.84875184e-02], ..., [ 8.42504934e-01, 3.48976970e-01], [ 9.02919539e-01, 3.74001519e-01], [ 9.23872227e-01, 3.82680406e-01]], [[ 5.48970169e-01, 2.27390889e-01], [ 3.82476420e-01, 1.58426920e-01], [ 1.99266616e-01, 8.25389350e-02], ..., ... ..., [ 7.96403887e-01, 3.29881291e-01], [ 6.81637871e-01, 2.82343651e-01], [ 5.37081009e-01, 2.22466238e-01]], [[-2.88986383e-01, -1.19702079e-01], [-4.65117875e-01, -1.92658132e-01], [-6.20921484e-01, -2.57194100e-01], ..., [ 9.02919539e-01, 3.74001519e-01], [ 8.42504934e-01, 3.48976970e-01], [ 7.45268820e-01, 3.08700453e-01]], [[-9.05140890e-16, -3.74921632e-16], [-1.92085356e-01, -7.95643595e-02], [-3.75775660e-01, -1.55651375e-01], ..., [ 9.18818424e-01, 3.80587052e-01], [ 9.18818424e-01, 3.80587052e-01], [ 8.78661650e-01, 3.63953572e-01]]]) Coordinates: * receiver_id (receiver_id) <U3 300B 'R00' 'R01' 'R02' ... 'R22' 'R23' 'R24' * component (component) <U1 8B 'X' 'Y' * time (time) float64 640B 0.0 0.005063 0.01013 ... 0.3899 0.3949 0.4
<xarray.DataArray (time: 80, receiver_id: 25, component: 2)> Size: 32kB array([[[ 0.00000000e+00, 0.00000000e+00], [-1.92085356e-01, -7.95643595e-02], [-3.75775660e-01, -1.55651375e-01], ..., [ 9.18818424e-01, 3.80587052e-01], [ 9.18818424e-01, 3.80587052e-01], [ 8.78661650e-01, 3.63953572e-01]], [[ 2.88986383e-01, 1.19702079e-01], [ 1.00224799e-01, 4.15144712e-02], [-9.29170890e-02, -3.84875184e-02], ..., [ 8.42504934e-01, 3.48976970e-01], [ 9.02919539e-01, 3.74001519e-01], [ 9.23872227e-01, 3.82680406e-01]], [[ 5.48970169e-01, 2.27390889e-01], [ 3.82476420e-01, 1.58426920e-01], [ 1.99266616e-01, 8.25389350e-02], ..., ... ..., [ 7.96403887e-01, 3.29881291e-01], [ 6.81637871e-01, 2.82343651e-01], [ 5.37081009e-01, 2.22466238e-01]], [[-2.88986383e-01, -1.19702079e-01], [-4.65117875e-01, -1.92658132e-01], [-6.20921484e-01, -2.57194100e-01], ..., [ 9.02919539e-01, 3.74001519e-01], [ 8.42504934e-01, 3.48976970e-01], [ 7.45268820e-01, 3.08700453e-01]], [[-9.05140890e-16, -3.74921632e-16], [-1.92085356e-01, -7.95643595e-02], [-3.75775660e-01, -1.55651375e-01], ..., [ 9.18818424e-01, 3.80587052e-01], [ 9.18818424e-01, 3.80587052e-01], [ 8.78661650e-01, 3.63953572e-01]]]) Coordinates: * receiver_id (receiver_id) <U3 300B 'R00' 'R01' 'R02' ... 'R22' 'R23' 'R24' * component (component) <U1 8B 'X' 'Y' * time (time) float64 640B 0.0 0.005063 0.01013 ... 0.3899 0.3949 0.4
# A 2-D Box.
d = sn.domain.dim2.BoxDomain(x0=0.0, x1=350.0, y0=-100.0, y1=0.0)
# Create a project.
p = sn.Project.from_domain(domain=d, path="project_2d", load_if_exists=True)
# Model.
mc = sn.ModelConfiguration(
background_model=sn.model.background.homogeneous.IsotropicElastic(
vp=3000.0, vs=1500.0, rho=2000.0
)
)
# Event specific configuration.
ec = sn.EventConfiguration(
# 10 Hz Ricker wavelet
wavelet=sn.simple_config.stf.Ricker(center_frequency=10.0),
waveform_simulation_configuration=sn.WaveformSimulationConfiguration(
end_time_in_seconds=0.3
),
)
# Combine everything into a complete configuration.
p += sn.SimulationConfiguration(
name="simulation",
elements_per_wavelength=1.2,
max_frequency_in_hertz=25.0,
model_configuration=mc,
event_configuration=ec,
# Absorbing boundaries everywhere but the top.
absorbing_boundaries=sn.AbsorbingBoundaryParameters(
reference_frequency=10.0,
number_of_wavelengths=0.0,
reference_velocity=1500.0,
free_surface=["y1"],
),
)
# Add an event.
p += sn.Event(
event_name="event_a",
# A single source in the center.
sources=[
sn.simple_config.source.cartesian.VectorPoint2D(
x=50.0, y=0.0, fy=-10.0, fx=0.0
)
],
# Line of 25 receivers spaced @ 10 meters
receivers=[
sn.simple_config.receiver.cartesian.Point2D(
x=50.0 + (i + 1) * 10.0,
y=0.0,
station_code=f"AA_{i}",
fields=["displacement"],
)
for i in range(25)
],
)
# Visualize the setup.
p.viz.nb.simulation_setup("simulation", events=p.events.list())
[2025-05-21 03:01:00,193] INFO: Creating mesh. Hang on.
<salvus.flow.simple_config.simulation.waveform.Waveform object at 0x7dec33a22110>
# Run it.
p.simulations.launch(
simulation_configuration="simulation",
events=p.events.list(),
site_name="local",
ranks_per_job=1,
)
p.simulations.query(block=True)
[2025-05-21 03:01:00,557] INFO: Submitting job ... Uploading 1 files... 🚀 Submitted job_2505210301237559_63235d1c27@local
True
EventData
object from which we can then get the data using the
get_waveform_data()
method. For example, to retrieve data for the first
receiver with id XX.AA_1.
, we would do the following.# Get the EventData object.
ed = p.waveforms.get("simulation", "event_a")[0]
# Retrieve an obspy stream object for receiver AA_1
st = ed.get_waveform_data(
receiver_name="XX.AA_1.", receiver_field="displacement"
)
st.plot(show=False)
<Figure size 800x500 with 2 Axes>
xarray.DataArray
using the function
get_waveform_data_xarray()
that was introduced with Salvus release
2025.1.0
.da = ed.get_waveform_data_xarray(receiver_field="displacement")
da
<xarray.DataArray 'displacement' (trace: 50, time: 266)> Size: 53kB array([[ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., -4.1478032e-12, -4.4830905e-12, -4.7161615e-12], [ 0.0000000e+00, 0.0000000e+00, -2.0031971e-19, ..., 5.2360377e-12, 5.3063691e-12, 5.2815118e-12], [ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., -4.4507397e-12, -4.5939342e-12, -4.6322750e-12], ..., [ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., -1.6913059e-11, -1.5586617e-11, -1.4017106e-11], [ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 5.6617484e-12, 5.1259292e-12, 4.5559281e-12], [ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., -3.0636202e-11, -2.7980450e-11, -2.5016451e-11]], dtype=float32) Coordinates: * time (time) float64 2kB -0.1559 -0.1542 -0.1525 ... 0.2983 0.3 * trace (trace) object 400B MultiIndex * receiver_id (trace) <U9 2kB 'XX.AA_0.' 'XX.AA_0.' ... 'XX.AA_24.' * component (trace) <U1 200B 'X' 'Y' 'X' 'Y' 'X' ... 'Y' 'X' 'Y' 'X' 'Y' Attributes: start_time_in_seconds: -0.15593936024673521 sampling_rate_in_hertz: 581.217642312331 start_time_in_seconds_utc_datetime: 1969-12-31T23:59:59.844061Z end_time_in_seconds_utc_datetime: 1970-01-01T00:00:00.300000Z reference_time_utc: 1970-01-01T00:00:00.000000Z reference_time_in_seconds: 0.0 npts: 266 location: [[ 60. 0.]\n [ 70. 0.]\n [ 80. ...
<xarray.DataArray 'displacement' (trace: 50, time: 266)> Size: 53kB array([[ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., -4.1478032e-12, -4.4830905e-12, -4.7161615e-12], [ 0.0000000e+00, 0.0000000e+00, -2.0031971e-19, ..., 5.2360377e-12, 5.3063691e-12, 5.2815118e-12], [ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., -4.4507397e-12, -4.5939342e-12, -4.6322750e-12], ..., [ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., -1.6913059e-11, -1.5586617e-11, -1.4017106e-11], [ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 5.6617484e-12, 5.1259292e-12, 4.5559281e-12], [ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., -3.0636202e-11, -2.7980450e-11, -2.5016451e-11]], dtype=float32) Coordinates: * time (time) float64 2kB -0.1559 -0.1542 -0.1525 ... 0.2983 0.3 * trace (trace) object 400B MultiIndex * receiver_id (trace) <U9 2kB 'XX.AA_0.' 'XX.AA_0.' ... 'XX.AA_24.' * component (trace) <U1 200B 'X' 'Y' 'X' 'Y' 'X' ... 'Y' 'X' 'Y' 'X' 'Y' Attributes: start_time_in_seconds: -0.15593936024673521 sampling_rate_in_hertz: 581.217642312331 start_time_in_seconds_utc_datetime: 1969-12-31T23:59:59.844061Z end_time_in_seconds_utc_datetime: 1970-01-01T00:00:00.300000Z reference_time_utc: 1970-01-01T00:00:00.000000Z reference_time_in_seconds: 0.0 npts: 266 location: [[ 60. 0.]\n [ 70. 0.]\n [ 80. ...
"displacement"
. The array has the following two
dimensions, always in this order:"trace"
: A multi-index dimension that combines the dimensions
"receiver_id"
and "component"
. For details on multi-indices, see
section 1 of this tutorial."time
: The time dimension."start_time_in_seconds"
: The time of the first sample."sampling_rate_in_hertz"
: The sampling rate of the data."npts"
: The number of time samples."location"
: The coordinates of each receiver location."X"
and "Y"
and plot them.plt.figure()
# Note that the xarray plotting routines use
# matplotlib under the hood and accept the
# usual input arguments. Here, we make sure
# that both components are plotted with the same
# color range using the vmin and vmax arguments.
da.sel(component="X").plot(vmin=-6e-10, vmax=6e-10)
plt.figure()
da.sel(component="Y").plot(vmin=-6e-10, vmax=6e-10)
# Attributes are stored as a dictionary and can be accessed like this (e.g.,
# for the sampling rate):
<matplotlib.collections.QuadMesh at 0x7dec32a35050>
sr = da.attrs["sampling_rate_in_hertz"]
sr
581.217642312331
unstack()
method to obtain a 3D representation of the
data with dimensions ("receiver_id", "component", "time")
.da_3D = da.unstack()
print(
f"The unstacked array has a shape of {da_3D.shape}, with the dimensions "
f"corresponding to {da_3D.dims}."
)
The unstacked array has a shape of (266, 25, 2), with the dimensions corresponding to ('time', 'receiver_id', 'component').
2025.1.0
, you can now
define custom functions that operate on the new xarray
data structures
directly and add them to a project. Trace-by-trace based custom functions
operating on obspy.Stream
objects are still supported as well. However, you
should observe a significant performance improvement when using the xarray
interface for custom processing functions, especially when working with data
with a large number of receivers.xarray.DataArray
object.data_array
: The Salvus xarray.DataArray
object on which the filtering
will be performed.event
: The Salvus Event
object of the Event that this data corresponds
to.def bandpass_filter(
data_array: xr.DataArray,
event: sn.Event,
) -> xr.DataArray:
"""
A custom bandpass filtering function that operates on an xarray.DataArray
object.
Args:
data_array: The input data array.
event: The Salvus event that the data belongs to.
Returns:
data_array: The filtered output data array.
"""
# Minimum frequency in Hz
min_frequency = 10.0
# Maximum frequency in Hz
max_frequency = 20.0
# The order of the filter.
N = 8
# If true, the resulting filter will be zero phase.
zerophase = True
# Here, we make a copy of the original data
# to avoid modifying the input data when
# applying the filter.
data_array_filtered = data_array.copy()
sampling_rate_in_hertz = data_array_filtered.attrs[
"sampling_rate_in_hertz"
]
nyquist = 0.5 * sampling_rate_in_hertz
sos = scipy.signal.iirfilter(
# The order of the filter.
N=N,
# The critical frequencies.
Wn=(min_frequency / nyquist, max_frequency / nyquist),
btype="band",
# Choose a butterworth filter for the linear phase shift.
ftype="butter",
# Directly output the second order section coefficients.
output="sos",
)
if zerophase:
data_array_filtered.data[:] = scipy.signal.sosfiltfilt(
sos, data_array_filtered.data, axis=1
)
else:
data_array_filtered.data[:] = scipy.signal.sosfilt(
sos, data_array_filtered.data, axis=1
)
# Apply the filter to the data
return data_array_filtered
# Plot the unfiltered data
plt.figure()
da.sel(component="X").plot()
# Plot the filtered data
plt.figure()
da_filt = bandpass_filter(da, ed)
da_filt.sel(component="X").plot()
<matplotlib.collections.QuadMesh at 0x7ded1046fbd0>
ProcessingConfiguration
.# Define a processing configuration
pc = sn.processing.ProcessingConfiguration(
name="bandpass",
data_source_name="SYNTHETIC_DATA:simulation",
processing_function=bandpass_filter,
)
# Add it to the project
p.add_to_project(
pc,
overwrite=True,
)
:
to the data that is being
queried:ed_filt = p.waveforms.get("PROCESSED_DATA:bandpass", "event_a")[0]
da_filt = ed_filt.get_waveform_data_xarray(receiver_field="displacement")
da_filt.sel(component="X").plot()
<matplotlib.collections.QuadMesh at 0x7dec32d53750>
xarray
interface. Here, we want to define a data
selection function that performs a couple of operations on our input data:data_array
: The Salvus xarray.DataArray
object on which the data
selection will be performed.event
: The Salvus Event
object of the event that this data corresponds
to.DataArray
, with the same
dimensions as the DataArray
on which the selection was performed. However,
please note that in this case, not all coordinates of the original DataArray
need to be present: the omission of e.g. a receiver_id
or component
implicitly sets its weights to zero. The data of the weights array itself
should correspond to the temporal weighting function (window) that will be
applied to the data.def select_data(
data_array: xr.DataArray,
event: sn.Event,
):
"""
A custom data selection function that operates on an xarray.DataArray
object.
Args:
data_array: The input data array.
event: The Salvus event that the data belongs to.
Returns:
weights: A data array with the desired selection applied and data
corresponding to the temporal weights to be applied to the data.
"""
# Some stations that we wish to keep.
selected_receiver_ids = ["XX.AA_1.", "XX.AA_2.", "XX.AA_3."]
# The component that we wish to keep.
selected_components = ["Y"]
# This is an intersection of the selection (selected stations) and the
# available (station, component) pairs. Important if not every station
# has every component.
selection = [
(receiver_id, component)
for receiver_id, component in set(data_array.trace.data)
if receiver_id in selected_receiver_ids
and component in selected_components
]
weights = data_array.sel(trace=selection)
# Set the weights to 1, if kept like this, no temporal selection on the
# data is performed
weights.data[:] = 1
# Now we additionally apply a box window on the time dimension,
# where the first half of the data is disregarded.
weights.data[:, : weights.data.shape[1] // 2] = 0
# With the coordinate "trace_weights", we can optionally give a specific
# scalar weight to each trace that will be considered in the computation
# of misfits. You can use it by uncommenting the following line.
# weights.coords["trace_weights"] = ("trace", np.ones((weights.shape[0])))
return weights
DataSelectionConfiguration
and add it to our
project.dsc = sn.DataSelectionConfiguration(
name="data_selection",
receiver_field="displacement",
data_selection_function=select_data,
)
p.add_to_project(
dsc,
overwrite=True,
)
MisfitConfiguration
. To check whether the data selection works as intended,
we retrieve data from the project and specify that we want to use our new
data selection configuration.ed_filt_sel = p.waveforms.get(
"PROCESSED_DATA:bandpass",
"event_a",
data_selection_configuration="data_selection",
)[0]
da_filt_sel = ed_filt_sel.get_waveform_data_xarray(
receiver_field="displacement"
)
da_filt_sel
<xarray.DataArray 'displacement' (trace: 3, time: 266)> Size: 3kB array([[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ... -2.90627435e-11, -2.62216724e-11, -2.28313167e-11, -1.90024246e-11, -1.48555161e-11, -1.05167671e-11, -6.11381041e-12, -1.77160532e-12, 2.39147287e-12, 6.26719466e-12, 9.76056909e-12, 1.27923748e-11, 1.53010521e-11, 1.72439233e-11, 1.85977067e-11, 1.93583587e-11, 1.95402444e-11, 1.91747347e-11, 1.83082664e-11, 1.70000004e-11, 1.53191453e-11, 1.33420896e-11, 1.11494312e-11, 8.82304299e-12, 6.44325912e-12, 4.08628730e-12, 1.82192846e-12, -2.88330639e-13, -2.19280670e-12, -3.85065851e-12, -5.23246680e-12, -6.32036264e-12, -7.10773150e-12, -7.59853118e-12, -7.80627559e-12, -7.75275243e-12, -7.46653260e-12, -6.98135820e-12, -6.33446334e-12, -5.56491441e-12, -4.71202287e-12, -3.81388714e-12, -2.90611719e-12, -2.02076637e-12, -1.18550536e-12, -4.23043897e-13, 2.49197391e-13, 8.19173288e-13, 1.28007858e-12, 1.63001708e-12, 1.87150816e-12, 2.01086673e-12, 2.05749567e-12, 2.02313145e-12, 1.92107658e-12, 1.76545714e-12, 1.57053331e-12, 1.35008693e-12, 1.11690626e-12]], dtype=float32) Coordinates: * time (time) float64 2kB -0.1559 -0.1542 -0.1525 ... 0.2983 0.3 * trace (trace) object 24B MultiIndex * receiver_id (trace) <U9 108B 'XX.AA_3.' 'XX.AA_1.' 'XX.AA_2.' * component (trace) <U1 12B 'Y' 'Y' 'Y' Attributes: start_time_in_seconds: -0.15593936024673521 sampling_rate_in_hertz: 581.217642312331 start_time_in_seconds_utc_datetime: 1969-12-31T23:59:59.844061Z end_time_in_seconds_utc_datetime: 1970-01-01T00:00:00.300000Z reference_time_utc: 1970-01-01T00:00:00.000000Z reference_time_in_seconds: 0.0 npts: 266 location: [[ 60. 0.]\n [ 70. 0.]\n [ 80. ... temporal_weights: <xarray.DataArray 'displacement' (tr...
<xarray.DataArray 'displacement' (trace: 3, time: 266)> Size: 3kB array([[ 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, -0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, 0.00000000e+00, ... -2.90627435e-11, -2.62216724e-11, -2.28313167e-11, -1.90024246e-11, -1.48555161e-11, -1.05167671e-11, -6.11381041e-12, -1.77160532e-12, 2.39147287e-12, 6.26719466e-12, 9.76056909e-12, 1.27923748e-11, 1.53010521e-11, 1.72439233e-11, 1.85977067e-11, 1.93583587e-11, 1.95402444e-11, 1.91747347e-11, 1.83082664e-11, 1.70000004e-11, 1.53191453e-11, 1.33420896e-11, 1.11494312e-11, 8.82304299e-12, 6.44325912e-12, 4.08628730e-12, 1.82192846e-12, -2.88330639e-13, -2.19280670e-12, -3.85065851e-12, -5.23246680e-12, -6.32036264e-12, -7.10773150e-12, -7.59853118e-12, -7.80627559e-12, -7.75275243e-12, -7.46653260e-12, -6.98135820e-12, -6.33446334e-12, -5.56491441e-12, -4.71202287e-12, -3.81388714e-12, -2.90611719e-12, -2.02076637e-12, -1.18550536e-12, -4.23043897e-13, 2.49197391e-13, 8.19173288e-13, 1.28007858e-12, 1.63001708e-12, 1.87150816e-12, 2.01086673e-12, 2.05749567e-12, 2.02313145e-12, 1.92107658e-12, 1.76545714e-12, 1.57053331e-12, 1.35008693e-12, 1.11690626e-12]], dtype=float32) Coordinates: * time (time) float64 2kB -0.1559 -0.1542 -0.1525 ... 0.2983 0.3 * trace (trace) object 24B MultiIndex * receiver_id (trace) <U9 108B 'XX.AA_3.' 'XX.AA_1.' 'XX.AA_2.' * component (trace) <U1 12B 'Y' 'Y' 'Y' Attributes: start_time_in_seconds: -0.15593936024673521 sampling_rate_in_hertz: 581.217642312331 start_time_in_seconds_utc_datetime: 1969-12-31T23:59:59.844061Z end_time_in_seconds_utc_datetime: 1970-01-01T00:00:00.300000Z reference_time_utc: 1970-01-01T00:00:00.000000Z reference_time_in_seconds: 0.0 npts: 266 location: [[ 60. 0.]\n [ 70. 0.]\n [ 80. ... temporal_weights: <xarray.DataArray 'displacement' (tr...
select_data
is not present at all in the retrieved data.# Present components and stations
list(da_filt_sel.unstack().component.data), list(
da_filt_sel.unstack().receiver_id.data
)
(['Y'], ['XX.AA_3.', 'XX.AA_1.', 'XX.AA_2.'])
da_filt_sel.sel(component="Y").plot()
<matplotlib.collections.QuadMesh at 0x7dec32f6c610>
xarray
structures, which allows us
to vectorize the misfit computations. If you want to define a custom misfit
function, you can now take advantage of this. Additionally, it is now
possible to define time-and-space dependent misfit functions.xarray.DataArray
object.xarray.DataArray
object.xarray.Dataset
object. A xarray.Dataset
object
combines multiple xarray.DataArray
's. The output Dataset
will contain two
DataArray
's, one for the adjoint sources and one for the per-trace misfits.def l2_misfit_and_adjoint_source(
data_array_synthetic: xr.DataArray,
data_array_observed: xr.DataArray,
event: sn.Event,
) -> xr.Dataset:
"""
L2 misfit and adjoint source.
Using this only serves for demonstration purposes right now as Salvus
already comes with this misfit implemented.
Both data traces have been resampled to exactly the same points in time.
Args:
data_array_synthetic: synthetic waveforms.
data_array_observed: observed waveforms.
event: Event object.
"""
diff = data_array_observed - data_array_synthetic
misfits = (
0.5
* (diff**2).sum(axis=1)
/ data_array_observed.attrs["sampling_rate_in_hertz"]
)
adj_src = diff / data_array_observed.attrs["sampling_rate_in_hertz"]
return xr.Dataset(
data_vars={
"adjoint_sources": adj_src,
"misfits": misfits,
},
attrs={
"sampling_rate_in_hertz": data_array_synthetic.attrs[
"sampling_rate_in_hertz"
],
"start_time_in_seconds": data_array_synthetic.attrs.get(
"start_time_in_seconds", 0.0
),
},
)
mc = sn.MisfitConfiguration(
name="custom_L2",
observed_data="PROCESSED_DATA:bandpass",
misfit_function=l2_misfit_and_adjoint_source,
receiver_field="displacement",
data_selection_configuration="data_selection",
)
p.add_to_project(mc)
p
<salvus.project.project.Project at 0x7dec33a34c50>