-
“minimal”: Only coordinates in which the dimension already appears
are included.
-
“different”: Coordinates which are not equal (ignoring attributes)
across all datasets are also concatenated (as well as all for which
dimension already appears). Beware: this option may load the data
payload of coordinate variables into memory if they are not already
loaded.
-
“all”: All coordinate variables will be concatenated, except
those corresponding to other dimensions.
-
list of str: The listed coordinate variables will be concatenated,
in addition the “minimal” coordinates.
-
parallel
(
bool
,
default
:
False
) – If True, the open and preprocess steps of this function will be
performed in parallel using
dask.delayed
. Default is False.
-
join
(
{"outer",
"inner",
"left",
"right",
"exact",
"override"}
,
default
:
"outer"
) – String indicating how to combine differing indexes
(excluding concat_dim) in objects
-
“outer”: use the union of object indexes
-
“inner”: use the intersection of object indexes
-
“left”: use indexes from the first object with each dimension
-
“right”: use indexes from the last object with each dimension
-
“exact”: instead of aligning, raise
ValueError
when indexes to be
aligned are not equal
-
“override”: if indexes are of same size, rewrite indexes to be
those of the first object with that dimension. Indexes for the same
dimension must have the same size in all objects.
-
attrs_file
(
str
or
path-like
,
optional
) – Path of the file used to read global attributes from.
By default global attributes are read from the first file provided,
with wildcard matches sorted by filename.
-
combine_attrs
(
{"drop",
"identical",
"no_conflicts",
"drop_conflicts",
"override"}
or
callable()
,
default
:
"override"
) – A callable or a string indicating how to combine attrs of the objects being
merged:
-
“drop”: empty attrs on returned Dataset.
-
“identical”: all attrs must be the same on every object.
-
“no_conflicts”: attrs from all objects are combined, any that have
the same name must also have the same value.
-
“drop_conflicts”: attrs from all objects are combined, any that have
the same name but different values are dropped.
-
“override”: skip comparing and copy attrs from the first dataset to
the result.
If a callable, it must expect a sequence of
attrs
dicts and a context object
as its only parameters.
-
**kwargs
(
optional
) – Additional arguments passed on to
xarray.open_dataset()
. For an
overview of some of the possible options, see the documentation of
xarray.open_dataset()
-
Returns
-
xarray.Dataset
Notes
open_mfdataset
opens files with read-only access. When you modify values
of a Dataset, even one linked to files on disk, only the in-memory copy you
are manipulating in xarray is modified: the original file on disk is never
touched.
See also
combine_by_coords
,
combine_nested
,
open_dataset
Examples
A user might want to pass additional arguments into
preprocess
when
applying some operation to many individual files that are being opened. One route
to do this is through the use of
functools.partial
.
>>> from functools import partial
>>> def _preprocess(x, lon_bnds, lat_bnds):
... return x.sel(lon=slice(*lon_bnds), lat=slice(*lat_bnds))
>>> lon_bnds, lat_bnds = (-110, -105), (40, 45)
>>> partial_func = partial(_preprocess, lon_bnds=lon_bnds, lat_bnds=lat_bnds)
>>> ds = xr.open_mfdataset(
... "file_*.nc", concat_dim="time", preprocess=partial_func
... )
It is also possible to use any argument to open_dataset
together
with open_mfdataset
, such as for example drop_variables
: