Note
Download this example as a Jupyter notebook here: pypsa/atlite
Creating a Cutout with ERA5#
In this example we download ERA5 data on-demand for a cutout we want to create. (Atlite does also work with other datasources, but ERA5 is the easiest one to get started.)
This only works if you have in before
Installed the Copernicus Climate Data Store
cdsapi
packageRegistered and setup your CDS API key as described on their website here
Import the package first:
[1]:
import atlite
We implement notifications in atlite
using loggers from the logging
library.
We recommend you always launch a logger to get information on what is going on. For debugging, you can use the more verbose level=logging.DEBUG
:
[2]:
import logging
logging.basicConfig(level=logging.INFO)
Defining the Cutout extent#
This will not yet trigger any major operations.
A cutout is the basis for any of your work and calculations.
The cutout
is created in the directory and file specified by the relative path
If a cutout at the given location already exists, then this command will simply load the cutout again. If the cutout does not yet exist, it will specify the new to-be-created cutout. > Note ERA5
, Before the data can be downloaded it has to be processed by CDS servers, this might take a while depending on the ammout of data requested. You can check the status of your request
here.
[3]:
cutout = atlite.Cutout(
path="western-europe-2011-01.nc",
module="era5",
x=slice(-13.6913, 1.7712),
y=slice(49.9096, 60.8479),
time="2011-01",
)
INFO:atlite.cutout:Building new cutout western-europe-2011-01.nc
For creating the cutout, you need to specify
The dataset to create the cutout with (
era5
)The time period it covers
The longitude
x
and latitudey
it stretches
Here we went with the ERA5
dataset from ECMWF
module="era5"
Here we decided to provide the time
period of the cutout as a string, because it is only a month. You could have also specify it as a time range
slice("2011-01","2011-01")
The regional bounds (space the cutout stretches) where specified by the
x=slice(-13.6913, 1.7712) # Longitude
y=slice(49.9096, 60.8479) # Latitude
and describe a rectangle’s edges. In this case we drew a rectangle containing some parts of the atlantic ocean, the Republic of Ireland and the UK.
Preparing the Cutout#
If the cutout does not yet exist or has some features which are not yet included, we have to tell atlite to go ahead and do so.
No matter which dataset you use, this is where all the work actually happens. This can be fast or take some or a lot of time and resources, among others depending on your computer ressources and (for downloading e.g. ERA5 data) your internet connection.
[4]:
cutout.prepare()
INFO:atlite.data:Storing temporary files in /tmp/tmpc92pd1sr
INFO:atlite.data:Calculating and writing with module era5:
INFO:atlite.datasets.era5:Requesting data for feature runoff...
INFO:atlite.datasets.era5:Requesting data for feature influx...
INFO:atlite.datasets.era5:Requesting data for feature temperature...
INFO:atlite.datasets.era5:Requesting data for feature wind...
INFO:numexpr.utils:NumExpr defaulting to 4 threads.
INFO:atlite.datasets.era5:CDS: Downloading variables
* 100m_u_component_of_wind (2011)
* 100m_v_component_of_wind (2011)
* forecast_surface_roughness (2011)
INFO:atlite.datasets.era5:CDS: Downloading variables
* 2m_temperature (2011)
* soil_temperature_level_4 (2011)
INFO:atlite.datasets.era5:Requesting data for feature height...
INFO:atlite.datasets.era5:CDS: Downloading variables
* runoff (2011)
INFO:atlite.datasets.era5:CDS: Downloading variables
* surface_net_solar_radiation (2011)
* surface_solar_radiation_downwards (2011)
* toa_incident_solar_radiation (2011)
* total_sky_direct_solar_radiation_at_surface (2011)
INFO:atlite.datasets.era5:CDS: Downloading variables
* orography (2011)
[########################################] | 100% Completed | 0.5s
[4]:
<Cutout "western-europe-2011-01">
x = -13.50 ⟷ 1.75, dx = 0.25
y = 50.00 ⟷ 60.75, dy = 0.25
time = 2011-01-01 ⟷ 2011-01-31, dt = H
module = era5
prepared_features = ['height', 'wind', 'influx', 'temperature', 'runoff']
The cutout.prepare()
function takes a list of features which should be prepared. When this is not specified, all available features are build.
After, the execution all downloaded data is stored at cutout.path
. Per default it is not loaded into memory, but into dask arrays. This keeps the memory consumption extremely low.
The data is accessible in cutout.data
which is an xarray.Dataset
. Querying the cutout gives us some basic information on which data is contained in it.
[5]:
cutout.data
[5]:
<xarray.Dataset> Dimensions: (time: 744, x: 62, y: 44) Coordinates: * x (x) float64 -13.5 -13.25 -13.0 -12.75 ... 1.25 1.5 1.75 * y (y) float64 50.0 50.25 50.5 50.75 ... 60.25 60.5 60.75 * time (time) datetime64[ns] 2011-01-01 ... 2011-01-31T23:00:00 lon (x) float64 dask.array<chunksize=(62,), meta=np.ndarray> lat (y) float64 dask.array<chunksize=(44,), meta=np.ndarray> Data variables: height (y, x) float32 dask.array<chunksize=(44, 62), meta=np.ndarray> wnd100m (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray> roughness (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray> influx_toa (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray> influx_direct (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray> influx_diffuse (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray> albedo (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray> temperature (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray> soil temperature (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray> runoff (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray> Attributes: module: era5 prepared_features: ['influx', 'temperature', 'wind', 'height', 'runoff'] chunksize_time: 100
- time: 744
- x: 62
- y: 44
- x(x)float64-13.5 -13.25 -13.0 ... 1.5 1.75
array([-13.5 , -13.25, -13. , -12.75, -12.5 , -12.25, -12. , -11.75, -11.5 , -11.25, -11. , -10.75, -10.5 , -10.25, -10. , -9.75, -9.5 , -9.25, -9. , -8.75, -8.5 , -8.25, -8. , -7.75, -7.5 , -7.25, -7. , -6.75, -6.5 , -6.25, -6. , -5.75, -5.5 , -5.25, -5. , -4.75, -4.5 , -4.25, -4. , -3.75, -3.5 , -3.25, -3. , -2.75, -2.5 , -2.25, -2. , -1.75, -1.5 , -1.25, -1. , -0.75, -0.5 , -0.25, 0. , 0.25, 0.5 , 0.75, 1. , 1.25, 1.5 , 1.75])
- y(y)float6450.0 50.25 50.5 ... 60.5 60.75
array([50. , 50.25, 50.5 , 50.75, 51. , 51.25, 51.5 , 51.75, 52. , 52.25, 52.5 , 52.75, 53. , 53.25, 53.5 , 53.75, 54. , 54.25, 54.5 , 54.75, 55. , 55.25, 55.5 , 55.75, 56. , 56.25, 56.5 , 56.75, 57. , 57.25, 57.5 , 57.75, 58. , 58.25, 58.5 , 58.75, 59. , 59.25, 59.5 , 59.75, 60. , 60.25, 60.5 , 60.75])
- time(time)datetime64[ns]2011-01-01 ... 2011-01-31T23:00:00
array(['2011-01-01T00:00:00.000000000', '2011-01-01T01:00:00.000000000', '2011-01-01T02:00:00.000000000', ..., '2011-01-31T21:00:00.000000000', '2011-01-31T22:00:00.000000000', '2011-01-31T23:00:00.000000000'], dtype='datetime64[ns]')
- lon(x)float64dask.array<chunksize=(62,), meta=np.ndarray>
Array Chunk Bytes 496 B 496 B Shape (62,) (62,) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray - lat(y)float64dask.array<chunksize=(44,), meta=np.ndarray>
Array Chunk Bytes 352 B 352 B Shape (44,) (44,) Count 2 Tasks 1 Chunks Type float64 numpy.ndarray
- height(y, x)float32dask.array<chunksize=(44, 62), meta=np.ndarray>
- module :
- era5
- feature :
- height
Array Chunk Bytes 10.91 kB 10.91 kB Shape (44, 62) (44, 62) Count 2 Tasks 1 Chunks Type float32 numpy.ndarray - wnd100m(time, y, x)float32dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
- units :
- m s**-1
- long_name :
- 100 metre wind speed
- module :
- era5
- feature :
- wind
Array Chunk Bytes 8.12 MB 1.09 MB Shape (744, 44, 62) (100, 44, 62) Count 9 Tasks 8 Chunks Type float32 numpy.ndarray - roughness(time, y, x)float32dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
- units :
- m
- long_name :
- Forecast surface roughness
- module :
- era5
- feature :
- wind
Array Chunk Bytes 8.12 MB 1.09 MB Shape (744, 44, 62) (100, 44, 62) Count 9 Tasks 8 Chunks Type float32 numpy.ndarray - influx_toa(time, y, x)float32dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
- units :
- W m**-2
- module :
- era5
- feature :
- influx
Array Chunk Bytes 8.12 MB 1.09 MB Shape (744, 44, 62) (100, 44, 62) Count 9 Tasks 8 Chunks Type float32 numpy.ndarray - influx_direct(time, y, x)float32dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
- units :
- W m**-2
- module :
- era5
- feature :
- influx
Array Chunk Bytes 8.12 MB 1.09 MB Shape (744, 44, 62) (100, 44, 62) Count 9 Tasks 8 Chunks Type float32 numpy.ndarray - influx_diffuse(time, y, x)float32dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
- units :
- W m**-2
- module :
- era5
- feature :
- influx
Array Chunk Bytes 8.12 MB 1.09 MB Shape (744, 44, 62) (100, 44, 62) Count 9 Tasks 8 Chunks Type float32 numpy.ndarray - albedo(time, y, x)float32dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
- units :
- (0 - 1)
- long_name :
- Albedo
- module :
- era5
- feature :
- influx
Array Chunk Bytes 8.12 MB 1.09 MB Shape (744, 44, 62) (100, 44, 62) Count 9 Tasks 8 Chunks Type float32 numpy.ndarray - temperature(time, y, x)float32dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
- units :
- K
- long_name :
- 2 metre temperature
- module :
- era5
- feature :
- temperature
Array Chunk Bytes 8.12 MB 1.09 MB Shape (744, 44, 62) (100, 44, 62) Count 9 Tasks 8 Chunks Type float32 numpy.ndarray - soil temperature(time, y, x)float32dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
- units :
- K
- long_name :
- Soil temperature level 4
- module :
- era5
- feature :
- temperature
Array Chunk Bytes 8.12 MB 1.09 MB Shape (744, 44, 62) (100, 44, 62) Count 9 Tasks 8 Chunks Type float32 numpy.ndarray - runoff(time, y, x)float32dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
- units :
- m
- long_name :
- Runoff
- module :
- era5
- feature :
- runoff
Array Chunk Bytes 8.12 MB 1.09 MB Shape (744, 44, 62) (100, 44, 62) Count 9 Tasks 8 Chunks Type float32 numpy.ndarray
- module :
- era5
- prepared_features :
- ['influx', 'temperature', 'wind', 'height', 'runoff']
- chunksize_time :
- 100
We can again breakdown which data array belongs to which feature.
[6]:
cutout.prepared_features
[6]:
module feature
era5 height height
wind wnd100m
wind roughness
influx influx_toa
influx influx_direct
influx influx_diffuse
influx albedo
temperature temperature
temperature soil temperature
runoff runoff
dtype: object
If you have matplotlib installed, you can directly use the plotting functionality from xarray
to plot features from the cutout’s data.
Warning This will trigger
xarray
to load all the corresponding data from disk into memory!
Now that your cutout is created and prepared, you can call conversion functions as cutout.pv
or cutout.wind
. Note that this requires a bit more information, like what kind of pv panels to use, where do they stand etc. Please have a look at the other examples to get a picture of application cases.
Reducing Cutout file sizes#
Cutouts can become quite large, depending on the spatial and temporal scope they cover. By default atlite
uses a trade-off between speed and compression to reduce the file size of cutouts.
Stronger compression can be selected when creating a new cutout by choosing a higher complevel
(1
to 9
, default: 4
)
cutout.prepare(compression={"zlib": True, "complevel": 9})
To change the compression for an existing cutout:
cutout = atlite.Cutout("cutout-path.nc")
compression = {"zlib": True, "complevel": 9}
for var in cutout.data.data_vars:
cutout.data[var].encoding.update(compression)
cutout.to_file()
For details and more arguments for compression
, see the xarray documentation for details.
Alternatively a cutout can also be compressed by using the netcdf
utility nccopy
from the commandline:
nccopy -d4 -s <input cutout .nc file> <output cutout .nc file>