Download this example as a Jupyter notebook here:

Creating a Cutout with ERA5

In this example we download ERA5 data on-demand for a cutout we want to create. (Atlite does also work with other datasources, but ERA5 is the easiest one to get started.)

This only works if you have in before

  • Installed the Copernicus Climate Data Store cdsapi package

  • Registered and setup your CDS API key as described on their website here

Import the package first:

import atlite

We implement notifications in atlite using loggers from the logging library.

We recommend you always launch a logger to get information on what is going on. For debugging, you can use the more verbose level=logging.DEBUG:

import logging

Defining the Cutout extent

This will not yet trigger any major operations.

A cutout is the basis for any of your work and calculations.

The cutout is created in the directory and file specified by the relative path If a cutout at the given location already exists, then this command will simply load the cutout again. If the cutout does not yet exist, it will specify the new to-be-created cutout.

cutout = atlite.Cutout(path="",
                       x=slice(-13.6913, 1.7712),
                       y=slice(49.9096, 60.8479),
INFO:atlite.cutout:Building new cutout

For creating the cutout, you need to specify

  • The dataset to create the cutout with (era5)

  • The time period it covers

  • The longitude x and latitude y it stretches

Here we went with the ERA5 dataset from ECMWF


Here we decided to provide the time period of the cutout as a string, because it is only a month. You could have also specify it as a time range


The regional bounds (space the cutout stretches) where specified by the

x=slice(-13.6913, 1.7712) # Longitude
y=slice(49.9096, 60.8479) # Latitude

and describe a rectangle’s edges. In this case we drew a rectangle containing some parts of the atlantic ocean, the Republic of Ireland and the UK.

Preparing the Cutout

If the cutout does not yet exist or has some features which are not yet included, we have to tell atlite to go ahead and do so.

No matter which dataset you use, this is where all the work actually happens. This can be fast or take some or a lot of time and resources, among others depending on your computer ressources and (for downloading e.g. ERA5 data) your internet connection.

cutout.prepare() temporary files in /tmp/tmpc92pd1sr and writing with module era5:
INFO:atlite.datasets.era5:Requesting data for feature runoff...
INFO:atlite.datasets.era5:Requesting data for feature influx...
INFO:atlite.datasets.era5:Requesting data for feature temperature...
INFO:atlite.datasets.era5:Requesting data for feature wind...
INFO:numexpr.utils:NumExpr defaulting to 4 threads.
INFO:atlite.datasets.era5:CDS: Downloading variables
         * 100m_u_component_of_wind (2011)
         * 100m_v_component_of_wind (2011)
         * forecast_surface_roughness (2011)

INFO:atlite.datasets.era5:CDS: Downloading variables
         * 2m_temperature (2011)
         * soil_temperature_level_4 (2011)

INFO:atlite.datasets.era5:Requesting data for feature height...
INFO:atlite.datasets.era5:CDS: Downloading variables
         * runoff (2011)

INFO:atlite.datasets.era5:CDS: Downloading variables
         * surface_net_solar_radiation (2011)
         * surface_solar_radiation_downwards (2011)
         * toa_incident_solar_radiation (2011)
         * total_sky_direct_solar_radiation_at_surface (2011)

INFO:atlite.datasets.era5:CDS: Downloading variables
         * orography (2011)

[########################################] | 100% Completed |  0.5s
<Cutout "western-europe-2011-01">
 x = -13.50 ⟷ 1.75, dx = 0.25
 y = 50.00 ⟷ 60.75, dy = 0.25
 time = 2011-01-01 ⟷ 2011-01-31, dt = H
 module = era5
 prepared_features = ['height', 'wind', 'influx', 'temperature', 'runoff']

The cutout.prepare() function takes a list of features which should be prepared. When this is not specified, all available features are build.

After, the execution all downloaded data is stored at cutout.path. Per default it is not loaded into memory, but into dask arrays. This keeps the memory consumption extremely low.

The data is accessible in which is an xarray.Dataset. Querying the cutout gives us some basic information on which data is contained in it.

Dimensions:           (time: 744, x: 62, y: 44)
  * x                 (x) float64 -13.5 -13.25 -13.0 -12.75 ... 1.25 1.5 1.75
  * y                 (y) float64 50.0 50.25 50.5 50.75 ... 60.25 60.5 60.75
  * time              (time) datetime64[ns] 2011-01-01 ... 2011-01-31T23:00:00
    lon               (x) float64 dask.array<chunksize=(62,), meta=np.ndarray>
    lat               (y) float64 dask.array<chunksize=(44,), meta=np.ndarray>
Data variables:
    height            (y, x) float32 dask.array<chunksize=(44, 62), meta=np.ndarray>
    wnd100m           (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
    roughness         (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
    influx_toa        (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
    influx_direct     (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
    influx_diffuse    (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
    albedo            (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
    temperature       (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
    soil temperature  (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
    runoff            (time, y, x) float32 dask.array<chunksize=(100, 44, 62), meta=np.ndarray>
    module:             era5
    prepared_features:  ['influx', 'temperature', 'wind', 'height', 'runoff']
    chunksize_time:     100

We can again breakdown which data array belongs to which feature.

module  feature
era5    height                   height
        wind                    wnd100m
        wind                  roughness
        influx               influx_toa
        influx            influx_direct
        influx           influx_diffuse
        influx                   albedo
        temperature         temperature
        temperature    soil temperature
        runoff                   runoff
dtype: object

If you have matplotlib installed, you can directly use the plotting functionality from xarray to plot features from the cutout’s data.

Warning: This will trigger xarray to load all the corresponding data from disk into memory!

Now that your cutout is created and prepared, you can call conversion functions as cutout.pv or cutout.wind. Note that this requires a bit more information, like what kind of pv panels to use, where do they stand etc. Please have a look at the other examples to get a picture of application cases.