MACA Training Data


Training Data

Statistical downscaling methods require a training dataset, or an observational dataset of the variables downscaled at a fine resolution. After the downscaling process, the statistical distribution of each variable will match the statistical distribution from the training dataset and the final downscaled data will be at the same resolution as the training dataset. The importance of the training dataset cannot be emphasized, as flaws in the training dataset will be passed on to the downscaled dataset. The MACA products utilize two training datasets: METDATA and Livneh observational datasets.


As training data for MACAv1/v2-METDATA, we use the gridded surface meteorological dataset METDATA (Abatzoglou, 2013) with high spatial resolution (1/24-degree or approximately 4-km) and daily timescales for near-surface minimum/maximum temperature, minimum/maximum relative humidity, precipitation, downward solar radiation, wind components, and specific humidity. For MACAv1-METDATA, 1979-2010 from METDATA were used. For MACAv2-METDATA, 1979-2012 from METDATA were used.

This dataset was created by bias-correcting daily and sub-daily mesoscale reanalysis and assimilated precipitation from the NASA‚~@~Ys North American Land Data Assimilation System (NLDAS-2, Mitchell et al., 2004) using monthly temperature, precipitation and humidity from Parameter-elevation Regressions on Independent Slopes Model (PRISM, Daly et al., 2008). The data were validated against an extensive network of weather stations including RAWS, AgriMet, AgWeatherNet, and USHCN-2 showing skill in correlation and RMSE comparable to that derived from interpolation using station observations. However, this dataset is advantageous in that it provides spatially and temporally complete data across several variables and even across vast unmonitored areas of the United States.

Download the METDATA data:



Note that the training years of METDATA used for the maca datasets are as follows:
  • MACAv1-METDATA training years: METDATA years 1979-2010
  • MACAv2-METDATA training years: METDATA years 1979-2012

As training data for MACAv2-LIVNEH, we use the Livneh (Livneh et al., 2013) gridded surface meteorological dataset at a high spatial resolution (1/16 deg ~ 6-km) and daily timescales for near-surface minimum/maximum temperature, precipitation and wind speed from the years 1950-2011. (Note that the Livneh dataset is actually for 1915-2011, but due to spurious spatial patterns and the scarcity of observing networks prior to 1950, only years post-1950 were utilized for training. We are actually using Livneh version L14 for Canada and Livneh version L13 for the USA.

This dataset was created by incorporating day observations of maximum and minimum temperature as well as accumulated precipitation from National Weather Service Cooperative Observer stations across the country. Climatological estimates of monthly precipitation from the Parameter-elevation Regressions on Independent Slopes Model (PRISM, Daly et al., 2008) are further used to account for spatial variability in precipitation. Temperatures are adjusted across elevated terrain using a standard 6.5C/km lapse rate.

The training dataset for MACAv2-LIVNEH also includes the additional variables of downward solar radiation and specific humidity. These variables have been estimated from the temperature and precipitation data in the Livneh dataset using the MT-CLIM (Glassy et al., 1994) algorithm. The MT-CLIM algorithm utilizes the latitude, elevation, slope, and aspect of the location. Radiation estimates are based on the observation that the diurnal temperature range (the difference between maximum and minimum temperature each day) is closely related to the daily average atmospheric transmittance.

Download the exact LIVNEH data used in macav2-livneh
Elevations on Livneh Grid

Note that the training years of LIVNEH used for the maca datasets are as follows:

  • MACAv2-LIVNEH training years: LIVNEH years 1950-2011