(1) Overview


Spatial coverage

Description: This dataset is clipped to the EDENext [1] extent which covers the continent of Europe and parts of North Africa down to 34 degrees latitude. The projection is WGS84 (ESPG:4326).

  • Northern boundary: 72.3
  • Southern boundary: 34.0
  • Eastern boundary: -12.0
  • Western boundary: 47.6

Temporal coverage

01 April 2014 (current).


Red deer, Cervus elaphus

(2) Methods


Binary presence and absence

Five sets of distribution data were combined to produce a single presence absence mask. The data sets used were as follows:

  • The EMMA Database: Mapping Europe’s mammals using data from the Atlas of European Mammals [2]
  • The Global Biodiversity Information Facility (GBIF) [3]
  • IUCN Red List Dataset [4]
  • The National Biodiversity Network UK 10k Data [5]
  • Spanish Ministry of Agriculture National Inventory of Biodiversity [6]

Habitat definition

For much of the indicated range the distributions above were by their nature simple presence limits. Within these designated boundaries there was no indication of absence. In order to introduce absences within these limits, suitability masks were defined using species-specific habitat preferences derived from land cover classes, using GLOBCOVER [7] at 1 km resolution. The habitats were defined as being more than 25% Woodland or Moorland according to Tapper(1999) [8], and are thus somewhat UK centric.

The 300m GLOBCOVER dataset was reclassified as woodland or moorland = 1 and other = 0 as per Table 1. The data was then aggregated to 1km and those cells with greater than 25% woodland or moorland were then classed as suitable. All data processing was undertaken in ESRI ArcGIS 10.0.

Table 1

Reclassed values defining the GLOBCOVER suitability layers.

Value Label Grass Pasture No Urban & Urban Fringe Roe

11 Post-flooding or irrigated croplands (or aquatic) 0 1 0

14 Rainfed croplands 0 1 0

20 Mosaic cropland (50-70%) / vegetation (grassland/shrubland/forest) (20-50%) 1 1 1

30 Mosaic vegetation (grassland/shrubland/forest) (50-70%) / cropland (20-50%) 1 1 1

40 Closed to open (>15%) broadleaved evergreen or semi-deciduous forest (>5m) 0 1 1

50 Closed (>40%) broadleaved deciduous forest (>5m) 0 1 1

60 Open (15-40%) broadleaved deciduous forest/woodland (>5m) 0 1 1

70 Closed (>40%) needleleaved evergreen forest (>5m) 0 1 1

90 Open (15-40%) needleleaved deciduous or evergreen forest (>5m) 0 1 1

100 Closed to open (>15%) mixed broadleaved and needleleaved forest (>5m) 0 1 1

110 Mosaic forest or shrubland (50-70%) / grassland (20-50%) 1 1 1

120 Mosaic grassland (50-70%) / forest or shrubland (20-50%) 1 1 1

130 Closed to open (>15%) (broadleaved or needleleaved, evergreen or deciduous) shrubland (<5m) 0 1 0

140 Closed to open (>15%) herbaceous vegetation (grassland, savannas or lichens/mosses) 1 1 0

150 Sparse (<15%) vegetation 0 1 0

160 Closed to open (>15%) broadleaved forest regularly flooded (semipermanently or temporarily) 0 1 1

170 Closed (>40%) broadleaved forest or shrubland permanently flooded - Saline or brackish water 0 1 0

180 Closed to open (>15%) grassland or woody vegetation on regularly flooded or waterlogged soil 1 1 0

190 Artificial surfaces and associated areas (Urban areas >50%) 0 0 0

200 Bare areas 0 1 0

210 Water bodies 0 1 0

220 Permanent snow and ice 0 1 0

230 No data (burnt areas, clouds,…) 0 1 0

The 1km resolution habitat suitability masked data was then combined with the presence data and converted to a percentage of suitable habitat at a 20km resolution.

Model predictor suite

The spatial modelling requires a comprehensive predictor variable suite that included a wide range of remotely sensed variables as follows:

  • Remotely sensed climatic indicators derived by Temporal Fourier Analysis (TFA) of MODIS satellite imagery of several temperature parameters, and vegetation indices for the period 2001-2008 [9]
  • Digital Elevation from the Shuttle Radar Topography Mission, together with derived aspect and ruggedness [10]
  • Temporal Fourier Analysis (TFA) of Precipitation, and allied Bioclimatic Indicator (Bioclim) precipitation variables from the WORLDCLIM datasets [11]
  • Length of Growing Period from United Nations Food and Agriculture Organisation [12]
  • Travel Time to major towns from the Joint Research Centre at Ispra [13]
  • Human population density derived from the Global Rural Urban Mapping project at CEISIN [14]
  • A distance weighted human population index layer [15] representing the likelihood of human visits based on the population within 30km

Habitat suitability modelling

The percentage of suitable habitat layer was then offered to three modelling techniques: GLM [16] multivariate regression and Random Forest [17], both using R-project [18] modules embedded within the VECMAP [19] software suite, and the FAO FARMS [20] regression tool developed for livestock density modelling. All three methods were bootstrapped at least 25 times, and were further refined by using a zoned approach whereby separate models were produced for a series of 50 eco-climatic zones based on climate, vegetation and seasonality. Such zonation tends to produce more accurate sub-models, which can then be combined into a single output.

The average of the three models for each species was then produced as an ensemble consensus product for each species.

Output datasets

A copy of both the presence/absence layer and the ensembled modelled habitat suitability have been provided as a quick look map in JPEG format to view from any image viewer. The data itself is distributed as GIS Raster data in two formats. GeoTIFFs which is a standard proprietary GIS raster format. GeoJP2 (JPEG 2000 format) which is a nonproprietary format.

To access and analyse the Raster data directly GeoTIFFs and GeoJPGs can be read by most GIS software and some other software packages These formats are compatible with proprietary (ESRI ArcGIS) and open source Quantum GIS (QGIS) [21] or R-project [18] raster package).

Folder structure

  • quicklooks - JPEG maps for viewing only
  • tiff - GeoTIFF data 0.008333 degree (~1km) 32bit floating point
  • geoJPG2k - GeoJPG 2000, 0.008333 degree (~1km) 16bit unsigned Integer data

Sampling strategy

Sample points were extracted for input into the three different models from a 20km matrix defining the percentage of habitat suitability. Depending on the model 1000-3000 sample points were used in each of 25 bootstraps.

Quality control

These models are a first attempt at quantifying the red deer distribution at this scale and there has been no ground truth validation of these maps so far. The model outputs all, however, satisfy standard accuracy metrics (AIC and R squared) assuring statistical reliability. They have also been informally reviewed by project deer experts.


There were no constraints involved in data production.



(3) Dataset description

Object name


Data type

Primary data, processed data, interpretation of data.

Format names and versions


Creation dates

28 April 2014

Dataset creators

  • Dr. William Wint: Senior Research Associate, Environmental Research Group Oxford (ERGO), Department of Zoology, Oxford, OX1 3PS, UK
  • Dr. David Morley: Research Assistant, Environmental Research Group Oxford (ERGO), Department of Zoology, Oxford, OX1 3PS, UK
  • Neil S. Alexander: Research Assistant, Environmental Research Group Oxford (ERGO), Department of Zoology, Oxford, OX1 3PS, UK. Corresponding email:neil.alexander@zoo.ox.ac.uk




This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.



Publication date

If already known, the date the dataset was published in the repository (28 April 2014).

(4) Reuse potential

These layers are a first attempt to provide a description of red deer habitat as a proxy for abundance at a continental scale. They have been developed in the hope they will aid epidemiologists test hypotheses relating to the role of red deer in the spread of vector-borne disease.

Areas of future development on the dataset itself might be to: assess the accuracy of the maps through groundtruthing; a comparison of the three different models used in this analysis and an assessment of which model provides the most accurate outputs; an attempt at a more systems-based approach to modelling deer abundance at a country scale.