Getting Started with SDSM Version 3.1
The Statistical Downscaling Model (SDSM) is a decision support tool, developed by Drs. Robert Wilby and Christian Dawson in the UK, for assessing local climate change impacts using a robust statistical downscaling technique. It is a hybrid of a stochastic weather generator and regression-based downscaling methods and facilitates the rapid development of multiple, low-cost, single-site scenarios of daily surface weather variables under current and future climate forcing. SDSM is designed to help the user identify those large-scale climate variables (the predictors) which explain most of the variability in the climate (the predictand) at a particular site and statistical models are then built based on this information. Statistical models are built using daily observed data – local climate data for a specific location for the predictand and larger-scale NCEP data for the predictors – and these models are then used with GCM-derived predictors to obtain daily weather data at the site in question for a future time period.
Getting Started
Where can I get SDSM?
You can download the software, user manual and a demonstration data set (for Blogsville) for free from: www.sdsm.org.uk
How do I prepare my own data for use in SDSM Version 3.1?
It is best to set up a new directory for each site you wish to downscale using SDSM. This directory should contain both the observed daily data (i.e. the predictand) and the observed (NCEP, i.e. National Centre for Environmental Prediction, Kalnay et al., 2006) and GCM-derived predictors. You will need to supply the predictand information, but predictor information can be obtained from the CCCSN web site (see Statistical Downscaling Input).
Observed daily data for Canada from the Historical Adjusted Climate Database for Canada may be requested from Lucie Vincent for temperature and Eva Mekis for precipitation. These data files are in row format, i.e. each row contains daily data for one month for each year. These data need to be converted to single column format (i.e. the data values only) to be compatible with SDSM. Date information should not be included. Have a look at the observed daily data included with the Blogsville example to ensure that you understand the format required. Reformatting can be done using a programming language, such as FORTRAN, or in a spreadsheet package, such as Microsoft Excel. Once you have correctly formatted the data, you will need to make sure that SDSM will recognise the code used to identify missing data values in the data set. This is done by entering the correct code in the Missing Data Identifier window on the Settings screen.
Predictor files downloaded from the CCCSN are in zipped format and the files contained within the zip file follow the correct naming convention and format used in SDSM. This zip file contains both the observed and GCM-derived predictor variables. Unzip these files into the appropriate site directory making sure that you preserve the sub-directory structure. You will have downloaded these zip files based on which GCM you wish to use. Currently, GCM-derived predictors are available for experiments undertaken with CGCM1, CGCM2 and HadCM3. There is a very limited data set available since daily GCM data are required for the construction of predictors, and not all climate modelling centres archive daily data from their climate change experiments. The observed predictors, derived from the NCEP reanalyses, are contained within each zip file along with the GCM predictors. The observed predictors are interpolated to the GCM grid in question and thus are slightly different from GCM to GCM. If you are using more than one GCM for downscaling at each site, you should probably set up a separate directory for each GCM to make sure that you do not confuse the observed predictor data sets. You will have to go through the calibration process with SDSM for each GCM in using NCEP predictors, since the observed predictors, and hence, the statistical relationships, will be slightly different from GCM to GCM.
How do I use SDSM?
Download Software
Download the software from the SDSM web site into a directory of your choice on your computer. Double-click on the setup.exe file in the SDSM Install directory and follow the instructions in the installation wizard.
Launch SDSM
To launch SDSM, click on the Start button on the Windows desktop, then on All Programs and then on the SDSM icon (a small cloud) which appears when you click on SDSM in the list of available programs. Click on the Start button on the SDSM title page to continue to the SDSM main menu (shown below):
The task of statistically downscaling daily weather series is divided into seven discrete processes within the SDSM software. These are: quality control and data transformation; screening of the predictor variables; model calibration; weather generation using observed predictors; statistical analyses; graphing of model output; and scenario generation using climate model predictors. Before starting the downscaling process, it is necessary to check the input data ranges, type and integrity using the Settings screen. This can be accessed by clicking on the wrench symbol at the top of the main menu. For more details concerning the global preferences contained in the Settings pages, please refer to the SDSM user manual.
Quality Control and Data Transformation
To check an input file for missing data and/or suspect values it is necessary to undertake some quality control of the data. Click on the Analyse button at the top of the SDSM main menu and then on Quality Control in the drop-down menu that appears. Click on the Select File button and browse through to the correct location for the data file you wish to check in the Open file window that appears. Then click on the Check File button at the top of the screen. A Quality Check Complete confirmation box will appear – click on OK to view the results of this operation. If the results of this check are satisfactory, click on Next to proceed to Screen Variables. If you wish to transform a particular data variable this can be done by clicking on the Transform button at the top of the Quality Control screen. The regression technique used in SDSM makes the assumption that the input data are normally distributed. If this is not the case, for example the precipitation variable, it may be necessary to apply a particular transformation to a data set so that its distribution becomes more normal. For more details regarding this option, please refer to the user manual.
Screen Variables
The Screen Variables screen assists the user in choosing appropriate downscaling predictor variables for model calibration. There are three options available to help with this task – seasonal correlation analysis, partial correlation analysis and scatter plots – but it is ultimately up to the user to decide whether or not the identified relationships are physically sensible for the site and predictand in question. If you are not already on the Screen Variables screen you can access it by clicking on the Analyse button at the top of the main menu and then on Screen Variables on the drop-down menu that appears. Click on the Select Predictand File button and browse through to the correct location of the predictand file (e.g. observed daily maximum temperature for the site in question).
Similarly, locate and select the desired Predictor Variables by identifying the correct drive and directory from the drop-down menu in the centre of this screen. A list of the available predictor variables is given in the Predictor Variables window. Click on any of the file names in this list to select a file (it will be highlighted in blue) and a definition of the selected variable will be listed in the Predictor Description window. Also on this screen you need to Select Analysis Period (i.e. annual, seasonal or monthly), identify whether or not the Process is Conditional (as it would be for precipitation where amounts depend on wet-day occurrence) or Unconditional (as for temperature), and the Significance Level used to test the significance of the predictor-predictand correlations.
Once you have made your selections, click on the Analyse button to investigate the percentage of variance explained by the specific predictand-predictor pairs. The Results screen shows the strongest correlation in each month in red, whilst blanks indicate insignificant relationships at the selected significance level. Also on this screen, by clicking on the Correlation button you are able to explore inter-variable correlations and partial correlations which help to identify the amount of explanatory power which is unique to each predictor. Correlations between the different predictors and the predictand may result in a reduced correlation value – the partial correlation statistics indicate which predictors have the strongest association with the predictand once the influence of the other predictors has been removed. You can also use the Scatter button on this screen to visually inspect inter-variable behaviour and to determine whether or not data transformations may be required and the importance of outliers. Please refer to the user manual for a more thorough discussion of these options. After you have completed the variable screening you should have identified those predictors which appear to explain most of the variance in the data. Click on Next to go to the Calibrate Model screen, or select Calibrate Model from the drop-down menu when you click on the Analyse option at the top of this screen.
Calibrate Model
This process constructs the downscaling models based on multiple linear regression equations given the daily predictand data (e.g. maximum temperature for the site in question) and regional-scale atmospheric predictor variables. On this screen you need to Select Predictand File, identify the correct directory path to the observed predictor variables and highlight the predictors which were identified as being important in the Screen Variables section, decide upon the temporal resolution of the downscaling model by clicking on Monthly, Seasonal or Annual in the Model Type box and specify whether or not the downscaling process should be conditional by checking the appropriate option in the Process box. Finally you need to select a subset of the available data to fit the model to by changing either the Fit Start or Fit End dates in the Data section and then select an appropriate output file to which the model parameters will be written. The data not used to calibrate the model can be used for independent model validation. Again, refer to the user manual for more details about these options. Once you have made your selections, click on the Calibrate button at the top of the screen and once the calibration process is completed a dialogue box will appear on the screen indicating the how much of the variance in the local predictand is explained by the regional forcing (the R-squared value) and the standard error of the model. Click on OK to return to the Calibrate Model screen. Depending on the results obtained in the calibration process, you may wish to proceed directly to the Weather Generator screen (click on Next), or to return to Screen Variables to determine whether or not you selected the most appropriate predictor variables, or if it is necessary to transform some of the predictors you selected.
Weather Generator
Once the regression models have been calibrated, the next step in the SDSM downscaling process is to use these models to produce synthetic daily weather series using atmospheric predictor variables in the Weather Generator process. This operation allows the user to verify the calibrated models by using them with the independent data excluded from the calibration process, as well as to synthesise artificial time series representative of current climate conditions. If you are not already on the Weather Generator screen, click on the Analyse button at the top of the screen and select Weather Generator from the drop-down menu. First, you need to select the appropriate parameter file – click on the Select Parameter File button, and browse through the Open file window until you have located the correct directory and file. Click on the file name, e.g. TMAX.PAR. Then specify the location of the predictor variable files by selecting the correct directory and drive in the Select Predictor Directory window. Click on the Save To .OUT File button in the Select Output File window and enter a suitable file name for the synthetic data generated in this step in the Open file window and save this file in an appropriate directory. Click on the View Details button and the files used in the model calibration are listed in the window below.
To generate synthetic data for verification purposes, enter the time period details of the independent data set in the Synthesis Start and Synthesis Length boxes. If you wish to generate synthetic data for the whole of the observed data period, i.e. using the complete predictor record, then enter the Record Start and Record Length values in the Synthesis Start and Synthesis Length boxes, respectively. Finally, you need to decide on how many ensembles of synthetic data are required (up to a maximum of 100) and enter this value in the Ensemble Size box. Each ensemble member is considered to be an equally-plausible representation of local climate resulting from using the same set of predictor variables in the calibrated models. More details concerning this option can be found in the user manual. Once you have made all the selections on this screen, click on the Synthesize button at the top of the menu and after a few seconds the Synthesis Completed dialogue box will appear. Click on OK to return to the Weather Generator screen. Click on Next to proceed to Analyse Model Output.
Analyse Data
A number of statistics, including the mean, maximum, minimum, variance, peaks above/below thresholds, percentiles, percent wet-days and wet-/dry-day spell lengths, can be computed for both observed and synthetic data using SDSM, on a calendar month, seasonal or annual basis, in order to evaluate either observed or downscaled data. If you are not already on the Analyse Data screen, click on the Analyse button at the top of any screen and select Analyse Data from the drop-down menu. First, select the Data Source, clicking on either Modelled or Observed for synthetic or observed data analysis, respectively. Then, click on the Select Input File button and browse through to the correct directory and file in the Open file window that appears. Click on the appropriate file name. If you are using synthetic data, click on the View Details button in the Modelled Scenario window to check that the basic information about the downscaling experiment is correct. Then specify the Analysis Period by entering appropriate Analysis Start and Analysis End dates if they are different from the Standard Data Start and Standard Data End dates contained in the global Settings (see 2 above).
You can analyse individual ensemble members or obtain mean diagnostics for all ensemble members by unchecking or checking the Use Ensemble Mean? box, respectively, in the Ensemble Size window. If you wish to analyse individual ensembles, then you should enter the number of the ensemble member in the Ensemble Member box. You also need to specify the location of the analysis results – click on the Save Summary File As button in the Select Output File window and enter a suitable file name in the correct directory in the Open file window that appears. Finally, you need to select which statistics are calculated from the input data – click on the Statistics button at the top of the menu and then check up to eight statistics for analysis. Click on Back to return to the Analyse Data screen and then on the Analyse button at the top of the menu. After a few moments the Results screen will appear. SDSM enables the user to graphically compare results of downscaled and observed data to get an indication of model skill in the Compare Results operation. More details of this operation and options for adjusting chart appearance are given in the user manual.
Scenario Generation
This is the final step in the SDSM downscaling process and allows the user to produce ensembles of synthetic daily weather series using daily atmospheric predictor variables supplied by a global climate model (GCM). These predictor variables must be in the same format as the observed predictors, i.e. normalised with respect to the reference period and available for all variables used in model calibration. The Scenario Generation procedure is identical to that of the Weather Generator operation (see 6 above), except that it may be necessary to specify different GCM dates and source directory for the predictor variables.
Click on the Analyse button at the top of any screen and then select Generate Scenario from the drop-down menu. First, check the options in the Settings menu, by clicking on the Settings button (wrench symbol) at the top of the screen and check the Year Length and Standard Start and Standard End dates in the Data window. Most GCMs have a calendar year of length of 365 days, with the exception of HadCM2 and HadCM3 which have a 360-day year. Make any necessary changes and then click on Back to return to the Generate Scenario screen. Then, click on the Select Parameter File button and browse through to the correct directory and file to select the appropriate downscaling model parameter file in the Open file window. Click on View Details and the files used in the calibration process are listed (both the predictand and the predictors). Under the GCM Directory header select the correct drive and directory for the location of the GCM predictors. Then decide on how many ensembles you want to generate and enter this number in the Ensemble Size box. Finally, click on the Save To .OUT File button in the Select Output File window and browse through to the appropriate directory and enter a suitable file name for the generated data in the Open file window that appears. Then click on the Generate button at the top of the screen and after a few seconds a Scenario Generated dialogue box will appear. Click on OK to return to the Generate Scenario screen. You can examine the output by using the Compare Results operation, more details of which are given in the user manual.
Further Reading
SDSM tool
Wilby, R.L. & Dawson, C.W. (2004): Using SDSM Version 3.1 – A Decision Support Tool for the Assessment of Regional Climate Change Impacts. User Manual. 67pp.
Wilby, R.L., Dawson, C.W. & Barrow, E.M. (2002): SDSM – a decision support tool for the assessment of regional climate change impacts. Environmental and Modelling Software 17: 145-157.
Wilby, R.L., Hassan, H. & Hanaki, K. (1998): Statistical downscaling of hydrometeorological variables using general circulation model output. Journal of Hydrology 205: 1-19.
Wilby, R.L. (2003): Past and projected trends in London’s urban heat island. Weather 58: 251-260.
SDSM application and evaluation
Dibike Y. B. and Coulibaly P. (2005): Hydrologic Impact of Climate Change in the Saguenay Watershed: Comparison of Downscaling Methods and Hydrologic Models, Journal of Hydrology 307(1-4): 145-163.
Dibike Y. B., Gachon P., St-Hilaire A., Ouarda T. B.M.J. and Nguyen V-T-V. (2007): Uncertainty Analysis of Statistically Downscaled Temperature and Precipitation Regimes in Northern Canada, Theoretical and Applied Climatology (in press).
Gachon P. and Dibike Y.B. (2007): Temperature change signals in northern Canada: Convergence of statistical downscaling results using two driving GCMs, International Journal of Climatology (In review, December 2006).
Gachon, P., St-Hilaire A., Ouarda T.B.M.J., Nguyen V.T.V., Lin C., Milton J., Chaumont D., Goldstein J., Hessami M., Nguyen T.D., Selva F., Nadeau M., Roy P., Parishkura D., Major N., Choux M. and Bourque A. (2005): A first evaluation of the strength and weaknesses of statistical downscaling methods for simulating extremes over various regions of eastern Canada. Sub-component, Climate Change Action Fund (CCAF), Environment Canada, Final report, Montréal, Québec, Canada, 209 pp. (available from the 1st author).
Goodess C.M., Osborn T.J., and Hulme M. (2003): The identification and evaluation of suitable scenario development methods for the estimation of future probabilities of extreme weather events. Tyndall Centre Technical Report 4: Norwich, 69 pp.
Harpham C. and Wilby R.L. (2005): Multi-site downscaling of heavy daily precipitation occurrence and amounts. Journal of Hydrology 312(1-4):235-255.
Haylock, M.R., Cawley G.C., Harpham C., Wilby R.L., and Goodess C.M. (2006): Downscaling heavy precipitation over the UK: A comparison of dynamical and statistical methods and their future scenarios, International Journal of Climatology 26(10): 1397-1415.
Khan M.S., Coulibaly P. and Dibike Y. (2006): Uncertainty Analysis of Statistical Downscaling Methods, Journal of Hydrology 319(1-4):357-382.
Nguyen V.T.V., Nguyen T.D., and Gachon P. (2006): On the linkage of large-scale climate variability with local characteristics of daily precipitation and temperature extremes: an evaluation of statistical downscaling methods. Advances in Geosciences (WSPC/SPI-B368) 4(16): 1-9.


