A major element of the CDMP 19th Century Forts and Voluntary Observers Database Build Project is quality controlling the digitized U.S. daily weather observations. A number of computerized tests are performed on the CDMP Forts data to improve and ensure the quality of the data. If outliers are found by these tests, the data are manually checked to assess the correctness of the data, and the data are corrected if found to be clearly in error.
The first group of tests concentrates on gross errors in the entirety of the keyed data and inconsistencies between keyed data and metadata. The primary functions of these three tests are:
To scan through each data type and identify values that are logically or physically impossible. An example of the former might be a non-existent cloud type abbreviation and an example of the latter, a relative humidity reading of 0%.
To check for consistency between the date of the keyed data and the date identified in the metadata.
To check the keyed data against the metadata but in detail, element-by-element. Any elements either requested in the metadata but not keyed, or keyed but not requested are flagged.
The second set of tests focuses on the accuracy of temperature, precipitation, and snowfall elements. For all three variables, the monthly sum and mean are calculated and compared with those recorded by the observer. Large differences are flagged for review. Subjectively determined threshold cut-offs for extreme values also are tested.
Other monthly tests for temperature include element cross-comparisons between mean, maximum, minimum and hourly temperatures, temperature range, and dry bulb and wet bulb temperatures. Daily tests include examination of climatologically extreme values and daily spikes, very large and very small diurnal temperature ranges, and comparisons of maximum, minimum, and hourly temperatures for inconsistencies.
Monthly precipitation assessments include a manual check of the five largest monthly precipitation totals. Daily tests include examination of climatological extreme values and checking that hourly and daily values sum to the daily and monthly totals. When the sum of daily values equals the month total reported by the observer, all remaining days are filled with zeroes to create finalized precipitation data.
The five largest monthly snowfall and snow depth also are manually assessed. Comparisons are made between the precipitation and snow sums, as well as between snowfall and snow depth monthly sums. Snowfall checks against the minimum temperature monthly means are also done.
For further details on quality control of the 19th century weather data, see this report (pdf).