R/reformat_GSOD.R
reformat_GSOD.Rd
This function automates cleaning and reformatting of GSOD station
files in
“YEAR.tar.gz”, provided that they have been untarred or
“STATION.csv” format that have been downloaded from the United States
National Center for Environmental Information's (NCEI)
download page. Three additional useful elements: saturation vapour pressure
(es), actual vapour pressure (ea) and relative humidity (RH) are calculated
and returned in the final data frame using the improved August-Roche-Magnus
approximation (Alduchov and Eskridge 1996). All units are converted to
International System of Units (SI), e.g., Fahrenheit to Celsius and
inches to millimetres.
reformat_GSOD(dsn = NULL, file_list = NULL)
dsn | User supplied full file path to location of data files on local disk for tidying. |
---|---|
file_list | User supplied list of file paths to individual files of data
on local disk for tidying. Ignored if |
A data frame as a data.table
object of
GSOD data.
If multiple stations are given, data are summarised for each year by station, which include vapour pressure and relative humidity elements calculated from existing data in GSOD. Else, a single station is tidied and a data frame is returned.
All missing values in resulting files are represented as NA
regardless
of which field they occur in.
Only station files in the original “csv” file format are supported by this function. If you have downloaded the full annual (“YYYY.tar.gz”) file you will need to extract the individual station files from the tar file first to use this function.
Note that reformat_GSOD()
will attempt to reformat any “.csv”
files found in the dsn
that you provide. If there are non-
GSOD files present this will lead to errors.
For a complete list of the fields and description of the contents and units,
please refer to Appendix 1 in the GSODR vignette,
vignette("GSODR", package = "GSODR")
.
While GSODR does not distribute GSOD weather data, users of the data should note the conditions that the U.S. NCEI places upon the GSOD data. “The following data and products may have conditions placed on their international commercial use. They can be used within the U.S. or for non- commercial international activities without restriction. The non-U.S. data cannot be redistributed for commercial purposes. Re-distribution of these data by others must provide this same notification. A log of IP addresses accessing these data and products will be maintained and may be made available to data providers.”
Alduchov, O.A. and Eskridge, R.E., 1996. Improved Magnus form approximation of saturation vapor pressure. Journal of Applied Meteorology and Climatology, 35(4), pp.601-609. DOI: <10.1175
For automated downloading and tidying see the get_GSOD
function which provides expanded functionality for automatically downloading
and expanding annual GSOD files and cleaning station files.
Adam H. Sparks, adamhsparks@gmail.com
# \donttest{ # Download data to 'tempdir()' download.file( url = "https://www.ncei.noaa.gov/data/global-summary-of-the-day/access/2010/95551099999.csv", destfile = file.path(tempdir(), "95551099999.csv"), mode = "wb" ) # Reformat station data files in R's tempdir() directory tbar <- reformat_GSOD(dsn = tempdir()) tbar#> STNID NAME CTRY COUNTRY_NAME ISO2C ISO3C STATE #> 1: 955510-99999 TOOWOOMBA AIRPORT AS AUSTRALIA AU AUS #> 2: 955510-99999 TOOWOOMBA AIRPORT AS AUSTRALIA AU AUS #> 3: 955510-99999 TOOWOOMBA AIRPORT AS AUSTRALIA AU AUS #> 4: 955510-99999 TOOWOOMBA AIRPORT AS AUSTRALIA AU AUS #> 5: 955510-99999 TOOWOOMBA AIRPORT AS AUSTRALIA AU AUS #> --- #> 361: 955510-99999 TOOWOOMBA AIRPORT AS AUSTRALIA AU AUS #> 362: 955510-99999 TOOWOOMBA AIRPORT AS AUSTRALIA AU AUS #> 363: 955510-99999 TOOWOOMBA AIRPORT AS AUSTRALIA AU AUS #> 364: 955510-99999 TOOWOOMBA AIRPORT AS AUSTRALIA AU AUS #> 365: 955510-99999 TOOWOOMBA AIRPORT AS AUSTRALIA AU AUS #> LATITUDE LONGITUDE ELEVATION BEGIN END YEARMODA YEAR MONTH DAY #> 1: -27.55 151.917 642 19980301 20210116 2010-01-01 2010 1 1 #> 2: -27.55 151.917 642 19980301 20210116 2010-01-02 2010 1 2 #> 3: -27.55 151.917 642 19980301 20210116 2010-01-03 2010 1 3 #> 4: -27.55 151.917 642 19980301 20210116 2010-01-04 2010 1 4 #> 5: -27.55 151.917 642 19980301 20210116 2010-01-05 2010 1 5 #> --- #> 361: -27.55 151.917 642 19980301 20210116 2010-12-27 2010 12 27 #> 362: -27.55 151.917 642 19980301 20210116 2010-12-28 2010 12 28 #> 363: -27.55 151.917 642 19980301 20210116 2010-12-29 2010 12 29 #> 364: -27.55 151.917 642 19980301 20210116 2010-12-30 2010 12 30 #> 365: -27.55 151.917 642 19980301 20210116 2010-12-31 2010 12 31 #> YDAY TEMP TEMP_ATTRIBUTES DEWP DEWP_ATTRIBUTES SLP SLP_ATTRIBUTES STP #> 1: 1 21.2 8 17.9 8 1013.4 8 942.0 #> 2: 2 23.2 8 19.4 8 1010.5 8 939.3 #> 3: 3 21.4 8 18.9 8 1012.3 8 940.9 #> 4: 4 18.9 8 16.4 8 1015.7 8 944.1 #> 5: 5 20.5 8 16.4 8 1015.5 8 944.0 #> --- #> 361: 361 18.9 8 17.3 8 1005.8 8 934.8 #> 362: 362 17.7 8 14.7 8 1013.2 8 941.6 #> 363: 363 19.8 8 15.4 8 1014.4 8 942.7 #> 364: 364 21.4 8 16.1 8 1014.6 8 942.9 #> 365: 365 21.1 8 15.7 8 1013.9 8 942.3 #> STP_ATTRIBUTES VISIB VISIB_ATTRIBUTES WDSP WDSP_ATTRIBUTES MXSPD GUST MAX #> 1: 8 NA 0 4.3 8 6.7 NA 25.8 #> 2: 8 NA 0 3.7 8 5.1 NA 26.5 #> 3: 8 14.3 6 7.6 8 10.3 NA 28.7 #> 4: 8 23.3 4 8.7 8 10.3 NA 24.1 #> 5: 8 NA 0 7.5 8 10.8 NA 24.6 #> --- #> 361: 8 25.3 5 7.2 8 11.8 NA 22.1 #> 362: 8 NA 0 11.2 8 14.4 NA 21.5 #> 363: 8 NA 0 6.6 8 9.3 NA 24.0 #> 364: 8 NA 0 7.2 8 9.3 NA 26.5 #> 365: 8 NA 0 7.5 8 9.3 NA 26.5 #> MAX_ATTRIBUTES MIN MIN_ATTRIBUTES PRCP PRCP_ATTRIBUTES SNDP I_FOG #> 1: <NA> 17.8 <NA> 1.5 G NA NA #> 2: <NA> 19.1 <NA> 0.3 G NA NA #> 3: <NA> 19.3 * 19.8 G NA NA #> 4: <NA> 16.9 * 1.0 G NA NA #> 5: <NA> 16.7 <NA> 0.3 G NA NA #> --- #> 361: <NA> 17.9 * 42.9 G NA NA #> 362: <NA> 15.4 * 0.3 G NA NA #> 363: <NA> 15.0 <NA> 0.0 G NA NA #> 364: <NA> 15.0 <NA> 0.0 G NA NA #> 365: <NA> 16.5 <NA> 0.0 G NA NA #> I_RAIN_DRIZZLE I_SNOW_ICE I_HAIL I_THUNDER I_TORNADO_FUNNEL EA ES RH #> 1: NA NA NA NA NA 2.0 2.5 81.5 #> 2: NA NA NA NA NA 2.2 2.8 79.2 #> 3: NA NA NA NA NA 2.2 2.5 85.7 #> 4: NA NA NA NA NA 1.9 2.2 85.4 #> 5: NA NA NA NA NA 1.9 2.4 77.3 #> --- #> 361: NA NA NA NA NA 2.0 2.2 90.4 #> 362: NA NA NA NA NA 1.7 2.0 82.6 #> 363: NA NA NA NA NA 1.7 2.3 75.8 #> 364: NA NA NA NA NA 1.8 2.5 71.8 #> 365: NA NA NA NA NA 1.8 2.5 71.3# }