Introduction

This R package is aimed at accessing the openaq API. OpenAQ is a community of scientists, software developers, and lovers of open environmental data who are building an open, real-time database that provides programmatic and historical access to air quality data. See their website at https://openaq.org/ and see the API documentation at https://docs.openaq.org/. The package contains 5 functions that correspond to the 5 different types of query offered by the openaq API: cities, countries, latest, locations and measurements. The package uses the dplyr package: all output tables are data.frame (dplyr “tbl_df”) objects, that can be further processed and analysed.

What data can you get?

Via the API since November 2017 the API only provides access to the latest 90 days of OpenAQ data. The whole OpenAQ data can be accessed via Amazon S3. See this announcement. You can interact with Amazon S3 using the aws.s3 package and the maintainer of ropenaq plans to write tutorials about how to access OpenAQ data and will also keep the documentation of ropenaq up-to-date regarding data access changes.

Finding measurements availability

Three functions of the package allow to get lists of available information. Measurements are obtained from locations that are in cities that are in countries.

The aq_countries function

The aq_countries function allows to see for which countries information is available within the platform. It is the easiest function because it does not have any argument. The code for each country is its ISO 3166-1 alpha-2 code.

library("ropenaq")
countries_table <- aq_countries()
library("knitr")
kable(countries_table)
name code cities locations count
Andorra AD 2 3 61803
Argentina AR 1 4 14976
Australia AU 20 104 5271548
Austria AT 16 306 1521351
Bahrain BH 1 1 30603
Bangladesh BD 1 2 24911
Belgium BE 14 198 1547173
Bosnia and Herzegovina BA 10 22 961764
Brazil BR 72 119 2812094
Canada CA 12 200 3854349
Chile CL 140 128 5961662
China CN 388 1444 26292400
Colombia CO 11 32 191475
Croatia HR 17 50 561913
Curaçao CW 1 1 10042
Czech Republic CZ 15 200 3625468
Denmark DK 7 25 376859
Ethiopia ET 1 2 38291
Finland FI 35 108 1276496
France FR 135 1175 15449102
Germany DE 36 1027 10585082
Ghana GH 1 11 1595
Gibraltar GI 2 6 40842
Hong Kong HK 9 16 803178
Hungary HU 14 50 1278268
India IN 106 335 11399682
Indonesia ID 2 3 54435
Iraq IQ 1 1 355
Ireland IE 16 38 188081
Israel IL 14 159 192119580
Italy IT 45 104 2204825
Kazakhstan KZ 1 1 5448
Kenya KE 1 2 2842
Kosovo XK 8 9 74041
Kuwait KW 1 1 15617
Kyrgyzstan KG 1 1 397
Latvia LV 4 4 102351
Lithuania LT 8 17 295130
Luxembourg LU 3 7 254313
Macedonia, the Former Yugoslav Republic of MK 16 30 779954
Malta MT 4 4 146519
Mexico MX 5 97 2579658
Mongolia MN 25 42 2774703
Nepal NP 1 4 58345
Netherlands NL 68 112 6718365
Nigeria NG 1 1 2541
Norway NO 37 83 2287249
Peru PE 1 21 689689
Philippines PH 1 1 958
Poland PL 146 204 2426752
Portugal PT 16 67 1837362
Russian Federation RU 1 49 187117
Serbia RS 4 5 74774
Singapore SG 1 1 1275
Slovakia SK 8 38 1364750
Slovenia SI 8 8 85272
South Africa ZA 1 11 571193
Spain ES 115 1070 12857377
Sri Lanka LK 1 1 11107
Sweden SE 3 15 333209
Switzerland CH 14 25 863160
Taiwan, Province of China TW 30 77 3784558
Thailand TH 33 68 4111113
Turkey TR 43 152 5640285
Uganda UG 1 1 15597
United Arab Emirates AE 2 2 26984
United Kingdom GB 112 162 8041583
United States US 774 2083 45487757
Uzbekistan UZ 1 1 5448
Viet Nam VN 2 3 50913

The aq_cities function

Using the aq_cities functions one can get all cities for which information is available within the platform. For each city, one gets the number of locations and the count of measures for the city, the URL encoded string, and the country it is in.

city country locations count cityURL
Escaldes-Engordany AD 2 58565 Escaldes-Engordany
unused AD 1 3238 unused
Abu Dhabi AE 1 16997 Abu+Dhabi
Dubai AE 1 9987 Dubai
Buenos Aires AR 4 14976 Buenos+Aires
Amt der Nieder�sterreichischen Landesregierung AT 39 322499 Amt+der+Nieder%EF%BF%BDsterreichischen+Landesregierung

The optional country argument allows to do this for a given country instead of the whole world.

cities_tableIndia <- aq_cities(country="IN", limit = 10, page = 1)
kable(cities_tableIndia)
city country locations count cityURL
Delhi IN 79 1976481 Delhi
Hyderabad IN 16 769525 Hyderabad
Jodhpur IN 2 197137 Jodhpur
Fatehabad IN 1 864 Fatehabad+
Alwar IN 2 16546 Alwar
Satna IN 3 18758 Satna
Howrah IN 5 132199 Howrah
Barddhaman IN 3 2470 Barddhaman
Muzaffarpur IN 3 156139 Muzaffarpur
Siliguri IN 3 20378 Siliguri

If one inputs a country that is not in the platform (or misspells a code), then an error message is thrown.

The aq_locations function

The aq_locations function has far more arguments than the first two functions. On can filter locations in a given country, city, location, for a given parameter (valid values are “pm25”, “pm10”, “so2”, “no2”, “o3”, “co” and “bc”), from a given date and/or up to a given date, for values between a minimum and a maximum, for a given circle outside a central point by the use of the latitude, longitude and radius arguments. In the output table one also gets URL encoded strings for the city and the location. Below are several examples.

Here we only look for locations with PM2.5 information in Chennai, India.

locations_chennai <- aq_locations(country = "IN", city = "Chennai", parameter = "pm25")
kable(locations_chennai)
location city country count sourceNames lastUpdated firstUpdated distance sourceName latitude longitude pm25 pm10 no2 so2 o3 co bc cityURL locationURL
Alandur Bus Depot Chennai IN 13224 CPCB 1519271100 1487450700 8196207 CPCB 12.99711 80.19151 TRUE FALSE FALSE FALSE FALSE FALSE FALSE Chennai Alandur+Bus+Depot
Alandur Bus Depot, Chennai - CPCB Chennai IN 4817 c(“data.gov.in”, “caaqm”) 1550121300 1520573400 8197392 caaqm 12.90992 80.10765 TRUE FALSE FALSE FALSE FALSE FALSE FALSE Chennai Alandur+Bus+Depot%2C+Chennai+-+CPCB
IIT Chennai IN 16204 CPCB 1519271100 1487442600 8199905 CPCB 12.99251 80.23745 TRUE FALSE FALSE FALSE FALSE FALSE FALSE Chennai IIT
IIT, Chennai - CPCB Chennai IN 220 c(“data.gov.in”, “caaqm”) 1523859300 1520573400 8199021 caaqm 13.00522 80.23981 TRUE FALSE FALSE FALSE FALSE FALSE FALSE Chennai IIT%2C+Chennai+-+CPCB
Manali Chennai IN 19515 CPCB 1519271100 1487452500 8187465 CPCB 13.16454 80.26285 TRUE FALSE FALSE FALSE FALSE FALSE FALSE Chennai Manali
Manali, Chennai - CPCB Chennai IN 8111 c(“caaqm”, “data.gov.in”) 1550108700 1520573400 8187465 caaqm 13.16454 80.26285 TRUE FALSE FALSE FALSE FALSE FALSE FALSE Chennai Manali%2C+Chennai+-+CPCB
US Diplomatic Post: Chennai Chennai IN 25944 StateAir_Chennai 1553517000 1449869400 8194955 StateAir_Chennai 13.08784 80.27847 TRUE FALSE FALSE FALSE FALSE FALSE FALSE Chennai US+Diplomatic+Post%3A+Chennai
Velachery Res. Area, Chennai - CPCB Chennai IN 5864 caaqm 1550121300 1523862900 8199021 caaqm 13.00522 80.23981 TRUE FALSE FALSE FALSE FALSE FALSE FALSE Chennai Velachery+Res.+Area%2C+Chennai+-+CPCB

Getting measurements

Two functions allow to get data: aq_measurement and aq_latest. In both of them the arguments city and location needs to be given as URL encoded strings.

The aq_measurements function

The aq_measurements function has many arguments for getting a query specific to, say, a given parameter in a given location or for a given circle outside a central point by the use of the latitude, longitude and radius arguments. Below we get the PM2.5 measures for Delhi in India.

results_table <- aq_measurements(country = "IN", city = "Delhi", parameter = "pm25", limit = 10, page = 1)
kable(results_table)
location parameter value unit country city latitude longitude dateUTC dateLocal cityURL locationURL
US Diplomatic Post: New Delhi pm25 34.9 µg/m³ IN Delhi 28.63576 77.22445 2019-03-25 13:30:00 2019-03-25 19:00:00 Delhi US+Diplomatic+Post%3A+New+Delhi
US Diplomatic Post: New Delhi pm25 34.0 µg/m³ IN Delhi 28.63576 77.22445 2019-03-25 12:30:00 2019-03-25 18:00:00 Delhi US+Diplomatic+Post%3A+New+Delhi
US Diplomatic Post: New Delhi pm25 32.0 µg/m³ IN Delhi 28.63576 77.22445 2019-03-25 11:30:00 2019-03-25 17:00:00 Delhi US+Diplomatic+Post%3A+New+Delhi
US Diplomatic Post: New Delhi pm25 27.1 µg/m³ IN Delhi 28.63576 77.22445 2019-03-25 10:30:00 2019-03-25 16:00:00 Delhi US+Diplomatic+Post%3A+New+Delhi
US Diplomatic Post: New Delhi pm25 33.7 µg/m³ IN Delhi 28.63576 77.22445 2019-03-25 09:30:00 2019-03-25 15:00:00 Delhi US+Diplomatic+Post%3A+New+Delhi
US Diplomatic Post: New Delhi pm25 43.6 µg/m³ IN Delhi 28.63576 77.22445 2019-03-25 08:30:00 2019-03-25 14:00:00 Delhi US+Diplomatic+Post%3A+New+Delhi
US Diplomatic Post: New Delhi pm25 55.6 µg/m³ IN Delhi 28.63576 77.22445 2019-03-25 07:30:00 2019-03-25 13:00:00 Delhi US+Diplomatic+Post%3A+New+Delhi
US Diplomatic Post: New Delhi pm25 68.5 µg/m³ IN Delhi 28.63576 77.22445 2019-03-25 06:30:00 2019-03-25 12:00:00 Delhi US+Diplomatic+Post%3A+New+Delhi
US Diplomatic Post: New Delhi pm25 94.1 µg/m³ IN Delhi 28.63576 77.22445 2019-03-25 05:30:00 2019-03-25 11:00:00 Delhi US+Diplomatic+Post%3A+New+Delhi
US Diplomatic Post: New Delhi pm25 110.4 µg/m³ IN Delhi 28.63576 77.22445 2019-03-25 04:30:00 2019-03-25 10:00:00 Delhi US+Diplomatic+Post%3A+New+Delhi

One could also get all possible parameters in the same table.

The aq_latest function

This function gives a table with all newest measures for the locations that are chosen by the arguments. If all arguments are NULL, it gives all the newest measures for all locations. Below are the latest values for Hyderabad at the time this vignette was compiled.

tableLatest <- aq_latest(country="IN", city="Hyderabad")
kable(head(tableLatest))
location city country distance latitude longitude parameter value lastUpdated unit sourceName averagingPeriod_value averagingPeriod_unit cityURL locationURL
Bollaram Industrial Area Hyderabad IN NA NA NA so2 16.8 2017-02-17 05:15:00 µg/m³ CPCB 0.25 hours Hyderabad Bollaram+Industrial+Area
Bollaram Industrial Area Hyderabad IN NA NA NA pm10 137.0 2017-02-17 05:15:00 µg/m³ CPCB 0.25 hours Hyderabad Bollaram+Industrial+Area
Bollaram Industrial Area Hyderabad IN NA NA NA no2 16.2 2017-02-17 05:15:00 µg/m³ CPCB 0.25 hours Hyderabad Bollaram+Industrial+Area
Bollaram Industrial Area Hyderabad IN NA NA NA pm25 55.0 2017-02-17 05:15:00 µg/m³ CPCB 0.25 hours Hyderabad Bollaram+Industrial+Area
Bollaram Industrial Area Hyderabad IN NA NA NA co 420.0 2017-02-17 05:15:00 µg/m³ CPCB 0.25 hours Hyderabad Bollaram+Industrial+Area
Bollaram Industrial Area, Hyderabad - TSPCB Hyderabad IN 7704866 17.54089 78.35853 no2 26.3 2019-02-14 05:15:00 µg/m³ caaqm 0.25 hours Hyderabad Bollaram+Industrial+Area%2C+Hyderabad+-+TSPCB

Paging and limit

For all endpoints/functions, there a a limit and a page arguments, which indicate, respectively, how many results per page should be shown and which page should be queried. If you don’t enter the parameters by default all results for the query will be retrieved with async requests, but it might take a while nonetheless depending on the total number of results.

aq_measurements(city = "Delhi", parameter = "pm25")

Rate limiting

In October 2017 the API introduced a rate limit of 2,000 requests every 5 minutes. Please keep this in mind. In the case when the request receives a response status of 429 (too many requests), the package will wait 5 minutes.

Other packages of interest for getting air quality data

  • The rdefra package, also part of the rOpenSci project, allows to to interact with the UK AIR pollution database from DEFRA, including historical measures.

  • The openair package gives access to the same data as rdefra but relies on a local and compressed copy of the data on servers at King’s College (UK), periodically updated.

  • The usaqmindia package provides data from the US air quality monitoring program in India for Delhi, Mumbai, Chennai, Hyderabad and Kolkata from 2013. ## Meta

  • Please report any issues or bugs.
  • License: GPL
  • Get citation information for ropenaq in R doing citation(package = 'ropenaq')
  • Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.