Skip to contents

This vignette provides an brief introduction to the npi package.

npi is an R package that allows R users to access the U.S. National Provider Identifier (NPI) Registry API by the Center for Medicare and Medicaid Services (CMS).

The package makes it easy to obtain administrative data linked to a specific individual or organizational healthcare provider. Additionally, users can perform advanced searches based on provider name, location, type of service, credentials, and many other attributes.

Search registry

To explore organizational providers with primary locations in New York City, we could use the city argument in the npi_search(). The nyc dataset here finds 10 organizational providers with primary locations in New York City, since 10 is the default number of records that are returned in npi_search(). The response is a tibble that has high-cardinality data organized into list columns.

nyc <- npi_search(city = "New York City")
#> 10 records requested
#> Requesting records 0-10...
nyc
#> # A tibble: 10 × 11
#>    npi    enume…¹ basic    other_…² identi…³ taxono…⁴ addres…⁵ practi…⁶ endpoi…⁷
#>  * <chr>  <chr>   <list>   <list>   <list>   <list>   <list>   <list>   <list>  
#>  1 13262… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  2 13564… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  3 14972… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  4 19728… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  5 14079… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  6 13665… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  7 18516… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  8 16594… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  9 16695… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 10 10938… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> # … with 2 more variables: created_date <dttm>, last_updated_date <dttm>, and
#> #   abbreviated variable names ¹​enumeration_type, ²​other_names, ³​identifiers,
#> #   ⁴​taxonomies, ⁵​addresses, ⁶​practice_locations, ⁷​endpoints

Other search arguments for the function include number, enumeration_type, taxonomy_description, first_name, last_name, use_first_name_alias, organization_name, address_purpose, state, postal_code, country_code, and limit.

Additionally, more than one search argument can be used at once.

nyc_multi <- npi_search(city = "New York City", state = "NY", enumeration_type = "org")
#> 10 records requested
#> Requesting records 0-10...
nyc_multi
#> # A tibble: 10 × 11
#>    npi    enume…¹ basic    other_…² identi…³ taxono…⁴ addres…⁵ practi…⁶ endpoi…⁷
#>  * <chr>  <chr>   <list>   <list>   <list>   <list>   <list>   <list>   <list>  
#>  1 19728… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  2 15886… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  3 16292… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  4 15383… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  5 10637… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  6 12354… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  7 12452… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  8 12353… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  9 11849… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 10 16799… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> # … with 2 more variables: created_date <dttm>, last_updated_date <dttm>, and
#> #   abbreviated variable names ¹​enumeration_type, ²​other_names, ³​identifiers,
#> #   ⁴​taxonomies, ⁵​addresses, ⁶​practice_locations, ⁷​endpoints

Visit the function’s help page via ?npi_search after installing and loading the package for more details.

Increasing number of records returned

The limit argument of npi_search() lets you set the maximum records to return from 1 to 1200 inclusive, defaulting to 10 records if no value is specified.

nyc_25 <- npi_search(city = "New York City", limit = 25)
#> 25 records requested
#> Requesting records 0-25...
nyc_25
#> # A tibble: 25 × 11
#>    npi    enume…¹ basic    other_…² identi…³ taxono…⁴ addres…⁵ practi…⁶ endpoi…⁷
#>  * <chr>  <chr>   <list>   <list>   <list>   <list>   <list>   <list>   <list>  
#>  1 13262… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  2 13564… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  3 14972… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  4 19728… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  5 14079… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  6 13665… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  7 18516… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  8 16594… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  9 16695… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 10 10938… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> # … with 15 more rows, 2 more variables: created_date <dttm>,
#> #   last_updated_date <dttm>, and abbreviated variable names ¹​enumeration_type,
#> #   ²​other_names, ³​identifiers, ⁴​taxonomies, ⁵​addresses, ⁶​practice_locations,
#> #   ⁷​endpoints

When using npi_search(), searches with greater than 200 records (for example 300 records) may result in multiple API calls. This is because the API itself returns up to 200 records per request, but allows previously requested records to be skipped. npi_search() will automatically make additional API calls up to the API’s limit of 1200 records for a unique set of query parameter values, and will still return a single tibble. However, to save time, the function only makes additional requests if needed. For example, if you request 1200 records, and 199 are returned in the first request, then the function does not need to make a second request because there are no more records to return.

nyc_300 <- npi_search(city = "New York City", limit = 300)
#> 300 records requested
#> Requesting records 0-200...
#> Requesting records 200-300...
nyc_300
#> # A tibble: 300 × 11
#>    npi    enume…¹ basic    other_…² identi…³ taxono…⁴ addres…⁵ practi…⁶ endpoi…⁷
#>  * <chr>  <chr>   <list>   <list>   <list>   <list>   <list>   <list>   <list>  
#>  1 13262… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  2 13564… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  3 14972… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  4 19728… Organi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  5 14079… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  6 13665… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  7 18516… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  8 16594… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#>  9 16695… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> 10 10938… Indivi… <tibble> <tibble> <tibble> <tibble> <tibble> <tibble> <tibble>
#> # … with 290 more rows, 2 more variables: created_date <dttm>,
#> #   last_updated_date <dttm>, and abbreviated variable names ¹​enumeration_type,
#> #   ²​other_names, ³​identifiers, ⁴​taxonomies, ⁵​addresses, ⁶​practice_locations,
#> #   ⁷​endpoints

The NPPES API documentation does not specify additional API rate limitations. However, if you need more than 1200 NPI records for a set of search terms, you will need to download the NPPES Data Dissemination File.

Obtaining more human-readable output

npi_summarize() provides a more human-readable overview of output already obtained through npi_search().

npi_summarize(nyc)
#> # A tibble: 10 × 6
#>    npi        name               enumeration_type primary_practi…¹ phone prima…²
#>    <chr>      <chr>              <chr>            <chr>            <chr> <chr>  
#>  1 1326214693 BENJAMIN BOWLING   Individual       NA               212-… Psychi…
#>  2 1356498703 MICHAEL SCHMIDT    Individual       4401 BRONX BOUL… 718-… Intern…
#>  3 1497228076 VIVIAN AYALA       Individual       NA               212-… Social…
#>  4 1972840189 BEVERLY SUAREZ LLC Organization     220-18 HORACE H… 718-… Psychi…
#>  5 1407906092 MELINDA SCHROEDER  Individual       NA               212-… Social…
#>  6 1366591505 ANNE GRIFFIN       Individual       205 EAST 78TH S… 212-… Social…
#>  7 1851622625 TOD GRAPES         Individual       169 MANHATTAN A… 212-… Social…
#>  8 1659422525 ELLEN FEINSTEIN    Individual       441 W END AVE S… 212-… Intern…
#>  9 1669524237 LEE SHECHTMAN      Individual       247 3RD AVE SUI… 212-… Kinesi…
#> 10 1093868499 BENJAMIN SADOCK    Individual       NA               212-… Day Tr…
#> # … with abbreviated variable names ¹​primary_practice_address,
#> #   ²​primary_taxonomy

Additionally, users can flatten all the list columns using npi_flatten().

npi_flatten(nyc)
#> # A tibble: 30 × 48
#>    npi   basic…¹ basic…² basic…³ basic…⁴ basic…⁵ basic…⁶ basic…⁷ basic…⁸ basic…⁹
#>    <chr> <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>  
#>  1 1093… BENJAM… SADOCK  JAMES   MD      NO      M       2007-0… 2007-0… A      
#>  2 1093… BENJAM… SADOCK  JAMES   MD      NO      M       2007-0… 2007-0… A      
#>  3 1326… BENJAM… BOWLING DOUGLAS M.D.    NO      M       2008-0… 2014-0… A      
#>  4 1326… BENJAM… BOWLING DOUGLAS M.D.    NO      M       2008-0… 2014-0… A      
#>  5 1356… MICHAEL SCHMIDT THOMAS  MSW LC… NO      M       2007-0… 2007-0… A      
#>  6 1356… MICHAEL SCHMIDT THOMAS  MSW LC… NO      M       2007-0… 2007-0… A      
#>  7 1366… ANNE    GRIFFIN MCLEAN  MD      NO      F       2007-0… 2007-0… A      
#>  8 1366… ANNE    GRIFFIN MCLEAN  MD      NO      F       2007-0… 2007-0… A      
#>  9 1407… MELINDA SCHROE… LUCY    LCSW    NO      F       2007-0… 2007-0… A      
#> 10 1407… MELINDA SCHROE… LUCY    LCSW    NO      F       2007-0… 2007-0… A      
#> # … with 20 more rows, 38 more variables: basic_name_prefix <chr>,
#> #   basic_name_suffix <chr>, basic_organization_name <chr>,
#> #   basic_organizational_subpart <chr>,
#> #   basic_authorized_official_first_name <chr>,
#> #   basic_authorized_official_last_name <chr>,
#> #   basic_authorized_official_middle_name <chr>,
#> #   basic_authorized_official_telephone_number <chr>, …

Alternatively, individual columns can be flattened for each npi by using the cols argument. Only the columns specified will be flattened and returned with the npi column by default.

npi_flatten(nyc, cols = c("basic", "taxonomies"))
#> # A tibble: 10 × 26
#>    npi   basic…¹ basic…² basic…³ basic…⁴ basic…⁵ basic…⁶ basic…⁷ basic…⁸ basic…⁹
#>    <chr> <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>  
#>  1 1093… BENJAM… SADOCK  JAMES   MD      NO      M       2007-0… 2007-0… A      
#>  2 1326… BENJAM… BOWLING DOUGLAS M.D.    NO      M       2008-0… 2014-0… A      
#>  3 1356… MICHAEL SCHMIDT THOMAS  MSW LC… NO      M       2007-0… 2007-0… A      
#>  4 1366… ANNE    GRIFFIN MCLEAN  MD      NO      F       2007-0… 2007-0… A      
#>  5 1407… MELINDA SCHROE… LUCY    LCSW    NO      F       2007-0… 2007-0… A      
#>  6 1497… VIVIAN  AYALA   ROSE    NA      NO      F       2019-0… 2019-0… A      
#>  7 1659… ELLEN   FEINST… MARCH   CSW     YES     F       2007-0… 2007-0… A      
#>  8 1669… LEE     SHECHT… NA      M.D.    YES     M       2007-0… 2012-0… A      
#>  9 1851… TOD     GRAPES  T       B.S. e… YES     M       2010-0… 2010-0… A      
#> 10 1972… NA      NA      NA      NA      NA      NA      2013-0… 2013-0… A      
#> # … with 16 more variables: basic_name_prefix <chr>, basic_name_suffix <chr>,
#> #   basic_organization_name <chr>, basic_organizational_subpart <chr>,
#> #   basic_authorized_official_first_name <chr>,
#> #   basic_authorized_official_last_name <chr>,
#> #   basic_authorized_official_middle_name <chr>,
#> #   basic_authorized_official_telephone_number <chr>,
#> #   basic_authorized_official_title_or_position <chr>, …