bold
is an R package to connect to BOLD Systems via their API. Functions in bold
let you search for sequence data, specimen data, sequence + specimen data, and download raw trace files.
bold
info
## Installation
You can install the stable version from CRAN
```r
install.packages("bold")
```
Or the development version from Github
```r
install.packages("devtools")
devtools::install_github("ropensci/bold")
```
Then load the package into the R sesssion
```r
library("bold")
```
## Usage
### Search for taxonomic names via names
`bold_tax_name` searches for names with names.
```r
bold_tax_name(name = 'Diplura')
#> input taxid taxon tax_rank tax_division parentid parentname
#> 1 Diplura 734358 Diplura class Animals 20 Arthropoda
#> 2 Diplura 603673 Diplura genus Protists 53974 Scytosiphonaceae
#> taxonrep
#> 1 Diplura
#> 2
```
```r
bold_tax_name(name = c('Diplura', 'Osmia'))
#> input taxid taxon tax_rank tax_division parentid parentname
#> 1 Diplura 734358 Diplura class Animals 20 Arthropoda
#> 2 Diplura 603673 Diplura genus Protists 53974 Scytosiphonaceae
#> 3 Osmia 4940 Osmia genus Animals 4962 Megachilinae
#> taxonrep
#> 1 Diplura
#> 2
#> 3 Osmia
```
### Search for taxonomic names via BOLD identifiers
`bold_tax_id` searches for names with BOLD identifiers.
```r
bold_tax_id(id = 88899)
#> input taxid taxon tax_rank tax_division parentid parentname
#> 1 88899 88899 Momotus genus Animals 88898 Momotidae
```
```r
bold_tax_id(id = c(88899, 125295))
#> input taxid taxon tax_rank tax_division parentid parentname
#> 1 88899 88899 Momotus genus Animals 88898 Momotidae
#> 2 125295 125295 Helianthus genus Plants 100962 Asteraceae
```
### Search for sequence data only
The BOLD sequence API gives back sequence data, with a bit of metadata.
The default is to get a list back
```r
bold_seq(taxon = 'Coelioxys')[1:2]
#> [[1]]
#> [[1]]$id
#> [1] "BCHYM1514-13"
#>
#> [[1]]$name
#> [1] "Coelioxys conica"
#>
#> [[1]]$gene
#> [1] "BCHYM1514-13"
#>
#> [[1]]$sequence
#> [1] "GATAATATATATAATTTTTGCAATATGATCAGGAATAATAGGATCCTCTTTAAGAATAATTATTCGTATAGAATTAAGAATTCCAGGATCTTGAATTAATAATGATCAAATTTATAACTCCTTTATTACAGCACATGCATTTTTAATAATTTTTTTTTTAGTTATACCTTTTCTTATTGGAGGATTTGGAAATTGATTAGTACCTTTAATATTAGGATCACCAGATATAGCTTTCCCACGAATAAATAATATTAGATTTTGATTATTACCTCCTTCTTTATTAATATTATTATTAAGTAATTTAATAAATCCCAGACCAGGAACAGGCTGAACAGTTTATCCTCCTTTATCTTTATACACATACCACCCTTCTCCCTCAGTTGATTTAGCAATTTTTTCACTACATCTATCAGGAATCTCTTCTATTATTGGATCTATAAATTTTATTGTTACAATTTTAATAATAAAAAACTTTTCAATAAATTATAATCAAATACCATTATTCCCATGATCTATTTTAATTACTACTATTTTATTATTATTATCACTACCTGTATTAGCTGGTGCTATTACTATATTATTATTTGATCGAAATTTAAATTCTTCTTTTTTTGACCCTATAGGAGGAGGAGACCCAATTTTATACCAACATTTATTT"
#>
#>
#> [[2]]
#> [[2]]$id
#> [1] "FBAPB481-09"
#>
#> [[2]]$name
#> [1] "Coelioxys afra"
#>
#> [[2]]$gene
#> [1] "FBAPB481-09"
#>
#> [[2]]$sequence
#> [1] "----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------TTTCCACGAATAAATAATGTAAGATTTTGACTATTACCTCCCTCAATTTTCTTATTATTATCAAGAACCCTAATTAACCCAAGTGCTGGTACTGGATGAACTGTATATCCTCCTTTATCCTTATATACATTTCATGCCTCACCTTCCGTTGATTTAGCAATTTTTTCACTTCATTTATCAGGAATTTCATCAATTATTGGATCAATAAATTTTATTGTTACAATCTTAATAATAAAAAATTTTTCTTTAAATTATAGACAAATACCATTATTTTCATGATCAGTTTTAATTACTACAATTTTACTTTTATTATCATTACCAATTTTAGCTGGAGCAATTACTATACTCCTATTTGATCGAAATTTAAATACCTCATTCTTTGACCCAATAGGAGGAGGAGATCCAATTTTATATCAACATTTATTT"
```
You can optionally get back the `httr` response object
```r
res <- bold_seq(taxon = 'Coelioxys', response = TRUE)
res$headers
#> $date
#> [1] "Mon, 02 May 2016 15:52:53 GMT"
#>
#> $server
#> [1] "Apache/2.2.15 (Red Hat)"
#>
#> $`x-powered-by`
#> [1] "PHP/5.3.15"
#>
#> $`content-disposition`
#> [1] "attachment; filename=fasta.fas"
#>
#> $connection
#> [1] "close"
#>
#> $`transfer-encoding`
#> [1] "chunked"
#>
#> $`content-type`
#> [1] "application/x-download"
#>
#> attr(,"class")
#> [1] "insensitive" "list"
```
You can do geographic searches
```r
bold_seq(geo = "USA")
#> [[1]]
#> [[1]]$id
#> [1] "GBAN1777-08"
#>
#> [[1]]$name
#> [1] "Macrobdella decora"
#>
#> [[1]]$gene
#> [1] "GBAN1777-08"
#>
#> [[1]]$sequence
#> [1] "---------------------------------ATTGGAATCTTGTATTTCTTATTAGGTACATGATCTGCTATAGTAGGGACCTCTATA---AGAATAATTATTCGAATTGAATTAGCTCAACCTGGGTCGTTTTTAGGAAAT---GATCAAATTTACAATACTATTGTTACTGCTCATGGATTAATTATAATTTTTTTTATAGTAATACCTATTTTAATTGGAGGGTTTGGTAATTGATTAATTCCGCTAATA---ATTGGTTCTCCTGATATAGCTTTTCCACGTCTTAATAATTTAAGATTTTGATTACTTCCGCCATCTTTAACTATACTTTTTTGTTCATCTATAGTCGAAAATGGAGTAGGTACTGGATGGACTATTTACCCTCCTTTAGCAGATAACATTGCTCATTCTGGACCTTCTGTAGATATA---GCAATTTTTTCACTTCATTTAGCTGGTGCTTCTTCTATTTTAGGTTCATTAAATTTTATTACTACTGTAGTTAATATACGATGACCAGGGATATCTATAGAGCGAATTCCTTTATTTATTTGATCCGTAATTATTACTACTGTATTGCTATTATTATCTTTACCAGTATTAGCAGCT---GCTATTTCAATATTATTAACAGATCGTAACTTAAATACTAGATTTTTTGACCCAATAGGAGGAGGGGATCCTATTTTATTCCAACATTTATTTTGATTTTTTGGCCACCCTGAAGTTTATATTTTAATTTTACCAGGATTTGGAGCTATTTCTCATGTAGTAAGTCATAACTCT---AAAAAATTAGAACCGTTTGGATCATTAGGGATATTATATGCAATAATTGGAATTGCAATTTTAGGTTTTATTGTTTGAGCACATCATATATTTACAGTAGGTCTTGATGTAGATACACGAGCTTATTTTACAGCAGCTACAATAGTTATTGCTGTTCCTACAGGAATTAAAGTATTTAGGTGATTG---GCAACT"
#>
#>
#> [[2]]
#> [[2]]$id
#> [1] "GBAN1780-08"
#>
#> [[2]]$name
#> [1] "Haemopis terrestris"
#>
#> [[2]]$gene
#> [1] "GBAN1780-08"
#>
#> [[2]]$sequence
#> [1] "---------------------------------ATTGGAACWTTWTATTTTATTTTNGGNGCTTGATCTGCTATATTNGGGATCTCAATA---AGGAATATTATTCGAATTGAGCCATCTCAACCTGGGAGATTATTAGGAAAT---GATCAATTATATAATTCATTAGTAACAGCTCATGGATTAATTATAATTTTCTTTATGGTTATGCCTATTTTGATTGGTGGGTTTGGTAATTGATTACTACCTTTAATA---ATTGGAGCCCCTGATATAGCTTTTCCTCGATTAAATAATTTAAGTTTTTGATTATTACCACCTTCATTAATTATATTGTTAAGATCCTCTATTATTGAAAGAGGGGTAGGTACAGGTTGAACCTTATATCCTCCTTTAGCAGATAGATTATTTCATTCAGGTCCATCGGTAGATATA---GCTATTTTTTCATTACATATAGCTGGAGCATCATCTATTTTAGGCTCATTAAACTTTATTTCTACAATTATTAATATACGAATTAAAGGTATAAGATCTGATCGAGTACCTTTATTTGTATGATCAGTTGTTATTACAACAGTTCTGTTATTATTGTCTTTACCTGTTTTAGCTGCA---GCTATTACTATATTATTAACAGATCGTAATTTAAATACTACTTTTTTTGATCCTATAGGAGGTGGAGATCCAGTATTGTTTCAACACTTATTTTGATTTTTTGGTCATCCAGAAGTATATATTTTGATTTTACCAGGATTTGGAGCAATTTCTCATATTATTACAAATAATTCT---AAAAAATTGGAACCTTTTGGATCTCTTGGTATAATTTATGCTATAATTGGAATTGCAGTTTTAGGGTTTATTGTATGAGCCCATCATATATTTACTGTAGGATTAGATGTTGATACTCGAGCTTATTTTACAGCAGCTACTATAGTTATTGCTGTTCCTACTGGTATTAAAGTTTTTAGGTGATTA---GCAACA"
#>
#>
#> [[3]]
#> [[3]]$id
#> [1] "GBNM0293-06"
#>
#> [[3]]$name
#> [1] "Steinernema carpocapsae"
#>
#> [[3]]$gene
#> [1] "GBNM0293-06"
#>
#> [[3]]$sequence
#> [1] "---------------------------------------------------------------------------------ACAAGATTATCTCTTATTATTCGTTTAGAGTTGGCTCAACCTGGTCTTCTTTTGGGTAAT---GGTCAATTATATAATTCTATTATTACTGCTCATGCTATTCTTATAATTTTTTTCATAGTTATACCTAGAATAATTGGTGGTTTTGGTAATTGAATATTACCTTTAATATTGGGGGCTCCTGATATAAGTTTTCCACGTTTGAATAATTTAAGTTTTTGATTGCTACCAACTGCTATATTTTTGATTTTAGATTCTTGTTTTGTTGACACTGGTTGTGGTACTAGTTGAACTGTTTATCCTCCTTTGAGG---ACTTTAGGTCACCCTGGYAGAAGTGTAGATTTAGCTATTTTTAGTCTTCATTGTGCAGGAATTAGCTCAATTTTAGGGGCTATTAATTTTATATGTACTACAAAAAATCTTCGTAGTAGTTCTATTTCTTTGGAACATATAAGACTTTTTGTTTGGGCTGTTTTTGTTACTGTTTTTTTATTAGTTTTATCTTTACCTGTTTTAGCTGGTGCTATTACTATGCTTTTAACAGACCGTAATTTAAATACTTCTTTTTTT------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------"
#>
#>
#> [[4]]
#> [[4]]$id
#> [1] "NEONV108-11"
#>
#> [[4]]$name
#> [1] "Aedes thelcter"
#>
#> [[4]]$gene
#> [1] "NEONV108-11"
#>
#> [[4]]$sequence
#> [1] "AACTTTATACTTCATCTTCGGAGTTTGATCAGGAATAGTTGGTACATCATTAAGAATTTTAATTCGTGCTGAATTAAGTCAACCAGGTATATTTATTGGAAATGACCAAATTTATAATGTAATTGTTACAGCTCATGCTTTTATTATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTAGTTCCTCTAATATTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAATAATATAAGTTTTTGAATACTACCTCCCTCATTAACTCTTCTACTTTCAAGTAGTATAGTAGAAAATGGATCAGGAACAGGATGAACAGTTTATCCACCTCTTTCATCTGGAACTGCTCATGCAGGAGCCTCTGTTGATTTAACTATTTTTTCTCTTCATTTAGCCGGAGTTTCATCAATTTTAGGGGCTGTAAATTTTATTACTACTGTAATTAATATACGATCTGCAGGAATTACTCTTGATCGACTACCTTTATTCGTTTGATCTGTAGTAATTACAGCTGTTTTATTACTTCTTTCACTTCCTGTATTAGCTGGAGCTATTACAATACTATTAACTGATCGAAATTTAAATACATCTTTCTTTGATCCAATTGGAGGAGGAGACCCAATTTTATACCAACATTTATTT"
#>
#>
#> [[5]]
#> [[5]]$id
#> [1] "NEONV109-11"
#>
#> [[5]]$name
#> [1] "Aedes thelcter"
#>
#> [[5]]$gene
#> [1] "NEONV109-11"
#>
#> [[5]]$sequence
#> [1] "AACTTTATACTTCATCTTCGGAGTTTGATCAGGAATAGTTGGTACATCATTAAGAATTTTAATTCGTGCTGAATTAAGTCAACCAGGTATATTTATTGGAAATGACCAAATTTATAATGTAATTGTTACAGCTCATGCTTTTATTATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTAGTTCCTCTAATATTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAATAATATAAGTTTTTGAATACTACCTCCCTCATTAACTCTTCTACTTTCAAGTAGTATAGTAGAAAATGGGTCAGGAACAGGATGAACAGTTTATCCACCTCTTTCATCTGGAACTGCTCATGCAGGAGCCTCTGTTGATTTAACTATTTTTTCTCTTCATTTAGCCGGAGTTTCATCAATTTTAGGGGCTGTAAATTTTATTACTACTGTAATTAATATACGATCTGCAGGAATTACTCTTGATCGACTACCTTTATTCGTTTGATCTGTAGTAATTACAGCTGTTTTATTACTTCTTTCACTTCCTGTATTAGCTGGAGCTATTACAATACTATTAACTGATCGAAATTTAAATACATCTTTCTTTGACCCAATTGGAGGGGGAGACCCAATTTTATACCAACATTTATTT"
```
And you can search by researcher name
```r
bold_seq(researchers = 'Thibaud Decaens')[[1]]
#> $id
#> [1] "AMAZ108-09"
#>
#> $name
#> [1] "Arsenura armida"
#>
#> $gene
#> [1] "AMAZ108-09"
#>
#> $sequence
#> [1] "TACTTTATATTTTATTTTTGGAATTTGAGCAGGTATAATTGGAACCTCCTTAAGTTTATTAATTCGAGCTGAATTAGGAATACCTGGATTTTTAATTGGTAATGATCAAATTTATAATACTATTGTAACAGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAACTGATTAATTCCTTTAATATTAGGTGCCCCTGATATAGCTTTCCCCCGAATAAATAACATAAGCTTTTGATTACTCCCCCCCTCATTAATACTTTTAATTTCGAGAAGAATTGTAGAAAATGGAGCAGGAACAGGATGAACAGTTTATCCCCCACTTTCATCTAATATTGCTCATAGAGGCTCTTCAATTGATTTAGCTATTTTTTCCCTTCATTTAGCTGGAATTTCTTCAATTTTAGGTGCTATTAATTTCATTACAACAATTATTAATATACGATTAAATAATATAGCTTTTGATCAAATACCTTTATTTGTTTGATCTGTAGGTATTACTGCTTTCCTTCTTCTTCTTTCTCTTCCAGTATTAGCTGGTGCTATTACTATATTATTAACTGATCGAAATTTAAATACATCTTTTTTTGACCCTGCAGGAGGAGGAGATCCAATTCTTTATCAACATTTATTT"
```
by taxon IDs
```r
bold_seq(ids = c('ACRJP618-11', 'ACRJP619-11'))
#> [[1]]
#> [[1]]$id
#> [1] "ACRJP618-11"
#>
#> [[1]]$name
#> [1] "Lepidoptera"
#>
#> [[1]]$gene
#> [1] "ACRJP618-11"
#>
#> [[1]]$sequence
#> [1] "------------------------TTGAGCAGGCATAGTAGGAACTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATTAACAGTATAAATTATGATCAAATACCACTATTTGTGTGATCAGTAGGAATTACTGCTTTACTCTTATTACTTTCTCTTCCAGTATTAGCAGGTGCTATCACTATATTATTAACGGATCGAAATTTAAATACATCATTTTTTGATCCTGCAGGAGGAGGAGATCCAATTTTATATCAACATTTATTT"
#>
#>
#> [[2]]
#> [[2]]$id
#> [1] "ACRJP619-11"
#>
#> [[2]]$name
#> [1] "Lepidoptera"
#>
#> [[2]]$gene
#> [1] "ACRJP619-11"
#>
#> [[2]]$sequence
#> [1] "AACTTTATATTTTATTTTTGGTATTTGAGCAGGCATAGTAGGAACTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATTAACAGTATAAATTATGATCAAATACCACTATTTGTGTGATCAGTAGGAATTACTGCTTTACTCTTATTACTTTCTCTTCCAGTATTAGCAGGTGCTATCACTATATTATTAACGGATCGAAATTTAAATACATCATTTTTTGATCCTGCAGGAGGAGGAGATCCAATTTTATATCAACATTTATTT"
```
by container (containers include project codes and dataset codes)
```r
bold_seq(container = 'ACRJP')[[1]]
#> $id
#> [1] "ACRJP008-09"
#>
#> $name
#> [1] "Lepidoptera"
#>
#> $gene
#> [1] "ACRJP008-09"
#>
#> $sequence
#> [1] "AACTTTATATTTTATTTTTGGTATTTGATCTGGAATAATTGGAACATCTTTAAGTTTACTAATTCGAACAGAATTAGGTAACCCAGGGTCCTTAATTGGAGATGATCAAATTTATAATACTATTGTTACAGCCCATGCTTTTATTATAATTTTTTTTATAGTTATACCAATTATAATTGGTGGATTTGGAAATTGACTTGTACCTTTAATATTAGGAGCTCCTGATATAGCTTTCCCCCGAATAAATAATATAAGTTTTTGACTTTTACCCCCCTCATTAATTTTATTAATTTCTAGAAGAATTGTTGAAAATGGAGCAGGTACAGGATGAACAGTTTACCCCCCACTTTCATCAAATATTGCCCATGGTGGATCATCTGTTGATTTAGCCATTTTTTCTCTTCATTTAGCCGGAATTTCATCTATTTTAGGAGCAATTAATTTTATTACAACTATTATTAATATACGAGTAAATAATTTATCTTTTGACCAAATACCTTTATTTGTTTGAGCAGTTGGTATCACAGCTCTTCTTTTACTTCTATCTTTACCAGTTTTAGCAGGAGCTATTACTATATTATTAACCGATCGTAATTTAAATACTTCATTTTTTGATCCTGCCGGGGGTGGAGACCCAATTTTATACCAACATTTATTT"
```
by bin (a bin is a _Barcode Index Number_)
```r
bold_seq(bin = 'BOLD:AAA5125')[[1]]
#> $id
#> [1] "BLPAC438-06"
#>
#> $name
#> [1] "Eacles ormondei"
#>
#> $gene
#> [1] "BLPAC438-06"
#>
#> $sequence
#> [1] "NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAGCAGAATTAGGTACCCCCGGATCTTTAATTGGAGATGACCAAATTTATAATACCATTGTAACAGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGATTAGTACCCCTAATACTAGGAGCTCCTGATATAGCTTTCCCCCGAATAAATAATATAAGATTTTGACTATTACCCCCATCTTTAACTCTTTTAATTTCTAGAAGAATTGTCGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCCCTTTCATCTAATATTGCTCATGGAGGCTCTTCTGTTGATTTAGCTATTTTTTCCCTTCATCTAGCTGGAATCTCATCAATTTTAGGAGCTATTAATTTTATCACAACAATCATTAATATACGACTAAATAATATAATATTTGACCAAATACCTTTATTTGTATGAGCTGTTGGTATTACAGCATTTCTTTTATTGTTATCTTTACCTGTACTAGCTGGAGCTATTACTATACTTTTAACAGATCGAAACTTAAATACATCATTTTTTGACCCAGCAGGAGGAGGAGATCCTATTCTCTATCAACATTTATTT"
```
And there are more ways to query, check out the docs for `?bold_seq`.
### Search for specimen data only
The BOLD specimen API doesn't give back sequences, only specimen data. By default you download `tsv` format data, which is given back to you as a `data.frame`
```r
res <- bold_specimens(taxon = 'Osmia')
head(res[,1:8])
#> processid sampleid recordID catalognum fieldnum
#> 1 ASGCB255-13 BIOUG07489-F04 3955532 BIOUG07489-F04
#> 2 BCHYM412-13 BC ZSM HYM 18272 3896353 BC ZSM HYM 18272 BC ZSM HYM 18272
#> 3 FBAPB679-09 BC ZSM HYM 02154 1289040 BC ZSM HYM 02154 BC ZSM HYM 02154
#> 4 FBAPB730-09 BC ZSM HYM 02205 1289091 BC ZSM HYM 02205 BC ZSM HYM 02205
#> 5 FBAPB748-09 BC ZSM HYM 02223 1289109 BC ZSM HYM 02223 BC ZSM HYM 02223
#> 6 FBAPB753-09 BC ZSM HYM 02228 1289114 BC ZSM HYM 02228 BC ZSM HYM 02228
#> institution_storing bin_uri phylum_taxID
#> 1 Biodiversity Institute of Ontario BOLD:ABZ2181 20
#> 2 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAP2416 20
#> 3 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAI1788 20
#> 4 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAK5820 20
#> 5 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAI1998 20
#> 6 SNSB, Zoologische Staatssammlung Muenchen BOLD:ACF3653 20
```
You can optionally get back the data in `XML` format
```r
bold_specimens(taxon = 'Osmia', format = 'xml')
```
```r
1470124
BOM1525-10
BOLD:AAN3337
DHB 1011
DHB 1011
DHB1011
Marjorie Barrick Museum
```
You can choose to get the `httr` response object back if you'd rather work with the raw data returned from the BOLD API.
```r
res <- bold_specimens(taxon = 'Osmia', format = 'xml', response = TRUE)
res$url
#> [1] "http://www.boldsystems.org/index.php/API_Public/specimen?taxon=Osmia&specimen_download=xml"
res$status_code
#> [1] 200
res$headers
#> $date
#> [1] "Mon, 02 May 2016 15:53:38 GMT"
#>
#> $server
#> [1] "Apache/2.2.15 (Red Hat)"
#>
#> $`x-powered-by`
#> [1] "PHP/5.3.15"
#>
#> $`content-disposition`
#> [1] "attachment; filename=bold_data.xml"
#>
#> $connection
#> [1] "close"
#>
#> $`transfer-encoding`
#> [1] "chunked"
#>
#> $`content-type`
#> [1] "application/x-download"
#>
#> attr(,"class")
#> [1] "insensitive" "list"
```
### Search for specimen plus sequence data
The specimen/sequence combined API gives back specimen and sequence data. Like the specimen API, this one gives by default `tsv` format data, which is given back to you as a `data.frame`. Here, we're setting `sepfasta=TRUE` so that the sequence data is given back as a list, and taken out of the `data.frame` returned so the `data.frame` is more manageable.
```r
res <- bold_seqspec(taxon = 'Osmia', sepfasta = TRUE)
res$fasta[1:2]
#> $`ASGCB255-13`
#> [1] "-------------------------------GGAATAATTGGTTCTGCTATAAGTATTATTATTCGAATAGAATTAAGAATTCCTGGATCATTCATTTCTAATGATCAAACTTATAATTCTTTAGTAACAGCTCATGCTTTTTTAATAATTTTTTTTCTTGTAATACCATTTTTAATTGGTGGATTTGGAAATTGATTAATTCCATTAATATTAGGAATCCCAGATATAGCATTTCCTCGAATAAATAATATTAGATTTTGACTTTTACCCCCATCCTTAATAATTTTACTTTTAAGAAATTTCTTAAATCCAAGTCCAGGAACAGGTTGAACTGTATATCCCCCCCTTTCTTCTTATTTATTTCATTCTTCCCCTTCTGTTGATTTAGCTATTTTTTCTCTTCATATTTCTGGTTTATCTTCCATCATAGGTTCTTTAAATTTTATTGTTACAATTATTATAATAAAAAATATTTCATTAAAACATATTCAATTACCTTTATTTCCTTGATCCGTTTTTATTACAACTATTTTACTATTATTTTCTTTACCTGTTCTAGCAGGAGCTATTACTATATTATTATTTGATCGAAACTTTAATACTTCATTTTTTGATCCAACTGGAGGAGGAGATCCAATTTTATATCAACATTTATTC"
#>
#> $`BCHYM412-13`
#> [1] "AGTTCTATATATAATCTTTGCTATATGATCAGGAATAATTGGTTCAGCAATAAGAATTATTATTCGTATAGAATTAAGAATTCCAGGATCATTTATTTCTAATGATCAAACTTATAATTCTTTAGTAACTGCTCATGCTTTTTTAATAATTTTTTTTCTTGTTATACCTTTTTTGATTGGAGGATTCGGAAATTGATTAATTCCAATAATATTAGGAATTCCAGATATAGCTTTTCCCCGAATAAATAATATTAGATTTTGACTTTTACCCCCATCTTTAATAATTTTACTTTTAAGAAATTTTTTCAATCCTAGTCCAGGAACTGGATGAACTGTTTATCCTCCTCTTTCTTCTTATTTATTTCATTCTTCCCCTTCTGTTGATTTAGCAATTTTTTCTTTACATATTTCTGGCTTATCCTCTATTATAGGTTCTTTAAATTTTATTGTAACAATTATTATAATAAAAAATATTTCATTAAAACATATTCAACTTCCCTTATTTCCCTGATCTGTTTTTATTACTACTATCTTATTATTATTTTCTTTACCAGTATTAGCCGGAGCAATTACAATATTATTATTTGATCGAAATTTTAATACTTCATTTTTTGATCCAACTGGGGGTGGGGACCCAATTCTCTATCAACATTTATTT"
```
Or you can index to a specific sequence like
```r
res$fasta['GBAH0293-06']
#> $`GBAH0293-06`
#> [1] "------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------TTAATGTTAGGGATTCCAGATATAGCTTTTCCACGAATAAATAATATTAGATTTTGACTGTTACCTCCATCTTTAATATTATTACTTTTAAGAAATTTTTTAAATCCAAGTCCTGGAACAGGATGAACAGTTTATCCTCCTTTATCATCAAATTTATTTCATTCTTCTCCTTCAGTTGATTTAGCAATTTTTTCTTTACATATTTCAGGTTTATCTTCTATTATAGGTTCATTAAATTTTATTGTTACAATTATTATAATAAAAAATATTTCTTTAAAATATATTCAATTACCTTTATTTTCTTGATCTGTATTTATTACTACTATTCTTTTATTATTTTCTTTACCTGTATTAGCTGGAGCTATTACTATATTATTATTTGATCGAAATTTTAATACATCTTTTTTTGATCCAACAGGAGGGGGAGATCCAATTCTTTATCAACATTTATTTTGATTTTTTGGTCATCCTGAAGTTTATATTTTAATTTTACCTGGATTTGGATTAATTTCTCAAATTATTTCTAATGAAAGAGGAAAAAAAGAAACTTTTGGAAATATTGGTATAATTTATGCTATATTAAGAATTGGACTTTTAGGTTTTATTGTT---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------"
```
### Get trace files
This function downloads files to your machine - it does not load them into your R session - but prints out where the files are for your information.
```r
bold_trace(taxon='Osmia', quiet=TRUE)
#> Downloading: 51 MB
#>
#>
#> .../bold/bold_trace_files/BBHYL361-10[LepF1,LepR1]_F.ab1
#> .../bold/bold_trace_files/BBHYL361-10[LepF1,LepR1]_R.ab1
#> .../bold/bold_trace_files/BBHYL363-10[LepF1,LepR1]_F.ab1
#> .../bold/bold_trace_files/BBHYL363-10[LepF1,LepR1]_R.ab1
#> .../bold/bold_trace_files/BBHYL365-10[LepF1,LepR1]_F.ab1
#> .../bold/bold_trace_files/BBHYL365-10[LepF1,LepR1]_R.ab1
#> .../bold/bold_trace_files/FBAPB666-09[LepF1,LepR1]_F.ab1
#> .../bold/bold_trace_files/FBAPB666-09[LepF1,LepR1]_R.ab1
#> .../bold/bold_trace_files/FBAPB667-09[LepF1,LepR1]_R.ab1
```
## Citing
To cite `bold` in publications use:
> Scott Chamberlain (2016). bold: Interface to Bold Systems API. R package version 0.3.5. https://github.com/ropensci/bold
## License and bugs
* License: [MIT](http://opensource.org/licenses/MIT)
* Report bugs at [our Github repo for bold](https://github.com/ropensci/bold/issues?state=open)
[Back to top](#top)