bold tutorial

for v0.2.0

bold is an R package to connect to BOLD Systems via their API. Functions in bold let you search for sequence data, specimen data, sequence + specimen data, and download raw trace files.

bold info

## Installation You can install the stable version from CRAN ```r install.packages("bold") ``` Or the development version from Github ```r install.packages("devtools") devtools::install_github("ropensci/bold") ``` Then load the package into the R sesssion ```r library("bold") ```
## Usage ### Search for taxonomic names via names `bold_tax_name` searches for names with names. ```r bold_tax_name(name = 'Diplura') #> input taxid taxon tax_rank tax_division parentid parentname #> 1 Diplura 734358 Diplura class Animals 20 Arthropoda #> 2 Diplura 603673 Diplura genus Protists 53974 Scytosiphonaceae #> taxonrep #> 1 Diplura #> 2 ``` ```r bold_tax_name(name = c('Diplura', 'Osmia')) #> input taxid taxon tax_rank tax_division parentid parentname #> 1 Diplura 734358 Diplura class Animals 20 Arthropoda #> 2 Diplura 603673 Diplura genus Protists 53974 Scytosiphonaceae #> 3 Osmia 4940 Osmia genus Animals 4962 Megachilinae #> taxonrep #> 1 Diplura #> 2 #> 3 Osmia ``` ### Search for taxonomic names via BOLD identifiers `bold_tax_id` searches for names with BOLD identifiers. ```r bold_tax_id(id = 88899) #> input taxid taxon tax_rank tax_division parentid parentname #> 1 88899 88899 Momotus genus Animals 88898 Momotidae ``` ```r bold_tax_id(id = c(88899, 125295)) #> input taxid taxon tax_rank tax_division parentid parentname #> 1 88899 88899 Momotus genus Animals 88898 Momotidae #> 2 125295 125295 Helianthus genus Plants 100962 Asteraceae ``` ### Search for sequence data only The BOLD sequence API gives back sequence data, with a bit of metadata. The default is to get a list back ```r bold_seq(taxon = 'Coelioxys')[1:2] #> [[1]] #> [[1]]$id #> [1] "BCHYM1514-13" #> #> [[1]]$name #> [1] "Coelioxys conica" #> #> [[1]]$gene #> [1] "BCHYM1514-13" #> #> [[1]]$sequence #> [1] "GATAATATATATAATTTTTGCAATATGATCAGGAATAATAGGATCCTCTTTAAGAATAATTATTCGTATAGAATTAAGAATTCCAGGATCTTGAATTAATAATGATCAAATTTATAACTCCTTTATTACAGCACATGCATTTTTAATAATTTTTTTTTTAGTTATACCTTTTCTTATTGGAGGATTTGGAAATTGATTAGTACCTTTAATATTAGGATCACCAGATATAGCTTTCCCACGAATAAATAATATTAGATTTTGATTATTACCTCCTTCTTTATTAATATTATTATTAAGTAATTTAATAAATCCCAGACCAGGAACAGGCTGAACAGTTTATCCTCCTTTATCTTTATACACATACCACCCTTCTCCCTCAGTTGATTTAGCAATTTTTTCACTACATCTATCAGGAATCTCTTCTATTATTGGATCTATAAATTTTATTGTTACAATTTTAATAATAAAAAACTTTTCAATAAATTATAATCAAATACCATTATTCCCATGATCTATTTTAATTACTACTATTTTATTATTATTATCACTACCTGTATTAGCTGGTGCTATTACTATATTATTATTTGATCGAAATTTAAATTCTTCTTTTTTTGACCCTATAGGAGGAGGAGACCCAATTTTATACCAACATTTATTT" #> #> #> [[2]] #> [[2]]$id #> [1] "FBAPB481-09" #> #> [[2]]$name #> [1] "Coelioxys afra" #> #> [[2]]$gene #> [1] "FBAPB481-09" #> #> [[2]]$sequence #> [1] "----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------TTTCCACGAATAAATAATGTAAGATTTTGACTATTACCTCCCTCAATTTTCTTATTATTATCAAGAACCCTAATTAACCCAAGTGCTGGTACTGGATGAACTGTATATCCTCCTTTATCCTTATATACATTTCATGCCTCACCTTCCGTTGATTTAGCAATTTTTTCACTTCATTTATCAGGAATTTCATCAATTATTGGATCAATAAATTTTATTGTTACAATCTTAATAATAAAAAATTTTTCTTTAAATTATAGACAAATACCATTATTTTCATGATCAGTTTTAATTACTACAATTTTACTTTTATTATCATTACCAATTTTAGCTGGAGCAATTACTATACTCCTATTTGATCGAAATTTAAATACCTCATTCTTTGACCCAATAGGAGGAGGAGATCCAATTTTATATCAACATTTATTT" ``` You can optionally get back the `httr` response object ```r res <- bold_seq(taxon = 'Coelioxys', response = TRUE) res$headers #> $date #> [1] "Mon, 02 May 2016 15:52:53 GMT" #> #> $server #> [1] "Apache/2.2.15 (Red Hat)" #> #> $`x-powered-by` #> [1] "PHP/5.3.15" #> #> $`content-disposition` #> [1] "attachment; filename=fasta.fas" #> #> $connection #> [1] "close" #> #> $`transfer-encoding` #> [1] "chunked" #> #> $`content-type` #> [1] "application/x-download" #> #> attr(,"class") #> [1] "insensitive" "list" ``` You can do geographic searches ```r bold_seq(geo = "USA") #> [[1]] #> [[1]]$id #> [1] "GBAN1777-08" #> #> [[1]]$name #> [1] "Macrobdella decora" #> #> [[1]]$gene #> [1] "GBAN1777-08" #> #> [[1]]$sequence #> [1] "---------------------------------ATTGGAATCTTGTATTTCTTATTAGGTACATGATCTGCTATAGTAGGGACCTCTATA---AGAATAATTATTCGAATTGAATTAGCTCAACCTGGGTCGTTTTTAGGAAAT---GATCAAATTTACAATACTATTGTTACTGCTCATGGATTAATTATAATTTTTTTTATAGTAATACCTATTTTAATTGGAGGGTTTGGTAATTGATTAATTCCGCTAATA---ATTGGTTCTCCTGATATAGCTTTTCCACGTCTTAATAATTTAAGATTTTGATTACTTCCGCCATCTTTAACTATACTTTTTTGTTCATCTATAGTCGAAAATGGAGTAGGTACTGGATGGACTATTTACCCTCCTTTAGCAGATAACATTGCTCATTCTGGACCTTCTGTAGATATA---GCAATTTTTTCACTTCATTTAGCTGGTGCTTCTTCTATTTTAGGTTCATTAAATTTTATTACTACTGTAGTTAATATACGATGACCAGGGATATCTATAGAGCGAATTCCTTTATTTATTTGATCCGTAATTATTACTACTGTATTGCTATTATTATCTTTACCAGTATTAGCAGCT---GCTATTTCAATATTATTAACAGATCGTAACTTAAATACTAGATTTTTTGACCCAATAGGAGGAGGGGATCCTATTTTATTCCAACATTTATTTTGATTTTTTGGCCACCCTGAAGTTTATATTTTAATTTTACCAGGATTTGGAGCTATTTCTCATGTAGTAAGTCATAACTCT---AAAAAATTAGAACCGTTTGGATCATTAGGGATATTATATGCAATAATTGGAATTGCAATTTTAGGTTTTATTGTTTGAGCACATCATATATTTACAGTAGGTCTTGATGTAGATACACGAGCTTATTTTACAGCAGCTACAATAGTTATTGCTGTTCCTACAGGAATTAAAGTATTTAGGTGATTG---GCAACT" #> #> #> [[2]] #> [[2]]$id #> [1] "GBAN1780-08" #> #> [[2]]$name #> [1] "Haemopis terrestris" #> #> [[2]]$gene #> [1] "GBAN1780-08" #> #> [[2]]$sequence #> [1] "---------------------------------ATTGGAACWTTWTATTTTATTTTNGGNGCTTGATCTGCTATATTNGGGATCTCAATA---AGGAATATTATTCGAATTGAGCCATCTCAACCTGGGAGATTATTAGGAAAT---GATCAATTATATAATTCATTAGTAACAGCTCATGGATTAATTATAATTTTCTTTATGGTTATGCCTATTTTGATTGGTGGGTTTGGTAATTGATTACTACCTTTAATA---ATTGGAGCCCCTGATATAGCTTTTCCTCGATTAAATAATTTAAGTTTTTGATTATTACCACCTTCATTAATTATATTGTTAAGATCCTCTATTATTGAAAGAGGGGTAGGTACAGGTTGAACCTTATATCCTCCTTTAGCAGATAGATTATTTCATTCAGGTCCATCGGTAGATATA---GCTATTTTTTCATTACATATAGCTGGAGCATCATCTATTTTAGGCTCATTAAACTTTATTTCTACAATTATTAATATACGAATTAAAGGTATAAGATCTGATCGAGTACCTTTATTTGTATGATCAGTTGTTATTACAACAGTTCTGTTATTATTGTCTTTACCTGTTTTAGCTGCA---GCTATTACTATATTATTAACAGATCGTAATTTAAATACTACTTTTTTTGATCCTATAGGAGGTGGAGATCCAGTATTGTTTCAACACTTATTTTGATTTTTTGGTCATCCAGAAGTATATATTTTGATTTTACCAGGATTTGGAGCAATTTCTCATATTATTACAAATAATTCT---AAAAAATTGGAACCTTTTGGATCTCTTGGTATAATTTATGCTATAATTGGAATTGCAGTTTTAGGGTTTATTGTATGAGCCCATCATATATTTACTGTAGGATTAGATGTTGATACTCGAGCTTATTTTACAGCAGCTACTATAGTTATTGCTGTTCCTACTGGTATTAAAGTTTTTAGGTGATTA---GCAACA" #> #> #> [[3]] #> [[3]]$id #> [1] "GBNM0293-06" #> #> [[3]]$name #> [1] "Steinernema carpocapsae" #> #> [[3]]$gene #> [1] "GBNM0293-06" #> #> [[3]]$sequence #> [1] "---------------------------------------------------------------------------------ACAAGATTATCTCTTATTATTCGTTTAGAGTTGGCTCAACCTGGTCTTCTTTTGGGTAAT---GGTCAATTATATAATTCTATTATTACTGCTCATGCTATTCTTATAATTTTTTTCATAGTTATACCTAGAATAATTGGTGGTTTTGGTAATTGAATATTACCTTTAATATTGGGGGCTCCTGATATAAGTTTTCCACGTTTGAATAATTTAAGTTTTTGATTGCTACCAACTGCTATATTTTTGATTTTAGATTCTTGTTTTGTTGACACTGGTTGTGGTACTAGTTGAACTGTTTATCCTCCTTTGAGG---ACTTTAGGTCACCCTGGYAGAAGTGTAGATTTAGCTATTTTTAGTCTTCATTGTGCAGGAATTAGCTCAATTTTAGGGGCTATTAATTTTATATGTACTACAAAAAATCTTCGTAGTAGTTCTATTTCTTTGGAACATATAAGACTTTTTGTTTGGGCTGTTTTTGTTACTGTTTTTTTATTAGTTTTATCTTTACCTGTTTTAGCTGGTGCTATTACTATGCTTTTAACAGACCGTAATTTAAATACTTCTTTTTTT------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------" #> #> #> [[4]] #> [[4]]$id #> [1] "NEONV108-11" #> #> [[4]]$name #> [1] "Aedes thelcter" #> #> [[4]]$gene #> [1] "NEONV108-11" #> #> [[4]]$sequence #> [1] "AACTTTATACTTCATCTTCGGAGTTTGATCAGGAATAGTTGGTACATCATTAAGAATTTTAATTCGTGCTGAATTAAGTCAACCAGGTATATTTATTGGAAATGACCAAATTTATAATGTAATTGTTACAGCTCATGCTTTTATTATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTAGTTCCTCTAATATTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAATAATATAAGTTTTTGAATACTACCTCCCTCATTAACTCTTCTACTTTCAAGTAGTATAGTAGAAAATGGATCAGGAACAGGATGAACAGTTTATCCACCTCTTTCATCTGGAACTGCTCATGCAGGAGCCTCTGTTGATTTAACTATTTTTTCTCTTCATTTAGCCGGAGTTTCATCAATTTTAGGGGCTGTAAATTTTATTACTACTGTAATTAATATACGATCTGCAGGAATTACTCTTGATCGACTACCTTTATTCGTTTGATCTGTAGTAATTACAGCTGTTTTATTACTTCTTTCACTTCCTGTATTAGCTGGAGCTATTACAATACTATTAACTGATCGAAATTTAAATACATCTTTCTTTGATCCAATTGGAGGAGGAGACCCAATTTTATACCAACATTTATTT" #> #> #> [[5]] #> [[5]]$id #> [1] "NEONV109-11" #> #> [[5]]$name #> [1] "Aedes thelcter" #> #> [[5]]$gene #> [1] "NEONV109-11" #> #> [[5]]$sequence #> [1] "AACTTTATACTTCATCTTCGGAGTTTGATCAGGAATAGTTGGTACATCATTAAGAATTTTAATTCGTGCTGAATTAAGTCAACCAGGTATATTTATTGGAAATGACCAAATTTATAATGTAATTGTTACAGCTCATGCTTTTATTATAATTTTCTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGACTAGTTCCTCTAATATTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAATAATATAAGTTTTTGAATACTACCTCCCTCATTAACTCTTCTACTTTCAAGTAGTATAGTAGAAAATGGGTCAGGAACAGGATGAACAGTTTATCCACCTCTTTCATCTGGAACTGCTCATGCAGGAGCCTCTGTTGATTTAACTATTTTTTCTCTTCATTTAGCCGGAGTTTCATCAATTTTAGGGGCTGTAAATTTTATTACTACTGTAATTAATATACGATCTGCAGGAATTACTCTTGATCGACTACCTTTATTCGTTTGATCTGTAGTAATTACAGCTGTTTTATTACTTCTTTCACTTCCTGTATTAGCTGGAGCTATTACAATACTATTAACTGATCGAAATTTAAATACATCTTTCTTTGACCCAATTGGAGGGGGAGACCCAATTTTATACCAACATTTATTT" ``` And you can search by researcher name ```r bold_seq(researchers = 'Thibaud Decaens')[[1]] #> $id #> [1] "AMAZ108-09" #> #> $name #> [1] "Arsenura armida" #> #> $gene #> [1] "AMAZ108-09" #> #> $sequence #> [1] "TACTTTATATTTTATTTTTGGAATTTGAGCAGGTATAATTGGAACCTCCTTAAGTTTATTAATTCGAGCTGAATTAGGAATACCTGGATTTTTAATTGGTAATGATCAAATTTATAATACTATTGTAACAGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAACTGATTAATTCCTTTAATATTAGGTGCCCCTGATATAGCTTTCCCCCGAATAAATAACATAAGCTTTTGATTACTCCCCCCCTCATTAATACTTTTAATTTCGAGAAGAATTGTAGAAAATGGAGCAGGAACAGGATGAACAGTTTATCCCCCACTTTCATCTAATATTGCTCATAGAGGCTCTTCAATTGATTTAGCTATTTTTTCCCTTCATTTAGCTGGAATTTCTTCAATTTTAGGTGCTATTAATTTCATTACAACAATTATTAATATACGATTAAATAATATAGCTTTTGATCAAATACCTTTATTTGTTTGATCTGTAGGTATTACTGCTTTCCTTCTTCTTCTTTCTCTTCCAGTATTAGCTGGTGCTATTACTATATTATTAACTGATCGAAATTTAAATACATCTTTTTTTGACCCTGCAGGAGGAGGAGATCCAATTCTTTATCAACATTTATTT" ``` by taxon IDs ```r bold_seq(ids = c('ACRJP618-11', 'ACRJP619-11')) #> [[1]] #> [[1]]$id #> [1] "ACRJP618-11" #> #> [[1]]$name #> [1] "Lepidoptera" #> #> [[1]]$gene #> [1] "ACRJP618-11" #> #> [[1]]$sequence #> [1] "------------------------TTGAGCAGGCATAGTAGGAACTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATTAACAGTATAAATTATGATCAAATACCACTATTTGTGTGATCAGTAGGAATTACTGCTTTACTCTTATTACTTTCTCTTCCAGTATTAGCAGGTGCTATCACTATATTATTAACGGATCGAAATTTAAATACATCATTTTTTGATCCTGCAGGAGGAGGAGATCCAATTTTATATCAACATTTATTT" #> #> #> [[2]] #> [[2]]$id #> [1] "ACRJP619-11" #> #> [[2]]$name #> [1] "Lepidoptera" #> #> [[2]]$gene #> [1] "ACRJP619-11" #> #> [[2]]$sequence #> [1] "AACTTTATATTTTATTTTTGGTATTTGAGCAGGCATAGTAGGAACTTCTCTTAGTCTTATTATTCGAACAGAATTAGGAAATCCAGGATTTTTAATTGGAGATGATCAAATCTACAATACTATTGTTACGGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGTAATTGATTAGTTCCCCTTATACTAGGAGCCCCAGATATAGCTTTCCCTCGAATAAACAATATAAGTTTTTGGCTTCTTCCCCCTTCACTATTACTTTTAATTTCCAGAAGAATTGTTGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCACTGTCATCTAATATTGCCCATAGAGGTACATCAGTAGATTTAGCTATTTTTTCTTTACATTTAGCAGGTATTTCCTCTATTTTAGGAGCGATTAATTTTATTACTACAATTATTAATATACGAATTAACAGTATAAATTATGATCAAATACCACTATTTGTGTGATCAGTAGGAATTACTGCTTTACTCTTATTACTTTCTCTTCCAGTATTAGCAGGTGCTATCACTATATTATTAACGGATCGAAATTTAAATACATCATTTTTTGATCCTGCAGGAGGAGGAGATCCAATTTTATATCAACATTTATTT" ``` by container (containers include project codes and dataset codes) ```r bold_seq(container = 'ACRJP')[[1]] #> $id #> [1] "ACRJP008-09" #> #> $name #> [1] "Lepidoptera" #> #> $gene #> [1] "ACRJP008-09" #> #> $sequence #> [1] "AACTTTATATTTTATTTTTGGTATTTGATCTGGAATAATTGGAACATCTTTAAGTTTACTAATTCGAACAGAATTAGGTAACCCAGGGTCCTTAATTGGAGATGATCAAATTTATAATACTATTGTTACAGCCCATGCTTTTATTATAATTTTTTTTATAGTTATACCAATTATAATTGGTGGATTTGGAAATTGACTTGTACCTTTAATATTAGGAGCTCCTGATATAGCTTTCCCCCGAATAAATAATATAAGTTTTTGACTTTTACCCCCCTCATTAATTTTATTAATTTCTAGAAGAATTGTTGAAAATGGAGCAGGTACAGGATGAACAGTTTACCCCCCACTTTCATCAAATATTGCCCATGGTGGATCATCTGTTGATTTAGCCATTTTTTCTCTTCATTTAGCCGGAATTTCATCTATTTTAGGAGCAATTAATTTTATTACAACTATTATTAATATACGAGTAAATAATTTATCTTTTGACCAAATACCTTTATTTGTTTGAGCAGTTGGTATCACAGCTCTTCTTTTACTTCTATCTTTACCAGTTTTAGCAGGAGCTATTACTATATTATTAACCGATCGTAATTTAAATACTTCATTTTTTGATCCTGCCGGGGGTGGAGACCCAATTTTATACCAACATTTATTT" ``` by bin (a bin is a _Barcode Index Number_) ```r bold_seq(bin = 'BOLD:AAA5125')[[1]] #> $id #> [1] "BLPAC438-06" #> #> $name #> [1] "Eacles ormondei" #> #> $gene #> [1] "BLPAC438-06" #> #> $sequence #> [1] "NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAGCAGAATTAGGTACCCCCGGATCTTTAATTGGAGATGACCAAATTTATAATACCATTGTAACAGCTCATGCTTTTATTATAATTTTTTTTATAGTTATACCTATTATAATTGGAGGATTTGGAAATTGATTAGTACCCCTAATACTAGGAGCTCCTGATATAGCTTTCCCCCGAATAAATAATATAAGATTTTGACTATTACCCCCATCTTTAACTCTTTTAATTTCTAGAAGAATTGTCGAAAATGGAGCTGGAACTGGATGAACAGTTTATCCCCCCCTTTCATCTAATATTGCTCATGGAGGCTCTTCTGTTGATTTAGCTATTTTTTCCCTTCATCTAGCTGGAATCTCATCAATTTTAGGAGCTATTAATTTTATCACAACAATCATTAATATACGACTAAATAATATAATATTTGACCAAATACCTTTATTTGTATGAGCTGTTGGTATTACAGCATTTCTTTTATTGTTATCTTTACCTGTACTAGCTGGAGCTATTACTATACTTTTAACAGATCGAAACTTAAATACATCATTTTTTGACCCAGCAGGAGGAGGAGATCCTATTCTCTATCAACATTTATTT" ``` And there are more ways to query, check out the docs for `?bold_seq`. ### Search for specimen data only The BOLD specimen API doesn't give back sequences, only specimen data. By default you download `tsv` format data, which is given back to you as a `data.frame` ```r res <- bold_specimens(taxon = 'Osmia') head(res[,1:8]) #> processid sampleid recordID catalognum fieldnum #> 1 ASGCB255-13 BIOUG07489-F04 3955532 BIOUG07489-F04 #> 2 BCHYM412-13 BC ZSM HYM 18272 3896353 BC ZSM HYM 18272 BC ZSM HYM 18272 #> 3 FBAPB679-09 BC ZSM HYM 02154 1289040 BC ZSM HYM 02154 BC ZSM HYM 02154 #> 4 FBAPB730-09 BC ZSM HYM 02205 1289091 BC ZSM HYM 02205 BC ZSM HYM 02205 #> 5 FBAPB748-09 BC ZSM HYM 02223 1289109 BC ZSM HYM 02223 BC ZSM HYM 02223 #> 6 FBAPB753-09 BC ZSM HYM 02228 1289114 BC ZSM HYM 02228 BC ZSM HYM 02228 #> institution_storing bin_uri phylum_taxID #> 1 Biodiversity Institute of Ontario BOLD:ABZ2181 20 #> 2 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAP2416 20 #> 3 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAI1788 20 #> 4 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAK5820 20 #> 5 SNSB, Zoologische Staatssammlung Muenchen BOLD:AAI1998 20 #> 6 SNSB, Zoologische Staatssammlung Muenchen BOLD:ACF3653 20 ``` You can optionally get back the data in `XML` format ```r bold_specimens(taxon = 'Osmia', format = 'xml') ``` ```r 1470124 BOM1525-10 BOLD:AAN3337 DHB 1011 DHB 1011 DHB1011 Marjorie Barrick Museum ``` You can choose to get the `httr` response object back if you'd rather work with the raw data returned from the BOLD API. ```r res <- bold_specimens(taxon = 'Osmia', format = 'xml', response = TRUE) res$url #> [1] "http://www.boldsystems.org/index.php/API_Public/specimen?taxon=Osmia&specimen_download=xml" res$status_code #> [1] 200 res$headers #> $date #> [1] "Mon, 02 May 2016 15:53:38 GMT" #> #> $server #> [1] "Apache/2.2.15 (Red Hat)" #> #> $`x-powered-by` #> [1] "PHP/5.3.15" #> #> $`content-disposition` #> [1] "attachment; filename=bold_data.xml" #> #> $connection #> [1] "close" #> #> $`transfer-encoding` #> [1] "chunked" #> #> $`content-type` #> [1] "application/x-download" #> #> attr(,"class") #> [1] "insensitive" "list" ``` ### Search for specimen plus sequence data The specimen/sequence combined API gives back specimen and sequence data. Like the specimen API, this one gives by default `tsv` format data, which is given back to you as a `data.frame`. Here, we're setting `sepfasta=TRUE` so that the sequence data is given back as a list, and taken out of the `data.frame` returned so the `data.frame` is more manageable. ```r res <- bold_seqspec(taxon = 'Osmia', sepfasta = TRUE) res$fasta[1:2] #> $`ASGCB255-13` #> [1] "-------------------------------GGAATAATTGGTTCTGCTATAAGTATTATTATTCGAATAGAATTAAGAATTCCTGGATCATTCATTTCTAATGATCAAACTTATAATTCTTTAGTAACAGCTCATGCTTTTTTAATAATTTTTTTTCTTGTAATACCATTTTTAATTGGTGGATTTGGAAATTGATTAATTCCATTAATATTAGGAATCCCAGATATAGCATTTCCTCGAATAAATAATATTAGATTTTGACTTTTACCCCCATCCTTAATAATTTTACTTTTAAGAAATTTCTTAAATCCAAGTCCAGGAACAGGTTGAACTGTATATCCCCCCCTTTCTTCTTATTTATTTCATTCTTCCCCTTCTGTTGATTTAGCTATTTTTTCTCTTCATATTTCTGGTTTATCTTCCATCATAGGTTCTTTAAATTTTATTGTTACAATTATTATAATAAAAAATATTTCATTAAAACATATTCAATTACCTTTATTTCCTTGATCCGTTTTTATTACAACTATTTTACTATTATTTTCTTTACCTGTTCTAGCAGGAGCTATTACTATATTATTATTTGATCGAAACTTTAATACTTCATTTTTTGATCCAACTGGAGGAGGAGATCCAATTTTATATCAACATTTATTC" #> #> $`BCHYM412-13` #> [1] "AGTTCTATATATAATCTTTGCTATATGATCAGGAATAATTGGTTCAGCAATAAGAATTATTATTCGTATAGAATTAAGAATTCCAGGATCATTTATTTCTAATGATCAAACTTATAATTCTTTAGTAACTGCTCATGCTTTTTTAATAATTTTTTTTCTTGTTATACCTTTTTTGATTGGAGGATTCGGAAATTGATTAATTCCAATAATATTAGGAATTCCAGATATAGCTTTTCCCCGAATAAATAATATTAGATTTTGACTTTTACCCCCATCTTTAATAATTTTACTTTTAAGAAATTTTTTCAATCCTAGTCCAGGAACTGGATGAACTGTTTATCCTCCTCTTTCTTCTTATTTATTTCATTCTTCCCCTTCTGTTGATTTAGCAATTTTTTCTTTACATATTTCTGGCTTATCCTCTATTATAGGTTCTTTAAATTTTATTGTAACAATTATTATAATAAAAAATATTTCATTAAAACATATTCAACTTCCCTTATTTCCCTGATCTGTTTTTATTACTACTATCTTATTATTATTTTCTTTACCAGTATTAGCCGGAGCAATTACAATATTATTATTTGATCGAAATTTTAATACTTCATTTTTTGATCCAACTGGGGGTGGGGACCCAATTCTCTATCAACATTTATTT" ``` Or you can index to a specific sequence like ```r res$fasta['GBAH0293-06'] #> $`GBAH0293-06` #> [1] "------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------TTAATGTTAGGGATTCCAGATATAGCTTTTCCACGAATAAATAATATTAGATTTTGACTGTTACCTCCATCTTTAATATTATTACTTTTAAGAAATTTTTTAAATCCAAGTCCTGGAACAGGATGAACAGTTTATCCTCCTTTATCATCAAATTTATTTCATTCTTCTCCTTCAGTTGATTTAGCAATTTTTTCTTTACATATTTCAGGTTTATCTTCTATTATAGGTTCATTAAATTTTATTGTTACAATTATTATAATAAAAAATATTTCTTTAAAATATATTCAATTACCTTTATTTTCTTGATCTGTATTTATTACTACTATTCTTTTATTATTTTCTTTACCTGTATTAGCTGGAGCTATTACTATATTATTATTTGATCGAAATTTTAATACATCTTTTTTTGATCCAACAGGAGGGGGAGATCCAATTCTTTATCAACATTTATTTTGATTTTTTGGTCATCCTGAAGTTTATATTTTAATTTTACCTGGATTTGGATTAATTTCTCAAATTATTTCTAATGAAAGAGGAAAAAAAGAAACTTTTGGAAATATTGGTATAATTTATGCTATATTAAGAATTGGACTTTTAGGTTTTATTGTT---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------" ``` ### Get trace files This function downloads files to your machine - it does not load them into your R session - but prints out where the files are for your information. ```r bold_trace(taxon='Osmia', quiet=TRUE) #> Downloading: 51 MB #> #> #> .../bold/bold_trace_files/BBHYL361-10[LepF1,LepR1]_F.ab1 #> .../bold/bold_trace_files/BBHYL361-10[LepF1,LepR1]_R.ab1 #> .../bold/bold_trace_files/BBHYL363-10[LepF1,LepR1]_F.ab1 #> .../bold/bold_trace_files/BBHYL363-10[LepF1,LepR1]_R.ab1 #> .../bold/bold_trace_files/BBHYL365-10[LepF1,LepR1]_F.ab1 #> .../bold/bold_trace_files/BBHYL365-10[LepF1,LepR1]_R.ab1 #> .../bold/bold_trace_files/FBAPB666-09[LepF1,LepR1]_F.ab1 #> .../bold/bold_trace_files/FBAPB666-09[LepF1,LepR1]_R.ab1 #> .../bold/bold_trace_files/FBAPB667-09[LepF1,LepR1]_R.ab1 ```
## Citing To cite `bold` in publications use:
> Scott Chamberlain (2016). bold: Interface to Bold Systems API. R package version 0.3.5. https://github.com/ropensci/bold
## License and bugs * License: [MIT](http://opensource.org/licenses/MIT) * Report bugs at [our Github repo for bold](https://github.com/ropensci/bold/issues?state=open) [Back to top](#top)
comments powered by Disqus