If output_dir is specified, files will have the .xml file extension.

tika_xml(input, ...)

Arguments

input

Character vector describing the paths and/or urls to the input documents.

...

Other parameters to be sent to tika().

Value

A character vector in the same order and with the same length as input, of unparsed XHTML. Unprocessed files are as.character(NA).

Examples

batch <- c( system.file("extdata", "jsonlite.pdf", package = "rtika"), system.file("extdata", "curl.pdf", package = "rtika"), system.file("extdata", "table.docx", package = "rtika"), system.file("extdata", "xml2.pdf", package = "rtika"), system.file("extdata", "R-FAQ.html", package = "rtika"), system.file("extdata", "calculator.jpg", package = "rtika"), system.file("extdata", "tika.apache.org.zip", package = "rtika") ) xml <- tika_xml(batch)