This downloads and installs the Tika App jar (~60 MB) into a user directory, and verifies the integrity of the file using a checksum. The default settings should work fine.

install_tika(version = "1.22",
  digest = paste0("64975d79211bc5c37f866abb2f4077687eff55b7615",
  "67f7ad0b36a221a2ae3457b748fac9b288a31a641f3",
  "7dfc8679260413f972b5b60cf6deb6721329cad001"),
  mirrors = c("http://mirrors.ocf.berkeley.edu/apache/tika/",
  "http://apache.cs.utah.edu/tika/",
  "http://mirror.cc.columbia.edu/pub/software/apache/tika/"),
  retries = 2, url = character())

Arguments

version

The declared Tika version

digest

The sha15 checksum. Set to an empty string "" to skip the check.

mirrors

A vector of Apache mirror sites. One is picked randomly.

retries

The number of times to try the download.

url

Optional url to a particular location of the tika app. Setting this to any character string overrides downloading from random mirrors.

Value

Logical if the installation was successful.

Details

The default settings of install_tika() should typically be left as they are.

This function will download the version of the Tika jar tested to work with this package, and can verify file integrity using a checksum.

It will normally download from a random Apache mirror. If the mirror fails, it tries the archive at http://archive.apache.org/dist/tika/. You can also enter a value for url directly to override this.

It will download into a directory determined by the rappdirs::user_data_dir() function, specific to the operating system.

If tika() is stopping with an error compalining about the jar, try running install_tika() again.

Uninstalling

If you are uninstalling the entire rtika package and want to remove the Tika App jar also, run:

unlink(rappdirs::user_data_dir('rtika'), recursive = TRUE)

Alternately, navigate to the install folder and delete it manually. It is the file path returned by rappdirs::user_data_dir('rtika'). The path is OS specific, and explained here: https://github.com/r-lib/rappdirs .

Distribution

Tika is distributed under the Apache License Version 2.0, which generally permits distribution of the code "Object" without the "Source". The master copy of the Apache Tika source code is held in GIT. You can fetch (clone) the large source from GitHub ( https://github.com/apache/tika ).

Examples

install_tika()
#> Downloading the Tika App .jar version 1.22 into "/Users/sasha/Library/Application Support/rtika". The file is approximately 60 MB - this may take a while.
#> The download is successful.
#> The file integrity is good.
#> The installation is successful.