This function prepares the data by cleaning punctuation, checking spelling against the lexicons, mapping terms according to the lexicons and lower casing everything. It contains several of the other functions in the package for ease of use.

textPrep(inputText, delim)

Arguments

inputText

The relevant pathology text column

delim

the delimitors so the extractor can be used

Value

This returns a string vector.

See also

Other NLP - Text Cleaning and Extraction: ColumnCleanUp(), DictionaryInPlaceReplace(), Extractor(), NegativeRemoveWrapper(), NegativeRemove()

Examples

mywords<-c("Hospital Number","Patient Name:","DOB:","General Practitioner:", "Date received:","Clinical Details:","Macroscopic description:", "Histology:","Diagnosis:") CleanResults<-textPrep(PathDataFrameFinal$PathReportWhole,mywords)