Standardises the location of biopsies by cleaning up the common typos and abbreviations that are commonly used in free text of pathology reports
TermStandardLocation(dataframe, SampleLocation)
dataframe | The dataframe |
---|---|
SampleLocation | Column describing the Macroscopic sample from histology |
#Firstly we extract histology from the raw report # using the extractor function mywords<-c("Hospital Number","Patient Name:","DOB:","General Practitioner:", "Date received:","Clinical Details:","Macroscopic description:", "Histology:","Diagnosis:") MypathExtraction<-Extractor(PathDataFrameFinal,"PathReportWhole",mywords) names(MypathExtraction)[names(MypathExtraction) == 'Datereceived'] <- 'Dateofprocedure' MypathExtraction$Dateofprocedure <- as.Date(MypathExtraction$Dateofprocedure) # The function then standardises the histology terms through a series of # regular expressions ll<-TermStandardLocation(Mypath,'Histology') rm(MypathExtraction)