Specific diseases - Barrett’s oesophagus

The functions above really act as building blocks for the further, more complex analyses we can do for specific Endoscopic-pathological disease sets

One particular disease is the premalignant oesophageal condition Barrett’s oesophagus. This is characterised by the growth of cells (called columnar lined epithelium) in the oesophagus. These cells usually occupy the lower part of the oesophagus as a continuous sheet from the top of the stomach to varying lengths up the oesophagus. This condition requires endoscopic surveillance and the timing of this depends on the prior endoscopic features (namely the length of the Barretts segment as measured by the Prague score- explained below) and the pathological stage at that endoscopy (which for non-dysplastic samples, since the revised 2013 UK guidelines, means the presence or absence of intestinal metaplasia). This can be seen in the image below (from Fitzgerald RC, et al. Gut 2013;0:1–36. doi:10.1136/gutjnl-2013-305372)



Pre-processing Barrett’s samples

Such a dataset needs some processing prior to the analysis so for this we can turn to a specific set of function for Barrett’s oesophagus itself.


Prague score

Firstly we need to extract the length of the Barrett’s segment. This is known as the Prague score and is made up of the length from the top of the gastric folds (just below the gastro-oesophageal junction) to the top of the circumferential extent of the Barrett’s segment (C). In addition the maximal extent is from the top of the gastric folds to the top of the tongues of Barrett’s segment (M). This gives an overall score such as C1M2.

After filtering for endoscopic indication (eg “Surveillance-Barrett’s”- this is stored in the ‘Indication’ column in our data set) the aim of the following function is to extract a C and M stage (Prague score) for Barrett’s samples. This is done using a regular expression where C and M stages are explicitly mentioned in the free text. Specifically it extracts the Prague score. This is usually mentoned in the ‘Findings’ column in our dataset but obviously the user can define which column should be searched.

v<-Barretts_PragueScore(Myendo,'Findings')
  Indications ProcedurePerformed Findings CStage MStage
23 Weight Loss Gastroscopy (OGD) STOMACH: diffuse gastritis with angiodysplasia and punctate bleeding site on greater curve mid body - no obvious ulcer- antrum scar ?,OESOPHAGUS: Barrett’s osophagus C3M5 34-39cm ,He also has erosive gastritis in the fundus ,STOMACH: non-erosive gastritis from mid-body to antrum, CLO test: negative ,STOMACH: numerous fundic glands looking polyps at body and fundus with size ranging from 2 mm to 10 mm ,Otherwise entirely normal study ,Scope easily passed through A lesion underwent EMR 3 5
24 Haematemesis or Melaena/Blood PR Gastroscopy (OGD) Top of gastrisc folds:43cm ,No inlet patch ,Biopsies taken from distal, mid and proximal third ,STOMACH: Normal ,Hiatus Hernia- Small ,Oesophagus normal with no hiatus hernia or oesophagitis ,D2 biopsies taken in view of weight loss ,STOMACH: gastritis in antrum - CLO test: negative ,Nodules between 38 anc 34cm, Paris Type IIa annd covered 75% of the circumference of the oesophagus NA NA
25 Dyspepsia Gastroscopy (OGD) No varices anywhere in the upper GI tract/ no PHG ,OESOPHAGUS: Barrett’s oesophagus C0 M1 ,No blood or sign of recent bleeding 0 1
26 Small Bowel Biopsy Gastroscopy (OGD) There is a duodenal diverticulum, just after the D1/D2 flexure but there is no inflammation, ulceration or erosions ,Biopsies from D2 and stomach ,Biopsies from D2 and stomach ,Several erosions/small ulcers in inflammed antrum ,Hiatus hernia ,The antrum looks spared ,Stomach - mild gastritis CLO - negative ,D1 inspected carefully and no othe abnormalities seen NA NA
27 Follow-up ULCER HEALING Gastroscopy (OGD) GOJ at 40cm with small sliding hiatus hernia ,Forrest Ulcer classification: IIc ,Gastritis and duodenitis A lesion underwent EMR NA NA

Worst pathological stage

We also need to extract the worst pathological stage for a sample, and if non-dysplastic, determine whether the sample has intestinal metaplasia or not. This is done using ‘degradation’ so that it will look for the worst overall grade in the histology specimen and if not found it will look for the next worst and so on. It looks per report not per biopsy (it is more common for histopathology reports to contain the worst overall grade rather than individual biopsy grades). Specfically it extracts the histopathology worst grade.

# The histology is then merged with the Endoscopy dataset. The merge occurs
# according to date and Hospital number
b<-Endomerge2(Myendo,'Dateofprocedure','HospitalNumber',Mypath,'Dateofprocedure',
'HospitalNumber')

#Picking out only the Barrett's related endoscopies
b<-b[grepl("Surv",b$Indications),]

b<-Barretts_PathStage(b,'Histology')
  Indications ProcedurePerformed Findings Days IMorNoIM
327 Surveillance-Barrett’s Gastroscopy (OGD) Normal upper GI endoscopy to the Angularis ,D2 biopsies taken in view of weight loss ,Modalities used to achieve haemostasis: 0 days No_IM
346 Surveillance-Barrett’s Gastroscopy (OGD) D1 - ulcer healing A lesion underwent EMR 0 days No_IM
355 Surveillance-Barrett’s Gastroscopy (OGD) G: no Fundal varices ,Small oesophageal diverticulu at 20cms and florid candida ,There seemed to be two loops 0 days No_IM
372 Surveillance-Barrett’s Gastroscopy (OGD) STOMACH: Gastritis- Mild No ulceration seen ,Stomach and duodenum normal ,Normal upper GI endoscopy 0 days No_IM

Therapeutic events

We also need to process any endoscopies where an event other than surveillance happened (unless filtered specifically for surveillance). This function extracts the Event- usually a therapeutic event, from the text eg endoscopic mucosal resection, radiofrequency ablation etc. It does not currently include stricture dilatation. At the moment the event is extracted from the endoscopy only but for specimen extraction eg endoscopic mucosal resection future iterations may also examine the pathology data as it is likely to describe the presence of EMR more robustly.


v<-Endomerge2(Myendo,"Dateofprocedure","HospitalNumber",Mypath,"Dateofprocedure","HospitalNumber")

#Pick out the therapeutic procedures:
b<-v[grepl("Therap",v$Indications),]
b<-Barretts_EventType(b,'Histology', 'ProcedurePerformed','OGDReportWhole','Findings')
  Indications ProcedurePerformed Findings Days EVENT
204 Therapeutic- RFA Gastroscopy (OGD) 2 biopsies from D2 ,OESOPHAGUS: Varices- 3 columns grade 2 0 days RFA
208 Therapeutic- Dilatation Gastroscopy (OGD) No active bleeding/no visible vessel seen ,Micronodular gastritis in the antrum ,No fresh or altered blood in stomach ,STOMACH: previous bypass surgery in 1997 - both loops look normal 0 days nothing
213 Therapeutic- Dilatation Gastroscopy (OGD) Stomach and duodenum normal ,Biopsies showed no evidence of H ,STOMACH: hiatus hernia 0 days nothing
218 Therapeutic- RFA Gastroscopy (OGD) EMR scar unchanged ,Oesophageal biopsies taken from three levels as requested ,Oesophagus normal; SCJ at 40 cm; no oesophagitis, hiatus hernia or dilatation ,Known GOJ adeno-Ca ,Total of 6 ablations ,On treatment dose LMWH for bilateral PE and portal vein thrombosis ,STOMACH: streaks of erythematous mucosa at distal body/proximal antrum ,OESOPHAGUS: Normal to GOJ at 40 cm ,Stomach and duodenum: normal with no mucosal lesion ,Oesophageal mucosae very friable and inflamed 0 days RFA
221 Therapeutic- Dilatation Gastroscopy (OGD) DUODENUM: not entered ,Normal oesophagus; SCJ at 40 cm widely patent with 20 mm balloon fitting loosely; both limbs of jejunum entered for 20 cm HALO 90 done with good effect 0 days RFA

So far we have extracted the worst pathological stage per sample as well as the length of the Barrett’s segment where available. We have also extracted any therapeutic events that have occurred. These separate columns are the building blocks for further analyses.


Follow-up groups

Having done these pre-processing steps, the follow-up group to which the last endoscopy belongs (rather than the patient as their biopsy results or Barrett’s segment length and therefore their follow-up timing, may fluctuate over time) can be determined. The follow-up timing, as eplained in the the original guideline flowchart above, depends on the length of the Barrett’s segment and the presence of intestinal metaplasia (a type of columnar lined epithelium). If abnormal cells (dysplasia) are present the there is a different follow-up regime which we won’t concern ourselves with at the moment. The timing of follow-up is done with the function Barretts_FUType. This relies on the previous functions called Barretts_PathStage and Barretts_EventType and Barretts_PragueScore having been run. The Barretts_FUType function will tell you which follow up Rule the patient should be on so that the timing of the next endoscopy can be determined.


v<-Endomerge2(Myendo,"Dateofprocedure","HospitalNumber",Mypath,"Dateofprocedure","HospitalNumber")
b<-Barretts_PathStage(v,'Histology')
b1<-Barretts_PragueScore(b,'Findings')
b2<-Barretts_EventType(b1,'Histology','ProcedurePerformed','OGDReportWhole','Findings')
b3<-Barretts_FUType(b2,'Findings')
  IMorNoIM CStage MStage EVENT FU_Group
23 No_IM NA 2 RFA Rule1
24 No_IM 3 5 nothing Rule3
25 IM NA 0 nothing Rule2
26 No_IM 0 1 nothing Rule1
27 No_IM NA 0 nothing Rule1

Running all the Barrett’s functions:

It is sometimes useful to run all the Barrett’s functions rather than one by one. This is provided by the function BarrettsAll which is a parent function for Barretts_PathStage, Barretts_EventType, Barretts_FUType. It relies on the columns being named in a standardised way but not all the columns need to be present- if they are not present the BarrettsAll function will just skip them.

 # The histology is then merged with the Endoscopy dataset. The merge occurs
 # according to date and Hospital number
 v<-Endomerge2(Myendo,'Dateofprocedure','HospitalNumber',Mypath,'Dateofprocedure',
 'HospitalNumber')
 #The function relies on the other Barrett's functions being run as well:
 b3<-BarrettsAll(v)
  IMorNoIM CStage MStage FU_Group EVENT
23 No_IM NA 2 Rule1 RFA
24 No_IM 3 5 Rule3 nothing
25 IM NA 0 Rule2 nothing
26 No_IM 0 1 Rule1 nothing
27 No_IM NA 0 Rule1 nothing

Assessment of Barrett’s therapeutics

Barrett’s patients also undergo therapeutic procedures and there are various metrics to analyse the performance of the therapy by endoscopist, as well which catheters are being used etc. To start with we can do simple things like plotting all the pathological grades of the EMRs.This should only be run after all the Barretts functions (as explained above- all contained within BarrettsAll) so that the column event is present in the dataframe.

The therapeutic functions are fairly self-explanatory. The function to determine the post EMR grade is BarrettsEMRGrades with correlation with the endoscopists Paris classification (a standardised method to describe lesions in the upper GI tract) provided for with the function. Using the dataset extracted from the code chunk above:

# As long as the code int he chunk above has been run then all that needs to be done is:
BarrettsEMRGrades(b3)


In order to determine the basic number of EMR, RFA and APC a function called BarrettsBasicNumbers is provided which outputs a geom_line plot with the information on it.

## $ProcNumbers
## # A tibble: 43 x 3
## # Groups:   EVENT [?]
##    EVENT  year     n
##    <chr> <dbl> <int>
##  1 APC   2001.     7
##  2 APC   2002.     8
##  3 APC   2003.     7
##  4 APC   2004.     8
##  5 APC   2005.     5
##  6 APC   2006.    10
##  7 APC   2007.    11
##  8 APC   2008.     8
##  9 APC   2009.     7
## 10 APC   2010.     3
## # ... with 33 more rows
## 
## $ProcNumbersPlot


To assess the number of different catheter types used (for renumeration purposes for example) the following function BarrettssRFACath can be used. Because the use of a catheter may be part of the ProcedurePerformed column or in the free text of Findings, more than one column can be searched in.

BarrettssRFACath(b3,"ProcedurePerformed","Findings")


One of the most important aspects of therapeutics is the assessment of outcome which is defined as clearance of intestinal metaplasia (CRIM), usually on at least two endoscopies. As most patients undergoing Barrett’s ablation will require radiofrequency ablation after the initial endoscopic mucosal resection, we can define CRIM as being an endoscopy where the therapeutic procedure is listed as ‘nothing’ in the EVENT column (this will only work if the Barrett’s pre-processing functions have been used) after a course of RFA or EMR

ds<-Barretts_CRIM(b3,'pHospitalNum',"EVENT")
ds2<-data.frame(ds$pHospitalNum,ds$ind)
ds.pHospitalNum ds.ind
A1648588 TRUE
A1648588 FALSE
A1648588 TRUE
A1648588 FALSE
A1648588 TRUE
A1648588 FALSE
A3962018 TRUE
A3962018 FALSE
A3962018 TRUE
A3962018 FALSE


Quality assessment in Barrett’s surveillance

Quality of documentation for Barrett’s surveillance endoscopies

We can also assess the documentation used in Barrett’s endoscopies. Documentation quality is a cornerstone of endoscopic quality as listed in the EndoMineR principles and also in the Analysis tutorial There are standards associated with the endoscopies that all endoscopists should adhere to. This function therefore assesses the Barrett’s documentation. This notes how many reports contain the mandatory report fields as specified in the BSG standards on Barrett’s endoscopic reporting. This should be run after the Barretts_Prague as assessment of the Prague score is a part of this assessment:

BarrettsDocumentQual(b3,"Findings")


Quality of perfomance of Barrett’s surveillance endoscopies as just by tissue sampling

One of the essential requirements to demonstrate adequate sampling of Barrett’s oesophagus during endoscopy is that the endoscopist should adhere to the ‘Seattle protocol’ for biopsies which is to take 4 equally spaced biopsies at 2cm intervals in the circumferential part of the oesophagus. Because the macroscopic description of the pathological specimen tells us how many samples are taken overall (and rarely how many at each level but this is usually not the case for a variety of reasons) we can determine the shortfall in the number of biopsies taken, per endoscopist. Again pre-processing the Barrett’s samples is pre-requisite. The Number of biopsies and their size should also be extracted using the histopathology functions.

 # The number of average number of biopsies is then calculated and
 # compared to the average Prague C score so that those who are taking
 # too few biopsies can be determined
b4<-HistolNumbOfBx(b3,'Macroscopicdescription','specimen')
b4<-HistolBxSize(b4,'Macroscopicdescription')
BarrettsBxQual(b4,'Date.x','HospitalNumber',
                                      'Endoscopist')
## $BxShortfallPre
##                        Endoscopist    MeanDiff
## 1    Dr\n al-Arif, Ummu Kulthoom\n -12.8888889
## 2           Dr\n Anderson, Alana\n  -1.8500000
## 3   Dr\n Avitia-Ramirez, Alondra\n  -6.3181818
## 4           Dr\n Greimann, Phoua\n  -2.7222222
## 5             Dr\n Ives, Rashiah\n  -8.5416667
## 6         Dr\n Kekich, Annabelle\n  -0.8333333
## 7     Dr\n Kola-Kehinde, Karisma\n   0.7619048
## 8          Dr\n Martinez, Maegen\n -11.9090909
## 9            Dr\n Moreno, Lauren\n  -0.3478261
## 10         Dr\n Sullivan, Shelby\n  -9.8333333
## 
## $t


This function will again return a list with a ggplot showing the shortfall per endoscopist as well as a table with the same values.

Quality of perfomance of Barrett’s surveillance endoscopies

As we discovered with some of the generic functions above, one way to determine quality of endoscopy is to assess the pathology of specimens taken at surveillance per year. This function outputs a plot which determines the the overall number of pathologies (low/high grade dysplasia and cancer) for patients on surveillance.


This function provides a plot with absolute numbers of pathology detected. This of course doesn’t tell us the per endoscopist rate so a further function is provided which looks specifically at the detection of dysplasia by endoscopist as a function of the number of surveillance endoscopies done by that endoscopist. The output is provided as a table of proportions per endoscopist. This is called the dysplasia detection rate and gives you a good idea of how robustly an endoscopist is examining a segment of Barrett’s mucosa.

bDDR<-BarrettsDDRQual(b3,'Endoscopist','IMorNoIM')
  HGD IM LGD No_IM T1a
Dr al-Arif, Ummu Kulthoom 0.007109 0.02085 0.004739 0.07773 0.001896
Dr Anderson, Alana 0.006161 0.02133 0.003791 0.07251 0.00237
Dr Avitia-Ramirez, Alondra 0.006161 0.02085 0.005213 0.06161 0.001422
Dr Greimann, Phoua 0.004739 0.01659 0.003791 0.06351 0.0004739
Dr Ives, Rashiah 0.005687 0.02322 0.003791 0.06872 0.0009479
Dr Kekich, Annabelle 0.005213 0.01801 0.00237 0.06493 0
Dr Kola-Kehinde, Karisma 0.007583 0.01611 0.007583 0.0654 0.0009479
Dr Martinez, Maegen 0.009005 0.01896 0.004265 0.06635 0.0004739
Dr Moreno, Lauren 0.007109 0.01801 0.01043 0.07441 0.0004739
Dr Sullivan, Shelby 0.009479 0.02133 0.006635 0.05924 0.0004739

Surveillance enrollment and lost to follow-ups

Enrollment rates to surveillance programme

We can determine how many patients who underwent endoscopy for reasons that was not Barrett’s surveillance, and in whom Barrett’s oesophagus was found, have had further follow-up endoscopies ie what the enrollment to surveillance rate is in those who weren’t known to have Barrett’s when their endoscopy was performed.

This function graphs the patients who were not on surveillance programmes and sees how many then had an endoscopy. This should be run after the Barretts_Prague and Barretts_PathStage.

Enroll<-BarrettsSurveil(Myendo,'HospitalNumber','Dateofprocedure','Indications')
MyEnroll<-data.frame(Enroll["HospitalNumber"],Enroll["Years"])
HospitalNumber Years
J1337672 5.005495 weeks
M5148114 3.769231 weeks
P2080807 3.313187 weeks
Q7176341 3.818681 weeks
Q7729897 3.684066 weeks
R3882435 3.357143 weeks
S1951882 3.354396 weeks
W2120051 3.241758 weeks

Patients undergoing surveillance

Perhaps we are also interested in how many patients fall under each follow-up category so we can plan how many Barrett’s surveillance endoscopies we are going to have to do over a certain time period The function to do this gets the unique patient ID’s for each patient, for each rule. It lists the unique PatientIDs associated with a rule (‘Rule1’,‘Rule2’,‘Rule3’,‘NoRules’). This allows us to determine how many patients will need follow up at specific time intervals. This should be run after the Barretts_Prague, Barretts_PathStage and Barretts_FUType. The following example shows the inidividual patients who are currently under a Rule1 follow up.

v<-Endomerge2(Myendo,"Dateofprocedure","HospitalNumber",Mypath,"Dateofprocedure","HospitalNumber")
b4<-BarrettsAll(v)
colnames(b4)[colnames(b4) == 'pHospitalNum'] <- 'HospitalNumber'
Rule<-BarrettsSurveil_HospNum(b4,'Rule1','HospitalNumber')
x
J6044658
Y6417773
B6072011
G1449886
O7163832
L4378217
K2657390
N4378127
R8004923
M5148114