Can we accurately define a group of pregnancies of unknown location (PULs) as low risk in order t... more Can we accurately define a group of pregnancies of unknown location (PULs) as low risk in order to safely reduce follow-up for these pregnancies and allocate resources to pregnancies at an increased risk of being ectopic? Prediction model M4 classified around 70% of PULs as low risk, of which around 97% were later characterized as failed PULs or intrauterine pregnancies (IUPs), while still classifying 88% of ectopic pregnancies as high risk. Depending on the level of suspicion of ectopic pregnancy (EP), women with a PUL receive a lengthy follow-up in order to confirm the location and viability of the pregnancy. A multi-centre diagnostic accuracy study of 1962 patients was carried out between 2003 and 2007 for retrospective temporal validation and between 2009 and 2011 for prospective external validation. The reference standard is the final characterization of PUL as failed pregnancies or IUPs (low risk), or as ectopic pregnancies (high risk). M4 is a multinomial logistic regression model based on the serum human chorionic gonadotrophin (hCG) levels at presentation and 48 h later. Temporal validation data from 1341 PULs collected at St George's Hospital in London were available, of which 53% were failed, 39% were intrauterine and 8% were ectopic pregnancies. External validation data from 621 PULs collected at four other London-based teaching hospitals were available, of which 63% were failed, 22% were intrauterine and 15% were ectopic pregnancies. The EP rate varied between 8 and 16% across the five hospitals. At St George's, 980 [73.1%, 95% confidence interval (CI): 70.5-75.4] PULs were considered low risk. Of these, 963 were failed PULs or IUPs (98.3%, 95% CI: 97.2-98.9) and 17 were ectopic pregnancies. At the other four hospitals, 62-75% were considered low risk, with 96-98% of these turning out to be failed PUL or IUP. Eighty-five percent (95% CI: 76.8-90.2) of the ectopic pregnancies were considered high risk at St George's, compared with 80-92% in the other hospitals. Of total, 120 patients had been excluded due to loss to follow-up, and a further 102 patients because of missing hCG levels due to differences in local clinical practice. There are variations in the definition of a PUL used in different countries. The suggested protocol could safely reduce the follow-up in the majority of PUL such that units could increase the focus on women at a risk of complications. This would lead to a change in the management of the majority of women with a PUL and a more efficient use of resources. At the end of the manuscript, we provide a link to enable clinicians to use the protocol. B.V.C. is supported by a postdoctoral fellowship from the Research Foundation Flanders (FWO). K.V.H. is supported by a fellowship from the Flanders' Agency for Innovation by Science and Technology (IWT-Vlaanderen), by the Research Council KU Leuven (GOA MaNet), by the Flemish Government (iMinds) and by the Belgian Federal Science Policy Office (IUAP P7/DYSCO). T.B. is supported by the Imperial Healthcare NHS Trust NIHR Biomedical Research Centre. No competing interests are declared.
How does a protocol based on a single serum progesterone measurement perform as a triage tool in ... more How does a protocol based on a single serum progesterone measurement perform as a triage tool in women with pregnancy of unknown location (PUL) in comparison to protocols based on serial hCG measurement? Triage based on the logistic regression model M4 (using initial hCG and hCG ratio (48 h/0 h)) classifies the majority of PUL into low and high risk groups, in contrast to a progesterone protocol based on a serum level threshold of 10 nmol/l. Low progesterone has been shown to identify failing pregnancies and those at low risk of complications. A prediction model (M4) based on the initial hCG and the hCG ratio at 0 and 48 h can successfully classify PUL into low and high risk groups. A multi-centre diagnostic accuracy study of 1271 women was performed retrospectively on data from women at St. George's Hospital (SGH, London, UK) between February 2005 and 2006, Queen Charlottes & Chelsea Hospital (QCCH, London, UK) between April 2009 and August 2012, and the Royal Prince Alfred Hospital (RPAH, Sydney, Australia) between February 2008 and October 2011. The end-points were the final observed outcome for each pregnancy as a failed PUL (low risk), intrauterine pregnancy (IUP, low risk), or ectopic pregnancy (EP, high risk), and any interventions or complications for EP during the follow-up period. Complete data were available for initial progesterone, 0/48 h hCG and final outcome in 431 of 534 women (81%) at SGH, 396/585 (68%) at QCCH and 96/152 (63%) at RPAH. Missing values were handled using multiple imputation. Three diagnostic approaches were used to classify PUL as high risk: a range of serum progesterone levels were evaluated (>10, 16 and 20 nmol/l) for the progesterone protocol, risk of EP given by the M4 model ≥5% for the M4-based protocol, and hCG ratio was between 0.87 and 1.66 for hCG cut-offs as previously published. Results were analysed using random intercept models or stratified analysis to account for variability between centres. The progesterone protocol based on levels of >10 nmol/l classified 24% (95% confidence interval 20-28%) of failed PUL, 95% (92-97%) of IUP and 76% (67-83%) of EP as high risk. The M4 protocol classified 14% (11-17%) of failed PUL, 37% (31-43%) of IUP and 84% (76-90%) of EP as high risk. The hCG ratio cut-offs classified 10% (8-12%) of failed PUL, 15% (11-20%) of IUP and 63% (53-71%) of EP as high risk. Using complete cases only, 67% of EP treated with methotrexate (n = 48) and 89% surgically managed (n = 37) were correctly classified by the progesterone protocol, 96 and 81% by M4 protocol and 75 and 65% by hCG ratio cut offs, respectively. Data were incomplete for 103 (19%), 189 (32%) and 56 (37%) patients at SGH, QCCH and RPAH, respectively; however, we are reassured by the minimal differences seen between the results of complete cases and those following imputation of missing values. The variation in the inclusion criteria between the three centres is also a potential limitation of this study; however, it reflects real clinical practice. Furthermore, the hCG ratio cut-offs were not originally developed to optimize triage. The results show that serum progesterone is less efficient for triage than serial hCG measurements assessed using the M4 model, the striking difference being serum progesterone places nearly all IUP in the high-risk category. A two-step strategy combining single-visit and two-visit approaches should be investigated. Funding was from Research Foundation-Flanders (FWO). There are no competing interests.
What is the inter-/intra-observer agreement and diagnostic accuracy among gynaecological and non-... more What is the inter-/intra-observer agreement and diagnostic accuracy among gynaecological and non-gynaecological ultrasound specialists in the prediction of pouch of Douglas (POD) obliteration (secondary to endometriosis) at offline analysis of two-dimensional videos using the dynamic real-time transvaginal ultrasound (TVS) 'sliding sign' technique? The inter-/intra-observer agreement and diagnostic accuracy for the interpretation of the TVS 'sliding sign' in the prediction of POD obliteration was found to be very acceptable, ranging from substantial to almost perfect agreement for the observers who specialized in gynaecological ultrasound. Women with POD obliteration at laparoscopy are at an increased risk of bowel endometriosis; therefore, the pre-operative diagnosis of POD obliteration is important in the surgical planning for these women. Previous studies have used TVS to predict POD obliteration prior to laparoscopy, with a sensitivity of 72-83% and specificity of 97-100%. However, there have not been any reproducibility studies performed to validate the use of TVS in the prediction of POD obliteration pre-operatively. This was a reproducibility study which involved the offline viewing of pre-recorded video sets of 30 women presenting with chronic pelvic pain, in order to determine POD obliteration using the TVS 'sliding sign' technique. The videos were selected on real-time representative quality/quantity; they were not obtained from sequential patients. There were a total of six observers, including four gynaecological ultrasound specialists and two fetal medicine specialists. The study was conducted over a period of 1 month (March 2012-April 2012). The four gynaecological ultrasound observers performed daily gynaecological scanning, while the other two observers were primarily fetal medicine sonologists. Each sonologist viewed the TVS 'sliding sign' video in two anatomical locations (retro-cervix and posterior uterine fundus), i.e. 60 videos in total. The POD was deemed not obliterated, if 'sliding sign' was positive in both anatomical locations (i.e. anterior rectum/rectosigmoid glided smoothly across the retro-cervix/posterior fundus, respectively). If the 'sliding sign' was negative (i.e. anterior rectum/rectosigmoid did not glide smoothly over retro-cervix/posterior fundal region, respectively), the POD was deemed obliterated. Diagnostic accuracy and inter-observer agreement among the six sonologists was evaluated. The same sonologist was also asked to reanalyse the same videos, albeit in a different order, at least 7 days later to assess for intra-observer agreement. A separate analysis of the inter- and intra-observer correlation was also performed to determine the agreement among the four observers who specialized in gynaecological ultrasound. Cohen's κ coefficient <0 meant that there was poor agreement, 0.01-0.20 slight agreement, 0.21-0.40 fair agreement, 0.41-0.60 moderate agreement, 0.61-0.80 substantial agreement and 0.81-0.99 almost perfect agreement. Agreement (Cohen's κ) between all six observers for the interpretation of the 'sliding sign' for both sets of videos in both regions (retro-cervix and fundus) ranged from 0.354 to 0.927 (fair agreement to almost perfect agreement) compared with 0.630-0.927 (substantial agreement to almost perfect agreement) when only the gynaecological sonologists were included. The overall multiple rater agreement for the interpretation of the 'sliding sign' for both video sets and both regions was Fleiss' κ 0.454 (P-value <0.01) for all six observers and 0.646 (P-value <0.01) for the four gynaecological ultrasound specialists. The multiple rater agreement for all six or all four observers was higher for the retro-cervical region versus the fundal region (Fleiss' κ 0.542 versus 0.370 and 0.732 versus 0.560, respectively). The intra-observer agreement among the six observers for the interpretation of the 'sliding sign' and prediction of POD obliteration ranged from Cohen's κ 0.60-0.95 and 0.46-1.0 (P-value <0.01), respectively. After excluding the fetal medicine specialists, the intra-observer agreement for the interpretation of the 'sliding sign' and the prediction of POD obliteration ranged from Cohen's κ 0.71-0.95 and 0.67-1.0, respectively, indicating substantial to almost perfect agreement. When comparing the four gynaecological observers for the prediction of POD obliteration using the TVS 'sliding sign' (after excluding cases with the POD outcome classified as 'unsure' by the observers), the results for accuracy, sensitivity, specificity, positive and negative predictive value were 93.1-100, 92.9-100, 90.9-100, 77.8-100 and 97.7-100%, respectively. The 'gold standard' for the diagnosis of POD obliteration is laparoscopy; however, laparoscopic data were available only for 24 out of 30 (80%) TVS 'sliding…
Can we accurately define a group of pregnancies of unknown location (PULs) as low risk in order t... more Can we accurately define a group of pregnancies of unknown location (PULs) as low risk in order to safely reduce follow-up for these pregnancies and allocate resources to pregnancies at an increased risk of being ectopic? Prediction model M4 classified around 70% of PULs as low risk, of which around 97% were later characterized as failed PULs or intrauterine pregnancies (IUPs), while still classifying 88% of ectopic pregnancies as high risk. Depending on the level of suspicion of ectopic pregnancy (EP), women with a PUL receive a lengthy follow-up in order to confirm the location and viability of the pregnancy. A multi-centre diagnostic accuracy study of 1962 patients was carried out between 2003 and 2007 for retrospective temporal validation and between 2009 and 2011 for prospective external validation. The reference standard is the final characterization of PUL as failed pregnancies or IUPs (low risk), or as ectopic pregnancies (high risk). M4 is a multinomial logistic regression model based on the serum human chorionic gonadotrophin (hCG) levels at presentation and 48 h later. Temporal validation data from 1341 PULs collected at St George's Hospital in London were available, of which 53% were failed, 39% were intrauterine and 8% were ectopic pregnancies. External validation data from 621 PULs collected at four other London-based teaching hospitals were available, of which 63% were failed, 22% were intrauterine and 15% were ectopic pregnancies. The EP rate varied between 8 and 16% across the five hospitals. At St George's, 980 [73.1%, 95% confidence interval (CI): 70.5-75.4] PULs were considered low risk. Of these, 963 were failed PULs or IUPs (98.3%, 95% CI: 97.2-98.9) and 17 were ectopic pregnancies. At the other four hospitals, 62-75% were considered low risk, with 96-98% of these turning out to be failed PUL or IUP. Eighty-five percent (95% CI: 76.8-90.2) of the ectopic pregnancies were considered high risk at St George's, compared with 80-92% in the other hospitals. Of total, 120 patients had been excluded due to loss to follow-up, and a further 102 patients because of missing hCG levels due to differences in local clinical practice. There are variations in the definition of a PUL used in different countries. The suggested protocol could safely reduce the follow-up in the majority of PUL such that units could increase the focus on women at a risk of complications. This would lead to a change in the management of the majority of women with a PUL and a more efficient use of resources. At the end of the manuscript, we provide a link to enable clinicians to use the protocol. B.V.C. is supported by a postdoctoral fellowship from the Research Foundation Flanders (FWO). K.V.H. is supported by a fellowship from the Flanders' Agency for Innovation by Science and Technology (IWT-Vlaanderen), by the Research Council KU Leuven (GOA MaNet), by the Flemish Government (iMinds) and by the Belgian Federal Science Policy Office (IUAP P7/DYSCO). T.B. is supported by the Imperial Healthcare NHS Trust NIHR Biomedical Research Centre. No competing interests are declared.
How does a protocol based on a single serum progesterone measurement perform as a triage tool in ... more How does a protocol based on a single serum progesterone measurement perform as a triage tool in women with pregnancy of unknown location (PUL) in comparison to protocols based on serial hCG measurement? Triage based on the logistic regression model M4 (using initial hCG and hCG ratio (48 h/0 h)) classifies the majority of PUL into low and high risk groups, in contrast to a progesterone protocol based on a serum level threshold of 10 nmol/l. Low progesterone has been shown to identify failing pregnancies and those at low risk of complications. A prediction model (M4) based on the initial hCG and the hCG ratio at 0 and 48 h can successfully classify PUL into low and high risk groups. A multi-centre diagnostic accuracy study of 1271 women was performed retrospectively on data from women at St. George's Hospital (SGH, London, UK) between February 2005 and 2006, Queen Charlottes & Chelsea Hospital (QCCH, London, UK) between April 2009 and August 2012, and the Royal Prince Alfred Hospital (RPAH, Sydney, Australia) between February 2008 and October 2011. The end-points were the final observed outcome for each pregnancy as a failed PUL (low risk), intrauterine pregnancy (IUP, low risk), or ectopic pregnancy (EP, high risk), and any interventions or complications for EP during the follow-up period. Complete data were available for initial progesterone, 0/48 h hCG and final outcome in 431 of 534 women (81%) at SGH, 396/585 (68%) at QCCH and 96/152 (63%) at RPAH. Missing values were handled using multiple imputation. Three diagnostic approaches were used to classify PUL as high risk: a range of serum progesterone levels were evaluated (>10, 16 and 20 nmol/l) for the progesterone protocol, risk of EP given by the M4 model ≥5% for the M4-based protocol, and hCG ratio was between 0.87 and 1.66 for hCG cut-offs as previously published. Results were analysed using random intercept models or stratified analysis to account for variability between centres. The progesterone protocol based on levels of >10 nmol/l classified 24% (95% confidence interval 20-28%) of failed PUL, 95% (92-97%) of IUP and 76% (67-83%) of EP as high risk. The M4 protocol classified 14% (11-17%) of failed PUL, 37% (31-43%) of IUP and 84% (76-90%) of EP as high risk. The hCG ratio cut-offs classified 10% (8-12%) of failed PUL, 15% (11-20%) of IUP and 63% (53-71%) of EP as high risk. Using complete cases only, 67% of EP treated with methotrexate (n = 48) and 89% surgically managed (n = 37) were correctly classified by the progesterone protocol, 96 and 81% by M4 protocol and 75 and 65% by hCG ratio cut offs, respectively. Data were incomplete for 103 (19%), 189 (32%) and 56 (37%) patients at SGH, QCCH and RPAH, respectively; however, we are reassured by the minimal differences seen between the results of complete cases and those following imputation of missing values. The variation in the inclusion criteria between the three centres is also a potential limitation of this study; however, it reflects real clinical practice. Furthermore, the hCG ratio cut-offs were not originally developed to optimize triage. The results show that serum progesterone is less efficient for triage than serial hCG measurements assessed using the M4 model, the striking difference being serum progesterone places nearly all IUP in the high-risk category. A two-step strategy combining single-visit and two-visit approaches should be investigated. Funding was from Research Foundation-Flanders (FWO). There are no competing interests.
What is the inter-/intra-observer agreement and diagnostic accuracy among gynaecological and non-... more What is the inter-/intra-observer agreement and diagnostic accuracy among gynaecological and non-gynaecological ultrasound specialists in the prediction of pouch of Douglas (POD) obliteration (secondary to endometriosis) at offline analysis of two-dimensional videos using the dynamic real-time transvaginal ultrasound (TVS) 'sliding sign' technique? The inter-/intra-observer agreement and diagnostic accuracy for the interpretation of the TVS 'sliding sign' in the prediction of POD obliteration was found to be very acceptable, ranging from substantial to almost perfect agreement for the observers who specialized in gynaecological ultrasound. Women with POD obliteration at laparoscopy are at an increased risk of bowel endometriosis; therefore, the pre-operative diagnosis of POD obliteration is important in the surgical planning for these women. Previous studies have used TVS to predict POD obliteration prior to laparoscopy, with a sensitivity of 72-83% and specificity of 97-100%. However, there have not been any reproducibility studies performed to validate the use of TVS in the prediction of POD obliteration pre-operatively. This was a reproducibility study which involved the offline viewing of pre-recorded video sets of 30 women presenting with chronic pelvic pain, in order to determine POD obliteration using the TVS 'sliding sign' technique. The videos were selected on real-time representative quality/quantity; they were not obtained from sequential patients. There were a total of six observers, including four gynaecological ultrasound specialists and two fetal medicine specialists. The study was conducted over a period of 1 month (March 2012-April 2012). The four gynaecological ultrasound observers performed daily gynaecological scanning, while the other two observers were primarily fetal medicine sonologists. Each sonologist viewed the TVS 'sliding sign' video in two anatomical locations (retro-cervix and posterior uterine fundus), i.e. 60 videos in total. The POD was deemed not obliterated, if 'sliding sign' was positive in both anatomical locations (i.e. anterior rectum/rectosigmoid glided smoothly across the retro-cervix/posterior fundus, respectively). If the 'sliding sign' was negative (i.e. anterior rectum/rectosigmoid did not glide smoothly over retro-cervix/posterior fundal region, respectively), the POD was deemed obliterated. Diagnostic accuracy and inter-observer agreement among the six sonologists was evaluated. The same sonologist was also asked to reanalyse the same videos, albeit in a different order, at least 7 days later to assess for intra-observer agreement. A separate analysis of the inter- and intra-observer correlation was also performed to determine the agreement among the four observers who specialized in gynaecological ultrasound. Cohen's κ coefficient <0 meant that there was poor agreement, 0.01-0.20 slight agreement, 0.21-0.40 fair agreement, 0.41-0.60 moderate agreement, 0.61-0.80 substantial agreement and 0.81-0.99 almost perfect agreement. Agreement (Cohen's κ) between all six observers for the interpretation of the 'sliding sign' for both sets of videos in both regions (retro-cervix and fundus) ranged from 0.354 to 0.927 (fair agreement to almost perfect agreement) compared with 0.630-0.927 (substantial agreement to almost perfect agreement) when only the gynaecological sonologists were included. The overall multiple rater agreement for the interpretation of the 'sliding sign' for both video sets and both regions was Fleiss' κ 0.454 (P-value <0.01) for all six observers and 0.646 (P-value <0.01) for the four gynaecological ultrasound specialists. The multiple rater agreement for all six or all four observers was higher for the retro-cervical region versus the fundal region (Fleiss' κ 0.542 versus 0.370 and 0.732 versus 0.560, respectively). The intra-observer agreement among the six observers for the interpretation of the 'sliding sign' and prediction of POD obliteration ranged from Cohen's κ 0.60-0.95 and 0.46-1.0 (P-value <0.01), respectively. After excluding the fetal medicine specialists, the intra-observer agreement for the interpretation of the 'sliding sign' and the prediction of POD obliteration ranged from Cohen's κ 0.71-0.95 and 0.67-1.0, respectively, indicating substantial to almost perfect agreement. When comparing the four gynaecological observers for the prediction of POD obliteration using the TVS 'sliding sign' (after excluding cases with the POD outcome classified as 'unsure' by the observers), the results for accuracy, sensitivity, specificity, positive and negative predictive value were 93.1-100, 92.9-100, 90.9-100, 77.8-100 and 97.7-100%, respectively. The 'gold standard' for the diagnosis of POD obliteration is laparoscopy; however, laparoscopic data were available only for 24 out of 30 (80%) TVS 'sliding…
Uploads
Papers by G. Condous