3. Materials and Methods
To present the different methods and the research methodology adopted in this work, we rely mainly on a scenario describing an instance of the process a doctor adopts to make a choice of treatment protocol for a patient suffering from BC.
Our primary outcomes included: (i) SEs’ prediction and detection rate using knowledge-based reasoning in relation to BC, including NIBC, NMIBC and MIBC; and (ii) overall survival and progression-free survival within AEs’ severity grades. Secondary outcomes were SE management, preventing recurrences and occurrences of SE risks and the impact on quality of life.
3.1. Knowledge and Data Collection
In this study, we used a crawler “pubCrawler” as a selective information dissemination service (SDIS) to extract information from various online knowledge sources, particularly evidence-based knowledge and data about past events and facts in BC treatments. This systematic review framework was applied to search, extract, and assess scientific papers. We used a keyword search strategy to find relevant articles that contain knowledge about BC treatment risks and effects and research about BC ontologies. Journals, books, guidelines, and taxonomies are also included in these crawling events. Additionally, research was applied to find clinical anonymized and non-identifiable data and indications about patients and related clinical cases used in previously published studies about BC. Whether a study refers to random samples or is broadly applicable to many different types of samples, data are labelled with good generalizability.
We adopted a comprehensive literature search that included RCTs and data-based knowledge from Pubmed/Medline, Embase, Cochrane CENTRAL and the Allied and Complementary Medicine Database (AMED). Our review process was based on Cochrane guidelines for systematic reviews of interventions [
46]. Terms and their combinations were searched following specific criteria as described in
Table 1 including the ongoing trials. Within some databases, such as Pubmed, we used the “related articles” function to refine the search. On the other hand, we manually searched in the retrieved studies’ references as they were cited. We have retained the most complete and updated studies, to which the outcomes were different in measures and time. This was to avoid multiplicity and similarities included in reports with the same samples and results. All BC treatment SE-related Knowledge and RCTs will be included. Our search focused on studies with extended research findings and conclusions, from a small sample study to a larger population. The larger the population, the more the data is generalizable. This method provided information on a dedicated web page whenever new hits on articles appeared in PubMed and the US National Library of Medicine (NLM) or when new sequences were found in Science Direct or GenBank that were specific to our customized queries. These targets provided access to their databases.
Only studies with BC treatment AEs and SE severity grades related to the clinical state of patients were included. Furthermore, any other treatment of BC different from the performed one was used as a comparator, including SEs. On the other hand, we avoided any use of studies with RCTs implicating the same treatments/procedures and the same AE severity grades.
We provide some details about the exclusion and remaining records and papers as described in
Figure 1. As detailed in
Table 1, four queries were launched together and returned 3858 hits that were received by our crawler as shown in
Figure 1.
Inclusion and exclusion criteria are characteristics that prospective studies must have to screen and review searched studies. These criteria are based on our PICOs of interest, with the agreement of all the research team. The PICO method is a process used in evidence-based practices to develop search strategies. It contextualizes and answers questions about healthcare and clinical observations. We referred to the PICO model to define our clinical questions, or PICOs, and to help find relevant evidence in the studies searched. This consists of concepts relating to (P) patient problem/characteristics, (I) intervention, (C) comparison with interventions/where applicable and (O) outcomes to measure.
For general knowledge, we used background questions in the form of a wh-question about aspects of healthcare in BC treatment. For specific knowledge, we used foreground questions affecting clinical decisions and includes indications about medical and clinical problems, such as treatments’ SEs. As follows, we present examples of our PICO questions:
In a 65-year-old male patient, smoker, diagnosed with BC Papillary Carcinoma (P), does a chemotherapy treatment (I) worsen the situation of the patient more than immunotherapy (C) for more serious SEs (O)?
In adults with NMIBC (P), what does a TURBT (I) cause as SEs (O)?
First, duplicates were removed. Then, we focused on checking the titles and abstracts. This helped us to reduce the number of articles to 279 to refine the set of records gradually. In this screening phase, we found that there were many papers discussing cancers, and general knowledge about treatments such as numbers and statistics. However, few of these papers reviewed the risks and effects of BC treatments regarding evidence-based medicine and semantic web technologies. Moreover, we only considered articles that strongly focused on providing data about BC patients and cases, which was our main concern in this information gathering. We also focused on knowledge acquisition. We mainly excluded research that did not include examples and patients’ parameters and characteristics related to BC treatments. Only 93 remaining papers and records met the criteria to be included for quantitative synthesis and utilization as medical evidence and cases.
3.2. Data Collection and Analysis
For our studies selection we used the EndNote X7.8 as a tool to import our crawled reports and to perform data deduplication on our gathered information from previous studies (literature). Among the authors, two independent researchers browsed the content of all relevant studies and scanned their taxonomy, abstracts, location ID and titles to extract knowledge data from the included RCTs. Furthermore, based on our predefined eligibility criteria as mentioned in
Table 1, unassociated contents were excluded from our identified studies. Then, full text evaluation was processed for eligibility and inclusion as shown in
Figure 1. Both researchers independently extracted, selected, and evaluated the quality of the recorded studies using the recommended tool Cochrane risk of bias tool (RoB 2) [
47] within the included RCTs and PICOs based on the consolidated standards of reporting trials (CONSORT) [
48]. Any conflicting views, disagreement or inconsistencies were resolved by consensus with the help of another researcher, the adjudicating senior author who reached a final decision after discussion.
We used Cohen’s kappa, as it determines agreement between both investigators involved in data collection and analysis. The interpretation of kappa results is as follows: poor (κ < 0), slight (κ = 0.00–0.20), fair (κ = 0.21–0.40), moderate (κ = 0.41–0.60), substantial (κ = 0.61–0.80), or almost perfect (κ = 0.81–1.00). We used the following Cohen’s kappa formula for agreement between our two investigators:
where: p
0 is the relative observed agreement among raters and pe is the hypothetical possibility of chance agreement.
The methodological quality of all included RCTs was appraised by two independent researchers. We used Cohen’s kappa as it determines agreement between both investigators involved in data collection and analysis. The interpretation of kappa results is as follows: poor (κ < 0), slight (κ = 0.00–0.20), fair (κ = 0.21–0.40), moderate (κ = 0.41–0.60), substantial (κ = 0.61–0.80), or almost perfect (κ = 0.81–1.00).
We used the crosstabulation table
Table 2, to understand the degree to which both raters agreed and disagreed. As described in
Table 2, the researchers rated 93 RCT studies targeted for inclusion after evaluation. 80 RCT studies received confirmation for further study as agreed by both investigators. Furthermore, both researchers agreed that there were 6 RCT studies not confirmed for further study. Thus, there were 7 RCT studies for which the investigators could not agree on their status.
Based on data in
Table 2 we have p
0 = 0.92 and p
e = 0.81,
k = (p
0 − p
e)/(1 − p
e): then
k = 0.57. Our Cohen’s kappa (
k) is 0.57. This states a moderate strength of agreement. It is statically significant (95% CI,
p > 0.0005).
Moreover, we used calculable decision-making markers of treatment effects to evaluate studies including RCTs about the rate of AEs in BC treatments (for example the risk of life-threatening AE rate). We also compared SEs of a prescribed treatment (intervention) to another as a standard of care (control). For risk quantification of treatments, we computed the absolute risk reduction (ARR) difference between both treatments (intervention–control). The ARR indicates the treatment with less life-threatening risk. Additionally, the number needed to treat (NNT): NNT = 1/ARR, which indicates the number of patients that must receive the treatment for one patient to benefit. We used the relative risk (RR) as the ratio of risks in intervention treatment subjects to the risks in the control treatment subjects. With a (RR > 1) we have a treatment with a high-risk of bad outcome compared to control trials. However, a (RR < 1) indicates greater treatment benefit with decreased risks. The relative risk reduction (RRR): RRR = 1–RR indicates the amount of risk reduction performed by the treatment. For the assessment of treatment AEs, we used the absolute risk increase (ARI) which measures the difference between a treatment event rate and a control event rate. Moreover, the inverse of ARI is the number needed to harm (NNH): NNH = 1/ARI and this indicates the number of patients that must receive the treatment to have an AE. In this case, the RR > 1 indicates a greater treatment risk. Identically, the higher the relative risk increase (RRI): RRI = R − 1, the higher the harm rates.
Statistical Analysis
For meta-analysis, we used the Cochrane collaboration’s software Review Manager (RevMan 5.3) which uses the Cochran–Mantel–Haenszel test method (CMH) [
49] to carry out statistical analysis [
50]. Treatment SEs of continuous data were considered as a standardized mean difference and dichotomous data as a risk ratio, while 95% confidence intervals were provided. A
p-value ≤ 0.05 was considered statistically significant which indicates strong evidence against the null hypothesis (no significant difference) which is rejected, and the alternative hypothesis (difference is anticipated) is retained which states that the results are significant in terms of supporting the investigated study and 95% confidence intervals (CI) are provided.
The CMH-χ²-test was used to evaluate statistical heterogeneity within the used studies and a p-value < 0.1 was of significance. However, the I2 statistic was used to quantify heterogeneity across the included RCTs and to examine the null hypothesis. When I2 ≤ 50%, homogeneity is detected, and a fixed-effects model is applied. However, I2 > 50% suggests a significant heterogeneity and a random-effects model meta-analytical technique is utilized with a subgroup to specify this heterogeneity.
3.4. Building an Ontology for BC Knowledge Representation
In this study, a patient was represented by an instance of the class Patient, which was mainly expressed by two main subclasses, PatientBiophysicalInformation and PatientClinicalInformation.
We also introduced Pathologies as a class, in which we mainly focused on BladderCancer as a subclass describing its grades, characteristics and malignancy using class hierarchy. Besides the BladderCancer subclass, we identified BodyDisease as a subclass describing other tumors and diseases/illnesses that could be bound to BC or mentioned as a possible-related complication.
The Treatment class was designed to categorize BC therapy strategies and protocols in which we record applied treatment-related clinical evidence. This was to model knowledge about TreatmentType, Drug, ClinicalTechnique and TherapyProtocol. This class is related to both the RiskSideEffect and the BladderCancer (grades) classes to obtain possible complication about each suggested clinical act. This class is the clinical evidence basis element of our semantic prediction rules used to reason and decide about prediction results of treatment’s SEs.
We included a RiskSideEffect class in which we created two main subclasses: RiskSeverity and Complication. This describes possible treatment SEs threatening the bladder and other organs when applying a therapy. When matched with BC grade and treatment, severity can help in managing complications correctly. It is important to be mentioned and includes the assessment criteria used to compare patient’s BC details with BC treatment’s evidence. Moreover, this class includes clinical evidence outcomes as good and bad effects to be compared and anticipated for the treatment decision making.
Another important concept in our ontology is the Anatomy class. It contains knowledge and gender-based details about human body organs and describes AnatomyAbnormality and ConventionalAnatomy. This is to obtain more precise predictive results when locating treatment SEs damages to the patient’s health. The subclass OrgansBiophysicalSensitivity helps in understanding BC behavior and complications in addition to the body’s physical responses to undertaken treatments.
The inferred results were supported by the
TreatmentSEStandard class, including texts and standards about treatment related SEs, extracted from the international standard CTCAE and the published guidelines of the international cancer research foundations with reference to BC treatment clinical practice guidelines [
55,
56,
57,
58,
59,
60].
The construction of our ontology was semi-automatic. Information and knowledge representation design was applied to obtain a pre-formal metamodel ontology for rule-based prediction. It included classes, properties and attributes as static components. Additionally, the BC treatment process with SE prediction diagram was a part of the model. This model was a convenient tool to represent and formalize our domain knowledge. Furthermore, our descriptive representation was also understandable by medical team actors (clinicians and technicians) for both assessment and follow-up. This was to provide more semantic clinical evidence details, used as instances and concept values, so that we could obtain an ontology model.
The next step was to represent complex concepts in our ontology, using semantic and logic-based specifications. Regarding the variable conditions of each patient, our semantic-based explicit criteria rules were needed to help in discovering complications and to tag the source causes. We used Protégé to manually transform our model into an ontology OWL-DL model format. Our domain knowledge was represented by a set of classes and instances. OWL:Thing is the root class of this model, the set of subclasses representing the common domain terminology used to describe our conceptual knowledge model.
The main concepts in our ontology were modelled in a hierarchical manner which is a feature of OWL inherited from previous languages for graph representation, such as RDF [
61]. The higher level of hierarchy was initially reserved for BC treatment domains. The main concepts were modeled as classes or categories of concepts. One class can be a subclass of several series of classes.
Creating a Cancer sub-class within the Pathology class highlighted this to be prominent. For example, BladderCancerPapillaryCarcinoma is both a Carcinoma and a Cancer. Therefore, we defined a BladderCancerPapillaryCarcinoma class as having two levels of super-classes on the hierarchical scale of the Pathology class: Carcinoma and Cancer. All instances of the BladderCancerPapillaryCarcinoma class would be instances of both the Carcinoma class and the Cancer class: a super-class can access sub-classes’ instances and attributes. However, the BladderCancerPapillaryCarcinoma class inherited attributes and facets from parent classes. We specify a class attribute using OWL Properties (binary predicates/formulae with two free variables). This is a logic descriptor of a given concept modelled as object properties or data type properties, for example, tumPapillaryCarcinomaOf(x,y) denotes that y is a papillary carcinoma of a given tumor x; or as atomic properties to denote the set of each objects (instances of a class x) classified under a defined class, for example, Cancer(x), InvasiveCarcinoma(x), etc.
We can also use Data type properties which are a type of predicate capable of linking an instance (or object of a specific class type) to a range of data values (similar to the data types used to define the columns of a database).
An example of the use of a property could be as described in the following triplet:
<NonInvasivePapillaryCarcinoma has TreatmentName, TURBT>
This is to say that a BC NonInvasivePapillaryCarcinoma object is linked to the range of data values TURBT (transurethral resection) by the datatype property hasTreatmentName.
We can also represent the level of risk severity that is associated with a specific treatment option using data type properties, as shown in the following triplet example:
<IntravesicalTherapy has RiskSeverity, 4>
This example links the object class IntravesicalTherapy to the data value 4, indicating its risk severity grade, by the use of the datatype property hasRiskSeverity. On the CTCAE SEs scale, the grade 4 refers to life-threatening consequences with an urgent intervention to be indicated.
In OWL, we can also use Restrictions on attributes (facets, sometimes called role restrictions) to map pre-identified classes into definitions in a language that is understandable by a reasoning engine and therefore will be understood by a computer or a digital device. Here, our knowledge was explicitly included in our ontology in the form of restrictions. To define these restrictions, we rely on the notation allowed by the OWL language. In these definitions we also use the Existential Restriction most common in OWL ontologies. This restriction describes a class of individuals, which maintains at least one (some, ∃) relationship with an instance of a specific class.
Example 1. BladderCancerPapillaryCarcinoma is a Cancer which is not manifested by Adenoma. We can define this class as the combination of the following definitions:
not Adenoma, denoted as ¬ Adenoma
not BladderCancerPapillaryCarcinoma and (manifestedBy some Adenoma) denoted in OWL as ¬ BladderCancerPapillaryCarcinoma ⊓ (manifestedBy ∃ Adenoma)
In addition, we can also use OWL Universal restriction (only ∀) to describe classes of individuals which for a given relationship are related (only) to individuals of a specific class. For example, the class of individuals which only has anatomy (hasAnatomy) and individuals belonging to the Bladder denoted as BladderAnatomy.
Example 2. BladderCancer has Anatomy only, the DetrusorMuscle and Adventitia, which is only, a part of some BladderWall or Peritoneum.
BladderCancer has Anatomy only (DetrusorMuscle and Adventitia only (partOf some (BladderWall or Peritoneum))
BladderCancer ⊓ hasAnatomy ∀ (DetrusorMuscle ⊓ Adventitia ∀ (partOf ∃ (BladderWall ⊔ Peritoneum)))
We have also used many features of OWL that could be found in the world wide web consortium reports and Jain et al. [
62,
63]. These include enclosure, intersection, union, inverse, and equivalence (of classes and properties). All these elements were created to identify the quantitative assessment model and predictions generated by our knowledge-based approach.
3.5. A Semantic Rule-Base for Decision Support in BC Treatment Selection
Based on the given conceptualized domain vocabulary (as provided by our OWL2 ontology in the previous section) and an extended syntax with logic-based descriptions, we can produce sets of decision support rules using the semantic web rule language (SWRL) [
64]. As an OWL-DL based language combined with Horn-like logic rules of the rule markup language (Rule-ML), this is used for developing rule-based approaches. The rules assessment and firing tasks were passed to the rule engine (Pellet 2) [
65]. This engine played the role of a deductive system which, starting from formulas of the language chosen as premises (axioms already represented in the ontology), made it possible to construct new formulas in the form of new premises to be added to the ontology. The newly generated knowledge by the rule engine will be offered to medical practitioners as decision support guidelines. The set of rules are run in a query-like mode of knowledge reasoning and retrieval. In this sense, a condition that is tested by the first part of a rule is true if its query is valid. For example, “
All patients who are taking external beam radiation therapies are at risk” can be written as:
where
x and
y represent some given patient and health risk, respectively.
We could then deduce “there are radiation therapies that are presenting a health risk to the patient” which could be recorded in the OWL ontology formalism as:
∃ x RadiationTherapy (x) ∧ atRiskOf (x,y)
This could be recorded in the ontology thanks to the generalization hierarchy, by minimizing differences between classes by way of extracting their common characteristics. The combination represents a specialization of a superclass with a multileveled hierarchy using inheritance.
To ask and determine cases and possible undesirable events according to the defined evidence, we used semantic query-enhanced web rule language (SQWRL) queries examination. As a SWRL-based query language, this queries OWL ontologies and affords SQL-like services to model the retrieved knowledge [
66]. The following example shows a query examination we used in our ontology:
Patient (?P) ∧ hasStage (?P, ?S) ∧ swrlb:greaterThan (?S, 1) ∧ hasSideEffect (?P, ?SE) ∧ hasStandardReference (?SE, ?SR) ∧ hasTreatment (?P, RadiationTherapy) → sqwrl: select (?P, ?SE) ∧ sqwrl: select (?SR)
This aims to detect all BC SEs within an advanced stage (greater than 1) and associated with a given patient/treatment combination when the treatment is radiation therapy as shown in
Table 4. This result is also supported by the afforded standard reference.
Patient (?P) ∧ hasStage (?P, ?S) ∧ swrlb:greaterThan (?S, 1) ∧ hasSeverityGrade (?SE, ?SG) ∧ hasSideEffect (?P, ?SE) ∧ hasTreatment (?P, RadiationTherapy) → sqwrl: select (?P, ?SG) ∧ sqwrl: sum (?SG) ∧ sqwrl: avg (?SG)
In this query we added a SEs severity grade criterion to detect each severity grade of the predicted SEs related to a BC patient treated with radiation therapy (stage > I)
Table 4. Since
hasSeverityGrade (SG) is a datatype property of the class
RiskSideEffect, we can use this semantic relationship to relate each SE to its own severity grade. Moreover, this query gives the total score and the average value of the obtained severity grade using SWRL aggregation operators
sum and
avg. Results are shown in
Table 5. Severity grades are ranked from 0 to 5 according to the CTCAE related to the cancer therapy evaluation program (CTEP).
Here, the result shows a severity grade rate of 2, which refers to moderate AEs related to the indicated intervention. The strategic approach that has been adapted for the design and development of our ontology is detailed in
Figure 3.
Both SWRL and SQWRL queries rely on OWL inferences since they were both built primarily on OWL-DL. Therefore, the complementarity of these languages made it possible and easier to reach our objective. We have relied on Protégé in its version 3.5 as ontology and rule-base editor [
67]. This is a platform and an environment for development and it built and managed our ontology, using tools for the construction of oncology and oncotherapy conceptual models. Clinicians and patients are the potential users of this ontology. Many plugins have helped us to accomplish our mission and enhance our ontology performance while editing the main semantics and test cases for our predictive rules.
4. Results
In order to describe our ontology about BC treatments risks and SEs, we rely on the graph and figures presented in
Figure 4. These describe the composition of the ontology in terms of various structured components including:
There are -6 super-classes targeting the BC, treatments, procedures, risks, and evidence, each of which contains a large number of hierarchical subclasses (all types of examinations and existing oncotherapy techniques) linked to instances describing concrete objects about BC cases: 42 subclasses of the second level, 60 of the third levels, 198 of the fourth level, 251 of the fifth level, 284 of the 6th level. Concepts are classified by types and families of medical examination techniques.
- -
80 “object-properties” between classes and 176 “data-type-properties” between classes and instances, indicating the values and the parameters related to the occurred examinations.
- -
1825 instances with actual objects of knowledge: 35% of these instances are data. 65% of our instances are presented as the finest values of knowledge and evidence. Here, we define data as analysis’ elements, while knowledge is the synthesis of evidence and information flow which presents data with a context.
- -
621 different rules for the checking of SQWRL and SWRL risk and SE identification queries. These tests deploy as parameters the type of examination in treatment procedure as well as the probable risks, with selective results of examinations.
The validation of our produced ontology is an essential step. We followed special criteria according to which the ontology had to be validated in terms of consistency, taxonomy, and inference
Figure 5. This checking validates the SWRL rules which are based on valid relationships to predict and detect SEs from the given prediction criteria (as mentioned in
Section 3.3).
Furthermore, formal correctness was evaluated according to criteria as disjunction errors which aimed to identify a class as a conjunction of distinct classes. We also checked the consistency and coherence to verify the accuracy and the semantic and syntactic representation of BC treatment and SEs knowledge without contradictory conclusions. Moreover, we checked duplication errors to remove redundant elements which can be deduced from the others. Completeness was also evaluated to measure the conformance and compatibility of both ontology and our domain-model. This criterion was based on covering all elements and terms related to BC treatments by proving the incompleteness of elements to check the completeness of the ontology as mentioned by Gómez et al. [
68,
69]. To do this, we used Pellet as a java based OWL2DL reasoner [
70].
The semantic reasoner and SWRL Rules are the constructive elements of the system’s rule engine within our model. For application, the SWRLTab interface plugin allows the creation and the management of our SWRL and SQWRL (SQL-like query features). A graphical format sample of these rules is shown in
Figure 6.
To reason about a prescribed treatment and anticipate SEs and possible complications, the model is performed within a clinical administration unit. When the healthcare provider supplies the queries by the required predefined criteria through an application programming interface (API), the model requests its query processor. Accordingly, the model checks the relations between SEs, complications, and treatments with reference to guidelines and elements identified in the edited SQWRL and SWRL rules. The model manages the interactions of both KB and user. Therefore, the model generates a detailed decision about the treatment SEs, its severity grade and the required reference standard. Then, results are displayed on the user interface as an output of the model.
JessTab is used in Protégé to perform the querying tasks. The key forms of performance for retrieving knowledge from ontology are SWRL and SQWRL queries. SWRL rules are launched in the Jess inference engine after the class consistency checking. The results and the validation process of the Jess inference tab are shown in
Figure 7.
The obtained outputs were considered as new evidence to supply the model by being stored in the properties of class instances.
As shown in
Figure 9, the instance PMIBC001 of the class Patient who has a stage 2 MIBC and has radiation therapy as a planned treatment is predicted to have anemia, blistering, etc., as listed in
Figure 9 with a severity grade 2. This is the output of the SWRL rule that we defined previously, including all the factors of predicting treatment SEs. The obtained outcome is stored in our model to supply our ontology for future reasoning
Figure 9. Before rules processing, the content of SEs, severity grade and standard reference was empty, and then supplied by the inferred results of prediction.
Thus, we customized an easily accessible java swing-based user interface API, to let healthcare providers and clinicians anticipate AEs of prescribed BC treatment and support their decision making. This provided options to query our ontology model directly and predict BC treatment SE elements according to the required criteria. As inspired by the work of Kayes et al. [
71], we present, in
Table 6, examples of operator-defined reasoning rules from our rule-base followed by a descriptive simplified conceptual model of our ontology with details of the
Pathology concept, as shown in
Figure 10.
Based on the previously published RCTs in the literature, the performed model resulted in elements constructing prediction conclusions that were displayed on the user interface for consideration by the clinicians
Figure 11.
When the prediction inference is processed, the newly registered patient (e.g., PMIBC001) within the
Patient class of the model is tagged and its demographic (e.g., age, gender) and biophysical (e.g., medical history, BC type and stage) data are imported in the queries for the required reasoning. Launching the reasoning process through the interface involved patient identification, the required treatment selection among the imported list of the prescribed treatments for the whole therapy to this patient (including procedure details and treatment dose) and the BC stage retrieval. All the used data, knowledge, rules, and queries were retrieved automatically from our performed knowledge model. The output of the Java-based API is displayed on the user-interface as treatment SE prediction information (AEs and severity grade) with reference to the prediction knowledge standard references and guidelines (related to the obtained SEs severity grade), as shown in the validation example of
Figure 11. A decision is also displayed within the interface importing the content of the instance (Decision_001) generated by the model and stocked in the KB.
For clinical practice, the proposed model is implemented using OWL as described in the implementation and clinical application process presented in
Figure 2. Furthermore, adopting the W3C-based web service activity ensures interoperability and services communication. This will aid in the identification of the desired and most effective treatment, as well as the restrictions that must be adhered to.
This helped to verify the absence of contradictions between ontological elements, in addition to the conceptual matching between them. We also checked the completeness, ensuring that all ontological elements are either explicitly declared or inferable. In our ontology all defined elements obey the principle of concision. Furthermore, we can add new knowledge without changing the old ones originally specified in our ontology, which satisfies the criteria of extensibility which an effective conceptual model should hold.
After formal validation and inconsistencies evaluation, we were concerned by the validation of the domain conceptualization. Experts’ feedback was an important step in validating and modifying the ontology. This step allowed us to obtain a confirmation about the accuracy and the correctness of our knowledge with object-oriented explanations. Positive feedback had no impact on our knowledge model, but it was a valid confirmation. Only feedback with domain conceptual notes was considered to make updates and modifications using RDFS rules. To help in the validation task, we used factual questions to improve and enrich our ontology. Feedback showed a rate of 0.48% (compared with our ontology composition: 2933 elements) representing 14 elements to add as knowledge representation items (concepts, relationships, and individuals) and logical chaining suggestions. Moreover, Boolean-like questions about BC treatments relating SEs to risk indicators helped in correcting and updating our knowledge model. This survey resulted in nine questions with “no” answers from a total of 210 Boolean questions, representing a rate of 18.90%. Modifications and suggestions represented 19.38%. Hence, this step showed a total satisfaction rate of 80.62%.
For decision-making markers of treatment effects calculation, we took this example: in a stage I BC patient in which the tumor had spread to the connective tissue layer of the bladder but has not reached the muscle layer, we indicate that the tumor was removed, then intravesical BCG (immunotherapy) or mitomycin (chemotherapy) was delivered.
Table 7 shows the AE results of a RCT that compares the intravesical BCG treatment to mitomycin treatment.
A major efficacy prediction endpoint is the life-threatening AE rate, which was higher in intravesical BCG subjects (2.8%) than in the mitomycin ones (0.7%). The RR calculation for life-threatening AE rate is: 0.007/0.028 = 0.25 (25%). The RRR is 1 − 0.25 = 0.75, which means that there is a 75% decreased risk of life-threatening AE in patients receiving the mitomycin treatment compared to those receiving intravesical BCG. The ARR is 0.028 − 0.007 = 0.021 (2.1%). The NNT is 1/0.021 = 47.6 (≈48): This means that 48 patients need to receive mitomycin so that one patient gets benefit.
For treatments’ SEs measurement prediction and comparison, we take the example of peripheral neuropathy. Based on data in
Table 7, we find that mitomycin treatment subjects (4.5%) had a significantly higher rate of peripheral neuropathy than intravesical BCG (2.8%). The associated RR for infusion reactions is 0.045/0.028 = 1.6. The RRI is 1.6 − 1 = 0.6 (60%) which means that the rate of peripheral neuropathy occurrence is 60% higher in subjects receiving mitomycin. The ARI for mitomycin treatment subjects relative to intravesical BCG is the difference between the rates of peripheral neuropathy: 0.045 − 0.028 = 0.017 (1.7%). The NNH is 1/0.017 = 58.8, which means that 59 subjects need to receive mitomycin to arrive at one more case of peripheral neuropathy.
To evaluate our approach, we performed tests on 110 BC anonymized cases collected from different sources. Around 67 open BC stories were found to be useful and informative, as they include details about specific BC treatment journeys and witnessed SEs. These type 1 cases were extracted from sources such as the Urology Care Foundation and American Urological Association guides and patients’ stories [
72], Fox Chase Cancer Center Health [
73], Temple Health [
74], BC advocacy Network (BCAN) [
75,
76], Action BC UK and Patient Resource Publishing [
77]. On the other hand, we obtained 43 more cases from reviewed research papers including post and pre-treatment information and clinical details collected from studies involving BC patients. These type 2 cases were found in 43 papers among our 93 selected studies as described in the “Materials and Methods” section. From each paper we chose one case that meets the criteria of testing (treatment type, cancer type and effects).
The number of these cases is considered adequate, considering the type of BC related to each case, the age, sex, weight and performed treatments (as detailed within patients’ treatment protocol stories including procedures and delivered therapeutic doses), along with some behavioral factors as mentioned in the descriptions, such as post/perioperative smoking, alcohol drinking and chronic diseases [
78]. To cover most possible scenarios of our ontology-aided BC treatment SE prediction, we were referred to BC types, for which a sample of cases has been assigned. We describe these types as non-invasive bladder cancer (NIBC) which covers non-invasive papillary carcinoma (stage 0a) and carcinoma in situ (Cis) (stage 0is), non-muscle invasive bladder cancer (NMIBC) (stage I) and muscle invasive bladder cancer (MIBC) (stages > I). The tested scenarios were processed as described in
Table 8.
These cases were carefully collected and formed the core of the KB upon which our ontology was created. The queries that were tested and executed by the Pellet2 rule engine [
79] focused on inferring possible SEs related to each type of treatment, administrated for the considered cases of BC to predict its impact on patient health and safety. Each treatment type was tested by a query. Results were obtained including the possible SEs related to each type of treatment. The inferred results also contain a reference to the patient identifier (as mentioned in
Table 4). An average score of severity grade was also concluded for each predicted SE using semantic queries (as mentioned in the
Section 3.5). In
Table 9, we present the resulted average severity grade score per sample. Then, we compared our predicted SEs to AEs that were reported in patients’ descriptions and stories: this was processed by manual check. For example, and as described in
Table 9, a patient represented by sample (S1), diagnosed with low grade NIBC (stage 0a), was predicted to develop mild SEs with a (SG = 1) considering all indicators that we defined previously. This patient’s prediction results showed SEs such as pain when urinating, low grade bladder infection, hematuria, and incontinence when applying TURBT treatment, and bladder irritation and burning feeling in the bladder when receiving a post-TURBT intravesical chemotherapy (mitomycin). However, the reported AEs showed that the patient developed pain when urinating, hematuria, low grade bladder infection, but no incontinence effect as post-TURBT SEs. Real SEs were bladder irritation, burning feeling in the bladder, loss of appetite and insomnia for post-intravesical chemotherapy. As a result, our reasoning predicted 75% of AEs for post-TURBT procedure and 80% for post-intravesical chemotherapy. This means that for this case, results were compatible at 77%.
5. Discussion
Thanks to the capability of OWL’s terminology, in particular its expressiveness, our ontology’s concepts have been represented in a syntax as close as possible to natural language. Composite concepts were classified according to their semantics. Hence, the initial semantics were preserved by the proposed formal representation.
In general, our results presented 80.3% of the real reported BC treatment SEs prediction. These results also showed that the more we go through the advanced stages of BC, the more the treatment protocol becomes complex and presents important and serious SEs. Moreover, the results obtained from our ontology prediction approach were close to real AEs recorded within the collected test samples.
For heterogeneity, a part of our records, four cross-sectional studies [
80,
81,
82,
83] reported on the overall cystitis AEs and complications, allowing for different time intervals relative to the treatment of patients with high-risk T1G3 BC (TURBT-Radiation therapy versus TURBT-BCG Immunotherapy). We excluded all BC types and stages different than T1G3 (cross-sectional studies).
24.5% of patients who received TURBT-Radiation therapy compared with 16.8% of patients who received TURBT-BCG Immunotherapy (controls) had cystitis. According to
Figure 12 (df: degree of freedom; M-H: Mantel-Haenszel), meta-analysis showed that the overall cystitis AEs and complications relative to TURBT-Radiation therapy versus TURBT-BCG Immunotherapy ranged from 2% to 27%. This showed a significantly higher cystitis SE rate by TURBT-Radiation therapy than TURBT-BCG Immunotherapy (rate difference 11%; 95% CI; 2–27%;
p = 0.03; I
2 = 81; χ² = 16.13).
Our evidence-based reasoning approach, combined with the semantic KB model, helped to generate predictions related to possible patient health issues during and even before treatment application. Integrating a diversity of knowledge and evidence into a single KB and ontology has improved the process of predicting treatment risks and the SEs associated with oncotherapy in BC. Furthermore, improving and adding more inheritance edges between concepts helped to obtain better prediction accuracy. This makes it more robust, especially in the BC domain, involving highly complex and specialized knowledge and semantics.
With reference to previous studies, our approach highlighted digital SE prediction through information technology to optimize oncotherapy treatment processes in BC oncology. Besides the knowledge representation method, we adopted a crawler-based knowledge gathering method to overcome difficulties encountered within computerized patient records or knowledge management in clinical processes, as reported in the study of Masters et al. and Manika et al. This fact helped in supervising most of the useful digital resources and covered more specific reports to enhance the validation of our results. So far, our ontological approach combined the conceptual semantics and the fundamentals of the probabilistic models and Bayesian network, instead of using them separately as reported in previous studies which revealed some major fuzziness and inaccuracies in results. However, our results represented treatments with their detailed consequences and explanations, which enhanced their credibility as compared to patient-reported SEs. Identifying patients at high risk of BC treatment SEs through a semantic-based prediction method included medical proof and evidence. Compared to the studies of Brooks et al. and Tramèr et al., which only abstracted risk factors about patients and used statistical models for prediction, we found that predictions made for categories or groups (classes) can suit individuals belonging to the group. The approach also relies on past treatment outcomes as well as the latest medical research published in scientific journals and databases, to reach the curved top of unsuspected new predictive associations. This adds to the conclusions of Malterud et al. and Hang et al. in their studies of rule-based and EBM studies. Our approach incorporates various personal factors and improves patient safety by predicting SEs that can be hidden for some patients with specific factors. Our approach could be extended in the future to cover other cancers and help in solving theoretical and technical problems, such as the real-time procedural confusions and difficulties that oncologists can face within medical protocols and standards for clinical pathways and treatment provision.
Our study findings could influence decisions and clinical practices at many levels of BC treatment. Clinicians can apply the presented methods of predicting and estimating BC treatment SEs that are commonly reported in published studies. Furthermore, measures of treatment of SEs can be calculated by clinicians themselves for use in clinical practice even when these indicators are not directly provided by the system. As a result, both patients and clinicians can predict and interpret the results. Following this assessment, users can introduce these measures for treatment of AEs into the system to increment the system’s KB, thus contributing to generating clinical decisions for future medical cases.
Our knowledge-based model can be easily deployed within clinical information systems (CIS), communicating with many unit information systems. Featuring our findings with electronic health record (EHR) information allows access to evidence-based tools that providers can use to make decisions about a patient’s care. Information about patient treatment and SEs embedded in a clinical document architecture (CDA) [
84]—as a health level 7 (HL7 V.3) [
85] standard—within an EHR can communicate easily with our OWL mapping model to supply our ontology with information. Thus, our semantic decision rules are fired to predict and decide about treatments’ SEs. Even more, within a decision support system (DSS), our findings improve patient safety and health quality through computerized alerts that prompt clinicians regarding possible treatment SEs and their severity grades. This helps in better optimizing treatment management plans in a risk-aware manner. Our approach serves as a
Supplementary Material source of proof for evidence-based practice. This includes the integration of available evidence, clinical expertise, and health policy decision-making. Furthermore, clinicians can agree or disagree about the system’s output with relation to a treatment of their choice and recommend or rate the decision for future use. If they disagree, they can override the decision by introducing justification from their own previous experience in an anonymous way. Security issues should also be studied in the future, following the approach that data and documents should normally be shared on a health information system (HIS).
Using a knowledge-based approach provided the study with an extensive KB. This allows healthcare providers to shape a strategy for working with the patient based on trusted, credible resources and to improve both focus and precision within healthcare practice. Thus, clinicians with a knowledge-based background have a wide marketing edge. As a main feature of our approach, heterogeneous data inclusiveness enables the support and the reuse of the semantics when building decisions. This is essential to prepare the environment to extensible host evidence-based practice knowledge and to integrate additional utilities in the future. It is important to model the flow of information necessary to simulate a cancer treatment case and to allow reasoning about possible undesirable effects and consequences. It is also reassuring to link the system generating results to scientific references and publications from which the generated knowledge was retrieved. This approach shows how to benefit from previous research studies and how to consider their outcome as historical knowledge (how to benefit from past clinical and research experience to enhance patient experience in an actual clinical context). In comparison with data driven approaches, there is less dependence on human participation and primary research. In our context, we did not have to deal with occurrences of data incompleteness. It is a fact that the outcome of data-driven studies is usually affected by an insufficient amount of data. A knowledge-based model is more effective at predicting effects and future events than data driven models with a high prediction accuracy [
86]. Moreover, calculating measures of treatment effect and providing clinicians with SEs’ severity grades empower the clinical practice. This also helps clinicians to set a solid strategy for AE management when referring to our prediction results, including knowledge standard references and guidelines as provided by the
TreatmentSEStandard class, as defined in our ontology.
Despite our rigorous methodology, our approach only supports clinical decisions and does not afford an ultimate commitment regarding clinical trials. The inherent limitations related to the included studies disallowed us from reaching definitive conclusions. Statistical results are also influenced by patient historical backgrounds, biological mutations, and clinical analysis methods. The community-based transfer of outcomes may also need more effort, because of the extended range of studies and their high-volume of organization. Through our feasibility study we are looking to test other types of cancer in the future, to cover both cancerology and oncology disciplines. Data was partially automated to the ontology feeding channel, which makes human intervention necessary. In addition, we would like to optimize our method of risk severity evaluation. In fact, treatment of SEs and AEs cannot be adequately concluded using a single measure. Along with measures of treatment of SEs, it is recommended also to report the standard of care and the control event rate (CER) [
87] as a standard rate. We are looking forward to extending a data driven approach in which tests will use data extracted directly from electronic health records and cancer registries. Thus, integrating the work with a hospital information system and a real electronic health record management system will require adoption of the Fast Healthcare Interoperability Resources (FHIR) [
88] standard data model.