The Canon Group&#39;s Effort: Working Toward a Merged Model

C. Friedman; S. M. Huff; W. R. Hersh; E. Pattison-Gordon; J. J. Cimino

Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com FRIEDMAN ET AL., The Canon Group’s Merged Model Re The Canon Group's Effort: Working Toward a Merged Model CAROL FRIEDMAN, PHD, EDWARD PATTISON-GORDON, Abstract exchanging STANLEY MS, Objective: To develop data and applications, using HUFF, MD, WILLIAM R. HERSH,MD? JAMES J. CIMINO, MD M. a representational schema a collaborative approach. for clinical data for use in Design: Representational models for clinical radiology were independently developed manually by several Canon Group members who had diverse application interests, using sample reports. These models were merged into one common model through an iterative process by means of workshops, meetings, and electronic mail. Results: A core merged models that were model for radiologic developed independently. findings present in a set of reports that subsumed the Conclusions: The Canon Group’s modeling effort focused on a collaborative approach to developing a representational schema for clinical concepts, using chest radiography reports as the initial experiment. This effort resulted in a core model that represents a consensus. Further efforts in modeling will extend the representational coverage and will also address issues such as scalability, automation, evaluation, and support of the collaborative effort. J Am Med Informatics Assoc. 1995;2:4-18. One point of general agreement in the medical informatics community is that no common representation for medical information now exists that is generally accepted for use across clinical systems. A Affiliations of the authors: Queens College of the City University of New York, Department of Computer Science (CF), and Columbia University, Center for Medical Informatics (CF, JJC), New York, NY; University of Utah, Salt Lake City, UT (SMH); Oregon Health Sciences University, Portland, OR (WRH); and Brigham and Women’s Hospital, Boston, MA EP-G). Correspondence and reprints: Carol Friedman, PhD, Queens College of CUNY, Computer Science Department, 65-30 Kissena Boulevard, Flushing, NY 11367. e-mail: friedma@cucis.cis.columbia.edu Received for publication: 7123194;accepted for publication: 8/30/94. possible approach to addressing this need is through collaborative efforts. Several collaborators might effectively divide the labor of modeling a large domain.. More importantly, collaborators working in a variety of application areas and representing a diversity of motivations should produce a broader, more useful representation. Well-defined methods for collaborative development have yet to be described. Simply combining contributions may lead to misunderstandings, redundan“Design by committee” may cies, or ambiguities. produce results that end up satisfying no one.* The collaborative development of objective, reproducible methods for representing medical information would be a valuable contribution to the field. Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com Journal of the American Medical Informatics Association Volume 2 Number 5 1 Jan / Feb 1995 The Canon Group was formed by a group of medical informatics researchers with the common interest of addressing the scientific and technical issues surrounding medical vocabularies,2 such as the need for coherent conceptual representation across applications and subject domains. A primary tenet of this group is that terms in controlled medical vocabularies should correspond to concepts that have their meanings made explicit through a deep representational structure3 that may have a variety of uses. Five important aspects associated with an adequate representational structure were identified by the Group: controlled vocabulary, typology, concept model, notation, and granularity. We describe an initial effort in the development of such a structure. ing of the Canon Group for the purposes of discussing requirements for a representational model and gaining a broader perspective concerning the modeling of medical language. The following tasks were agreed on to initiate the Canon Group’s effort: Another tenet of the Canon Group is that the establishment of a broad collaborative effort is necessary to achieve a sharable representational structure and controlled vocabulary. Additionally, because many collaborators are involved, it is especially necessary to delineate and establish procedures, methods, and tools to support the collaborative effort. A recent editorial, which appeared in this journal, challenged the Canon Group to demonstrate the collaborative approach to controlled vocabulary construction.4 The construction of a sharable representational model for chest radiography report findings is the first experiment in testing the above hypotheses. A meeting was arranged in January 1993, in ‘Harriman, New York, where different participants presented their models and demonstrated how they would represent the medical information in the chosen reports. Background The broad goal of the Canon Group, the establishment of a basis for the canonical representation of medical concepts, has been previously described in this journal.2 The need for such a representation is based on increasing demands for computerized medical information and automated health care systems. The quality of the information and the health care systems is closely related to the quality of the medical-concept representation. Additionally, a common representational model would facilitate the sharing of applications and data among health care institutions. For example, if clinical data had the same representation, it would be possible to collect specified data from different institutions in order to obtain a large amount of data associated with a broad patient population; these data could then be used to perform medical research studies. Even if the different institutions retained their own individual representations, the data could still be shared if they were mapped into the representational form of the common model. One of the first steps toward establishment of a canonical representation of medical concepts was a meet- An initial domain consisting reports was chosen. of chest radiography Text from this domain was obtained from four sites to serve as a source for automated text processing. Eighteen reports were chosen for the purposes of information modeling and detailed comparison of each group’s approach. The meeting did not result in agreement on methods or on details of the model. However, a better understanding of the different models and an initial attempt to develop a “merged” model by combining desirable features of the different models did emerge. Although a merged model was not achieved due to time constraints, enough groundwork was covered so that it was possible to begin development of an experimental, yet tangible, model resulting from a collaborative effort. The Individual Models Models simplify reality by ignoring certain details of a system in order to focus attention on aspects of the world that suit the purpose of the modeler. The structure and content of a model are driven by the purpose of the model, and models of a given domain will typically converge to the degree that the underlying purposes of the models converge. In the case of the models presented at the Harriman retreat, there are differences relating to the purposes (requirements) of the models. For instance, models such as MedSORT,5 MOSE, 6 and Galen 7,8emphasize knowledgeintensive models of medical concepts. MOSE and Galen both aim to define an extensible and application-independent framework that is suitable for building and integrating different terminologies. MedSORT and Galen aim to represent all (and only) valid medical concepts, and to reveal all implicit relationships associated with the concepts. The Queens/Columbia model9 and the Utah model’” emphasize medical-data representation geared for decision support and natural-language applications. Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com 6 FRIEDMAN These models aim to represent data found in clinical events in a way that is convenient for access by database management systems and for use by clinical applications. The models from Stanford” and Harvard 12 emphasize the support of structured data collection. The Stanford model was based on formal logic because it is well suited for reasoning. The Carnegie Mellon/Ohio Health Sciences (CMU/OHSU) model 13focuses on language normalization to allow translation across representations, from controlled vocabularies to natural-language text. However, the individual models presented at the Harriman retreat also share common factors, and therefore it is possible to frame a discussion of the individual models around these five highly interacting and overlapping themes: terminology (controlled vocabulary), typology, concept model, notation, and granularity. Every model mentioned controlled vocabulary as a distinct component of the model. Cimino et al. have defined a set of requirements for a controlled vocabulary. These requirements help to define the functional role of the vocabulary in the Queens/Columbia model. The Galen group8 separate knowledge about terminology from knowledge about contents, an approach also taken by the Queens/Columbia’ and Harvard 12 groups. Besides the recognition of vocabulary as a part of the model, there are similarities in the requirements and use of vocabularies among the models. In particular, several groups stated the need for definitions, the need to manage synonyms and homonyms, and the need for domain completeness and nonredundancy. The models expressed the idea that vocabulary terms are symbolic names for underlying medical concepts. Additionally, the CMU/OHSU group 13 expressed the need for molecular to atomic mappings (decomposition) and illustrated the utility of semantically typing the concepts as part of the process of creating canonical representations. Similarly, the Queens/Columbia group 15 expressed the need for compositional mappings that also specified the semantic relations of the atomic components. A second common theme in most of the models is typology-the use of a semantic network or a hierarchy to organize the terms (representing concepts) into semantic domains that are then referenced within the models. The third theme discussed at the Harriman retreat involves the way in which concepts are combined to make meaningful expressions of more complex and complete medical concepts. Every model used some formalism to express relationships between vocabu- ET AL., The Canon Group’s Merged Model lary terms (the names of concepts) to form more complex terms. The Utah group 10 used a frame-oriented paradigm where the vocabulary elements were used as fillers of slots. The name of each slot corresponds to the semantic role that the vocabulary element plays in the model. For instance, the term lung is used to fill the slot body part and the term nodule is used to fill the slot radiology finding. However, the most common approach of the modelers was to treat the vocabulary items as nodes in a network and connect the nodes by links that were named relationships. Using this methodology, the radiology finding above would be expressed as nodule-has location-lung. In this example, the controlled vocabulary terms nodule and lung have been connected by the relationship has location. The second approach was used in all of the models except the model proposed by the Utah group, 10 but it was noted during discussions that the- frame-oriented representation could easily be converted to the named-relationship form. The fourth theme is related to the notation used to represent the models. The Utah group 10 used frames and slots to describe the model, while the most popular mode of expression used was the conceptual graph (CG) notation,16 which was used by the Stanford, Columbia, and Harvard groups. The Galen group used the Semantic Meta Knowledge (SMK) notation associated with the Galen work,7 while the MOSE group described the notation used with the MOSE project.6 Basically, these different notations are all similar and convertible. The final theme of the Harriman retreat relates to the granularity of terms. Specific differences in the models are attributable to differences in the granularity of concepts in the vocabulary, the granularity of the hierarchy, and the symbols used to represent the concepts. With regard to concept granularity, there were classic “lumpers” and “splitters.” What lumpers expressed as a single concept, splitters would express as closely related simpler concepts. Lumpers preferred to think of hilar adenopathy as a single concept, whereas splitters preferred to think of adenopathy as the concept and hilum as a body part that is the location. Again, it is obvious that the two forms of expression are equivalent and interchangeable as long as the simpler terms are presented in the vocabulary and their relationship to the more complex terms is understood. In the case of the granularity of the hierarchy, some models had a very fine network of semantic classes, whereas others had a coarser network. For example, those with a flatter semantic network preferred to classify modifier terms such as large, round, and Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com Journal of the American Medical Informatics Association Volume 2 white as children of one term, property, whereas those with a deeper hierarchy would create intermediate classes such as size, shape, and color that would then be classified as children of property and parents of large, round, and white, respectively. Another example of this difference occurred when specifying synonymous terms. For some models enlarged cardiac silhouette was synonymous with enlarged heart, whereas for others these were similar but non synonymous concepts. Another difference was associated with the actual symbolic names assigned to the concepts. For example, one group would use the name enlarged heart, whereas another would use cardiomegaly. This difference can be resolved by straightforward substitution, if there is a one-to-one mapping and a precise understanding of the underlying concept. Two other interesting phenomena that caused some differences in the detailed models were noted. The differences were due to the ambiguous nature of the source material and to the task that consisted of mapping the relevant text of the reports to corresponding concepts in the model. The clinical information was not always interpreted in the same way by different participants. Because the clinical information being modeled was obtained from natural-language reports, the expression of the underlying clinical information often contained ambiguities that were sometimes resolved differently by different participants. For example, the phrase increased paramediasfinal opacity was interpreted by some participants as referring to a temporal concept denoting an increase in opacity over time, whereas others interpreted the phrase as referring to a degree concept denoting an above normal opacity. Because differences in interpretations of the actual reports were not part of the modeling exercise per se, it was frequently necessary to disambiguate differences in interpretations from differences in the model itself. It was decided that for the modeling exercise, it was more critical to understand the interpretation being modeled than to decide which interpretation was correct. Although differences in the interpretations of the reports were not particularly relevant to the modeling exercise, the test data supported the argument that free text is inherently ambiguous. The second problem occurred when the textual expressions in the actual reports were mapped to the model. There was a tendency for the models to associate their symbolic terms with identical or similar surface forms. For example, the word cardiomegaly in the reports was typically associated with a concept called cardiomegaly in the models. This method of naming concepts was adequate when the word itself was unambiguous. However, when the word was Number 1 Jan / Feb 1995 7 ambiguous, the corresponding concept was likely to be ambiguous also. For example, the word increased frequently occurred in the reports, but had at least two different meanings. Therefore, the symbolic name increased would be a poor symbol to use in the model because the underlying meaning would probably be misunderstood inadvertently, even if it were described precisely. A better approach would be to use symbols that are unambiguous. For example, it would be more appropriate to use the symbols temporal increase in and above normal degree to represent the two different concepts associated with the word increased. Methods To facilitate the modeling effort, the task of representing the entire report was broken up into several well-defined subtasks. The initial subtask chosen was the modeling of individual findings from a small number of sample reports. For example, the phrase new plate like opacities in left lower lung zone compatible with atelectasis contains two interrelated findingsnew plate like opacities and atelectasis. These two findings were modeled independently and the connective relation compatible with was ignored. This stepwise method of model building made possible the development of a tangible core merged model within a reasonable amount of time so that it could be critiqued by those involved in the collaborative effort and by others working in medical informatics. Three conventions were adopted that were considered prerequisites for the merging of the models: a common notation was adopted, a common database was established, and a common convention for commenting in the notation system was adopted. A common notation, Sowa’s conceptual graph formalism, 16 was chosen as the representational notation for the initial effort because it is widely used in medical informatics 11,17-19 and can be mapped into other knowledge-representation schemas and database forms. Currently, the Knowledge Interchange Format (KIF)20 developed by the knowledge-sharing project 21 sponsored by the Advanced Research Projects Agency (ARPA) provides a means whereby a representation consisting of CGs can be translated into other knowledge-representation schemas. A database of individual findings from the set of selected reports was established. This database was necessary solely for the collaborative modeling exercise because it provided uniquely labeled findings for identification purposes and helped to disambiguate the interpretations of the clinical information in the sample reports. Because the initial task was restricted to the individual find- Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com 8 FRIEDMAN ings only, connective relations were included for completeness but were enclosed in parentheses, as were comments. In addition, conjoined phrases were expanded (when appropriate) and words added (enclosed in square brackets) to make the conjunction more explicit. For example, new plate like opacities in left mid and lower lung zone contains the conjoined body location phrase in left mid and lower lung zone. That finding was represented as new plate like opacities in left mid [lung zone] and [left] lower lung zone because in the original sentence lung zone does not immediately follow left mid and left does not immediately precede lower lung zone. The individual findings from a sample report called BWH22 are shown in Appendix A. A convention was also established for representing comments within the notation. A percent sign (%) indicates the start of a comment, which continues to the end of the line. Although comments did not represent information in the reports, they facilitated collaboration and were useful for documentation purposes. In addition, an Internet FTP site was established and members of the Canon Group were given access to it. The different versions of the model and the modeling exercises are maintained at the FTP site. The current version of the merged model was developed in an iterative fashion by means of workshops, meetings, and electronic interchanges. The first version of the model was developed at a workshop by a subcommittee. It was presented to all of the Canon Group participants, who analyzed it and discussed its problems. It was subsequently modified in accordance with the discussion and placed in a directory at the FTP site. In addition, findings from the database of sample reports were also modeled in accordance with each version of the merged model and added to the server. Every participant was asked to review the latest merged model and the modeling ET AL., The Canon Group’s Merged Model of the findings. The current model represents a consensus that was reached after several rounds of reviews and modifications. Following the CG formalism, the merged model specifies canonical medical concepts in a form that consists of two major components: one component specifying the semantic classification and hierarchical organization of the concepts, the other containing canonical graphs. Every concept must be associated with a place in the overall hierarchy. The model supports two different versions of a hierarchical organization. The first version is called the “core” hierarchy and classifies the concepts for the purpose of supporting exchange of data using the model. The core hierarchy is. a minimal hierarchy consisting of broad classes or axes. A minimal hierarchy was chosen to simplify classification, avoid inconsistencies in classification, and facilitate collaborative efforts because it simplifies the task of mapping to different hierarchies that are likely to be developed by individual sites in support of particular applications. The second version of the hierarchy, the “specialized” hierarchy, consists of specializations on the core hierarchy. It was chosen to support particular applications and views of the concepts. In the specialized hierarchy, a concept may frequently have multiple parents in order to provide as many classificatory views of the concept as are necessary to support the functions for applications. The specialized hierarchy is application-dependent and is not shown here. The second component of the model contains canonical graphs consisting of terminologic knowledge about the structure of the concepts and their semantic relationships with each other. Every concept in the model is associated with a unique preferred symbolic name that corresponds to a unique, well-defined con- % TYPE HIERARCHY concept2 concept3 concept4 concept5 concept5 concept6 < < < < < < concept1. conceptl. concept2. concept2. concept3. concept3. % % % % % % % concept1 /\ / \ concept2 concept3 /\ /\ / \ / \ concept4 concept5 concept6 % CANONICAL GRAPES [concepti:{"expression1"."expression2",...,"expressionM"}](relation1) -> [concepti1:cl] (relation2) -> [concepti2:c2] (relationN) -> [conceptiN:cN]. Figure 1 A schematic overview of the organization of the merged model. The first component specifies the type hierarchy. A graph-like version of this hierarchy can be seen on the right-hand side of this component. The second component consists of canonical conceptual graphs that specify the components of concepts along with associated relationships. Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com Journal of the American Medical Informatics Association 9 Volume 2 Number 1 Jan / Feb 1995 cept. Symbolic names were chosen carefully so that the underlying meaning of the concept would be as unambiguous as possible. For example, as mentioned in the Background Section, the symbolic name increased would not be appropriate because its meaning is ambiguous; in the merged model, two different symbolic names were assigned to represent the different meanings of the word increased: temporal-increase-in and more-than-normal-degree. A schematic overview of the organization of the merged model is shown in Figure 1 as a CG. The first component specifies the type hierarchy. A hierarchical classification specifying that concept2 is a subclass of concept1 has the form concept2 < conceptl. In Figure 1, a graph-like version of the hierarchy is also shown (on the right-hand side) with the CG statements, because it is easier to visualize than the CG subtype statements. According to Figure 1, the highest concept is concept1 because it has no parent. Concept2 and concept3 are subclasses of concept1 because they appear to the left of the < symbol and concept1 appears to the right. Concept5 is a subclass of both concept2 and concept3. The second component of the model, as shown in Figure 1, consists of canonical CGs. A canonical CG specifies the components of a complex concept along with the associated relationships, and it may also specify surface form (i.e., textual) expressions of the concept. For example, in Figure 1, concepti is related to N other concepts. It has a relation called relation1 to a concept called conceptil. A referent of a concept may be expressed by specifying a colon (:) after the related concept followed by a unique identifier or set, and corresponds to a specific instance of the concept, a set of instances, or a cardinality constraint. An example of the canonical CG of a concept named cardiomegaly is shown in Figure 2. As illustrated in Figure 2, the concept cardiomegaly is associated with a set of two expressions found in the text-“cardiomegaly” and “enlarged heart”-that are represented as literal elements of a set following the name of the concept cardiomegaly. Although the name of the concept is unique, the mapping from the surface form strings (i.e., textual expressions) to the concept is not necessarily unique. For example, the string “enlarged heart” may be specified in a referent set of another concept, providing a mechanism whereby the ambiguous nature of natural-language expres-. sions may be represented in the model because the possibility exists that a mapping from the text to a concept is not unique. This also serves to differentiate the linguistic level of expression from the unambiguous, well-defined conceptual level. The concept car- [cardiomegaly : {“cardiomegaly” , “enlarged (has-observation) -> [heart] (has-property) -> [enlarged]. heart”}] - Figure 2 The canonical conceptual graph of the concept cardiomegaly consists of two more elementary components, an observation concept, heart, and a property concept, enlarged. diomegaly consists of two more basic components. One component is the core observation, heart. The other component is the concept enlarged that describes a property of the heart. Both concepts must also be defined in the model. The representation of this concept illustrates a phenomenon that is likely to occur when there are collaborators from different orientations developing a model. The modelers did not agree on the representation of this concept, nor on the representation of other concepts consisting of body locations and associated properties. Basically, there were two different views of cardiomegaly: one view represented cardiomegaly as described above, the other view preferred to represent cardiomegaly so that the core observation is enlarged and the body location is heart. The former view was agreed on so that we could proceed with the merged model. It was realized that there would be differing viewpoints in certain instances, and that compromises would have to be made in the process. This was acceptable to the group members as long as the model was associated with a well-defined semantics that was consistent. Results The merged model contains concepts closely associated with terms in the reports, such as cardiomegaly and lung, but also contains higher level, more abstract concepts that are not generally seen in actual reports because these concepts contain generic structural descriptions of the information rather than the information itself. For example, in the merged model there is a high-level concept called rad_finding, as shown in Figure 3, that represents the structure of a generic radiology finding that contains an observation and optional qualifiers. According to Figure 3, a rad_finding is a complex concept with components that are also concepts that are interrelated in predetermined ways. For example, the core component of rad_finding is a concept that is classified as an observation. Observation concepts are also specified elsewhere in the model, and represent the different observations that occur in radiologic examinations of the chest, such as pleural effusion and coronary artery bypassgraft. Since the domain being modeled consists Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com 10 [rad_finding] (has-observation)-> (has-location) -> (has-location-qualifier) (has-presence) -> (has-degree) -> (has-temporal) -> (has-quantity) -> (has-property) -> FRIEDMAN [observation] [body-location : {*}] -> [location-qualifier: [certainty : {*}] [degree:{*}] [temporal : {*}] [quantity : {*}] [property : {*}] ET AL., The Canon concepts, they would chical assignments. {*}] Figure 3 The canonical conceptual graph associated with the concept rad_finding, the generic radiology finding. Rad-finding is considered a high-level concept because it is not actually seen in the reports but instead contains a description of the structure of report findings. A finding consists of a core observation relation, observation, with different qualifiers (e.g., body-location, ‘location-qualifier, certainty, degree, temporal, quantity, and property), all of which are optional and may occur zero or more times. of only radiologic examinations of the chest, observation actually is meant to refer specifically to a radiologic observation concept rather than to a broader observation concept encompassing other observations; in future models it will be renamed radiological observation. In rad_finding, the naming of the core concept underwent several rounds of changes. Some previous names were focus, subject, central finding, and core. In this case, the modelers agreed on the semantics of the component but found it difficult to assign an appropriate name to the concept. The preferred symbolic name observation was chosen because it seemed to be the most appropriate and generic name for the concept. The remaining components of rad_finding are all optional and may occur zero or more times because they contain the cardinality constraint denoted by {“}. They further qualify the observation, and contain different information, such as body location, certainty, degree, and temporal and other descriptive information. The canonical CG for any concept that occurs in a radiologic observation should be representable according to the specifications set forth for rad_finding. For example, hilar adenopathy ‘and platelike atelectasis are complex terms that are frequently found in radiologic examinations. They could be included in the model as new concepts using CGs similar to the one shown in Figure 3. The CGs for hilar adenopathy and platelike atelectasis are shown in Figure 4. The concept hilar adenopathy has two components, an observation called adenopathy and a body location called hilum, which is the location of the observation. Similarly, platelike atelectasis has two components, an observation, atelectasis, and a property qualifier, platelike. If the concepts hilar adenopathy and platelike atelectasis were included in the model as new Group’s Merged Model also have to be given hierar- The concept rad_finding can also be used to represent findings from specific radiologic examinations instead of canonical concepts. In this case the CG is not a canonical CC but instead represents an instantiation of the canonical CG rad-finding. The instantiation of a CG is represented by an identification marker (a # symbol and an identifier) following the name of the CG. Thus, if the first two findings in a report identified as CXR123 were possible hilar adenopathy and moderate platelike atelectasis, the corresponding CGs would be as shown in Figure 5. In Figure 5, identifiers are associated with the two rad_finding concepts because, for comparison purposes, it is convenient to identify individual findings using a common notation. When the findings in the reports are modeled independently by different modelers, the levels of granularity tend to differ, and therefore the values of the observations may differ. For example, an equivalent CG convention could associate the identifier with instances of hilar adenopathy or platelike atelectasis instead of with rad_finding instances. If applications exist where this representation is necessary, a mapping could be used to transform the report findings so that the observations (i.e., hilar adenopathy and platelike atelectasis) are associated with identifiers instead. Other CGs were developed to represent the structure of concepts found in body location information and qualifiers. Body location information is complex because it encompasses spatial information that is difficult to represent. Presently, the CG associated with body location requires more work, but a large variety of body location concepts can be represented properly. Figure 6 illustrates the CG called body-location, consisting of three components. One component, represented by the relation has-location, is needed when one body location is used to identify another, as in lymph nodes of right hilum. The component called has-location-qualifier corresponds to the [hilar [platelike adenopathy] (has-observation) (has-location) atelectasis] (has-observation) (has-property)-> -> -> [adenopathy] [hilum] . -> -> [atelectasis] [platelike]. - Figure 4 Two canonical conceptual graphs that correspond to the concepts hilar adenopathy and platelike atelectasis. These concepts are considered lower level concepts because they are associated with actual phrases that 1; ~: are found in reports. Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com Journal of the American Medical Informatics Association Figure 5 An instance of a canonical conceptual graph of the core hierarchy is represented by a referentan identifier that is preceded by a # symbol. The referent is shown following the name of the concept. The representation’s of two radiology findings, identified as CXR123.1 and CXR123.2 (possibIe hilar adenopathy and moderate platelike atelectasis), that occurred in a report identified as CXR123 are shown. Figure 6 The canonical conceptual graph for body-location contains the representation of a generic body location. A body location concept has a component that is the primary body location and an optional qualifier, location-qualifier, that is associated with concepts such as left. Body-location also has another optional relation, locative, that is associated with locative prepositions such as under. [rad_finding:#CXR123.1] (has-observation) (has-presence) - [rad_finding:#CXR123.2] (has-observation) (has-degree) - -> [body-location] (has-location) (has-location-qualifier) (location-relation) Figure 7 Examples of the conceptual graphs of two complex body location concepts, left upper lobe of lung and medial anterior segment of left upper lobe. The primary location of left upper lobe of lung is lobe of lung, which is qualified by two location qualifiers, left and upper. The primary location of medial segment of left upper lobe is left upper lobe of lung, which has a location qualifier; segment. Segment is qualified by medial and anterior. [left version upper lobe of lung] (has-location) (has-location-qualifier) (has-location-qualifier) -> [hilar adenopathy] [possible]. -> -> [platelike [moderate -> -> -> atelectasis] degree]. [body_location:{*}] [location_qualifier:{*}] [locative: {*}]. - -> -> -> [lobe of lung] [left] [upper]. [medial anterior segment of left upper lobe of lung] Cleft upper lobe of lung] (has-location) -> -> (has-location-qualifier) [segment] -> [medial] (has-location-qualifier) -> [anterior]. (has-location-qualifier) possible qualifiers of a body location. These could be relative locations, such as upper or base, or other body locations. For example, in the merged model, a concept joint of left hand consists of a body location, joint, with a qualifier, hand. Some developers may want to model joint of left hand using a more detailed representation. For example, joint may be viewed as having the relation part-of to the body location hand. Although a more detailed model of body location may be desirable at a later time, the simpler model was chosen for the current the merged model to shorten development 11 Volume 2 Number 1 Jan / Feb 1995 of time. The remaining relation, has-location-relation, refers to qualifiers, such as under or along, which specify the locative relation of the finding to the body location. Examples of the CGs of two complex body location concepts, left upper lobe of lung and medial anterior segment of left upper lobe, are shown in Figure 7. The concept left upper lobe of lung consists of a more elementary concept called lobe of lung that is qual: ified by left and upper. The second concept in Figure 7 is more complex. It consists of the more elementary concept left upper lobe of lung that is qualified by the concept segment. Similarly, segment is qualified by ‘the concepts medial and anterior. The representations of other qualifiers, such as temporal, certainty, and degree, are shown in Appendix B, which contains the current version of the core model. A listing of the CGs that represent the findings in the sample report is given in Appendix C. Discussion The current version of the merged model was deliberately restricted to a subtask that consisted of the modeling of individual findings. However, other subtasks, such as modeling the overall structure of the report and modeling the interrelations among findings, are considered essential for the final model. Subsequent versions of the model will be extended to handle this information. Adequate representations of information containing spatial relations, uncer- tainty, fuzzy information, anatomic descriptions, temporal information, and causality are each very difficult and complex subjects, and there is much active research within each of these areas.22-29 This information is presently represented in the model in very simplistic ways. To develop deeper models within these subareas, a long-range sustained effort will be needed. Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com 12 FRIEDMAN Another open issue is how to represent complex concepts in the model. When there is a finding like hilar adenopathy, which occurs frequently in the results sections of radiology reports, there are generally two equivalent ways in which the information may be represented. One way is the method described in the Results section, which consists of adding a new definition of a canonical concept, hilar adenopathy, to the model. Adding complex terms can result in the proliferation of many new concepts. An alternative method does not involve adding a new concept to the model, and requires that only the more elementary concepts hilum and adenopathy be in the model. The finding hilar adenopathy in a report may then be represented as a rad_finding that consists of two related concepts-adenopathy where the body location is hilum. This method does not involve the proliferation of concepts, but retrieval of the information may be more complicated. A need identified by the Canon Group is for tools that support collaboration. To support model building, tools are needed for parsing, browsing, and editing concept models. Our modeling effort is a collaboration of geographically separated participants using a variety of computer platforms. Therefore, our tools also need to facilitate the communication of model content and proposed changes across computer platforms and wide distances. Researchers have noted that participants in any collaborative design effort use a “shared drawing space” to both convey and store information and to mediate interaction.30 When the participants are gathered together in a design meeting, for example, the shared drawing space is often a white board. Our shared drawing space consists of the current state of the model, as well as the set of individual chest x-ray findings that we are trying to model. The model was represented both by diagrams (either on the white board or on paper) and by CG statements in the linear notation. Four different collaborative interactions have been identified and classified according to whether the times and locations of the interactions are the same.31 These interactions are 1) face-to-face (same time, same place), 2) distributed synchronous (same time, different places), 3) asynchronous (different times, same place), and 4) distributed asynchronous (different times, different places). Obviously, when the participants are geographically separated, the same physical white board cannot serve as the shared drawing space. Instead, other tools are needed to support the different collaborative interis the term that has come to actions. (“Groupware ET AL., The Canon Group’s Merged Model be applied to computer software tools designed to facilitate collaborative interaction.) Our choice of interaction was limited by the tools for collaboration available to us. Face-to-face collaboration occurred in meetings and was supported by the usual white boards, overheads, and paper handouts. E-mail supported distributed asynchronous collaboration, especially while writing papers (like this one), but we did not have, for example, tools that support distributed synchronous model building. In fact, e-mail and access to the Internet were about the only computer tools shared by all of us. The choice of CGs as a formalism enabled collaborators to speak the same language and, using the linear notation, to exchange models via e-mail. But not every collaborator had a parser for the linear notation and CG editing and browsing tools. Parsing CGs readily reveal errors made with linear notation syntax and, occasionally, semantic errors as well. Display tools can make it easier to see a large number of concepts, and their relationships, at once. For example, an outline viewer (in which hierarchies are viewed as outlines, with descendants indented beneath ancestors) implemented by one of the Canon Group members (Bell) has helped reveal semantic errors. As we have progressed, and as the model has grown in complexity, the need for CG parsers, browsers, and editors on multiple platforms has become more pronounced. There is also a need for a tool, such as the Standard Generalized Markup Language (SGML),32 to provide a means of specifying and standardizing the format of the reports so that they can readily be shared by others. These tools would support distributed asynchronous collaboration, whereas tools that display CGs could also support face-to-face collaboration. If this tool set were augmented with some sort of real-time messaging system, then that would enable distributed synchronous collaboration as well. The merged model is an experimental model, developed by merging the models of some of the participants at the Canon Group’s’ meeting.9-‘3 It is an incomplete model and in its present form accounts. for a small subset of clinical information. Even though it encompasses a very small piece of the overall goal;. it represents an important achievement in that it subsumes independent work at four sites associated with : four applications and orientations. It provides a medium whereby participants can communicate using a common language, and thus makes it possible for those in the group to analyze and criticize the actual modeling effort more accurately. Since it is a partial and an experimental model, it will continue to change and evolve as it is extended and applied. Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com Journal of the American Medical lnformatics Association Volume 2 FuturePlans In this article, a model was described that emerged from a unified vision, as well as from continued collaborative efforts. However, it is a small part of what is truly necessary to meet the goals described in the Canon Group’s position paper. 2 This section provides an overview of future work deemed necessary for the Canon Group, consisting of four major themes: 1) scalability and generalizability, 2) automation, 3) evaluation, and 4) collaboration and support. The Canon Group’s work must scale-up and generalize, both in the chest x-ray (CXR) report domain and in other domains. To this end, we must take advantage of the 10,000 remaining reports that we collected in our initial work. Creating a workable model for all of these reports will require some resources that currently do not exist. We advocated in our position paper that we needed a grammar for the formation of medical concepts, consisting of basic resources (i.e., basic lexical units, typology, and an inventory of basic concepts) and procedures (i.e., rules of composition). For the current model, we have focused more on the collaborative process than on the resources. As we scale-up and, particularly, as applications are built, we need to explicitly create these resources. Another aspect to scalability concerns making all of our methods computationally tractable. This includes not only devising efficient algorithms, but also adopting technologies that make the product useful. One such technology is an Internet-based client-server architecture. This architecture will allow collaborators and their applications to access Canon Group resources. Once we achieve scale in the CXR report domain, we must determine which aspects of our work generalize to other domains. There is considerable interest in handling what is probably the most unstructured of all medical data, the physical examination. As with the CXR reports, we will obtain data from diverse sites and will repeat the process. Initially we will focus on one aspect of the physical examination, such as the cardiac or abdominal examination, and will build outward. The second consideration is automation. With the large volume of medical data generated daily, there must be considerable, if not complete, automation of these processes. Modeling CXR reports by hand may enable us to collaboratively understand and analyze the underlying conceptual issues, but ultimately the modeling must be nearly fully automated. The natural-language processor developed by Friedman et a1.15was used to automatically process and struc- Number 1 Jan / Feb 1995 13 ture the CXR reports in accordance with the model proposed by their group.’ It is possible that the same natural-language processing system could be used to automate the processing of the CXR reports in accordance with the merged model. Another part of the process that must be automated is the building of the model itself. As mentioned by Evans and Hersh 13 the CLARIT system provided automated noun-phrase extraction and first-order thesaurus construction, allowing large numbers of terms and modifiers to be discovered. The third aspect of our future plans is the need for evaluation, both to provide us with a measure of our work and, if we are successful, to convince others to adopt our approach. There are several planned approaches to evaluation. These approaches are not mutually exclusive and include: 1. Evaluation of the model in each individual group’s application, such as decision support and structured data entry. The benefit of this approach is the establishment of the operational use of the system, while the drawback is the possible inability to control for variables outside the context of the vocabulary. 2. Evaluation of the model in different sites by sharing clinical data. The benefit of this approach is the direct operational assessment of the model between sites and facilitation of sharing of data. 3. Ensuring consistent mapping back and forth between the model and the original text. The benefit of this approach is the direct assessment of mapping back and forth. The drawback is the lack of evaluation in an operational setting. 4. Presenting the model to clinicians for evaluation. The benefit of this approach is to have assessment by the people whose language we are modeling, while the drawback is its inherent subjectivity. The fourth theme of future work concerns collaboration and support. While we have found that a small focused group has enabled us to move beyond mere ideas, we will not consider our work a success unless it is adopted for use in operational systems. Collaboration, of course, requires support in many forms. We will obviously need the support of the producers of existing vocabularies, not only to map our representations to their terminologies, but also to utilize their terminologies. Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com 14 FRIEDMAN ET AL., Conclusions 8. The development of a core merged model is a small but critical step in addressing what we identify as the central challenge of medical informatics-development of a generally accepted model for representBecause the merged model ing clinical information. has been developed collaboratively, it has been deemed acceptable on an experimental basis by a number of different sites involved in medical informatics. This is a substantial step in the required direction. The effort so far enhances the level of discussion about and activity for developing a standard model for medical-concept representation. The work described here is an initial effort that provides a foundation we hope will ultimately be appropriated for use in tangible clinical applications. A widely accepted standard model for medical-concept representation would provide a mechanism whereby sharable applications and data could become a reality and true collaboration could be feasible. 9. 10. 11. 12. 13. 14. 15. Members of the Canon Group who collaborated in the development of the merged model include (alphabetically): Douglas S. Bell, MD Keith E. Campbell, MD Christopher G. Chute, MD, DrPH James J. Cimino, MD David A. Evans, PhD Carol Friedman, PhD Robert A. Greenes, MD, PhD William R. Hersh, MD Stanley M. Huff, MD Stephen B. Johnson, MD Robert C. McClure, MD Mark A. Musen, MD, PhD Edward Pattison-Gordon, MS Alan Rector, MD, PhD Roberto Rocha, MD 16. 17. 18. 19. References 20 1. Feinstein AR. ICD, POR, and DRG: unsolved scientific problems in the nosology of clinical medicine. Arch Intern Med. 1988;148(10):2269-74. 2. Evans DA, Cimino JJ, Hersh WR, Huff SM, Bell DS. Toward a medical-concept representation language. J Am Med Informatics Assoc. 1994;1:207-17. 3. Musen MA. Dimensions of knowledge sharing and reuse. Comput Biomed Res. 1992;25(5):435-67. 4. Tuttle MS. The position of the Canon Group: a reality check. J Am Med Informatics Assoc. 1994;1:298-9. 5. Evans D. Final Report on the MedSORT-II Project: Developing and Managing Medical Thesauri. Technical report. Pittsburgh, PA: Laboratory for Computational Linguistics, Carnegie Mellon University, 1987. 6. Rossi-Mori A, Bemauer J, Pakarinen V, et al. CEN/TC251/PT003 models for representation of terminologies and coding systems in medicine. In: Proceedings of the Seminar Opportunities for European and U.S. Cooperation in Standardization in Health Care Informatics. Geneva, Switzerland, September 1992. 7. Rector AL, Nowlan WA, Kay S. Conceptual knowledge: the core of medical information systems. In: Lun KC, Degoulet P, 21 22 23 24 25. 26. 27. 28. The Canon Group’s Merged Model Plemme TE, Rienhoff O, eds. Proceedings of MEDINFO 92. Amsterdam: North-Holland, 1992:1420-6. Rector AL, Nowlan WA, Glowinski A. Goals for concept representation in the GALEN project. In: Safran C, ed. Proceedings of the Seventeenth Annual Symposium on Computer Applications in Medical Care. New York: McGraw-Hill, 1994:414-8. Friedman C, Cimino JJ; Johnson SB. A schema for representing medical language applied to clinical radiology. J Am Med Informatics Assoc. 1994;1:233-48. Huff SM, Rocha RA, Haug PJ, Bray BE, Warner HR. An Event Model of Medical Information Representation. Technical report. Salt Lake City, UT: University of Utah, 1994. Campbell KE, Das AK, Musen MA. A logical foundation for representation of clinical data. J Am Med Informatics Assoc. 1994;1:218-32. Bell DS, Pattison-Gordon E, Greenes RA. Experiments in concept modeling for radiographic image reports. J Am Med Informatics Assoc. 1994;1:249-62. Evans D, Hersh B. The CXR Reports: Model and Analysis: Background paper for CANON Group workshop. Technical report. Pittsburgh, PA: Laboratory for Computational Linguistics, Carnegie Mellon University, 1993. Cimino JJ, Clayton PD, Hripcsak G, Johnson SB. Knowledgebased approaches to the maintenance of a large controlled medical terminology. J Am Med Informatics Assoc. 1994;1:3550. Friedman C, Alderson PO, Austin JHM, Cimino JJ, Johnson SB. A general natural-language test processor for clinical radiology. J Am Med Informatics Assoc. 1994;1:161-74. Sowa JF. Conceptual Structures. Reading, MA: Addison-Wesley, 1984. Bernauer J. Conceptual graphs as an operational model for descriptive findings. In: Clayton PD, ed. Fifteenth Annual Symposium on Computer Applications in Medical Care. New York: McGraw-Hill, 1991:214-8. Schroder M. Knowledge based analysis of radiological reports using conceptual graphs. In: Pfeiffer HD, ed. Proceedings of the Seventh Annual Workshop on Conceptual Graphs, Las Cruces, New Mexico. Berlin: Springer-Verlag, 1992:213-22. Rassinoux AM, Baud RH, Scherrer JR. Conceptual graphs model extension for knowledge representation of medical texts. In: Lun KC et al., ed. Proceedings of MEDINF092. Amsterdam: North-Holland, 1992:1368-74. Genesereth MR, Fikes RE. Knowledge Interchange Format, Version 3.0 Reference Manual. Technical report logic-92-l. Stanford, CA: Stanford University, 1992. Neches R, Fikes RE, Finin T, et al. Enabling technology for knowledge sharing. Al. 1991;12(3):16-36. Kahn DC. Modeling time in medical decision support programs. Med Decis Making. 1991;11:249-64. Console L, Furno A, Torasso P. Dealing with time in diagnostic reasoning based on causal models. In: Methodologies for Intelligent Systems, vol. 3. Amsterdam: North-Holland, 1988. Allen JF. Towards a general theory of action and time. Artif Intell. 1984;23(2):123-54. Zadeh LA. Commonsense knowledge representation based on fuzzy logic. IEEE Comput. 1983;16(10):61-6. Maida AS, Shapiro SC. Intensional concepts in propositional semantic networks. In: Readings in Knowledge Representation. San Mateo, CA: Morgan Kaufmann, 1985:169-89. Davis E. Representations of commonsense knowledge. In: Morgan Kaufmann Series in Representation and Reasoning. San Mateo, CA: Morgan Kaufmann, 1990. Han R. Knowledge Representation: An AI Perspective. Tutorial monographs in cognition science. Norword, NJ: Ablex Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com Journal of the American Medical Informatics Association Volume 2 A Individual Findings in a Sample X-ray Report BWH22.07/compatible with atelectasis (based on finding BWH22.06) BWH22,12/consistent with coronary artery bypass graft (based on finding +BWH22.11) BWH22.141/consistent with] previous lobectomy on the right (based on finding +BWH22.13) BWH22.lO/left lower lobe atelectasis BWH22.03/[new] 4 intact sternotomy wires BWH22.06/new plate like opacities in left [mid lung zone and left] lover +zone BWH22.02/new surgical clips in distribution of circumflex artery BWH22.04/persistent increased right paramediastinal opacity BWH22.05/possibly related to previous radiation therapy (based on finding +BWH22.04) BWH22.ll/post-operative changes BWH22.13/[post-operative changes] BWH22.09/slight interval decrease in left pleural effusion BWH22.08/some interval improvement in left pleural effusion BWH22.Ol/surgical clips again along right mediastinum and [along] right +region APPENDIX lung hilar % Observations rad_finding < observation. body-location < observation. active-disease < observation. acute-disease < observation. air-accumulation < observation. calcified_granuloma < granuloma. cancer < observation. %disease circumscribed_density < observation. collapse < observation. compression_fracture < fracture. % is "lung" implied consolidation < observation. curvilinear_density < density. edema < observation. effusion < observation. fibrosis < observation. fluid < observation. fluid_overload < observation. focal_opacity < circumscribed_density. fracture < observation. granuloma < observation. granulomatous_disease < observation. infiltrate < observation. %is "lung" implied linear_fibrosie < fibrosis. linear_opacity < circumscribed_density. multiple_sclerosis < observation. % disease nodular_opacity < circumscribed_density. nonunionized_fracture < fracture. pleural_effusion < effusion. rounded_density < circumscribed_density. scarring < observation. trauma < observation. B % Medical Core Merged Model l/ THE CORE MERGED MODEL % Statements % The concepts % specify the bracketed specified controlled by "/* */" 15 1 Jan / Feb 1995 elevation_of_hemidiaphragm < rad_finding. granulomatous_disease < rad_finding. interstitial_markings_in_lung < rad_finding. kyphosis < rad_finding. kyphosis < rad_finding. nasogastric_tube < rad_finding. peribronchial_cuffing < rad_finding. pneumothorax < rad_finding. pleural-effusion < rad_finding. pleural-thickening < rad_finding. scoliosis < rad_finding. B_shaped_scoliosis < scoliosis. sternotomy < rad_finding. sternotomy_wire < rad_finding. subsegmental_atelectasis < atelectasis. tortuous_aorta < rad_finding. widening_of_mediastinum < rad_finding. Publishing, 1991. 29. Chen S, ed. Advances in Spatial Reasoning. Norword, NJ: Ablex Publishing, 1991. 30. Ishii H, Miyake N. Toward an open shared workspace: computer and video fusion approach of TeamWorkstation. Communications ACM. 1991;34(12):37-50. 31. Hsu J, Lockwood T. Collaborative computing. Byte. 1993; 18(3):113-20. 32. Goldfarb CF. The SGML Handbook. Oxford, UK: Clarendon Press, 1990. APPENDIX Number and statements in the type assignment names that are present following statements in the model. % TYPE HIERARCHY % Rad Findings % higher level findings rad_finding < finding. % lower level findings air_bronchogram < rad_finding. atelectasis < rad_finding. blunting_of_costophrenic_angle < rad_finding. bony_changes < rad_finding. calcification_of_lymph_node < rad_finding. cardiac_silhouette_upper_limits_of_normal < rad_finding. cardiac_silhouette_within_normal_limits < rad_finding. cardiomegaly < rad_finding. clear_lung < rad_finding. coronary_artery_bypass_graft < rad_finding. degeneration_of_thoracic_spine < rad_finding. degenerative-joint-disease < rad_finding. discoid_atelectasis < atelectasis. "%" are comments Devices nasogastric_tube < observation. prosthetic_valve_ring < observation. sternotomy_sire < observation. surgical_clip < observation. wire < observation. % Surgical procedures therapeutic_procedure < observation. surgical_procedure < observation. coronary_artery_bypass_graft < observation. lobectomy < observation. sternotomy < observation. % Body Location Concepts 7th_rib < rib. aorta < body-location. aortic_valve < body_location. aorto_pulmonary_window < body_location. blood_vessels < body_location. cardiac_silhouette < body_location. cardiopulmonary < body_location. chest-wall < body_location. %part of chest costophrenic_angles < body_location. diaphragm < body_location. distribution_of_circumflex_artery < body_location. extrathoracic < body_location. heart < body_location. hemidiaphragm < body_location. hilum < body_location. left_lower_lobe_of_lung < body_location. %part_of lung Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com 16 FRIEDMAN left_lower_lung_zone < body_location. left_upper_lobe_of_lung < body_location. lobe_of_lung < body_location. lung < body_location. lymph_node < body_location. major_fissure < body_location. mediastinum < body_location. paramediastinum < body_location. pleura < body-location. pleural-space < body_location. pulmonary_blood_vessels < blood_vessels. rib < body_location. right_lower_lobe_of_lung < body_location. soft_tissue < body_location. spine < body_location. subpulmonic < body_location. thoracic_spine < body_location. %is this thoracic_vertebral_body < body_location. vertebral_body < body_location. % Location Qualifiers body_location < location_qualifier. body_location_part < location_qualifier. laterality < location_qualifier. locative < location_qualifier. orientation < location_qualifier. quantity < location_qualifier. relative_location < location_qualifier. % Laterality bilateral < laterality. right < laterality. left < laterality. % Body_location_part area_of < body_location_part. bibasilar < body_location_part. border < body_location_part. field < body_location_part. lobe < body_location_part. region < body_location_part. segment < body_location_part. wall < body_location_part. zone < body_location_part. % Relative Locations anterior < relative_location. base < relative_location. inferior < relative_location. lateral < relative_location. lower < relative_location. median < relative_location. mid < relative_location. posterior < relative_location. upper < relative_location. % Orientation anterior_posterior < orientation horizonal < orientation. lateral < orientation. transverse < orientation. % Qualifiers % types of qualifiers degree < qualifier. orientation < qualifier. position < qualifier. quantity < qualifier. temporal < qualifier. property < qualifier. % loser level qualifiers calcification < property. coarse < property. density < property. elevated < position. focal < property. hazy < property. intact < property. scattered < property. smooth < property. %part_of %part_of % Density lung lung The Canon Group’s ET AL., Qualifiers clear < property. curvilinear density opaqueness < property. % Shape %part_of < property Qualifiers shape_qualifier < property. lateral_deviation < shape_qualifier. platelike < shape_qualifier. round < shape_qualifier. s_shaped_deviation < lateral_deviation. tortuous < shape-qualifier. lung same as thoracic Merged Model vertebral body? %Degree Qualifiers extensive < high_degree. high_degree < degree. large_amount < degree. mild_degree < degree. minimal < mild_degree. moderate_degree < degree. more_than_normal_degree < degree. severe_degree < degree. slight < mild_degree. some < mild_degree. % Size Qualifiers qualitative_size < property. quantitative_size < property. enlargement < qualitative_size. large < qualitative_size. normal_size < qualitative_size. prominent < qualitative_size. size_within_normal_limits < qualitative_size. small < qualitative_size. thickening < qualitative_size. widening < qualitative_size. % Temporal qualifiers change < temporal. again < temporal. chronic < temporal. decrease_in <change. decrease_in_size < decrease_in. healed < change. improved < change. temporal_increase_in < change. temporal_increase_in_intensity < temporal_increase_in. temporal_increase_in_number < temporal_increase_in. temporal_increase_in_size < temporal_increase_in. interval < temporal. interval_development < temporal. new < temporal. no_change < change. no_change_from_previous_exam < no_change. no_change_in_-intensity < no_change. no_change_in_number < no_change. no_change_in_position < no_change. no_change_in_size < no_change. persistent < temporal. post_operative change < change. previous < temporal. remain < change. remain_in_p1ace < remain. resolved < change. statutus post < tempora1. % Certainty Qualifiers absent < certainty. cannot_rule_out < low_certainty. connective < certainty. evidence_of < moderate_certainty. high_certainty < Certainty. history_of < high_certainty. likely < high_certainty. low_certainty < certainty. moderate_certainty < certainty. possible < moderate_certainty. present < high_certainty. probable < moderate-certainty. Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com Journal of the American Medical Informatics Association Volume 2 unlikely < low_certainty. undetermined < certainty. Number 17 1 Jan / Feb 1995 [pneumothorax:{"pneumothorax"."air (has_observation) (has-location) -> in pleural -> [air_accumulation] [pleural_space]. [pleural_effusion] (has_observation) (has_location) -> [effusion] [pleural_space]. space")] - % Quantities '>1' < fuzzy_quantity. a_few < fuzzy_quantity. fuzzy_quantity < quantity. many < fuzzy-quantity. multiple < fuzzy_quantity. number < quantity. % Connective [cardiac-silhouette-within-normal-limits: {"cardiac silhouette within normal limits","normal (has_observation)-> [cardiac_silhouette] (has_property) -> [size_within_normal_limits]. [cardiomegaly: {"cardiomegaly","cardiac enlargement","enlargement "enlargement_of_cardiac_silhouette"}] (has_observation) -> [heart] (has_property) -> [enlargement]. [clear_lung] & Dimensions diameter < dimension. length < dimension. volume < dimension. width < dimension. (has_observation) (has_property) [elevation_of_hemidiaphragm] (has_observation) (has-property) % Units [pleural_thickening] (has_observation) (has_property) cm < unit. mm < unit. heart"}] - along < locative. under < locative. adjoining < locative in < locative. % CANONICAL % Canonical % i.e. they [s_shaped_scoliosis] (has_observation) (has_property) GRAPHS graphs are defined only for concepts are composed of other concepts that [rad_finding} (has_observation)-> (has_location) (has_location_qualifier) (has_certainty) (has_degree) (has_temporal) (has_quantity) (has_property) -> -> -> -> -> -> -> [observation] [body_location] [location_qualifier:{*}] [certainty:{*}] [degree:{*}] [temporal:{*}] [quantity:{*}] [property:{*}]. [body_location] (has_location) (has_location_qualifier) (location_relation) -> -> -> [body_location:{*}] [location-qualifier:{*)] [locative:{*}]. [location_qualifier] (has_location_qualifier) -> [location_qualifier: [quantitative_size] (has_dimension) (has_measurement) (has_orientation) -> -> -> (has-degree) -> -> [pleural [thickening]. -> deviation -> [spine] [lateral_deviation]. -> -> [spine] [s-shaped_deviation]. -> [number] -> [unit]. -> -> -> of spine"}] - aorta","uncoiled -> [aorta] [tortuous]. -> aorta","unrolled aorta"] - -> -> [mediastinum] [widening]. angle","castophrenic sulci","costophrenic [left_upper_lobe_of_lung:{"left (has_location) (has_location_qualifier) (has_location_qualifier) angles"," sulcus"}]. upper -> lobe of lung","left [lobe_of_lung] -> [left] -> [upper]. upper lobe"]- % Density Qualifiers [opaqueness:{"opacity","density","opaque","opaqueness"}]. O>O] [dimension] [measurement] [orientation] [degree:{*}] -> [certainty:{*}] [degree:{*}] % Rad Findings [calcification_of_lymph_node] (has_observation) (has_location) [widening_of_mediastinum] (has_observation) (has_property) % Body Locations [costophrenic_angles: {"costophrenic costophrenic -> -> [hemidiaphragm] [elevated]. APPENDIX [measurement] (has_quantity) (has_unit) (has_degree) (has_certainty) - -> [lung] [clear]. heart", - [tortuous_aorta:{"tortuos (has-observation) (has_property) which are complex are more elementary. -> of - - [scoliosis:{"scoliosis","lateral (has_observation) (has_property) % Locative [certainty] size Relations compatible_with < connective. consistent_with < connective. may_represent < connective. most_likely_represent < connective related_to < connective. [tempora1] -> -> [calcification] [lymph_node]. C Structured Findings in X-ray Report BWH22 /* BWH22.01. hilar region surgical */ clips again along right mediastinum [rad_finding:#BWH22.01] (has_observation) -> [surgical_clip] (has_location) -> [mediastinum] (has_location_qualifier) -> [right] (location_relation) -> [along]. (has_location) -> [hilum] (has_location_qualifier) -> [right] and [along] right Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com 18 FRIEDMAN (has_location_qualifier) (location_relation) (has_temporal) -> [again] (has_quantity) -> [">l"] /* BWH22.02. new surgical [rad_fnding:#BWH22.02] (has_observation) (has_location) (has_temporal) (has_quantity) /* BWH22.03. [new] BWH22.04. persistent /* -> -> -> in distribution of circumflex artery l/ /* -> [surgical_clip] [distribution_of_circumflex_artery] [new] [">l"]. sternotomy l/ wires BWH22.05. (possibly [rad_finding:#BWH22.05] /* BWH22.06.-new lung zone l/ -> -> -> BWH22.08. with) atelectasis (based on finding BWH22.06) */ -> some interval [atelectasis] improvement in left pleural l/ effusion [rad_finding:#BWH22.08] (has_observation) -> [pleural_effusion] (has_location_qualifier) -> [left] (has_temporal) -> [improved] (has_degree) -> [some] (has_temporal) -> [interval]. /* -> [sternotomy_wire] [new] [4] [intact]. increased related (has_observation) (has_temporal) plate (compatible Merged Model - right paramediastinal l/ opacity like to) -> opacities previous /* radiation l/ therapy [rad_finding:#BWH22.06] (has_observation) -> [opacity] (has_location) -> [lung] (has_location_qualifier) (has_location_qualifier) (has_location_qualifier) (has_location_qualifier) (has_location_qualifier) (has-location-qualifier) (has_temporal) -> [new] (has_property) -> [platelike]. left mid slight interval BWH22.10. left lower [rad_finding:#BWH22.10] (has_observation) (has_location) lung zone and left lower /* BWH22.12. BWH22.11) [zone] in left pleural effusion. */ [pleural_effusion] -> [left] -> -> [slight] [interval]. atelectasis l/ -> -> [atelectasis] [left_lower_lobe_lung] with) coronary artery l/ bypass graft (based on finding l/ -> [coronary_artery_bypass_graft]. -> -> -> lobe (consistent [rad_finding:#BWH22.12] (has_observation) -> decrease /* BWH22.11 (* BWH22.13). post-operative changes [rad_finding:#BWH22.ll] (has_observation) -> [observation] (has_temporal) -> [post_operative_changes]. -> [radiation_therapy] [previous]. in BWH22.09. [rad_finding:#BWH22.09] (has_observation) -> (has_location_qualifier) (has_temporal) -> [decrease_in] (has_degree) (has_temporal) [rad_finding:#BWH22.04] (has_observation) -> [circumscribed_density] (has_location) -> [paramediastinum] (has_location_qualifier) -> [right]. (has_temporal) -> [persistent] (has_degree) -> [more_than_normal_degree]. /* BWH22.07. The Canon Group’s [rad_finding:#BWH22.07] (has_observation) - 4 intact [rad_finding:#BWH22.03] (has_observation) (has_temporal) (has_quantity) (has_property) /* clips -> [region] [along]. -> ET AL., [zone] [left] [mid], /* BWH22.14. (consistent finding BWH22.13) with) previous -> -> Cleft] [lower], Ix-ad-finding: #BWH22.14] (has_observation) -> [lobectomy] (has_location_qualifier) -> [right] (has_temporal) -> [previous]. labectomy on the right (based on l/ Downloaded from jamia.bmj.com on July 15, 2011 - Published by group.bmj.com The Canon Group's Effort: Working Toward a Merged Model Carol Friedman, Stanley M Huff, William R Hersh, et al. JAMIA 1995 2: 4-18 doi: 10.1136/jamia.1995.95202547 Updated information and services can be found at: http://jamia.bmj.com/content/2/1/4 These include: References Article cited in: http://jamia.bmj.com/content/2/1/4#related-urls Email alerting service Receive free email alerts when new articles cite this article. Sign up in the box at the top right corner of the online article. Notes To request permissions go to: http://group.bmj.com/group/rights-licensing/permissions To order reprints go to: http://journals.bmj.com/cgi/reprintform To subscribe to BMJ go to: http://group.bmj.com/subscribe/ View publication stats

RELATED PAPERS

RELATED TOPICS

Log In

The Canon Group's Effort: Working Toward a Merged Model

The Canon Group's Effort: Working Toward a Merged Model

Related Papers

RELATED PAPERS

RELATED TOPICS