1. Introduction
In traditional computer-aided design (CAD) practices, most interdisciplinary data exchanges take place with two-dimensional (2D) drawings, documents, or reports. As unstructured forms, these data files are hardly used by other software tools, resulting in remodeling work for their own tasks. Due to the lack of semantic meanings of elements (e.g., points, lines, and planes) in electronic documents, designers have to manually interpret these elements based on their experience in order to identify and extract the required data [
1]. The CAD-based process limits the reusability of project data through the whole building life cycle. Compared with the traditional CAD-based method, Building Information Modeling (BIM) technology is able to represent the geometry, properties, and relations of building objects based on the object-oriented method [
2]. In the early stage of BIM technology, its purpose is to make a complete model available for every participant. Nowadays, the use of BIM technology in the architecture, engineering, construction, and facility management (AEC/FM) industry is becoming more widespread [
3], and it results in numerous structured data in various domains, which can be interpreted by different software tools and used for different business tasks. The business task means that participants use some project data to carry out some activities for a professional application, and it can take place during the building lifecycle. It is useful for participants to use the complete model because of its rich and precise information.
However, as projects become larger and more complicated, project information increases dramatically [
4], resulting in the huge file size of the BIM model. It is difficult for participants to use such a large model, and it inevitably takes much time to process the complete model for their required information. In general, the designer/engineer requires a part of model data for their own business tasks, rather than a complete one. For example, the structural engineer mainly focuses on the structural objects (such as columns, walls, and slabs) from the architectural model for structural design and analysis, rather than the overall architectural model. The method or system to automatically extract the required information from BIM models can improve the quality and productivity by preventing unnecessary work. Due to the lack of effective tools to extract the required data, the designer/engineer generally deals with the original model in manual way. Such a large body of information makes it difficult for designers/engineers to directly process, leading to the inefficiencies in data sharing and exchange between software tools. Extracting the required data from the original BIM model has become one of the problems that must be addressed in BIM uses [
5].
In general, commercial software tools in the AEC/FM industry have the functions of querying or extracting objects. However, only specific partial model data which have been defined in software tools can be exported. Or designers/engineers can use embedded filtering in BIM authoring tools to select the components of interest (e.g., only make the structural components visible, only make the beams visible), and then save this model for use. In this way, it is different to extract other model data according to specific requirements or purposes (e.g., only extract the concrete components). Consequently, a number of plug-ins in designated software tools were developed to extract the required information. It is noted that their functions are unavailable for other software tools, only can be used in specific software tools. Although the objects can be selected one by one manually for extraction, this process is cumbersome and prone to error. In addition, the manual extraction method cannot be stored as templates for reuse. Hence, extracting partial models based on a public data schema is inevitable in order to meet requirements from diverse business tasks.
Industry Foundation Classes (IFC) was developed to support the full range of data exchange [
6]. Many studies related to partial model extraction have been carried out, and more details about these studies will be presented in the following section. However, most methods were developed to extract specific model data for specific business tasks. The users hardly extract model data by their intents, and sometimes manually select the required objects. It is necessary to develop an innovative method to extract the required model data based on the users’ requirements. In this paper, a generic language was designed to extract partial models from IFC models based on the eXtensible Markup Language (XML) format. By using the proposed language, users can design a configuration file to define extraction requirements, and then the required model data is automatically extracted and formed as a partial model. The proposed method supports diverse definitions of extraction requirements, including object types, attributes, and relations. To make the definitions of extraction requirements more rigorous and standardized, the Selection Set was proposed to represent extraction requirements. Furthermore, mathematical logic and set theory [
7] were used to describe rules for partial model extraction, so IFC data with multiple representations could be processed to form valid partial IFC models.
The rest of the paper is organized as follows. A review of related work for partial model extraction from two aspects (that is, task-specific and user-defined methods) was first introduced. Second, according to comparative analysis, the concept of the Selection Set was proposed to integrate users’ requirements. Third, seven rules in the extraction of syntactically and semantically valid partial IFC models were designed. Subsequently, with the adoption of XML schema, a generic language was developed for partial model extraction from an IFC model based on the Selection Set (PMESS), and then the proposed method was validated through a test case. The final section summarizes the most important conclusions.
2. Related Work
The methods of extracting partial models could be classified into different areas [
8,
9]. This section mainly presents these methods according to the task-specific and user-defined requirements.
2.1. Partial Model Extraction According to Specific Tasks
At present, many commercial software tools have developed data interfaces for specific tasks, such as Revit, ArchiCAD, and Tekla Structures. The required objects and attributes can be extracted from the original model by using these tools. However, these data interfaces could not be applied to other software tools. As commercial software tools, their algorithms or codes are not public, so it is difficult to modify these algorithms for extracting different model data by users’ intention. Besides the native model filtering for export, some software products provide the functionality for exporting IFC files. For example, IFC Translator in ArchiCAD [
10] can be used to export different IFC models according to options: (1) selected elements only; (2) visible elements, on either all stories or the current story; (3) entire project. The first and second options are used for partial model extraction, and the third option is for the complete one. Similarly, a functionality called ’IFC Export Setup Options’ exists in Revit [
11], which can export only elements visible in the view. The aforementioned functionality is mainly used to extract physical objects according to the type or view. However, it can hardly extract the partial model according to other requirements, such as relationships and the required attributes.
IFC is a de facto standard to support a full range of business tasks in the AEC/FM industry [
12]. It is a rich schema for representing diverse information throughout the building life cycle. Rather than exchange requirements (ERs) for specific tasks, IFC schema focuses on the complete building information among all actors at every stage of a building. Consequently, Information Delivery Manual (IDM) and Model View Definition (MVD) were proposed by buildingSMART International (bSI). The main purpose of IDM is to define exchange requirements of specific tasks in a non-technical term, and as a subset of IFC schema, MVD is a technical standard that translates IDM-based exchange requirements to Model Views for software implementation. Hence, software tools are able to export required partial models by defining different MVDs. So far, bSI has published several MVDs, such as IFC4 Reference View and IFC4 Design Transfer View [
13]. Based on the approach of the Georgia Tech Process to Product Modeling (GTPPM) [
14], Lee et al. [
15] developed an eXtended Process to Product Modeling (xPPM) tool to automatically generate process maps (PMs), ERs, functional parts (FPs), and MVDs. The xPPM promotes consistent and reliable implementations of IDM and MVD. The implementation of IDM and MVD standardizes exchanged information in domain-specific tasks. For example, the recent version of software Revit is capable of exporting IFC models according to a set of MVDs, such as Ifc2 × 2 Singapore BCA e-Plan Check, Ifc2 × 3 Coordination View 2.0, Ifc2 × 3 Basic FM Handover View, and IFC4 Reference View. However, except these mentioned MVDs, other published ones have not been widely supported by BIM software tools yet. There are several tools to define and document new MVDs, such as IfcDoc [
16] and ViewEdit [
17]. It is useful for users to generate their own MVDs according to the requirements of business tasks. However, it still needs additional work to develop corresponding software tools to realize these MVDs.
Currently, there are numerous research projects on how to efficiently extract geometric information from data models [
18,
19,
20]. However, business tasks require different kinds of information, not just geometric information. Hence, different frameworks and methods were developed to extract the required information for specific business tasks. For example, in the structural domain, some information—such as the type, location, geometry, and material of objects—should be extracted from architectural models. Qin et al. [
21] proposed the Structural General Format (SGF) based on XML and developed an algorithm to automatically extract structural information from IFC-based architectural models to generate SGF-based models. Besides IFC data format, the SGF-based model could be translated into different finite element models for structural analysis. Hu et al. [
22] proposed an IFC-based Unified Information Model with conversion algorithms between the architectural and structural models, and among various structural analysis models. Besides the structural domain, other business tasks also need to extract information from upstream models, e.g., energy simulation [
23], construction schedule [
24], etc. The aforementioned studies mainly extract the required information for specific business tasks.
2.2. Partial Model Extraction According to User-Defined Requirements
Due to the multidisciplinary, multi-stage and multi-party nature of building projects, business tasks require different information. The studies related to partial model extraction can be divided into two types: the schema level-based and the instance level-based methods. The former method is to develop a definition format with various mappings for data exchange, and the latter directly deals with the data within the original model. The schema level-based method refers to extract model data according to a predefined model data structure. Consequently, it is necessary to define all possible transformations in advance. The instance level-based method focuses on the specific information of project objects, and this method enables the extraction of some designated model data according to the user-defined requirements. In the IFC file, the specific information of objects means corresponding IFC instances. This method gives users enough flexibility and well-understanding, but more complex querying algorithms are required than the schema level-based method.
2.2.1. Methods at the Schema Level
Partial Model Query Language (PMQL) [
25] and Generalized Model Subset Definition schema (GMSD) [
26] are two early partial model extraction methods based on the schema level. PMQL was developed based on XML, Structured Query Language (SQL), and Simple Object Access Protocol (SOAP). It aims to extract a partial instance model from the original IFC model through the select, update, and delete operations. Inspired by PMQL and SPARQL Protocol and RDF Query Language (SPARQL), Mazairac and Beetz [
9] adopted the BIM Server to develop an open query language (BIMQL, Building Information Model Query Language). The purpose of this language is to select, set, create, and delete IFC model data for managing BIM models, and the functions ‘select’ and ‘set’ have been developed. However, IFC schema includes numerous logical relations. When extracting specific relations using the PMQL, it requires many iterative cycles of request and response [
26]. Furthermore, the PMQL method has room for improvement, such as path expression, nested queries, and inheritance hierarchies. As a result, GMSD was developed to support the dynamic selection of object instances and the filtering of a building model through predefined model view definitions. GMSD was designed to support EXPRESS-based models for consistency with IFC. Users need to define or edit MVDs within the GMSD method, but the MVD definition is a challenge for users. To solve this problem, bSI developed an official standardization specification data format for capturing MVDs based on the XML format, called mvdXML [
27]. The mvdXML is a machine-interpretable representation for information exchange in IFC schema, and can be easily processed by software tools. More and more software tools are expected to support mvdXML. Inspired by this method, the proposed method in this paper was also developed based on the XML.
The new model view definitions generated based on the schema level need to be validated in the syntax and semantics, so Yang and Eastman [
28] defined a serial of rules for subset generation by using set theory, aiming at supporting specific exchanges through the generation of valid model views. Furthermore, Lee [
29] proposed the ‘minimal set’, the smallest complete subset of a schema related to a concept. Several conditions for extracting valid subsets from EXPRESS schema were defined to match the concepts. A tool called ‘IFC Model View Extractor’ (alpha version) was developed for generating subschema from IFC schema. According to subset generation rules [
28], Yu et al. [
30] proposed a semi-automatic generation method for MVDs, which could extract partial models according to core concepts of specific tasks. However, these core concepts need to be accurately predefined by users. In this study, with reference to the rule-based subset generation method [
28], the rules for Selection Set-based partial model extraction were designed based on mathematical logic and set theory.
Some other researchers attempted to convert IFC models into a generic data schema. Given the requirement on spatial analysis, Daum and Borrmann [
31,
32] carried out a topological analysis of BIM models and proposed Query Language for Building Information Models (QL4BIM). This method enables users to extract the partial model by defining boundary representation. Fuchs and Scherer [
33] developed a language called Multi-Model Query Language (MMQL), which required homogeneous data access to link and filter multi-model information, such as the bill of quantities, building, and schedule. Nevertheless, the export results are documented in textual format, rather than IFC-based data format or other model formats. Zhang and El-Gohary [
34,
35] developed an automated BIM Information Extraction method to extract the required information from IFC models with semantic Natural Language Processing techniques and Java Data Access Interface. Some limitations still exist in this method. For example, IFC relations in the extracted model are not yet fully aligned with the proposed semantic logic-based representation. Pauwels and Terkaj [
36] proposed a procedure to convert IFC EXPRESS schema to an IfcOWL ontology for construction industry.
2.2.2. Methods at the Instance Level
The extraction method at the schema level always needs to be defined in a formal language for the data schema [
15]. It is a complex and difficult task for users in the AEC/FM industry. Methods at the instance level (e.g., the object, property, etc.) were proposed to meet user-defined exchange requirements. Katranuschkov et al. [
8] adopted the semantic query method to extend the GMSD definition at the instance level and developed Multi-model View Generator to extract partial models from BIM and non-BIM data. Based on the GMSD work, Windisch et al. [
37] proposed a generic framework for consistent generation of BIM-based model views, which aims to provide the filtering at class and object level, and the generation of ad-hoc and multi-model views. However, the relations between these levels need to be further studied and implemented. To avoid the definition of data schema, Won et al. [
38] proposed a no-schema algorithm to extract a partial model from an IFC model depending on user-defined object types or predefined ERs. The current version could not extract the partial model under combinatorial conditional expressions.
Currently, there are some good open source software libraries that help users and software developers to work with BIM IFC files, such as IfcOpenShell [
39], xBIM [
40], and IfcPlusPlus [
41]. The user can use one of these IFC libraries to read IFC files according to the requirement, and IfcPlusPlus was selected as the IFC library in this study.
2.3. Summary on Related Research Works
The partial model extraction methods in the first subsection are mainly used in domain-specific tasks. The second subsection presents two types of generic partial model extraction methods related to user-defined requirements. One method is to extract partial models through definitions at the schema level. Even though it is general enough to meet various requirements of business tasks, users are required to be familiar with definitions within these methods. The other provides the selection function to extract partial models at the instance level. Nevertheless, the current tools mainly extract some common physical elements, but not fully support the extraction under some restrictions.
To support the extraction with user-defined requirements, the Selection Set was proposed to integrate user-defined requirements, and the XML format was used to design a generic language to automatically extract partial models. By using the proposed method, the required objects and their attributes can be extracted from the original model according to the user-defined requirements, and other objects that are not required can be filtered out. The key characteristics of the proposed method compared to related studies are summarized as follows:
- (1)
The mathematical logic and set theory were used to define Selection Set and extraction rules for partial model extraction. The mappings between IFC data and user-defined requirements were developed by using the mathematical method, ensuring the stability of the proposed algorithm. When the version of IFC schema is updated, only some definitions in the Selection Set or extraction rules need to be updated or revised, rather than the entire extraction algorithm. The structure of the proposed language is stable and independent of IFC versions.
- (2)
A generic language for partial model extraction was designed based on Selection Set (PMESS). In order to extract different model data, data extraction requirements were analyzed and classified according to IFC schema. Subsequently, the technical structure of the partial model extraction language was developed by using the software-independent XML schema. The purpose of Selection Set is to standardize and integrate the users’ extraction requirements, and its elements can be used to map into different user-defined requirements. By using the proposed language, the partial model can be extracted according to the user-defined extraction requirements, including objects, properties, and relations.
3. Concept of the Selection Set
During the process of partial model extraction, the software tool firstly identifies the extraction requirements defined by users, then extracts information which meets the requirements, and finally forms a valid data model based on the extracted information. Therefore, the extraction requirements can be regarded as input parameters. In this study, user-defined extraction requirements are integrated into the Selection Set, which can be assumed as some basic sets with specific semantic to extract partial models.
The Selection Set is an information set that is formed based on the requirements, such as object types, attributes, relations, and mixed ones. Elements in the Selection Set are used as input parameters of the proposed method. According to referencing relations between IFC data, the proposed method queries IFC data based on input parameters and then exports the required IFC model data, that is, a partial model or sub-model.
Extraction requirements can be classified as different semantics and relationships, such as object types, properties, relations, and mixed cases. In terms of data representation in the IFC schema, the entities and rules can be used to describe these extraction requirements. Hence, the first condition for the Selection Set is defined as follows:
Condition 1: A Selection Set includes a set of Entities and Rules, and has at least one Entity.
whereeis ENTITY;ris Rule;Sis Selection Set;Eis a non-null set of Entities; andRis a set of Rules. In an IFC model, every IFC instance represents a specific meaning, and it is illegal to have an abstract IFC entity in the IFC model. The proposed method is to query and extract IFC instances from the IFC model according to the Selection Set, so the abstract IFC entities should not be included in the Selection Set. The Condition 2 for Selection Set is listed as follows. This condition is similar to the rule BR02 defined by Yang and Eastman [
28].
Condition 2: A Selection Set cannot include an abstract entity.
whereAabsis a set of Abstract entity data types. Business tasks require diverse information, so a set of exchange requirements may need to be defined for partial model extraction. In some cases, the partial model may only need to meet one of many Selection Sets, while other cases require to meet many Selection Sets. In conclusion, the former relation among Selection Sets is ‘union’, and the latter one is ‘intersection’. Hence, the theorem for forming new Selection Sets is defined as follows:
Theorem 1: Forming new Selection Sets
The union of many Selection Sets is still a Selection Set, and the intersection of many Selection Sets is also a Selection Set. Letsidenote a Selection Set: Proof: - (1)
Let a be an element of , that is, . There is at least one si, so that a belongs to si.
Because and , ;
- (2)
Let b be an element of , that is, . For any si, b belongs to si.
Because and , . □
4. Rules for Partial Model Extraction
The output of the partial model extraction method is the IFC file, so the file must comply with the IFC schema. During the extraction process, the proposed method is to process the original IFC model depending on rules for partial model extraction and then export the partial IFC model by integrating required model data. These rules for partial model extraction can be categorized into three types: basic rules for a valid IFC file, extraction rules based on Selection Set, and processing rule for redundant information, as shown in
Figure 1.
Basic rules for a valid IFC file: The basic rules refer to the fundamental requirements that should be complied with when an IFC file is formed. The proposed method is to export an IFC file, so the extracted partial model should also comply with these basic rules.
Extraction rules based on Selection Set: The extraction requirements were included in the Selection Set. In order to query and extract the matching IFC data, the mapping from elements in the Selection Set to IFC model data was developed.
Processing rule for redundant information: Required data could be identified according to the basic rules and extraction rules. In the final step to export the IFC file, the information that is not required by business tasks should be filtered to ensure the validity of the IFC file.
Traditionally, the existing extraction methods generally process BIM data based on one IFC version. When the IFC version is updated, it is clear that defining a mapping between one version and the updated one is a major undertaking. It may be needed to modify the corresponding algorithm. Through the proposed rules, the data processing of partial model extraction was divided into different steps. It can significantly reduce the modification work of the proposed algorithm because of the updated IFC version, ensuring the stability of this method. The data processing flow can be concluded as follows. The first step is to query corresponding IFC data according to the user-defined requirements. These requirements (such as object types, properties, and relations) can be obtained from the Selection Set. In Step 2, the target object will be located through the IFC reference relationship, and all of its corresponding attributes will be remained. Finally, all the target objects and their attributes will be extracted to form a new IFC file. The following subsection describes the proposed rules in detail.
4.1. Basic Rules for a Valid Industry Foundation Classes (IFC) File
The
IfcProject is an important entity in an IFC file. It is not only a foundation of space structure entities (such as
IfcSite,
IfcBuilding, and
IfcBuildingStorey), but also contains unit, owner history, geometric representation and other basic information of a building project. An IFC file has only one
IfcProject entity [
42]. Based on this entity, some basic project information can be queried from the IFC model data, such as site, building, and unit. Hence, a partial model should contain the
IfcProject entity and related entities which are referenced by attributes of the
IfcProject.
Rule 1: The partial model has only one
IfcProject entity and an entity set which consists of other entities referenced by attributes of the
IfcProject.
where
Mo is the original model;
epro is the
IfcProject entity;
Rpro is a set of entities referenced by
IfcProject entity’s attributes; and
Mp is the partial model.
In the IFC schema, the IFC entity mainly contains explicit and inverse attributes [
43]. The explicit attributes are scalar values or the information computed from other attributes, while the inverse ones are identified relationally through other entities.
To ensure the completeness of the partial model, when extracting one designated entity (called ‘target entity’), all entities referenced by the explicit attributes of the target entity should be extracted together. The entity set consisting of these referenced entities is assumed to define as the Essential Set (ES) of the target entity.
Rule 2: The Essential Set of the target entity is included in the partial model.
where
Re is the entity set referenced by the attributes of
e.
Besides explicit attributes, some information of the target entity is represented by other IFC instances defined in inverse attributes. Similarly, the entities defined in inverse attributes of the target entity should be extracted. According to the referencing and inheritance structure of the IFC model, the entities in the ES also need to be queried to find out corresponding entities defined in inverse attributes. It is noteworthy that some particular IFC entities are used to represent basic information of a building project, which may be referenced by many IFC instances. A representative entity is the IfcOwnerHistory. If these entities were queried to search entities defined in inverse attributes, some entities that were not required would be extracted as well as some repeat entities. To avoid this situation, these entities comprising the Particular Set (PS) were designed as ending points of the query process. Besides the IfcOwnerHistory, the basic entities—such as IfcDirection, IfcCartesianPoint, IfcAxis2Placement3D/IfcAxis2Placement2D, IfcLocalPlacement, and IfcGeometricRepresentationContext/IfcGeometricRepresentationSubContext—were set as the particular entities in this study. When encountering these particular entities during the query of inverse attributes, the proposed method will stop the running and enter into the next query. After this step, the non-target IFC entities cannot be extracted. The entity set including entities in the ES and entities defined in inverse attributes is assumed to be the Individual Set (IS) of the target entity. The IS contains complete information of the target entity.
Rule 3: The partial model includes entities defined in inverse attributes of the target entity, which are not queried from the Particular Set.
where
einv is an entity defined in inverse attributes of the target entity
e;
Rinv is the inverse relation from
e to
einv;
Eps is a set of particular entities in the Particular Set; and
Einv is a set of required entities
einv.
4.2. Extraction Rules Based on Selection Set
According to Condition 1, the Selection Set has at least one Entity e. In the IFC model, all IFC instances matching the Entity e should be extracted. Furthermore, other IFC instances related to explicit and inverse attributes of the target entity should be extracted according to Rule 2 and Rule 3.
Rule 4: IFC instances matching Entity
e in the Selection Set are contained in the partial model.
where
Ee is the set of entities related to Entity
e; and
fM is a function from
e to
Ee, working on original model
Mo.
Numerous complex relations exist between various objects in a building project. For example, the binary relation can be divided into different types of relations, such as containment, parallel, and crosscutting relations. Consequently, relation entities should be set into Selection Set. All corresponding IFC instances within the user-defined relations in the Selection Set would be extracted to form the partial model. In this study, the object which contains other object(s) or is relied by other object(s) is set as the relating object, and other object(s) are called related object(s).
Rule 5: The corresponding relating entity and related entities are included in the partial model, when a relation entity is included in the Selection Set.
where
erel is a relation entity;
erelating is a relating object entity;
Erelated is the set of related object entities; and
IFCREL is a function to test if the relation entity
erel exists between relating object entity
erelating and related object entities
Erelated.
The attributes of IFC entity represent different essential characteristics from other entities. These attributes can form different rules. As a result, the according model data can be extracted by designing different rules.
Rule 6: The entities included in the partial model satisfy the rules in the Selection Set.
where
r is a Rule.
4.3. Processing Rule for Redundant Information
According to elements in the Selection Set (Rule 4, 5, and 6), the matching IFC instances can be extracted from the original model, while the related necessary IFC instances are extracted based on Rule 1, 2, and 3. The IFC entities which are undefined in the Selection Set or cannot be inferred from the Selection Set are not supposed to be included in the partial model. These entities are called redundant information in this study. The redundant information, including IFC instances and related attributes, must be filtered before the export of the partial model.
Rule 7: The partial model cannot include entities which are not defined or inferred in the Selection Set.
where
is an Entity undefined in the Selection Set or cannot be inferred from the Selection Set.
5. Generic Language for Partial Model Extraction Based on the Selection Set
To ensure the proposed method to be interpreted by software tools, the XML schema was adopted to define a generic language for partial model extraction based on the Selection Set (PMESS). The overall architecture of the PMESS is shown in
Figure 2.
The ‘PMESS’ element is the root element at the first level of the overall architecture. The element is represented by the box symbol. The element at the second level is the ‘select’ element, which means the proposed method is to extract the partial model based on the Selection Set. The ‘select’ is a child element of the ‘PMESS’, which is connected by an arrow with a solid line. Condition 1 and 2 are mainly prescribed by elements ‘item’, ‘relation’, and ‘where’. The ‘item’ element defines the entity type, including object entity and attribute entity, and the ‘relation’ for the relation entity. The ‘item’ element has three attributes: type, match, and function. The relationship between the element and the attribute is represented by the dotted arrow. The ‘where’ is used to represent the rules for extracting partial models. It is noted that the ‘item’ and the ‘cascades’ are connected by the hollow arrow with dotted line. This means that the structure of the ‘cascades’ is the same as the ‘item’. More details will be discussed in the following section. An example of a PMESS-based configuration file used for extracting concrete columns is shown in
Figure 3.
5.1. ‘Select’-Mechanism
The ‘select’ element has one unique attribute ‘option’, which is either ‘AND’ or ‘OR’. The ‘option’ is designed to comply with the Theorem mentioned above. The use of ‘AND’ and ‘OR’ is defined as the intersection and union relations between several items, respectively. While the value of ‘option’ is ‘AND’, IFC instances will be extracted from the original model only when they match all the defined items. On the contrary, the ‘OR’ is required to match any one of the defined items. The default value of ‘option’ is ‘OR’.
5.2. ‘Item’-Classification
The type of entities in the Selection Set can be defined by the attribute ‘type’ of ‘item’. Another two attributes of ‘item’ are ‘match’ and ‘function’. The ‘type’ was designed to comply with Condition 1. Its value includes two types: ELEMENT and ATTRIBUTE, and it is required not to include abstract entity types (as mentioned in Condition 2). The ‘match’ enables users to describe the name of ELEMENT or ATTRIBUTE. The mapping between the value of ‘match’ and IFC entities/attributes has been established, which can automatically query IFC data according to the user-defined requirements.
When the ‘type’ is ‘ELEMENT’, the proposed method will extract IFC object entities which comply with the requirements defined in ‘match’ and ‘where’ (as mentioned in Rule 4). Particularly, if the value of ‘match’ is ‘SET’, the partial model extraction is required to comply with the rules defined in the ‘relation’.
The matching attribute entities in the IFC model will be extracted, if the ‘type’ is ‘ATTRIBUTE’ (as mentioned in Rule 4). The proposed method queries and extracts object entities which have the matching attribute entities.
The third attribute of ‘item’ is ‘function’, which is currently limited to the ‘extract’ for partial model extraction. The ‘filter’, ‘modify’, and ‘add’ in the ‘function’ will be further studied in the next paper.
5.3. ‘Relation’-Rule
According to representations of IFC relation entities, objects entities within a relationship can be divided into relating object entity and related object entity (entities), as shown in
Figure 4a. In the IFC schema, there are many sub-entities within the
IfcRelationship entity to represent diverse relations, such as
IfcRelContainedInSpatialStructure,
IfcRelAggregates, and
IfcRelAssignsToGroup.
Figure 4b illustrates an example of relating object entity and related object entities defined by the
IfcRelAssignsToGroup entity.
The relating object entity and related object entity (entities) are defined in the sub-element ‘relateto’ of ‘relation’ (as shown in
Figure 2), while the sub-element ‘relationtype’ is for the type of ‘relation’ (Rule 5). Currently, the proposed method supports the extraction of relations of building storey, group, and element assembly. As mentioned above, in the case of ‘type=ELEMENT’ and ‘match=SET’, the proposed method will query the matching relation entity according to the definition in ‘relation’, and extract IFC object entities referenced by the relation entity.
5.4. ‘Where’-Rule
The elements mentioned above are mainly used to extract the objects with a certain type or relation, but not for the objects with some given characteristics. Hence, the ‘where’ element was designed to define rules for extracting specific objects according to the user-defined semantics (Rule 6). According to the characteristics of objects defined in the IFC schema, the object semantics could be classified as direct and indirect semantics. While direct semantics could be directly attained from IFC instances, indirect ones have to be inferred or computed from other IFC instances. The direct ones include the Identity Document (ID), name, description, and predefined type; and the material, storey, shape, and comparison for the indirect ones, as shown in
Figure 2.
When these semantic meanings are defined in the PMESS document, the proposed method can query the target data to form a valid IFC model.
Figure 5 presents an example of all attributes in
IfcBeam entity and the corresponding IFC entities for ‘where’ rules. The ID, name, and description are derived from the
IfcRoot entity, a root entity in the IFC schema storing the most fundamental information. The predefined type, an extension of the IFC4 version to the attribute in the
IfcBeam entity, is used to define different types of the object (
IfcBeam in this example). The indirect semantics are required to query other IFC entities, for example, the material. In general, the
IfcMaterial entity is associated with the IFC object entity through the
IfcRelAssociatesMaterial entity, a subtype of the
IfcRelAssociates entity.
5.5. ‘Cascades’-Rule
The definitions in ‘item’ are overall requirements for extracting the partial model, and the ‘cascades’ can be used to further prescribe the requirements. The structure of ‘cascades’ is the same as ‘item’ to ensure the uniform definition.
Figure 6 shows an example of the PMESS-based configuration file for extracting beams under the ‘where’-rule that the construction time is ‘2019-09-20’. In this case, the proposed method firstly queries all beams in the building project, and then queries the specified beams which match the ‘cascades’.
The XML schema was adopted to define PMESS elements for complying with Rule 4-Rule 6. Rule 1-Rule 3 are the fundamental rules to form a valid IFC file, while Rule 7 is used to process redundant information within the extracted IFC instances. These four rules (Rule 1, 2, 3, and 7) have been embedded in the data process engine for implementation, not required to be defined by users.
6. Test Case
C++ programing language was used to develop two data interfaces for the implementation of the proposed method (PMESS). One is to read the PMESS-based configuration file, and the other is to extract and export the partial model. For further applications, these data interfaces were embedded into the proposed IFC-based platform. Different IFC models exported from many software tools (such as ArchiCAD, MagiCAD, Revit, and Tekla Structures) have been used to verify the feasibility of PMESS. In this section, a practical project model created by ArchiCAD was used to introduce the utility of the proposed method.
The test case was conducted using a building model of a shopping mall project. The building has eight floors with a construction area of 148,564 m
2, including six floors above ground and two underground floors. This model was built by ArchiCAD and exported as an IFC file by default settings. The file size of this IFC model was about 101 M, with 1,862,673 IFC instances.
Figure 7 shows the visualization of this project in the proposed IFC-based platform. Due to the large IFC file size of this project, it is necessary to extract partial models for different business tasks. Through the proposed method, several partial models were extracted under the following extraction requirements.
6.1. Partial Model Extraction for the User-Defined Extracted Objects
Through setting different ‘item’ elements, the required objects could be extracted. As examples for types of extracted objects,
Table 1 shows four examples of partial models extracted from the original model. The first three partial models extract a certain type of object, while two types of objects are extracted in the fourth one.
Figure 8 depicts the PMESS-based configuration file for the fourth partial model in the proposed platform. Through the PMESS, physical objects from different disciplines (architecture, structure, MEP, etc.) can be extracted from the original BIM model, such as the door and window for architecture, the beam and column for structure, the equipment and pipeline for MEP.
By using the IFC File Analyzer [
44], IFC model data can be analyzed in detail. As shown in
Table 1, all IFC object entities which match the user-defined requirements were correctly extracted. Moreover, through filtering out other objects, the resulting models only contain the required objects and their attributes. For example, the number of IFC instances in the fourth partial model was 173,763, and 90.7% instances were filtered out. Accordingly, the file size decreased to 13.8 M (only 13.7% of the original model). The results show that the proposed method correctly identifies the user-defined requirements and extracts all the semantically required objects from the original model. On the other hand, the decreasing of the partial models in file size is apparent compared with the original model, which avoids filtering redundant information manually and facilitates the fulfillment of downstream business tasks based on useful building information.
Except the extraction according to the object type, other semantics could be used to extract the required BIM data, such as the relationship and the rules (as shown in the following subsections).
6.2. Partial Model Extraction Based on the User-Defined Relations
This project is composed of underground and overground structures, so it needs to be built by different designers. According to this requirement, the ‘relation’-rule was used to extract all objects in the different parts of this building. An example of the PMESS-based configuration file for extracting the underground structure is illustrated in
Figure 9, and the resulting partial model is presented in
Figure 10. The file size of the extracted partial model is 32.1 M, including 942 columns, 1380 beams, 1253 walls, 351 doors, 69 slabs, and 148 stairs. These extracted objects are consistent with the original model. The proposed method is capable of querying and extracting the required information depending on the user-defined relation rule.
6.3. Partial Model Extraction Based on the User-Defined Rules
Numerous curtain walls were contained in this building project, such as peripheral curtain walls, and the skylight on the sixth floor. Due to the complexity of curtain walls, the models of curtain walls were required to set as separate models, which would be designed by different curtain wall engineers. For this purpose, curtain walls in different placements were extracted from the complete architectural model, and imported back into the original software for further design and analysis, as shown in
Figure 11. The main contents of the PMESS-based configuration files were presented in the middle part of
Figure 11.
The file sizes of these two partial models were 2.37 M and 0.59 M, respectively, which were much smaller than the original model’s (101 M). It is beneficial for engineers to make a detailed design based on the reduced models. These extracted partial models could be imported back to the original software ArchiCAD for detailed design. In addition, these extracted partial models were IFC compliant models, and could be used to import to other BIM software tools (such as Revit and Tekla) for professional design. It demonstrated that the extracted IFC files were syntactically valid.
7. Conclusions
A building project always consists of different types of information from multiple disciplines. However, business tasks require only a part of the complete building information model. Meanwhile, the required information varies depending on business tasks. A common method for partial model extraction which meets user-defined extraction requirements is necessary. For this purpose, a generic language for partial model extraction based on the Selection Set was proposed to extract a partial model from the IFC model.
The Selection Set was designed to represent extraction requirements. Elements in the Selection Set work as input parameters of the partial model extraction method. Due to the complexity of requirements for business tasks, several extraction requirements could be defined in the intersection or union form.
Furthermore, seven rules were defined to extract the partial model based on mathematical logic and set theory. These proposed rules ensure the syntactical and semantic validity of the partial IFC model during the extraction process. Firstly, the proposed method queries IFC data which matches the requirements defined in the Selection Set, such as IFC entities, properties, and relations. Subsequently, according to seven rules for partial model extraction, these extracted IFC data are defined as the nodes to query other related IFC data, and redundant information is filtered for forming a valid partial model.
Considering the processability of the computer and the readability of users, the XML schema was adopted to design the generic language for partial model extraction. Given the definitions of building information in the IFC schema, this study developed a mapping between IFC data and the elements defined in the PMESS method, which could meet diverse requirements defined by users. Through the PMESS method, users can extract the required information from the original model under different extraction requirements, such as objects, properties, and relations. In addition, the PMESS-based configuration file can be saved as a common template for reuse, which improves the efficiency of the definitions of extraction requirements.
To demonstrate the feasibility of the proposed method, a practical project was used to extract different partial models under three conditions: object types, object relations, and specific rules. Compared with the original model, the required objects were correctly extracted, which showed the validity of partial models at the semantic level. Furthermore, the extracted partial models could be imported back into the original software tool, which demonstrated the syntactical validity of the extracted IFC file.
Currently, although some commercial software products can be used to extract the required objects, it needs users to manually select the required objects, and the partial model cannot be extracted according to the particular rules. Some researchers have carried out research projects for partial model extraction, and mainly focus on some specific information, such as geometric information. In this study, the proposed PMESS method makes users automatically extract the partial model by requirement definition. Furthermore, users can define different requirements to extract the required partial model based on the PMESS, which can accommodate more applications.
This study is an important step in data sharing and exchange of building projects, and it also has room for improvement. For example, bSI proposed IDM and MVD for exchange requirements, and defined several templates for practical tasks. To extend the applicability of the proposed method, the PMESS should be mapped to IDM and MVD. Given the fact that the PMESS was designed based on the XML schema, the mapping mechanism between the PMESS and mvdXML could be further studied.