1. CS7620-A Case-Based Reasoning December 09, 2009
CBArch Report
Urjit Bhatia, Andres Cavieres, Preetam Joshi, Radhika Shivapurkar
CS-7620 | Case-Based Reasoning | 12/09/09
1 General Problem
The general, long term goal of this project is to generate a tool to assist architects and engineers in early
phases of a building design process, known as conceptual design. The potential relevance of such a tool lies
on the fact that most important decisions regarding the environmental impact, life-cycle performance and
operative costs of a building are taken on this phase.
The hypothesis of our work is that bringing knowledge and expertise from good examples and relevant
cases (Figure 1) might be a significant contribution for decision-making at conceptual level. The energy
consumption of buildings is just one relevant example of how early decisions on building shapes, orientation,
construction materials and mechanical systems may have a positive impact if assessed correctly from the
beginning.
Figure 1 Relationship between amount of knowledge available for each design phase and design freedom, according to Fabrycky
(Fabrycky1991). Early design infusion is supposed to reduce design freedom (middle figure). On the right side, the image shows
the relation between levels of effort in design and distribution of added-values. Curve 1 means the ability to affect costs and
functional capabilities; Curve 2 refers to the cost of design changes; Curve 3 represents the traditional design effort distribution;
Curve 4 represents a BIM (CAD system) enhanced design effort distribution (Patrick McLeamy). Knowledgeable efforts on early
phases clearly make more sense.
1.1 Specific Problem
The specific proposal is to use Case-Based Reasoning to help the exploration of initial building
configurations, which is known as conceptual design phase. Conceptual design essentially focuses on the
definition of basic 3D models that represents the overall shape, size, orientation and how the main activities
are related to each other and distributed within a physical terrain (Fig 1). At small building scales this
exercise does not imply too much complexity in order to justify the help from a CBR system, but in the
context of big commercial buildings such as office buildings, shopping malls, laboratories, schools or
hospitals, the assessment of relevant examples can provide a valuable guidance during early design.
Georgia Institute of Technology Page | 1
2. CS7620-A Case-Based Reasoning December 09, 2009
Figure 2 Example of conceptual design building models.
Some type of knowledge and guidance can be provided by parametric models, which are smart
geometric representations driven by user-defined rules. In this way non-geometric properties of a physical
object can guide the behavior of a parametric model, while other properties get updated accordingly. For
instance a “column” can update its diameter and/or its material properties if loads coming from the upper
floor increase, a building height could be defined a being a ratio of the building’s width, and its width could
be defined as ratio of the terrain length, etc.
Libraries of such parametric components can be created based on class definitions that embed general
rules as described above. During design, these classes are instantiated with specific values and then
optimized (adapted) as needed during the process. From a CBR perspective, parametric models can be
considered similar to cases, with mechanisms explicitly defined for adaptative behavior.
However, what is missing on parametric CAD systems is an efficient mechanism to select and retrieve
best model cases from libraries. In this scenario Case-Based Rasoning may offer an important contribution
by complementing the adaptation capabilities of a parametric system with retrieval techniques. A third
important component however is the integration of an ontology of parametric models. Since many new forms
of parametric models can be created during design, the retainment step requires a proper classification of the
new types so that the parametric case base grows in a consistent and organized manner.
Furthermore, the ontology provides a semantic layer to the system to support additional automation of
design tasks. Among them we foresee reasoning capabilities to recommend unexpected alternatives to the
user based on some initial query. In this scenario the integration of the ontology with a CBR system is
intended to add a conceptual classification of instances, providing a new level of abstraction necessary to
support the exploration of a wider scope of valid options.
1.2 Context of the Study
A design problem can be tackled from many different perspectives, depending on the interests, goals and
constraints of the design team. In the same way a parametric representation can be implemented according to
such conditions. In order to build our system integration we focused on the conceptual design of large
commercial buildings and the evaluation of their potential energy consumption. This domain problem is well
known and extensive datasets are publicly available to support the retrieval stage.
The building energy consumption dataset is available from the Energy Information Administration (EIA)
website, which is a government department responsible for generating official energy statistics. The specific
datasets chosen are the CBECS public use microdata files 1 and 15. Each file contains 5210 records. These
Georgia Institute of Technology Page | 2
3. CS7620-A Case-Based Reasoning December 09, 2009
records represent information voluntarily provided in 2003 by building owners from 50 states and the District
of Columbia.
The main link between our parametric design system and the datasets is the definition of building shapes
as an enumeration of basic standard types. The parametric instantiation of these types according to a series of
other building properties extracted from the dataset provides the basic starting point for our CBR framework.
• Squared
• Squared with patio
• Rectangular
• “I” shape
• “L” shape
• “T” shape
• “E” shape
• “F” shape
• “H” shape
• “X” shape
2 Proposed Solution and Methodology
The traditional CBR cycle is depicted in the Figure 2. In this model the data structures of retrieved and
adapted cases are the same, therefore only one repository is needed. In our system however we have two
different sets of cases with different representations each. For the problem of designing buildings that follow
energy consumption patterns of real-world instances our system starts the cycle from a master dataset of
existing buildings as mentioned above. But instantiation and adaptation use the parametric format.
Figure 3 CBR traditional cycle: Retrieve, reuse, revise and retain steps.
Georgia Institute of Technology Page | 3
4. CS7620-A Case-Based Reasoning December 09, 2009
For this reason we adopted a second repository for parametric cases. While the first case base is fairly large,
the second one is small, containing eight or nine initial templates. The main idea is that this small parametric
case base must grow and be managed in an efficient way. Figure 3 shows our proposed framework.
Figure 4 Proposed system framework. . Each main step has a simple or a refined algorithm implementation (+ or -). First phase
focuses on system integration with simple algorithms.
The proposed framework has the same four steps of CBR, but two different case repositories, the ontology
and an ontology reasoner. The cycle here starts with the architect’s query. A building is required based on a
feature vector of 12 components. Geographic location, building activities, intended shape and size are the
most important ones. The user also provides the geometry of the site in which the new building has to be
designed along with a main orientation vector. The best match gets retrieved from the building case base and
instantiated in the CAD environment as a parametric model which contains all the information from the
retrieved case.
The next step is adaptation (reuse). For this step we initially considered both automatic and user-driven
adaptation. Currently only automatic adaptation is implemented at the topological level. In this step a simple
heuristic search algorithm generates a valid topologic adaptation to be evaluated at the revise stage.
Evaluation is intended to be two-folded. A user evaluation is expected to assess the result in terms of
aesthetics, volumetric distribution, functionality, etc. The second evaluation is about the estimated energy
consumption of the design model by checking its performance against the real-world dataset. Currently this
evaluation is not implemented.
After evaluation the retainment step is triggered by the user. In this step the ontology reasoner classifies the
adapted model according to pre-existing types. In case that the adapted model represents an instance of an
exiting type it may be stored as such. Otherwise the reasoner infers a new type and creates a new concept in
the ontology. The adapted model then becomes the first instance of this new type, available for further
Georgia Institute of Technology Page | 4
5. CS7620-A Case-Based Reasoning December 09, 2009
retrieval. The goal of the proposal is that whenever in future iterations the designer asks for a case, the
system will look for a best match on the real-world database, but also would suggest related shape types that
were created in the past, increasing the scope of valid design alternatives.
Each step of the framework is considered to have a simple (naïve) implementation and a more advanced
implementation (+ or – sign). Due to time constraints the team decided to focus on the system integration
first using just simple algorithms for each aspect. Further work will explore more advanced options.
2.1 Tasks led by each team member
• RETRIEVE (Urjit): {Building Database, Fish and Shrink and Knn Retrieval}
• REUSE (Andres): {Domain knowledge, System Integration, Adaptation}
• REVISE (Preetam): {Ontology definition, Reasoner Integration, Knowledge Extraction}
• RETAIN (Radhika): {Knowledge Extraction}
3 Retrieval
The retrieval is the prime phase in a CBR cycle. The work on retrieval depends on a lot of factors such as
the knowledge engineering, the dataset etc. The following sections will provide an overview of how the
retrieval is handled in the CBArch System.
3.1 Raw Input Data
The data set is a very broad data set of building features –“The 2003 Commercial buildings Energy
Consumption Survey (CBECS) building characteristics and consumption and expenditures public use files”.
This data-set is available at http://www.eia.doe.gov/emeu/cbecs containing information of about 5200
buildings. The data-set had to be pruned and filtered to remove missing values. Some of the values were
interpolated and added to the database to make the data consistent.
Not all the features of the dataset were a part of the input feature set, so we filtered those columns and
created another “feature-vector” containing only the important features that we needed. This gave us the
advantage, from a database point of view, of being able to work on a smaller database table and make the
query process faster.
Georgia Institute of Technology Page | 5
6. CS7620-A Case-Based Reasoning December 09, 2009
Figure 5 Some of the important features from the dataset.
3.2 Retrieval System Framework
The retrieval system is built around MySql 5.0 and C#. The persistence manager – nHibernate is a
relatively new technology and was used for Object Relational Mapping (ORM). The communication between
the database and the algorithm has been contained to a one-time load of requested feature set from the
database to the algorithm’s working memory. It is represented as an Object Mapped Model – called as an
“Entity”. These are the POCOs (Plain old CLR Objects) that are mapped to the database. There are some
external systems references that help complete the framework and provide important services.
Figure 6 Assembly References.
Georgia Institute of Technology Page | 6
7. CS7620-A Case-Based Reasoning December 09, 2009
3.3 The Retrieval Algorithm
There are two similarity based algorithms used in this system: Fish and Shrink and Knn. The need for
two algorithms arose due to the challenges brought forward by the performance of the Fish and Shrink
algorithm. The basic life cycle of the retrieval algorithm is represented in Figure C.
3.3.1 Fish and Shrink
The fish and shrink algorithm works in two phases. First it calculates the similarity amongst cases
themselves and then does a match-and-promote phase similarity checking of the cases with the query. The
central idea is that the case that is similar to the query can guide us to finding other cases that are possible
candidates for being similar to the query since they themselves are similar to this target case.
Figure 7 Algorithm Lifecycle. This generic lifecycle is maintained by both the algorithms.
Data structure
The class Node contains an “Aspect Hash” of type “Aspect”. Each “Aspect” contains a list of Neighbor
objects, which point to the position of a Node in the working memory Map.
Figure 8 Relation between structures.
Georgia Institute of Technology Page | 7
8. CS7620-A Case-Based Reasoning December 09, 2009
Figure 9 Class diagram for data structure.
3.3.2 Knn Retrieval:
The Knn retrieval is also based on the same underlying data-structure as shown above. It calculates the
similarity of the query with the existing cases in the memory in a “just-in-time” fashion. The Knn gives quick
results compared to the Fish and Shrink.
3.3.3 Similarity Measurement Heuristics:
There are some very important similarity measurements and heuristics used for retrieval. These define
how the algorithms compare two cases in the case base. For example, the similarity matrix shown below
gives a heuristic for matching the census divisions and calculating the similarity contribution of this feature
over the range of 0 to 1.
In a similar fashion, other heuristics include:
1. Exponential Scaling: used for features like NumberOfFloors. A building with 2 floors is very
different from a building with 5 floors, but another building with 20 floors is not as different from one
Georgia Institute of Technology Page | 8
9. CS7620-A Case-Based Reasoning December 09, 2009
that has 27-30 floors. Thus as the base value of measurement (number of floors) increases, the
significance of the gap decreases.
2. Magnification (Linear Scaling): used for features filtering some features like
TotalWeeklyOperatingHours. This gives us a way to filter purely numerical values. Mathematically,
it can be modelled as: Let x = (a – b)/(a + b) Magnification m = x/f where x < f and f is the
magnification factor. Other cases are thus considered to be too far away from the test case.
3. Direct Testing: In cases of truth values or fixed valued functions over the feature vectors, direct
testing was used. If the values match, then the similarity is positive otherwise zero.
3.3.4 Issues and Challenges
The task of retrieval presented us with several challenges, including choice of a good platform,
integration with the other modules of the system and performance. During initial phases of the retrieval
implementation, time taken was multiple of its current performance. This optimization was done using
micro-timers embedded in the code and third-party performance evaluation tools like: ANTS memory
profiler and EQUATEC memory profiler. These tools helped to indentify the cause of lags. It was found out
that some loops like ForEach and data-structures like ArrayLists were performing slowly. Thus this was
remedied by using crude, but faster implements. Another approach used was to make a lot of the decision
process inline. This came with a sacrifice of code modularity but helped to improve significantly the
performance.
Another issue was the way we evaluated and interpreted the Fish and Shrink algorithm. The text
supporting this algorithm is not very expressive and alternate sources of the same do not agree on some of the
finer details, like updating the testDistances. This penalized our work on the retrieval algorithm. Our time &
effort evaluation for Fish and Shrink failed and forced us to implement the Knn algorithm. The issue
identified was that Fish and Shrink was filtering to about 90% which seems fine overall but given the large
size of case-base, we were targeting around 98%. On the other hand, we planned to use Knn to rank and filter
the cases presented to us by the Fish and Shrink Algorithm, in a hybrid ensemble like approach. The fish and
shrink was able to present around 300 cases out of nearly 5000. So we could have again ranked them
presented the best k-cases, but time constraints hindered that implementation.
4 Reuse
Adaptation is performed once a best match or a list of best matches is retrieved. Case properties of
interest at the conceptual design level can vary according to each problem, particular goals or specific
business practices. The initial set of building properties selected for our retrieval system contains 12
properties:
- Building Shape: An enumerated set of standard building shapes.
- Square Foot Area: Size of the building.
- Census Division: Describes geographic location. Relevant to analyze climate conditions.
- Free Standing: Describes if a building is isolated or not from others.
- Number of Floors: Number of useful levels of the building.
Georgia Institute of Technology Page | 9
10. CS7620-A Case-Based Reasoning December 09, 2009
- Main Activity: Describes what type of business or activity occurs on the building.
- Number of Businesses: Describes how many businesses exist on a building for energy assessment.
- Number of Employees in Main Shift: Relevant for energy consumption assessment.
- Open 24 Hours: Relevant for energy consumption assessment.
- Open During Week Days: Relevant for energy consumption assessment.
- Open During Weekends: Relevant for energy consumption assessment.
- Total Weekly Operating Hours: Relevant for energy consumption assessment.
The retrieved feature vector representation of a good match gets partially replicated into the data structure
of the parametric model representation. For instance all the information requested at the query stage gets
replicated in the CAD model, plus extra information such as building materials, façade properties, glazing
and sun protection necessary for energy consumption evaluation. One assumption was that the designer
would not normally request for cases focusing explicitly on those extra properties, but would rather expect
them as useful information to be learned from the retrieval.
Once the relevant case properties are mapped into a parametric representation the parametric model gets
instantiated. The shape of the retrieved case is instantiated by using a library of basic shape topology
templates (figure X). These templates are adjusted to fit the geometric characteristics of the case as well as
the orientation of the building as defined by the user. However chances are that an instantiated version of the
case would not perfectly fit the characteristics of a given site or other contextual constraints of the new
problem. Therefore adaptation of building layout and other associated properties must be done.
There are two basic approaches for adaptation, namely geometric adaptation and topologic adaptation.
Another important assumption made in this project is that sometimes non-geometric properties may drive the
topological / geometrical features of a building, but it is most common the case where shape modifications
drive the value of non-geometric properties. Any of them can be performed either by the user herself or by
some automatic procedure. In this work we are focusing only on automatic adaptation of both building
topology and geometry.
Georgia Institute of Technology Page | 10
11. CS7620-A Case-Based Reasoning December 09, 2009
Figure 10 Database of standard topology templates for initial retrieval instantiation.
4.1 Geometric Adaptation
Geometric adaptation is the simplest method and the set of rules to achieve a successful adaptation can
be summarized by means of three basic geometric transform operators: Move, Rotate and Scale. In our
system the initial attempts for adaptation must be geometric so to accommodate the parametric instance in a
given polygonal site by some combination of these operators.
However, despite that geometric adaptation makes a lot of sense from a domain perspective and it is part of
the natural adaptation capabilities of parametric models, the research team decided not to implementation it
to its full potential because geometric adaptation can be achieved procedurally, hence there is no special need
to retaining a geometric adapted form. (Figure 8).
Figure 11 Sequence from geometric adaptation to topologic adaptation. The initially instantiated "L" shaped building gets out of
the site bounds due to the chosen orientation line. It system first should try to move the shape (incomplete). If some space remains
out, then topologic adaptation is triggered so that the shape fit the site. A different topology is reached (but not necessarily a new
one).
A more interesting and more promising scenario is provided by topologic adaptation. The idea is that
Georgia Institute of Technology Page | 11
12. CS7620-A Case-Based Reasoning December 09, 2009
new topologies have different meaning from building design perspective, implying new architectural
concepts which are more worthy of keeping and reusing.
4.2 Topologic Adaptation
Assuming that geometric adaptation failed 1 , i.e., not combination of Move, Rotate or Scale operators
could make a building shape to completely fit a polygonal site; the system performs a topologic adaptation
process.
The two algorithms that generate the topologic adaptation are called in a sequence based on a simple
heuristic rule. Any part (space component) of the building failing to fit in the site must first search for an
empty spot adjacent to its same original sector, called a building branch. This rule is in accordance to the
basic criteria of compatibility between building activities, which states that only compatible spaces (rooms)
can be put together. If there is no empty spot big enough to accommodate the “misfit” space within its same
branch, then it must start a new branch attached to its original branch.
If this also fails then the misfit space has to try these two steps again with the closest branch, and so on.
After all branches have been looked at the ground level then the system looks for accommodation in a second
level and so on. Such re-accommodation lead to a change on the topology whenever the misfit space creates a
new branch attached to closest one to its original branch. In order to perform this search the following
adjacency graph data structure was defined:
Figure 12 Adjacency list representation for a building shape topology graph. Note the almost complete match between a “H” and
“U” gets differetiated by the concept of “yard”.
1
As stated in the previous point, geometric adaptation was not fully implemented, because topological adaptation was more
relevant according to our time constraints.
Georgia Institute of Technology Page | 12
13. CS7620-A Case-Based Reasoning December 09, 2009
The topologic adaptation algorithms keep track of node relationships for later classification of standard
types and subsumption inference of new types by the ontology reasoner. These also includes the recognition
of cycles for identification of shapes such as ‘”Square_with_courtayard”.
Figure 13 Example of topology adaptation that leads to a new concept. Initial "Wide_rectangle" type which is correctly classified
by the ontology as a children of the “U” type (two inverted U’s) gets adapted into a new shape which is classified by the reasoner
as children of both "U" and "H" types.
4.2.1 Issues and Challenges
The representation of the topology graph is incomplete as the current implementation does not consider
all the information that should be considered for properly computing the adaptation. Furthermore, the current
implementation of the topology adaptation algorithm does not support addition or deletion of spaces nodes,
being limited to adapt only the same number of spaces originally instantiated from the retrieval.
Further work has to be done in a more complete representation of the shape topologies and better
algorithm to keep track of more complex outcomes, as well as to support addition or deletion of nodes.
Another important limitation of the current implementation is that space nodes are undifferentiated. This
limitation does not correspond to the complexity of real-world buildings made of different types of spaces
with different requirements. This additional level of information should lead to a richer set of rules regarding
how spaces might adapt, not only regarding themselves but also in relation with other contextual constraints
such as accessibility, sun exposure, energy optimization, etc. At this stage the current implementation worked
well as proof-of-concept and a starting point for further improvements in such direction.
5 Revise
The system requires an adaptation which is not only geometric and topological but also efficient in terms
of energy. This can be achieved by evaluating the design with respect to energy and modifying it. The CBR-
Arch dataset consist of energy parameters associated with materials used for construction. We use these
parameters to evaluate the design produced using geometric and topographic adaptations. The current
evaluation module in CBR-Arch is a standalone module which provides evaluation based on energy
consumption.
5.1 Energy Consumption Evaluation
The evaluation module uses a knowledge extracted from the domain i.e our master database pertaining to
energy associated with real existing buildings. The knowledge extraction module creates a mapping of all the
wall materials and roof materials used with the consumed energy .The energy components taken into
Georgia Institute of Technology Page | 13
14. CS7620-A Case-Based Reasoning December 09, 2009
consideration are fuel and electricity consumption by use of materials like glass, wood, bricks etc. This
material-energy mapping format is given as follows:
Roof Material 1 Average associated Energy=14888798.43
Roof Material 2 Average associated Energy =5162388.04
Roof Material 3 Average associated Energy =2190562.31
….
Wall Material 1 Average associated Energy =10516642.5364
Wall Material 2 Average associated Energy =21645429.1371
Wall Material 3 Average associated Energy =9407973.24345
…..
The evaluation module uses this mapping to find all the materials used to build the entity and uses an
aggregation function to calculate the aggregated energy. The aggregated energy is summation of all the
materials used.
Figure 14 A simplified framework for evaluation of energy performance for the adapted building model.
6 Retain
After the adaptation stage, there will be a retain stage which would essentially complete the classical
Case based reasoning cycle. In a design related domain like architecture it is up to the designer to decide if
the adapted solution is good enough to stored, based on her expertise or based on some performance
assessment such as mathematical analysis or simulations. In any case simple geometric adaptation can be
achieved procedurally so that there is not much gain in storing its outcomes. However topologic adaptation
processes can lead to completely different configurations that can be worthy of storing. In this scenario a new
Georgia Institute of Technology Page | 14
15. CS7620-A Case-Based Reasoning December 09, 2009
class of shape or building concept can emerge. The retain stage therefore is two-folded; it has to store a new
meaningful concept as well as relevant instances that represent such a concept.
In domains like the travel recommender system, we had a direct storage of instances without any regard
to the meaning it provides while storing. In our domain of architecture, when new shapes are generated they
will be stored using an ontology. The need to do this is that if we store shapes directly then there is no real
use of these shapes in our domain. We require new shapes which are essentially new concepts to be classified
as part of the already existing shapes. For example, if a new shape is generated, we first run an evaluation of
the new shape based on the different evaluation approaches that have been proposed in this paper. Then
based on the results of these evaluations, we first check the ontology model to see whether this shape already
exists, if it does not exist then we classify the new shape as a part of the already existing shapes. Hence,
when the user queries for a H, he can get different varieties of a H shape like a combination of H and an L
shape etc. Hence, CBR acts as a discovery process here which can discover new shapes after the adaption
takes place.
6.1 Ontology Design Decisions
An ontology is a collection of Concepts and their corresponding Instances. Concepts consist of
individuals which may belong to either a single concept or to more than one concept provided that these
concepts are not disjoint. Any given instance cannot belong to two or more disjoint concepts. Such
individuals cannot exist. For example, consider the popular pizza ontology. In this case, consider two
concepts: Non-Vegetarian topping and Vegetarian topping. These two concepts are disjoint i.e., any topping
which is a Vegetarian topping cannot be a Non-Vegetarian topping. The presence of such a topping which
belongs to both the Non-Vegetarian and Vegetarian classes causes an inconsistency in the ontology.
Concepts also consist of Properties which have a constraint placed on them. These constraints determine as
to which concept a new concept would belong to. By running a reasoner on the ontology, inconsistencies can
be identified and automated classification (Only in OWL-DL) can be achieved. There are many reasoners
available namely: Pellet, DIG and a few more. We will describe the Pellet ( v1.5.2) reasoner in the further
sections of this report.
Ontologies are represented by many different formats such as Web-Ontology Language (OWL), N-
TRIPLES etc. The OWL representation of an ontology is a very common representation. It has three variants
namely: OWL Lite, OWL-DL and OWL Full. The OWL Lite is the syntactically simplest species of OWL. It
is intended to be used in situations where a simple class hierarchy and simple constraints are needed. OWL-
DL is much more expressive than OWL-DL and is based on description logics. Description logics are a
decidable fragment of first order logic and are therefore amenable to automated reasoning. OWL Full is the
most expressive and is used in situations where very high expressiveness is more important than being able to
guarantee the decidability or computational completeness of the language. It is not possible to perform
automated reasoning on OWL Full. Therefore, we chose OWL-DL for the representation of our ontology.
6.2 Ontology Structure
We incorporated basic building shapes into our ontology which served as the base ontology. We used
Protégé 3.4 to create the ontology consisting of the concepts and the constraints placed on these concepts.
Figure 9 shows the Jambalaya view of the ontology.
Georgia Institute of Technology Page | 15
16. CS7620-A Case-Based Reasoning December 09, 2009
Figure 15 Jambalaya view of topology ontology: Basic topology concepts and set of standard topologic shapes.
As seen in Figure 9, the Ontology consists of the following basic shapes:
T Shape
U Shape
H Shape
I Shape
L Shape
The Building Energy Consumption dataset describes building shapes as a larger enumeration, including
shapes that were not part of our initial ontology. Such shapes include the Square shape with courtyards and
all its derivations, the “X” shape and the “E” shape. The purpose was to just define a minimal set from which
more complicated shapes could be inferred from more basic types.
Thus the super class of all these shapes is the class Shape, which will consist of all possible shapes that
would be generated in the future operations of the CBArch system. These initial shapes serve as a basis for
new shapes which would be classified under these shapes. For example a new shape which is a combination
of a “H” and an “L” will be a member of both the “H” and the “L” classes. These shapes are not disjoint with
respect to each other hence enabling newer concepts to belong to more than one of these basic concepts.
The basic classes have the following properties associated with them:
• hasYards
Georgia Institute of Technology Page | 16
17. CS7620-A Case-Based Reasoning December 09, 2009
• hasLines
• hasBranches
• hasPoints
• has Angles
Figure 16 Asserted and inferred hierarchy of building shapes. A new type gets classified by the Pellet reasoner according to
Description Logics rules defined on the right side.
A branch is defined as a sequence of three continuous points which result in a straight line. A building
has a number of yards like for example consider a “U” shape building, it consists of one yard. Similarly, a
“L” shape building has one yard. A “H” shape building has two yards. The angles, lines and points are
obvious.
Figure 10 shows the constraints placed on individual basic shapes. There are two types of constraints
namely: necessary constraints and necessary and sufficient constraints. The necessary constraints determine
whether a new class is eligible to be considered as a part of a particular class. On the other hand, necessary
and sufficient constraints determine a closure condition such that a new class would be truly classified under
a particular class. As shown in Figure 10, cardinality constraints were imposed on each of the properties of
each class. These cardinality constraints were selected based on intuition and trial and error.
Initially, we started off with only hasLines, hasPoints and hasAngles properties and played around with
the constraints in order to get a reasonable classification. However, these properties could not result in a good
classification of the new shaped that were being generated. We, therefore, investigated better features that
would give us better classification results. Hence we came up with the hasBranches and hasYards properties.
By imposing appropriate constraints on these properties and combining them with the other properties,
we were able to establish a reasonable ontology structure which can be updated as and when new concepts
arise. Figure 10 also depicts the inferred hierarchy computed using the inbuilt Pellet 1.5.2(Direct) reasoner.
The various levels of classification have been generated by classifying the taxonomy using the Classify
Taxonomy function provided by Protégé. Before this step is performed, the ontology should be checked for
inconsistencies i.e., a new shape should be validated.
Georgia Institute of Technology Page | 17
18. CS7620-A Case-Based Reasoning December 09, 2009
6.3 Invoking the Pellet reasoner through C# code:
We needed to emulate the operations performed in Protégé in our C# code because the
GenerativeComponents API was available in C#. Hence, we need to invoke the Pellet reasoner from C# and
classify the ontology model represented using the Jena framework. The Jena and the Pellet reasoner were
available in Java. Hence, we used IKVM to enable access of these functionalities in our C# code. Figure 11
shows the steps followed in order to achieve the required functionality. The Pellet 1.5.2 package consisted of
the Pellet reasoner and the Jena API. We used IKVM to convert it into a .dll file and then imported it as a
library reference in the C# code.
Figure 17 Integration of Pellet Reasoner (Java based) with the parametric representation of the building (C# based).
6.4 Retain stage:
The retain stage basically consisted of updating an existing ontology with a new shape that was
generated in the CBArch cycle of operation. The reasoner was invoked in order to get either the direct or all
the super classes of the new shape. When the reasoner is told to get all the super classes of a given shape, the
inheritance hierarchy of the new shape would be shown. If only the direct super-classes function of the
reasoner is invoked, then only the direct super-class of a particular new shape would be shown. For example,
for a new shape which is a combination of a H and an L, the direct super classes of the new shape would be
H and L. All the super classes of the new shape would list the whole hierarchy of classification i.e., it would
also list the super classes of H and L. The importance of this step is mentioned in the next section.
6.5 Importance of the Ontology Update:
The ontology update module is an important part of the CBArch cycle. This is because the addition of
new shapes results in a better choice range of shapes from the parametric database. For example, if a user
queried for a H shape, then this query would consult the ontology to check for shapes which are similar
(belongs) to the H shape and return different possible shapes relating to an H. Hence, different varieties of an
H shape can be instantiated instead of a simple H shape that would result due to the absence of the parametric
Georgia Institute of Technology Page | 18
19. CS7620-A Case-Based Reasoning December 09, 2009
database.
An additional feature to be implemented here is to store the specifications (features) associated with a
new shape in a database. For example, the features like site area, region, building activity, etc. corresponding
to a new shape which is classified under a H shape can be stored into a database. If a user gives a query,
shapes whose corresponding features were stored in the database, would be retrieved if the user's query
shows a reasonable level of match with these features of a particular shape. The “level” of match is yet to be
finalized.
As future work, we plan to implement the promising concept of Derivational Analogy which would store
the traces of operations of a user in order to arrive at a particular solution. These traces can be useful to
compute new solutions. This aspect is currently being investigated.
7 Evaluation and Conclusions
The current report introduces a novel design support system that integrates Case-Based Reasoning with
Parametric Modeling and Ontologies. The system takes as reference the domain of conceptual design of
commercial buildings.
At the current stage of development our focus was on system integration and proof-of-concept only. The
goal of this integration is to take advantage of complementary capabilities of these three systems to support
architectural design processes. While CBR provides a framework to store and retrieve good examples at the
instance level, Parametric Modeling offers a framework for rule-based form adaptation. Finally ontologies
are intended to provide a higher layer of abstraction at the semantic level, so that new design concepts can be
created and classified. Instances can be therefore organized under this conceptual umbrella, and new forms of
design automation can be explored. Among them it is expected an improvement of the recommendation
capabilities of the system by enabling unexpected cases to be brought to the designer, increasing in this way
the scope of valid alternatives to be explored.
The system as proof-of-concept shows initially very good results. It successfully retrieves and adapts
shapes according to the specified rules, and then classifies them as new concepts of the ontology when
appropriate. However the retrieval system is not referring to the ontology yet. This step remains to be done,
but at least the basic foundations required to achieve this functionality are well rooted. Further work has to
focus on the evaluation of adapted models, more specifically on the aspects related with expected energy
consumption. Additional work will also explore fine grain evaluation of results, exploration of alternative
approaches for data representation and performance improvement of algorithms.
Georgia Institute of Technology Page | 19