Most people in the world of smart buildings have probably heard of Brick, ASHRAE 223P, or Haystack. What these are, essentially, are organized ways to create a data model for building technology systems. This image below is from one of Erik Paulson’s Medium posts, “A Systems Approach: A Comparison of the Brick Schema and Project Haystack.”

Why would we need a data model, one may ask? What it comes down to is that it can greatly speed up the way technology is deployed, integrated, and scaled. It can also assist with organization and record keeping across building systems. In many smart buildings today, if a data model exists, it is typically maintained by a smart building IoT vendor or an MSI (master systems integrator). Behind the scenes, that vendor may be using it to query data, run fault detection rules, or build some form of digital twin. This is systems engineering.
The point is that the data model is often baked into the vendor’s platform, or is sometimes referred to as the independent data layer (IDL). This helps the vendor derive results in a bigger, faster, and more efficient fashion, but it also creates a dependency. When that vendor walks away or is replaced, the data model, and sometimes the data itself, can disappear with it. It becomes similar to the situation where a BAS or BMS vendor is removed and much of that information is gone, forcing the next team to start over from scratch during a retrofit.
What the Open-FDD project is attempting to do is create a data model, which can also be referred to as a knowledge graph, that stays with the building. It is intended to be free, open source, and vendor-neutral, while also retaining data on premises. The design intention is that it can be maintained by the client’s IT department on a local server, somewhat similar to how a BAS or BMS server is often set up and located onsite today. Ideally, the server itself would be treated as important building infrastructure and maintained with close oversight from IT.
This is the beginning of a different approach where the building owner can retain not only the data, but also the structure and meaning behind that data. Instead of the knowledge graph living inside a proprietary platform, it can remain with the building as part of its long-term digital infrastructure.
A little background on knowledge graphs
A little background on knowledge graphs is that this is a concept rooted in computer science theory. Brick, ASHRAE 223P, and Haystack all make use of a common framework called RDF, which stands for Resource Description Framework. RDF was developed during the rise of World Wide Web technologies in the early 2000s, when search engines on the internet needed a computationally efficient way of querying sources and understanding relationships between pieces of information.
For example, if you were using a search engine in the pre-AI days and typed in something like this into Google: “articles HVAC central plants efficiency chilled water turbo core compressors water cooled Europe,” there would need to be some efficient way for the system to understand the meaning and relationship behind those terms. This is where the power of RDF comes in. In RDF, everything is represented through subject-object relationships, often referred to as triples. From a computer science perspective, this makes it much more efficient to query a knowledge graph and discover how pieces of information relate to one another.
The alternative would be to rely only on plain text search across what would most likely be an enormous dataset made up of websites and documents across the internet. A simple text search, even if it could appear faster than the RDF query language SPARQL, for something like “articles HVAC central plants efficiency chilled water turbo core compressors water cooled Europe” would be far less efficient and far less structured in how it interprets meaning.
So this is where the power of a knowledge graph comes in, but in the context of a smart building, it is about how things are mechanically interconnected, which we will discuss below. There could be other interlinking technology systems as well, but for the focus of Open-FDD, it is mainly HVAC and energy efficiency.
What is interesting is that computer science PhD researchers who came into the smart building IoT world, and who had studied these kinds of data structures in their research, saw how messy building data organization really was. They recognized that the industry needed a better way to structure data and relationships, which led to the development of ontologies for buildings. That is what makes this so interesting: people from outside the traditional building industry could clearly see that the smart building world was handling data in a very inefficient way, perhaps not all that different from the idea of trying to search the entire internet without the benefit of structured relationships.
If I am not mistaken, some of the key people who helped bring this way of thinking into the smart building world were Gabe Fierro from the Colorado School of Mines, Erik Paulson from UW–Madison, along with many other professionals associated with NREL and LBNL.
Getting Into the Weeds of Modeling
Getting into the weeds of modeling data for a building can get messy, and there is probably a lot of debate around the best way to do it. What the Open-FDD project, which is focused on the HVAC technology silo, energy efficiency, and HVAC fault detection, attempts to do is begin modeling data with a BACnet scan of the entire building BACnet network, capturing every point into a data model.
When you look at a raw BACnet scan represented in that data model, it can almost resemble HTML at first glance, but it is actually written in a syntax for RDF called Turtle. That structure can contain BACnet addresses, device references, and point information for the building systems being modeled. RDF support is built into the bacpypes3 Python ecosystem, thanks to Joel Bender, the library maintainer, who is also another notable person in the smart building ontology world, which makes it easier to build knowledge graphs quickly from BACnet scans using the bacpypes3 library.

The next part is the daunting process of “tagging” the data model after the BACnet scan of the building. Prior to the age of generative AI, a human would need to manually gather knowledge about the building, such as blueprints, HVAC mechanical schedules, BACnet scans, and other documentation, and then piece the data model together by hand. This is very labor-intensive and time-consuming, and the human doing the work can also be prone to tagging errors.
Over the years, some efforts were likely made using machine learning models to classify point names into ontology tags, but in the current era this is where the LLM really shines. It can tag the model very quickly and efficiently for a human to review. In my testing, generative AI does an exceptional job of understanding what a point name maps to in a Brick tag. For example, generative AI can recognize that a point such as SAT or SA-T refers to a Supply Air Temperature Sensor. The LLM can then tag it appropriately, as shown below.

In this example above, Open-FDD assigns a unique identifier to the point and another unique identifier to the piece of equipment it belongs to, uses brick:isPointOf to link the point to that equipment, stores rdfs:label "SA-T" as the human-readable point name, uses ofdd:mapsToRuleInput "ahu_sat" to map that point into the fault detection logic, and can also incorporate time-series database references, while the polling flag indicates that the point is intended to be polled over BACnet for time-series data collection purposes. Open-FDD is attempting to leverage AI-assisted data modeling to improve the overall speed of the tagging process, which can lower labor costs and, with proper data model testing, improve overall accuracy.
Beyond BACnet telemetry
What the Open-FDD project is attempting to do is also incorporate Brick feeds and feed relationships into the model, assisted by AI, so that the data model can show the mechanical relationships between different systems. For example, a zone temperature sensor may be fed by a VAV, which is fed by an air handling unit, which in turn has chilled water cooling fed by a central plant. Can you see the power in this? A knowledge graph for the building would then work much like the earlier Google query example, where a query can be made against the model for a particular problem and potentially yield possible fixes for the issue at hand.
Other things the data model could incorporate include mechanical schedule data from blueprints, such as engineered pump sizes, coil capacities, AHU fan sizing, and airflow volumes—essentially, everything the mechanical engineer used to size the system. Queries could then be made for virtual metering or engineering calculations. For example, an AFDD platform could calculate in near real time whether a mechanical system has a problem that is causing energy waste. The engineering data could all be baked into the data model along with the BACnet point address information and the devices that make up the HVAC system. It is all there, and it can be queried appropriately and automatically when fault conditions arise.
Open-FDD is designed to provide basic insights for fault detection and analytics, with the intention that it can be integrated into other platforms, perhaps cloud-based ones. Coupled with a strong knowledge graph, those platforms could then provide enhanced troubleshooting of mechanical systems, all made possible by the data model, or knowledge graph, housed by Open-FDD.
A Practical Use Case
A use case example helps show where this could matter. Imagine if the Veterans Affairs hospitals kicked off another ten year rounds of retro commissioning of each hospital based on a standardized reporting framework. At one point in my career, after about five years of experience as a field technician in HVAC controls programming, I was hired by a consulting firm that specialized in U.S. federal retro-commissioning work. It became my job to go onsite and recommission older federal hospitals or laboratories operated by the GSA. It was daunting, but also a lot of fun for someone early in their career, spending weeks onsite recommissioning old equipment and then attempting to set up the BAS to log proper data and extract it.
Part of that process for me at the time was manually extracting data from the BAS and analyzing it in Excel to help support the RCx report writing with charts for the project lead, documenting how the building operated over time. It was daunting work. You crossed your fingers hoping you had good data, and it could take me about two weeks in an office setting just chugging through Excel. Talk about labor-intensive. On top of that, each analyst on the RCx project could arrive at widely different results, so each federal hospital in the program could end up with very different findings, some of them poor simply because of differences in analyst understanding.
This is where the power of something like Open-FDD could come into play. What if, prior to the start of the project, an organization set up a common framework to log data and model the data? Then, in phase two of the project, the onsite specialist could arrive, click a few buttons, and have the data model automatically pull the data into charts showing, for example, how the AHU leaving air temperature setpoint control tracks against the heating valve or cooling valve over time. Each contractor would then be working from a more consistent and repeatable foundation, producing faster results and more standardized reporting. I can guarantee many RCx contractors would appreciate that as well.
Of course, the project would need to be planned so that the RCx contractors were required to follow a defined data analysis and reporting framework written into the project specifications or scope. Phase one would be setting up the technology framework and the data model. Phase two would involve the RCx specialists generating reports based on standardized analytics built from the phase one framework design and aligned with whatever the RCx scope for the facility required. If that framework were free, open source, and standardized, the question becomes: why not?
Conclusion
Though Open-FDD is still far from meeting federal software security requirements, the concepts behind it point toward useful features for the future and a stronger foundation for more efficient buildings. At its core, the idea is that the building’s data model, context, and foundational analytics can remain onsite rather than being fully dependent on a cloud platform. That alone is an important shift in thinking: by design, the data stays with the client.
As mentioned prior Open-FDD aims to provide a foundational level of fault detection and analytics at the edge, giving building owners and operators a structured, open, and vendor-neutral base to work from. From there, cloud-based vendors or service providers could still build more advanced features on top as needed, but the building would retain the local knowledge graph, the data relationships, and the core understanding of how its systems are organized.
The initial diagrams shown at the top of the post help illustrate that vision. They show the architectural concept of Open-FDD operating as a client-hosted edge platform that is open source and free, keeping processing local while still allowing optional integration with cloud-based analytics, along with the OpenFDD pyramid, where the ontology and knowledge graph form the foundation, data ingestion and visualization build on top of that, and fault detection and diagnostics sit higher up the stack. In many ways, that is the central idea of this article: if the building has a strong data foundation, then better analytics, better fault detection, and better long-term outcomes become much more achievable.
If Open-FDD or projects like it continue to mature, they could help move the industry toward a future where building data is more structured, more portable, and more useful over the long term. That would not only support energy efficiency and HVAC performance, but also give owners a better chance of retaining the digital knowledge of their buildings regardless of which vendor happens to be involved at a given time.
Please follow along with the project on GitHub. There is also a link at the top of the README for the OpenFDD Discord channel, and you can follow TalkShopWithBen on YouTube for OpenFDD development logs and other tutorials related to BACnet and data modeling practices.
The OpenFDD project is in active development and is looking for contributors to help enhance the project, as well as for testing. Thank you for reading, and have a great day.