I had the pleasure of hosting our latest Tech Stack roundtable with two fantastic guest perspectives: Liz Wood, Director of Media and Analytics at Intelligent Demand, and Amanda Harbison, Director of Analytics at the National Electrical Contractors Association (NECA).
The question at hand: how do we improve our data quality? How can we pull from lessons in other industries to fish the turds out of our AEC data pond?
As I said at the top of our roundtable discussion, I feel like construction deals with an awful lot of unstructured data. And roundtable participants wholeheartedly agreed. The pain points are clear — but what are the solutions?
Our current data problems
Construction companies have decades of legacy data. They all too often think it should fit seamlessly into some turnkey software solution. But anyone who’s seen the rubber meet the road knows that’s not how things really go down.
Our problem in the AEC industry isn’t unique. It’s happening in even the most digitized industries. We want software to use our data to drive efficiencies — some AEC contractors are even using upwards of 50 software programs. But we don’t truly have a big-picture idea of our end goal. We just keep bolting things on as we go.
The real issue, as Liz and Amanda pointed out, is that humans are impatient. We don’t want to do the pre-process work to make our data clean. As Liz said, in an ideal scenario, “The bulk of the work and the hours and the resourcing that go into a project — I would say 80% of it — happens before the project ever goes live.” In construction, we are making the same argument to invest in early collaboration through Design-Build or IPD contracts.
Why is so much work required before launch? Whether it’s an online ad, a member survey, or a construction project — we need clean data.
In our discussion, Liz said, “Probably the biggest learning takeaway that I've had with my team over the past year is really pushing back and identifying all of those points that need to connect and really figuring out what needs to happen to make those connections work.” Without clean data, that’s impossible.
Why clean data matters — and how to get it
Before a new solution launches, ideally, the inputted data should go through a QA process. If we launch A, will it speak to B? Will it give us what we expect in scenario C?
Liz said that her clients often want to launch tomorrow. But she has to push back, reminding them that if they do, “all the data starting tomorrow until the end of time will not be consistent. They will be broken. And they will prevent us from telling the story that we need to tell and will prevent us from gaining the insights that we need in order to inform future projects.”
Fortunately, getting to clean data can be simpler than you’d think. Our roundtable dredged up some tips:
Implement data validation procedures: Amanda pointed out that validation checks can help. They can catch incorrect data (e.g., data that’s an alpha character when it should be a numeral) or flag potential duplicates.
Spent time with your legacy data: She also says that you can reach a point where you trust your legacy data. It just requires some time to go through the historic data points. Then, she said, you’ll be able to say, “After doing some digging and cleaning, we can say safely that we believe that this set of data has enough integrity — that the trend line can inform decisions.”
Identify your start date: Alternatively, Amanda suggested drawing a line in the sand. You could say that as of April 1, 2021, we know that all of our data moving forward will be clean, for example.
Get clarity on your goal: You don't have to go back and through and clean up everything back to 1967. You can slice out those segments of data that are important to clean up so that you get the reports that you want.
Set naming conventions: Something as simple as some groups using an ampersand and some using “and” can split your data. Again, data validation checks can help here.
Use AI: AI might be able to scrub your data for you. You can lean on it for your base structure, but be realistic about what you expect from artificial intelligence.
Get help: Consider bringing a data scientist in house if your budget and overhead can accommodate it.
Regardless of the industry, data standards are a necessity if we want our data to work for us. But as our roundtable discussion highlighted, the primary reason the AEC industry doesn’t have standards in place is that establishing them is, to put it simply, hard. Getting things standardized across an organization is a herculean task and we're trying to do it across a massive, global industry.
Still, though, our subject matter experts said that all of this work is worth it. “It takes a lot of resources and budget to set these things up upfront,” Liz said. “But if you set it up, then you can forecast how much money is saved in the future because you're no longer paying 20 people to do what one person with one system could do, right? And then you can have those same people doing more productive things.”
And remember to bring your team along for the ride. The focus on data will require new processes, and you might get pushback when asking team members to learn them or comply with new standards. Amanda encouraged us to remember “the educational piece of no one's attacking anyone. It's just that this is what's going to make things more efficient. That change management piece is always going to be important,” she said.
If you want to learn more about how to clean up your data pond so you can trust your data and realize efficiency gains, watch the full recording of the Tech Stack roundtable.
Join the next Technology Stack breakout discussion, where fellow Construction Dork, Jonathan Marsh, leads the debate with Nichole Carter and DJ Phipps around why we still haven’t gone “all-in” on BIM. Register here.