Did you know that U.S. water utilities could save $17.6 billion through 2027 by using advanced solutions incorporating AI and predictive analytics to help extend asset life cycles with more robust insight into asset condition?*
For water mains, AI and predictive modeling can assess the likelihood of failure (LoF) for every water main in a given time period. Combined with an assessment of the consequence of failure (CoF), or impact of that failure, this approach gives water systems the most precise picture of risk in their underground infrastructure, enabling proactive water main management.
Predictive modeling is the new standard in water main management (and water infrastructure management broadly). However, it can be difficult to know where to begin when you’re assessing and selecting an AI or predictive analytics solution.
There are 3 core needs to consider when selecting a predictive modeling solution for water mains:
- The modeling approach
- Prediction accuracy and explainability
- Integration with your everyday tools and processes
In this blog post, we’ll walk you through these 3 core needs, highlighting questions to ask and capabilities to look for to find the best predictive modeling solution for your water system’s needs.
Choosing the best predictive modeling approach for your water mains
AI and predictive modeling can mean many different things, and with nearly every company now claiming to offer AI solutions, it’s important to understand solution capabilities and how predictions are being generated.
The type of modeling matters because you need a model that captures all of the nuances of your unique water mains data to provide the highest accuracy LoF predictions. Here are some questions to consider:
Are LoF predictions generated using machine learning, regression analysis, or another method?
When it comes to water mains, there are 3 main approaches to generating likelihood of failure predictions: simple weighted models, weighted models with regression analysis, and machine learning models. Both regression analysis and machine learning models rely on statistics to generate predictions.
The highest accuracy predictions for water mains will typically come from machine learning models and algorithms, supported by expert Data Scientists.
Why? Machine learning models account for non-linear relationships between variables, as well as variable interaction. A more traditional regression approach will miss these relationships in your data. Learn more about these approaches here.
Does the predictive modeling solution add additional datasets to your water main data?
Water systems often have a lot of historical data to lean on. This data is crucial for any type of predictive modeling process. That said, many variables influence water main risk beyond system-specific data, including weather, soil, parcel data, and more.
There might even be data that exists in a water system’s ecosystem but in a different tool or location. A great example of this type of data is repair history and cost, which might exist in a financial or asset management tool separate from the infrastructure data source of truth.
When choosing a modeling approach for your water main management, you need to understand what data the model will be able to access, and what data will be added beyond your existing dataset. At minimum, it’s important to select a modeling approach that considers relevant factors for water main risk, including but not limited to parcel, weather, and soil data.
Do the water main LoF predictions rely on a global model or your specific data?
Your predictions are generated based on the underlying dataset, so you need to understand what data is contributing to the likelihood of failure predictions. Water mains can rely on either a local or global dataset to generate predictions.
- A local dataset means that all of the data contributing to your predictions is data from and related to your specific water system.
- A global dataset means that data from many water systems and locations is grouped together and used to derive predictions.
The only benefit of a global dataset is that it is bigger, which can help with a statistical analysis process. However, it’s risky to rely on predictions driven by a global dataset. If you are a water system in the Southeastern US, your predictions will be influenced by frost patterns in the Northeast, soil and pipe material interactions on the West Coast, and more. A local dataset, while smaller, is a better option for predictive modeling where interactions between variables, such as the interaction between certain types of soil and pipe materials, really matter.
How is Consequence of Failure (CoF), or criticality, calculated or assessed?
Though predictive modeling is most relevant for likelihood of failure (LoF) predictions, the approach to assessing CoF, or criticality, must also be considered. Most criticality analyses generate a CoF score, which is combined with the likelihood of failure prediction to generate an overall risk score. The most important thing regarding criticality is that the score considers multiple important variables to your system, including pipe size, the financial and social impact of a break, service to a critical customer (hospital, etc), and environmental impact.
Understanding and trusting the accuracy of your water main predictions
AI and predictive modeling solutions can sometimes be challenging to understand. So when considering a predictive modeling approach for your water main management, you must factor in both performance and explainability. How accurate are the predictions? What is driving these results? Performance and explainability matter because they build trust in the data and better help you manage risk. Below are a few questions to get you started:
Does the solution show what factors drive the predictions for every water main?
Although you need a data scientist to generate predictions using machine learning models, you shouldn’t need to be a data scientist to understand the results.
Explainability is the antidote to the feeling that AI and machine learning-generated predictions are a ‘black box.’ Explainability is exactly what it sounds like — for every prediction generated, a high-quality solution helps you understand what factors drive the predictions across your water system. It also enables you to understand what’s driving the likelihood of failure prediction for each individual water main and why the prediction for one water main is different than, for example, the next water main down the road.
How will the solution provide “proof” that the predictions reflect the risk in your water main infrastructure?
In addition to explainability, any predictive modeling solution needs to show that the predictions generated accurately represent risk and improve your ability to manage those risks in your water main infrastructure. This is a question of model performance — how good are the predictions?
The most common way to measure the performance of a predictive model is to hold out some of the data from analysis and then use that data to test the model’s performance. For water mains, this is often done as a ‘time shift study.’ A time shift study removes the last year (or several years) of data from the analysis. Then, the model generates likelihood of failure predictions for the same time period, allowing you to compare the predictions to what happened during that time period.
When looking at the results of a time shift study, there are two main things to know:
- You can’t rely on a single metric
- Performance metrics must always be compared to the baseline
What metrics are used to assess the model’s performance?
You can’t rely on a single metric to assess the performance of a predictive model.
As BlueConduit’s expert Data Scientists would say, “It’s easy to lie with statistics.” Essentially, this means that you can make statistical results say whatever you want them to say, especially if you only look at a single variable. For example, let’s say you want to predict how many people, out of a random group of 100, live past age 80. If you predict that every single person will live past 80, you will have correctly predicted everyone who lives past 80, regardless of how many people actually do.
To assess the performance of statistical models, you must look at multiple performance measures and understand what each one is measuring related to performance. (Learn more about performance metrics in BlueConduit’s recent webinar) Which brings us to the next question…
Are performance metrics compared to baseline models?
In addition to assessing predictive model performance using multiple metrics, you must ensure those performance metrics are being compared to the baseline.
Why? Because the core benefit of predictive analytics is improvement compared to where you are now.
If a water system is currently assuming that older water mains are more likely to fail, they’d want to compare predictive model performance to a risk assessment based only on age. How do the two different analyses compare? Which one is more accurately predicting the water mains that fail?
It’s important to compare to a baseline model because no predictive model is 100% accurate. So, you really want to understand how much better the predictions are compared to whatever approach you’re using today.
At BlueConduit, we compare predictions generated using machine learning models to 3 baseline models: year installed only, past breaks only, and a weighted model that considers pipe age, soil type, and history of breaks with variable weight assigned using regression analysis (learn more about the weighted model approach here).
Integrating water main predictions with your everyday tools and processes
AI and predictive analytics aim to help water systems and their engineering partners make better, more proactive decisions to manage the risk in their water main infrastructure. This is why the tools and delivery approach matter; it doesn’t matter how accurate the predictions are if they are delivered in a CSV file that only gets a passing glance before gathering dust on an analyst’s desk.
When considering a predictive modeling solution, remember to think about who and how you want to use those predictions to make sure you’re finding the right fit for your team. Here are some questions to think about:
How will you provide data for predictive analysis?
You likely already have a data source of truth, or maybe a few, for your water mains data. Understanding what type of data is required for modeling, and in what format, will help you understand the initial implementation lift and ongoing management needs for your team.
Solutions or software that pull data directly from your existing source of truth, like BlueConduit’s Water Main Predictions tool, make it easy to keep data up-to-date while limiting time and administrative work for key team members.
How is data delivered back to you, and can you easily engage with that data?
Likelihood of failure predictions, alongside criticality analysis and overall risk scores, can be provided back to your team in a range of formats, from a PDF to a software tool.
Again, it’s important to consider who will use this data and for what purpose. If we assume the goal of using predictive analytics is more proactive planning for condition assessments, maintenance, and/or replacement in high-risk areas, it’s likely that you want data delivered in a format where team members can easily engage with the results and use predictions to drive tactical or strategic planning. For example, BlueConduit delivers predictive data and risk analysis in an easy-to-use software tool integrated into your existing Esri environment (no coding or special expertise required!).
Do predictions update dynamically?
Predictions represent a moment in time. While static predictions might be a realistic representation of risk at the specific moment they are created, they will quickly become outdated (especially if you are looking at an area with lots of seasonality and weather/temperature shifts). Software tools with high-quality data integrations keep your dataset up-to-date, which enables up-to-date predictions and continual assessment of the highest risk and areas of greatest need.
BlueConduit’s Water Main Predictions tool lives inside your Esri environment and pulls data directly from your water main data source of truth. This keeps your predictions up-to-date and gives you insight into mains where risk has increased (or decreased!) since your last look.
Do predictions easily transfer to your asset management and/or work order tools?
The purpose of using predictive modeling to assess the likelihood of failure and overall risk for your water mains network is to enable you to proactively manage that risk. But it’s hard to be proactive if it is a pain to get the risk data into the tools you use to manage boots-on-the-ground work.
So when you are assessing predictive modeling solutions, always ask yourself, “How does data move between my data source-of-truth, asset management, and work order systems?” Solutions that integrate with your existing tools and automate data transfer from one tool to another will make it much easier for your team to utilize predictive analytics in your daily work.
Predictive modeling: The future of water main performance management
When it comes to water mains, predictive modeling presents a tremendous opportunity to more proactively assess and manage risk and likely infrastructure failure. However, it can be challenging to find the best tool for your team and processes. When selecting a tool, It’s crucial to understand the modeling approach, the accuracy of predictive results, and how your team will be able to engage with the tool for insights and planning.
BlueConduit’s Water Main Predictions tool is the easiest-to-use, highest accuracy tool on the market. Delivered directly into Esri, our technology uses world-class machine learning algorithms, supported by in-house expert Data Scientists, to build local, custom models based on your data and enriched with parcel, weather, and soil data (and more!). Our goal is simple – combine our data expertise with your water expertise to provide you with the best predictive and decision analytics tools, as well as the support you need to use them well.
*Source: Bluefield Research, Analyst Presentation: U.S. Digital Water Market, March 2024
Want to learn more about BlueConduit’s Water Main Predictions tool and modeling approach? Learn from our expert team in this on-demand webinar.