Over the years I've been involved with various projects and discussions on generating and handling data in wastewater treatment. A few years ago I was involved in a couple of WERF Projects focused on developing Decision Support Systems (DSS) to prevent plant upsets, along with Dr Nancy Love and Advanced Data Mining (ADMi). The folks at ADMi did some nice data analytics to pick out anomalies that might indicate toxins in the plant influent, but one of the major hurdles we ran into was distinguishing anomalies due to toxins and anomalies due to measurement problems. This reminded me of what my ex-boss and mentor, Dr John Watts, used to drill into me which is you need to focus on good primary measurements in order to have confidence in your data. Wastewater is a tough place to try to do that! As I said, a lot of our data is bad.
So, here is my brain dump on some of the keys to making big data work in wastewater, and avoiding the pitfalls of bad big data (there's a tongue-twister there somewhere...)!
5 keys to making big data work
1. Focus on data quality rather than quantity
- Clean them - wastewater is an extremely fouling environment an not the best place to put scientific equipment. My experience has been that everyone underestimates how quickly sensors become fouled. Go for auto-cleaning whenever possible and avoid installing anything in raw sewage or primary effluent unless you really need the measurement (see Key #2!) as these areas are particularly prone to fouling. Mixed liquor is actually an easier place to take measurements and final effluent the easiest of all!
- Calibrate them - this is generally understood, though the frequency of calibration, particularly for sensors that tend to drift, is generally shorter than ideal.
- Validate them - this is the piece that's overlooked by most instrumentation suppliers, I think. Analytics to validate the measurements, particularly during calibration is an area that needs much more attention.
2. Measure what matters most
3. Think dynamics, not steady state
|Graphic showing difference between composite|
sample and continuous measurement
(Courtesy Dr. Leiv Rieger/WEF,
taken from WEF Modeling 101 Webcast)
4. Recognize different timescales
- Diurnal (daily) variations
- Weekly trends (especially weekend versus weekday differences)
- Seasonal shifts