1090

Get a Live Demo

You need to see DPS gear in action. Get a live demo with our engineers.

White Paper Series

Check out our White Paper Series!

A complete library of helpful advice and survival guides for every aspect of system monitoring and control.

DPS is here to help.

1-800-693-0351

Have a specific question? Ask our team of expert engineers and get a specific answer!

Learn the Easy Way

Sign up for the next DPS Factory Training!

DPS Factory Training

Whether you're new to our equipment or you've used it for years, DPS factory training is the best way to get more from your monitoring.

Reserve Your Seat Today

Thinking Proactively Will Maintain the Uptime of Your Network

By Andrew Erickson

August 23, 2024

Share: 

Proactive thinking isn't just about maintaining uptime. It's about safeguarding your entire operation. By anticipating potential issues, you can preemptively address them. By rapidly addressing issues, and preventing them, you can avoid disruptions and reduce operational risks.

Together, let's use snippets from the book "100% Uptime" by Bob Berry to better understand how to maintain network uptime.

100% Uptime Book Cover

Equipment Failures are Costly and Damaging

"When you can't effectively monitor your network, you can't just write a bigger check to keep things running. When serious failures happen, that can actually disrupt your service delivery. This hurts revenue, incurs regulatory fines, or - in the case of public safety - puts lives at risk."

Many clients reach out regarding network downtime related issues. The following hypothetical is not an unheard-of scenario.

Imagine a remote site with critical telecommunications equipment housed in a cabinet. This site is essential for maintaining your company's communication network, and any downtime could lead to significant revenue loss and potential safety hazards.

You need to monitor a range of equipment, including a diesel generator with contact closures, two strings of batteries (24 cells each, totaling 48 volts per string), and three HVAC units with MODBUS out and airflow sensors. Additionally, you need discrete sensors for door access control and water detection on the floor. Temperature and humidity sensors are required at multiple points within the cabinet.

Without proper monitoring, equipment failure could result in costly repairs, potential fires, and regulatory fines. The worst incident so far involved a generator failure during a storm, leading to a complete site blackout and a 4-hour drive for a technician to resolve the issue. Your team cannot afford such downtime again, and the operations manager would face severe consequences if it happened.

Our clients face challenges like these all year. In order to prevent these incidents, you remain as proactive as possible. Here's what to do:

Identify Critical Elements to Monitor

The book begins chapter five by posing a question for you:

"Why would you spend $100,000 or more on mission-critical equipment at a remote facility, then neglect spending less than 1% of that amount on basic monitoring to protect it?"

To ensure 100% uptime, begin by identifying all critical elements at the site. The critical elements from our hypothetical scenario include:

  • A diesel generator with contact closures.
  • Two strings of batteries (24 cells each, totaling 48 volts per string).
  • Three HVAC units with MODBUS out and airflow sensors.
  • Temperature and humidity sensors at multiple points.
  • Discrete sensors for door access control.
  • Water detection on the floor.

Identifying all the critical elements you plan to monitor lays the foundation for a proactive maintenance strategy. By understanding which components are most important to your network's operation, you can prioritize monitoring efforts.

This ensures potential issues are detected early and addressed before they escalate into significant downtime or operational failures. This targeted approach minimizes risks and enhances the overall reliability of the system.

Establish a Comprehensive Monitoring Strategy

Establishing a comprehensive monitoring strategy provides real-time visibility into the performance and health of all critical equipment.

"Without good remote visibility, your expensive infrastructure equipment is at risk."

By implementing a strong monitoring system, you can quickly detect problems, enabling prompt responses to potential failures.

Readiness to respond minimizes downtime and protects both your operational integrity and revenue. Having a proactive approach like this enhances reliability and supports informed decision-making. This ultimately focuses efforts to where they are needed most.

Centralized Monitoring and Real-Time Alerts

To respond promptly to issues, set up alert mechanisms such as email and SMS alerts.

"A good remote monitoring system will collect data from all of your remote locations, process it, and alert your staff in the correct way. Serious alarms will be clear and obvious. Unimportant "nuisance alarms" will be suppressed to reduce distractions. Alerts will appear on a console screen, as an email, or as an SMS text message."

These alerts can be configured to send notifications to key personnel. By consolidating all alarms and providing a single interface for monitoring, you confirm that critical issues are identified and addressed immediately.

Environmental Monitoring

Deploy sensors to monitor temperature and humidity at various points within the cabinet. This is essential for preventing equipment overheating and sustaining optimal operating conditions. The book mentions an example of sensor use:

"You can put temperature and airflow sensors on an HVAC vent to measure cooling (or heating) effectiveness."

Another example type of sensors are D-Wire sensors. These can be daisy-chained to cover multiple battery cells, tracking voltage, temperature, and internal resistance.

Power and Backup Systems

Monitor the diesel generator using contact closures. Keep track of fuel levels and generator status to prevent unexpected failures. Implementing a comprehensive power monitoring solution helps ensure that your site remains operational, even during power outages.

Implementing the Right Equipment

Select the appropriate Remote Terminal Unit (RTU) that can handle multiple types of alarms (discrete, analog, ping). The book notes that with the right RTU:

"You can log into [it] at any time to check site status."

Verify the RTU supports secure communication protocols like SNMPv3 and is compatible with MODBUS for HVAC monitoring. The "100% Uptime" book mentions that:

"If you're not tracking your HVAC and generator run times with basic sensors and analysis, you can't assess whether you're short-cycling them."

Knowing if you're short-cycling your equipment is needed. Short cycles will reduce the total lifespan of your machinery and increase the likelihood of unexpected failures. To prevent this, use daisy-chained sensors to cover multiple battery cells, tracking voltage, temperature, and internal resistance.

Implement door sensors and integrate them with your RTU for real-time access control. Make sure your system supports remote access. This allows for immediate response to unauthorized entries.

RTUs

Setting Up Effective Alert Mechanisms

Use a master station to consolidate all alarms and notifications under a single interface for monitoring. Having a single interface streamlines the monitoring process, allowing for quicker identification and prioritization of critical issues. This not only enhances response times but also reduces the risk of alarm fatigue.

"A web interface makes it easier to handle high alarm volume, as you can see which alarms have cleared and which ones are still standing. You can also see if a coworker has 'acknowledged' an alarm (agreed to work on it) already - enabling you to focus on something else."

Implement derived controls to automatically take corrective actions, such as switching to backup power if the generator fails. In addition to taking corrective actions, implementing derived controls reduces the burden on personnel during critical situations.

This proactive step minimizes the risk of human error, reinforcing the system's ability to remain both operational and efficient. This ultimately safeguards your network's uptime and stability.

Regular Testing and Maintenance

Make sure your monitoring system is regularly tested and maintained. Schedule regular inspections and tests of all sensors and RTUs. Keep all monitoring equipment up-to-date with the latest firmware.

Regular updates can provide enhanced features, fix vulnerabilities, and improve compatibility with other devices. This protects your network from potential failures or breaches that could disrupt operations.

"Just like equipment in your network, people with unique knowledge and training can be single points of failure. It's your job to reduce these risks."

Provide ongoing training for your technical staff to handle and respond to alarms effectively. Ongoing training for technical staff makes sure they're well-equipped to handle the complexities of monitoring systems and responding to alerts effectively.

Future-Proof Your Monitoring Infrastructure

Invest in solutions that can adapt to future advancements.

"Once you grow beyond 10 sites, monitoring exclusively via individual RTUs is going to get very cumbersome very quickly."

Choosing future-proof equipment, like the multifunction network alarm manager T/Mon, guarantees compatibility with both current and legacy systems. This allows your monitoring infrastructure to easily adapt as your network grows and evolves without the need for frequent replacements.

Take the Critical Step Towards Operational Efficiency

Upgrading your remote monitoring system is a necessary step toward attaining operational efficiency and reliability. By choosing DPS Telecom, you gain access to proven technology, tailored solutions, and comprehensive support designed to meet your specific needs.

Contact DPS Telecom at 1-800-693-0351 or email sales@dpstele.com. Our expert engineers are ready to assist you in building the right solution for monitoring your sites.

Share: 
Andrew Erickson

Andrew Erickson

Andrew Erickson is an Application Engineer at DPS Telecom, a manufacturer of semi-custom remote alarm monitoring systems based in Fresno, California. Andrew brings more than 17 years of experience building site monitoring solutions, developing intuitive user interfaces and documentation, and opt...