The Federal Communications Commission’s (“FCC’s” or the “Commission’s”) Public Safety and Homeland Security Bureau (the “Bureau”) released a Public Notice encouraging communications service providers to implement appropriate measures to prevent major service disruptions.
Based on submissions to the Commission’s Network Outage Reporting System (NORS) and publicly available data, the Bureau has observed a number of major service outages caused by minor changes in network management systems. These so-called “sunny day” outages do not result from a natural weather-related disaster or other unforeseeable catastrophe, and can result in “silent failures,” which are outages that occur without providing explicit notification or alarm to the service provider.
After an analysis of the facts and circumstances surrounding 2014 outages, Bureau staff have determined that service providers likely could have prevented most of theses outages if they had implemented certain industry best practices. In particular, seven best practices recommended by the Commission’s Communications Security Reliability and Interoperability Council (“CSRIC”) II, that could help prevent sunny day outages and silent failures:
- Awareness Training: “Network Operators, Service Providers and Equipment Suppliers should provide awareness training that stresses the severe impact of network failure, the risks of various levels of threatening conditions and the roles components play in the overall architecture.”
- Required Experience and Training: “Network Operators, Service Providers, and Equipment Suppliers should establish a minimum set of work experience and training courses which must be completed before personnel may be assigned to perform maintenance activities on production network elements, especially when new technology is introduced in the network.”
- Access Privileges: “Service Providers, Network Operators, and Equipment Suppliers should have policies on changes to and removal of access privileges upon staff member status changes.”
- Network Change Verification: “Network Operators should establish policies and processes for adding and configuring network elements that include approval for additions and changes to configuration tables. Verification rules should minimize the possibility of receiving inappropriate messages.”
- Network Reconfiguration 911 Assessment: Service Providers and Network Operators when reconfiguring their network or Emergency Services Gateway should assess the impact on the routing of 911 calls.”
- Diversity Audits: “Network Operators and Public Safety should periodically audit the physical and logical diversity called for by network design of their network segment(s) and take appropriate measures as needed.”
- Network Monitoring: “Network Operators, Service Providers, and Public Safety should monitor their network to enable quick response to network issues.”
The Bureau encourages providers to review and consider voluntarily implementing these practices as appropriate. The Notice also offers additional steps to help prevent future outages or mitigate the impact of outages, including implementing access control, validation and authentication procedures, software-based alarming, enhanced outage detection and examining automatic re-routing.
Please Contact Us if you have any questions.