by Stuart Hardy, Business Unit Manager of EOH’s Carrier and Network Solutions Division
A digital technology environment depends on the efficiency and reliability of IT assets. So organisations have to constantly evaluate the performance of servers and immediately fix problems that arise. There is no time to waste.
This makes real-time monitoring essential – no one can afford the time to manually assess the functionality of physical systems when there are other issues to address. No matter whether the server is physical, virtual, on-premises or off, organisations must gain actionable information as quickly as possible to keep networks running smoothly. Because delivering on promises to clients and maintaining a competitive edge depends on it.
The importance of constant network monitoring
Even state-of-the-art equipment and the most up to date software don’t guarantee that systems will remain fault free. So it’s critical to have constant insight into the state of any business-critical system. It must therefore be continuously monitored to prevent outages. Pre-scheduled alerts must be set up to inform those responsible about possible failures. This allows them to always be aware of the system performance, no matter where they are, and enables them to react quickly.
Constant network monitoring also makes it quicker to solve problems. This is because the monitoring solution will already have indicated which device may be causing the failure, reducing the time it takes to identify and address the problem. Also, patterns of failures can be identified, helping staff to better understand the overall health of the network and define improvement actions.
Real-time monitoring and situational awareness
Ensuring uptime depends on real-time monitoring, which in turn depends on two critical factors: requirements and availability.
The monitoring platform must perfectly match the operational requirements of the business to ensure attentiveness and availability at all times, to allow emerging situations to receive immediate attention in real time. This also allows staff to constantly see the state of risk to security, data, network, end points, and cloud devices and applications.
Continuous monitoring greatly improves what is known as the situational awareness for IT managers. This term, coined by Mica Endsley, is described
as having a perception of elements in the environment, understanding the meaning of the elements in the environment, and applying the understanding to being able to project future states. This is key to mitigating risk, as having full network visibility and situational awareness makes it possible to assess all the elements that are currently having an impact on the network, or may do so in the future.
Network monitoring in cloud computing
Cloud computing requires both high- and low-level monitoring.
High-level monitoring provides information on the status of the virtual platform. This is collected at the middleware, application and user layers and is generally more useful for the consumer than the provider, as it is directly related to the quality of service that is experienced.
Low-level monitoring is usually information that is not available to the consumer, being concerned with the status of the physical infrastructure of the entire cloud environment (e.g. servers and storage areas).
Effective monitoring should thus provide both very fine-grained measures and a synthetic outlook of the cloud, involving all the variables affecting the quality of service and other requirements.
Elements of effective cloud network monitoring
There are certain key elements that effective cloud network monitoring needs to contain.
- Firstly, the monitoring system needs to be scalable, because of the large number of parameters that need to be monitored around a vast number of resources.
- Secondly, it needs to be elastic. This is necessary to cope with dynamic changes of all the monitored entities, especially in terms the expansion and contraction of networks as virtual resources are created or removed.
- Thirdly, the monitoring system needs to be adaptable to cope with varying computational and network loads without impeding any of these activities.
- Fourthly, it needs to be autonomic. It needs to self-manage its distributed resources by automatically reacting to unpredictable changes.
Stuart Hardy has been in the ICT industry since 1997, has been in the Telecommunications industry since 1997, intimately involved in product development, operations and product marketing roles. He has held Executive level positions in some of the largest Operators in South Africa and has founded and driven two successful start-up companies in the Mobile data and Wireless networking spaces. Today, Stuart is a Divisional Director for EOH in their Telecommunications sector.