IT & Systems Management

Proactive monitoring: 4 ways to avoid IT downtime

This is a contributed article from Sneha Paul, a product consultant at ManageEngine, a division of Zoho Corporation

Information technology is a business enabler and the lifeblood of modern organisations. While effective IT boosts productivity and your bottom line, unplanned downtime puts your business’s profitability and reputation at stake—as the recent IT outage at British Airways suggests. A power surge at one of BA’s data centers resulted in communication across all its systems going down. The aftermath? Over £170 million wiped off the market value of British Airways in addition to a compensation bill of up to £150 million. 

According to one study, most UK firms experience an average of 43 hours of downtime per year, costing roughly £12.3 billion in lost revenue. Further, search engines downgrade a page's ranking if the site is slow or provides poor end-user experience. That’s why it’s important to have resilient IT services and consistently monitor the uptime of your applications, data availability, and website. And the methodologies below can help you establish that resilience and keep your IT up and running.


Ensure perpetual service with website performance analysis

A recent report states that the top 50 retailers in the UK incur £1 billion in lost revenue because of poor website experience.  A website’s performance metrics directly correlate with a company's KPIs, such as conversion rate. Additionally, errors like “page not found” not only make your customers go elsewhere but they also squander all your investments in paid search and online marketing. 

With a website monitoring tool, you can proactively audit performance benchmarks such as website response time split up (DNS lookup time, connection time, first byte download time, and more), individual page element load time, uptime rate, availability, and redirection time. You can even check the content in your site and receive alerts whenever specified keywords are missing or there are any hacking attempts. 

Perform these checks frequently – as often as once per minute — to ensure that you stay ahead of any performance bottlenecks. And perform each check from multiple geographic locations simultaneously, so you can be sure that end users in all geographies are getting the best possible user experience. 


Keep network outages at bay with round-the-clock vigilance 

Network infrastructure is crucial for any business. Around 72 percent of firms in the UK experience up to eight network outages annually, which extrapolates to £521 per employee in terms of lost productivity.

Network surveillance software helps you monitor the response time, packet loss, and other performance metrics of network devices such as physical and virtual servers, routers, firewalls, and switches. It keeps tabs on the bandwidth consumption of your users and applications. Single alarm consoles raise issues related to CPU usage, buffer hit rates, UPS battery status, network availability, wireless signal strength, and other critical performance metrics. 

With round-the-clock vigilance, you get a tip-off about any anomaly instantly, via notifications through email or text. The software even gives you a root cause analysis report. This helps you troubleshoot the issue faster so you can avert a major disaster before it even raises its ugly head. 


Audit application performance KPIs to avoid blackouts

Applications, whether customer-facing or those supporting in-house operations, drive the functioning of an enterprise. End users have little tolerance for dysfunctional, slow, or broken applications. No wonder millions of users recently went into a frenzy when the popular instant messaging application WhatsApp suffered a global outage!

Using application performance monitoring (APM) software, you can assess the key performance indicators of your applications such as memory utilization, database connection time, resource availability, or throughput. You can also break down and analyze application performance based on browser, geography, ISP, and other parameters. This gives you a perspective based on the end-user experience thereby helping you improve customer service.

APM also puts predictive analytics into practice by alerting you of any ensuing performance deterioration before it manifests into a full-blown issue. And to top it all off, you can automate the process of resolving the issue right away through auto reboot or deployment of corrective scripts or programs.


Extend greater reliability with cloud infrastructure monitoring

The adoption of cloud computing in the UK has increased by 83 percent since 2010, according to research by Cloud Industry Forum. The most widely used cloud service is web hosting (providing access and storage space for websites), which is all the more reason to have unified visibility of your cloud infrastructure. 

With cloud monitoring, you gain detailed insight into the health and performance of your cloud solutions. You can analyse your virtual machines' workload as well as audit the resource utilization of your virtual data center. You can also baseline this analysis data, so if there is any disk capacity shortage or a memory leak you instantly receive alerts to help contain any disruptions. 

Additionally, leveraging the SLA management capabilities of cloud monitoring tools helps ensure that your service provider consistently delivers on its service commitments. Having consolidated visibility into your cloud network ensures greater reliability and optimal performance. 


Embrace the smarter solution

In this era of digitalisation, accessibility is promised anywhere and anytime. That means downtime and outages can adversely affect your SLAs. Implementing a robust and efficient framework to keep an eye on your IT environment eliminates the occurrence of any operational disruptions which puts your brand value at stake. While deploying a software suite that implements the above functionalities takes months, you can achieve the same results in a shorter timeframe with the right tool. The key lies in finding that single comprehensive solution that realises your goal of ensuring business continuity, 24x7.


« Enterprise GitHub projects of the week: UFO, CloudFoundry, Brakeman


News Roundup: Should hacking be considered an act of war? »
IDG Connect

IDG Connect tackles the tech stories that matter to you

  • Mail


Do you think your smartphone is making you a workaholic?