Try ilert AIOps

All-in-one Incident Management Platform

Manage on-call, respond to incidents and communicate them via status pages using a single application.

Trusted by leading companies

Highlights

The features you need to operate always-on-services

Every feature in ilert is built to help you to respond to incidents faster and increase uptime.

Harness the power of generative AI

Enhance incident communication and streamline post-mortem creation with ilert Al. ilert AI helps your business to respond faster to incidents.

Read more
Integrations

Deploy in minutes with 100+ ready-to-use integrations

ilert seamlessly connects with your tools using out pre-built integrations or via email. ilert integrates with monitoring, ticketing, chat, and collaboration tools.

Transform your Incident Response today - start free trial
Start for free
Customers

See how industry leaders achieve 99.9% uptime with ilert

Organizations worldwide trust ilert to streamline incident management, enhance reliability, and minimize downtime. Read what our customers have to say about their experience with our platform.

Stay up to date

Expert insights from our blog

Engineering

Under the hood: Request coverage feature

Discover the process of developing one of the most frequently used features in ilert's mobile app.

Marko Simon
May 23, 2025 • 5 min read

The ilert mobile app is primarily used by responders to receive notifications about critical alerts, react to them on the go, and check their current on-call status. It has various capabilities, including critical notifications via push, quick actions for alerts, and critical alert settings. The app enables responders to view their current on-call shifts and escalation policies, take on-call shifts from somebody else, and create coverage requests to ask for on-call shift handover from a colleague. The latter is a new feature of ilert that has proven to be very useful for a communication tool between users, and this post is taking a deeper dive into the development of the feature and the challenges we faced developing it.

Why were coverage requests introduced?

Since we introduced on-call schedules, users have been able to create overrides—special shifts that take priority over regular ones. An override lets you assign another user to take over on-call duty, either for a full shift or just part of it. Overrides don’t have to follow existing shifts—they can be created for any time period, even outside of configured shifts.

Later on, the "Take on-call" feature was introduced, which is the opposite of overriding my shifts. Both methods create overrides, but neither method ensures that the other user gets notified of any action taken on their on-call shifts. Furthermore, creating overrides for other users was giving them responsibility they eventually weren't aware of, and this could be critical.

The solution for this problem was to introduce a flow of asking another user to take over specific on-call duties, resulting in a short communication stream of requesting coverage.

Designing the coverage request REST API

The general flow of a coverage request should be:

1. User A creates a coverage request, asking User B to (partially) take over one or multiple shifts

2. User B gets notified, either accepts or declines the coverage request

3. User A gets notified of the action that User B decided to choose

The logic behind ilert request coverage feature

We needed to design the API around a coverage request entity, which had to have at least the following fields:

- sender

- receiver

- shifts

Additionally, we added a message field to give users an option to communicate additional details for their request. For the user interface, we also provided the current state and the createdAt date string, which are read-only properties. When the user declines the coverage request, some communication back may be useful too, handled by giving the user the ability to add a declineComment. Lastly, to show multiple coverage requests in a list view and apply meaningful filters, we used the state field in combination with an `expired` state calculated in the frontend. A coverage request is considered expired when the last shift it covers has ended.

Beyond the classic Create and Read operations on the coverage request entity, we needed specific endpoints to perform actions: accept, decline, and cancel. Update and Delete operations are not part of the flow right now and won't be implemented.

From mockup to polished UI

ilert Request coverage feature: mockups and final view

There are no significant differences between the mockup and the final version of the coverage request creation view. The styles have been adjusted, and an additional timezone information box has been included. The final versions of the list view and the detail view look like this:

Communication is key

A general goal of this feature is to motivate users to see and respond to coverage requests as early as possible, as on-call shifts are always bound to time and can sometimes be on short notice. Another goal is to let all relevant communication stay in the ilert mobile app, eliminating the need to switch between tools. To achieve this, several means of communication are introduced.

Push notifications

Whenever an action related to a coverage request is taken, a push notification is sent to the relevant person.

  • Coverage request created: receiver gets notified
  • Coverage request accepted/declined: sender gets notified
  • Coverage request cancelled: receiver gets notified

But what if the receiver doesn't have a mobile app?

Email

ilert checks if any of the relevant users don't have at least one registered push notification token (unique ID from a user on a device, used by ilert to route push notifications). If that is the case, ilert sends out an email to the user’s primary email, containing information about the coverage request.

In-app badge

Sometimes push notifications get dismissed by accident, without recognising the content (and possibly swiping away a time-critical coverage request). To provide more presence in the app, a small red circle (badge) is added at the top left of the menu icon in each list view. It indicates whether there is one or more pending coverage requests for review. Additionally, the main menu item shows a count of all pending requests at any time.

Provide filters, but keep the UI clean

Giving the user the ability to filter coverage requests in the list view is necessary. An obvious one is a filter for Received and Sent requests. Another important but tricky filter is for relevant requests only. That means any expired and not pending requests are filtered out by default. But as we already have the Received/Sent toggle, another toggle for Current/All would've cluttered the UI too much.

One idea was to introduce a filter toolbar (similar to the one implemented on the alert list), but the idea was discarded as it would've been the only filter at the time of release (which would've looked odd). Another idea was to choose the default: only show requests in state Pending, and let the user access all via a button click. Ultimately, we settled on this solution for its simplicity and ease of use.

Every day usage reveals papercuts

After the release of the feature, the ilert team started using the feature internally as well, and quickly recognized one flaw of the feature. When acting on a coverage request (accept, decline, or cancel), the coverage request would instantly disappear from the list without giving a clear confirmation of the coverage request's change of state. 

Two improvements were put in place:

  • Stay on the detail view after an action happens to see the updated state of the request
  • Keep relevant coverage requests in the list view for 24 more hours after performing an action

The latter wasn't the case before, because the list was initially built upon the state field, meaning it would instantly disappear from the list upon acceptance. A click on past requests was needed to view the just-accepted request. Therefore, an additional query parameter was defined and included in the API, enabling the frontend to specify a past creation date. The response also included all coverage requests—no matter their status—from the given creation date up to now. Now users can see all pending coverage requests, plus recently accepted/declined/cancelled ones (in the last 24 hours).

Haven't  installed the ilert app yet? Give it a try! Download the app for Android or iOS.

Product

Rollbar and ilert: Real-time error monitoring meets smart incident response

Rollbar is now part of the ilert integration catalog! Detect errors in real time and respond instantly with ilert’s alerting and incident management.

Daria Yankevich
May 22, 2025 • 1 min read

We’re excited to share that Rollbar is now part of the ilert integration catalog! This new technical partnership allows software teams to detect application errors in real time with Rollbar and instantly respond using ilert’s powerful alerting and incident management features.

What is Rollbar?

Rollbar is a comprehensive, real-time error monitoring and debugging platform designed to help development teams detect, diagnose, and resolve issues faster—before they impact users. By providing deep visibility into application errors across the entire software lifecycle, Rollbar empowers teams to ship higher-quality code with greater confidence. With clients like CircleCI, Twilio, Babbel, and Salesforce, Rollbar is trusted by teams that prioritize reliability and seamless user experience.

At its core, Rollbar gives you deep visibility into your application’s health by capturing errors as they happen. Whether it's a backend exception or a frontend crash, Rollbar collects detailed metadata—including stack traces, request parameters, and user data—so developers can fix bugs quickly and confidently.

Key highlights of Rollbar:

  • Real-time monitoring: Instantly detect and visualize errors.
  • Intelligent grouping: Reduce noise using machine learning-based error clustering.
  • Comprehensive context: Investigate errors with full context, from local variables to affected users.
  • Enterprise-ready: Scales with your infrastructure and offers strong security and integrations.

How you can benefit from the Rollbar and ilert integration

Send alerts from Rollbar to ilert to enable rapid incident response

With this new integration, Rollbar alerts can now trigger events in ilert automatically—enabling rapid, targeted incident response across your engineering and SRE teams.

Here’s how your team benefits:

  • Deploy with confidence: Get notified instantly when a new deployment introduces an error. You can roll back quickly or fix forward without waiting for user complaints.
  • Bridge the gap between Dev and Ops: SREs and developers can collaborate more efficiently, as ilert routes Rollbar alerts to the right on-call engineer using your preferred notification channels—phone, SMS, push, Microsoft Teams, and more.
  • Centralized approach for faster root cause analysis: ilert enables teams to receive alerts from various alert sources, which helps to expedite root cause analysis. ilert as a central dispatcher can correlate Rollbar alerts with alerts from other sources to help teams resolve incidents faster.

How to Connect Rollbar and ilert

Getting started is easy. To integrate Rollbar with ilert, follow our step-by-step guide. It takes just a few minutes to:

  1. Create an alert source in ilert.
  2. Generate an endpoint URL.
  3. Add the webhook in your Rollbar settings.

Once connected, Rollbar will send error notifications to ilert, where you can manage them using your existing on-call schedules, escalation policies, and alerting preferences.

Engineering

An ultimate step-by-step guide on Checkmk Cloud Monitoring

Explore Checkmk’s new fully managed, cloud-based monitoring solution for seamless incident management, and learn how to connect it with ilert.

Tim Nguyen Van
May 09, 2025 • 10 min read

Checkmk launched Checkmk Cloud (SaaS) in February 2025, which is a fully managed, cloud-based version of their monitoring technology. This solution, designed for ease of use, allows enterprises to start monitoring their IT infrastructure with no installation, maintenance, or manual upgrades required. The SaaS version is compatible with both cloud-based and on-premises systems, bringing them together under a single, straightforward platform. 

As Checkmk is one of the popular monitoring solutions chosen by ilert users, we decided to dive deep and test this new SAAS version to provide you with a helpful guide on connecting your Checkmk monitoring with the ilert incident management platform.

If you get stuck or anything is unclear, reach out to the ilert support team via the chat widget. We are happy to help!  

What this guide covers

This step-by-step guide will help you:

  • Set up and configure Checkmk Cloud Monitoring for cloud-based and on-premises infrastructure
  • Create a dedicated IAM user in your AWS account with the necessary permissions to allow Checkmk to access and monitor your AWS resources.
  • Build an intuitive dashboard to monitor performance, detect anomalies, and gain real-time insights.
  • Receive critical Checkmk alerts via multiple channels, like SMS, phone calls, messengers, or push notifications with the help of ilert.

Prerequisites: What you will need to follow this guide

  • A registered account on Checkmk.
  • AWS Account with API Access.
  • IAM (Identity and Access Management) access.
  • A Checkmk Cloud instance deployed and accessible via a browser.
  • A Windows-running machine.

Stage 1: Adding a Windows host via Checkmk agent

Adding hosts is now more straightforward with Checkmk Cloud, which supports deploying monitoring agents directly on target systems. This approach reduces the need for manual configuration and shortens setup time, helping administrators scale their monitoring infrastructure more efficiently.

  1. Select the Windows agent package.
  2. Open Powershell on your Windows machine and follow the instructions.
Adding a Windows host via Checkmk agent 01

  1. Navigate to the newly added host.
Adding a Windows host via Checkmk agent 02

Stage 2: Creating an IAM User for Checkmk

  1. In AWS, open the IAM service and go to users.
Creating an IAM User for Checkmk 01

  1. In the top right corner, click “Create User.
Creating an IAM User for Checkmk 02

  1. Enter a User name.
Creating an IAM User for Checkmk 03

  1. In the next step, click Attach policy directly and select the ReadOnlyAccess in the Permission policies.
Creating an IAM User for Checkmk 04

  1. Click Create user.
Creating an IAM User for Checkmk 05

  1. Select the newly created user.
Creating an IAM User for Checkmk 06

  1. Click Security credentials.
Creating an IAM User for Checkmk 07

  1. Now, click Create access key.
Creating an IAM User for Checkmk 08

  1. Select Third-party Service, accept the confirmation message, and click Next.
Creating an IAM User for Checkmk 09

  1. Optional: Set a description tag.
Creating an IAM User for Checkmk 10

  1. An Access key and Secret key will be generated. You will need these two keys in the setup process for Checkmk Cloud cloud-based monitoring.
Creating an IAM User for Checkmk 11

Stage 3: Adding an AWS host

  1. On the sidebar, click Setup -> Hosts.
Adding an AWS host to Checkmk 01

  1. Click Add host.
Adding an AWS host to Checkmk 02

  1. Enter a Host name and set the IP address family to ‘No IP,’ then save the host.
Adding an AWS host to Checkmk 03

  1. In the sidebar, navigate to Setup -> VM, cloud, container.
Adding an AWS host to Checkmk 04

  1. Now, select Amazon Web Services (AWS).
Adding an AWS host to Checkmk 05

  1. Click Add rule.
Adding an AWS host to Checkmk 06

  1. Enter the access key ID and the secret access key from your AWS account into the corresponding fields.
  2. Select the options you want to monitor.
Adding an AWS host to Checkmk 08

  1. Now, in the sidebar, navigate to Setup -> Hosts.
Adding an AWS host to Checkmk 09

  1. Select the previously created host.
Adding an AWS host to Checkmk 10

  1. Click Save & run service discovery.
Adding an AWS host to Checkmk 11

Adding an AWS host to Checkmk 11

  1. Navigate to Setup -> Dynamic host management.
Adding an AWS host to Checkmk 12

  1. Click Add connection.
Adding an AWS host to Checkmk 13

  1. Enter a Unique ID and Title in the corresponding fields.
  2. Enable ‘Automatically delete hosts without piggyback’ in the Delete vanished hosts settings.
  3. Save the connection.
Adding an AWS host to Checkmk 16

  1. To apply the changes made, in the top right corner, click the Changes.
Adding an AWS host to Checkmk 17

  1. Now, click Activate on selected sites.
Adding an AWS host to Checkmk 18

  1. In the sidebar, navigate to Monitor -> AWS EC2 instances.
Adding an AWS host to Checkmk 19

  1. This view is a general overview of your EC2 Instances.
Adding an AWS host to Checkmk 20

Stage 4: Build intuitive dashboards to monitor performance

Checkmk dashboards are customizable visual interfaces that provide real-time insights into the health and performance of your cloud and on-premises systems.

  1. In the sidebar, navigate to Customize -> Dashboards
Build dashboards in Checkmk to monitor performance 01

  1.  Click on Add dashboard.
Build dashboards in Checkmk to monitor performance 02

  1. Select specific objects to which the dashboard is restricted. In this guide, we will set the restriction to: ‘No restrictions to specific objects.’
  2. Click Continue.
Build dashboards in Checkmk to monitor performance 04

  1. Enter a Unique ID and Title and click Save & go to dashboard.
Build dashboards in Checkmk to monitor performance 05

  1. Now, navigate to Dashboard -> Enter layout mode.
Build dashboards in Checkmk to monitor performance 06

  1. Navigate to Add -> Combined graph.
Build dashboards in Checkmk to monitor performance 07

  1. Enter a Custom title for the graph.
Build dashboards in Checkmk to monitor performance 08

  1. Select the Service filter condition ‘Service (exact match)’ and enter ‘AWS/EC2 CPU utilization’ as the value.
Build dashboards in Checkmk to monitor performance 09

  1. Select ‘CPU utilization’ as Graph and select a desired Time range.
Build dashboards in Checkmk to monitor performance 10

  1. Click Save.
Build dashboards in Checkmk to monitor performance 11

  1. The Checkmk dashboard feature allows you to customize the size and layout of your dashboard fully.
Build dashboards in Checkmk to monitor performance 12

  1. To finalize the setup, navigate to Dashboard -> Leave layout mode.
Build dashboards in Checkmk to monitor performance 13

Stage 5: Connect Checkmk Cloud with ilert 

To connect Checkmk Cloud with ilert, add a new Notification rule of type ‘ilert’ and enter the Integration key of your Checkmk alert source in ilert.

 Connect Checkmk Cloud with ilert to send alerts via SMS, phone call, push notifications

For further information, please refer to ilert's Checkmk Integration Guide.

Explore all
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Our Cookie Policy
We use cookies to improve your experience, analyze site traffic and for marketing. Learn more in our Privacy Policy.
Open Preferences
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.