Harness the power of generative AI

All-in-one Incident Management Platform

Manage on-call, respond to incidents and communicate them via status pages using a single application.

Start for free

Book a demo

Trusted by leading companies

Features

Empower your Ops teams with Enterprise-Grade Incident Management

ilert is the incident management platform designed from the ground up as a single application and covers the entire incident response lifecycle.

Reliable & actionable alerting

Reliable alerts via voice, SMS, pushnotifications. Frictionless acknowledgement,no need to log-in anywhere.

On-call management

Always alert the right person and share on-call responsibility across your team with on-call schedules and automatic escalations.

Status pages

Build trust and communicate incidents in seconds with status pages that are connected with your infrastructure.

Incident communication

Effectively communicate IT incidents to stakeholders throughout the entire service chain in a matter of seconds.

Call routing

Directs incoming calls efficiently based on schedules and escalation paths, ensuring prompt incident response.

ChatOps

Integrates collaboration tools like Slack, streamlining incident communication and decision-making within chat channels.

Highlights

The features you need to operate always-on-services

Every feature in ilert is built to help you to respond to incidents faster and increase uptime.

Harness the power of generative AI

Enhance incident communication and streamline post-mortem creation with ilert Al. ilert AI helps your business to respond faster to incidents.

Integrations

Deploy in minutes with 100+ ready-to-use integrations

ilert seamlessly connects with your tools using out pre-built integrations or via email. ilert integrates with monitoring, ticketing, chat, and collaboration tools.

Start for free

View all integrations

Transform your Incident Response today - start free trial

Start for free

Customers

See how industry leaders achieve 99.9% uptime with ilert

Organizations worldwide trust ilert to streamline incident management, enhance reliability, and minimize downtime. Read what our customers have to say about their experience with our platform.

“We are using ilert to fix our problems sooner than our customers are realizing them. ilert gives our engineering and operations teams the confidence that we will react in time.”

Dr. Robert Zores

Chief Technology Officer

“ilert has helped Ingka significantly reduce both MTTR & MTTA over the last 3 years, the collaboration with the team at ilert is what makes the difference.”

Karan Honavar

Engineering Manager at IKEA

“Other teams are now checking whether they would also use ilert.”

Thilo Maass

Manager at Adesso AG

Stay up to date

Expert insights from our blog

Engineering

Under the hood: Request coverage feature

Discover the process of developing one of the most frequently used features in ilert's mobile app.

Marko Simon

May 23, 2025 • 5 min read

‍

The ilert mobile app is primarily used by responders to receive notifications about critical alerts, react to them on the go, and check their current on-call status. It has various capabilities, including critical notifications via push, quick actions for alerts, and critical alert settings. The app enables responders to view their current on-call shifts and escalation policies, take on-call shifts from somebody else, and create coverage requests to ask for on-call shift handover from a colleague. The latter is a new feature of ilert that has proven to be very useful for a communication tool between users, and this post is taking a deeper dive into the development of the feature and the challenges we faced developing it.

Why were coverage requests introduced?

Since we introduced on-call schedules, users have been able to create overrides—special shifts that take priority over regular ones. An override lets you assign another user to take over on-call duty, either for a full shift or just part of it. Overrides don’t have to follow existing shifts—they can be created for any time period, even outside of configured shifts.

‍

Later on, the "Take on-call" feature was introduced, which is the opposite of overriding my shifts. Both methods create overrides, but neither method ensures that the other user gets notified of any action taken on their on-call shifts. Furthermore, creating overrides for other users was giving them responsibility they eventually weren't aware of, and this could be critical.

‍

The solution for this problem was to introduce a flow of asking another user to take over specific on-call duties, resulting in a short communication stream of requesting coverage.

Designing the coverage request REST API

The general flow of a coverage request should be:

‍

1. User A creates a coverage request, asking User B to (partially) take over one or multiple shifts

2. User B gets notified, either accepts or declines the coverage request

3. User A gets notified of the action that User B decided to choose

‍

The logic behind ilert request coverage feature

‍

We needed to design the API around a coverage request entity, which had to have at least the following fields:

- sender

- receiver

- shifts

‍

Additionally, we added a message field to give users an option to communicate additional details for their request. For the user interface, we also provided the current state and the createdAt date string, which are read-only properties. When the user declines the coverage request, some communication back may be useful too, handled by giving the user the ability to add a declineComment. Lastly, to show multiple coverage requests in a list view and apply meaningful filters, we used the state field in combination with an `expired` state calculated in the frontend. A coverage request is considered expired when the last shift it covers has ended.

‍

Beyond the classic Create and Read operations on the coverage request entity, we needed specific endpoints to perform actions: accept, decline, and cancel. Update and Delete operations are not part of the flow right now and won't be implemented.

From mockup to polished UI

ilert Request coverage feature: mockups and final view

There are no significant differences between the mockup and the final version of the coverage request creation view. The styles have been adjusted, and an additional timezone information box has been included. The final versions of the list view and the detail view look like this:

‍

Communication is key

A general goal of this feature is to motivate users to see and respond to coverage requests as early as possible, as on-call shifts are always bound to time and can sometimes be on short notice. Another goal is to let all relevant communication stay in the ilert mobile app, eliminating the need to switch between tools. To achieve this, several means of communication are introduced.

Push notifications

Whenever an action related to a coverage request is taken, a push notification is sent to the relevant person.

Coverage request created: receiver gets notified
Coverage request accepted/declined: sender gets notified
Coverage request cancelled: receiver gets notified

‍

But what if the receiver doesn't have a mobile app?

‍

Email

ilert checks if any of the relevant users don't have at least one registered push notification token (unique ID from a user on a device, used by ilert to route push notifications). If that is the case, ilert sends out an email to the user’s primary email, containing information about the coverage request.

‍

In-app badge

Sometimes push notifications get dismissed by accident, without recognising the content (and possibly swiping away a time-critical coverage request). To provide more presence in the app, a small red circle (badge) is added at the top left of the menu icon in each list view. It indicates whether there is one or more pending coverage requests for review. Additionally, the main menu item shows a count of all pending requests at any time.

Provide filters, but keep the UI clean

Giving the user the ability to filter coverage requests in the list view is necessary. An obvious one is a filter for Received and Sent requests. Another important but tricky filter is for relevant requests only. That means any expired and not pending requests are filtered out by default. But as we already have the Received/Sent toggle, another toggle for Current/All would've cluttered the UI too much.

‍

One idea was to introduce a filter toolbar (similar to the one implemented on the alert list), but the idea was discarded as it would've been the only filter at the time of release (which would've looked odd). Another idea was to choose the default: only show requests in state Pending, and let the user access all via a button click. Ultimately, we settled on this solution for its simplicity and ease of use.

‍

Every day usage reveals papercuts

After the release of the feature, the ilert team started using the feature internally as well, and quickly recognized one flaw of the feature. When acting on a coverage request (accept, decline, or cancel), the coverage request would instantly disappear from the list without giving a clear confirmation of the coverage request's change of state.

‍

Two improvements were put in place:

Stay on the detail view after an action happens to see the updated state of the request
Keep relevant coverage requests in the list view for 24 more hours after performing an action

‍

The latter wasn't the case before, because the list was initially built upon the state field, meaning it would instantly disappear from the list upon acceptance. A click on past requests was needed to view the just-accepted request. Therefore, an additional query parameter was defined and included in the API, enabling the frontend to specify a past creation date. The response also included all coverage requests—no matter their status—from the given creation date up to now. Now users can see all pending coverage requests, plus recently accepted/declined/cancelled ones (in the last 24 hours).

‍

Haven't installed the ilert app yet? Give it a try! Download the app for Android or iOS.

Product

Rollbar and ilert: Real-time error monitoring meets smart incident response

Rollbar is now part of the ilert integration catalog! Detect errors in real time and respond instantly with ilert’s alerting and incident management.

Daria Yankevich

May 22, 2025 • 1 min read

We’re excited to share that Rollbar is now part of the ilert integration catalog! This new technical partnership allows software teams to detect application errors in real time with Rollbar and instantly respond using ilert’s powerful alerting and incident management features.

What is Rollbar?

Rollbar is a comprehensive, real-time error monitoring and debugging platform designed to help development teams detect, diagnose, and resolve issues faster—before they impact users. By providing deep visibility into application errors across the entire software lifecycle, Rollbar empowers teams to ship higher-quality code with greater confidence. With clients like CircleCI, Twilio, Babbel, and Salesforce, Rollbar is trusted by teams that prioritize reliability and seamless user experience.

‍

At its core, Rollbar gives you deep visibility into your application’s health by capturing errors as they happen. Whether it's a backend exception or a frontend crash, Rollbar collects detailed metadata—including stack traces, request parameters, and user data—so developers can fix bugs quickly and confidently.

‍

Key highlights of Rollbar:

‍

Real-time monitoring: Instantly detect and visualize errors.
Intelligent grouping: Reduce noise using machine learning-based error clustering.
Comprehensive context: Investigate errors with full context, from local variables to affected users.
Enterprise-ready: Scales with your infrastructure and offers strong security and integrations.

How you can benefit from the Rollbar and ilert integration

Send alerts from Rollbar to ilert to enable rapid incident response

With this new integration, Rollbar alerts can now trigger events in ilert automatically—enabling rapid, targeted incident response across your engineering and SRE teams.

‍

Here’s how your team benefits:

‍

Deploy with confidence: Get notified instantly when a new deployment introduces an error. You can roll back quickly or fix forward without waiting for user complaints.
Bridge the gap between Dev and Ops: SREs and developers can collaborate more efficiently, as ilert routes Rollbar alerts to the right on-call engineer using your preferred notification channels—phone, SMS, push, Microsoft Teams, and more.
Centralized approach for faster root cause analysis: ilert enables teams to receive alerts from various alert sources, which helps to expedite root cause analysis. ilert as a central dispatcher can correlate Rollbar alerts with alerts from other sources to help teams resolve incidents faster.

How to Connect Rollbar and ilert

Getting started is easy. To integrate Rollbar with ilert, follow our step-by-step guide. It takes just a few minutes to:

‍

Create an alert source in ilert.
Generate an endpoint URL.
Add the webhook in your Rollbar settings.

‍

Once connected, Rollbar will send error notifications to ilert, where you can manage them using your existing on-call schedules, escalation policies, and alerting preferences.

Engineering

An ultimate step-by-step guide on Checkmk Cloud Monitoring

Explore Checkmk’s new fully managed, cloud-based monitoring solution for seamless incident management, and learn how to connect it with ilert.

Tim Nguyen Van

May 09, 2025 • 10 min read

Checkmk launched Checkmk Cloud (SaaS) in February 2025, which is a fully managed, cloud-based version of their monitoring technology. This solution, designed for ease of use, allows enterprises to start monitoring their IT infrastructure with no installation, maintenance, or manual upgrades required. The SaaS version is compatible with both cloud-based and on-premises systems, bringing them together under a single, straightforward platform.

‍

As Checkmk is one of the popular monitoring solutions chosen by ilert users, we decided to dive deep and test this new SAAS version to provide you with a helpful guide on connecting your Checkmk monitoring with the ilert incident management platform.

‍

If you get stuck or anything is unclear, reach out to the ilert support team via the chat widget. We are happy to help!

What this guide covers

This step-by-step guide will help you:

‍

Set up and configure Checkmk Cloud Monitoring for cloud-based and on-premises infrastructure
Create a dedicated IAM user in your AWS account with the necessary permissions to allow Checkmk to access and monitor your AWS resources.
Build an intuitive dashboard to monitor performance, detect anomalies, and gain real-time insights.
Receive critical Checkmk alerts via multiple channels, like SMS, phone calls, messengers, or push notifications with the help of ilert.

Prerequisites: What you will need to follow this guide

A registered account on Checkmk.
AWS Account with API Access.
IAM (Identity and Access Management) access.
A Checkmk Cloud instance deployed and accessible via a browser.
A Windows-running machine.

Stage 1: Adding a Windows host via Checkmk agent

Adding hosts is now more straightforward with Checkmk Cloud, which supports deploying monitoring agents directly on target systems. This approach reduces the need for manual configuration and shortens setup time, helping administrators scale their monitoring infrastructure more efficiently.

‍

Select the Windows agent package.
Open Powershell on your Windows machine and follow the instructions.
‍

Adding a Windows host via Checkmk agent 01

‍

Navigate to the newly added host.

Adding a Windows host via Checkmk agent 02

Stage 2: Creating an IAM User for Checkmk

In AWS, open the IAM service and go to users.

‍

In the top right corner, click “Create User.”

‍

Enter a User name.

‍

In the next step, click Attach policy directly and select the ReadOnlyAccess in the Permission policies.

‍

Click Create user.

‍

Select the newly created user.

‍

Click Security credentials.

‍

Now, click Create access key.

‍

Select Third-party Service, accept the confirmation message, and click Next.

‍

Optional: Set a description tag.

‍

An Access key and Secret key will be generated. You will need these two keys in the setup process for Checkmk Cloud cloud-based monitoring.

Stage 3: Adding an AWS host

On the sidebar, click Setup -> Hosts.

‍

Click Add host.

‍

Enter a Host name and set the IP address family to ‘No IP,’ then save the host.

‍

In the sidebar, navigate to Setup -> VM, cloud, container.

‍

Now, select Amazon Web Services (AWS).

‍

Click Add rule.

‍

Enter the access key ID and the secret access key from your AWS account into the corresponding fields.
Select the options you want to monitor.

‍

Now, in the sidebar, navigate to Setup -> Hosts.

‍

Select the previously created host.

‍

Click Save & run service discovery.

‍

Navigate to Setup -> Dynamic host management.

‍

Click Add connection.

‍

Enter a Unique ID and Title in the corresponding fields.
Enable ‘Automatically delete hosts without piggyback’ in the Delete vanished hosts settings.
Save the connection.

‍

To apply the changes made, in the top right corner, click the Changes.

‍

Now, click Activate on selected sites.

‍

In the sidebar, navigate to Monitor -> AWS EC2 instances.

‍

This view is a general overview of your EC2 Instances.

Stage 4: Build intuitive dashboards to monitor performance

Checkmk dashboards are customizable visual interfaces that provide real-time insights into the health and performance of your cloud and on-premises systems.

‍

In the sidebar, navigate to Customize -> Dashboards.

Build dashboards in Checkmk to monitor performance 01

‍

Click on Add dashboard.

Build dashboards in Checkmk to monitor performance 02

‍

Select specific objects to which the dashboard is restricted. In this guide, we will set the restriction to: ‘No restrictions to specific objects.’
Click Continue.