LLM-Based Congnitive AIOps Autonomous Cloud Incident Management





Find us on Google Scholar

Peer Review Policy
Article Processing Charges
Publication Procedure
Research Topics
FAQ
Copyright Infringement
Refund and Cancellation Policy

Find us on Google Scholar

Peer Review Policy

Article Processing Charges

Publication Procedure

Research Topics

FAQ

Refund and Cancellation Policy

Version
Download 8
File Size 545.77 KB
File Count 1
Create Date 27/03/2026
Last Updated 27/03/2026

Download

Description

LLM-Based Congnitive AIOps Autonomous Cloud Incident Management

Dr. KAVITHA K S
Computer Science & Engineering Dayananda Sagar College of Engineering Bengaluru, India dr.kavitha-cse@dayanandasagar.edu

AYUSH GUPTA

Computer Science & Engineering Dayananda Sagar College of Engineering Bengaluru, India Ayushgupta1312005@gmail.com

Dr. NAGRAJ M LUTIMATH
Computer Science & Engineering Dayananda Sagar Academy of Technologyand Management Bengaluru, India

nagarajml-cse@dsatm.edu.cin

ADI HEMA AJAY CHARAN

Computer Science & Engineering Dayananda Sagar College of Engineering Bengaluru, India adiajay12367@gmail.com

AKASH SINGH

Computer Science & Engineering Dayananda Sagar College of Engineering Bengaluru, India akashsingh2710670@gmail.com

ANANYA GUPTA

Computer Science & Engineering Dayananda Sagar College of Engineering Bengaluru, India

ananyagupta8303@gmail.com

Abstract
Cloud-based services continue to form the foundational fabric of modern digital infrastructures, supporting everything from enterprise systems to immersive web and AI applications. Yet as organisations scale, the incident management challenge rises sharply: distributed microservices, dynamic workloads, intricate dependencies and ever‐changing configurations combine to create failure modes that defy traditional manual processes. In this context, relying on human-driven Troubleshooting Guides (TSGs) is increasingly unsatisfactory: the sheer volume of alerts, the interwoven nature of faults, and the rapid pace of change make responsiveness slow, resolution accuracy inconsistent and operational costs unacceptable. Moreover, existing AIOps tools— while capable of anomaly detection or alert correlation—often fall short in providing end-to-end autonomous resolution, contextual reasoning and transparent decision-making. To bridge this gap, we propose DreamOps, a novel AI-driven centralised incident management framework tailored for cloud environments. Instead of treating alert generation and remediation as separate silos, DreamOps integrates intelligent perception, cognitive reasoning and deterministic execution into a cohesive pipeline. At its core, DreamOps transforms unstructured operational knowledge—such as incident logs, TSGs and domain expert playbooks—into executable workflows via large-language- model (LLM)-based reasoning agents. The system actively ingests telemetry from monitoring stacks (for example, metrics, logs and traces), applies machine-learning classifiers and anomaly detectors to prioritise and classify incidents, and then invokes LLM agents to generate context-aware mitigation plans. These plans are then validated, converted into deterministic workflows and executed— while retaining human oversight for critical decisions.

LLM-Based Congnitive AIOps Autonomous Cloud Incident Management

LLM-Based Congnitive AIOps Autonomous Cloud Incident Management

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us

LLM-Based Congnitive AIOps Autonomous Cloud Incident Management

LLM-Based Congnitive AIOps Autonomous Cloud Incident Management

What is DOI

Site Map

Frequently Asked Questions

Why IJSREM?

Publication Time Period

Publication Procedure

Processing Fee's

Follow Us

Working Hours

Contact Us