user@cloud-architect:~$

About Me

Profile Picture

Hello! I'm Dilhan, a results-driven Cloud/DevOps Architect and an Engineering Manager with experience in pre-sales and leading high-performing teams to innovate and implement scalable cloud solutions.

Skilled in orchestrating the development and deployment of applications by leveraging modern DevOps methodologies and cloud architectures. Adept at fostering a culture of continuous integration and continuous deployment (CI/CD) while ensuring robust security and compliance standards. Excels in automating workflows, optimizing infrastructure, and managing cross-functional teams to increase efficiency and drive project deliverables.

With a strong emphasis on collaboration and communication, I seamlessly bridge the gap between technical teams and business stakeholders to deliver results that align with organizational objectives and customer needs.

I thrive on solving complex technical challenges and building robust systems that empower development teams. I am proficient in tools like Kubernetes, Docker, Terraform, Ansible, Jenkins, and more. I believe in continuous learning and always staying ahead of the curve in the rapidly evolving cloud landscape.

Core Competencies: Cloud Architecture (AWS, Azure, GCP, Alibaba, Singapore Government Clouds) | DevOps Automation | Kubernetes | Terraform | CI/CD | FinOps | Infrastructure as Code | Pre-Sales Engineering | Solution Architecture | Governance & Compliance | Agile Delivery | Engineering Management | Stakeholder Management | Performance Optimization | Team Management

KPIs

Experience

Years of Experience

Team Members Led

Team Members Led

Lead Time Reduction

Lead Time Reduction

Team Certification Growth

Team Certification Growth

Cloud Cost Reduction

Cloud Cost Reduction

Project Portfolio

SGD Project Portfolio

Core Expertise

My career is built on a foundation of technical architecture, team leadership, and strategic pre-sales. This chart illustrates the synergy between these core areas, which enables me to deliver comprehensive, business-aligned solutions.

Skills

Cloud Platforms

AWS, Azure, Google Cloud Platform (GCP), Alibaba Cloud, G-Cloud (Singapore), GPC (Singapore), GCC (Singapore)

Containerization & Orchestration

Docker, Kubernetes, ECS, EKS, AKS, GKE, OpenShift, Karpenter

Infrastructure as Code (IaC)

Terraform, Bicep, ARM, AWS CDK, CloudFormation, OpenTofu, Pulumi, Ansible, Puppet, Chef

Databases & Distributed Systems

SQL (PostgreSQL, MySQL), NoSQL (CosmosDB, DynamoDB, MongoDB), Kafka, Azure Service Bus, SQS, Redis

Integration Tools

Dell Boomi, Pentaho, AWS Step Functions, Azure Logic Apps, REST, GraphQL

CI/CD & Automation

Jenkins, GitLab CI, GitHub Actions, Azure DevOps, CircleCI, ArgoCD

Security & Compliance

IAM, Network Security, Azure Policies, SonarQube, Checkmarx, Compliance Audits

Monitoring & Logging

Prometheus, Grafana, ELK Stack, CloudWatch, Azure Monitor, SumoLogic, Dynatrace, New Relic

Enterprise Architecture

SAFE, TOGAF, Domain-Driven Design, Microservice Architecture, Event-Driven Architecture

Latest Missions

AWS Landing Zone

SIT-CIT AWS Landing Zone Migration

Orchestrating an enterprise-grade cloud migration and governance framework for SIT-CIT using AWS Control Tower.

AWS Control Tower Governance CSI Guidelines
IaC Deployment Server

Secure IaC Deployment Engine

Architected a secure, Docker-based Terraform deployment host for isolated and auditable infrastructure delivery.

Dockerized IaC Bastion Security Terraform Terraform-Agents Shell-Scripting
Evaluation Platform Logo

FIND - Evaluation Platform

Built a cloud-native platform for validating AI software for clinical evaluations in a secure AWS environment.

AWS CloudFormation DevOps CI/CD
Gembaa Logo

Gembaa - Cloud-Native SaaS Platform

Led the design and deployment of a multi-tenant, highly available Kubernetes-based SaaS solution on AWS.

AWS Kubernetes GitOps Argo CD EKS Lambda API Gateway SumoLogic Terraform Azure Entra ID
Octais Project Logo

Octais Platform – Enterprise Azure Cloud Platform

Led cloud infrastructure transformation for a large-scale multi-tenant platform on Azure.

Azure AKS Terraform Pulumi ArgoCD Redis FalkorDB
Governance.com Logo

Governance.com (Consulting Engagement)

A process improvement initiative to enhance client support, operational efficiency, and development workflows.

Agile Kanban Stakeholder Management Process Improvement SRE
DirectAsia Logo

Direct Asia - Portal/CRM/Cloud & DevOps

A technology transformation mission to improve data pipelines and modernize cloud architecture in Azure.

Azure Azure DevOps Azure Synapse CRM Agile
PUB Logo

PUB - 1BUP QP & LP Migration to GPC

Led the pre-sales and architectural migration of a critical government portal to GPC 2.0.

GPC IM8 Architecture Pre-Sales Cloud Migration

Resume

Download my full resume from LinkedIn for a detailed overview of my experience, skills, and certifications.

Summary

Highly accomplished Cloud and DevOps Architect with 18+ years of progressive experience in designing, deploying, and managing robust, scalable, and secure cloud infrastructure. Proven expertise in automating complex workflows, optimizing system performance, and implementing best practices for continuous integration and delivery.

Experience

Associate Director - Lead Cloud Consultant | Singapore Institute of Technology (SIT)

October 2025 – Present | Singapore

  • Directed the design and deployment of secure, scalable multi-cloud infrastructures across AWS and Azure, guiding teams in adopting IaC practices with Terraform and CloudFormation to deliver cost efficiency and reliability.
  • Partnered with C-level stakeholders and IT Security leadership to establish cloud security frameworks, enforce compliance with regulatory standards, and implement proactive vulnerability management strategies.
  • Championed the adoption of Kubernetes, Docker, and serverless architectures, mentoring engineering teams to deliver cloud-native platforms that improved agility, resilience, and time-to-market.
  • Oversaw the implementation of enterprise-grade CI/CD pipelines, disaster recovery frameworks, and API management platforms, ensuring business continuity, operational excellence, and seamless integration across complex ecosystems.

Cloud Solution Architect & Engineering Manager | Staizen

August 2020 – August 2025 | Singapore

  • Defined technical vision for a 30+ member team, leading cloud transformations.
  • Managed complex infrastructure and led pre-sales engagements, securing multi-million-dollar contracts.
  • Directed OKR-based goal setting for engineering teams, improving delivery predictability and alignment with business objectives.

Senior Consultant & Cloud Solution Architect | Mantu

June 2019 – August 2020 | Singapore

  • Acted as primary cloud architect for large-scale projects, translating business needs into secure, scalable solutions.
  • Provided technical leadership during pre-sales cycles and conducted cloud readiness assessments for enterprise clients.

Senior Solution Architect | Total eBiz Solution

March 2018 – June 2019 | Singapore

  • Owned architectural design for public sector projects with Singapore Government Agencies.
  • Developed response strategies for government tenders, ensuring compliance with IM8 and MAS standards.

Chief Technology Officer | Ascentic

May 2017 – March 2018 | Sri Lanka

  • Set the company's technology roadmap, led pre-sales for digital transformation, and oversaw end-to-end delivery.
  • Introduced DevOps practices, reducing delivery lead times by 35%.

Technical Architect | Inexis Consulting

December 2014 – April 2017 | Sri Lanka

  • Provided technical leadership during pre-sales cycles.
  • Designed cloud-first, API-driven solutions for healthcare and insurance domains, emphasizing performance and compliance.

Technical Lead | Virtusa

September 2008 – November 2014 | Sri Lanka

  • Directed full-stack development teams on enterprise-grade financial solutions.
  • Oversaw SharePoint administration and custom module development.

Education

Doctorate in Computer Science - Reading | Aspen University, USA

2020 September - Present

MSc in Enterprise Application Development (Distinction) | Sheffield Hallam University

August 2015 – December 2017

Certifications

  • AWS Certified Solutions Architect
  • AWS Certified Developer
  • AWS Data Warehousing
  • Microsoft Azure Administrator
  • Microsoft Azure Architect
  • Dell Boomi Associate Developer
  • Certified SCRUM Master

Blog & Insights

Architecting the “Agentic Mesh” for the Autonomous Enterprise

Agentic Mesh

We have officially crossed the Rubicon. 2024 and 2025 were defined by the race to integrate Large Language Models (LLMs) into business workflows, predominantly through Retrieval-Augmented Generation (RAG) chatbots and deterministic tool calling. While valuable, these implementations remain largely reactive human-in-the-loop, triggering single actions. As we settle into 2026, the paradigm is shifting rapidly from single-model implementations to multi-agent autonomous systems. We are moving from building systems that answer questions to building systems that achieve goals.

Published: January 05, 2026

The Drumbeat of Change

Post-Nginx

As a Cloud & DevOps Architect, I’ve been hearing the whispers, then the murmurs, and now the increasingly loud conversations about the future of Nginx. While “retiring” might be too strong a word, Nginx is still a powerful and widely used tool; the shift in ownership and the evolving open-source landscape have many organizations pondering alternatives and future strategies. This isn’t a doomsday prediction, but rather a call to proactive planning.

Published: November 15, 2025

Beyond the Pair Programmer

GitHub Universe 2025

"Hybrid Teams," where humans and AI agents operate as a single unit. In this vision, human developers move away from repetitive implementation tasks to focus on high-level architecture and the "why" behind a product, while specialized agents handle the technical "how," including coding, security reviews with CodeQL, and maintenance.

Published: November 11, 2025

A Senior Architect’s Playbook for Migrating to KRaft

Platform Engineering

The general availability of Apache Kafka 4.0 officially removes the ZooKeeper dependency, mandating a one-way migration to the internal KRaft protocol for all self-managed clusters. This fundamental architectural shift delivers significant benefits, including operational simplicity, massive scalability to millions of partitions, and near-instant controller failover. The post outlines a zero-downtime, phased migration playbook, starting with a "bridge" release upgrade, followed by deploying a new KRaft controller quorum, and then performing a "dual-write" rolling restart to safely migrate metadata. This process concludes with an irreversible final cut-over, which necessitates critical pre-migration validation, implementing new KRaft-specific monitoring, and updating all admin tools to no longer connect to ZooKeeper.

Published: October 26, 2025

The Strategic Imperative for Scalable Cloud Delivery and Optimized Developer Experience

Platform Engineering

Platform Engineering (PE) is the strategic architectural solution necessary to overcome the Cloud Complexity Crisis, which has resulted in unsustainable cognitive overload for application developers under decentralized DevOps models. PE addresses this by creating an Internal Developer Platform (IDP)—a productized, self-service interface that abstracts infrastructure complexity through standardized, automated Golden Paths. This approach centralizes high-skill infrastructure expertise and embeds security, governance, and cost efficiency into the architecture, enabling developers to focus exclusively on business logic. The article concludes that PE is a mandatory shift that significantly boosts organizational velocity, minimizes developer toil, and delivers measurable business value through improvements in key DORA metrics, fundamentally reshaping cloud delivery towards a highly automated and optimized future.

Published: October 23, 2025

Cloud Ops is Still Too Hard. LLMs are About to Fix It.

LLM-Driven DevOps

Let’s face it: our cloud environments are an absolute monster. We’re all juggling sprawling microservices, endless multi-cloud configurations, and a thousand tiny details. Yeah, Infrastructure as Code (IaC) and our CI/CD pipelines have helped, but they haven't solved the core problem. So much of DevOps is still just repetitive, exhausting toil: debugging incidents that take hours, endlessly tuning policies, and making sure documentation isn't instantly outdated. It’s draining. Here’s the game changer: Large Language Models (LLMs). They're injecting genuine contextual reasoning and flexibility into operations. Instead of us having to write every single script, the LLMs can now generate configs, intelligently analyze logs, and even propose the fix.

Published: October 02, 2025

A Deep Dive into Declarative Side-Effects

Terraform Actions

Terraform has always excelled at managing infrastructure, but handling side-effects like cache invalidations or notifications often meant hacks and external scripts. With the introduction of Terraform Actions in v1.14, these tasks now have a native, declarative home. Actions let you trigger operations either directly from the CLI or bound to resource lifecycle events, with support for scaling, conditions, and long-running workflows. Real-world examples include automatically invalidating a CDN when content changes or stopping EC2 instances immediately after creation to save costs. While provider support is still limited and action results cannot yet be dependencies, the approach significantly reduces complexity and brings more operational workflows into Terraform itself. This is a major step forward in unifying infrastructure and side-effects under Infrastructure as Code.

Published: September 28, 2025

Azure AKS Automatic

AKS Automatic

Azure Kubernetes Service (AKS) Automatic is a major leap forward, offering a fully managed, "production-ready" Kubernetes experience that eliminates much of the traditional operational burden. By baking in best practices for security and automation, the service streamlines everything from initial setup to day-two operations. It intelligently scales workloads with pre-configured tools like HPA, VPA, and Karpenter, and provides built-in safeguards for security and reliability. This approach makes Kubernetes far more accessible to lean teams and startups, while also providing a standardized, efficient platform for larger enterprises, ultimately allowing all teams to accelerate application delivery and focus on innovation.

Published: September 21, 2025

AWS Launches 8th-Gen Memory-Optimized EC2

AWS R8i and R8i-flex

The new Amazon EC2 R8i and R8i-flex instances, powered by custom Intel Xeon 6 processors, are now generally available and offer significant performance and price-performance improvements over the previous generation. Designed for memory-intensive workloads, these instances provide up to 20% higher overall performance and feature the latest AWS Nitro Cards to double network and EBS bandwidth. The R8i-flex is a great option for workloads that don't need full compute utilization, while the R8i is built for more demanding applications.

Published: September 19, 2025

What’s really inside Terraform Enterprise v1.0.1?

Terraform Release 1.0.1

Scheduled for release next week, Terraform Enterprise v1.0.1 signals a new era of maturity for the platform, moving beyond flashy features to solidify its core enterprise capabilities. This foundational update is expected to enhance governance with more granular, context-aware policy controls, shift cost management left with proactive budget and drift detection, and streamline Day-2 operations by introducing intelligent, automated drift remediation workflows. While seemingly an incremental update, v1.0.1 represents a deliberate step toward embedding smarter, more autonomous controls for security, cost, and compliance directly into the infrastructure lifecycle, reinforcing TFE's role as a comprehensive solution for managing IaC at scale.

Published: September 10, 2025

Supercharging GuardDuty with Custom Entity Lists

Amazon GuardDuty

Amazon GuardDuty has long served as a crucial intelligent threat detection service within AWS, leveraging machine learning and global threat intelligence to identify malicious activity. However, a significant limitation was its inability to natively incorporate an organization's unique, context-specific threat intelligence, forcing teams to rely on separate, often complex, workarounds. The introduction of new custom entity lists fundamentally changes this dynamic.

Published: September 08, 2025

Azure App Service Premium v4 Release

Azure App Service v4

Microsoft quietly but significantly raised the bar for application hosting with the General Availability of Azure App Service Premium v4 on September 1, 2025. This update is not just another SKU refresh. It represents a step forward in performance, scalability, and flexibility for developers and organizations running critical workloads on Azure.

Published: September 04, 2025

Azure Landing Zone with Terraform

Azure Landing zone

An Azure landing zone uses subscriptions to isolate and scale application resources and platform resources. Subscriptions for application resources are called application landing zones, and subscriptions for platform resources are called platform landing zones.

Published: September 02, 2025

AWS Landing Zone using Terraform and Sentinel

AWS Landing zone

This post provides a comprehensive overview of the architectural principles, design patterns, and core components used in this AWS Landing Zone. The goal of this architecture is to establish a secure, scalable, cost-efficient, and operationally excellent foundation for deploying all workloads on AWS.

Published: September 01, 2025

Demystifying Kubernetes Networking

Kubernetes Networking

Kubernetes networking can feel like a black box until you know exactly how packets move inside your cluster. In this blog post, I break down how networking really works in Kubernetes and share practical insights for running production-grade clusters.

Published: August 15, 2025

Terraform best practices for large-scale deployments

Taming the Beast

Terraform is a fantastic tool for Infrastructure as Code (IaC). It makes provisioning and managing resources a breeze... until it doesn't. Managing Terraform at scale is very different from running a few `.tf` files for a single project.

Published: August 09, 2025

AWS Egress Cost Optimizer

AWS Egress Cost Optimizer

In the dynamic world of cloud computing, managing and optimizing data transfer costs, especially egress (data leaving AWS), is a persistent challenge. These costs can be hazy, unpredictable, and often a significant portion of the AWS bill.

Published: August 01, 2025

Azure Governance Guardian

Azure Governance Guardian

In today's dynamic cloud environments, maintaining control over Azure resources isn't just a "nice-to-have", it's absolutely critical for security, cost optimization, and compliance. This solution provides a robust, automated framework to ensure Azure deployments consistently adhere to the organization's governance policies, in real-time.

Published: July 10, 2025

EFS Backup Consistency Verification

EFS Backup Consistency Verification

AWS Backup is fantastic, but what happens when your EFS is constantly being written to during the backup window? You might end up with backups that aren't truly consistent, leading to potential data integrity nightmares during a restore. 😱

Published: July 04, 2025

Azure Intelligent Data Governance Classification

Azure Intelligent data governance classification

Automate sensitive data discovery, classification, and governance across your Azure Data Lake. This solution leverages AI/ML (Azure ML, OpenAI, Cognitive Services) to bring clarity, compliance, and control to vast datasets.

Published: In Progress

Get in Touch

Have a project in mind or just want to chat about Cloud/DevOps? Feel free to reach out!

GitHub: github.com/chathushka-dilhan