Struggling to balance support tickets and innovation? Discover how a small DevOps team leverages simple Lambda-powered workflows to empower 200+ developers and unlock massive efficiency.
Introduction
How do you manage endless support tickets while still focusing on innovation?
Not every task we handle is thrilling or exciting. Not every task is blog-post material. Sometimes, we deal with less glamorous missions, like saving money on CloudWatch log storage, offboarding a developer, securing access to a sensitive S3 bucket, disabling unused IAM roles, implementing a code freeze solution, and more.
And it doesn’t stop there — sometimes, we even get support tickets that seem endless: granting missing IAM permissions, creating MongoDB or RDS clusters and users, setting up AWS Personal Accounts, creating ECR repositories, secrets, and so much more.
How can just one DevOps, or even a few, manage all of this in addition to daily tasks?
Here’s where it gets interesting: we’re a team of only 5 DevOps, responsible for over 200 developers.
What if there were a simple way to automate these tasks or, even better, empower developers to handle them on their own?
Well, what if I told you there is a way? A simple way.
If I were to give this blog post another title, it would be The Power of Simplicity.
You won’t find a complex architecture with thousands of lines of code here. Instead, you’ll see the most basic and straightforward solutions — the kind that are often the most effective.
About Me
Before we dive in, let me introduce myself.
I’m Orel Bello, an AWS Community Builder and a passionate DevOps Platform Engineer with over 3.5 years of experience, including the past 2.5 years at Melio. My tech journey began during my military service as a Deputy Commander in the Technological Control Center for the Israel Police. After earning a B.Sc. in Computer Science, I started as a Storage and Virtualization Engineer before discovering my true calling in DevOps. Now an AWS Certified Professional in both DevOps and Solutions Architect, I specialize in building scalable, efficient, and cost-effective cloud solutions.
One thing you should know about Melio is that our entire architecture is fully serverless. We run a large-scale environment of Lambda functions, and naturally, Lambda has become our go-to solution for nearly every challenge we need to address.
Understanding Lambda Functions
Let’s take a look at what Lambda functions are and how they can help us boost efficiency through automation.
We’re all familiar with Lambda functions — the serverless compute service that lets you focus on writing code instead of managing servers.
Lambda integrates natively with many AWS services, making it the perfect tool for automation.
You can trigger Lambda functions on demand, or by a variety of AWS services like EventBridge, SNS, SQS, API Gateway, and many more.
And the best part? You don’t need to be an expert developer to write automations. All you need is a solid understanding of basic Python and the legendary boto3 library.
Boto3 is the engine behind the AWS CLI that we all know and love. It lets you perform actions on AWS with ease.
And here’s the kicker — it’s already included in Lambda, so no additional layer is required!
So, what can you do with it?
Basically — everything!
Let me show you just how simple it can be.
Use Case 1: Implementing a Codefreeze Solution
Let’s talk about the Code Freeze.
We always want our production environment to be stable and error-free. But there are certain critical periods, like when we’re presenting a live demo to partners, where we can’t afford the risk of a developer accidentally deploying to production and causing issues. During these times, we need to block all deployments based on a schedule automatically — and, most importantly, make it easy to enable or disable the block if a hotfix is needed in production.
Here’s the simplest solution for this:
Let’s break it down into three parts:
- Scheduling — For scheduling, we can use EventBridge, which allows us to use CRON expressions to trigger our Lambda function at specific times.
- Blocking — Since all of our services are deployed through CloudFormation stacks, blocking all deployments is as simple as denying CloudFormation actions (Such as CreateStack and UpdateStack). We can achieve this using SCP (Service Control Policies).
- Lambda — This is the bridge between EventBridge and SCP.
In short, we write a simple Lambda function to attach the SCP policy and trigger it using EventBridge (And of course, another lambda function and Eventbridge to disable the code freeze), It’s as easy as that!
Automating the code freeze mechanism not only helps safeguard stability but also simplifies the process and reduces the chances of human error during those critical times.
Use Case 2: Developer Offboarding Automation
Alright, that was simple, but what about offboarding a developer?
At Melio, every developer has a Personal AWS Account and a Personal Atlas MongoDB cluster. When they leave the company, we need to delete these resources for two key reasons:
- Security : We want to make sure no backdoors are left open.
- Cost Optimization : Resources that are no longer in use should be terminated.
Don’t worry, it’s just as straightforward as before.
The first step is to use EventBridge integrated with CloudTrail to capture the DisableUser event, which tells us a developer has left the company.
Next, we need to clean up the AWS resources before closing the account.
Why not just close the account right away? We deploy third-party resources, like the Twingate connector, when creating a personal AWS account. We’ll need to run a terraform destroy before closing the account to terminate those external resources.
How do we do this?
We simply send an API request (using the requests library, so we’ll need a Lambda layer for that) to Env0 (our Terraform platform). Once the destroy operation is complete (we can implement a simple wait mechanism with a step function), we close the AWS account with a basic Boto3 command. Afterward, we make an API call to MongoDB to delete the cluster, and that’s it.
It’s an easy workflow, and aside from the additional Lambda layer for the Python requests library, everything else is native to AWS.
Use Case 3: CloudWatch Logs Cost Optimization
Let’s look at one more use case.
At Melio, we store log groups in CloudWatch to meet compliance requirements. However, CloudWatch can be expensive, so we came up with a more cost-effective solution: exporting log groups to S3, which is a much cheaper storage option.
The catch? There isn’t a native way to do this automatically, like with the lifecycle rule for S3 buckets, so we had to build our own solution.
Let’s break it down:
1. DynamoDB Table Creation:
Create a DynamoDB table containing the names of all log groups. This table acts as a registry for managing the export process.
2. Export Task Initialization:
Retrieve the last item from the DynamoDB table, initiating an export task for the corresponding log group. Subsequently, remove the item from the table.
3. Set Retention Policy:
Apply a retention policy of 3 months to the log group that was exported successfully, ensuring that only relevant data is retained in CloudWatch
4. Task Status Monitoring:
Check if the DynamoDB table is empty. If it is, the export process is complete. If not, wait for 15 minutes and monitor the status of the ongoing export task.
5. Task Completion Check:
If the export task is marked as done, start the next export task. If not, wait for 15 minutes and recheck the status.
We created a systematic approach to ensure log groups are exported to S3, reducing costs while still meeting compliance. The process runs periodically — every three months — ensuring that only the necessary data stays in CloudWatch. This results in significant cost savings over time while still staying compliant with our requirements.
The Buffet: A Self-Service Solution
While Lambda saves time through automation, how can we address on-demand developer requests without creating bottlenecks?
That’s where The Buffet comes in — a self-service portal powered by Lambda functions.
The Buffet empowers developers to work more efficiently without waiting for DevOps, removing the bottleneck and allowing them to perform tasks independently. It’s all about making their lives easier and letting them do what they need to do, without any dependency on DevOps.
How It Works
We’ve set up an interface where developers can submit their requests (we use Slack, but you can use any tool you prefer).
Once a request is made, it’s sent via API Gateway into our AWS account. From there, we trigger an SNS topic, which sends the request to multiple SQS queues — one for each runner (i.e., self-service action). The relevant Lambda function pulls from the SQS queue and performs the actions.
Implementation Details
That covers the infrastructure, but what about the logic for the runners?
It’s simpler than you might think.
We’ve identified the most frequently requested tasks and automated them. These are often day-one operations, like creating AWS personal accounts, ECR repositories, Secrets, RDS clusters, MongoDB clusters, and more.
What do all of these tasks have in common?
They all create resources using Terraform. And since the Terraform code is stored in a Git repository, we just fetch the relevant file, append the new resource, create a pull request, and after the merge, Env0 applies the changes.
This simple but powerful architecture allows us to automate the creation of resources and easily add new runners without hassle.
Lambda function → Modify the Terraform repo → Create a PR → Apply with Env0.
Benefits
Using Buffet is a win-win for everyone.
Developers no longer need to wait on DevOps for support requests and can focus solely on development, free from bottlenecks. Meanwhile, DevOps can shift focus to more impactful tasks instead of handling repetitive support.
Creating a Self-Service portal can significantly ease the day-to-day load on DevOps and streamline workflows for everyone.
While it does require effort, and building new runners is simple, creating the portal itself will take some time.
However, it can empower your team and skyrocket productivity. The impact can be so huge, it’s like adding a new DevOps engineer to your team, handling the heavy lifting!
Final Thoughts
From a simple code freeze mechanism to comprehensive workflows, Lambda functions empower DevOps engineers to streamline their processes. Whether it’s using EventBridge for triggers, Step Functions for orchestration, or Slack for user interfaces, these tools make balancing efficiency and simplicity feel effortless.
Ready to simplify your workflows? Start small — automate just one task and watch the impact it has. With every step forward, you’ll uncover the incredible power of simplicity.
Visit our career website