1 October, 2024
4 minutes
As an engineer at Volante, I encountered an interesting challenge.
We worked with several banks, each needing customised software. What made the situation unique was that the software couldn't be hosted on a public or shared cloud infrastructure, several teams were making changes to the same codebase and every change needed rigorous testing in a production environment so we could simulate specific bank interactions accurately.
Multiple times a day, my task was to spin up complex micro-service environments for periods ranging from 15 minutes to an hour. Once the testing was done — or after getting client approval — I'd shut it down. We relied on Kubernetes (with Helm), Chef to automate server configuration, Terraform, Fluentd, Prometheus, Grafana and Traefik for monitoring, logs and ingress control. It was a pretty complex system to manage.
Naturally, I automated this process. Starting with manually-run Terraform scripts, but over time, I transitioned them to a custom shell script library that could be triggered from a dashboard I built. What used to be a multi-step process, became a single-click one. It could be operated by anyone and it cut deployment time by more than 50% during critical releases, allowing our team to deliver faster. More importantly, it allowed me to focus on coding rather than getting bogged down by routine operational tasks.
Automating for a startup budget
In 2022, I became CTO of a startup trying to become the HomeDepot of India. We built a platform designed to scale to thousands of transactions per second. Our deployment stack involved splitting up our backend into micro-services, using Kubernetes, GitHub Actions for CI/CD, Docker BuildKit, Prometheus and Grafana. We spared no expense on production but with a limited budget, the cloud costs for staging and development were unsustainable yet essential.
Like before, I automated the DevOps pipeline. Once again, we used Terraform to spin up micro-services, databases, caches, queues and cron jobs on demand. The testing team, with no DevOps knowledge, could push code, deploy for 30 minutes to an hour, run tests, and then the environment would automatically shut down.
The results were phenomenal. Our development and staging costs dropped from $1400 to just $100 a month. More importantly, our team moved faster through builds, were able to prototype features that the operations team could try before it went into production. It was a game-changer.
Building Infrastructure for my own Startup
Fast forward to 2024; I was brainstorming SaaS ideas with a small team, quickly prototyping on local machines. We generated interest from customers, and it was time to bring the app to them.
This task was daunting! I found myself starting from scratch (again) — with a fresh AWS account, no DevOps team and no automation. I had to set up the servers, databases, CI/CD pipelines, security systems, and monitoring from the ground up. A process that I was familiar with but it was an enormous effort, nevertheless.
I looked for PaaS offerings but found that none had the complete automation capabilities I needed. That's when the idea hit me — what if I build a product for other developers — the mythical “build something that you would use”.
The "Aha" moment
"What if we automated the entire DevOps pipeline, like a large enterprise would, but not just for our services but for anyone else's?"
Startups spend countless hours piecing together DevOps tools and SaaS applications to create the ideal environment for their apps. What if they could rent a pre-built, perfect setup and focus solely on their product?
We realised this was worth building. We also realised that it had to be more than just a CI/CD tool.
Beyond CI/CD
To make this viable, we needed to provide enterprise-grade infrastructure to startups in just a few clicks. The solution though opinionated, had to be simple, powerful and scalable. It needed to have the following core attributes:
Minimal Learning Curve: No DevOps knowledge required.
Single-Click Deployment: Ships CI/CD out-of-the-box
Auto-Scaling: Scale up when necessary, but scale to zero when idle to save costs.
Environments over Services: Automation to affect entire environments, not just individual services.
Cost Transparency: Pay for actual usage, not for renting servers.
Framework Agnostic: Containerise or make compatible with any web app
Security: Implement every best practice in the book
This would save time, effort and cost for every new app without sacrificing scalability or flexibility.
An alternative to the cloud
When the cloud emerged in the 90s, it was revolutionary. It allowed companies to bypass the complexities of networking and hardware setup, providing compute, storage, and networking on demand. It helped businesses move faster and cut upfront costs.
Today, we're at a similar inflection point. AI enables engineers to ship code faster than ever, focusing entirely on the product. Good DevOps engineers — with a mix of talent and experience — are hard to come by. Setting up complex infrastructure on AWS or its equivalents is a repetitive task that can be automated. It lets modern companies move even faster and spend even less on infrastructure.
Building ToyStack: The Neo-Cloud
That's why we built ToyStack — a cloud for the new age.
Our 8,000 (and growing) users simply bring their code and we:
Deploy them on a fully-managed Kubernetes cluster.
Allow them to spin up databases and caches.
Integrate logging and monitoring tools that pipe information directly into their dashboards.
Continuously improve security systems and run vulnerability checks
Deploy ephemeral development environments to reduce costs
Automate everything from code to cloud
More importantly, all of this happens with a single-click — no specialised knowledge required (no certification courses).
I'm excited to introduce my first "build something you would use" product to the world. Building ToyStack was a real challenge — we pushed the boundaries of our knowledge and the tools we relied on. We had to dive deep, creating custom Terraform implementations, solving containerisation issues, combating bad-actor abuse, and facing challenges that most DevOps teams wouldn't encounter when setting up internal infrastructure.
But thanks to continuous iteration and feedback from our Discord community, we've made the product more stable. I'll be writing a technical blog soon about how we tackled these obstacles—stay tuned!
PS: If you're interested in trying our product, you can get started here. If you'd like to tell us what we can do better, speak with an engineer on our team.