Zeno Health's Journey to a Scalable, Cost Effective Infrastructure

  • Case Studies
  • Zeno Health's Journey to a Scalable, Cost Effective Infrastructure
banner
banner
banner
Zeno Health's Journey to a Scalable, Cost Effective Infrastructure

Zeno Health's Journey to Scalable, Cost-Effective Infrastructure

Opportunity

Zeno Health, Mumbai's leading generic medicine-focused pharmacy, with 175+ stores across Mumbai and Pune, aims to provide holistic healthcare that is authentic, accessible, and affordable. As part of their growth strategy, Zeno sought to achieve scalability, cost optimization, and automatic healing for their infrastructure while consolidating their services into a centralized and manageable environment. This required a modernized, efficient, and secure infrastructure capable of handling current demands and future growth.

Problem

Zeno's existing infrastructure faced several challenges:

Lack of Scalability: Services were running on EC2 instances without autoscaling, limiting the ability to handle spikes in demand.

No Centralized Logging: Difficulties in debugging and performance monitoring due to the absence of a unified logging system.

Cost Overheads: Dependency on New Relic for monitoring resulted in high monthly costs, and running on standalone EC2 instances led to underutilized resources.

Complexity in Deployment: Manual setups and lack of CI/CD pipelines hindered efficiency and consistency.

Limited Security: Using IAM users instead of centralized management tools like IAM SSO posed security risks.

Fragmented Infrastructure: Services hosted on Digital Ocean and AWS created challenges in cost tracking and resource management.

Solution

To address Zeno Health's challenges, we implemented a comprehensive transformation of their infrastructure:

Migration to Kubernetes: We migrated services from EC2 instances to Kubernetes clusters using AWS Elastic Kubernetes Service (EKS). This enabled out-of-the-box scaling and automatic service healing, ensuring minimal downtime and enhanced reliability. We leveraged Karpenter with AWS EKS to utilize spot instances for development and pre-production environments, significantly reducing costs.

Infrastructure Automation and Centralization: We used Terraform to set up their entire environment, ensuring consistent and replicable deployments across pre-production and production. We established centralized logging with Elasticsearch and Kibana, improving visibility and troubleshooting, and transitioned to CI/CD pipelines with AWS CodePipeline and CodeBuild for streamlined deployments.

Cost Optimization: We ran development, pre-production, and internal operations services (e.g., Prometheus, Grafana, Elasticsearch) on spot instances to minimize costs. We implemented cluster scale-down during off-peak hours to further optimize expenses and migrated from New Relic to Grafana and Prometheus, saving ₹50,000–₹60,000 per month.

Application Modernization: We developed Dockerfiles for all Python and PHP services, standardizing deployment and enhancing portability. We configured PHP services with php-fpm and NGINX for efficient resource usage.

Enhanced Monitoring and Security: We set up Grafana and Prometheus for monitoring and alerting, alongside Elastic APM for detailed application performance insights. We replaced IAM users with IAM SSO, improving security and simplifying user management, and integrated IRSA (IAM Roles for Service Accounts), ensuring each service had appropriate permissions.

Consolidation and Regional Optimization: We migrated development environments to a new AWS account for cost bifurcation and enhanced security. We transitioned services from Digital Ocean to AWS, consolidating infrastructure for better management, and deployed development and pre-production environments in the us-east-1 region for cost savings. We set up MongoDB in a high-availability replicaset configuration and configured Pritunl VPN for secure database access.

Results

Scalability and Reliability: We achieved autoscaling to handle thousands of users seamlessly, with services automatically restarting in case of issues, ensuring minimal downtime.

Significant Cost Savings: We leveraged spot instances to reduce costs across development, pre-production, and production environments, and eliminated ₹50,000–₹60,000 monthly costs by replacing New Relic with Grafana and Prometheus.

Improved Security and Manageability: We centralized user access management with IAM SSO, secured service credentials through IRSA, and provided encrypted database access via VPN.

Operational Efficiency: We fully automated environment setup with Terraform, making it easier to replicate and manage, and CI/CD pipelines enabled rapid, reliable deployments with zero downtime.

Infrastructure Consolidation: We unified infrastructure on AWS, simplifying resource management and cost tracking. Zeno Health is now well-positioned to support its vision of creating a holistic healthcare ecosystem with a scalable, secure, and cost-efficient infrastructure.

Timeline

Duration: 4 months
Team Size: 2 engineers
Client Industry: Health Tech and Ecommerce