Cost Optimization: Caching Secrets and DynamoDB with Redis/Valkey
In applications that make intensive use of Secrets Manager and DynamoDB, operational costs can skyrocket quickly if proper optimization strategies are not applied. In this article, I share how we implemented a caching solution with Redis (and later Valkey) that allowed us to reduce the costs of these services from 300$ to 45$ monthly, while also improving the latency of our applications.
The problem: Disproportionate costs in API calls
Our architecture consisted of various components that frequently accessed secrets and configuration data:
Affected components:
- Async workers: Processes that query secrets on every execution
- Web platform: REST API that accesses configurations stored in DynamoDB
- Microservices: Different services that need connection credentials
The main problem? Every time a worker or the web needed a secret or configuration, it made a direct request to AWS Secrets Manager or DynamoDB. This meant:
- Costs per API calls: Thousands of daily requests that accumulate on the bill
- Additional latency: Each request requires a round-trip to AWS
- External dependency: Availability depended on AWS services
The solution: Caching layer with Redis
The strategy was simple but effective: interpose a caching layer between the application and AWS services.
Initial architecture: Redis in-cluster
The first implementation used Redis deployed inside the Kubernetes cluster:
Characteristics:
- Deployment as StatefulSet inside the EKS cluster
- Persistence with EBS volumes
- TTL (Time To Live) configuration for keys
Immediate benefits:
- Drastic reduction of requests: The vast majority of queries were served from cache
- Latency improvement: Local queries were significantly faster
- Lower cost: The cost of Redis inside the cluster was insignificant compared to savings
Implementation: Cache-Aside Pattern
For the implementation, we used the Cache-Aside pattern (also known as Lazy Loading):
| |
TTL strategy:
- Secrets Manager and DynamoDB: 1 hour (3600 seconds) - Balance between freshness and savings
Migration to Valkey (AWS)
Once we validated that the caching approach worked correctly, we decided to migrate to Valkey managed by AWS (ElastiCache).
Why Valkey?
Valkey is the open source project that emerged after the Redis divergence. AWS adopted it as the default option for ElastiCache.
Migration advantages:
- Reduced management: No need to manage cache infrastructure
- High availability: Automatic replication and failover
- Scalability: Easy to scale vertically or horizontally
- Monitoring: Native integration with CloudWatch
- Security: Encryption in transit and at rest
Implemented configuration:
- Node type:
cache.t4g.small(Graviton instances for savings)
Quantifiable results
Cost savings
Before (without cache):
- Secrets Manager: 250$/month
- DynamoDB: 50$/month (mainly in API calls)
- Total: 300$/month
After (with Valkey):
- Secrets Manager: 25$/month (90% reduction)
- DynamoDB: 10$/month (80% reduction)
- Valkey (ElastiCache): 10$/month
- Total: 45$/month
Total savings: 255$/month (85% reduction)
Performance improvement
Beyond economic savings, performance improvements have been significant:
- Query latency: Significant reduction in secret and configuration queries
- Throughput: Increased capacity to handle more requests per second
- User experience: APIs respond faster thanks to local cache
Operational benefits
- Reduction of external dependencies: Most requests don’t leave the cluster
- Better user experience: APIs respond faster
- Fewer limitations: Significant reduction of throttling at Secrets Manager and DynamoDB
- Scalability: The system can handle more load with the same resources
Lessons learned
1. Not everything needs to be cached
It’s important to analyze which data really benefits from cache:
- High read frequency, low write frequency: ✅ Perfect for cache
- Constantly changing data: ❌ Better to access directly
- Critical security secrets: Consider very short TTLs
2. Size matters
The Valkey instance size should be adjusted based on:
- Volume of data to cache
- Access pattern (read-heavy vs write-heavy)
- Latency requirements
In our case, a cache.t4g.small has been sufficient, but monitoring is necessary.
3. Monitoring is key
We have implemented alerts for:
- Cache hit rate: Should be >90%
- Evictions: If they increase, memory needs to be expanded
- Connection errors: Indicate problems in Valkey
4. Planning the fallback
We should always have a plan in case Valkey goes down:
| |
This strategy ensures the system continues to function even if the cache fails.
Conclusion
Implementing a caching layer with Redis/Valkey between our application and AWS services (Secrets Manager and DynamoDB) has been one of the most impactful optimizations we’ve performed.
Final results:
- Economic savings: 255$/month (85% reduction)
- Performance improvement: Significant latency reduction in queries
- Better user experience: Much more responsive APIs
- Greater capacity: Possible throughput increase
The key to success has been:
- Analyze usage pattern before implementing
- Start simple (Redis in-cluster) and then scale (Valkey AWS)
- Implement appropriate TTL strategies and invalidation
- Monitor continuously to optimize
- Plan fallback to guarantee availability
For any application that makes intensive use of Secrets Manager, DynamoDB, or any AWS service with costs per API call, using a cache like Valkey is, almost certainly, one of the best investments you can make. The return is immediate, both in costs and performance.
If your application has a high monthly bill in these services, I encourage you to try this strategy. The results, as we’ve seen, can be surprising.