← Back to blog

Cost Optimization: Caching Secrets and DynamoDB with Redis/Valkey

March 19, 2025

In applications that make intensive use of Secrets Manager and DynamoDB, operational costs can skyrocket quickly if proper optimization strategies are not applied. In this article, I share how we implemented a caching solution with Redis (and later Valkey) that allowed us to reduce the costs of these services from 300$ to 45$ monthly, while also improving the latency of our applications.

The problem: Disproportionate costs in API calls

Our architecture consisted of various components that frequently accessed secrets and configuration data:

Affected components:

The main problem? Every time a worker or the web needed a secret or configuration, it made a direct request to AWS Secrets Manager or DynamoDB. This meant:

The solution: Caching layer with Redis

The strategy was simple but effective: interpose a caching layer between the application and AWS services.

Initial architecture: Redis in-cluster

The first implementation used Redis deployed inside the Kubernetes cluster:

Characteristics:

Immediate benefits:

  1. Drastic reduction of requests: The vast majority of queries were served from cache
  2. Latency improvement: Local queries were significantly faster
  3. Lower cost: The cost of Redis inside the cluster was insignificant compared to savings

Implementation: Cache-Aside Pattern

For the implementation, we used the Cache-Aside pattern (also known as Lazy Loading):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Pseudocode of the Cache-Aside pattern
def get_secret(secret_name):
    # 1. First, try to get from cache
    cached_value = redis.get(f"secret:{secret_name}")
    if cached_value:
        return cached_value

    # 2. If it doesn't exist in cache, go to the source
    secret_value = secrets_manager.get_secret_value(secret_name)

    # 3. Store in cache with appropriate TTL
    redis.setex(f"secret:{secret_name}", 3600, secret_value)

    return secret_value

TTL strategy:

Migration to Valkey (AWS)

Once we validated that the caching approach worked correctly, we decided to migrate to Valkey managed by AWS (ElastiCache).

Why Valkey?

Valkey is the open source project that emerged after the Redis divergence. AWS adopted it as the default option for ElastiCache.

Migration advantages:

  1. Reduced management: No need to manage cache infrastructure
  2. High availability: Automatic replication and failover
  3. Scalability: Easy to scale vertically or horizontally
  4. Monitoring: Native integration with CloudWatch
  5. Security: Encryption in transit and at rest

Implemented configuration:

Quantifiable results

Cost savings

Before (without cache):

After (with Valkey):

Total savings: 255$/month (85% reduction)

Performance improvement

Beyond economic savings, performance improvements have been significant:

Operational benefits

  1. Reduction of external dependencies: Most requests don’t leave the cluster
  2. Better user experience: APIs respond faster
  3. Fewer limitations: Significant reduction of throttling at Secrets Manager and DynamoDB
  4. Scalability: The system can handle more load with the same resources

Lessons learned

1. Not everything needs to be cached

It’s important to analyze which data really benefits from cache:

2. Size matters

The Valkey instance size should be adjusted based on:

In our case, a cache.t4g.small has been sufficient, but monitoring is necessary.

3. Monitoring is key

We have implemented alerts for:

4. Planning the fallback

We should always have a plan in case Valkey goes down:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
def get_with_fallback(key, fetcher, ttl=3600):
    try:
        # Try to get from cache
        cached = redis.get(key)
        if cached:
            return cached
    except RedisError:
        # If Redis fails, go directly to the source
        pass  # Fallback

    # Get from source (AWS)
    value = fetcher()

    # Try to cache (optional, can fail)
    try:
        redis.setex(key, ttl, value)
    except RedisError:
        pass  # Don't fail the operation if cache fails

    return value

This strategy ensures the system continues to function even if the cache fails.

Conclusion

Implementing a caching layer with Redis/Valkey between our application and AWS services (Secrets Manager and DynamoDB) has been one of the most impactful optimizations we’ve performed.

Final results:

The key to success has been:

  1. Analyze usage pattern before implementing
  2. Start simple (Redis in-cluster) and then scale (Valkey AWS)
  3. Implement appropriate TTL strategies and invalidation
  4. Monitor continuously to optimize
  5. Plan fallback to guarantee availability

For any application that makes intensive use of Secrets Manager, DynamoDB, or any AWS service with costs per API call, using a cache like Valkey is, almost certainly, one of the best investments you can make. The return is immediate, both in costs and performance.

If your application has a high monthly bill in these services, I encourage you to try this strategy. The results, as we’ve seen, can be surprising.