When building serverless web applications, many developers face a common challenge: "How can we efficiently manage real-time API responses alongside static file delivery?"
This becomes especially critical for features like search autocomplete or real-time search, where asynchronous patterns like SQS → Lambda simply won't work due to their inherent limitations.
Today, we'll explore a practical architecture pattern that addresses these concerns. By connecting AWS WAF, CloudFront, API Gateway, and Lambda, we can achieve both real-time response capabilities and unified management in a serverless web application.
Why Do We Need This Pattern?
Traditional Serverless Web App Challenges
Existing serverless web applications often encounter these issues:
Real-Time Response Limitations: While asynchronous patterns like SQS → Lambda work well for batch processing, they're unusable for scenarios requiring immediate responses like search autocomplete or real-time search. When users type character by character, synchronous processing becomes essential.
Complex Routing Management: Separately managing static files and API endpoints often leads to domain separation and complicated CORS configurations. For example, serving the frontend from myapp.com and APIs from api.myapp.com requires additional setup due to browser security policies.
Distributed Security Management: Managing security settings across WAF, CloudFront, and API Gateway separately makes it difficult to maintain consistent security policies.
Scalability vs Performance Trade-offs: You often have to choose between implementing complex caching strategies for performance or sacrificing performance for scalability. The Solution: Real-Time Response Serverless Web App Pattern The pattern designed to solve these challenges is the WAF → CloudFront → API Gateway/S3 Synchronous Processing Architecture.
Architecture Overview
User Browser → WAF → CloudFront → API Gateway → Lambda Functions
│
└─────────► S3 Bucket (Static Files)
Request Flow
- Static files (/*): User → WAF → CloudFront → S3
- API calls (/api/*): User → WAF → CloudFront → API Gateway → Lambda
The key strength of this architecture is serving both static files and APIs from a single domain while providing immediate synchronous API responses and optimizing static files through CDN.
Key Benefits of This Pattern
1. Static File Optimization
Static files (/*) are cached by CloudFront and served directly from the edge without reaching Lambda. This enables ultra-fast responses of 10-50ms for images, CSS, and JS files, while only API requests are forwarded to Lambda, significantly reducing server load.
2. Global Performance Optimization
Through CloudFront's worldwide edge locations, static files are cached globally, and API requests are processed from the nearest POP (Point of Presence), minimizing latency.
3. Real-Time Synchronous Processing
Unlike SQS → Lambda patterns, API Gateway → Lambda provides synchronous processing with immediate response returns. This is optimized for applications requiring real-time autocomplete results as users type, real-time search, or instant data retrieval. SQS patterns store messages in queues for later processing, making real-time responses impossible.
4. Single Domain Operation
By serving both static files and APIs from the same domain, CORS configuration becomes simpler and cookie/session management becomes easier.
5. Auto-Scaling
Lambda automatically scales within its concurrent execution limits, while API Gateway can handle tens of thousands of requests per second.
6. Unified Security
AWS WAF inspects all traffic at the CloudFront level, blocking DDoS, SQL injection, XSS, and other attacks. Rate-based rules can limit requests per IP (AWS console suggests examples like 2,000 requests per 5 minutes), freely adjustable based on application characteristics.
Real Implementation Example
Let's examine a practical example from the GitHub repository:
Project Structure
cdk-serverless-web-app-pattern/
├── bin/
│ └── app.ts # App entry point
├── lib/
│ └── stack.ts # All stack definitions
├── lambda/
│ ├── sample-lambda-1/
│ │ └── index.js # Lambda function handler
│ └── sample-lambda-2/
│ └── index.js # Lambda function handler
├── web/
│ └── index.html # Frontend application
└── README.md
Core Implementation Elements
1. Cross-Region Setup
// WAF deployed in us-east-1 per CloudFront requirements
// Other resources deployed in Sydney (ap-southeast-2)
cdk bootstrap aws://account-number/us-east-1
cdk bootstrap aws://account-number/ap-southeast-2
2. Stack Separation
- FrontendStack: Manages S3 and CloudFront deployment
- BackendStack: Manages API Gateway and Lambda functions
- WafStack: Security rules and WAF configuration This separation allows frontend and backend teams to deploy independently.
3. Automatic Cache Invalidation
// Automatic CloudFront cache invalidation after S3 deployment
distributionPaths: ['/*']
Performance Optimization Strategies
Lambda Cold Start Minimization
Memory Allocation Optimization: Properly allocate Lambda function memory to reduce cold start times. Higher memory allocation improves CPU performance, reducing initialization time.
Deployment Package Size Minimization: Keep packages under 50MB and remove unnecessary dependencies to reduce loading time.
Provisioned Concurrency: Maintain pre-warmed execution environments for critical functions. CloudFront Caching Optimization Static File Caching: Cache HTML, CSS, JS, and image files for extended periods while handling updates through version management strategies. API Caching Disabled: Disable caching for /api/* paths to always return fresh data.
CloudFront Caching Optimization
Static File Caching: Cache HTML, CSS, JS, and image files for extended periods while handling updates through version management strategies.
API Caching Disabled: Disable caching for /api/* paths to always return fresh data.
additionalBehaviors: {
'/api/*': {
origin: apiGatewayOrigin,
cachePolicy: cloudfront.CachePolicy.CACHING_DISABLED,
}
}
When to Use This Pattern
Suitable Use Cases
Real-time response applications: Search autocomplete, real-time search, instant data retrieval, and other interactive features requiring immediate user input response
SPA (Single Page Applications): Applications built with React, Vue, or Angular where static files are served via CDN and APIs through serverless
Media and portfolio sites: Services with predominantly static content (images/videos) but requiring interactive features like comments, view counts, and search with immediate responses
Global services: Services targeting worldwide users requiring regional performance optimization
Unified security management services: Enterprise applications requiring integrated security management through WAF
Constraints to Consider
API Response Delays: Lambda cold starts can cause delays of hundreds of milliseconds to 2 seconds on initial API calls. Java languages, complex library initialization, or VPC configurations may extend this further, though Provisioned Concurrency or Lambda SnapStart (Java support) can almost completely resolve this issue.
Complexity: Combining multiple AWS services makes initial setup and debugging more complex.
Cost: For small-scale services, costs may be higher than EC2 or container-based solutions.
Response Size Limits: API Gateway has a 10MB limit, Lambda has a 6MB response size limit.
Cost Efficiency Analysis
Looking at actual operational costs, this pattern shows significant cost savings under specific conditions:
Core Cost Reduction Benefits: Processing numerous static file requests (JS, CSS, images, etc.) through inexpensive CloudFront instead of expensive API Gateway or Lambda results in massive savings on API Gateway request costs ($3.50 per million) and Lambda invocation and execution costs. Additionally, data transfer costs are slightly lower with CloudFront ($0.085/GB) compared to API Gateway ($0.090/GB), plus CloudFront offers 1TB free tier.
Scalability: With serverless pay-per-use characteristics, you can operate almost free during low traffic periods.
Monitoring and Operations
Key Metrics
IntegrationLatency: Monitor delay time from API Gateway to Lambda to identify performance bottlenecks.
CloudFront Cache Hit Rate: Measure performance and cost efficiency through cache hit rates.
Lambda Duration & Cold Start: Track function execution time and cold start frequency.
WAF Blocked Requests: Monitor security threat blocking status.
Recommended Alarm Settings
For operational stability, we recommend setting up these additional CloudWatch alarms:
- API Gateway 5XX error rate > 1%: Detects Lambda function errors or timeouts
- Lambda error rate > 0.1%: Detects internal function logic errors
- CloudFront Origin 5XX error rate > 0.5%: Detects S3 or API Gateway origin errors
- Average response time > 2 seconds: Detects frequent cold starts or performance degradation
Try It Yourself
To experience this pattern hands-on, you can use the following GitHub repository:
GitHub Repository: https://github.com/jaeneungsim/cdk-serverless-web-app-pattern
Quick Start
# Clone repository
git clone https://github.com/jaeneungsim/cdk-serverless-web-app-pattern.git
cd cdk-serverless-web-app-pattern
# Install dependencies
npm install
# Bootstrap both regions
cdk bootstrap aws://account-number/us-east-1
cdk bootstrap aws://account-number/ap-southeast-2
# Deploy
cdk deploy --all
After deployment, you can access the CloudFront domain and test these endpoints:
- /api/lambda-1 (GET): Returns greeting from Lambda 1
- /api/lambda-2 (GET): Returns greeting from Lambda 2
Extensibility
Based on this foundation pattern, you can add the following features:
Database Integration: Connect DynamoDB or RDS for dynamic data processing
Authentication System: User authentication through Cognito or Auth0
Custom Domains: HTTPS custom domains through Route53 and ACM
CI/CD Pipelines: Automated deployment through CodePipeline or GitHub Actions
Environment Separation: Development, staging, and production environment configuration
Conclusion
The WAF → CloudFront → API Gateway → Lambda synchronous processing pattern is highly suitable for serverless web applications requiring real-time responses with unified management.
It particularly excels in projects meeting these conditions:
- Services requiring real-time responses (search autocomplete, real-time search, interactive APIs)
- Services with extensive static content (SPAs, media sites, etc.)
- Global user base (services targeting worldwide users)
- Security-critical enterprise applications
While Lambda cold start issues still exist along with initial setup complexity and other constraints, these problems can be effectively managed through Provisioned Concurrency, proper optimization strategies, and monitoring.
By leveraging serverless architecture's auto-scaling and cost efficiency alongside CloudFront's global performance optimization, this pattern is worth serious consideration for building scalable and high-performance web applications.