Why Rate Limiting Matters for Node.js SaaS
A SaaS API usually needs several types of limits at the same time:
- Abuse limits to prevent bots, scraping, brute-force login attempts, and broken clients.
- Infrastructure protection to keep Node.js workers, databases, queues, and third-party APIs from being overloaded.
- Product quotas to enforce free, pro, team, and enterprise plan boundaries.
- Cost controls to prevent expensive background jobs, AI calls, email sends, or webhook retries from exploding.
- Fairness controls so one tenant cannot consume shared capacity needed by others.
A simple public endpoint may only need an IP-based limit. A serious SaaS API usually needs per-user, per-team, per-token, per-route, and per-plan limits. That is where the choice of tool starts to matter.
Four Places to Enforce Rate Limits
There are four common enforcement layers, each with different trade-offs around cost, context, and operational complexity.
1. Application Middleware
This is the most direct option. You install middleware in Express, Fastify, NestJS, or another Node.js framework and evaluate each request before it reaches the handler.
Application middleware is good for:
- Login and signup protection.
- Password reset protection.
- User-aware limits.
- SaaS plan-aware quotas.
- Route-level business rules.
The downside is that basic in-memory middleware is not reliable once you run multiple Node.js processes, containers, or regions. The express-rate-limit documentation explicitly notes that the default memory store keeps hit counts in memory and produces inconsistent results when multiple servers or processes are used. For multi-instance APIs, an external store is recommended.
2. Redis-Backed Distributed Rate Limiting
Redis is a common state store for distributed rate limiting because it is fast, shared, and supports atomic operations. A Redis-backed limiter can keep quota state consistent across several Node.js instances, containers, or workers.
This approach is useful for:
- Multi-instance Node.js APIs.
- SaaS plan quotas.
- API key limits.
- Brute-force login protection.
- Webhook and third-party API cost protection.
- Distributed deployments where the same user may hit different app instances.
Tools such as Upstash Rate Limit and rate-limiter-flexible simplify this pattern. Upstash Rate Limit is designed for HTTP-based, connectionless environments such as serverless functions, Vercel Edge, Cloudflare Workers, Fastly Compute@Edge, and other environments where HTTP is preferred over TCP. Upstash also exposes useful features such as caching blocked requests, timeout behavior, analytics, traffic protection, custom rates, multi-region usage, multiple limits, and dynamic limits.
3. Edge and WAF Rate Limiting
Edge rate limiting happens before traffic reaches your origin. Cloudflare WAF rate limiting rules allow teams to define matching expressions, request thresholds, mitigation durations, and actions when the configured rate is reached.
Edge limits are good for:
- Blocking obvious abuse before it reaches Node.js.
- Protecting login, search, and public API endpoints.
- Country, IP, user agent, path, method, and header-based filtering.
- Reducing origin load during spikes.
The trade-off is that edge limits may not understand your full application state. They are excellent for coarse protection, but not always enough for plan-aware SaaS quotas.
4. API Gateway Rate Limiting
API gateways enforce limits outside the application but closer to API management. Kong Gateway’s rate limiting plugin can apply limits per service or per route, send standard rate limit headers, and use Redis for distributed setups. Managed API gateway products such as Zuplo combine rate limiting with API keys, developer portals, monetization, routing, caching, and API governance. This can make sense when rate limiting is part of a broader platform requirement rather than a single middleware feature.
Comparison Table
| Option | Best For | Strengths | Watch Outs | Good Fit |
|---|---|---|---|---|
express-rate-limit | Early Express apps, login endpoints, simple abuse prevention | Easy to install, framework-native, low cost | Memory store is not enough for multiple instances; limited product quota features | MVPs and single-instance APIs |
rate-limiter-flexible + Redis | Distributed Node.js APIs | Redis-backed state, flexible patterns, login protection examples | You operate Redis and limiter logic | Growing SaaS APIs with multiple instances |
| Upstash Rate Limit | Serverless and edge-heavy Node.js apps | HTTP-based, serverless-friendly, analytics, multi-region support, dynamic limits | Cost depends on commands, traffic, and Redis usage | Vercel, Cloudflare Workers, edge and serverless apps |
| Arcjet | App-level security and quota controls | Code-level limits, bot protection, AI budget control, no Redis required for rate limit state | Depends on Arcjet platform; verify current pricing | SaaS apps that want security controls inside code |
| Cloudflare WAF Rate Limiting | Edge abuse control | Stops traffic before origin, dashboard/API/Terraform workflows, WAF integration | Less application-aware; advanced characteristics depend on plan | Public APIs and login endpoints behind Cloudflare |
| Kong Rate Limiting Plugin | API gateway teams | Per-service/per-route limits, headers, Redis policy for distributed gateways | Gateway operations and configuration complexity | Platform teams already using Kong |
| Zuplo | Managed API platform and developer portal | API keys, monetization, developer portal, rate limiting, GitOps, edge gateway model | Pricing and plan fit should be confirmed | API products, developer platforms, external SaaS APIs |
Option 1: Express Middleware
For a small Express API, express-rate-limit is the fastest path to a working limiter.
import express from "express";
import rateLimit from "express-rate-limit";
const app = express();
const loginLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
limit: 20,
standardHeaders: "draft-8",
legacyHeaders: false,
});
app.post("/login", loginLimiter, async (req, res) => {
res.json({ ok: true });
});
This is useful for early-stage abuse prevention, but treat it as a starting point. For production SaaS, the key question is whether the store is shared across all instances. If it is not, the same client may be allowed more requests than expected because each process has its own independent counter.
Use application middleware when the logic needs user context, such as:
- Free plan: 100 requests per minute.
- Pro plan: 1,000 requests per minute.
- Enterprise plan: custom quota.
- Login endpoint: stricter per-IP and per-account limits.
- AI endpoint: token-budget based limits.
Option 2: Redis-Backed Distributed Rate Limiting
Redis-backed limiting is the practical middle ground for many Node.js SaaS apps. It keeps state outside a single process, so you can scale horizontally without losing quota consistency.
A Redis-backed limiter can track keys such as:
rate:user:123:/api/search
rate:team:acme:/api/export
rate:ip:203.0.113.10:/login
rate:api_key:sk_live_xxx:/v1/events
For SaaS products, key design is more important than the package name. A bad key causes unfair limits. A good key maps to your product model.
Common identifiers include:
- IP address for anonymous abuse.
- User ID for logged-in products.
- Team ID for B2B SaaS.
- API key ID for developer APIs.
- Route group for expensive endpoints.
- Plan tier for commercial quotas.
Here is a practical example using rate-limiter-flexible with Redis:
import { RateLimiterRedis } from "rate-limiter-flexible";
import Redis from "ioredis";
const redisClient = new Redis({
host: process.env.REDIS_HOST,
port: Number(process.env.REDIS_PORT),
enableOfflineQueue: false,
});
const limiter = new RateLimiterRedis({
storeClient: redisClient,
keyPrefix: "rl",
points: 100,
duration: 60,
});
async function rateLimitMiddleware(req, res, next) {
const key = `user:${req.userId}:${req.route}`;
try {
const result = await limiter.consume(key);
res.set("Retry-After", String(Math.ceil(result.msBeforeNext / 1000)));
res.set("X-RateLimit-Remaining", String(result.remainingPoints));
next();
} catch (err) {
if (err.remainingPoints !== undefined) {
const retryAfter = Math.ceil(err.msBeforeNext / 1000);
res.set("Retry-After", String(retryAfter));
res.status(429).json({
error: "Too Many Requests",
retryAfter,
});
} else {
next(err);
}
}
}
Redis Hosting: Upstash vs Traditional Redis
Upstash is attractive when you want serverless-friendly Redis over HTTP and do not want to manage persistent TCP connections. Traditional Redis providers are attractive when your Node.js app already runs in a persistent server environment and can maintain TCP connections efficiently.
| Factor | Upstash | Self-Managed Redis |
|---|---|---|
| Connection model | HTTP-based | Persistent TCP |
| Serverless fit | Excellent | Requires connection pooling |
| Pricing model | Per-command or fixed plans | Memory and instance based |
| Multi-region | Built-in | Requires manual replication |
| Analytics dashboard | Included | Requires external tools |
| Free tier | 256 MB, 500K commands/month | Varies by provider |
Option 3: Arcjet for App-Level Security Limits
Arcjet is useful when rate limiting is part of a broader runtime security layer. Its documentation describes application-level limits configured inside code, with use cases including AI token spend control, login protection, API throttling, and SaaS plan quotas. Arcjet handles rate limit tracking through its cloud API, so you do not need separate Redis infrastructure for rate limit state.
That makes Arcjet a strong option when you want:
- Per-route and per-user limits in application code.
- Bot protection next to rate limiting.
- AI budget control.
- SaaS quota rules that change with product logic.
- Less Redis infrastructure to manage.
The trade-off is platform dependency. If you already have Redis and prefer full control, a Redis-backed limiter may be more portable. If you want security controls that ship with code, Arcjet can reduce operational work.
import arcjet, { rateLimit, shield } from "@arcjet/node";
const aj = arcjet({
key: process.env.ARCJET_KEY,
rules: [
rateLimit({
type: "tokenBucket",
refillRate: 10,
interval: "10s",
capacity: 100,
}),
shield({ mode: "LIVE" }),
],
});
app.post("/api/ai", async (req, res) => {
const decision = await aj.protect(req, { requested: 5 });
if (decision.isDenied()) {
return res.status(429).json({ error: "Rate limit reached" });
}
res.json({ result: "ok" });
});
Option 4: Cloudflare WAF Rate Limiting
Cloudflare WAF rate limiting is a good first line of defense. It can block, challenge, or otherwise mitigate traffic before it reaches your Node.js origin.
Use Cloudflare rate limiting for:
/login/signup/api/search/api/export- Unauthenticated public endpoints.
- Endpoints that receive bot traffic.
- Endpoints where traffic should be filtered before origin.
However, Cloudflare also notes that rate limiting rules are not designed to guarantee a precise number of requests reaches your origin because of detection and counter timing. That is not a problem for coarse abuse protection, but it matters for exact billing quotas. For exact SaaS quota enforcement, pair edge limits with app-level user or API-key limits.
Edge limits are typically configured through Cloudflare’s dashboard, API, or Terraform. A typical rule might look like:
- Field: URI Path
- Operator: equals
- Value:
/api/search - Expression:
(http.request.uri.path eq "/api/search") - Rate: Block for 10 minutes when requests exceed 30 per minute from the same IP.
Option 5: Kong for API Gateway Rate Limiting
Kong is a strong fit when your company already uses an API gateway layer. The rate limiting plugin can emit headers such as RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset, and Retry-After, and it can return 429 when limits are exceeded.
Kong is especially relevant when:
- You already route APIs through Kong.
- Multiple teams publish services behind the same gateway.
- You need per-service or per-route governance.
- You want API limits outside the Node.js codebase.
- You need Redis-backed consistency across gateway nodes.
For a solo Node.js SaaS, Kong may be heavier than necessary. For a platform team managing many APIs, it can centralize policy enforcement.
# Kong declarative config example (kong.yml)
services:
- name: search-api
url: http://search-service:3000
routes:
- name: search-route
paths:
- /api/search
plugins:
- name: rate-limiting
config:
minute: 30
policy: redis
redis_host: redis.default.svc.cluster.local
redis_port: 6379
fault_tolerant: true
hide_client_headers: false
Option 6: Zuplo for Managed API Products
Zuplo is positioned more as a managed API gateway and developer platform than as a simple rate limiting library. Its pricing page highlights API gateway capabilities such as authentication, routing, governance, developer portal, monetization, AI gateway, caching, and cost tracking.
This can be useful if your Node.js SaaS exposes public APIs to developers and needs:
- API keys.
- Developer documentation.
- Usage metering.
- Request routing.
- Monetization.
- Gateway-level policies.
- GitOps workflow.
If your only need is login endpoint protection, Zuplo may be more than you need. If your API is a product, it can replace several custom tools.
Cost and Operational Trade-Offs
Rate limiting tools have very different cost models. Understanding these before committing saves refactoring later.
Middleware Cost
Basic middleware is almost free, but the real cost appears when you need distributed consistency. At that point, you need Redis, a database-backed store, or a managed service.
Redis Cost
Redis-backed limiting usually costs based on memory, commands, bandwidth, and high-availability requirements. Upstash lists per-request pricing for pay-as-you-go Redis and fixed plans for predictable usage. Its free tier includes 256 MB data size and 500K monthly commands, while pay-as-you-go is priced per 100K commands. Confirm exact prices before committing because cloud pricing changes.
Edge WAF Cost
Edge WAF limits are usually tied to CDN or security platform plans. The advantage is origin protection. The limitation is weaker application context. Cloudflare rate limiting rules are available on Pro plans and above as of 2026, but verify current plan details.
API Gateway Cost
Gateway tools may charge by requests, seats, workspaces, support level, custom domains, analytics, or enterprise security features. Zuplo’s public page lists a free tier at 100K requests per month and a Builder plan with included requests plus paid request blocks, but production pricing should be confirmed before publishing.
Cost Comparison Summary
| Tool | Free Tier | Entry-Level Paid | Pricing Model | Main Cost Driver |
|---|---|---|---|---|
| express-rate-limit | Free (MIT) | N/A | Open source | Your infrastructure |
| rate-limiter-flexible + Redis | Free (ISC) | ~$5-50/mo Redis | Open source + Redis hosting | Redis instance size |
| Upstash Rate Limit | 256 MB, 500K commands | Per 100K commands | Usage-based | Command volume |
| Arcjet | Free tier available | Contact sales | Platform pricing | Request volume, features |
| Cloudflare WAF | Not in free plan | Pro plan ($20/mo+) | Plan-based | CDN/WAF plan tier |
| Kong | Free (OSS) | Enterprise pricing | OSS or subscription | Infrastructure + ops |
| Zuplo | 100K requests/mo | Builder plan | Request-based | API call volume |
Recommended Architecture by SaaS Stage
MVP or Solo Project
Use express-rate-limit for public endpoints and login abuse prevention. Keep the implementation simple. Add clear 429 responses and Retry-After headers. Do not over-engineer at this stage.
Early Production SaaS
Add Redis-backed limits for API keys, user accounts, and expensive endpoints. Keep edge protection enabled for obvious bot and brute-force traffic. Start tracking blocked request metrics.
Growing B2B SaaS
Use a layered model:
- Cloudflare or another WAF for coarse edge abuse protection.
- Application-level limits for user, team, and plan-aware quotas.
- Redis or a managed limiter for shared distributed state.
- Monitoring and alerts for blocked requests, near-limit tenants, and abnormal spikes.
API Product or Developer Platform
Consider an API gateway such as Kong or Zuplo when rate limiting is part of a bigger API management story: API keys, developer portal, usage tracking, monetization, routing, transformations, and governance.
Production Checklist
Before shipping rate limiting in a Node.js SaaS app, check the following:
- Define limits per endpoint class, not globally.
- Use different limits for anonymous users, authenticated users, API keys, and teams.
- Keep limits consistent with SaaS pricing tiers.
- Use Redis or a managed service when running multiple instances.
- Return
429 Too Many Requestsconsistently. - Include
Retry-Afterwhen possible. - Do not leak sensitive account information in blocked responses.
- Log blocked requests with route, tenant, user, API key, and reason.
- Alert on sudden spikes in blocked requests.
- Allow trusted internal jobs and webhooks through separate rules.
- Avoid blocking verified search engine bots accidentally at the edge.
- Document limit behavior in your API docs.
Practical Recommendation
For most Node.js SaaS teams, the best starting architecture is layered:
- Cloudflare WAF for edge-level abuse control.
- Application middleware for endpoint-specific logic.
- Redis-backed or managed limiter for distributed user and plan quotas.
- API gateway only when API management becomes a product or platform need.
Do not treat rate limiting as a single library decision. Treat it as part of your SaaS control plane. The right choice depends on where you need context, how precise your quotas must be, how many instances you run, and whether API governance is part of your product.
Conclusion
Rate limiting is not just a security feature. For a Node.js SaaS product, it is also a cost-control layer, a product packaging layer, and an operational safety layer.
Use middleware for fast adoption, Redis when you need distributed consistency, Cloudflare when you need edge protection, Kong when you already operate an API gateway, and Zuplo when your public API needs gateway, portal, monetization, and developer platform features.
The best long-term setup is usually not one tool. It is a layered architecture that separates coarse abuse protection from precise product quota enforcement. Start simple, monitor blocked requests, and evolve your rate limiting strategy as your SaaS grows.