Skip to content

Rate Limiting in ConnectSoft Base Template

Purpose & Overview

Rate Limiting is a mechanism that controls the number of requests a client can make to an API within a specified time window. In the ConnectSoft Base Template, rate limiting is implemented using ASP.NET Core's built-in rate limiting middleware (via ConnectSoft.Extensions.RateLimiting), protecting the microservice from abuse, ensuring fair resource allocation, and maintaining system stability under load.

Target Framework

The Base Template targets .NET 10. Rate limiting uses ASP.NET Core's built-in middleware. See Rate Limiting in ASP.NET Core on Microsoft Learn.

For Base Template-specific configuration, path exclusions, pipeline order, and implementation details, see the Base Template docs/Rate Limiting.md in the ConnectSoft.BaseTemplate repository.

Rate limiting provides:

  • Protection Against Abuse: Prevents malicious or misconfigured clients from overwhelming the service
  • Fair Resource Allocation: Ensures resources are distributed fairly among clients
  • System Stability: Protects backend services from traffic spikes
  • Cost Control: Limits API usage to prevent excessive resource consumption
  • DDoS Mitigation: First line of defense against denial-of-service attacks
  • Compliance: Enables enforcement of usage quotas and service-level agreements

Rate Limiting Philosophy

Rate limiting is a critical security and performance feature that should be configured thoughtfully. The Base Template provides a global rate limiter with configurable limits, while allowing specific paths and endpoints (like health checks, Swagger, MCP) to bypass or use separate rate limiting when necessary. Rate limits should be tested under load and adjusted based on actual traffic patterns and system capacity.

Architecture Overview

Rate Limiting in the Request Pipeline

Incoming Request
Rate Limiting Middleware (UseRateLimiter)
    ├── Extract Partition Key (IP, User ID, etc.)
    ├── Check Rate Limit Policy
    ├── Acquire Permit
    │   ├── Success → Continue to endpoint
    │   └── Failure → Return 429 Too Many Requests
Routing Middleware
Controller/Endpoint
Response

Rate Limiting Components

RateLimitingExtensions.cs
├── AddMicroserviceRateLimiting(configuration) - Service Registration
│   ├── Options from OptionsExtensions.RateLimitingOptions (section: RateLimiting)
│   ├── Delegates to AddConnectSoftRateLimiting(rateLimitingOptions[, excludeFromRateLimiting])
│   ├── Path exclusions: /assets/, /swagger/, /hangfire, /dashboard, /scalar, /mcp, DevUI, health checks
│   ├── MCP policy from OptionsExtensions.McpRateLimitingOptions (section: McpRateLimiting)
│   └── RejectionStatusCode (429)
└── UseMicroserviceRateLimiter(configuration) - Middleware
    ├── Delegates to UseConnectSoftRateLimiter(rateLimitingOptions)
    └── Place after UseMicroserviceRequestTimeouts(), before UseEndpoints()

OptionsExtensions.cs
├── AddConnectSoftRateLimitingOptions(configuration) → RateLimiting section
└── AddConnectSoftMcpRateLimitingOptions(configuration) → McpRateLimiting section (when UseMCP)

Service Registration

OptionsExtensions Pattern

Options are registered in AddMicroserviceOptions() and consumed via static properties:

  • OptionsExtensions.RateLimitingOptions — from section RateLimiting via AddConnectSoftRateLimitingOptions(configuration)
  • OptionsExtensions.McpRateLimitingOptions — from section McpRateLimiting via AddConnectSoftMcpRateLimitingOptions(configuration) (when UseMCP and RateLimiting are enabled)

AddMicroserviceRateLimiting Extension

Rate limiting is registered via AddMicroserviceRateLimiting(configuration):

// MicroserviceRegistrationExtensions.cs
services.AddMicroserviceRateLimiting(configuration);

Implementation — delegates to ConnectSoft.Extensions.RateLimiting:

// RateLimitingExtensions.cs
var rateLimitingOptions = OptionsExtensions.RateLimitingOptions;

// With path exclusions (e.g., /assets/, /swagger/, /mcp, health checks, etc.)
services.AddConnectSoftRateLimiting(rateLimitingOptions, excludeFromRateLimiting: context =>
{
    var path = context.Request.Path.Value ?? string.Empty;
    return excludedPaths.Any(p => path.StartsWith(p, StringComparison.OrdinalIgnoreCase));
});

// Or without exclusions
services.AddConnectSoftRateLimiting(rateLimitingOptions);

// MCP policy (when UseMCP) from McpRateLimiting section
var mcpRateLimitingOptions = OptionsExtensions.McpRateLimitingOptions;
// ... Configure "MCP" policy via options.AddFixedWindowLimiter("MCP", ...)

UseMicroserviceRateLimiter Middleware

Pipeline Position:

// MicroserviceRegistrationExtensions.cs
application.UseMicroserviceRateLimiter(configuration);

Placement (see Request Timeout for full middleware order):

// Middleware order:
application.UseRouting();                      // Before rate limiting
application.UseMicroserviceRequestTimeouts();  // Request Timeout runs before Rate Limiter
application.UseMicroserviceRateLimiter();      // After Request Timeout
application.UseEndpoints(...);                 // After rate limiting

Important: Rate limiting middleware must be placed: - After UseRouting() (to access route information) - After UseMicroserviceRequestTimeouts() (plan-recommended order) - Before UseEndpoints() (to intercept requests before endpoint execution)

Path Exclusions

The following paths are excluded from the global rate limiter (via predicate):

Path When
/assets/ Always (static assets)
/swagger/ When Swagger is enabled
/hangfire When Hangfire is enabled
/dashboard When Orleans is enabled
/scalar When Scalar is enabled
/mcp When MCP is enabled (uses separate MCP policy)
DevUI path When Microsoft Agent Framework DevUI is enabled
/v1/ When Microsoft Agent Framework DevUI API is enabled
Health checks path When HealthChecks are enabled

Rate Limiting Algorithms

Fixed Window Rate Limiter

The template uses Fixed Window rate limiting by default:

How It Works: - Divides time into fixed windows (e.g., 1 minute) - Allows a fixed number of requests per window (e.g., 100 requests) - Resets the counter at the start of each new window - Simple and predictable behavior

Example:

Window: 1 minute
Permit Limit: 5 requests
Time: 00:00:00 - 00:01:00 → 5 requests allowed
Time: 00:01:00 - 00:02:00 → Counter resets, 5 requests allowed again

Advantages: - Simple to understand and implement - Predictable reset behavior - Low memory overhead - Easy to configure

Disadvantages: - Can allow bursts at window boundaries - May not provide smooth rate limiting - Less precise than sliding window

Other Rate Limiting Algorithms (Available in ASP.NET Core)

Sliding Window: - Rolling window of time - Smoother rate limiting - More memory intensive

Token Bucket: - Allows bursts up to bucket size - Refills tokens at fixed rate - Good for bursty traffic patterns

Concurrency Limiter: - Limits concurrent requests - Not time-based - Useful for resource protection

Configuration

RateLimitingOptions (section: RateLimiting)

Options are registered via AddConnectSoftRateLimitingOptions(configuration) in OptionsExtensions.AddMicroserviceOptions(). The RateLimiting section contains:

  • EnableRateLimiting — Master switch for rate limiting
  • GlobalLimiter — Fixed-window policy (Window, AutoReplenishment, PermitLimit, QueueLimit)

GlobalLimiterOptions

Configuration Class:

// GlobalLimiterOptions.cs
public sealed class GlobalLimiterOptions
{
    /// <summary>
    /// Time window that takes in the requests.
    /// Must be greater than TimeSpan.Zero.
    /// </summary>
    [Required]
    [DataType(DataType.Duration)]
    required public TimeSpan Window { get; set; } = TimeSpan.FromSeconds(1);

    /// <summary>
    /// Whether the fixed window rate limiter automatically refreshes counters
    /// or if someone else will be calling externally to refresh counters.
    /// </summary>
    [Required]
    required public bool AutoReplenishment { get; set; } = true;

    /// <summary>
    /// Maximum number of permit counters that can be allowed in a window.
    /// Must be greater than 0.
    /// </summary>
    [Required]
    required public int PermitLimit { get; set; }

    /// <summary>
    /// Maximum cumulative permit count of queued acquisition requests.
    /// Must be greater than or equal to 0.
    /// </summary>
    [Required]
    required public int QueueLimit { get; set; }
}

appsettings.json Configuration

RateLimiting section (global rate limiting):

{
  "RateLimiting": {
    "EnableRateLimiting": true,
    "GlobalLimiter": {
      "Window": "00:01:00",
      "AutoReplenishment": true,
      "PermitLimit": 100,
      "QueueLimit": 0
    }
  }
}

McpRateLimiting section (MCP endpoint rate limiting; separate from RateLimiting; used when UseMCP and RateLimiting are enabled):

{
  "McpRateLimiting": {
    "McpLimiter": {
      "Window": "00:01:00",
      "AutoReplenishment": true,
      "PermitLimit": 100,
      "QueueLimit": 0
    }
  }
}

Note: MCP rate limiting uses the McpRateLimiting section (not RateLimiting.McpLimiter). When configured, a named policy "MCP" is created and applied to the /mcp endpoint.

Configuration Parameters:

Parameter Type Description Default Example
EnableRateLimiting bool Enable or disable rate limiting false true
GlobalLimiter GlobalLimiterOptions Global rate limiter settings (required) Required See below

GlobalLimiter and McpLimiter Options:

Parameter Type Description Default Example
Window TimeSpan Time window for rate limiting 00:00:01 00:01:00 (1 minute)
AutoReplenishment bool Automatically refresh counters true true
PermitLimit int Maximum requests per window Required 100
QueueLimit int Maximum queued requests 0 0 (no queuing)

Environment-Specific Configuration

Development:

{
  "RateLimiting": {
    "EnableRateLimiting": true,
    "GlobalLimiter": {
      "Window": "00:01:00",
      "AutoReplenishment": true,
      "PermitLimit": 100,
      "QueueLimit": 0
    }
  }
}

Production:

{
  "RateLimiting": {
    "EnableRateLimiting": true,
    "GlobalLimiter": {
      "Window": "00:01:00",
      "AutoReplenishment": true,
      "PermitLimit": 1000,
      "QueueLimit": 0
    }
  }
}

Testing:

{
  "RateLimiting": {
    "EnableRateLimiting": true,
    "GlobalLimiter": {
      "Window": "00:01:00",
      "AutoReplenishment": true,
      "PermitLimit": 5,
      "QueueLimit": 0
    }
  }
}

Partitioning Strategy

Partition Key Selection

The rate limiter uses a partitioning strategy to group requests:

Current Implementation:

var key = context.Request.Headers.TryGetValue("X-Test-Id", out var v)
    ? v.ToString()
    : context.GetClientIp() ?? "unknown";

Partition Key Priority: 1. X-Test-Id Header: Used for testing (if present) 2. Client IP Address: Extracted from X-Forwarded-For or RemoteIpAddress 3. "unknown": Fallback if IP cannot be determined

Client IP Extraction:

private static string GetClientIp(this HttpContext httpContext)
{
    string forwardedFor = httpContext.Request.Headers["X-Forwarded-For"].FirstOrDefault();
    if (!string.IsNullOrEmpty(forwardedFor))
    {
        return forwardedFor.Split(',')[0]; // Take the first IP in the list
    }

    return httpContext.Connection.RemoteIpAddress?.ToString();
}

Alternative Partitioning Strategies

By User Identity:

options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(context =>
{
    var key = context.User.Identity?.Name ?? context.GetClientIp() ?? "unknown";
    return RateLimitPartition.GetFixedWindowLimiter(key, _ => new FixedWindowRateLimiterOptions
    {
        PermitLimit = 100,
        Window = TimeSpan.FromMinutes(1),
        AutoReplenishment = true,
        QueueLimit = 0
    });
});

By API Key:

options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(context =>
{
    var key = context.Request.Headers["X-API-Key"].FirstOrDefault() 
        ?? context.GetClientIp() 
        ?? "unknown";
    return RateLimitPartition.GetFixedWindowLimiter(key, _ => new FixedWindowRateLimiterOptions
    {
        PermitLimit = 100,
        Window = TimeSpan.FromMinutes(1),
        AutoReplenishment = true,
        QueueLimit = 0
    });
});

By Tenant ID:

options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(context =>
{
    var tenantId = context.User.FindFirst("tenant_id")?.Value 
        ?? context.Request.Headers["X-Tenant-Id"].FirstOrDefault()
        ?? "unknown";
    return RateLimitPartition.GetFixedWindowLimiter(tenantId, _ => new FixedWindowRateLimiterOptions
    {
        PermitLimit = 100,
        Window = TimeSpan.FromMinutes(1),
        AutoReplenishment = true,
        QueueLimit = 0
    });
});

Response Headers

Rate Limit Headers

When rate limiting is enabled, the middleware adds standard rate limit headers:

Standard Headers: - X-RateLimit-Limit: Maximum number of requests allowed per window - X-RateLimit-Remaining: Number of requests remaining in current window - X-RateLimit-Reset: Unix timestamp when the rate limit resets - Retry-After: Seconds to wait before retrying (in 429 responses)

Example Response:

HTTP/1.1 200 OK
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1699920000

429 Too Many Requests Response:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1699920000
Retry-After: 60
Content-Type: application/json

Endpoint Exemptions

Disabling Rate Limiting for Specific Endpoints

Certain endpoints should bypass rate limiting:

Health Checks:

// HealthChecksExtensions.cs
endpointsBuilder = endpoints.MapHealthChecks("/health");

#if RateLimiting
endpointsBuilder.DisableRateLimiting();
#endif

Swagger UI:

// SwaggerExtensions.cs
var endpointsBuilder = endpoints.MapSwagger();

#if RateLimiting
endpointsBuilder.DisableRateLimiting();
#endif

Scalar UI:

// ScalarExtensions.cs
endpointsBuilder.DisableRateLimiting();

SignalR Hubs:

// SignalRExtensions.cs
hubBuilder.DisableRateLimiting();

Why Exempt These Endpoints?: - Health Checks: Must be accessible for monitoring and orchestration - Swagger/Scalar: Documentation endpoints, not production traffic - SignalR: Real-time connections may require different rate limiting strategy

Testing

Acceptance Tests

The template includes acceptance tests for rate limiting:

// RateLimitingAcceptanceTests.cs
[TestClass]
[DoNotParallelize]
public class RateLimitingAcceptanceTests
{
    [TestMethod]
    public async Task GlobalRateLimiterShouldReturn429OnSixthRequestWithinWindow()
    {
        using HttpClient? client = BeforeAfterTestRunHooks.ServerInstance?.CreateClient();
        Assert.IsNotNull(client, "TestServer client was not initialized.");

        // Set unique test ID for isolation
        client.DefaultRequestHeaders.Remove("X-Test-Id");
        client.DefaultRequestHeaders.Add("X-Test-Id", Guid.NewGuid().ToString("N"));

        const string endpoint = "api/FeatureA/FeatureAUseCaseA";
        using var body = new StringContent("{}", Encoding.UTF8, "application/json");

        // Send 5 requests (within limit)
        for (int i = 0; i < 5; i++)
        {
            using var ok = await client.PostAsync(endpoint, body);
            Assert.IsTrue(ok.IsSuccessStatusCode, 
                $"Expected success within limit on attempt #{i + 1}, got {(int)ok.StatusCode}");
        }

        // 6th request should be rate limited
        using var limited = await client.PostAsync(endpoint, body);
        Assert.AreEqual(HttpStatusCode.TooManyRequests, limited.StatusCode, 
            "Expected HTTP 429 on the 6th request within the window.");
    }
}

Test Configuration:

{
  "RateLimiting": {
    "EnableRateLimiting": true,
    "GlobalLimiter": {
      "Window": "00:01:00",
      "AutoReplenishment": true,
      "PermitLimit": 5,
      "QueueLimit": 0
    }
  }
}

Manual Testing

Test Rate Limiting:

# Send multiple requests rapidly
for i in {1..10}; do
  curl -X POST http://localhost:5000/api/FeatureA/FeatureAUseCaseA \
    -H "Content-Type: application/json" \
    -d '{}'
  echo ""
done

Expected Behavior: - First 100 requests (or configured limit): 200 OK - Subsequent requests: 429 Too Many Requests - Wait for window to reset: 200 OK again

Best Practices

Do's

  1. Enable Rate Limiting in Production

    {
      "RateLimiting": {
        "EnableRateLimiting": true,
        "GlobalLimiter": {
          "PermitLimit": 1000,
          "Window": "00:01:00"
        }
      }
    }
    

  2. Set Appropriate Limits

    // ✅ GOOD - Based on capacity analysis
    {
      "PermitLimit": 1000,
      "Window": "00:01:00"
    }
    
    // ❌ BAD - Too restrictive or too permissive
    {
      "PermitLimit": 10,  // Too restrictive
      "PermitLimit": 1000000  // Too permissive
    }
    

  3. Exempt Critical Endpoints

    // ✅ GOOD - Health checks exempt
    endpoints.MapHealthChecks("/health").DisableRateLimiting();
    

  4. Use IP-Based Partitioning for Public APIs

    // ✅ GOOD - IP-based for public APIs
    var key = context.GetClientIp() ?? "unknown";
    

  5. Set QueueLimit to 0 for Immediate Rejection

    {
      "QueueLimit": 0  // Reject immediately, don't queue
    }
    

  6. Monitor Rate Limit Metrics

  7. Track 429 responses
  8. Monitor rate limit usage
  9. Alert on high rejection rates

Don'ts

  1. Don't Disable Rate Limiting in Production

    // ❌ BAD - No protection
    {
      "EnableRateLimiting": false
    }
    

  2. Don't Use Too Restrictive Limits

    // ❌ BAD - Will block legitimate users
    {
      "PermitLimit": 1,
      "Window": "00:01:00"
    }
    

  3. Don't Forget to Exempt Health Checks

    // ❌ BAD - Health checks may be rate limited
    endpoints.MapHealthChecks("/health");
    // No DisableRateLimiting()
    

  4. Don't Use QueueLimit for Rate Limiting

    // ❌ BAD - QueueLimit is for queuing, not rate limiting
    {
      "QueueLimit": 100  // This queues requests, doesn't limit them
    }
    

  5. Don't Ignore Load Testing

    // ❌ BAD - Rate limits not tested
    // Deploy without load testing
    
    // ✅ GOOD - Test rate limits under load
    // Load test with expected traffic patterns
    

Advanced Scenarios

Custom Rate Limiting Policies

Endpoint-Specific Policies:

services.AddRateLimiter(options =>
{
    // Global limiter
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(context =>
        RateLimitPartition.GetFixedWindowLimiter(
            context.GetClientIp() ?? "unknown",
            _ => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 100,
                Window = TimeSpan.FromMinutes(1),
                AutoReplenishment = true,
                QueueLimit = 0
            }));

    // Endpoint-specific policy
    options.AddFixedWindowLimiter("api", limiterOptions =>
    {
        limiterOptions.PermitLimit = 50;
        limiterOptions.Window = TimeSpan.FromMinutes(1);
        limiterOptions.AutoReplenishment = true;
        limiterOptions.QueueLimit = 0;
    });

#if UseMCP
    // MCP endpoint rate limiting from McpRateLimiting section
    var mcpOptions = OptionsExtensions.McpRateLimitingOptions;
    if (mcpOptions?.McpLimiter is not null)
    {
        options.AddFixedWindowLimiter("MCP", limiterOptions =>
        {
            limiterOptions.PermitLimit = mcpOptions.McpLimiter.PermitLimit;
            limiterOptions.Window = mcpOptions.McpLimiter.Window;
            limiterOptions.AutoReplenishment = mcpOptions.McpLimiter.AutoReplenishment;
            limiterOptions.QueueLimit = mcpOptions.McpLimiter.QueueLimit;
        });
    }
#endif
});

Apply to Endpoint:

[EnableRateLimiting("api")]
[HttpPost("orders")]
public async Task<IActionResult> CreateOrder([FromBody] OrderRequest request)
{
    // ...
}

// MCP endpoint — use ConnectSoft mapping so policy + optional auth are applied consistently.
// In the template this is done inside MapMicroserviceMCPServer(); shown here for reference:
using ConnectSoft.Extensions.ModelContextProtocol;

endpoints.MapConnectSoftModelContextProtocol("/mcp", rateLimitingPolicy: "MCP");

MCP Endpoint Rate Limiting

When both UseMCP and RateLimiting template parameters are enabled, you can configure MCP-specific rate limiting that works independently from the global rate limiter. This allows you to set different rate limits for MCP endpoints (/mcp) compared to other endpoints.

Configuration in appsettings.json — use the McpRateLimiting section (separate from RateLimiting):

{
  "RateLimiting": {
    "EnableRateLimiting": true,
    "GlobalLimiter": {
      "Window": "00:01:00",
      "PermitLimit": 5,
      "AutoReplenishment": true,
      "QueueLimit": 0
    }
  },
  "McpRateLimiting": {
    "McpLimiter": {
      "Window": "00:01:00",
      "PermitLimit": 100,
      "AutoReplenishment": true,
      "QueueLimit": 0
    }
  }
}

How It Works:

  • MCP rate limiting uses the McpRateLimiting section with McpLimiter (not RateLimiting.McpLimiter)
  • When configured, a named rate limiting policy "MCP" is automatically created
  • The policy uses the same partitioning strategy as the global limiter (IP address or X-Test-Id header)
  • The /mcp path is excluded from global rate limiting and uses the MCP policy instead
  • MCP rate limiting is independent from global rate limiting - both can be active simultaneously

Implementation:

The MCP rate limiting policy is configured from OptionsExtensions.McpRateLimitingOptions in RateLimitingExtensions.cs:

#if UseMCP
var mcpRateLimitingOptions = OptionsExtensions.McpRateLimitingOptions;
if (mcpRateLimitingOptions?.McpLimiter is not null && rateLimitingOptions.EnableRateLimiting)
{
    services.Configure<RateLimiterOptions>(options =>
    {
        options.AddFixedWindowLimiter("MCP", limiterOptions =>
        {
            limiterOptions.PermitLimit = mcpRateLimitingOptions.McpLimiter.PermitLimit;
            limiterOptions.Window = mcpRateLimitingOptions.McpLimiter.Window;
            limiterOptions.AutoReplenishment = mcpRateLimitingOptions.McpLimiter.AutoReplenishment;
            limiterOptions.QueueLimit = mcpRateLimitingOptions.McpLimiter.QueueLimit;
        });
    });
}
#endif

The template applies the MCP policy when mapping the endpoint in ModelContextProtocolExtensions.cs by passing the policy name into MapConnectSoftModelContextProtocol (only when McpServerTransportType is Http):

using ConnectSoft.Extensions.ModelContextProtocol;

// ...
endpoints.MapConnectSoftModelContextProtocol(
    "/mcp",
    rateLimitingPolicy: "MCP",
    requireAuthorization: requireAuthorization);

Best Practices:

  • Set MCP rate limits higher than global limits to accommodate AI tool invocation patterns
  • Monitor MCP endpoint usage to adjust limits based on actual traffic
  • Consider per-user or per-session quotas for production environments
  • Test rate limiting under load to ensure it doesn't interfere with legitimate AI tool usage

Sliding Window Rate Limiter

Configuration:

options.AddSlidingWindowLimiter("sliding", limiterOptions =>
{
    limiterOptions.PermitLimit = 100;
    limiterOptions.Window = TimeSpan.FromMinutes(1);
    limiterOptions.SegmentsPerWindow = 4; // 4 segments = 15-second segments
    limiterOptions.AutoReplenishment = true;
    limiterOptions.QueueLimit = 0;
});

Token Bucket Rate Limiter

Configuration:

options.AddTokenBucketLimiter("token", limiterOptions =>
{
    limiterOptions.TokenLimit = 100;
    limiterOptions.ReplenishmentPeriod = TimeSpan.FromMinutes(1);
    limiterOptions.TokensPerPeriod = 10;
    limiterOptions.AutoReplenishment = true;
    limiterOptions.QueueLimit = 0;
});

Concurrency Limiter

Configuration:

options.AddConcurrencyLimiter("concurrency", limiterOptions =>
{
    limiterOptions.PermitLimit = 10; // Max 10 concurrent requests
    limiterOptions.QueueLimit = 0;
});

Troubleshooting

Issue: Rate Limiting Not Working

Symptoms: Requests not being rate limited, no 429 responses.

Solutions: 1. Verify Rate Limiting is Enabled

{
  "RateLimiting": {
    "EnableRateLimiting": true
  }
}

  1. Check Middleware Order

    // ✅ GOOD - Correct order
    application.UseRouting();
    application.UseMicroserviceRateLimiter();
    application.UseEndpoints(...);
    

  2. Verify Configuration is Loaded

  3. Check RateLimitingOptions is registered
  4. Verify appsettings.json contains rate limiting section
  5. Check options validation passes

Issue: Too Many 429 Responses

Symptoms: Legitimate users receiving 429 responses.

Solutions: 1. Increase Permit Limit

{
  "GlobalLimiter": {
    "PermitLimit": 1000  // Increase from 100
  }
}

  1. Review Partitioning Strategy

    // Check if IP extraction is working correctly
    var ip = context.GetClientIp();
    // Ensure IP is not null or "unknown"
    

  2. Check Window Size

    {
      "GlobalLimiter": {
        "Window": "00:01:00"  // Ensure window is appropriate
      }
    }
    

Issue: Health Checks Being Rate Limited

Symptoms: Health checks returning 429 responses.

Solutions: 1. Disable Rate Limiting for Health Checks

endpoints.MapHealthChecks("/health").DisableRateLimiting();

  1. Verify Exemption is Applied
  2. Check DisableRateLimiting() is called
  3. Verify conditional compilation (#if RateLimiting)

Issue: Rate Limits Not Resetting

Symptoms: Rate limits never reset, permanently blocked.

Solutions: 1. Verify AutoReplenishment is Enabled

{
  "GlobalLimiter": {
    "AutoReplenishment": true
  }
}

  1. Check Window Configuration
    {
      "GlobalLimiter": {
        "Window": "00:01:00"  // Ensure window is valid
      }
    }
    

Code Standards

When implementing or extending rate limiting, follow the Coding Standards. Use consistent naming, XML documentation, and analyzer rules (StyleCop, AspNetCoreAnalyzers).

Summary

Rate limiting in the ConnectSoft Base Template provides:

  • Global Rate Limiting: Fixed window rate limiter with configurable limits
  • Partitioning Strategy: IP-based or custom partition key selection
  • Endpoint Exemptions: Health checks and documentation endpoints bypass rate limiting
  • Configurable Limits: Permit limit, window, and queue limit configuration
  • HTTP 429 Responses: Standard rate limit responses with headers
  • Testing Support: Acceptance tests verify rate limiting behavior
  • Production Ready: Configurable for different environments

By following these patterns, teams can:

  • Protect Services: Prevent abuse and overload
  • Ensure Fairness: Distribute resources fairly among clients
  • Maintain Stability: Keep services responsive under load
  • Enforce Quotas: Control API usage and costs
  • Monitor Usage: Track rate limit metrics and adjust limits

Rate limiting is an essential security and performance feature that protects microservices from abuse while ensuring fair resource allocation and system stability.