Skip to content

Backend Library Generator Agent - HLD & Plan

Repo/Solution: ConnectSoft.Factory.BackendLibraryGeneratorAgent

The Backend Library Generator Agent (LGA) is a production-grade, autonomous microservice that turns a Backend Library Blueprint + DSLs into a complete .NET library solution (code, tests, docs, CI) and opens a PR in Azure DevOps using MCP-assisted GitOps.


🧱 What We Deliver

  1. An LGA microservice (anemic DDD + NHibernate + gRPC + MassTransit Saga) that:

    • Consumes Library Blueprints + DSLs
    • Orchestrates generation → validation → PR via MCP Filesystem/Git + ADO PR
    • Exposes gRPC and MCP tools for cross-agent invocation
  2. Factory-compliant library outputs using the official Library Template:

    • Library + MSTest project, README, Azure Pipelines YAML, multi-TFM net8/net9, DI/Options/Logging toggles from blueprint/CLI
  3. Operational runbook & use-cases for repeatable adoption and scale across teams.


🔐 Core Guarantees

  • Clean Architecture + DDD (anemic) boundaries, event-driven orchestration, idempotent runs.
  • Traceability via correlation/trace IDs across agent execution phases.
  • Deterministic templates with CI gates (build, tests, coverage).

📊 Benefits

  • 🚀 Speed: Blueprint-driven generation avoids hand-rolled scaffolding.
  • 🧱 Consistency: Single template and DSL control plane across projects.
  • 🛡️ Enterprise-Fit: ADO pipelines, NuGet packaging, governance hooks.

🗺️ Architecture (Context & Flow)

flowchart LR
  subgraph External
    CallerGRPC[[gRPC Clients]]
    CallerMCP[[MCP Clients - Agents]]
    CallerBus[[External Orchestrators - Bus]]
  end

  subgraph LGA["Backend Library Generator Agent (Microservice)"]
    API[Service Host: gRPC + MCP Facade]
    Saga[MassTransit Saga Orchestrator]
    Repo[(NHibernate + Outbox)]
    MCPFS[MCP Filesystem Client]
    MCPGit[MCP Git Client]
    ADO[Azure DevOps PR Adapter]
    AI[AI Services and Semantic Kernel]
  end

  CallerGRPC-->API
  CallerMCP-->API
  CallerBus-->Saga
  API-->Saga
  Saga<-->Repo
  Saga-->MCPFS
  Saga-->MCPGit
  Saga-->ADO
  Saga-->AI
Hold "Alt" / "Option" to enable pan & zoom
  • Agent placement and collaboration follow the ConnectSoft Agent System and platform architecture.
  • Execution flow adheres to the standard agent lifecycle (assignment → reasoning → validation → handoff).

🔁 Happy-Path Sequence

sequenceDiagram
  participant C as Client (gRPC/MCP/Bus)
  participant API as LGA Host
  participant S as Saga
  participant NH as NHibernate Repo
  participant FS as MCP Filesystem
  participant T as dotnet template (library)
  participant Q as Quality Runner
  participant G as MCP Git
  participant A as Azure DevOps PR

  C->>API: StartGeneration(correlationId, blueprintJson)
  API->>S: StartLibraryGeneration
  S->>NH: Persist Pending
  S->>FS: Prepare workspace
  FS-->>S: ok → WorkspaceReady
  S->>T: dotnet new connectsoft-library (+flags)
  T-->>S: ok → Generated
  S->>Q: build, tests, coverage ≥ threshold
  Q-->>S: pass → QualityPassed
  S->>G: clone/branch/commit/push
  G-->>S: commitId → BranchPushed
  S->>A: create PR
  A-->>S: prUrl → PROpened → Completed
Hold "Alt" / "Option" to enable pan & zoom
  • Template + pipeline behavior from Library Template and Runbook.

🧭 Vision & Scope — Backend Library Generator Agent (LGA)

📌 Purpose

The Backend Library Generator Agent (LGA) is a factory microservice responsible for turning Library Blueprints + DSLs into a complete, production-grade .NET library solution. It eliminates manual scaffolding and ensures every library generated is modular, testable, cloud-native, and observable.

“In ConnectSoft, libraries are not written — they are generated, validated, and delivered as governed modules.”


🎯 Core Responsibilities

  • Consume: Library Blueprints (library-blueprint.yaml) and DSLs (dsl-library, dsl-feature-flow, etc.).
  • Generate: Deterministic library projects from the official Library Template.
  • Validate: Run build, unit tests, integration tests, and coverage gates.
  • Deliver: Commit & push code into Git (via MCP Git), open a pull request in Azure DevOps, and attach traceable metadata.
  • Expose APIs: gRPC endpoints and MCP tools for cross-agent invocation.

🗺️ Boundaries & Context

  • Bounded Context: LibraryFactory.LibraryGeneration
  • Domain Model: Anemic aggregate (LibraryGeneration) + value objects (CorrelationId, BlueprintSnapshot, RepoRef, etc.).
  • Persistence: NHibernate + Outbox pattern.
  • Orchestration: MassTransit Saga — event-driven, idempotent, and replayable.
  • External Ports: MCP Filesystem, MCP Git, Azure DevOps PR, Semantic Kernel AI services.

LGA is not responsible for:

  • Designing business features (done by upstream planning/architect agents).
  • Managing runtime execution of libraries (done by consuming services).

🔐 Alignment to Platform Principles

  • DDD + Clean Architecture: Keeps library boundaries clean and deterministic.
  • Event-Driven Mindset: Every transition (WorkspaceReady, QualityPassed, PROpened) is an event.
  • Cloud-Native Mindset: Generated libraries ship with Docker support, IaC, and elastic-ready pipelines.
  • Observability-First: Each run emits telemetry (traceId, agentId, skillId, coverage %).
  • Knowledge & Memory System: Every artifact (code, docs, tests) is indexed and retrievable for reuse.

📊 Success Metrics

Metric Target
Library generation success rate ≥ 95%
CI coverage gate ≥ 70%
PR creation latency < 3 min from blueprint submission
Traceability compliance 100% (every output tagged with traceId)
Reuse & adoption ≥ 80% of new libraries generated via LGA

🚀 Strategic Value

  • Speed: Turns blueprint → validated PR in minutes.
  • Consistency: Enforces uniform structure and quality gates across all backend libraries.
  • Enterprise Fit: Integrates natively with Azure DevOps, governance hooks, and package registries.
  • Scale: Enables generation of thousands of reusable libraries across tenants, domains, and editions.

Solution Structure

This solution follows the ConnectSoft Microservice Template composition (Domain → Persistence → Messaging/Flow → ServiceModel → Application/Infrastructure → Testing/Architecture), adapted to the Backend Library Generator Agent (LGA). It keeps concerns in separate projects and aligns to DDD/Clean boundaries with optional actors, schedulers, and multiple API faces.

Project Breakdown (LGA)

Project Type Purpose
ConnectSoft.Factory.BackendLibraryGeneratorAgent Library Base constants, errors, shared primitives.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.EntityModel Library Aggregate/VO contracts for LibraryGeneration and lookups.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.DomainModel Library Use case interfaces (start generation, status queries).
ConnectSoft.Factory.BackendLibraryGeneratorAgent.DomainModel.Impl Library Orchestrators, validators (anemic domain services).
ConnectSoft.Factory.BackendLibraryGeneratorAgent.PersistenceModel Library Repos/specifications + query models.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.PersistenceModel.NHibernate Library NH mappings, outbox, dedup.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.DatabaseModel.Migrations Library FluentMigrator DDL (schema + lookups).
ConnectSoft.Factory.BackendLibraryGeneratorAgent.MessagingModel Library Command/event DTOs.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.FlowModel.MassTransit Library Saga state machine + consumers.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.ActorModel.Orleans (optional) Library GenerationGrain for high-fanout coordination.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.SchedulerModel.Hangfire (optional) Library Cleanup/DLQ/TTL jobs.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.ServiceModel.Grpc Library Contract-first gRPC endpoint for Start/Status.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.ServiceModel.RestApi (optional) Library Read-only REST (status, previews).
ConnectSoft.Factory.BackendLibraryGeneratorAgent.ServiceModel.GraphQL/SignalR (optional) Library Query graphs/real-time status streaming.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.SemanticKernel Library AI helpers (docs/PR text/commit summaries).
ConnectSoft.Factory.BackendLibraryGeneratorAgent.ModelContextProtocol Library MCP server tools (start/get-status/preview).
ConnectSoft.Factory.BackendLibraryGeneratorAgent.ApplicationModel Library DI, middleware, telemetry, feature flags.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.Application Host Worker + gRPC host; orchestrator entrypoint.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.InfrastructureModel Library Pulumi IaC (Container App, SB, KV, ACR).
ConnectSoft.Factory.BackendLibraryGeneratorAgent.DockerCompose Host Local multi-container orchestration.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.UnitTests Tests Unit coverage.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.AcceptanceTests Tests E2E (MassTransit harness + fake adapters).
ConnectSoft.Factory.BackendLibraryGeneratorAgent.ArchitectureTests Tests Dependency and layering rules.
ConnectSoft.Factory.BackendLibraryGeneratorAgent.ArchitectureModel / DiagramAsCodeModel Library Mermaid/C4 + diagram-as-code.

The table mirrors the template’s responsibility groups and optional modules (ServiceModel variants, ActorModel, Scheduler), ensuring modularity and clean composition.

Tree View

ConnectSoft.Factory.BackendLibraryGeneratorAgent/
├─ Common/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.Options/
│  └─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.Metrics/
├─ Domain/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.EntityModel/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.DomainModel/
│  └─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.DomainModel.Impl/
├─ Persistence/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.PersistenceModel/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.PersistenceModel.NHibernate/
│  └─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.DatabaseModel.Migrations/
├─ Messaging/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.MessagingModel/
│  └─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.FlowModel.MassTransit/
├─ ActorModel/                 # optional
│  └─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.ActorModel.Orleans/
├─ Scheduler/                  # optional
│  └─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.SchedulerModel.Hangfire/
├─ ServiceModel/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.ServiceModel.Grpc/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.ServiceModel.RestApi/          # optional
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.ServiceModel.GraphQL/          # optional
│  └─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.ServiceModel.SignalR/          # optional
├─ AI/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.SemanticKernel/
│  └─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.ModelContextProtocol/
├─ Application/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.ApplicationModel/
│  └─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.Application/
├─ Infrastructure/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.DockerCompose/
│  └─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.InfrastructureModel/
├─ Testing/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.UnitTests/
│  ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.AcceptanceTests/
│  └─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.ArchitectureTests/
└─ Architecture/
   ├─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.ArchitectureModel/
   └─ ConnectSoft.Factory.BackendLibraryGeneratorAgent.DiagramAsCodeModel/

This structure is a direct specialization of the template’s tree (Common/Domain/Persistence/Messaging/ActorModel/Scheduler/ServiceModel/AI/Application/Infrastructure/Testing/Architecture).

Diagram (Mermaid)

graph TB
  subgraph Host
    App[ConnectSoft.Factory.BackendLibraryGeneratorAgent.Application] --> AppM[ConnectSoft.Factory.BackendLibraryGeneratorAgent.ApplicationModel]
  end

  subgraph APIs
    GRPC[ServiceModel.Grpc]
    REST[ServiceModel.RestApi]
    GQL[ServiceModel.GraphQL]
    SR[ServiceModel.SignalR]
  end

  subgraph Flow
    Msg[MessagingModel]
    Saga[FlowModel.MassTransit]
  end

  subgraph Domain
    Ent[EntityModel]
    DM[DomainModel]
    DMI[DomainModel.Impl]
  end

  subgraph Persistence
    PM[PersistenceModel]
    NH[PersistenceModel.NHibernate]
    Mig[DatabaseModel.Migrations]
  end

  subgraph Optional
    Orleans[ActorModel.Orleans]
    Hangfire[SchedulerModel.Hangfire]
  end

  subgraph AI
    SK[SemanticKernel]
    MCP[ModelContextProtocol]
  end

  APIs --> Flow
  Flow --> Domain
  Domain --> Persistence
  Host --> APIs
  Host --> Flow
  Host --> Optional
  Host --> AI
Hold "Alt" / "Option" to enable pan & zoom

The diagram mirrors the template’s layering and optional modules (multiple API faces, actors, schedulers).

Dependency Rules

  • Inbound only into Domain (EntityModel/DomainModel); no outward refs from Domain to infrastructure.
  • ServiceModel/Flow call use cases; Persistence implements repository ports; Host wires DI/telemetry.
  • Optional modules (Orleans/Hangfire/REST/GraphQL/SignalR) are toggleable without breaking core boundaries.

📦 Domain (Anemic) & Persistence

🧱 Aggregate Root: LibraryGeneration

The LibraryGeneration aggregate represents the lifecycle of a generated backend library. It is modeled as a status machine with owned components that encapsulate key outputs of the generation flow.

🔁 Lifecycle States

  • Pending → Run initialized, awaiting workspace.
  • WorkspaceReady → Temporary workspace prepared.
  • Generated → Template applied, artifacts created.
  • QualityPassed / QualityFailed → Validation phase outcome.
  • BranchPushed → Code committed and pushed to Git.
  • PROpened → Pull Request created in Azure DevOps.
  • Completed / Failed → Terminal states with outcome recorded.

🧩 Owned Components

  • Workspace → File system workspace reference.
  • Quality → Build/test/coverage validation results.
  • GitOutcome → Commit hash, branch, patch fingerprint.
  • PullRequest → PR metadata, link, status.

🔑 Value Objects (VOs)

Value Object Purpose
CorrelationId Ensures traceability across saga, observability, and cross-agent flows.
BlueprintSnapshot Frozen copy of the input blueprint at run start.
RepoRef Target repository information (ADO project, repo name, etc.).
BranchName Deterministic branch identifier derived from blueprint + correlation ID.
WorkItemRef Optional link to Azure DevOps work item for traceability.
CommitMessage Standardized commit message format (blueprint + run ID + change note).
Coverage Captures code coverage % from quality gate run.
GitPatchFingerprint SHA or hash of patch set for idempotency + duplicate prevention.
Path Workspace folder path reference.
HttpUrl PR URL or repository URL for external consumers.

🗄️ Persistence Design

ORM Strategy

  • NHibernate (mapping-by-code) is used for ORM to ensure flexible, template-driven evolution.
  • Mapping files follow anemic DDD style, keeping domain logic minimal and persistence explicit.

Integration Events

  • Outbox Pattern ensures event reliability and idempotency.
  • Events are emitted after state transitions (e.g., LibraryGenerated, QualityValidated, PullRequestOpened).

Database Schema

  • FluentMigrator used for schema evolution and migration consistency.
Example Tables
  • LibraryGenerations → Aggregate root, keyed by CorrelationId.
  • LibraryGenerationEvents → Outbox, storing pending/committed integration events.
  • LibraryGenerationValueObjects → Separate mapping for VOs requiring persistence.

📊 Rationale

  • Auditability → Every state transition and event emission is persisted.
  • Idempotency → Outbox + patch fingerprinting prevent duplicate operations.
  • Template-Driven Evolution → Database schema evolves alongside blueprint/template upgrades.
  • Isolation → Library generation history is scoped per CorrelationId and blueprint.

✅ Acceptance Criteria

  • Aggregate + VOs modeled and documented.
  • NHibernate mappings defined (compilation-ready stubs).
  • Outbox schema created with FluentMigrator DDL.
  • Data model reviewed and aligned with DDD + Clean Architecture principles.

Database & Model Layer

Scope & Principles

Goal: Implement a relational model for the anemic aggregate LibraryGeneration with owned components (Workspace, Quality, GitOutcome, PullRequest) and value objects, optimized for traceability, idempotency, and event-driven orchestration.

Hard rules (per your guidance):

  • All timestamps use datetimeoffset and are stored in UTC (offset +00:00). Defaults and check constraints enforce UTC.
  • Enumerations/Statuses are realized as lookup tables (FKs), not .NET enums persisted as integers.
  • All textual data uses nvarchar (explicit lengths; URLs and JSON as nvarchar(max)).
  • Idempotency keys (e.g., blueprint SHA, patch SHA) are nvarchar(64); correlation as uniqueidentifier.
  • Outbox present for integration events; audit trail for traceability.
  • NHibernate mapping-by-code; FluentMigrator for versioned DDL.

Relational Model (Overview)

erDiagram
  LGA_LibraryGeneration ||--o{ LGA_AuditEntry : has
  LGA_LibraryGeneration ||--o{ LGA_OutboxMessage : emits
  LGA_LibraryGeneration }o--|| LU_LibraryGenerationStatus : "statusId → id"
  LGA_LibraryGeneration }o--|| LU_QualityStatus : "qualityStatusId → id"
  LGA_LibraryGeneration }o--|| LU_PullRequestState : "prStateId → id"
  LGA_LibraryGeneration }o--|| LU_SourceControlProvider : "scmProviderId → id"

  LGA_LibraryGeneration {
    uniqueidentifier id PK
    uniqueidentifier correlationId UNIQUE
    nvarchar(64) blueprintSha
    nvarchar(64) patchSha
    int statusId FK
    -- Workspace (owned)
    nvarchar(400) workspaceRoot
    nvarchar(400) workspaceScratch
    -- Quality (owned)
    int qualityStatusId FK
    decimal(5,2) coveragePercent
    nvarchar(2000) qualityReportUrl
    -- GitOutcome (owned)
    int scmProviderId FK
    nvarchar(400) repoFullName
    nvarchar(200) branchName
    nvarchar(64) commitId
    nvarchar(64) gitPatchFingerprint
    -- PullRequest (owned)
    int prStateId FK
    nvarchar(200) prNumber
    nvarchar(2000) prUrl
    -- Metadata
    nvarchar(2000) lastError
    datetimeoffset(7) createdUtc
    datetimeoffset(7) updatedUtc
    datetimeoffset(7) completedUtc NULL
    binary(8) rowVersion
  }

  LGA_AuditEntry {
    bigint auditId PK
    uniqueidentifier entityId
    nvarchar(100) entityType
    nvarchar(100) action
    nvarchar(200) actor
    nvarchar(max) changesJson
    datetimeoffset(7) occurredUtc
  }

  LGA_OutboxMessage {
    uniqueidentifier messageId PK
    uniqueidentifier aggregateId
    nvarchar(200) messageType
    nvarchar(max) payloadJson
    datetimeoffset(7) occurredUtc
    datetimeoffset(7) processedUtc NULL
    int retries
    nvarchar(200) lastError
  }

  LU_LibraryGenerationStatus { int id PK  nvarchar(50) code nvarchar(200) name }
  LU_QualityStatus          { int id PK  nvarchar(50) code nvarchar(200) name }
  LU_PullRequestState       { int id PK  nvarchar(50) code nvarchar(200) name }
  LU_SourceControlProvider  { int id PK  nvarchar(50) code nvarchar(200) name }
Hold "Alt" / "Option" to enable pan & zoom

Flattened owned components: Workspace, Quality, GitOutcome, PullRequest are embedded as prefixed columns in LGA_LibraryGeneration (simplifies query paths and keeps aggregate atomic). Lookups capture state machines and providers.


Naming, Schemas, and Conventions

  • Schema: lga for domain tables; lga_lu for lookups (or lga + LU_* if single schema preferred).
  • Primary Keys: id (GUID) for aggregate; auditId (BIGINT IDENTITY) for audit; GUID for outbox.
  • Concurrency: rowVersion binary(8) (SQL Server rowversion) for optimistic concurrency.
  • UTC enforcement: datetimeoffset(7) with default SYSUTCDATETIME() (converted to datetimeoffset) and CHECK constraints to ensure DATEPART(TZ, column) = 0.
  • Indexes: see below.

DDL (FluentMigrator-friendly SQL Sketch)

Use these as a reference for FluentMigrator; exact generator code provided after.

-- Lookups
CREATE TABLE lga_lu.LibraryGenerationStatus(
  id int IDENTITY(1,1) PRIMARY KEY,
  code nvarchar(50) NOT NULL UNIQUE,
  name nvarchar(200) NOT NULL
);
INSERT INTO lga_lu.LibraryGenerationStatus(code,name) VALUES
  (N'Pending',N'Pending'),
  (N'WorkspaceReady',N'Workspace Ready'),
  (N'Generated',N'Generated'),
  (N'QualityPassed',N'Quality Passed'),
  (N'QualityFailed',N'Quality Failed'),
  (N'BranchPushed',N'Branch Pushed'),
  (N'PROpened',N'PR Opened'),
  (N'Completed',N'Completed'),
  (N'Failed',N'Failed');

CREATE TABLE lga_lu.QualityStatus(
  id int IDENTITY(1,1) PRIMARY KEY,
  code nvarchar(50) NOT NULL UNIQUE,
  name nvarchar(200) NOT NULL
);
INSERT INTO lga_lu.QualityStatus(code,name) VALUES
  (N'Unknown',N'Unknown'),
  (N'Pass',N'Pass'),
  (N'Fail',N'Fail');

CREATE TABLE lga_lu.PullRequestState(
  id int IDENTITY(1,1) PRIMARY KEY,
  code nvarchar(50) NOT NULL UNIQUE,
  name nvarchar(200) NOT NULL
);
INSERT INTO lga_lu.PullRequestState(code,name) VALUES
  (N'None',N'None'),
  (N'Open',N'Open'),
  (N'Merged',N'Merged'),
  (N'Closed',N'Closed'),
  (N'Abandoned',N'Abandoned');

CREATE TABLE lga_lu.SourceControlProvider(
  id int IDENTITY(1,1) PRIMARY KEY,
  code nvarchar(50) NOT NULL UNIQUE,
  name nvarchar(200) NOT NULL
);
INSERT INTO lga_lu.SourceControlProvider(code,name) VALUES
  (N'Ado',N'Azure DevOps'),
  (N'GitHub',N'GitHub');

-- Aggregate
CREATE TABLE lga.LibraryGeneration(
  id uniqueidentifier NOT NULL PRIMARY KEY,
  correlationId uniqueidentifier NOT NULL UNIQUE,
  blueprintSha nvarchar(64) NOT NULL,
  patchSha nvarchar(64) NULL,

  statusId int NOT NULL REFERENCES lga_lu.LibraryGenerationStatus(id),

  -- Workspace*
  workspaceRoot nvarchar(400) NOT NULL,
  workspaceScratch nvarchar(400) NULL,

  -- Quality*
  qualityStatusId int NOT NULL REFERENCES lga_lu.QualityStatus(id),
  coveragePercent decimal(5,2) NULL,
  qualityReportUrl nvarchar(2000) NULL,

  -- GitOutcome*
  scmProviderId int NOT NULL REFERENCES lga_lu.SourceControlProvider(id),
  repoFullName nvarchar(400) NOT NULL,            -- e.g. org/project or org/repo
  branchName nvarchar(200) NOT NULL,
  commitId nvarchar(64) NULL,
  gitPatchFingerprint nvarchar(64) NULL,

  -- PullRequest*
  prStateId int NOT NULL REFERENCES lga_lu.PullRequestState(id),
  prNumber nvarchar(200) NULL,
  prUrl nvarchar(2000) NULL,

  -- Metadata
  lastError nvarchar(2000) NULL,

  createdUtc datetimeoffset(7) NOT NULL
    CONSTRAINT DF_LGA_LG_CreatedUtc DEFAULT (SYSUTCDATETIME()),
  updatedUtc datetimeoffset(7) NOT NULL
    CONSTRAINT DF_LGA_LG_UpdatedUtc DEFAULT (SYSUTCDATETIME()),
  completedUtc datetimeoffset(7) NULL,

  rowVersion rowversion,

  -- Constraints
  CONSTRAINT CK_LGA_LG_Coverage CHECK (coveragePercent IS NULL OR (coveragePercent >= 0 AND coveragePercent <= 100)),
  CONSTRAINT CK_LGA_LG_CreatedUtc_UTC CHECK (DATEPART(TZ, createdUtc) = 0),
  CONSTRAINT CK_LGA_LG_UpdatedUtc_UTC CHECK (DATEPART(TZ, updatedUtc) = 0),
  CONSTRAINT CK_LGA_LG_CompletedUtc_UTC CHECK (completedUtc IS NULL OR DATEPART(TZ, completedUtc) = 0)
);

-- Outbox
CREATE TABLE lga.OutboxMessage(
  messageId uniqueidentifier NOT NULL PRIMARY KEY,
  aggregateId uniqueidentifier NOT NULL,
  messageType nvarchar(200) NOT NULL,
  payloadJson nvarchar(max) NOT NULL,
  occurredUtc datetimeoffset(7) NOT NULL
    CONSTRAINT DF_LGA_OB_Occurred DEFAULT (SYSUTCDATETIME()),
  processedUtc datetimeoffset(7) NULL,
  retries int NOT NULL CONSTRAINT DF_LGA_OB_Retries DEFAULT (0),
  lastError nvarchar(2000) NULL,
  CONSTRAINT FK_LGA_OB_Aggregate FOREIGN KEY (aggregateId) REFERENCES lga.LibraryGeneration(id),
  CONSTRAINT CK_LGA_OB_Occurred_UTC CHECK (DATEPART(TZ, occurredUtc) = 0),
  CONSTRAINT CK_LGA_OB_Processed_UTC CHECK (processedUtc IS NULL OR DATEPART(TZ, processedUtc) = 0)
);

-- Audit
CREATE TABLE lga.AuditEntry(
  auditId bigint IDENTITY(1,1) NOT NULL PRIMARY KEY,
  entityId uniqueidentifier NOT NULL,
  entityType nvarchar(100) NOT NULL,      -- e.g., "LibraryGeneration"
  action nvarchar(100) NOT NULL,          -- e.g., "UpdateStatus", "OpenPR"
  actor nvarchar(200) NOT NULL,           -- e.g., "lga-saga", "devops-pipeline"
  changesJson nvarchar(max) NULL,
  occurredUtc datetimeoffset(7) NOT NULL
    CONSTRAINT DF_LGA_AU_Occurred DEFAULT (SYSUTCDATETIME()),
  CONSTRAINT CK_LGA_AU_UTC CHECK (DATEPART(TZ, occurredUtc) = 0)
);

-- Indexes
CREATE INDEX IX_LGA_LG_Status ON lga.LibraryGeneration(statusId);
CREATE INDEX IX_LGA_LG_QualityStatus ON lga.LibraryGeneration(qualityStatusId);
CREATE INDEX IX_LGA_LG_CreatedUtc ON lga.LibraryGeneration(createdUtc);
CREATE INDEX IX_LGA_LG_RepoBranch ON lga.LibraryGeneration(repoFullName, branchName);
CREATE INDEX IX_LGA_LG_BlueprintPatch ON lga.LibraryGeneration(blueprintSha, patchSha);
CREATE INDEX IX_LGA_OB_Aggregate ON lga.OutboxMessage(aggregateId, occurredUtc);
CREATE INDEX IX_LGA_AU_Entity ON lga.AuditEntry(entityId, occurredUtc);

Why flattened owned types? We keep the aggregate’s persistence atomic and simple to query, mirroring the anemic model; owned components do not need separate identities.


FluentMigrator (C#) – Skeleton

using FluentMigrator;

namespace ConnectSoft.Factory.BackendLibraryGeneratorAgent.Persistence.Migrations;

[Migration(2025091901)]
public class CreateLgaSchema : Migration
{
    public override void Up()
    {
        // Schemas
        if (!Schema.Schema("lga").Exists()) Execute.Sql("CREATE SCHEMA lga;");
        if (!Schema.Schema("lga_lu").Exists()) Execute.Sql("CREATE SCHEMA lga_lu;");

        // Lookup tables (example)
        Create.Table("LibraryGenerationStatus").InSchema("lga_lu")
            .WithColumn("id").AsInt32().PrimaryKey().Identity()
            .WithColumn("code").AsString(50).NotNullable().Unique()
            .WithColumn("name").AsString(200).NotNullable();

        // ... (QualityStatus, PullRequestState, SourceControlProvider)

        // Seed (safe idempotent inserts)
        Insert.IntoTable("LibraryGenerationStatus").InSchema("lga_lu").Row(new { code = "Pending", name = "Pending" });
        Insert.IntoTable("LibraryGenerationStatus").InSchema("lga_lu").Row(new { code = "WorkspaceReady", name = "Workspace Ready" });
        // ... remaining statuses

        // Aggregate table
        Create.Table("LibraryGeneration").InSchema("lga")
            .WithColumn("id").AsGuid().PrimaryKey()
            .WithColumn("correlationId").AsGuid().NotNullable().Unique()
            .WithColumn("blueprintSha").AsString(64).NotNullable()
            .WithColumn("patchSha").AsString(64).Nullable()

            .WithColumn("statusId").AsInt32().NotNullable().ForeignKey("lga_lu", "LibraryGenerationStatus", "id")

            .WithColumn("workspaceRoot").AsString(400).NotNullable()
            .WithColumn("workspaceScratch").AsString(400).Nullable()

            .WithColumn("qualityStatusId").AsInt32().NotNullable().ForeignKey("lga_lu", "QualityStatus", "id")
            .WithColumn("coveragePercent").AsDecimal(5,2).Nullable()
            .WithColumn("qualityReportUrl").AsString(2000).Nullable()

            .WithColumn("scmProviderId").AsInt32().NotNullable().ForeignKey("lga_lu", "SourceControlProvider", "id")
            .WithColumn("repoFullName").AsString(400).NotNullable()
            .WithColumn("branchName").AsString(200).NotNullable()
            .WithColumn("commitId").AsString(64).Nullable()
            .WithColumn("gitPatchFingerprint").AsString(64).Nullable()

            .WithColumn("prStateId").AsInt32().NotNullable().ForeignKey("lga_lu", "PullRequestState", "id")
            .WithColumn("prNumber").AsString(200).Nullable()
            .WithColumn("prUrl").AsString(2000).Nullable()

            .WithColumn("lastError").AsString(2000).Nullable()

            .WithColumn("createdUtc").AsDateTimeOffset().NotNullable().WithDefault(SystemMethods.CurrentUTCDateTime)
            .WithColumn("updatedUtc").AsDateTimeOffset().NotNullable().WithDefault(SystemMethods.CurrentUTCDateTime)
            .WithColumn("completedUtc").AsDateTimeOffset().Nullable()
            .WithColumn("rowVersion").AsCustom("rowversion").NotNullable();

        // CHECK constraints (UTC + coverage)
        Execute.Sql("ALTER TABLE lga.LibraryGeneration ADD CONSTRAINT CK_LGA_LG_Coverage CHECK (coveragePercent IS NULL OR (coveragePercent >= 0 AND coveragePercent <= 100));");
        Execute.Sql("ALTER TABLE lga.LibraryGeneration ADD CONSTRAINT CK_LGA_LG_CreatedUtc_UTC CHECK (DATEPART(TZ, createdUtc) = 0);");
        Execute.Sql("ALTER TABLE lga.LibraryGeneration ADD CONSTRAINT CK_LGA_LG_UpdatedUtc_UTC CHECK (DATEPART(TZ, updatedUtc) = 0);");
        Execute.Sql("ALTER TABLE lga.LibraryGeneration ADD CONSTRAINT CK_LGA_LG_CompletedUtc_UTC CHECK (completedUtc IS NULL OR DATEPART(TZ, completedUtc) = 0);");

        // Outbox
        Create.Table("OutboxMessage").InSchema("lga")
            .WithColumn("messageId").AsGuid().PrimaryKey()
            .WithColumn("aggregateId").AsGuid().NotNullable().ForeignKey("lga", "LibraryGeneration", "id")
            .WithColumn("messageType").AsString(200).NotNullable()
            .WithColumn("payloadJson").AsString(int.MaxValue).NotNullable()
            .WithColumn("occurredUtc").AsDateTimeOffset().NotNullable().WithDefault(SystemMethods.CurrentUTCDateTime)
            .WithColumn("processedUtc").AsDateTimeOffset().Nullable()
            .WithColumn("retries").AsInt32().NotNullable().WithDefaultValue(0)
            .WithColumn("lastError").AsString(2000).Nullable();

        Execute.Sql("ALTER TABLE lga.OutboxMessage ADD CONSTRAINT CK_LGA_OB_Occurred_UTC CHECK (DATEPART(TZ, occurredUtc) = 0);");
        Execute.Sql("ALTER TABLE lga.OutboxMessage ADD CONSTRAINT CK_LGA_OB_Processed_UTC CHECK (processedUtc IS NULL OR DATEPART(TZ, processedUtc) = 0);");

        // Audit
        Create.Table("AuditEntry").InSchema("lga")
            .WithColumn("auditId").AsInt64().PrimaryKey().Identity()
            .WithColumn("entityId").AsGuid().NotNullable()
            .WithColumn("entityType").AsString(100).NotNullable()
            .WithColumn("action").AsString(100).NotNullable()
            .WithColumn("actor").AsString(200).NotNullable()
            .WithColumn("changesJson").AsString(int.MaxValue).Nullable()
            .WithColumn("occurredUtc").AsDateTimeOffset().NotNullable().WithDefault(SystemMethods.CurrentUTCDateTime);

        Execute.Sql("ALTER TABLE lga.AuditEntry ADD CONSTRAINT CK_LGA_AU_UTC CHECK (DATEPART(TZ, occurredUtc) = 0);");

        // Indexes
        Create.Index("IX_LGA_LG_Status").OnTable("LibraryGeneration").InSchema("lga")
            .OnColumn("statusId").Ascending();
        Create.Index("IX_LGA_LG_QualityStatus").OnTable("LibraryGeneration").InSchema("lga")
            .OnColumn("qualityStatusId").Ascending();
        Create.Index("IX_LGA_LG_CreatedUtc").OnTable("LibraryGeneration").InSchema("lga")
            .OnColumn("createdUtc").Descending();
        Create.Index("IX_LGA_LG_RepoBranch").OnTable("LibraryGeneration").InSchema("lga")
            .OnColumn("repoFullName").Ascending()
            .OnColumn("branchName").Ascending();
        Create.Index("IX_LGA_LG_BlueprintPatch").OnTable("LibraryGeneration").InSchema("lga")
            .OnColumn("blueprintSha").Ascending()
            .OnColumn("patchSha").Ascending();

        Create.Index("IX_LGA_OB_Aggregate").OnTable("OutboxMessage").InSchema("lga")
            .OnColumn("aggregateId").Ascending()
            .OnColumn("occurredUtc").Descending();

        Create.Index("IX_LGA_AU_Entity").OnTable("AuditEntry").InSchema("lga")
            .OnColumn("entityId").Ascending()
            .OnColumn("occurredUtc").Descending();
    }

    public override void Down()
    {
        Delete.Table("AuditEntry").InSchema("lga");
        Delete.Table("OutboxMessage").InSchema("lga");
        Delete.Table("LibraryGeneration").InSchema("lga");

        Delete.Table("SourceControlProvider").InSchema("lga_lu");
        Delete.Table("PullRequestState").InSchema("lga_lu");
        Delete.Table("QualityStatus").InSchema("lga_lu");
        Delete.Table("LibraryGenerationStatus").InSchema("lga_lu");
    }
}

NHibernate Mapping-by-Code (C#)

Mapping preserves owned component boundaries via prefixed columns, FKs to lookup tables, and UTC DateTimeOffset.

using NHibernate.Mapping.ByCode;
using NHibernate.Mapping.ByCode.Conformist;

public class LibraryGenerationMap : ClassMapping<LibraryGeneration>
{
    public LibraryGenerationMap()
    {
        Table("lga.LibraryGeneration");

        Id(x => x.Id, m => {
            m.Generator(Generators.GuidComb);
            m.Column("id");
        });

        Property(x => x.CorrelationId, m => {
            m.Column("correlationId");
            m.NotNullable(true);
            m.Unique(true);
        });

        Property(x => x.BlueprintSha, m => { m.Column("blueprintSha"); m.Length(64); m.NotNullable(true); });
        Property(x => x.PatchSha, m => { m.Column("patchSha"); m.Length(64); m.NotNullable(false); });

        ManyToOne(x => x.Status, m => {
            m.Column("statusId");
            m.Class(typeof(LibraryGenerationStatus));
            m.NotNullable(true);
            m.Fetch(FetchKind.Select);
        });

        // Workspace (owned)
        Property(x => x.Workspace.Root, m => { m.Column("workspaceRoot"); m.Length(400); m.NotNullable(true); });
        Property(x => x.Workspace.Scratch, m => { m.Column("workspaceScratch"); m.Length(400); m.NotNullable(false); });

        // Quality (owned)
        ManyToOne(x => x.Quality.Status, m => {
            m.Column("qualityStatusId");
            m.Class(typeof(QualityStatus));
            m.NotNullable(true);
        });
        Property(x => x.Quality.CoveragePercent, m => { m.Column("coveragePercent"); });
        Property(x => x.Quality.ReportUrl, m => { m.Column("qualityReportUrl"); m.Length(2000); });

        // GitOutcome (owned)
        ManyToOne(x => x.GitOutcome.Provider, m => {
            m.Column("scmProviderId"); m.Class(typeof(SourceControlProvider)); m.NotNullable(true);
        });
        Property(x => x.GitOutcome.RepoFullName, m => { m.Column("repoFullName"); m.Length(400); m.NotNullable(true); });
        Property(x => x.GitOutcome.BranchName, m => { m.Column("branchName"); m.Length(200); m.NotNullable(true); });
        Property(x => x.GitOutcome.CommitId, m => { m.Column("commitId"); m.Length(64); });
        Property(x => x.GitOutcome.PatchFingerprint, m => { m.Column("gitPatchFingerprint"); m.Length(64); });

        // PullRequest (owned)
        ManyToOne(x => x.PullRequest.State, m => {
            m.Column("prStateId"); m.Class(typeof(PullRequestState)); m.NotNullable(true);
        });
        Property(x => x.PullRequest.Number, m => { m.Column("prNumber"); m.Length(200); });
        Property(x => x.PullRequest.Url, m => { m.Column("prUrl"); m.Length(2000); });

        // Metadata
        Property(x => x.LastError, m => { m.Column("lastError"); m.Length(2000); });

        // UTC times
        Property(x => x.CreatedUtc, m => { m.Column("createdUtc"); m.Type(NHibernateUtil.DateTimeOffset); m.NotNullable(true); });
        Property(x => x.UpdatedUtc, m => { m.Column("updatedUtc"); m.Type(NHibernateUtil.DateTimeOffset); m.NotNullable(true); });
        Property(x => x.CompletedUtc, m => { m.Column("completedUtc"); m.Type(NHibernateUtil.DateTimeOffset); });

        Version(x => x.RowVersion, m => {
            m.Column(c => c.Name("rowVersion"));
            m.UnsavedValue(null);
            m.Generated(VersionGeneration.Never);
            m.Type(NHibernateUtil.BinaryBlob);
        });
    }
}

// Lookup maps are simple Id/Code/Name entities mapped to lga_lu.*

UTC discipline (application level):

  • Use DateTimeOffset in the domain model.
  • Set CreatedUtc/UpdatedUtc via a clock service that returns DateTimeOffset.UtcNow.
  • Middleware ensures any incoming timestamps are converted to UTC prior to persistence.

Indices & Query Patterns

  • Operational queries

    • Find by correlationId (unique).
    • Dashboard lists by statusId, createdUtc DESC.
    • Quality board by qualityStatusId, coveragePercent.
    • Repo-branch activity by (repoFullName, branchName).
    • Idempotency scans by (blueprintSha, patchSha).
  • Retention

    • Consider partitioning/archiving AuditEntry by occurredUtc and event type after N days.
    • Outbox delete after processed + retention window.

Outbox & Audit Semantics

  • OutboxMessage
    • Written in the same transaction as LibraryGeneration state change.
    • Saga/worker reads unprocessed messages, publishes events, marks processedUtc.
  • AuditEntry
    • Created on every externally visible state change:
      • Status transitions, PR opened/merged, quality result changes, retry/compensation.
    • changesJson contains before/after or patch of fields (owned components included).

Acceptance

  • Schema diagrams produced: ERD above reflects aggregate + owned components + lookups + outbox + audit.
  • DDL aligns with aggregates: Flattened owned components with prefixed columns; all enums via lookup FKs; strings nvarchar; UTC datetimeoffset.
  • Migrations compile and run: FluentMigrator skeleton provided with UTC defaults and check constraints; lookup seeding included.

📡 Contracts (gRPC, Bus, MCP)

🎯 gRPC API

The LGA exposes a contract-first C# gRPC interface that external clients (agents, orchestrators, CLI tools) can call. This contract is domain-oriented and focuses only on the lifecycle of a library generation run.

Service Interface

using System.Threading.Tasks;

namespace ConnectSoft.Factory.BackendLibraryGeneratorAgent.ServiceModel.Grpc.Contracts
{
    /// <summary>
    /// gRPC service contract for initiating and monitoring library generation runs.
    /// </summary>
    public interface ILibraryGeneratorService
    {
        /// <summary>
        /// Start a new library generation run.
        /// </summary>
        Task<StartGenerationReply> StartGeneration(StartGenerationRequest request);

        /// <summary>
        /// Get the status of a previously started run.
        /// </summary>
        Task<GetRunStatusReply> GetRunStatus(GetRunStatusRequest request);
    }
}

Request/Reply DTOs

namespace ConnectSoft.Factory.BackendLibraryGeneratorAgent.ServiceModel.Grpc.Contracts
{
    public sealed class StartGenerationRequest
    {
        public string CorrelationId { get; init; } = default!;
        public string BlueprintJson { get; init; } = default!;
    }

    public sealed class StartGenerationReply
    {
        public string RunId { get; init; } = default!;
        public string InitialStatus { get; init; } = default!;
    }

    public sealed class GetRunStatusRequest
    {
        public string RunId { get; init; } = default!;
    }

    public sealed class GetRunStatusReply
    {
        public string Status { get; init; } = default!;
        public string? PullRequestUrl { get; init; }
        public string? FailureReason { get; init; }
    }
}

Design Notes

  • Contract-first → DTOs live in a shared contract assembly (ConnectSoft.Factory.BackendLibraryGeneratorAgent.ServiceModel.Grpc.Contracts).
  • Traceability → All requests carry CorrelationId or RunId for observability.
  • Minimal API surface → Only start and query status; internal orchestration is hidden.

🚌 MassTransit Bus Contracts

The orchestration inside the LGA is event-driven. We define commands and events as immutable records to coordinate the Saga.

Commands

public record StartLibraryGeneration(string CorrelationId, string BlueprintJson);
public record PrepareWorkspace(string CorrelationId);
public record RunTemplate(string CorrelationId);
public record RunQuality(string CorrelationId);
public record PushBranch(string CorrelationId);
public record OpenPullRequest(string CorrelationId);

Events

public record WorkspacePrepared(string CorrelationId, string Path);
public record LibraryGenerated(string CorrelationId);
public record QualityValidated(string CorrelationId, bool Passed, double Coverage);
public record BranchCommitted(string CorrelationId, string CommitId);
public record PullRequestOpened(string CorrelationId, string PullRequestUrl);
public record LibraryGenerationFailed(string CorrelationId, string Reason);

Contract Guarantees

  • All messages must carry CorrelationId for Saga correlation and traceability.
  • Events are idempotent → the same event replayed does not corrupt state.
  • Commands and events are versionable to allow evolution without breaking older runs.

🔧 MCP Server (Tools)

To integrate with the ConnectSoft Agent System, the LGA exposes MCP tools:

  • library.start_generation → Start a new run.

    • Inputs: correlationId, blueprintJson
    • Output: { runId, initialStatus }
  • library.get_status → Query run status.

    • Inputs: runId
    • Output: { status, pr_url?, failure_reason? }

Usage Scenario

  • A Solution Architect Agent triggers library.start_generation after producing a blueprint.
  • A QA Agent later polls library.get_status to decide if tests passed.
  • A DevOps Agent can wait for PullRequestOpened and attach additional pipeline steps.

📊 Rationale

  • Single Source of Truth: gRPC contracts are the public API, bus contracts drive orchestration, MCP tools are the agent-facing surface.
  • Consistency: All flows share the same identifiers (CorrelationId, RunId).
  • Extensibility: New commands/events can be added without changing the gRPC facade.
  • Agent-Friendly: MCP tools provide the simplest entrypoint for other agents, hiding infra details.

✅ Acceptance Criteria

  • C# contract classes (Requests, Replies, Commands, Events) compile.
  • gRPC service interface defined and aligned with clean architecture.
  • MassTransit contracts integrated into Saga with CorrelationId correlation.
  • MCP server exposes tools with the same naming as bus/gRPC methods.
  • Test harness demonstrates successful invocation via all three paths (gRPC, Bus, MCP).

🔄 Orchestration (Saga) & Policies

The Library Generation Saga is the single state machine that coordinates the end-to-end flow from blueprint intake to PR creation. It is message-driven, idempotent, and compensating on partial success. All transitions emit structured telemetry (traceId = CorrelationId) and outbox events.


🧭 State Machine (Happy Path)

States

Pending → WorkspaceReady → Generated → QualityPassed/Failed → BranchPushed → PROpened → Completed/Failed

Triggers (Commands/Events)

From → To Trigger (Command/Event) Action (Side Effects)
Pending → WorkspaceReady PrepareWorkspace / WorkspacePrepared Create temp workspace, persist path, emit WorkspacePrepared
WorkspaceReady → Generated RunTemplate / LibraryGenerated Execute library template with flags, persist patch SHA, emit LibraryGenerated
Generated → QualityPassed RunQuality[pass] / QualityValidated Build & test; store coverage; emit QualityValidated(passed=true)
Generated → QualityFailed RunQuality[fail] / QualityValidated Persist failure diagnostics; emit QualityValidated(passed=false)
QualityPassed → BranchPushed PushBranch / BranchCommitted Commit & push; store commitId; emit BranchCommitted
BranchPushed → PROpened OpenPullRequest / PullRequestOpened Create PR; store PR URL; emit PullRequestOpened
PROpened → Completed (terminal) Mark success; emit LibraryGenerationCompleted
* → Failed LibraryGenerationFailed Persist failure + reason; compensate; emit LibraryGenerationFailed

🧱 Saga State (MassTransit)

public sealed class LibraryGenerationState :
    SagaStateMachineInstance
{
    public Guid CorrelationId { get; set; }            // also traceId
    public string CurrentState { get; set; } = default!;
    public string? WorkspacePath { get; set; }
    public string? PatchSha { get; set; }
    public string? CommitId { get; set; }
    public string? PullRequestUrl { get; set; }
    public double? Coverage { get; set; }
    public string? FailureReason { get; set; }

    // Idempotency & dedupe
    public string? BlueprintSha { get; set; }
    public DateTime StartedUtc { get; set; }
    public DateTime? FinishedUtc { get; set; }
}

⚙️ Saga Definition (outline)

public sealed class LibraryGenerationStateMachine :
    MassTransitStateMachine<LibraryGenerationState>
{
    public State Pending { get; private set; }
    public State WorkspaceReady { get; private set; }
    public State Generated { get; private set; }
    public State QualityPassed { get; private set; }
    public State QualityFailed { get; private set; }
    public State BranchPushed { get; private set; }
    public State PROpened { get; private set; }
    public Event<StartLibraryGeneration> Start { get; private set; }
    public Event<WorkspacePrepared> WorkspacePrepared { get; private set; }
    public Event<LibraryGenerated> LibraryGenerated { get; private set; }
    public Event<QualityValidated> QualityValidated { get; private set; }
    public Event<BranchCommitted> BranchCommitted { get; private set; }
    public Event<PullRequestOpened> PullRequestOpened { get; private set; }
    public Event<LibraryGenerationFailed> Failed { get; private set; }

    public LibraryGenerationStateMachine()
    {
        InstanceState(x => x.CurrentState);

        Event(() => Start, x => x.CorrelateById(m => m.Message.CorrelationId));
        Event(() => WorkspacePrepared, x => x.CorrelateById(m => m.Message.CorrelationId));
        // ... other correlations

        Initially(
            When(Start)
                .Then(ctx =>
                {
                    ctx.Instance.BlueprintSha = ComputeSha(ctx.Message.BlueprintJson);
                    ctx.Instance.StartedUtc = DateTime.UtcNow;
                })
                .Publish(ctx => new PrepareWorkspace(ctx.Instance.CorrelationId))
                .TransitionTo(Pending)
        );

        During(Pending,
            When(WorkspacePrepared)
                .Then(ctx => ctx.Instance.WorkspacePath = ctx.Message.Path)
                .Publish(ctx => new RunTemplate(ctx.Instance.CorrelationId))
                .TransitionTo(WorkspaceReady)
        );

        During(WorkspaceReady,
            When(LibraryGenerated)
                .Publish(ctx => new RunQuality(ctx.Instance.CorrelationId))
                .TransitionTo(Generated)
        );

        During(Generated,
            When(QualityValidated, ctx => ctx.Data.Passed)
                .Then(ctx => ctx.Instance.Coverage = ctx.Data.Coverage)
                .Publish(ctx => new PushBranch(ctx.Instance.CorrelationId))
                .TransitionTo(QualityPassed),

            When(QualityValidated, ctx => !ctx.Data.Passed)
                .Then(ctx => ctx.Instance.Coverage = ctx.Data.Coverage)
                .TransitionTo(QualityFailed)
        );

        During(QualityPassed,
            When(BranchCommitted)
                .Then(ctx => ctx.Instance.CommitId = ctx.Data.CommitId)
                .Publish(ctx => new OpenPullRequest(ctx.Instance.CorrelationId))
                .TransitionTo(BranchPushed)
        );

        During(BranchPushed,
            When(PullRequestOpened)
                .Then(ctx => ctx.Instance.PullRequestUrl = ctx.Data.PullRequestUrl)
                .Then(ctx => ctx.Instance.FinishedUtc = DateTime.UtcNow)
                .Finalize()
                .TransitionTo(PROpened)
        );

        // Global failure
        DuringAny(
            When(Failed)
                .Then(ctx => ctx.Instance.FailureReason = ctx.Data.Reason)
                .ThenAsync(async ctx => await CompensateAsync(ctx.Instance))
                .Finalize()
        );

        SetCompletedWhenFinalized();
    }
}

Note: publishing commands (Publish(...)) assumes an outbox is configured to guarantee at-least-once delivery without duplicates.


🔐 Policies

Idempotency Keys

  • Primary: CorrelationId
  • Content Keys: Blueprint SHA + Patch SHA
    • If a subsequent request arrives with the same (CorrelationId, BlueprintSha, PatchSha), the saga returns the existing result (no duplicate work).
    • If CorrelationId matches but content differs (e.g., new BlueprintSha), fork a new run (new CorrelationId required) or reject per policy.

Retries & Redelivery

  • Transient operations (template run, git push, PR creation) use exponential backoff retry (e.g., 3, 10, 30 seconds; max 5 attempts).
  • Poison messages are moved to a DLQ with full context (CorrelationId, BlueprintSha, last error) and surfaced to Observability dashboards.
  • Redelivery is safe due to outbox + idempotent handlers (e.g., re-applying same patch yields same PatchSha → no double commits).

Compensation (Partial Success)

When a downstream step succeeds but later steps fail:

  • PR created but later failure → add a PR comment with failure reason and label (e.g., generation-failed), attach traceId, and leave PR open for human triage.
  • Branch pushed but PR creation failed → leave branch; post status check and create a work item (optional) with details.
  • Workspace created but generation failedcleanup workspace best-effort; retain artifacts for diagnostics when configured.

Concurrency & Ordering

  • One active transition per CorrelationId (saga instance) at a time.
  • Re-ordering tolerated: events include idempotent guards (e.g., ignore WorkspacePrepared if already WorkspaceReady).

Security & Secrets

  • Access to MCP FS/Git and ADO adapters scoped per run via ephemeral tokens from Key Vault.
  • All emitted logs are structured; secrets are redacted at the sink.

🧪 Failure Scenarios (Top 5)

  1. Template Execution Error

    • Where: RunTemplate
    • Handling: Retry with backoff. On final fail → emit LibraryGenerationFailed, attach logs; compensate (cleanup workspace), mark Failed.
  2. Quality Gate Fails (Coverage < threshold)

    • Where: RunQuality
    • Handling: Transition to QualityFailed; no automatic retry (non-transient). Leave artifacts + coverage report; label PR if one exists.
  3. Git Push Rejected (non-fast-forward / permission)

    • Where: PushBranch
    • Handling: Pull & rebase retry (limited). If still failing → create work item + PR comment (if PR exists), mark Failed.
  4. PR Creation Fails (rate limits / validation)

    • Where: OpenPullRequest
    • Handling: Retry with backoff. On final fail → keep branch, emit failure, add status check and create work item, mark Failed.
  5. Outbox Delivery Timeout / DLQ

    • Where: Any publish step
    • Handling: Message redelivery with dedupe; operator notified via DLQ alert. Saga remains waiting until event arrives or manual abort occurs.

Each scenario logs a machine-readable error (category, stage, correlationId, blueprintSha) to enable Studio analytics and auto-triage.


📐 Transition Table (Summary)

State On Event Action Next State
Pending WorkspacePrepared Save path; publish RunTemplate WorkspaceReady
WorkspaceReady LibraryGenerated Publish RunQuality Generated
Generated QualityValidated(p) Save coverage; publish PushBranch QualityPassed
Generated QualityValidated(!p) Save coverage QualityFailed
QualityPassed BranchCommitted Publish OpenPullRequest BranchPushed
BranchPushed PullRequestOpened Save PR URL; finalize PROpened/Done
* LibraryGenerationFailed Compensate; finalize Failed

✅ Acceptance

  • State machine and transition table documented and reviewed.
  • Five failure scenarios explicitly modeled with retries and compensation.
  • Idempotency policy implemented: CorrelationId + BlueprintSha + PatchSha.
  • Outbox + DLQ configured; redelivery verified in tests.
  • Telemetry: each transition emits structured events tied to CorrelationId for traceability.

🧰 Tooling Ports & Adapters

This section defines the clean architecture ports the Saga calls and their adapters (infra integrations). All ports are async, idempotent, trace-aware, and designed for fakeable testing.


🔌 Ports (Domain-Facing Interfaces)

namespace ConnectSoft.Factory.BackendLibraryGeneratorAgent.Ports;

public interface IMcpFilesystem
{
    Task<WorkspaceResult> PrepareWorkspaceAsync(PrepareWorkspaceRequest req, CancellationToken ct);
    Task WriteFilesAsync(WriteFilesRequest req, CancellationToken ct);               // for templated output
    Task<string> ComputePatchShaAsync(ComputePatchShaRequest req, CancellationToken ct);
    Task CleanupAsync(CleanupWorkspaceRequest req, CancellationToken ct);
}

public interface IMcpGit
{
    Task<GitCloneResult> CloneAsync(GitCloneRequest req, CancellationToken ct);
    Task<GitCommitResult> CommitAsync(GitCommitRequest req, CancellationToken ct);
    Task<GitPushResult> PushAsync(GitPushRequest req, CancellationToken ct);
}

public interface IAdoPullRequests
{
    Task<PullRequestResult> CreatePrAsync(CreatePrRequest req, CancellationToken ct);
    Task AddCommentAsync(AddPrCommentRequest req, CancellationToken ct);
    Task AddLabelAsync(AddPrLabelRequest req, CancellationToken ct);
    Task SetStatusCheckAsync(SetPrStatusCheckRequest req, CancellationToken ct);
}

public interface IQualityRunner
{
    Task<QualityResult> RunAsync(QualityRunRequest req, CancellationToken ct);       // build + tests + coverage
}

public interface IAiServices
{
    Task<AiDocResult> SummarizeBlueprintAsync(AiDocRequest req, CancellationToken ct);
    Task<AiCommitMsgResult> GenerateCommitMessageAsync(AiCommitMsgRequest req, CancellationToken ct);
    Task<AiPrBodyResult> GeneratePrBodyAsync(AiPrBodyRequest req, CancellationToken ct);
    Task<AiRemediationPlanResult> ProposeFixesAsync(AiRemediationPlanRequest req, CancellationToken ct);
}

Common request metadata (all ports receive):

public sealed record ExecutionContext(
    string CorrelationId,        // == traceId
    string TenantId,
    string? ProjectId,
    string AgentId,              // "lga"
    string SkillId,              // e.g., "RunTemplate"
    IDictionary<string,string> Tags);

🧩 Adapters (Infra Implementations)

1) MCP Filesystem Adapter (McpFilesystemAdapter)

  • Purpose: Workspace lifecycle and content I/O via MCP FS server.
  • Security: Allow-list of writable roots (e.g., /tmp/lga/*); path traversal blocked.
  • Idempotency: ComputePatchShaAsync returns stable SHA over sorted file list + contents.
  • Observability: Emits spans for prepare, write_batch, compute_sha, cleanup.
public sealed class McpFilesystemAdapter : IMcpFilesystem
{
    // ctor: IHttpClientFactory, IOptions<McpFsOptions>, ILogger, IMeter
    // maps to MCP /fs APIs: list, read, write, mkdir, stat … (tool calls)
}

Config – McpFsOptions:

Key Example Notes
Mcp:Filesystem:Roots /tmp/lga;/mnt/work/* Allow-list of roots
Mcp:Filesystem:MaxWriteBytes 10485760 10 MB per batch
Mcp:Filesystem:CleanupOnFailure true Best-effort cleanup

2) MCP Git Adapter (McpGitAdapter)

  • Purpose: Clone/commit/push via MCP Git server (credential-less from LGA; tokens resolved by MCP).
  • Branching: feature/lga/{short-corr}-{blueprintSha6}.
  • Commit message: From IAiServices.GenerateCommitMessageAsync, fallback to conventional commit.
  • Retry: Push uses exponential backoff; on non-fast-forward → one rebase attempt.

Config – McpGitOptions:

Key Example Notes
Mcp:Git:RemoteUrl https://dev.azure.com/org/_git/repo Template; repo comes in req
Mcp:Git:UserNameClaim svc-lga For audit
Mcp:Git:MaxPushRetries 5 With jitter/backoff

3) Azure DevOps PR Adapter (AdoPullRequestsAdapter)

  • Purpose: Create PR, add comments/labels/status checks via ADO REST.
  • Secrets: PAT retrieved per-run from Key Vault (by reference id), not stored in memory.
  • Compensation: On downstream failures, leaves PR open + posts comment + label generation-failed.

Config – AdoOptions:

Key Example Notes
Ado:Organization connectsoft
Ado:Project Factory Default; override per request
Ado:Repo LibraryTemplates Default repo
Ado:PatSecretRef kv://secrets/ado-pat-lga Key Vault reference
Ado:RequiredReviewers ["arch-bot","qa-bot"] Optional
Ado:StatusContext LGA/Quality For status check gate

4) Quality Runner Adapter (QualityRunnerAdapter)

  • Purpose: Execute dotnet build, test, and coverage tool (Coverlet) against the generated solution.
  • Environment: Can run in-container (prefer) or local runner with sandbox.
  • Outputs: CoveragePct, BuildLogUri, TestLogUri (artifact links).

Config – QualityOptions:

Key Example Notes
Quality:MinCoveragePct 70 Gate enforced in Saga
Quality:RunInContainer true Use job container
Quality:ContainerImage mcr.microsoft.com/dotnet/sdk:9
Quality:TimeoutSeconds 1800 30 min

5) AI Services Adapter (AiServicesAdapter via Semantic Kernel)

  • Purpose: Generate commit messages, PR body, docs summaries, and remediation plans from logs/blueprint.
  • Safety: Prompts include traceId, role and guardrails; never expose secrets/logs verbatim.
  • Fallbacks: If AI fails/timeouts → deterministic templates take over (no hard-block).

Config – AiOptions:

| Key | Example | |----------------------------|---------------------------------|-------------------------------| | Ai:Provider | AzureOpenAI | or OpenAI | | Ai:Deployment | gpt-4.1 | Model name/deployment id | | Ai:Temperature | 0.2 | Deterministic output | | Ai:TimeoutSeconds | 20 | Fast responses |


🧪 Testability & Fakes

Provide in-memory fakes for all ports to enable E2E and unit testing without external systems:

public sealed class FakeMcpFilesystem : IMcpFilesystem { /* no-op or temp dirs */ }
public sealed class FakeMcpGit : IMcpGit { /* returns deterministic commit/push */ }
public sealed class FakeAdoPullRequests : IAdoPullRequests { /* returns dummy PR url */ }
public sealed class FakeQualityRunner : IQualityRunner { /* configurable Pass/Fail */ }
public sealed class FakeAiServices : IAiServices { /* template-based strings */ }
  • Fakes accept an IMutableBehavior knob (e.g., NextRunFailsOnce("Push")) to simulate failure scenarios for Saga tests.

🧷 DI Registration (Composition Root)

public static class LgaServiceCollectionExtensions
{
    public static IServiceCollection AddLgaAdapters(this IServiceCollection services, IConfiguration cfg)
    {
        services.Configure<McpFsOptions>(cfg.GetSection("Mcp:Filesystem"));
        services.Configure<McpGitOptions>(cfg.GetSection("Mcp:Git"));
        services.Configure<AdoOptions>(cfg.GetSection("Ado"));
        services.Configure<QualityOptions>(cfg.GetSection("Quality"));
        services.Configure<AiOptions>(cfg.GetSection("Ai"));

        services.AddHttpClient();  // adapters use typed clients

        services.AddTransient<IMcpFilesystem, McpFilesystemAdapter>();
        services.AddTransient<IMcpGit, McpGitAdapter>();
        services.AddTransient<IAdoPullRequests, AdoPullRequestsAdapter>();
        services.AddTransient<IQualityRunner, QualityRunnerAdapter>();
        services.AddTransient<IAiServices, AiServicesAdapter>();

        return services;
    }
}

In test projects, override with fakes: services.AddTransient<IMcpFilesystem, FakeMcpFilesystem>(); etc.


🔒 Security & Governance Notes

  • Principle of least privilege: adapters request scoped tokens (repo/PR specific).
  • Key Vault for PATs and AI keys; no secrets in logs or memory.
  • Allow-lists for FS + Git targets; deny-by-default enforcement.
  • Redaction middleware for structured logs (block Authorization, Set-Cookie, etc.).

📈 Observability

All adapters:

  • Include traceId/CorrelationId and agentId=lga in logs/spans.
  • Emit metrics: success rate, latency, payload sizes, retries.
  • Produce events used by the Saga (WorkspacePrepared, BranchCommitted, PullRequestOpened, QualityValidated).

📚 DSL & Blueprint Alignment

  • Blueprint → Adapter mapping:
    • repo:IMcpGit.CloneAsync target + branch naming policy.
    • quality: → coverage threshold, test matrix presets.
    • docs: → AI summarization + PR body generation.
  • DSL control plane enforces allow-list of operations (e.g., which FS paths or repos are permitted), preventing ungoverned actions.

✅ Acceptance

  • Ports compiled with XML docs and nullability enabled.
  • Adapters registered via DI; fakes available for tests.
  • Config keys documented and validated (Options + IValidateOptions<>).
  • Security: Key Vault integration tested; allow-lists enforced.
  • Observability: spans, logs, and metrics emitted with CorrelationId.
  • E2E harness runs with all fakes to cover success + failure scenarios.

🧬 DSL & Blueprint Integration

This section defines how DSLs act as the control plane for the LGA and how a Library Blueprint is mapped into a deterministic output (code, tests, docs, CI, NuGet metadata). It also lists the template switches and the transformation rules the generator applies to the .csproj, solution layout, and pipeline.


🗺️ Control Plane (DSLs, Intents, Traces)

Purpose: DSLs encode intent, structure, contracts, and triggers that drive the generator. Every artifact carries trace metadata for audit and replay.

Control Concepts

  • Intent & Triggers
    • intent: generate-library
    • triggers: { onEvent, onCommand, onSchedule } (e.g., onEvent: VisionDocumentCreated)
  • Contracts
    • ports: references (fs, git, quality, ado, ai)
    • policies: idempotency keys, retries, coverage thresholds
  • Trace
    • traceId, sessionId, blueprintId, componentTag: lga
  • Guardrails / Allow Lists
    • allowedPaths, allowedRepos, allowedLicenses

Minimal DSL Block (embedded in blueprint)

control:
  intent: generate-library
  triggers:
    - onCommand: StartLibraryGeneration
  policies:
    idempotencyKeys: [ correlationId, blueprintSha, patchSha ]
    retries:
      pushBranch: { maxAttempts: 5, backoff: exponential }
      createPr: { maxAttempts: 5, backoff: exponential }
  trace:
    traceId: ${correlationId}
    componentTag: lga
  allow:
    repos: [ "https://dev.azure.com/connectsoft/Factory/_git/*" ]
    paths: [ "/tmp/lga/**" ]

🧩 Blueprint → Output Mapping

Blueprint Files

  • library-blueprint.yaml – identity, metadata, composition, quality
  • options.yaml – template flags and toggles (overrides default switches)
  • (Optional) docs/ – seed README fragments / API notes

Example: library-blueprint.yaml

library:
  packageId: ConnectSoft.Extensions.Mapping.Mapster
  name: ConnectSoft.Extensions.Mapping.Mapster
  description: Abstraction-first mapping provider using Mapster
  owners: [ "Platform/DevEx" ]
  repositoryUrl: https://dev.azure.com/connectsoft/Factory/_git/Extensions
  license: MIT
  tags: [ "mapping", "abstractions", "mapster", "connectsoft" ]
  authors: [ "ConnectSoft" ]
  versioning:
    scheme: SemVer
    initialVersion: 0.1.0
  tfm:
    multi: [ "net8.0", "net9.0" ]
  analyzers:
    enable: true
    ruleset: strict
quality:
  minCoveragePct: 75
  testMatrix:
    unit: true
    integration: true
docs:
  readme:
    title: "ConnectSoft Mapping (Mapster provider)"
    badges: [ "build", "coverage", "nuget" ]
  changelog: true
ports:
  git:
    targetRepo: Extensions
    defaultBranch: main
    reviewers: [ "arch-bot", "qa-bot" ]
  ado:
    project: Factory
  ai:
    commitMessages: true
    prBody: true

Example: options.yaml (template switches & overrides)

template:
  useDI: true
  useOptions: true
  useLogging: true
  nullable: enable
  analyzers: enable
  publishers:
    nuget: true
    artifacts: true
  includeSample: false
  strongName: false
  generateSourceLink: true
  repoSigning: false
  style:
    editorconfig: dotnet-sdk
  pipeline:
    packOnPR: true
    pushToArtifactsOnMerge: true

Deterministic Output (generator guarantees)

Input Area Output Artifact/Change
library.* .csproj metadata (PackageId, Description, Authors, RepositoryUrl, PackageTags, License)
tfm.multi Multi-TFM in .csproj (<TargetFrameworks>net8.0;net9.0</TargetFrameworks>)
quality.minCoveragePct Pipeline gate + QualityOptions.MinCoveragePct override
docs.readme.* README.md scaffold with badges + sections
ports.git.* Branch naming + reviewers + PR target
ai.* AI-generated commit message + PR body (with deterministic fallbacks)
analyzers.ruleset Include Directory.Build.props with strict rules + warnings as errors (opt-in)

⚙️ Template Switches & Transform Rules

CLI-equivalent switches (applied by the generator; these aren’t user-facing CLI flags but blueprint-driven toggles):

  • --useDI → Adds DI-friendly skeleton, ServiceCollectionExtensions, example registration tests.
  • --useOptions → Adds Options class + IValidateOptions<T> + example configuration docs.
  • --useLogging → Adds ILogger<T> samples and logging scopes.
  • --nullable {enable|disable} → Sets <Nullable> in .csproj.
  • --analyzers {enable|disable} → Adds analyzers package refs + ruleset wiring.
  • --multiTFM net8.0;net9.0 → Sets <TargetFrameworks>.
  • --includeSample → Optional SampleUsage project and tests (off by default).
  • --generateSourceLink → Adds SourceLink + ContinuousIntegrationBuild.
  • --strongName → Conditionally adds signing props & key reference.
  • --publishers.nuget|artifacts → Adds pack job and artifacts publishing in ADO pipeline.

.csproj transformation sketch

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFrameworks>net8.0;net9.0</TargetFrameworks>
    <Nullable>enable</Nullable>
    <GenerateDocumentationFile>true</GenerateDocumentationFile>
    <AssemblyName>ConnectSoft.Extensions.Mapping.Mapster</AssemblyName>
    <PackageId>ConnectSoft.Extensions.Mapping.Mapster</PackageId>
    <Description>Abstraction-first mapping provider using Mapster</Description>
    <Authors>ConnectSoft</Authors>
    <RepositoryUrl>https://dev.azure.com/connectsoft/Factory/_git/Extensions</RepositoryUrl>
    <PackageTags>mapping;abstractions;mapster;connectsoft</PackageTags>
    <PackageLicenseExpression>MIT</PackageLicenseExpression>
    <PublishRepositoryUrl>true</PublishRepositoryUrl>
    <ContinuousIntegrationBuild>true</ContinuousIntegrationBuild>
    <EmbedUntrackedSources>true</EmbedUntrackedSources>
  </PropertyGroup>

  <!-- Analyzers (optional) -->
  <ItemGroup Condition="'$(AnalyzersEnabled)'=='true'">
    <PackageReference Include="Microsoft.CodeAnalysis.NetAnalyzers" Version="9.*" PrivateAssets="All" />
    <PackageReference Include="Meziantou.Analyzer" Version="2.*" PrivateAssets="All" />
  </ItemGroup>
</Project>

Pipeline generation (Azure Pipelines YAML, excerpt)

trigger: none
pr:
  branches: { include: [ main, feature/* ] }

pool:
  vmImage: 'ubuntu-latest'

variables:
  BuildConfiguration: Release
  MinCoveragePct: ${{ parameters.minCoveragePct }}

steps:
- task: DotNetCoreCLI@2
  displayName: Build
  inputs: { command: 'build', projects: '**/*.csproj', arguments: '-c $(BuildConfiguration)' }

- task: DotNetCoreCLI@2
  displayName: Test with coverage
  inputs: { command: 'test', projects: '**/*Tests/*.csproj', arguments: '-c $(BuildConfiguration) /p:CollectCoverage=true /p:CoverletOutputFormat=cobertura' }

- task: PublishCodeCoverageResults@2
  inputs: { summaryFileLocation: '$(System.DefaultWorkingDirectory)/**/coverage.cobertura.xml' }

- script: |
    pct=$(python - <<'PY'
    # parse cobertura and print int coverage
    PY
    )
    if [ "$pct" -lt "$(MinCoveragePct)" ]; then
      echo "##vso[task.logissue type=error]Coverage gate failed: $pct < $(MinCoveragePct)"
      exit 1
    fi
  displayName: Enforce coverage gate

🔄 Generator Flow (Blueprint → Ports)

  1. Parse & Validate library-blueprint.yaml and options.yaml (schema below).
  2. Derive blueprintSha and branchName (policy).
  3. Prepare Workspace via IMcpFilesystem.PrepareWorkspaceAsync.
  4. Materialize Template applying toggles (files, .csproj, pipeline, README).
  5. Compute Patch and Patch SHA via IMcpFilesystem.ComputePatchShaAsync.
  6. Run Quality via IQualityRunner.RunAsync (respects quality.minCoveragePct).
  7. Commit/Push via IMcpGit.
  8. Open PR via IAdoPullRequests, using IAiServices for commit messages/PR body.

✅ Schema & Validation (excerpt)

library-blueprint.schema.yaml

type: object
required: [ library, quality, ports ]
properties:
  library:
    type: object
    required: [ packageId, name, description, repositoryUrl, tfm ]
    properties:
      packageId: { type: string, pattern: '^[A-Za-z0-9_.]+' }
      name: { type: string, minLength: 3 }
      description: { type: string, minLength: 10 }
      authors: { type: array, items: { type: string } }
      license: { type: string, enum: [ MIT, Apache-2.0, Proprietary ] }
      tags: { type: array, items: { type: string } }
      tfm:
        type: object
        properties:
          multi: { type: array, items: { type: string, enum: [ 'net8.0', 'net9.0' ] } }
  quality:
    type: object
    required: [ minCoveragePct ]
    properties:
      minCoveragePct: { type: integer, minimum: 0, maximum: 100 }
  ports:
    type: object
    properties:
      git: { type: object, required: [ targetRepo, defaultBranch ] }
      ado: { type: object, required: [ project ] }

Validation Rules

  • packageId must be unique within the tenant/organization (checked via repo/package registry).
  • tfm.multi must intersect with platform-supported TFM set.
  • Disallow forbidden licenses if governance policy requires.
  • quality.minCoveragePct overrides the default gate but cannot go below platform minimum (e.g., 60%).

🧪 Dry-Run Rendering

The LGA supports dry-run mode to materialize output without committing:

run:
  mode: dry-run
  outputPath: /tmp/lga/dryrun/${correlationId}
  • Produces full file tree under outputPath.
  • Emits a diff report (added/changed/removed) vs. template baseline.
  • Logs include PatchSha for idempotency checks.

Diff Rules (template guarantees)

  • Deterministic file ordering and normalized line endings.
  • Timestamps ignored; only content-based differences measured.
  • Generated GUIDs/numbers must be stable within a run (seed = correlationId).

📚 README & NuGet Metadata Injection

  • README badges are driven by blueprint (build, coverage, nuget).
  • NuGet metadata in .csproj pulled from library.* and validated.
  • Optionally generate CHANGELOG.md (Keep a Changelog format) with initial entry 0.1.0.

📈 Traceability

  • Every rendered artifact includes a trace footer in comments (file headers) with:
    • traceId, blueprintSha, generatorVersion, timestampUtc.
  • PR description includes a trace block and links to coverage report and build logs.

✅ Acceptance

  • Provided example blueprint + options render to a dry-run folder successfully.
  • Diff report shows only expected files and transformations (matches template guarantees).
  • .csproj, pipeline YAML, README, and tests are generated with correct metadata.
  • Validation fails gracefully on invalid license, unsupported TFM, or low coverage requirement.
  • Trace metadata present in file headers and PR body.

🧪 Testing Strategy (Unit, Integration, Generation Quality)

This section defines the test layers, tools, fixtures, and quality gates that guarantee the LGA remains deterministic, observable, and safe to evolve. All tests run headless in CI and can also run locally with containerized dependencies.


🧩 Test Layers & Scope

Layer Scope Tooling / Harness Success Signal
Unit Consumers/handlers (pure logic), idempotency checks, retries/backoff, template symbol resolution, VO formatting, branch naming MSTest (or xUnit), FluentAssertions, NSubstitute/Moq Fast (<200ms), isolated, deterministic
Integration NHibernate mappings & migrations, Outbox delivery, MassTransit saga transitions, fake adapters (MCP/ADO/Git/Quality/AI) Testcontainers (optional), MassTransit TestHarness, SQLite/InMemory, Fakes (ports) E2E Start → Completed with assertions on events
Generation QA Build/test/coverage over rendered output; README/YAML shape checks; NuGet metadata validation Dotnet CLI (build/test), Coverlet, Markdownlint/YAMLLint, custom verifiers for .csproj metadata Coverage ≥ gate; artifacts structurally correct

🧱 Unit Tests (What to Verify)

  1. Idempotency & Dedupe

    • Same (CorrelationId, BlueprintSha, PatchSha) must not re-run template or push changes.
    • Different BlueprintSha with same CorrelationId must be rejected or fork per policy.
  2. Retries & Backoff

    • Transient exceptions (push/PR) follow configured exponential backoff.
    • Verify max attempts and jitter logic (inject fake clock).
  3. Template Symbol Resolution

    • Options (useDI, useOptions, useLogging, multi-TFM) map to expected transformed files and .csproj elements.
    • Verify package metadata injection (PackageId, Description, RepositoryUrl, License, Tags).
  4. Value Objects & Policies

    • BranchName derivation (format, truncation, sanitization).
    • CommitMessage template (AI fallback → conventional commit).
    • Coverage rounding/formatting.
  5. Saga Guards

    • Ignore out-of-order events (e.g., duplicate WorkspacePrepared).
    • Transition table adherence (no “teleporting” across states).

Skeleton Example (MSTest)

[TestClass]
public class IdempotencyTests
{
    [TestMethod]
    public async Task SameCorrelationAndContent_MustNotRegenerateOrPush()
    {
        var ctx = Fixtures.NewExecution("corr-123", blueprintSha: "aaa", patchSha: "bbb");
        var fakes = Fixtures.Fakes()
            .WithFilesystemPatchSha("bbb")
            .TrackCalls();

        await Orchestrator.RunOnceAsync(ctx, fakes);
        await Orchestrator.RunOnceAsync(ctx, fakes); // simulate replay

        fakes.Filesystem.Writes.Should().Be(1);
        fakes.Git.Pushes.Should().Be(1);
        fakes.Ado.CreatedPrs.Should().Be(1);
    }
}

🔗 Integration Tests (Harness Design)

Goals

  • Validate NHibernate mappings & migrations (FluentMigrator) against SQLite in-memory or Testcontainers for parity.
  • Exercise MassTransit StateMachine with the TestHarness to assert transitions and emitted events.
  • Replace infra with fakes:
    • IMcpFilesystem → temp dir
    • IMcpGit → no-op commit/push with deterministic commit IDs
    • IAdoPullRequests → fake PR URL + comment recording
    • IQualityRunner → configurable pass/fail + coverage
    • IAiServices → deterministic strings

Happy Path E2E: Start → Completed

[TestMethod]
public async Task Start_To_Completed_HappyPath()
{
    using var h = await HarnessFactory.StartAsync(fakes => fakes
        .WithQualityPass(coverage: 78.3)
        .WithFilesystemPatchSha("patch-123"));

    var corr = Guid.NewGuid().ToString("N");
    await h.Bus.Publish(new StartLibraryGeneration(corr, Fixtures.BlueprintJson()));

    (await h.Harness.Consumed.Any<WorkspacePrepared>()).Should().BeTrue();
    (await h.Harness.Consumed.Any<LibraryGenerated>()).Should().BeTrue();
    (await h.Harness.Consumed.Any<QualityValidated>()).Should().BeTrue();
    (await h.Harness.Consumed.Any<BranchCommitted>()).Should().BeTrue();
    (await h.Harness.Consumed.Any<PullRequestOpened>()).Should().BeTrue();

    var state = await h.StateStore.LoadAsync(corr);
    state.CurrentState.Should().Be("Final");
    state.Coverage.Should().BeApproximately(78.3, 0.01);
    state.PullRequestUrl.Should().StartWith("https://dev.azure.com/");
}

Failure Matrix (must test at least):

  1. Template execution errorFailed, workspace cleanup recorded.
  2. Coverage below thresholdQualityFailed, no PR; artifacts retained.
  3. Git push rejected → retry + work item/PR comment; Failed if exhausted.
  4. PR creation rate-limited → retries, branch kept, status check set; Failed if exhausted.
  5. Outbox delayed/DLQ → redelivery leads to eventual transition; or operator abort.

🧪 Generation Quality (Rendered Output)

What we verify on the **generated library:**

  • Build & Test succeed (dotnet CLI).
  • Coverage meets or exceeds blueprint/all-tenant gate.
  • README contains badges/sections from blueprint; YAML pipeline includes build/test/coverage steps.
  • .csproj NuGet metadata: PackageId, Description, RepositoryUrl, License, Tags, SourceLink.
  • Multi-TFM correct and compilable for net8.0 and net9.0.
  • Lint checks: Markdownlint, YAMLLint (optional).

Verifier Snippets

CsprojVerifier.HasProperty(csproj, "TargetFrameworks", "net8.0;net9.0");
CsprojVerifier.HasProperty(csproj, "PackageId", "ConnectSoft.Extensions.Mapping.Mapster");
ReadmeVerifier.HasBadges("build","coverage","nuget");
YamlVerifier.HasSteps(pipeline, "DotNetCoreCLI@2", "PublishCodeCoverageResults@2");

🧰 Fixtures, Data & Determinism

  • Seed all randomized operations from CorrelationId to guarantee reproducibility.
  • Temp dirs under /tmp/lga/tests/{correlationId}; cleaned automatically.
  • Common Fixture helpers:
    • Fixtures.BlueprintJson(overrides?)
    • Fixtures.OptionsYaml(overrides?)
    • Fixtures.NewExecution(correlationId, blueprintSha, patchSha)
    • Fixtures.Fakes().WithQualityPass(...) .WithPushRejectOnce() …

📈 CI Configuration & Coverage Gate

  • Default gate: 70% line coverage (from Runbook); overridable per blueprint but not below platform minimum (e.g., 60%).
  • Coverlet produces Cobertura; PublishCodeCoverageResults uploads to CI.
  • Fail-fast when gate not met; post summary comment into PR via IAdoPullRequests.SetStatusCheckAsync.

Example CI Assertions (script step)

pct=$(grep -oP 'line-rate="\K[0-9.]+(?=")' coverage.cobertura.xml | awk '{ sum+=$1 } END { printf "%.0f", (sum/NR)*100 }')
test "$pct" -ge "$MIN_COVERAGE_PCT" || {
  echo "Coverage gate failed: $pct < $MIN_COVERAGE_PCT"
  exit 1
}

🔎 Diagnostics & Flakiness Control

  • Retry masks only for transient categories (net, IO, rate-limit).
  • Log attachments on failure: build log, test log, coverage report, rendered diff.
  • Quarantine tag for known flaky tests ([TestCategory("Quarantine")]) with weekly validator sweep.
  • Observability IDs (CorrelationId, BlueprintSha) are included in every test log line.

📂 Test Project Layout

/tests
  ConnectSoft.Factory.BackendLibraryGeneratorAgent.Tests.Unit/
    IdempotencyTests.cs
    RetryPolicyTests.cs
    TemplateSymbolResolutionTests.cs
    BranchNamingPolicyTests.cs
  ConnectSoft.Factory.BackendLibraryGeneratorAgent.Tests.Integration/
    SagaHappyPathTests.cs
    SagaFailureMatrixTests.cs
    NhMappingsAndMigrationsTests.cs
    OutboxDeliveryTests.cs
    RenderedQualityVerificationTests.cs
  TestUtilities/
    Fixtures.cs
    Fakes/
      FakeMcpFilesystem.cs
      FakeMcpGit.cs
      FakeAdoPullRequests.cs
      FakeQualityRunner.cs
      FakeAiServices.cs
    Verifiers/
      CsprojVerifier.cs
      ReadmeVerifier.cs
      YamlVerifier.cs

✅ Acceptance

  • Unit & Integration test suites pass locally and in CI.
  • Coverage ≥ 70% (or blueprint override ≥ platform minimum).
  • Rendered artifacts validated: README, pipeline YAML, tests, .csproj metadata.
  • Failure matrix scenarios are implemented and produce expected states, logs, and compensations.
  • CI publishes coverage, logs, and diff report as artifacts; PR status reflects gate outcome.

🚢 Delivery Engineering (Containers, Pipelines, IaC)

This section defines how the LGA is built, packed, deployed, and smoke-tested using a multi-stage Docker image, Azure DevOps pipelines, and Pulumi (C#) to provision cloud resources (ACR, Container Apps, Key Vault, Service Bus). Design follows Clean Architecture, Cloud-Native, Observability-First, and Security-First principles.


🐳 Container (Multi-Stage Dockerfile)

Key goals: small attack surface, reproducible builds, deterministic versions, non-root runtime, health probes.

# ---------- Base build image ----------
FROM mcr.microsoft.com/dotnet/sdk:9.0 AS build
WORKDIR /src

# Copy solution
COPY ./src ./src
COPY ./tests ./tests
COPY ./Directory.Build.props ./Directory.Build.props
COPY ./NuGet.config ./NuGet.config

# Restore with locked versions
RUN dotnet restore ./src/ConnectSoft.Factory.BackendLibraryGeneratorAgent.App/ConnectSoft.Factory.BackendLibraryGeneratorAgent.App.csproj --locked-mode

# Build + publish (ready-to-run where possible)
RUN dotnet publish ./src/ConnectSoft.Factory.BackendLibraryGeneratorAgent.App/ConnectSoft.Factory.BackendLibraryGeneratorAgent.App.csproj \
    -c Release -o /out \
    /p:PublishSingleFile=true \
    /p:IncludeNativeLibrariesForSelfExtract=true \
    /p:PublishTrimmed=false

# ---------- Runtime image ----------
FROM mcr.microsoft.com/dotnet/aspnet:9.0-cbl-mariner2.0 AS runtime
# Use a minimal, patched runtime; add non-root user
RUN adduser --uid 64123 --disabled-password lga && mkdir -p /app && chown -R lga /app
USER lga
WORKDIR /app
COPY --from=build /out ./

# Health & diagnostics
ENV ASPNETCORE_URLS=http://0.0.0.0:8080
EXPOSE 8080
# Optional: health endpoint served by the host (e.g., /health/ready)
# Use container start command provided by the app
ENTRYPOINT ["./ConnectSoft.Factory.BackendLibraryGeneratorAgent.App"]

Notes

  • Restore with --locked-mode to enforce deterministic packages.
  • SingleFile keeps image tidy; trimming is disabled initially for safer reflection/emit libraries (enable later with tests).
  • Run as non-root.
  • App exposes /health/ready and /health/live (implemented in host).

🧪 Azure DevOps Pipelines

Two pipelines:

  1. PR (CI) → build, test, coverage, pack artifact.
  2. CD (Sandbox) → Pulumi up, image push, deploy Container App, run smoke via gRPC/MCP.

1) PR Pipeline (azure-pipelines.pr.yml)

trigger: none
pr:
  branches: { include: [ main, feature/* ] }

pool:
  vmImage: 'ubuntu-latest'

variables:
  BuildConfiguration: Release
  MinCoveragePct: 70

steps:
- task: DotNetCoreCLI@2
  displayName: Restore
  inputs: { command: 'restore', projects: 'src/**/**/*.csproj', feedsToUse: 'select' }

- task: DotNetCoreCLI@2
  displayName: Build
  inputs: { command: 'build', projects: 'src/**/**/*.csproj', arguments: '-c $(BuildConfiguration) /warnaserror' }

- task: DotNetCoreCLI@2
  displayName: Test (+coverage)
  inputs:
    command: 'test'
    projects: 'tests/**/**/*.csproj'
    arguments: >
      -c $(BuildConfiguration)
      /p:CollectCoverage=true
      /p:CoverletOutputFormat=cobertura
      /p:ExcludeByFile="**/Migrations/*"

- task: PublishCodeCoverageResults@2
  inputs:
    codeCoverageTool: 'Cobertura'
    summaryFileLocation: '$(System.DefaultWorkingDirectory)/**/coverage.cobertura.xml'

- script: |
    pct=$(grep -oP 'line-rate="\K[0-9.]+(?=")' $(System.DefaultWorkingDirectory)/**/coverage.cobertura.xml | awk '{ sum+=$1 } END { printf "%.0f", (sum/NR)*100 }')
    echo "Coverage: $pct%"
    if [ "$pct" -lt "$(MinCoveragePct)" ]; then
      echo "##vso[task.logissue type=error]Coverage gate failed: $pct < $(MinCoveragePct)"
      exit 1
    fi
  displayName: Enforce coverage gate

- task: DotNetCoreCLI@2
  displayName: Publish (pack app)
  inputs:
    command: 'publish'
    publishWebProjects: false
    projects: 'src/ConnectSoft.Factory.BackendLibraryGeneratorAgent.App/ConnectSoft.Factory.BackendLibraryGeneratorAgent.App.csproj'
    arguments: '-c $(BuildConfiguration) -o $(Build.ArtifactStagingDirectory)/app'
    zipAfterPublish: true

- task: PublishBuildArtifacts@1
  inputs: { pathToPublish: '$(Build.ArtifactStagingDirectory)', artifactName: 'drop' }

2) CD to Sandbox (azure-pipelines.cd.yml)

trigger: none
resources:
  pipelines:
    - pipeline: pr
      source: 'lga-pr'           # name of the PR pipeline definition
      trigger: none

parameters:
  - name: env
    default: 'sandbox'

variables:
  ACR_NAME: 'csfactoryacr'
  ACR_LOGIN_SERVER: 'csfactoryacr.azurecr.io'
  CONTAINERAPP_ENV: 'csf-ca-env'
  APP_NAME: 'lga'
  IMAGE_TAG: '$(Build.BuildId)'
  DOTNET_ENV: 'Production'

stages:
- stage: BuildAndPush
  displayName: Build & Push Image
  jobs:
  - job: docker
    pool: { vmImage: 'ubuntu-latest' }
    steps:
    - download: pr
      artifact: drop
    - task: DockerInstaller@0
      inputs: { dockerVersion: 'latest' }
    - task: Docker@2
      displayName: Login ACR
      inputs:
        command: 'login'
        containerRegistry: 'svc-conn-acr'    # service connection
    - task: Docker@2
      displayName: Build image
      inputs:
        repository: '$(ACR_LOGIN_SERVER)/connectsoft/lga'
        command: 'build'
        Dockerfile: 'Dockerfile'
        tags: |
          $(IMAGE_TAG)
          latest
    - task: Docker@2
      displayName: Push image
      inputs:
        repository: '$(ACR_LOGIN_SERVER)/connectsoft/lga'
        command: 'push'
        tags: |
          $(IMAGE_TAG)
          latest

- stage: Provision
  displayName: Pulumi Up (Sandbox)
  dependsOn: BuildAndPush
  jobs:
  - job: pulumi
    pool: { vmImage: 'ubuntu-latest' }
    steps:
    - task: UseDotNet@2
      inputs: { packageType: 'sdk', version: '9.0.x' }
    - script: |
        dotnet tool install Pulumi --global || true
        export PULUMI_CONFIG_PASSPHRASE=""
        pulumi login azblob://pulumi-state # or Pulumi Cloud
      displayName: Pulumi init
    - script: |
        pushd iac/Pulumi.Lga.Stack
        pulumi stack select $(Build.SourceBranchName) || pulumi stack init $(Build.SourceBranchName)
        pulumi config set containerImage "$(ACR_LOGIN_SERVER)/connectsoft/lga:$(IMAGE_TAG)"
        pulumi up --yes --skip-preview
        popd
      displayName: Pulumi up

- stage: DeployAndSmoke
  displayName: Deploy & Smoke Test
  dependsOn: Provision
  jobs:
  - job: smoke
    pool: { vmImage: 'ubuntu-latest' }
    variables:
      LGA_URL: $(pulumi.lga.url) # populated by previous job via logging command or variable group
    steps:
    - script: |
        echo "Smoke: /health/ready"
        curl -fsS "$(LGA_URL)/health/ready"

        echo "Smoke: gRPC StartGeneration (contract-first stub)"
        # Example: use a tiny CLI helper built with Grpc.Net.Client in /tools
        dotnet run --project tools/SmokeGrpc -- \
          --url "$(LGA_URL)" \
          --start --correlation "smoke-$(Build.BuildId)" \
          --blueprint ./smoke/library-blueprint.yaml

        echo "Smoke: MCP tool call (optional HTTP facade)"
        curl -fsS -X POST "$(LGA_URL)/mcp/tools/library.start_generation" \
          -H "Content-Type: application/json" \
          -d '{"correlationId":"smoke-$(Build.BuildId)","blueprintJson":"{}"}'

        echo "Check status"
        curl -fsS "$(LGA_URL)/status?runId=smoke-$(Build.BuildId)" | tee status.json
        grep -E '"status"\s*:\s*"(PROpened|Completed)"' status.json
      displayName: Smoke tests

Pipeline considerations

  • Service connections: svc-conn-acr (ACR), svc-conn-azure (Azure subscription) used by Pulumi’s ARM provider.
  • Secrets: ADO PAT, AI keys, etc., are not stored in pipeline variables; Pulumi writes secret names (Key Vault references) and the app resolves them at runtime.
  • Artifacts: build logs, coverage, and smoke outputs published for triage.

🌩️ Pulumi (C#) – IaC Stack

Provision ACR, Container Apps Environment, Container App for LGA, Service Bus (for MassTransit), and Key Vault (for PAT/AI keys). Expose outputs to pipeline.

using Pulumi;
using Pulumi.AzureNative.ContainerRegistry;
using Pulumi.AzureNative.ContainerRegistry.Inputs;
using Pulumi.AzureNative.App;
using Pulumi.AzureNative.App.Inputs;
using Pulumi.AzureNative.Resources;
using Pulumi.AzureNative.KeyVault;
using Pulumi.AzureNative.KeyVault.Inputs;
using Pulumi.AzureNative.ServiceBus;
using Pulumi.AzureNative.ServiceBus.Inputs;

class LgaStack : Stack
{
    public LgaStack()
    {
        var cfg = new Config();
        var containerImage = cfg.Get("containerImage") ?? "csfactoryacr.azurecr.io/connectsoft/lga:latest";
        var location = "westeurope";

        var rg = new ResourceGroup("rg-lga", new ResourceGroupArgs { Location = location });

        // ACR
        var acr = new Registry("acr", new RegistryArgs
        {
            ResourceGroupName = rg.Name,
            Location = rg.Location,
            AdminUserEnabled = false,
            Sku = new SkuArgs { Name = "Standard" }
        });

        // Key Vault (PAT, AI keys)
        var kv = new Vault("kv-lga", new VaultArgs
        {
            ResourceGroupName = rg.Name,
            Location = rg.Location,
            Properties = new VaultPropertiesArgs
            {
                TenantId = Output.Create(GetTenantId()),
                Sku = new Pulumi.AzureNative.KeyVault.Inputs.SkuArgs { Family = "A", Name = "standard" },
                AccessPolicies = { },
                EnabledForDeployment = false
            }
        });

        // Service Bus (MassTransit)
        var sbNs = new Namespace("sb-lga", new NamespaceArgs
        {
            ResourceGroupName = rg.Name,
            Location = rg.Location,
            Sku = new SBSkuArgs { Name = "Standard", Tier = "Standard" }
        });

        var topic = new Topic("lga-topic", new TopicArgs
        {
            ResourceGroupName = rg.Name,
            NamespaceName = sbNs.Name,
            EnablePartitioning = true
        });

        // Container Apps Environment
        var env = new ManagedEnvironment("cae-lga", new ManagedEnvironmentArgs
        {
            ResourceGroupName = rg.Name,
            Location = rg.Location
        });

        // Container App (LGA)
        var app = new ContainerApp("lga-app", new ContainerAppArgs
        {
            ResourceGroupName = rg.Name,
            ManagedEnvironmentId = env.Id,
            Location = rg.Location,
            Configuration = new ConfigurationArgs
            {
                Ingress = new IngressArgs
                {
                    External = true,
                    TargetPort = 8080
                },
                Registries = new[]
                {
                    new RegistryCredentialsArgs
                    {
                        Server = acr.LoginServer,
                        Identity = "system" // use workload identity/managed identity to pull
                    }
                }
            },
            Template = new TemplateArgs
            {
                Containers = new[]
                {
                    new ContainerArgs
                    {
                        Name = "lga",
                        Image = containerImage,
                        Probes =
                        {
                            new ContainerAppProbeArgs
                            {
                                Type = "Liveness",
                                HttpGet = new ContainerAppProbeHttpGetArgs { Path = "/health/live" },
                                InitialDelaySeconds = 10, PeriodSeconds = 10
                            },
                            new ContainerAppProbeArgs
                            {
                                Type = "Readiness",
                                HttpGet = new ContainerAppProbeHttpGetArgs { Path = "/health/ready" },
                                InitialDelaySeconds = 5, PeriodSeconds = 10
                            }
                        },
                        Env =
                        {
                            new EnvironmentVarArgs { Name = "DOTNET_ENVIRONMENT", Value = "Production" },
                            new EnvironmentVarArgs { Name = "ServiceBus__Namespace", Value = sbNs.Name.Apply(n => $"{n}.servicebus.windows.net") },
                            // KeyVault references resolved by app at runtime (e.g., via Managed Identity)
                        }
                    }
                },
                Scale = new ScaleArgs
                {
                    MinReplicas = 1,
                    MaxReplicas = 3
                    // Add KEDA rules later (CPU, HTTP RPS, SB length)
                }
            }
        });

        // Useful outputs for pipeline
        this.Url = Output.Format($"https://{app.Name}.{env.Name}.azurecontainerapps.io");
        this.Registry = acr.LoginServer;
        this.ServiceBusEndpoint = sbNs.Name.Apply(n => $"{n}.servicebus.windows.net");
    }

    [Output] public Output<string> Url { get; set; }
    [Output] public Output<string> Registry { get; set; }
    [Output] public Output<string> ServiceBusEndpoint { get; set; }

    private static Output<string> GetTenantId()
        => Output.Create(Pulumi.AzureNative.Authorization.GetClientConfig.InvokeAsync())
                 .Apply(c => c.TenantId);
}

Design choices

  • Managed Identity for ACR pull and Key Vault access (no secrets in env vars).
  • Probes use app’s /health endpoints.
  • Scale starts 1→3; later add KEDA rules for CPU/RPS/queue depth.

🔒 Secrets & Identity

  • Key Vault stores: ADO__PAT, AI__ApiKey, optional repo/PR tokens.
  • App uses Managed Identity to read Key Vault at runtime; configuration binding (e.g., Azure.Identity) loads secrets into options.
  • Pipelines never log secrets; masking and secret scanning configured.

📈 Observability

  • Container emits OpenTelemetry traces/metrics/logs with traceId=CorrelationId.
  • Health checks integrated with Container Apps; pipeline smoke checks readiness.
  • Optional: route logs to Azure Monitor / Log Analytics and wire Grafana dashboards.

🧪 Smoke Test (Post-Deploy)

What it does

  1. Hit /health/ready → 200 OK.
  2. Call StartGeneration(correlationId, blueprintJson) via small gRPC CLI helper.
  3. Poll GetRunStatus(runId) until PROpened or Completed.
  4. Fail stage if not reached within timeout (e.g., 8 minutes).

Blueprint for smoke: a minimal blueprint that short-circuits quality step (e.g., minCoveragePct: 0 for sandbox) and targets a stub repo.


✅ Acceptance

  • CI (PR): green build, tests, and coverage ≥ 70% (or blueprint override ≥ platform minimum).
  • CD (Sandbox): Pulumi provisions/updates infra; image built & pushed; Container App updated.
  • Smoke: health endpoints OK; status reaches PROpened or Completed for the smoke run.
  • Security: no plaintext secrets in pipeline; Key Vault access via MI; ACR pull via MI.
  • Observability: traces & logs confirm each deployment step and smoke actions with CorrelationId.

Documentation & Diagrams

Docs

  • overview.md – Describes the Library Generator Agent’s role, upstream/downstream agents, and expected inputs/outputs.
  • architecture.md – Contains context diagrams (system boundaries, event flows), orchestration flow (saga + retries), and platform tie-ins (NHibernate, MassTransit, Pulumi, Semantic Kernel).
  • api/grpc.md – C#-based contract-first gRPC definitions with usage examples.
  • api/mcp.md – MCP tool interface specs (library.start_generation, library.get_status) with request/response examples.
  • runbook.md – Step-by-step developer workflow (from local build to CI/CD), coverage thresholds, retry knobs, and smoke test instructions.

Diagrams

  • Context Diagram – Agent in ecosystem: Vision/Planning → Library Generator → QA Agents.
  • Component Diagram – Domain (LibraryGeneration aggregate), Contracts, Ports/Adapters, Orchestration.
  • Sequence Diagram – StartGeneration → Workspace → Template → Quality → Push → PR.
  • Deployment Diagram – Container, ACR, Pulumi-provisioned Azure resources, Service Bus integration.

Acceptance

  • All documentation builds under the Docs pipeline (MkDocs).
  • A new teammate can follow Getting Started in overview.md and run an end-to-end workflow using fakes (in-memory persistence, fake MCP/ADO).
  • Diagrams render correctly (Mermaid/PlantUML).
  • API pages include at least one working C# snippet (gRPC client) and one MCP example.

🔁 Patterns, Use-Cases & Reuse

This section catalogs repeatable patterns for the LGA, with copy-pasteable recipes (blueprint snippets + expected outputs), and explains how to scale reuse via shared GitOps/Sandbox/QA services while preserving stable seam contracts.


🧩 Core Patterns

Pattern When to Use Key Template Switches Notable Outputs
Utility Library Small cross-cutting helpers (e.g., string utils, date/time) useDI:false, useOptions:false, useLogging:true, analyzers:enable, nullable:enable Lean .csproj, README with badges, strict analyzers
Framework Component Opinionated infra piece (e.g., ConnectSoft.Extensions.Http.OAuth2) useDI:true, useOptions:true, useLogging:true, generateSourceLink:true, multiTFM ServiceCollectionExtensions, IValidateOptions<T>, sample registration tests
API Client SDK for a 3rd-party/internal API useDI:true, useOptions:true, useLogging:true, includeSample:true IHttpClientFactory integration, retry policy docs, sample console
Localization Package Shared resources + pluralization rules useDI:true, useOptions:false, useLogging:false Resources/ folder, satellite assemblies notes, README i18n guide
Test Utilities Shared test helpers/assertions useDI:false, useOptions:false, useLogging:false, publishers.nuget:true InternalsVisibleTo guidance, samples targeting MSTest/xUnit

All patterns preserve idempotency and traceability (correlation IDs, blueprint SHA, patch SHA).


🍳 Cookbook — Copy/Paste Recipes

1) Utility Library (Strict Analyzers)

library-blueprint.yaml

library:
  packageId: ConnectSoft.Extensions.Strings
  name: ConnectSoft.Extensions.Strings
  description: String helpers for formatting, parsing, and normalization
  repositoryUrl: https://dev.azure.com/connectsoft/Platform/_git/Extensions
  tfm: { multi: [ "net8.0", "net9.0" ] }
  license: MIT
  tags: [ "strings", "helpers", "connectsoft" ]
quality: { minCoveragePct: 80 }
ports: { git: { targetRepo: Extensions, defaultBranch: main }, ado: { project: Platform } }

options.yaml

template:
  useDI: false
  useOptions: false
  useLogging: true
  analyzers: enable
  nullable: enable
  publishers: { nuget: true, artifacts: true }

Expected Outputs

  • .csproj with multi-TFM, Nullable=enable, analyzers wired.
  • Pipeline with 80% coverage gate.
  • README with build/coverage/NuGet badges.

2) Framework Component (Options + Validation)

library-blueprint.yaml

library:
  packageId: ConnectSoft.Extensions.Http.OAuth2
  name: ConnectSoft.Extensions.Http.OAuth2
  description: OAuth2 client credentials handler for named HttpClients
  repositoryUrl: https://dev.azure.com/connectsoft/Factory/_git/HttpExtensions
  tfm: { multi: [ "net8.0", "net9.0" ] }
quality: { minCoveragePct: 75 }
ports: { git: { targetRepo: HttpExtensions, defaultBranch: main }, ado: { project: Factory } }

options.yaml

template:
  useDI: true
  useOptions: true
  useLogging: true
  generateSourceLink: true
  analyzers: enable
  nullable: enable
  includeSample: true

Expected Outputs

  • ServiceCollectionExtensions with DI helpers.
  • OAuthHttpHandlerOptions + IValidateOptions<OAuthHttpHandlerOptions>.
  • Sample app + test demonstrating named HttpClient.

3) API Client (Resilient SDK)

library-blueprint.yaml

library:
  packageId: ConnectSoft.Clients.Greenhouse
  name: ConnectSoft.Clients.Greenhouse
  description: Typed client for Greenhouse API with retries and backoff
  repositoryUrl: https://dev.azure.com/connectsoft/Integrations/_git/Clients
  tfm: { multi: [ "net8.0", "net9.0" ] }
tags: [ "sdk", "api", "greenhouse" ]
quality: { minCoveragePct: 70 }
ports: { git: { targetRepo: Clients, defaultBranch: main }, ado: { project: Integrations } }

options.yaml

template:
  useDI: true
  useOptions: true
  useLogging: true
  includeSample: true
  analyzers: enable

Expected Outputs

  • Typed client with IHttpClientFactory, Polly (or built-in handlers) retry guidance.
  • README: auth, rate-limits, pagination examples.
  • Sample console to fetch candidates for a date range.

4) Localization Package (Resources Only)

options.yaml

template:
  useDI: true
  useOptions: false
  useLogging: false
  publishers: { nuget: true }

Expected Outputs

  • Resources/*.resx, culture folders guidance.
  • README: adding locales, pluralization rules, satellite assemblies.

5) Test Utilities (Internal Test SDK)

library-blueprint.yaml

library:
  packageId: ConnectSoft.Testing.Verifiers
  name: ConnectSoft.Testing.Verifiers
  description: Common verifiers for .csproj/README/YAML across templates
  repositoryUrl: https://dev.azure.com/connectsoft/DevEx/_git/Testing
  tfm: { multi: [ "net8.0", "net9.0" ] }

options.yaml

template:
  useDI: false
  useOptions: false
  useLogging: false
  publishers: { nuget: true }

Expected Outputs

  • CsprojVerifier, ReadmeVerifier, YamlVerifier.
  • Guidance for InternalsVisibleTo and version pinning for test projects.

🔌 Mapping to Template Commands & Switches

Use-Case Switches Additional Generation Rules
Utility useLogging, analyzers, nullable No DI scaffolding; minimal surface API; strict warnings
Framework useDI, useOptions, useLogging, sourceLink Adds Options + validation; DI extensions; docs section “Configuration”
API Client useDI, useOptions, useLogging, includeSample Adds typed client, handlers; sample usage; error taxonomy in README
Localization useDI Creates Resources/ structure; docs for culture fallback
Test Utilities publishers.nuget:true Strong naming optional; mark package as DevelopmentDependency where applicable

♻️ Reuse at Scale

Shared Services (turn knobs into platform services)

  • GitOps: central service that manages clone/branch/commit/push on behalf of agents (enforce branch naming, commit policies, reviewers).
  • Sandbox: preconfigured repos and smoke blueprints to validate generator changes across representative patterns.
  • QA Cluster: pool of Quality Runners with normalized images/tools; exposes a simple API (RunQuality) to any agent.

Maintain Stable Seams

  • Ports are contracts (IMcpFilesystem, IMcpGit, IAdoPullRequests, IQualityRunner, IAiServices).
    • Pin semantic versions for port DTOs.
    • Use additive changes; deprecate via [Obsolete] + deadline.
  • MCP Tools remain minimal: library.start_generation, library.get_status.
    • Evolve with new optional fields, never breaking required ones.

Governance & Evolution

  • Template Evolution via ADRs: version templates (v1, v2); LGA selects by blueprint or tenant policy.
  • Package Lifecycle: prerelease channels (-alpha, -beta) flow through separate feeds; promotion requires gates (coverage, consumer tests).
  • Deprecation Policy: announce in PR body + release notes; long-lived branches maintain LTS until EOL.

🚫 Anti-Patterns (and the fix)

Anti-Pattern Why It Hurts Fix
Monolithic “kitchen-sink” library Hard to version, breaks consumers Split into focused packages, each with its own blueprint
Hidden breaking changes in minor releases Consumer pain, rollback SemVer discipline, consumer canary, contract tests
Ad-hoc scripts in pipeline Fragile, untraceable Standardize through Quality Runner + YAML templates
Hardcoded repos/paths in blueprints Vendor lock, security Use allow-lists + ports; parametrize with policies

🔍 Example: Reusing QA & Sandbox

  1. Developer updates LGA template logic (e.g., new useObservability toggle).
  2. Sandbox job renders five cookbook blueprints (one per pattern) in dry-run: produces diffs + patch SHAs.
  3. QA Cluster runs build/test/coverage for each dry-run output.
  4. If all pass, proceed to canary tenants; otherwise rollback and attach remediation to the PR.

✅ Acceptance

  • Cookbook examples render cleanly in dry-run (deterministic diffs) and pass E2E (Start → Completed).
  • Reuse flows validated with Shared GitOps/Sandbox/QA services; seam contracts unchanged.
  • Documentation for each pattern includes switch mapping, expected outputs, and failure modes.
  • At least one consumer project per pattern compiles against the generated package in CI (canary).

Jobs & Scheduling

Scope & Principles

Goal: Add optional background job processing for maintenance and operational tasks that keep LGA healthy at scale—without hard-coding schedules. Jobs must be config-driven, safe to re-run, UTC-scheduled, and observability-first.

Standards (enforced here too):

  • Time: store and operate in UTC; DB columns are datetimeoffset (UTC offset +00:00).
  • Strings: use nvarchar with explicit lengths.
  • Statuses/enums: lookup tables + FKs (no ad-hoc string status).
  • Idempotency: all jobs use correlation keys / guards; reentrancy-safe.
  • Security: least privilege; no secrets in logs; access to external systems via managed identity/KV.

Architecture (Jobs Runtime)

flowchart LR
  subgraph LGA App
    Host[ASP.NET Host]
    Hangfire[Hangfire Server]
    Jobs[Job Orchestrator]
    Ports[Ports & Adapters]
    NH[(NHibernate + Outbox)]
  end

  subgraph External
    SB[[Azure Service Bus (DLQ)]]
    FS[[MCP Filesystem]]
    ADO[[Azure DevOps PR]]
  end

  Host-->Hangfire
  Hangfire-->Jobs
  Jobs-->NH
  Jobs-->FS
  Jobs-->SB
  Jobs-->ADO
Hold "Alt" / "Option" to enable pan & zoom
  • Runtime: LGA process hosts Hangfire Server (optional) next to the HTTP/gRPC host.
  • Schedules: loaded from configuration (YAML/appsettings); infra alternative via Pulumi (Container Apps Jobs) is supported but not required.
  • Policies: persisted in DB (toggle/overrides at runtime, audit).
  • Observability: each job emits spans/metrics + writes execution logs to DB.

Job Catalog (first wave)

Code Purpose Default Schedule (UTC) Idempotency Key
outbox-dispatch Drain Outbox table → publish integration events */1 * * * * (every minute) (messageId) + outbox state
dlq-replay Replay messages from Service Bus DLQ back to main queue (safe rules) 0 */2 * * * (every 2 hours) `(lockToken messageId)` + DLQ replay stamp
quality-nightly-sweep Re-run Quality on stale/failed runs; post status to PR 0 1 * * * (01:00 daily) (LibraryGeneration.Id, blueprintSha, patchSha)
workspace-cleanup Cleanup stale workspaces / temp artifacts via MCP FS allow-list 0 3 * * * (03:00 daily) (path, lastWriteTimeUtc)
pr-stale-check Comment/label stale PRs, ensure status checks present 30 4 * * * (04:30 daily) (prUrl) + last processed timestamp

All CRON expressions are UTC and configurable.


Configuration (appsettings + YAML) — UTC by design

appsettings.Jobs.json

{
  "Jobs": {
    "Enabled": true,
    "Timezone": "UTC",
    "OutboxDispatch": { "Cron": "*/1 * * * *", "BatchSize": 200, "MaxDegreeOfParallelism": 4 },
    "DlqReplay": {
      "Cron": "0 */2 * * *",
      "MaxMessagesPerRun": 500,
      "PoisonMaxReplays": 3,
      "Queues": [ "lga.library-generation" ],
      "DeadLetterSuffix": "$DeadLetterQueue"
    },
    "QualityNightlySweep": { "Cron": "0 1 * * *", "MaxAgeHours": 24, "CoverageFloorPct": 70 },
    "WorkspaceCleanup": { "Cron": "0 3 * * *", "Roots": [ "/tmp/lga/**" ], "DeleteOlderThanDays": 2 },
    "PrStaleCheck": { "Cron": "30 4 * * *", "StaleAfterDays": 5 }
  }
}

Pulumi param override (optional):

pulumi config set lga:jobs.outbox.cron "*/2 * * * *"  # slower in sandbox
pulumi config set lga:jobs.enabled true

Persistence (Policies & Execution Logs)

New tables follow your conventions: datetimeoffset UTC, nvarchar, lookup tables for statuses.

Tables

lga.JobPolicy — runtime-tunable knobs

Column Type Notes
Id (PK) int IDENTITY
Code (AK) nvarchar(100) e.g., outbox-dispatch
Enabled bit
Cron nvarchar(64) UTC cron
MaxConcurrency int
MaxRetries int logical job-level guard
RetryBackoffSeconds int
UpdatedUtc datetimeoffset(0) UTC

lga.JobExecutionLog

Column Type Notes
ExecutionId (PK) uniqueidentifier Correlates logs/telemetry
JobCode nvarchar(100) FK to JobPolicy.Code
StatusId (FK) int lga_lu.JobExecutionStatus(Id)
StartedUtc datetimeoffset(0) UTC
FinishedUtc datetimeoffset(0) UTC, nullable
ItemsProcessed int
ErrorsCount int
ErrorMessage nvarchar(2000) nullable
DetailsJson nvarchar(max) optional payload

lga_lu.JobExecutionStatus

  • Seed: Succeeded, Failed, Partial, Skipped.

FluentMigrator: add UTC check constraints (DATEPART(TZ, StartedUtc)=0 etc.), and unique on JobPolicy.Code.


Hangfire Integration (Server + Dashboard)

Program.cs (composition)

builder.Services.AddHangfire((sp, cfg) =>
{
    var cs = sp.GetRequiredService<IConfiguration>().GetConnectionString("Sql");
    cfg.UseSimpleAssemblyNameTypeSerializer()
       .UseRecommendedSerializerSettings()
       .UseSqlServerStorage(cs); // or .UsePostgreSqlStorage(...) by env
});

builder.Services.AddHangfireServer(options =>
{
    options.WorkerCount = Math.Max(2, Environment.ProcessorCount / 2);
    options.Queues = new[] { "default", "maintenance" };
    options.ServerName = $"lga-jobs-{Environment.MachineName}";
});

Dashboard is disabled by default in production; expose it only behind SSO in admin environments.


Job Registration (config-driven, UTC)

public static class JobRegistration
{
    public static void Register(IConfiguration cfg)
    {
        var tz = TimeZoneInfo.Utc;
        var jobs = cfg.GetSection("Jobs");
        if (!jobs.GetValue<bool>("Enabled")) return;

        RecurringJob.AddOrUpdate<OutboxDispatchJob>(
            "outbox-dispatch",
            j => j.RunAsync(JobExecutionContext.New("outbox-dispatch"), CancellationToken.None),
            jobs["OutboxDispatch:Cron"], tz, queue: "maintenance");

        RecurringJob.AddOrUpdate<DlqReplayJob>(
            "dlq-replay",
            j => j.RunAsync(JobExecutionContext.New("dlq-replay"), CancellationToken.None),
            jobs["DlqReplay:Cron"], tz, queue: "maintenance");

        RecurringJob.AddOrUpdate<QualityNightlySweepJob>(
            "quality-nightly-sweep",
            j => j.RunAsync(JobExecutionContext.New("quality-nightly-sweep"), CancellationToken.None),
            jobs["QualityNightlySweep:Cron"], tz);

        RecurringJob.AddOrUpdate<WorkspaceCleanupJob>(
            "workspace-cleanup",
            j => j.RunAsync(JobExecutionContext.New("workspace-cleanup"), CancellationToken.None),
            jobs["WorkspaceCleanup:Cron"], tz);

        RecurringJob.AddOrUpdate<PrStaleCheckJob>(
            "pr-stale-check",
            j => j.RunAsync(JobExecutionContext.New("pr-stale-check"), CancellationToken.None),
            jobs["PrStaleCheck:Cron"], tz);
    }
}

All schedules use TimeZoneInfo.Utc. The jobs read policy overrides from lga.JobPolicy if present, else from appsettings.


Job Contracts & Implementations (ports-first)

Shared Context & Base

public sealed record JobExecutionContext(string JobCode, Guid ExecutionId, string CorrelationId);

public static class JobExecutionContext
{
    public static JobExecutionContext New(string code) =>
        new(code, Guid.NewGuid(), $"job-{code}-{Guid.NewGuid():N}");
}

public abstract class LgaJobBase
{
    protected readonly IJobPolicyRepository Policies;
    protected readonly IJobLogRepository Logs;
    protected readonly ILogger Logger;
    protected readonly IClock Clock; // returns DateTimeOffset.UtcNow

    protected LgaJobBase(IJobPolicyRepository policies, IJobLogRepository logs, ILogger logger, IClock clock) { ... }

    protected async Task<TResult> RunGuardedAsync<TResult>(
        JobExecutionContext ctx,
        Func<CancellationToken, Task<TResult>> work,
        CancellationToken ct)
    {
        var log = JobExecutionLog.Start(ctx.JobCode, ctx.ExecutionId, Clock.UtcNow);
        try
        {
            var res = await work(ct);
            await Logs.MarkSucceededAsync(log with { FinishedUtc = Clock.UtcNow, ItemsProcessed = res switch { ICount c => c.Count, _ => 0 } });
            return res;
        }
        catch (Exception ex)
        {
            await Logs.MarkFailedAsync(log with { FinishedUtc = Clock.UtcNow, ErrorMessage = ex.Message });
            Logger.LogError(ex, "{Job} failed, executionId={ExecutionId}", ctx.JobCode, ctx.ExecutionId);
            throw; // Hangfire will apply its retry policy (limited by JobPolicy)
        }
    }
}

Outbox Dispatch

public sealed class OutboxDispatchJob : LgaJobBase
{
    private readonly IOutboxDispatcher _dispatcher;
    public OutboxDispatchJob(IOutboxDispatcher dispatcher, IJobPolicyRepository p, IJobLogRepository l, ILogger<OutboxDispatchJob> g, IClock c)
        : base(p, l, g, c) => _dispatcher = dispatcher;

    [DisableConcurrentExecution(timeoutInSeconds: 300)]
    public Task RunAsync(JobExecutionContext ctx, CancellationToken ct) =>
        RunGuardedAsync(ctx, token => _dispatcher.DispatchBatchAsync(max: 200, degree: 4, token), ct);
}
  • Idempotency: outbox rows transition atomically (OccurredUtcProcessedUtc); retries pick only unprocessed rows.
  • DB UTC: OccurredUtc, ProcessedUtc are datetimeoffset(0) UTC (enforced by checks).

DLQ Replay (Azure Service Bus)

public sealed class DlqReplayJob : LgaJobBase
{
    private readonly IDlqReplayer _replayer;

    public DlqReplayJob(IDlqReplayer r, IJobPolicyRepository p, IJobLogRepository l, ILogger<DlqReplayJob> g, IClock c)
        : base(p, l, g, c) => _replayer = r;

    [DisableConcurrentExecution(900)]
    public Task RunAsync(JobExecutionContext ctx, CancellationToken ct) =>
        RunGuardedAsync(ctx, token => _replayer.ReplayAsync(new DlqReplayOptions { MaxMessages = 500, PoisonMaxReplays = 3 }, token), ct);
}

Infra adapter (IDlqReplayer) responsibilities:

  • Receive from DLQ subqueue (peek-lock), stamp a replay header (attempt count), and resubmit to the main queue.
  • Respect PoisonMaxReplays (move to quarantine topic after limit).
  • Emit metrics: replayed_count, quarantined_count, latency.

Quality Nightly Sweep

public sealed class QualityNightlySweepJob : LgaJobBase
{
    private readonly ILibraryGenerationRepository _repo;
    private readonly IQualityRunner _quality;

    public QualityNightlySweepJob(ILibraryGenerationRepository r, IQualityRunner q, IJobPolicyRepository p, IJobLogRepository l, ILogger<QualityNightlySweepJob> g, IClock c)
        : base(p, l, g, c) { _repo = r; _quality = q; }

    [DisableConcurrentExecution(3600)]
    public Task RunAsync(JobExecutionContext ctx, CancellationToken ct) =>
        RunGuardedAsync(ctx, async token =>
        {
            var targets = await _repo.FindNeedingRecheckAsync(maxAge: TimeSpan.FromHours(24), token);
            var processed = 0;
            foreach (var lg in targets)
            {
                var result = await _quality.RunAsync(new QualityRunRequest(lg.Id, lg.WorkspacePath!), token);
                await _repo.UpdateCoverageAsync(lg.Id, result.CoveragePct, token);
                processed++;
            }
            return new Count(processed);
        }, ct);
}

Workspace Cleanup

public sealed class WorkspaceCleanupJob : LgaJobBase
{
    private readonly IMcpFilesystem _fs;
    public WorkspaceCleanupJob(IMcpFilesystem fs, IJobPolicyRepository p, IJobLogRepository l, ILogger<WorkspaceCleanupJob> g, IClock c)
        : base(p, l, g, c) => _fs = fs;

    [DisableConcurrentExecution(1800)]
    public Task RunAsync(JobExecutionContext ctx, CancellationToken ct) =>
        RunGuardedAsync(ctx, token => _fs.CleanupAsync(new CleanupWorkspaceRequest
        {
            Roots = new[] { "/tmp/lga/**" },
            DeleteOlderThan = TimeSpan.FromDays(2)
        }, token), ct);
}

Retry & Backoff (Policy)

  • Where: Hangfire (job-level retries) and inside job implementations (operation-level transient policies).
  • Persistence: lga.JobPolicy.MaxRetries + RetryBackoffSeconds; effective limits are minimum of Hangfire attribute and DB policy.
  • Poison Handling: DLQ replay respects PoisonMaxReplays; Outbox never re-dispatches ProcessedUtc != NULL.
  • Concurrency: [DisableConcurrentExecution] prevents overlapping runs per job code; also enforce MaxConcurrency in policy (e.g., SemaphoreSlim).

Observability

  • Spans: job.code, job.executionId, items_processed, errors_count, duration_ms.
  • Metrics: per job counters and histograms; success rate SLOs (e.g., >99% for outbox).
  • Logs: structured with CorrelationId = JobExecutionContext.CorrelationId; redacted secrets.
  • DB: JobExecutionLog retains audit; retention policy configurable (e.g., 14 days).

Security

  • No secrets in job args/logs; secrets for ADO/SB are resolved by Key Vault via managed identity.
  • Allow-lists: FS cleanup restricted to configured roots; DLQ replay restricted to configured queues.
  • Dashboard: off in prod; enable only in locked-down admin envs.

IaC & Pipelines Touchpoints

  • Pulumi: expose jobs.enabled and CRON overrides as stack config; no separate infra required if running Hangfire in the app.
  • CI/CD: smoke test can enqueue one-off executions using Hangfire’s background client in sandbox to validate wiring:
    • BackgroundJob.Enqueue<OutboxDispatchJob>(j => j.RunAsync(JobExecutionContext.New("outbox-dispatch"), CancellationToken.None));

Acceptance

  • Job definitions documented here and in /docs/lga/runbook.md (schedules, knobs, failure modes).
  • Migrations compile & run for lga.JobPolicy, lga.JobExecutionLog, and lga_lu.JobExecutionStatus.
  • Sandbox run succeeds:
    • Outbox drains new messages,
    • DLQ replays a synthetic message (round-trip visible in metrics/logs),
    • Nightly quality sweep finds zero or updates coverage accordingly,
    • Workspace cleanup removes stale temp content (logged),
    • PR stale check posts/update labels as expected.
  • Observability verified: traces/metrics/logs correlate via executionId and CorrelationId.

Optional: FluentMigrator Snippets (Policy & Logs)

[Migration(2025091902, "Jobs policy & logs")]
public class M_20250919_02_Jobs : Migration
{
    public override void Up()
    {
        Create.Table("JobPolicy").InSchema("lga")
            .WithColumn("Id").AsInt32().PrimaryKey().Identity()
            .WithColumn("Code").AsString(100).NotNullable().Unique()
            .WithColumn("Enabled").AsBoolean().NotNullable().WithDefaultValue(true)
            .WithColumn("Cron").AsString(64).NotNullable()
            .WithColumn("MaxConcurrency").AsInt32().NotNullable().WithDefaultValue(1)
            .WithColumn("MaxRetries").AsInt32().NotNullable().WithDefaultValue(3)
            .WithColumn("RetryBackoffSeconds").AsInt32().NotNullable().WithDefaultValue(10)
            .WithColumn("UpdatedUtc").AsDateTimeOffset().NotNullable().WithDefault(SystemMethods.CurrentUTCDateTime);
        Execute.Sql("ALTER TABLE lga.JobPolicy ADD CONSTRAINT CK_LGA_JP_UTC CHECK (DATEPART(TZ, UpdatedUtc) = 0);");

        if (!Schema.Schema("lga_lu").Exists()) Execute.Sql("CREATE SCHEMA lga_lu;");
        Create.Table("JobExecutionStatus").InSchema("lga_lu")
            .WithColumn("Id").AsInt32().PrimaryKey().Identity()
            .WithColumn("Code").AsString(50).NotNullable().Unique()
            .WithColumn("Name").AsString(200).NotNullable();
        Insert.IntoTable("JobExecutionStatus").InSchema("lga_lu").Row(new { Code = "Succeeded", Name = "Succeeded" });
        Insert.IntoTable("JobExecutionStatus").InSchema("lga_lu").Row(new { Code = "Failed", Name = "Failed" });
        Insert.IntoTable("JobExecutionStatus").InSchema("lga_lu").Row(new { Code = "Partial", Name = "Partial" });
        Insert.IntoTable("JobExecutionStatus").InSchema("lga_lu").Row(new { Code = "Skipped", Name = "Skipped" });

        Create.Table("JobExecutionLog").InSchema("lga")
            .WithColumn("ExecutionId").AsGuid().PrimaryKey()
            .WithColumn("JobCode").AsString(100).NotNullable()
            .WithColumn("StatusId").AsInt32().NotNullable().ForeignKey("lga_lu", "JobExecutionStatus", "Id")
            .WithColumn("StartedUtc").AsDateTimeOffset().NotNullable().WithDefault(SystemMethods.CurrentUTCDateTime)
            .WithColumn("FinishedUtc").AsDateTimeOffset().Nullable()
            .WithColumn("ItemsProcessed").AsInt32().NotNullable().WithDefaultValue(0)
            .WithColumn("ErrorsCount").AsInt32().NotNullable().WithDefaultValue(0)
            .WithColumn("ErrorMessage").AsString(2000).Nullable()
            .WithColumn("DetailsJson").AsString(int.MaxValue).Nullable();

        Execute.Sql("ALTER TABLE lga.JobExecutionLog ADD CONSTRAINT CK_LGA_JEL_Started_UTC CHECK (DATEPART(TZ, StartedUtc) = 0);");
        Execute.Sql("ALTER TABLE lga.JobExecutionLog ADD CONSTRAINT CK_LGA_JEL_Finished_UTC CHECK (FinishedUtc IS NULL OR DATEPART(TZ, FinishedUtc) = 0);");

        Create.Index("IX_LGA_JEL_JobCode_Time").OnTable("JobExecutionLog").InSchema("lga")
            .OnColumn("JobCode").Ascending().OnColumn("StartedUtc").Descending();
    }

    public override void Down()
    {
        Delete.Table("JobExecutionLog").InSchema("lga");
        Delete.Table("JobExecutionStatus").InSchema("lga_lu");
        Delete.Table("JobPolicy").InSchema("lga");
    }
}

Actor/Grain Integration

Focus: High-fanout coordination with Microsoft Orleans, adding a lightweight virtual-actor façade per CorrelationId that complements the MassTransit Saga. Grains provide: idempotent start gates, duplicate suppression, ordered fan-out, and push-style status streams—while the Saga remains the source of truth for orchestration.


Architecture & Roles

  • Saga-primary, Grain-assist (default):
    • Grain (GenerationGrain) is the front door for StartGeneration and a status/event hub.
    • Saga executes the state machine and writes DB/Outbox; emits integration events.
    • Consumers (bus handlers) call back into the grain to notify state transitions → grain publishes status streams to UIs/agents.
  • Why: Orleans’ single-threaded grain execution gives in-key sequencing and natural dedupe at very high fan-out (1k+ concurrent runs).
flowchart LR
  C[Client (gRPC/MCP)]-->G[GenerationGrain (CorrelationId)]
  G-->B[MassTransit Bus]
  B-->S[Saga]
  S-->DB[(NHibernate + Outbox)]
  S--events-->G
  G--status-->Sub[Subscribers (SignalR/Agents/Streams)]
Hold "Alt" / "Option" to enable pan & zoom

Grain Model & State (UTC + lookups)

  • Key: CorrelationId (string).
  • State (persisted via Orleans storage provider):
    • AggregateId: Guid
    • CorrelationId: string
    • StatusId: int (FK to GenerationStatuses)
    • BlueprintSha: string?, PatchSha: string?
    • CoveragePct: double?, QualityStatusId: int?
    • CommitId: string?, PullRequestUrl: string?
    • StartedAtUtc: DateTimeOffset, FinishedAtUtc: DateTimeOffset?
    • LastSequence: long (dedupe/ordering fence)

Times are DateTimeOffset in UTC; strings are modeled as Unicode (maps to nvarchar downstream). Statuses are lookup FK codes, not free text.


Interfaces (contract-first)

public interface IGenerationGrain : IGrainWithStringKey
{
    // Idempotent entry point (creates or resumes)
    Task<StartResult> StartAsync(StartRequest request);

    // Saga -> Grain notifications (ordered by Saga)
    Task OnWorkspacePrepared(WorkspacePrepared evt);
    Task OnLibraryGenerated(LibraryGenerated evt);
    Task OnQualityValidated(QualityValidated evt);
    Task OnBranchCommitted(BranchCommitted evt);
    Task OnPullRequestOpened(PullRequestOpened evt);
    Task OnFailed(LibraryGenerationFailed evt);

    // Read model
    Task<GrainStatus> GetStatusAsync();

    // Optional: subscribe to push updates (Orleans Streams)
    Task<StreamSubscriptionHandle<StatusChanged>> SubscribeAsync(Guid subscriberId);
}
  • Idempotency: StartAsync must no-op if the run already exists (same CorrelationId / same content keys).
  • Ordering: Do not mark the grain [Reentrant]; keep single-thread turn-based execution.

Grain Behavior (dual-mode orchestration)

  1. Start path

    • Validate content keys (e.g., BlueprintSha, PatchSha) and check DB (LibraryGenerations by CorrelationId).
    • If new: publish StartLibraryGeneration command to bus and set StatusId = Pending.
    • If existing: surface current status (no duplicate start).
  2. Event path (Saga → Grain)

    • Each consumer (e.g., WorkspacePreparedConsumer) resolves the grain by CorrelationId and invokes the matching On… method.
    • Grain:
      • Updates its local state (cheap, fast).
      • Publishes a status change (Orleans Stream) to subscribers (UI/agents).
      • Optionally sets timeouts (reminders) to detect stalls.
  3. Timeouts / Reminders (watchdog)

    • Grain registers Orleans Reminders (e.g., after Generated, expect QualityValidated within N minutes).
    • On timeout: emit alert event, optionally retry publishing the next command (policy-guarded).

Saga remains authoritative; Grain ensures fan-out and status push without adding new orchestration logic.


Persistence & Providers

  • Clustering: Orleans AdoNet Clustering (SQL Server / Postgres) using the same DB as the app (separate schema/prefix).
  • Grain storage:
    • AdoNet Grain Storage named "lga" (persist grain state; UTC times).
    • Alternatively Memory in dev/sandbox (no durability required).
  • Streams:
    • Azure Service Bus streams (reuse the existing namespace), stream provider "lga-status".
    • Stream key: CorrelationId, namespace: "lga/status".
builder.Host.UseOrleans(silo =>
{
    silo.UseAdoNetClustering(o => { o.Invariant = "System.Data.SqlClient"; o.ConnectionString = cfg.GetConn("Sql"); });
    silo.AddAdoNetGrainStorage("lga", o => { o.Invariant = "System.Data.SqlClient"; o.ConnectionString = cfg.GetConn("Sql"); });
    silo.AddAzureServiceBusStreams("lga-status", (sp, opt) =>
    {
        opt.ConnectionString = cfg["ServiceBus:ConnectionString"]; // via Key Vault/MI
        opt.ConfigureCacheSize(8192);
    });
    // Dashboard optional; disable in prod by default
});

Secrets are fetched via Managed Identity + Key Vault; never log them. All timestamps are UTC.


Sample Grain (skeleton)

[GenerateSerializer]
public sealed class GenerationGrainState
{
    [Id(0)] public Guid AggregateId { get; set; }
    [Id(1)] public string CorrelationId { get; set; } = default!;
    [Id(2)] public int StatusId { get; set; }         // FK → GenerationStatuses
    [Id(3)] public string? BlueprintSha { get; set; }
    [Id(4)] public string? PatchSha { get; set; }
    [Id(5)] public double? CoveragePct { get; set; }
    [Id(6)] public int? QualityStatusId { get; set; } // FK → QualityStatus
    [Id(7)] public string? CommitId { get; set; }
    [Id(8)] public string? PullRequestUrl { get; set; }
    [Id(9)] public DateTimeOffset StartedAtUtc { get; set; }
    [Id(10)] public DateTimeOffset? FinishedAtUtc { get; set; }
    [Id(11)] public long LastSequence { get; set; }
}

public sealed class GenerationGrain : Grain<GenerationGrainState>, IGenerationGrain
{
    private readonly IPublishEndpoint _bus;           // MassTransit
    private readonly IStatusStream _stream;           // wraps Orleans stream provider
    private readonly IClock _clock;                   // returns UtcNow
    private readonly ILookupCache _lookups;           // FK codes → ids

    public GenerationGrain(IPublishEndpoint bus, IStatusStream stream, IClock clock, ILookupCache lookups)
    { _bus = bus; _stream = stream; _clock = clock; _lookups = lookups; }

    public async Task<StartResult> StartAsync(StartRequest request)
    {
        if (State.CorrelationId is null)
        {
            State.CorrelationId = this.GetPrimaryKeyString();
            State.StartedAtUtc  = _clock.UtcNow;
            State.StatusId      = await _lookups.GenerationStatusId("Pending");
            await _bus.Publish(new StartLibraryGeneration(State.CorrelationId, request.BlueprintJson));
            await WriteStateAsync();
            await _stream.EmitAsync(State.CorrelationId, StatusChanged.Pending(State.StartedAtUtc));
            return StartResult.Accepted(State.CorrelationId);
        }
        // Idempotent: return existing status
        return StartResult.AlreadyStarted(State.CorrelationId, State.StatusId);
    }

    public async Task OnLibraryGenerated(LibraryGenerated evt)
    {
        // simple sequence guard (if you add sequence numbers)
        State.StatusId = await _lookups.GenerationStatusId("Generated");
        await WriteStateAsync();
        await _stream.EmitAsync(State.CorrelationId, StatusChanged.Generated(_clock.UtcNow));
    }

    // ... other On* methods mirror this pattern

    public Task<GrainStatus> GetStatusAsync()
        => Task.FromResult(new GrainStatus(State.StatusId, State.PullRequestUrl, State.CoveragePct));
}

Ordering, Dedupe & Replays

  • Ordering: Orleans guarantees single-threaded execution per grain key → natural ordering of events.
  • Dedupe: Maintain LastSequence if Saga emits monotonic sequence numbers; ignore stale repeats.
  • Replays: If a consumer replays an event (DLQ), the grain idempotently ignores duplicates via FK status monotonicity and/or LastSequence.

Observability

  • Tracing: Include correlationId and grain.type=GenerationGrain in spans for StartAsync and each On* call.
  • Metrics: counters (generation.start.accepted, status.emitted), histograms (time between transitions).
  • Logs: structured; redact PII/tokens; use UTC timestamps.

Security

  • Grain methods are internal to the cluster; ingress is via gRPC/MCP APIs which enforce RBAC.
  • Streams are namespaced and permissioned; no public subscription without auth.
  • No secrets in grain state or status messages.

Configuration (excerpt, UTC)

{
  "Orleans": {
    "ClusterId": "lga-sandbox",
    "ServiceId": "connectsoft-lga",
    "Storage": { "Provider": "AdoNet", "ConnectionString": "Name=Sql" },
    "Streams": { "Status": { "Provider": "AzureServiceBus", "Name": "lga-status" } }
  }
}

Acceptance

  • Orleans host builds and starts alongside the LGA service (silo + clustering + grain storage).
  • Demo scenario: Fire 1000 concurrent StartAsync calls for distinct CorrelationIds:
    • No duplicate starts; all reach PROpened/Completed via Saga.
    • Status stream subscribers receive ordered updates per correlation.
  • Resilience test: Replay selected events via DLQ → grains remain idempotent and ordered.
  • Observability: traces show Grain↔Saga interplay tied by CorrelationId, all timestamps in UTC.

Security & Governance

Focus: Ensure the agent, its generated artifacts, and its pipelines are secure, auditable, and policy-compliant. Security is default-deny, secrets are centrally managed, telemetry is redacted, and all decisions are traceable.


Principles

  • Least privilege by default (Managed Identity first; scoped PATs only if unavoidable).
  • Secrets never persist in code, config, PR text, or logs; resolve just-in-time from Azure Key Vault.
  • RBAC everywhere: gRPC/MCP endpoints, bus access, repo actions, and job runners.
  • Policy as code: allow-lists (repos/paths/licenses), gates (coverage, reviewers), and audit.
  • End-to-end traceability: every action stamped with CorrelationId, TenantId, UTC timestamps.

Secrets & Key Management (Azure Key Vault)

  • Source of truth: Azure Key Vault for ADO PAT (if needed), AI keys, Service Bus SAS (when MI not available).
  • Identity: Managed Identity (workload identity) to read KV; no shared secrets in pipelines.
  • Rotation: short TTL caching; automatic retry on Forbidden to support mid-run rotation.
  • Prohibition: secret values are never written to:
    • PR descriptions/comments
    • Logs/metrics/traces (redacted at sink)
    • Blueprint options or artifacts

Options & retrieval (excerpt):

// IOptions pattern with KV resolution via Azure.Identity
builder.Services.AddSingleton<ISecretResolver, KeyVaultSecretResolver>();
// Usage in adapters (ADO, AI, SB):
var pat = await secrets.GetAsync("kv://secrets/ado-pat-lga", ct); // value held in memory only for the call

Endpoint Security (gRPC & MCP)

  • AuthN (default): Azure AD tokens (JWT) via AddMicrosoftIdentityWebApi.
  • AuthZ: policy-based with scopes/roles:
    • Lga.Run (start generations), Lga.Read (status), Lga.Admin (ops).
  • mTLS (optional): for intra-cluster calls where AAD is not feasible.
  • Rate limits & quotas: per TenantId (sliding window + concurrency caps).
  • Input hardening: size limits, JSON schema validation for blueprints/options, repo/path allow-lists.

ASP.NET Core policies:

builder.Services.AddAuthorization(o =>
{
    o.AddPolicy("Lga.Run",  p => p.RequireClaim("scp","lga.run").RequireRole("LGA.Runner","LGA.Admin"));
    o.AddPolicy("Lga.Read", p => p.RequireClaim("scp","lga.read"));
    o.AddPolicy("Lga.Admin",p => p.RequireRole("LGA.Admin"));
});
app.MapGrpcService<LgaGrpcService>().RequireAuthorization("Lga.Read"); // read-only
app.MapPost("/mcp/tools/library.start_generation", ...).RequireAuthorization("Lga.Run");

Service Bus (bus) security: namespace-level role assignments for the app’s Managed Identity (Send/Listen only to required topics/queues).


Compliance & Governance Controls

  • Repo allow-list: only ADO repos matching https://dev.azure.com/connectsoft/{Project}/_git/*.
  • Path allow-list: FS operations restricted to /tmp/lga/** (enforced in MCP FS adapter).
  • License allow-list: MIT, Apache-2.0, Proprietary; blocked otherwise.
  • Branch policy enforcement: required reviewers (arch-bot,qa-bot), status checks (LGA/Quality), conventional commits.
  • Commit/PR provenance: PR body includes a trace footer (CorrelationId, BlueprintSha, PatchSha, GeneratorVersion) and no secrets.

Compliance event schema (stored + emitted):

public record ComplianceEvent(
  string CorrelationId,
  string TenantId,
  string Action,                 // e.g., "PR_OPENED", "BRANCH_PUSHED"
  string Repo, string Branch,
  string TemplateVersion,
  string License,                // from blueprint
  DateTimeOffset OccurredAtUtc,
  IDictionary<string,string> Tags);
  • Persist to DB (extend AuditEntry or new ComplianceEvents table, datetimeoffset UTC, all nvarchar).
  • Export to Log Analytics / SIEM; alert on violations (e.g., repo outside allow-list).

Supply Chain Security

  • Dependency locking: dotnet restore --locked-mode.
  • SBOM: CycloneDX for app and generated libraries; attach to build artifacts.
  • Image scanning: ACR Defender or Trivy on built images; fail on Critical vulns.
  • Base images pinned by digest; renovate automation for patch updates.
  • Package feeds: only approved NuGet sources; tamper-evident provenance (SLSA-style metadata).
  • Optional signing: NuGet package signing & container image signing (Notary v2) for release tiers.

CI (PR) enforcement (excerpt):

- script: cyclonedx dotnet --out-dir sbom --json
  displayName: Generate SBOM
- script: trivy image $(ACR_LOGIN_SERVER)/connectsoft/lga:$(IMAGE_TAG) --severity CRITICAL --exit-code 1
  displayName: Scan container image

Logging, Redaction & PII

  • Structured logging with automatic redaction of secrets (Authorization, Set-Cookie, PAT patterns).
  • Trace context: include CorrelationId, TenantId, RunId, UTC timestamp.
  • Sampling: full logs on failure paths; 10–20% sampling on success (configurable).
  • Retention: align with org policy (e.g., 30–90 days app logs; 365 days audit).

Redaction filter (concept):

builder.Services.AddSingleton<ILogRedactor, DefaultLogRedactor>();
// in Serilog/OTEL sink enrichers:
LogContext.PushProperty("tenantId", tenantId);
var safeMsg = _redactor.Redact(message); // masks tokens, secrets, emails if needed

Policy as Code (centralized)

  • securityPolicy.yaml (checked-in):
    • allowedRepos, allowedPaths, allowedLicenses
    • requiredReviewers, coverageFloorPct, maxBranchLifetimeDays
  • Loaded at startup; changes are hot-reloaded (with signature/ETag to prevent tamper).

Example:

allowedRepos:
  - "https://dev.azure.com/connectsoft/*/_git/*"
allowedPaths: [ "/tmp/lga/**" ]
allowedLicenses: [ "MIT", "Apache-2.0", "Proprietary" ]
requiredReviewers: [ "arch-bot", "qa-bot" ]
coverageFloorPct: 70

Threat Model (STRIDE snapshot)

Threat Mitigation
Spoofed caller AAD + RBAC; mTLS optional; per-tenant quotas
Tampering Git allow-lists; branch policies; commit provenance; audit trail
Repudiation AuditEntry & ComplianceEvent with CorrelationId (UTC)
Information leak Redaction; KV secrets; deny-by-default FS/Git operations
DoS Rate limits; backpressure; KEDA scale; queue TTLs
Elevation Least privilege MI; no inline secrets; scoped PAT; code reviews & checks

Database Additions (optional, conforms to conventions)

  • lga.ComplianceEvents
    • Id (BIGINT IDENTITY PK), CorrelationId (UNIQUEIDENTIFIER), TenantId (NVARCHAR 64), Action (NVARCHAR 64), Repo (NVARCHAR 400), Branch (NVARCHAR 200), TemplateVersion (NVARCHAR 64), License (NVARCHAR 32), OccurredAtUtc (DATETIMEOFFSET(0), UTC check), DetailsJson (NVARCHAR(MAX)).
    • Index: (Action, OccurredAtUtc DESC), (Repo, Branch).
    • Seeded lookup table for ComplianceAction codes (FK) if you want strong typing.

Acceptance

  • Security scans pass in CI:

  • SAST (code), dependency scan (NuGet), SBOM generated, container scan → no Critical.

  • Secrets: validated KV access via MI; no plaintext secrets in repo, logs, or PRs (regex audit passes).
  • RBAC: gRPC/MCP endpoints enforce Lga.Run/Lga.Read/Lga.Admin; negative tests confirm denial.
  • Governance: allow-lists active; attempts to target non-approved repos/paths fail with audit entries.
  • Compliance logging: ComplianceEvent (or AuditEntry) created for branch pushes/PRs with UTC timestamps and CorrelationId.
  • Pipelines: SBOM + image scan artifacts published; failing gates block merge.

Observability Extensions

Focus: Deepen end-to-end visibility of LGA with OpenTelemetry traces/metrics/logs, structured quality events, and Grafana/Prometheus dashboards. Every span/log/metric is tenant-aware, UTC-stamped, and correlatable by CorrelationId across gRPC ⇄ Saga ⇄ Orleans ⇄ Adapters ⇄ Pipelines.


Telemetry Architecture

flowchart LR
  App[ LGA Service ] --> OTEL[OpenTelemetry SDK]
  OTEL --> Col[OTel Collector]
  Col --> Tempo[(Grafana Tempo - Traces)]
  Col --> Prom[(Prometheus - Metrics)]
  Col --> Loki[(Loki/Log Analytics - Logs)]
Hold "Alt" / "Option" to enable pan & zoom
  • Propagation: W3C TraceContext (traceparent) across gRPC, MassTransit (headers), Orleans (grain calls), and HTTP adapters (ADO, MCP).
  • Identity fields (everywhere): correlation_id, tenant_id, run_id, blueprint_sha, patch_sha (when known), UTC timestamps.

Tracing — Spans per Saga Step

Create an ActivitySource("ConnectSoft.Factory.BackendLibraryGeneratorAgent") and instrument each saga transition and adapter call. Use server/client semantics per OpenTelemetry conventions.

Span plan (names & key attributes):

Span Name Kind Key Attributes
lga.start_generation SERVER correlation_id, tenant_id, request.size, blueprint.sha
lga.prepare_workspace INTERNAL workspace.root, fs.op=mkdir, allowlist=true
lga.template.generate INTERNAL template.version, switches, file.count, patch.sha
lga.quality.run INTERNAL coverage.pct, test.total, test.failed, quality.min_pct
git.clone / git.push CLIENT repo, branch, commit.id, retry.attempt
ado.pr.create CLIENT pr.url, reviewers, status_check
lga.finalize INTERNAL `status=Completed Failed,duration.ms`
orleans.grain.call INTERNAL grain=GenerationGrain, method, correlation_id
mcp.tool.call SERVER tool=library.start_generation, caller

MassTransit propagation: install a send/consume filter that injects/extracts context from message headers; include correlation_id as both MT header and span attribute.

static readonly ActivitySource Lga = new("ConnectSoft.Factory.BackendLibraryGeneratorAgent");

using var act = Lga.StartActivity("lga.template.generate", ActivityKind.Internal);
act?.SetTag("correlation_id", corrId);
act?.SetTag("template.version", templateVersion);
act?.SetTag("switches", string.Join(",", switches));

Sampling strategy:

  • Sandbox/Dev: 100% head sampling.
  • Prod: Tail-sampling: keep all errors and a percentage of success (e.g., 10%), plus long-latency spans (p95+).

Metrics — RED/USE + Domain KPIs

Expose Prometheus-scrapable metrics via OTel Metrics.

  • RED (requests):

    • lga_requests_total{endpoint,code}
    • lga_request_duration_seconds_bucket{endpoint}
    • lga_errors_total{endpoint,reason}
  • USE (resources):

    • lga_bus_messages_inflight{queue}
    • lga_outbox_pending
    • lga_jobs_running{job}
  • Domain KPIs:

    • lga_generation_latency_seconds (Start→PR/Complete, histogram)
    • lga_quality_coverage_pct (gauge; labeled by packageId)
    • lga_generations_total{status}
    • lga_pr_open_rate{repo}
meter.CreateHistogram<double>("lga_generation_latency_seconds");
meter.CreateCounter<long>("lga_generations_total");
meter.CreateGauge<double>("lga_quality_coverage_pct");

Logs — Structured, Redacted, Correlatable

  • JSON logs with fields: timestamp_utc, level, message, trace_id, span_id, correlation_id, tenant_id.
  • Redaction filter masks secrets (PAT, tokens) and PII.
  • Error logs include failure_reason and first N lines of failing command output (never full secrets).

Structured Events — Coverage & Tests

Emit domain events to logs/OTel events and persist summary where needed (e.g., QualityRuns).

Event: quality.result.emitted

{
  "event": "quality.result.emitted",
  "timestamp_utc": "2025-09-19T12:34:56Z",
  "correlation_id": "abc123",
  "tenant_id": "org-01",
  "coverage_pct": 78.3,
  "min_required_pct": 70,
  "tests_total": 321,
  "tests_failed": 0,
  "report_uri": "https://ado/.../coverage"
}

Event: generation.lifecycle.changed

  • Attributes: from_status_id, to_status_id, elapsed_ms_since_start, patch_sha, pr_url?.

Both are attached to the current span via Activity.AddEvent(...) and written to logs.


Dashboards — Grafana Composition

1) Executive Overview

  • Panels: “Generations by Status (stacked)”, “p50/p95 PR Open Latency”, “Error Rate by Endpoint”, “Coverage Distribution”.
  • Filters: tenant, repo, template.version, time.

2) Run Drill-down

  • Panels: “Run Timeline (Trace Waterfall via Tempo)”, “Saga Step Durations”, “Adapter Retries”, “DLQ Count”.
  • Links: panel → trace in Tempo (by correlation_id), → PR in ADO, → coverage report.

3) Quality & CI

  • Panels: “Coverage p50/p95 by Package”, “Test Failures by Suite”, “Gate Failures Over Time”, “SBOM/Scan Findings (counts)”.

4) Jobs & Queues

  • Panels: “Outbox Pending”, “DLQ Size”, “Job Success Rate”, “Job Duration”.

Golden signals & SLOs:

  • SLO: PR-open ≤ 5 min p95 (alert if > 10 min for 3 intervals).
  • Error budget for Failed runs per day.
  • Alert when coverage_pct < min_required_pct for canary tenants.

Wiring the Collector (example)

appsettings (export OTLP to collector):

{
  "OpenTelemetry": {
    "Exporter": { "OtlpEndpoint": "http://otel-collector:4317" },
    "Sampling": { "Head": "AlwaysOn", "Tail": "ErrorsAndSlow" }
  }
}

Program.cs:

builder.Services.AddOpenTelemetry()
    .WithTracing(t => t
        .AddSource("ConnectSoft.Factory.BackendLibraryGeneratorAgent")
        .AddAspNetCoreInstrumentation()
        .AddGrpcClientInstrumentation()
        .AddHttpClientInstrumentation()
        .AddMassTransitInstrumentation()
        .AddOrleansInstrumentation()
        .AddOtlpExporter())
    .WithMetrics(m => m
        .AddMeter("ConnectSoft.Factory.BackendLibraryGeneratorAgent")
        .AddRuntimeInstrumentation()
        .AddAspNetCoreInstrumentation()
        .AddOtlpExporter());

Cross-Agent & Pipeline Correlation

  • Include CorrelationId as:
    • gRPC metadata (x-correlation-id)
    • MassTransit headers (CorrelationId, TraceParent)
    • Orleans grain key
    • PR description trace block
    • CI job variables (CORRELATION_ID) for smoke tests

Result: Tempo trace shows end-to-end waterfall; clicking a span jumps to PR and coverage artifacts.


Tests & Guardrails

  • Span presence tests: invoke a happy path; assert spans for each saga step exist and are linked by correlation_id.
  • Metric shape tests: scrape /metrics locally; verify required metrics present with labels.
  • Redaction tests: log messages containing fake tokens → ensure masked.
  • Load tests: 1k concurrent starts → p95 latency within SLO; no missing spans.

Acceptance

  • Dashboards render a full generation lifecycle (Start → Workspace → Template → Quality → Push → PR) with click-through traces by CorrelationId.
  • Coverage/test events appear as structured entries and panels; failing gates raise alerts.
  • Trace/metric/log IDs consistently link across agents (gRPC, Saga, Orleans, adapters).
  • SLOs/alerts configured; deliberate failures trigger high-quality, actionable alerts.

Error Handling & Resilience

Focus: Define deterministic failure handling with compensations, judicious retries, and human-in-the-loop escalation—while preserving idempotency and traceability (UTC, CorrelationId, lookup-backed statuses).


Failure Taxonomy & Policy

Class Examples Retry? Notes
Transient network timeouts, ADO 429, SB 503 Yes (expo backoff + jitter) Cap attempts; persist counters in saga state.
Conflict Git non-fast-forward, PR already exists Conditional Retry after rebase/stash or idempotent detect.
Quality Gate coverage below floor No (auto) Mark QualityFailed; allow manual override via PR comment label.
Validation invalid blueprint/options No Fail fast; record reason; suggest remediation.
Auth/Policy repo not in allow-list, RBAC denied No Governance violation; escalate with audit.

Retry budget: per adapter operation (Git push, PR create, FS ops) store AttemptCount and NextAttemptAtUtc (datetimeoffset, UTC) in saga state; envelope published via Outbox includes these hints to downstream consumers.


Compensation Matrix (by Saga Step)

Step Failure Compensation Final State
PrepareWorkspace FS error Clean temp dir; mark audit; suggest re-run Failed
RunTemplate template error Attach generator logs as artifact; PR not created Failed
RunQuality coverage < floor Add PR comment (if PR exists); label quality:failed QualityFailed
PushBranch non-ff / rejected Fetch+rebase once; if still fails → create Work Item + PR comment Failed
OpenPullRequest 429 / transient Retry w/ backoff; on permanent error → Work Item, post branch link via comment Failed
Post-PR later step fails PR comment with error + remediation; keep branch for triage Failed (terminal)

All comments/labels are policy-driven and redact secrets. Strings are nvarchar, timestamps are datetimeoffset(UTC).


Retry Strategies (Adapters)

Common pattern: exponential backoff + jitter + circuit break; idempotency keys to make retries safe.

static async Task<T> RetryAsync<T>(Func<Task<T>> op, int max=4, int baseMs=400, CancellationToken ct=default)
{
    Exception? last = null;
    var rng = new Random();
    for (var i=0; i<max; i++)
    {
        try { return await op(); }
        catch (TransientException ex) // adapter-specific classification
        {
            last = ex;
            var delay = TimeSpan.FromMilliseconds(baseMs * Math.Pow(2, i) + rng.Next(0, 250));
            await Task.Delay(delay, ct);
        }
    }
    throw last ?? new Exception("Retry budget exhausted");
}
  • Git push: include idempotent commit message token [#corr:{CorrelationId}]; on duplicates detect already pushed.
  • PR create: compute idempotency key from (repo, branch); if 409, fetch existing PR and continue.
  • ADO comments: use dedupe key in comment footer to avoid duplicates on retries.

Dual-Mode Orchestration Guards (Grain + Saga)

  • GenerationGrain is the front-door deduper (per CorrelationId), preventing duplicate StartGeneration.
  • Saga persists AttemptCount/NextAttemptAtUtc per step; timers/redelivery follow these values.
  • InboxDedup table ensures at-least-once message handlers are idempotent (MessageId,Consumer unique).

Human-in-the-Loop Escalation

  • PR Comment (structured):

❗ LGA failed at step: OpenPullRequest
CorrelationId: 9b1c…  | Tenant: contoso  | UTC: 2025-09-19T12:31:00Z
Reason: ADO 403 (policy)
Next steps: ensure repo allow-list & service connection. 
* Labels: lga:failed, quality:failed, needs:review. * Work Item (ADO): auto-created with links to PR, coverage report, trace (Tempo). * Notification: optional Teams/Slack webhook with deep links; throttled (aggregate duplicates per CorrelationId).


Structured Alerts & Signals

  • Alert rules (Grafana/Prometheus):
    • lga_generations_total{status="Failed"} > X in 10m.
    • lga_generation_latency_seconds{quantile="0.95"} > 600.
    • lga_outbox_pending > Y for 15m (stuck dispatcher).
  • Payload includes: correlation_id, tenant_id, step, reason, trace_id (click-through).

Persistence Additions (optional)

  • lga.FailureLedger
    • Id BIGINT, CorrelationId UNIQUEIDENTIFIER, StepId INT (lookup), OccurredAtUtc DATETIMEOFFSET(0), Attempt INT, Reason NVARCHAR(2000), DetailsJson NVARCHAR(MAX).
  • Lookup lga_lu.SagaStep: PrepareWorkspace, RunTemplate, RunQuality, PushBranch, OpenPullRequest.

All UTC and NVARCHAR; joins power reports and triage dashboards.


Test Harness: Five Scenarios (must pass)

  1. Template Error

    • Inject failure in template adapter.
    • Expect state Failed, audit entry, no PR, alert fired, deterministic logs.
  2. Quality Gate Fail

    • coverage=62 < 70.
    • Expect QualityFailed, PR comment+label if PR existed; saga halts without push/PR (if ordered earlier).
  3. Git Push Rejected

    • Simulate upstream changes; first retry does fetch+rebase, second fails.
    • Expect Work Item, PR comment, final Failed, branch left for triage.
  4. PR Creation Rate-Limited (429)

    • 2 transient retries succeed; verify AttemptCount increments and final PROpened.
    • Metrics show retries; no duplicate PR.
  5. Outbox Dispatcher Stuck

    • Block dispatch once; redelivery unblocks.
    • Expect alert on outbox_pending, then recovery; no duplicate side effects (InboxDedup verified).

Assertions (per scenario):

  • Correct terminal statusId (lookup table),
  • UTC timestamps set (datetimeoffset),
  • Comments/labels/work items created as specified,
  • Traces include step span with failure_reason.

Acceptance

  • Harness executes the five scenarios above; expected compensations observed and alerts produced.
  • Retry/backoff limits and attempt counters persisted and respected across message redeliveries.
  • No secrets leak; PR comments and logs are redacted; events/audit rows are UTC and nvarchar.
  • Operators can triage via PR comment, Work Item, and trace link in < 5 minutes.

Knowledge & Memory Hooks

Focus: Persist and surface institutional memory of past generations so agents can reuse what worked and avoid regressions. Implements a pluggable embeddings + vector index pipeline and contract-first query APIs. All timestamps are datetimeoffset (UTC), strings are nvarchar, and enums use lookup tables.


Architecture

flowchart LR
  subgraph LGA Service
    Saga[Saga]
    Outbox[(Outbox)]
    ArtX[Artifact Extractor]
    KI[Knowledge Ingestor]
  end

  Saga -- events --> Outbox
  Outbox --> KI
  KI --> Emb[IAiEmbeddingService]
  KI --> VIdx[IVectorIndex]
  KI --> DB[(NHibernate: Knowledge* tables)]
  ArtX --> KI

  subgraph Consumers
    Agents[[Reasoning Agents (MCP/gRPC)]]
    Studio[[DevEx Studio / Search UI]]
  end

  Agents --> API[gRPC/MCP Memory API]
  Studio --> API
  API --> DB
  API --> VIdx
Hold "Alt" / "Option" to enable pan & zoom

Trigger points: after Generated, QualityValidated, and PROpened/Completed, the Knowledge Ingestor assembles metadata + artifacts, embeds text, and upserts into the vector index with idempotent keys.


Data Model (DB)

New tables under schema lga (lookups in lga_lu). All UTC and NVARCHAR.

lga.KnowledgeRun

Column Type Notes
Id (PK) uniqueidentifier Run memory id
LibraryGenerationId (FK) uniqueidentifier lga.LibraryGeneration(id)
CorrelationId (AK) nvarchar(64) Search key
TenantId nvarchar(64) Isolation
BlueprintSha nvarchar(64) Fingerprint
PatchSha nvarchar(64) Diff fingerprint
StatusId (FK) int lga_lu.LibraryGenerationStatus(id)
PackageId nvarchar(200) NuGet packageId
TemplateVersion nvarchar(32) Generator/template ver
Repo nvarchar(400) org/project or org/repo
Branch nvarchar(200) branch
PrUrl nvarchar(2000) PR link
CoveragePct decimal(5,2) quality outcome
StartedAtUtc datetimeoffset(0) UTC
FinishedAtUtc datetimeoffset(0) UTC nullable

Indexes:

  • UX_KnowledgeRun_CorrelationId (unique),
  • IX_KnowledgeRun_BlueprintSha_PatchSha,
  • IX_KnowledgeRun_Tenant_Status_FinishedAtUtc.

lga.KnowledgeArtifact

Column Type Notes
Id (PK) bigint identity
KnowledgeRunId (FK) uniqueidentifier Parent
ArtifactTypeId (FK) int lga_lu.ArtifactType(id)
Title nvarchar(200) e.g., README
Uri nvarchar(2000) blob/fs/PR url
ContentSha nvarchar(64) idempotency
Excerpt nvarchar(1000) preview
CreatedAtUtc datetimeoffset(0) UTC

lga.KnowledgeEmbedding

Column Type Notes
Id (PK) uniqueidentifier
ArtifactId (FK) bigint KnowledgeArtifact
ProviderId (FK) int lga_lu.EmbeddingProvider(id)
Dim int embedding dimension
VectorRef nvarchar(2000) pointer in external index (pgvector/Qdrant/Azure AI Search)
InsertedAtUtc datetimeoffset(0) UTC

Lookups (seeded):

  • lga_lu.ArtifactTypeReadme, PipelineYaml, Csproj, PrComment, Blueprint, Options.
  • lga_lu.EmbeddingProviderazure-aisearch, pgvector, qdrant, etc.

Keep artifact content out of the DB by default—store in blob/MCP FS and index the cleaned text. Always redact secrets.


Ports & Adapters

  • IAiEmbeddingService → returns float[] embeddings for text chunks.
    • Adapters: Azure OpenAI / OSS model; rate-limited; no secret logging.
  • IVectorIndex → upsert & KNN search.
    • Adapters: Azure AI Search (vector fields), pgvector, Qdrant.
  • IArtifactExtractor → pulls normalized text from:
    • Generated files (README, YAML, .csproj metadata),
    • PR (title, description, bot comments with coverage summaries),
    • Blueprint + options (minus PII/secrets).

Idempotency keys: artifactKey = $"{KnowledgeRunId}:{ArtifactTypeId}:{ContentSha}".


Ingestion Flow

  1. Saga emits LibraryGenerated / QualityValidated / PullRequestOpened (Outbox).
  2. Knowledge Ingestor (consumer) loads the run and artifacts; computes embeddings via IAiEmbeddingService.
  3. Chunking: 2–3k tokens/window with overlap; attach metadata:
    • correlation_id, tenant_id, repo, branch, package_id, template_version, blueprint_sha, patch_sha, artifact_type.
  4. Upsert to IVectorIndex and persist KnowledgeRun/KnowledgeArtifact/KnowledgeEmbedding rows (UTC).
  5. Observability: spans knowledge.embed, knowledge.index, metrics lga_memory_artifacts_total, lga_memory_index_latency_seconds.

Query API (contract-first)

gRPC (C# DTOs)

public sealed record SearchByBlueprintShaRequest(string TenantId, string BlueprintSha, string? PatchSha);
public sealed record SearchByBlueprintShaResponse(KnowledgeRunSummary? Run);

public sealed record SemanticSearchRequest(string TenantId, string Query, int TopK = 5, string? FilterRepo = null);
public sealed record SemanticSearchResponse(IReadOnlyList<ArtifactHit> Hits);

public sealed record KnowledgeRunSummary(
  Guid RunId, string CorrelationId, string PackageId, string TemplateVersion,
  string Repo, string Branch, string PrUrl, decimal? CoveragePct,
  int StatusId, DateTimeOffset StartedAtUtc, DateTimeOffset? FinishedAtUtc);

public sealed record ArtifactHit(
  long ArtifactId, string Title, string Uri, string ArtifactTypeCode,
  float Score, Guid RunId, string CorrelationId, string Repo, string Branch);
  • MCP tools:
    • memory.search_by_sha (params: tenantId, blueprintSha, patchSha?)
    • memory.search_semantic (params: tenantId, query, topK, filters)

Security: require Lga.Read scope; enforce tenant filter at the repository + vector layer.


Redaction & Compliance

  • Strip tokens/secrets/emails before embedding.
  • PR content: keep only bot-authored summaries + safe fields.
  • Respect repo/path allow-lists; drop artifacts outside allowed roots.
  • Retention: per-tenant TTL (e.g., 365 days) + hard delete endpoint for right-to-be-forgotten.

FluentMigrator Snippets (skeleton)

[Migration(2025092003, "Knowledge memory tables")]
public class M_20250920_03_Knowledge : Migration
{
    public override void Up()
    {
        // Lookups
        Create.Table("ArtifactType").InSchema("lga_lu")
          .WithColumn("Id").AsInt32().PrimaryKey().Identity()
          .WithColumn("Code").AsString(50).NotNullable().Unique()
          .WithColumn("Name").AsString(200).NotNullable();
        Insert.IntoTable("ArtifactType").InSchema("lga_lu").Row(new { Code="Readme", Name="README" });
        Insert.IntoTable("ArtifactType").InSchema("lga_lu").Row(new { Code="PipelineYaml", Name="Pipeline YAML" });
        Insert.IntoTable("ArtifactType").InSchema("lga_lu").Row(new { Code="Csproj", Name="CSPROJ" });
        Insert.IntoTable("ArtifactType").InSchema("lga_lu").Row(new { Code="PrComment", Name="PR Comment" });
        Insert.IntoTable("ArtifactType").InSchema("lga_lu").Row(new { Code="Blueprint", Name="Blueprint" });
        Insert.IntoTable("ArtifactType").InSchema("lga_lu").Row(new { Code="Options", Name="Options" });

        Create.Table("EmbeddingProvider").InSchema("lga_lu")
          .WithColumn("Id").AsInt32().PrimaryKey().Identity()
          .WithColumn("Code").AsString(50).NotNullable().Unique()
          .WithColumn("Name").AsString(200).NotNullable();
        Insert.IntoTable("EmbeddingProvider").InSchema("lga_lu").Row(new { Code="azure-aisearch", Name="Azure AI Search" });

        // KnowledgeRun
        Create.Table("KnowledgeRun").InSchema("lga")
          .WithColumn("Id").AsGuid().PrimaryKey()
          .WithColumn("LibraryGenerationId").AsGuid().NotNullable().ForeignKey("lga","LibraryGeneration","id")
          .WithColumn("CorrelationId").AsString(64).NotNullable().Unique()
          .WithColumn("TenantId").AsString(64).NotNullable()
          .WithColumn("BlueprintSha").AsString(64).NotNullable()
          .WithColumn("PatchSha").AsString(64).Nullable()
          .WithColumn("StatusId").AsInt32().NotNullable().ForeignKey("lga_lu","LibraryGenerationStatus","id")
          .WithColumn("PackageId").AsString(200).Nullable()
          .WithColumn("TemplateVersion").AsString(32).Nullable()
          .WithColumn("Repo").AsString(400).Nullable()
          .WithColumn("Branch").AsString(200).Nullable()
          .WithColumn("PrUrl").AsString(2000).Nullable()
          .WithColumn("CoveragePct").AsDecimal(5,2).Nullable()
          .WithColumn("StartedAtUtc").AsDateTimeOffset().NotNullable().WithDefault(SystemMethods.CurrentUTCDateTime)
          .WithColumn("FinishedAtUtc").AsDateTimeOffset().Nullable();

        // KnowledgeArtifact
        Create.Table("KnowledgeArtifact").InSchema("lga")
          .WithColumn("Id").AsInt64().PrimaryKey().Identity()
          .WithColumn("KnowledgeRunId").AsGuid().NotNullable().ForeignKey("lga","KnowledgeRun","Id")
          .WithColumn("ArtifactTypeId").AsInt32().NotNullable().ForeignKey("lga_lu","ArtifactType","Id")
          .WithColumn("Title").AsString(200).Nullable()
          .WithColumn("Uri").AsString(2000).Nullable()
          .WithColumn("ContentSha").AsString(64).NotNullable()
          .WithColumn("Excerpt").AsString(1000).Nullable()
          .WithColumn("CreatedAtUtc").AsDateTimeOffset().NotNullable().WithDefault(SystemMethods.CurrentUTCDateTime);

        // KnowledgeEmbedding (pointer to external index)
        Create.Table("KnowledgeEmbedding").InSchema("lga")
          .WithColumn("Id").AsGuid().PrimaryKey()
          .WithColumn("ArtifactId").AsInt64().NotNullable().ForeignKey("lga","KnowledgeArtifact","Id")
          .WithColumn("ProviderId").AsInt32().NotNullable().ForeignKey("lga_lu","EmbeddingProvider","Id")
          .WithColumn("Dim").AsInt32().NotNullable()
          .WithColumn("VectorRef").AsString(2000).NotNullable()
          .WithColumn("InsertedAtUtc").AsDateTimeOffset().NotNullable().WithDefault(SystemMethods.CurrentUTCDateTime);

        Create.Index("IX_KnowledgeRun_BlueprintPatch").OnTable("KnowledgeRun").InSchema("lga")
          .OnColumn("BlueprintSha").Ascending()
          .OnColumn("PatchSha").Ascending();
    }
    public override void Down()
    {
        Delete.Table("KnowledgeEmbedding").InSchema("lga");
        Delete.Table("KnowledgeArtifact").InSchema("lga");
        Delete.Table("KnowledgeRun").InSchema("lga");
        Delete.Table("EmbeddingProvider").InSchema("lga_lu");
        Delete.Table("ArtifactType").InSchema("lga_lu");
    }
}

Example Ingested Metadata (JSON)

{
  "correlation_id": "c0a8017e...",
  "tenant_id": "contoso",
  "package_id": "ConnectSoft.Extensions.Http.OAuth2",
  "template_version": "lib-tpl@v1.3.0",
  "blueprint_sha": "A1B2...",
  "patch_sha": "F00D...",
  "repo": "Platform/HttpExtensions",
  "branch": "feat/oauth2-handler",
  "pr_url": "https://dev.azure.com/...",
  "coverage_pct": 78.3,
  "artifacts": [
    { "type": "Readme", "uri": "mcp://fs/.../README.md", "content_sha": "..." },
    { "type": "PipelineYaml", "uri": "mcp://fs/.../azure-pipelines.yml", "content_sha": "..." }
  ]
}

Usage — Agent Queries

  • Exact: “Find run by blueprint SHA A1B2…SearchByBlueprintSha returns KnowledgeRunSummary (PR URL, coverage, status, timestamps).
  • Semantic: “Show examples of OAuth2 HttpClient with DI + options”SemanticSearch returns artifact hits with links and similarity.

Acceptance

  • Exact search: querying by Blueprint SHA returns the correct run metadata (PR URL, coverage, status, UTC times).
  • Semantic search: returns relevant artifacts across tenants (scoped to caller’s tenant).
  • Ingestion: duplicate events are idempotently ignored (same ContentSha), embeddings created once per artifact.
  • Compliance: secrets/PII redacted; only allowed repos/paths indexed.
  • Observability: spans (knowledge.embed, knowledge.index) and metrics appear; indexing failures alert with context.

Extensibility & Variants

Focus: Design for forward-compatibility so LGA can expose new interfaces (REST/GraphQL/SignalR), talk to multiple Git providers (ADO today; GitHub/GitLab tomorrow), and eventually generate multi-language libraries (Java/Kotlin) — without breaking existing consumers or compromising Clean Architecture, DDD (anemic), and security-first constraints.


Principles

  • Seams first: Everything behind ports with additive, versioned DTOs.
  • Default-off: New faces/adapters ship disabled until toggled.
  • Policy-driven: Tenant- and environment-scoped feature flags; RBAC per interface.
  • Observability & security parity: New faces/adapters inherit tracing/redaction/RBAC.
  • Data types: All timestamps are DateTimeOffset (UTC); all strings nvarchar; enumerations through lookup tables.

Contract Surfaces (Hooks) — gRPC primary, optional REST/GraphQL/SignalR

Interface Status Scope Notes
gRPC Primary Read/Write Contract-first C# DTOs (already defined).
REST Optional Read-only (default) /api/v1/runs/{correlationId}, /api/v1/status (no mutations by default).
GraphQL Optional Read-only Queries for runs/artifacts; no mutations by default.
SignalR Optional Push Status stream per CorrelationId (bridge from Orleans stream).

Toggle schema (appsettings):

{
  "Variants": {
    "Rest": { "Enabled": false, "Version": "v1" },
    "GraphQL": { "Enabled": false },
    "SignalR": { "Enabled": false }
  }
}

REST Controller sketch (read-only, RBAC = Lga.Read):

[Authorize(Policy = "Lga.Read")]
[ApiController, Route("api/v1/runs")]
public sealed class RunsController : ControllerBase
{
    [HttpGet("{correlationId}")]
    public async Task<ActionResult<RunDto>> GetByCorrelation(string correlationId, [FromServices] IRunQueries q)
        => (await q.GetAsync(correlationId)) is { } dto ? Ok(dto) : NotFound();
}

GraphQL (Hot Chocolate) sketch:

public sealed class Query
{
    [Authorize(Policy = "Lga.Read")]
    public Task<RunDto?> RunByCorrelation([Service] IRunQueries q, string correlationId) => q.GetAsync(correlationId);
}

SignalR Hub (status push):

[Authorize(Policy = "Lga.Read")]
public sealed class StatusHub : Hub
{
    public Task Subscribe(string correlationId) =>
        Groups.AddToGroupAsync(Context.ConnectionId, correlationId);
    // Orleans stream bridge pushes group messages: Clients.Group(correlationId).SendAsync("status", payload)
}

All endpoints emit OTel spans and carry CorrelationId/TenantId. CORS and rate limits are enforced per face.


Pluggable Git Providers

Unify Git workflows behind a single port:

public interface IGitProvider
{
    Task<BranchPushResult> PushAsync(GitPushRequest req, CancellationToken ct);
    Task<PullRequestResult> OpenPullRequestAsync(PrOpenRequest req, CancellationToken ct);
    Task EnsurePoliciesAsync(RepoPolicyRequest req, CancellationToken ct);
}

Adapters (first-class):

  • Azure DevOps (current): AdoGitProvider
  • GitHub: GitHubProvider (REST v3/GraphQL v4; checks API)
  • GitLab: GitLabProvider (v4 API; approvals & pipelines)

Capability matrix (excerpt):

Capability ADO GitHub GitLab
Branch push
PR/MR open
Required reviewers ✅ (policies) ✅ (CODEOWNERS + rules)
Status checks ✅ (policy gates) ✅ (checks API) ✅ (statuses)
Comment dedupe keys

Provider selection (policy):

{ "Git": { "Provider": "Ado", "AllowedProviders": [ "Ado", "GitHub", "GitLab" ] } }

SourceControlProvider remains a lookup table; adapters map provider-specific concepts to our seam DTOs. Retries, idempotency keys, and redaction are consistent across adapters.


Multi-Language Library Generation (future-ready)

Abstract the template engine behind ITemplateRenderer:

public interface ITemplateRenderer
{
    string LanguageCode { get; } // "dotnet" | "java" | "kotlin"
    Task<RenderResult> RenderAsync(RenderRequest req, CancellationToken ct);
}

Renderers:

  • DotNetTemplateRenderer (current): dotnet new + symbol switches.
  • MavenArchetypeRenderer (Java): mvn archetype:generate with properties.
  • GradleInitRenderer (Kotlin/Java): gradle init with module layout.

Blueprint extension (control plane):

library:
  language: dotnet   # dotnet|java|kotlin
  packageId: ConnectSoft.Extensions.Http.OAuth2
  tfm: { multi: [ "net8.0", "net9.0" ] }   # for dotnet
java:
  groupId: com.connectsoft
  artifactId: http-oauth2
  javaVersion: "21"
kotlin:
  dsl: kts

Quality runners per language (strategy):

public interface IQualityRunner { Task<QualityResult> RunAsync(QualityRunRequest req, CancellationToken ct); }

public sealed class DotNetQualityRunner : IQualityRunner { /* build/test/coverlet */ }
public sealed class MavenQualityRunner : IQualityRunner { /* mvn verify + jacoco */ }
public sealed class GradleQualityRunner : IQualityRunner { /* gradle test + kover */ }

Toggle & guardrails:

{
  "Templates": {
    "DotNet": { "Enabled": true },
    "Java":   { "Enabled": false },
    "Kotlin": { "Enabled": false }
  }
}
  • Default policy rejects non-dotnet blueprints until enabled per-tenant.
  • Coverage floors and pipeline YAML templates are language-scoped.

Feature Flags & Versioning

Feature flags (compile-time & runtime):

public interface IFeatureFlags
{
    bool RestEnabled { get; }
    bool GraphQlEnabled { get; }
    bool SignalREnabled { get; }
    bool JavaEnabled { get; }
    bool KotlinEnabled { get; }
    string GitProvider { get; } // "Ado"|"GitHub"|"GitLab"
}

Resolve from configuration with tenant overrides (e.g., Tenants:contoso:Variants:Rest:Enabled=true).

API versioning:

  • REST: /api/v1/*; add /api/v2/* when breaking DTOs are unavoidable.
  • GraphQL: schema version in description; additive changes only; deprecate via @deprecated with sunset date.
  • gRPC: additive fields (C# DTOs); never repurpose required fields.

Template versioning:

  • Blueprint pin: templateVersion: lib-tpl@1.4.0.
  • ADRs for breaking template changes; canary tenants via flags.

Observability & Security Parity

  • Spans: rest.request, graphql.query, signalr.bcast, git.provider.*, template.render(java|kotlin).
  • Metrics: per-face RPS/latency, per-provider success/error rates, per-language generation latency & coverage.
  • RBAC:
    • REST/GraphQL/SignalR endpoints require Lga.Read (and Lga.Run where write ends are later enabled).
    • Rate limits (per-tenant) and CORS (allow-lists) on REST/SignalR.
  • Redaction: same logging filters; no tokens in responses or PR text.

DI & Composition

services.AddScoped<IGitProvider>(sp =>
{
    var ff = sp.GetRequiredService<IFeatureFlags>();
    return ff.GitProvider switch
    {
        "GitHub" => sp.GetRequiredService<GitHubProvider>(),
        "GitLab" => sp.GetRequiredService<GitLabProvider>(),
        _ => sp.GetRequiredService<AdoGitProvider>()
    };
});

services.TryAddEnumerable(ServiceDescriptor.Transient<ITemplateRenderer, DotNetTemplateRenderer>());
if (flags.JavaEnabled)   services.TryAddEnumerable(ServiceDescriptor.Transient<ITemplateRenderer, MavenArchetypeRenderer>());
if (flags.KotlinEnabled) services.TryAddEnumerable(ServiceDescriptor.Transient<ITemplateRenderer, GradleInitRenderer>());

Unit tests: assert DI resolves exactly one active IGitProvider and N ITemplateRenderer based on flags; verify invalid combinations are rejected.


Documentation & ADRs

  • /docs/lga/extensibility.md — how to enable REST/GraphQL/SignalR, switch Git providers, and pilot Java/Kotlin.
  • ADRs
    • ADR-00X: “Adopt IGitProvider seam; default ADO.”
    • ADR-00Y: “Introduce REST/GraphQL read-only facades behind flags.”
    • ADR-00Z: “Multi-language templating behind ITemplateRenderer.”

Acceptance

  • Extensibility doc produced with steps, flags, and rollback guidance.
  • Toggles compile: build passes with each combination:
    • Rest/GraphQL/SignalR off/on (read-only),
    • GitProvider = Ado|GitHub|GitLab (adapters compile; smoke with fakes),
    • Templates = dotnet only; then dotnet+java/kotlin (fakes for quality runners).
  • Security parity verified: new faces enforce RBAC, rate limits, and redaction; no secrets in logs.
  • Observability parity: traces/metrics for each new face/provider appear in dashboards; correlation by CorrelationId works end-to-end.

Sandbox & Preview Environments

Focus: Enable trial runs without PRs by pushing to a sandbox repo/branch, executing preview build/test, and publishing results to Studio. Sandboxes are ephemeral, governed by TTL, with automatic cleanup jobs. All timestamps are UTC (datetimeoffset), strings are nvarchar, and enumerations use lookup tables.


Architecture

flowchart LR
  Caller[gRPC/MCP: preview] --> API
  API --> Saga
  Saga --> FS[MCP FS]
  Saga --> Tpl[Template Renderer]
  Saga --> Q[Quality Runner (preview)]
  Saga --> Git[IGitProvider (sandbox)]
  Git --> SandboxRepo[(Sandbox Repo)]
  SandboxRepo --> Pipe[Sandbox Pipeline]
  Pipe --> Artifacts[(Preview Artifacts)]
  Saga --> Pub[IPreviewPublisher -> Studio]
  Jobs[Hangfire: sandbox-gc] --> Git
Hold "Alt" / "Option" to enable pan & zoom
  • No PR is opened. We push to a sandbox branch (isolated repo) -> CI runs -> Studio receives a summary (coverage, test counts, diff/patch hash, links).
  • TTL cleanup removes old sandbox branches & artifacts.

Modes & Contracts

  • Preview mode is explicit:
    • gRPC: StartGenerationPreview(correlationId, blueprintJson, options?)
    • MCP: library.preview tool
  • Idempotency keys: BlueprintSha + PatchSha + TenantId → same preview branch reused (unless forceNew flag).

Branch format (configurable): preview/{tenant}/{packageId}/{short-blueprint-sha}/{yyyyMMddHHmmssZ}

Commit message footer: [preview] corr:{CorrelationId} tpl:{TemplateVersion} sha:{PatchSha}


Configuration (appsettings)

{
  "Sandbox": {
    "Enabled": true,
    "Repo": "https://dev.azure.com/connectsoft/Sandbox/_git/LgaPreview",
    "BranchPrefix": "preview",
    "TtlDays": 7,
    "RunPipeline": true,
    "PipelineYamlPath": "azure-pipelines.preview.yml",
    "PublishToStudio": true,
    "CoverageFloorPct": 70,
    "Allowlist": {
      "Repos": [ "https://dev.azure.com/connectsoft/Sandbox/_git/*" ],
      "Paths": [ "/tmp/lga/**" ]
    }
  }
}
  • RBAC: only callers with Lga.Run + tenant access can trigger previews.
  • Governance: sandbox repo allow-list; no external repos.

Data Model Additions (DB)

Under schema lga (UTC + nvarchar):

SandboxPreview

Column Type Notes
Id (PK) uniqueidentifier
CorrelationId (AK) nvarchar(64) links to run
TenantId nvarchar(64) isolation
BlueprintSha nvarchar(64)
PatchSha nvarchar(64)
Repo nvarchar(400) sandbox repo
Branch nvarchar(200) preview branch
PipelineRunUrl nvarchar(2000) CI run
CoveragePct decimal(5,2) preview result
TestsTotal int
TestsFailed int
CreatedAtUtc datetimeoffset(0)
ExpiresAtUtc datetimeoffset(0) TTL
StatusId (FK) int lookup: PreviewStatusPending/Running/Succeeded/Failed/Expired

Indexes:

  • UX_SandboxPreview_CorrelationId (unique)
  • IX_SandboxPreview_BlueprintPatch (BlueprintSha,PatchSha)
  • IX_SandboxPreview_ExpiresAtUtc (for GC)

Lookups: lga_lu.PreviewStatus seeded with codes above.


Flow (Preview)

  1. Start (Preview)

    • Validate blueprint, compute BlueprintSha/PatchSha.
    • Enforce allow-lists (repo/path/license).
    • Create SandboxPreview row (Status=Pending, ExpiresAtUtc = CreatedAtUtc + TtlDays).
  2. Workspace & Template

    • Prepare workspace under /tmp/lga/{corr}/.
    • Render library with requested switches (no version bumping, no publish).
  3. Quality (Preview)

    • Run preview quality via IQualityRunner (same tools, but with --preview profile).
    • Gate only by sandbox floor (CoverageFloorPct).
  4. Sandbox Push

    • Use IGitProvider (sandbox provider) to push to configured repo/branch.
    • No PR creation; commit contains preview footer.
  5. Pipeline (Optional)

    • If RunPipeline, trigger pipeline; poll/run webhook to collect:
      • coveragePct, testsTotal, testsFailed, artifact links.
    • Update SandboxPreview and publish a Studio card (see below).
  6. Publish to Studio

    • IPreviewPublisher posts a Preview card:
      • Blueprint/package info, coverage/tests, diff/patch SHA, links (branch, pipeline, artifacts), aging badge based on TTL.
    • Also upsert into Knowledge & Memory as ArtifactType=PreviewReport (optional).
  7. Done

    • Mark SandboxPreview.Status=Succeeded/Failed; leave branch until TTL.

TTL Cleanup (Jobs)

Job: sandbox-gc (UTC, daily)

  • Find ExpiresAtUtc < UtcNow with Status in (Succeeded, Failed, Expired).
  • Delete sandbox branch (via IGitProvider.DeleteBranchAsync), artifacts (if stored), mark Status=Expired.
  • Log audit entry + emit compliance.event (SANDBOX_CLEANUP).

Config (extends Jobs you already have):

{
  "Jobs": {
    "SandboxGc": { "Cron": "0 2 * * *", "BatchSize": 200 }
  }
}

Ports & Adapters

  • IPreviewPublisher → Studio (DevEx) API to render preview cards.
  • ISandboxPipeline → trigger and fetch results from CI (ADO).
  • IGitProvider (sandbox mode) → push & delete branches in sandbox repo.
  • IRunQueries → expose preview status for UI.

All ports follow redaction, retry, and UTC policies.


Studio Preview Card (payload)

{
  "type": "lga.preview",
  "correlation_id": "c9f2…",
  "tenant_id": "contoso",
  "package_id": "ConnectSoft.Extensions.Strings",
  "blueprint_sha": "A1B2…",
  "patch_sha": "F00D…",
  "repo": "Sandbox/LgaPreview",
  "branch": "preview/contoso/ConnectSoft.Extensions.Strings/A1B2/20250919T102233Z",
  "pipeline_url": "https://dev.azure.com/…",
  "coverage_pct": 78.3,
  "tests_total": 321,
  "tests_failed": 0,
  "expires_at_utc": "2025-09-26T10:22:33Z"
}

Security & Governance

  • RBAC: Lga.Run required to create previews; Lga.Read to view.
  • Isolation: per-tenant naming + branch prefixes; sandbox repo separate from production repos.
  • No secrets in artifacts or Studio payloads; redaction enforced.
  • Allow-lists: only the configured sandbox repo is permitted in preview mode.
  • Rate limits: per-tenant concurrency caps for preview starts.

Observability

  • Spans: preview.start, preview.template.generate, preview.quality.run, preview.git.push, preview.pipeline.trigger, preview.publish.
  • Metrics:
    • lga_preview_runs_total{status}
    • lga_preview_latency_seconds (Start→Publish)
    • lga_sandbox_gc_deleted_branches_total
  • Logs: structured; include correlation_id, tenant_id, expires_at_utc.

API Examples

  • gRPC: StartGenerationPreview → returns { run_id, branch, pipeline_url? }
  • MCP: library.preview → returns Studio card URL + branch string.
  • Queries: GetPreviewStatus(correlationId){ status, coveragePct?, testsTotal?, expiresAtUtc }

Acceptance

  • Sandbox dry-run visible in Studio: triggering preview produces a Studio card with branch & pipeline links, coverage/tests, and TTL.
  • No PR opened, and production repos remain untouched.
  • TTL cleanup works: sandbox-gc deletes expired preview branches & marks records Expired; audit entries created.
  • Observability present: trace waterfall for preview flow; metrics populated; alerts fire on persistent failures.

Reuse & Factory-Wide Patterns (Extended)

Focus: Generalize LGA’s proven patterns into shared, reusable services and contracts that every generator agent (libraries, services, APIs, clients, infra modules) can adopt uniformly—reducing duplication, improving governance, and accelerating scale.


Objectives

  • One way to PR: central GitOps + PR Orchestration Service for branch/push/PR/labels/comments across providers.
  • One way to test: a Shared Quality Runner Contract with language-specific adapters (DotNet, Java, Kotlin, JS).
  • One way to diff: a Blueprint → Artifact Diff Engine that yields deterministic patch/fingerprint for idempotency, governance, and PR signal.

Global conventions: All timestamps are DateTimeOffset (UTC), strings are nvarchar with explicit lengths, and enumerations use lookup tables.


Factory Reuse Architecture

flowchart LR
  subgraph Generator Agents
    A1[Backend Library Generator]
    A2[API Service Generator]
    A3[SDK Client Generator]
  end

  A1 --> GO[GitOps + PR Orchestrator (Shared)]
  A2 --> GO
  A3 --> GO

  A1 --> QR[Quality Runner (Shared Contract)]
  A2 --> QR
  A3 --> QR

  A1 --> DE[Blueprint→Artifact Diff Engine (Shared)]
  A2 --> DE
  A3 --> DE

  GO <--> Providers[(ADO/GitHub/GitLab)]
  QR <--> Toolchains[(dotnet/maven/gradle/npm)]
  DE <--> FS[(MCP FS / Blob)]
Hold "Alt" / "Option" to enable pan & zoom

1) GitOps + PR Orchestration Service (Shared)

A multi-tenant microservice offering a uniform API for Git workflow, abstracting ADO/GitHub/GitLab and enforcing policy-as-code.

Port (contract-first, C# DTOs)

public interface IGitOpsOrchestrator
{
    Task<BranchPushResult> PushAsync(BranchPushRequest req, CancellationToken ct);
    Task<PrOpenResult> OpenPullRequestAsync(PrOpenRequest req, CancellationToken ct);
    Task<CommentResult> UpsertCommentAsync(PrCommentRequest req, CancellationToken ct);
    Task LabelAsync(PrLabelRequest req, CancellationToken ct);
    Task EnsurePoliciesAsync(RepoPolicyRequest req, CancellationToken ct);
}
Core DTOs
public sealed record BranchPushRequest(
  string TenantId, string ProviderCode, string Repo, string Branch,
  string CommitMessage, IReadOnlyList<FileChange> Changes, string IdempotencyKey);

public sealed record PrOpenRequest(
  string TenantId, string ProviderCode, string Repo, string SourceBranch, string TargetBranch,
  string Title, string Description, IReadOnlyList<string> RequiredReviewers,
  string IdempotencyKey, IDictionary<string,string>? Metadata = null);
  • Idempotency: IdempotencyKey = sha256(repo|branch|patch) ensures safe retries.
  • Compliance footer automatically appended to PR descriptions (CorrelationId, BlueprintSha, PatchSha, TemplateVersion).
  • Comment dedupe: comments carry a hidden <!-- key:xyz --> marker to upsert instead of duplicate.

Persistence (shared schema factory_gitops)

  • WorkLedger (operations, UTC), Provider lookup (Ado,GitHub,GitLab), PrMap (maps idempotency → PR URL).
  • Strings nvarchar, times datetimeoffset(0), statuses via lookup tables.

Policies

  • Repo allow-list & branch prefixes per tenant.
  • Required reviewers & status checks configured centrally.
  • RBAC via Azure AD: scopes GitOps.Execute, GitOps.Read.

2) Shared Quality Runner Contract

A single interface + result schema adopted by all generators; language/toolchain-specific adapters implement it.

Port

public interface IQualityRunner
{
    Task<QualityResult> RunAsync(QualityRunRequest request, CancellationToken ct);
}

public sealed record QualityRunRequest(
  string TenantId, string Language, string WorkspacePath, string? Profile, double CoverageFloorPct);

public sealed record QualityResult(
  bool Passed, double CoveragePct, int TestsTotal, int TestsFailed,
  string? ReportUri, IReadOnlyList<QualityIssue> Issues);

Adapters (first-class)

  • DotNetQualityRunner (build/test/coverlet, trx → summary)
  • MavenQualityRunner (mvn verify + JaCoCo)
  • GradleQualityRunner (Gradle + Kover/JaCoCo)
  • NodeQualityRunner (npm/yarn + jest/nyc)

Uniform output: Coverage/test counts normalized; threshold enforcement is performed outside adapters (or via a small helper) so policy stays central.

Shared Storage (schema factory_quality)

  • QualityRun (per execution; UTC), QualityIssue (normalized fields), RunnerProvider lookup (dotnet,maven,gradle,node).
  • Optional artifact blobs (reports) stored via MCP FS/Blob; DB keeps URIs only.

3) Standard Blueprint → Artifact Diff Engine

A deterministic pipeline that takes (blueprint, template switches)rendered artifact set, computes diff against a baseline, and emits a patch + fingerprint.

Port

public interface IDiffEngine
{
    Task<ArtifactDiffResult> DiffAsync(ArtifactDiffRequest request, CancellationToken ct);
}

public sealed record ArtifactDiffRequest(
  string TenantId, string TemplateVersion, string WorkspacePath, IReadOnlyList<string> IncludeGlobs);

public sealed record ArtifactDiffResult(
  string PatchSha, int FilesChanged, int Insertions, int Deletions, string UnifiedDiffPath);

Behavior

  • Normalization: newline, encoding, license headers, pinned package versions for stable diffs.
  • Ignore set: .git/**, bin/obj, lockfiles (configurable), secrets.
  • Fingerprint: PatchSha = sha256(sorted(filePath + sha256(content))).
  • Outputs: unified diff file (stored via MCP FS), summary stats, per-file hashes.

Persistence (schema factory_diff)

  • DiffLedger with PatchSha, TemplateVersion, UTC times, sizes/stats.
  • Lookup: TemplateFamily (library, service, client), Language (dotnet/java/kotlin/js).

Shared SDKs & Packages (for all agents)

  • ConnectSoft.Factory.GitOps (client + DTOs)
  • ConnectSoft.Factory.Quality (port + result types)
  • ConnectSoft.Factory.Diffing (port + helpers)
  • ConnectSoft.Factory.ControlPlane (blueprint/DSL models, validation)

All packages ship analyzers and contract tests so adopters stay compliant.


Cross-Cutting Concerns

  • Security:
    • AAD RBAC (Factory.GitOps.Execute/Read, Factory.Quality.Run/Read).
    • Managed Identity to providers; secrets in Key Vault only.
  • Observability:
    • Traces: gitops.push, gitops.pr.open, quality.run, diff.compute with correlation_id, tenant_id.
    • Metrics: success/error counters, latency histograms, queue depth.
  • Governance:
    • Central policy files (repos, branches, reviewers, coverage floors, ignore globs).
    • ADRs control breaking contract changes; semver for SDKs.

Integration Pattern (Agent Side)

// Diff
var diff = await _diffEngine.DiffAsync(new(...));

// Quality
var quality = await _qualityRunner.RunAsync(new(..., CoverageFloorPct: policy.CoverageFloor));

// GitOps
var push = await _gitOps.PushAsync(new(..., IdempotencyKey: diff.PatchSha));
var pr   = await _gitOps.OpenPullRequestAsync(new(..., IdempotencyKey: diff.PatchSha));
await _gitOps.UpsertCommentAsync(new(pr.PrNumber, Templates.PrSummary(diff, quality), DedupeKey: diff.PatchSha));
  • Idempotency key = PatchSha ensures the same change never creates multiple PRs.
  • Comment upsert avoids duplication on retries.
  • Coverage floor enforced uniformly by policy.

Cookbook (Cross-Agent)

  • “Create repo PR with shared services” — end-to-end example using all three shared services (fake adapters).
  • “Swap Git provider (ADO → GitHub)” — one-line config change + smoke.
  • “Run quality in Java agent” — same result schema, different adapter.
  • “Compute patch for governance” — diff-only validation with no push/PR.

Each recipe includes expected telemetry, policy snippets, and troubleshooting notes.


Acceptance

  • Cross-agent cookbook updated with working samples for Library, API, and Client generator agents.
  • QA agents confirm:
    • PRs created via GitOps Orchestrator across providers with policy applied.
    • Quality Runner yields normalized results for at least dotnet and one additional language (fake/real).
    • Diff Engine produces stable PatchSha and unified diff; idempotent PR flow verified.
  • Toggles compile: all shared packages referenced build clean; agents run with shared services enabled/disabled.
  • Observability parity: traces/metrics/logs present for shared paths and correlate by CorrelationId across agents.

Executive Summary

The Backend Library Generator Agent (LGA) is a production-grade microservice that converts a Backend Library Blueprint and DSLs into a complete, factory-compliant .NET library (code, tests, docs, CI) and opens a governed PR in Azure DevOps via MCP-assisted GitOps. It aligns to Clean Architecture with an anemic domain, NHibernate persistence, a MassTransit saga for orchestration, and gRPC/MCP entry points for cross-agent use.

What it does (Inputs → Outputs)

  • Inputs: a standardized Backend Library Blueprint describing reusable, DI-friendly libraries (plus options/DSL).
  • Outputs: a deterministic library repo including the .csproj, DI/Options scaffolding, MSTest project, README, and Azure Pipelines YAML ready for NuGet packaging and CI gates.

How it works (Architecture & Flow)

  • Host & Ports: gRPC API and MCP tools front a Saga that coordinates workspace prep, template rendering, quality checks, Git operations, and PR creation. Adapters include MCP FS/Git, ADO PR, and AI helpers.
  • Orchestration: the Saga persists state via NHibernate + Outbox, ensuring idempotent, traceable runs across every step.
  • Contracts & Guarantees: clean boundaries, correlation/trace IDs, deterministic templates, and CI gates are first-class.

Platform Alignment

  • Clean Architecture & DDD: blueprints map to layers and agent roles, enabling safe, partial, and traceable generation across domain/application/infrastructure/tests.
  • Solution Composition: projects are organized by responsibility (Domain, Persistence, Messaging/Flow, ServiceModel, Application/Infrastructure, Testing), with optional Actor/Scheduler/API variants.

Delivery & Operations

  • Pipelines: build-test-pack CI and optional CD are templated, test-aware, and agent-integrated; artifacts (coverage, SBOM, packages) are published and trace-linked.
  • Infrastructure: Pulumi/IaC patterns and DevOps integration are supported for secure, auditable delivery when hosting is needed.

Extensibility & Reuse

  • The structure retains seams for additional faces (REST/GraphQL/SignalR) and optional Actor/Scheduler modules, without breaking core contracts.
  • Factory-wide patterns (GitOps/PR orchestration, quality runner, diffing) are designed for reuse by other generator agents.

Why it matters (Outcomes)

LGA provides speed (scaffolding from blueprints), consistency (single control plane and template), and enterprise-fit (ADO pipelines, NuGet, governance). It becomes a repeatable lane on the factory line, producing high-quality internal libraries with full traceability from blueprint to PR.

Conclusion

We’ve specified a secure, observable, and deterministic generator microservice that turns intent (blueprints) into governed code and pipelines. Its clean seams, auditability, and CI quality gates let teams scale library production confidently while preserving platform standards and traceability. In short: LGA is a durable building block that accelerates the ConnectSoft AI Software Factory without sacrificing control.


📚 Appendix — Source References

Canonical docs that shaped the Backend Library Generator Agent (LGA). Paths reflect your repo’s /docs/ or /design/ area as uploaded.

Core LGA Docs

  • /lga-plan.md — End-to-end plan & scope for LGA (cycles, outcomes, acceptance).
  • /overview.md — What LGA does, inputs/outputs, guarantees.
  • /runbook.md — Developer & ops flow; CI knobs; coverage gates.
  • /use-cases.md — Library types and adoption patterns across teams.
  • /features.md — Template switches, multi-TFM, CI behaviors.
  • /Solution Structure.md — Project map aligned to Clean Architecture.
  • /Solution Structure Graph.mmd — Mermaid graph of the solution topology.

Platform Foundations (principles & patterns)

  • /clean-architecture-and-ddd.md — Layering, anemic domain, boundary rules.
  • /modularization.md — Modules, seams, and scaling to thousands of components.
  • /event-driven-mindset.md — Outbox, inbox, retries, eventual consistency.
  • /cloud-native-mindset.md — Containerization, configuration, resiliency.
  • /observability-driven-design.md — Tracing, metrics, logs, dashboards.
  • /technology-stack.md — Chosen runtimes, infra, and tooling.
  • /strategic-goals.md — Factory north-star & capacity targets.

Agent System References

  • /agent-system-overview.md — Roles, collaboration, orchestration layers.
  • /agent-execution-flow.md — Standard lifecycle, resilience handrails.
  • /agent-collaboration-patterns.md — Cross-agent contracts and handoffs.
  • /architect-agents-overview.md — Responsibilities of architect-class agents.
  • /engineering-agents-overview.md — Build/test/release responsibilities.
  • /vision-and-planning-agents-overview.md — Strategy capture → delivery lines.
  • /qa-agents-overview.md — Quality gates and gatekeeper behaviors.

Blueprints, Templates & DSLs

  • /backend-library-blueprint.md — Blueprint schema & mapping to outputs.
  • /dsls.md — Control-plane DSLs (intent/structure/contracts/triggers).
  • /microservice-template.md — Base microservice template & conventions.
  • /templates.md — Template catalog & usage model.
  • /libraries.md — Library template specifics (flags, structure, examples).

Architecture (context & system)

  • /overall-architecture.md — C4 views; platform context for LGA.
  • /agentic-system-design.md — Agent mesh, reasoning surfaces, and MCP.

Knowledge & Memory

  • /knowledge-and-memory-system.md — Embeddings, vector index, and retrieval patterns for past runs.

References (quick index)

  • LGA Concept & Delivery: lga-plan.md, overview.md, runbook.md, use-cases.md, features.md
  • Structure & Architecture: Solution Structure.md, Solution Structure Graph.mmd, overall-architecture.md, agentic-system-design.md
  • Agent Operating Model: agent-system-overview.md, agent-execution-flow.md, agent-collaboration-patterns.md
  • Blueprint & Templates: backend-library-blueprint.md, dsls.md, microservice-template.md, templates.md, libraries.md
  • Pillars & Practices: clean-architecture-and-ddd.md, event-driven-mindset.md, cloud-native-mindset.md, observability-driven-design.md, technology-stack.md, strategic-goals.md
  • Quality & Governance: qa-agents-overview.md, knowledge-and-memory-system.md