Skip to content

TenantZero AI Architecture Spec

This file captures the implementation target used in this repository.

Goal

Deploy a private AI foundation inside a customer Azure tenant using Terraform and Azure-native services only.

Solution Scope

  • Infrastructure Blueprint — Reusable Terraform modules for secure Azure AI primitives
  • AI Landing Zone Accelerator — Private networking, identity, observability, and policy baseline
  • AI Governance Platform — Policy enforcement, RBAC, diagnostics, cost controls, and governed extension points

Non-goals

  • No SaaS control plane
  • No UI or app code
  • No workflow/agent runtime
  • No vendor-hosted shared tenant model

Core Principles

  • Private endpoints for platform services
  • Managed identity + Entra ID authentication (no shared keys)
  • Customer-owned infra, state, keys, and data
  • Environment isolation across dev/staging/prod with separate state
  • Contract-driven modules with explicit input/output contracts
  • Composition over conditionals — root stacks compose, modules don't assume context
  • Dependency inversion — modules never reference siblings

Maturity Tiers

Foundation Tier

  • VNet, private endpoints
  • Azure OpenAI, AI Search, data profile (Cosmos or PostgreSQL)
  • Key Vault with RBAC and purge protection
  • Log Analytics and diagnostics

Enterprise Tier (adds)

  • Managed identity with role assignments
  • Diagnostics fan-out to Log Analytics
  • Azure Monitor alerts
  • Azure Policy baseline
  • Defender for Cloud

Regulated Tier (adds)

  • Customer-managed keys everywhere
  • Private AKS
  • APIM
  • Firewall routing and egress control
  • Region restriction
  • Role segmentation

Deployed Building Blocks

  • Resource groups: core, net, sec, obs
  • Networking: VNet, subnets, NSGs, private DNS zones/links
  • Identity/secrets: Key Vault private mode, RBAC, purge protection
  • Model access: Azure OpenAI private endpoint + model deployment list
  • Retrieval/data: AI Search + Cosmos or PostgreSQL profile
  • Observability: Log Analytics + diagnostics fan-out
  • Policy baseline: allowed locations, private posture controls
  • Optional compute: private AKS

Input Surface

  • client_name
  • env
  • location
  • vnet_cidr
  • subnets
  • data_profile
  • models
  • log_retention_days
  • enable_cmk
  • enable_aks
  • enable_policy_baseline

See docs/architecture.md for module map and data flow details.