Managed in AWS
AWS deployment is under active development and is offered as an early-access program. The architecture described here is the target design; contact us to discuss timelines and join the program.
Deploy DocAI Fabric entirely inside your AWS account: application, storage, and AI services. You keep full control of your data and infrastructure; we handle the application lifecycle (provisioning, deployments, and updates) through a deployment role you grant to our CI/CD pipeline.
Unlike our Azure offering, no second cloud is involved: documents are processed by Amazon Textract (OCR) and Anthropic Claude on Amazon Bedrock (classification and extraction), all billed to your AWS account.
How It Works
Your AWS Account
├── DocAI Fabric resources
│ ├── ECS Fargate service ─┐
│ ├── ECR repository │
│ ├── ElastiCache (Redis) │
│ ├── Document storage (EFS/S3)│ Provisioned and managed
│ ├── Application Load Balancer├─ by our CI/CD pipeline
│ ├── Secrets Manager │
│ ├── CloudWatch Logs │
│ └── VPC + IAM roles ─┘
└── AWS AI services (serverless, pay-per-use)
├── Amazon Bedrock - Anthropic Claude models
└── Amazon Textract - OCR
You own the infrastructure and data. We deploy and update the application.
Our pipeline authenticates with a short-lived OIDC token against an IAM role you create and control: no long-lived AWS credentials ever leave your account, and every action the pipeline performs is visible in your CloudTrail.
What Gets Deployed
| Resource | Purpose | Sizing |
|---|---|---|
| ECS Fargate service | Application hosting | 0.5 vCPU / 1 GB, autoscaling 1-10 tasks |
| ECR repository | Docker images | N/A |
| ElastiCache (Redis) | Job queue & caching | cache.t4g.small, TLS + AUTH, private subnet |
| Document storage | EFS file system (S3 support in development) | Encrypted at rest, grows with usage |
| Application Load Balancer + ACM | HTTPS ingress & TLS certificate | N/A |
| Secrets Manager | Application secrets | N/A |
| CloudWatch Logs | Centralized logging | 30-day retention |
| VPC | Network isolation | 2 availability zones |
| IAM roles | Task execution & task roles | Least-privilege, created by the pipeline |
AI services require no provisioning: Bedrock and Textract are serverless and billed per use:
| AI service | Purpose | Model |
|---|---|---|
| Amazon Bedrock | Classification & field extraction | Anthropic Claude Haiku 4.5 |
| Amazon Bedrock | AI Copilot reasoning | Anthropic Claude Sonnet 4.6 |
| Amazon Textract | OCR | DetectDocumentText |
Estimated monthly cost: roughly $80-120/month infrastructure, plus pay-per-use AI (Textract ~$1.50 per 1,000 pages; Bedrock per-token). Everything appears on your single AWS bill.
Amazon Textract reads printed text in English, Spanish, French, German, Italian, and Portuguese (handwriting: English). If your documents are in other languages, talk to us: the application can alternatively connect to Azure Document Intelligence (150+ languages) while keeping everything else in AWS.
Prerequisites
Before starting, make sure you have:
- OIDC subject value from us (we will provide the exact value for Step 3, e.g.,
repo:docaifabric/docaifabric:environment:customer-<YOUR_ID>) - AWS CLI installed (install guide) and authenticated (
aws configureor SSO) - Administrator access to the target AWS account (you need to create IAM identity providers and roles)
We strongly recommend a dedicated member account in your AWS Organization, created just for DocAI Fabric. A dedicated account contains nothing else, so granting our pipeline administrative access to it is low-risk by design, and it aligns with the AWS best practice of one account per workload. Create one via AWS Organizations → Add an AWS account.
If you must deploy into a shared account, tell us: we will provide a scoped IAM policy and permissions boundary to use in Step 3 instead of AdministratorAccess.
Setup Guide
Step 1: Choose the Account and Region
Create (or pick) the AWS account and choose a region where both Amazon Bedrock (Anthropic Claude) and Amazon Textract are available, for example:
| Region | Location |
|---|---|
us-east-1 | N. Virginia |
us-west-2 | Oregon |
eu-central-1 | Frankfurt |
eu-west-1 | Ireland |
ap-southeast-2 | Sydney |
Other regions may work via Bedrock cross-region inference: tell us your preferred region and we will confirm availability during onboarding.
Note your AWS Account ID:
aws sts get-caller-identity --query Account --output tsv
Step 2: Create the GitHub OIDC Identity Provider
Register GitHub Actions as an OIDC identity provider in your account (once per account; you may already have it if you use GitHub Actions yourself):
aws iam create-open-id-connect-provider \
--url https://token.actions.githubusercontent.com \
--client-id-list sts.amazonaws.com \
--thumbprint-list 6938fd4d98bab03faadb97b34396831e3780aea1
If the command reports the provider already exists, that's fine: continue to Step 3. (AWS ignores the thumbprint for publicly trusted certificate authorities such as GitHub's; the value is only required by the CLI syntax.)
Step 3: Create the Deployment Role
Create an IAM role our pipeline assumes. The trust policy restricts it to our repository and your specific environment, so no other GitHub workflow can use it.
Save this as trust-policy.json, replacing <ACCOUNT_ID> and the subject value we provide:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::<ACCOUNT_ID>:oidc-provider/token.actions.githubusercontent.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
"token.actions.githubusercontent.com:sub": "<SUBJECT_VALUE_WE_PROVIDE>"
}
}
}
]
}
Then create the role:
aws iam create-role \
--role-name docaifabric-deploy \
--assume-role-policy-document file://trust-policy.json
aws iam attach-role-policy \
--role-name docaifabric-deploy \
--policy-arn arn:aws:iam::aws:policy/AdministratorAccess
Note the role ARN (arn:aws:iam::<ACCOUNT_ID>:role/docaifabric-deploy); you'll share it with us.
In a dedicated account, AdministratorAccess on this role is the simple and safe default: the only resources that will ever exist there are the ones the pipeline creates. In a shared account, ask us for the scoped policy document (ECS, ECR, ElastiCache, EFS/S3, EC2/VPC, ELB, ACM, Secrets Manager, CloudWatch, Bedrock, Textract, and bounded IAM) and a permissions boundary instead.
Step 4: Enable Bedrock Model Access
Anthropic Claude models on Amazon Bedrock require a one-time access activation per account:
- Open the Amazon Bedrock console in your chosen region
- Go to Model access and request access for Anthropic Claude models (a short use-case form may be required; approval is typically immediate)
Amazon Textract requires no activation. The application itself authenticates to both services through its IAM task role, so there are no AI API keys to create or share.
Step 5: Decide on Networking
By default we create a new VPC (two availability zones, public subnets for the load balancer, private subnets for the application, Redis, and storage). Nothing to do: this is the recommended path.
If you need the application inside an existing VPC (e.g., to reach internal systems or comply with network policy), share instead:
- VPC ID
- Two or more private subnet IDs (application, Redis, storage) with outbound access to AWS service endpoints (NAT or VPC endpoints for Bedrock, Textract, and S3)
- Two or more public subnet IDs (load balancer), or tell us if the application should be internal-only
Step 6: Choose Your Application Hostname
Decide the hostname users will open, e.g. docai.yourcompany.com. During deployment we'll send you two DNS records to create at your DNS provider:
- A validation record for the TLS certificate (AWS Certificate Manager)
- A CNAME from your hostname to the load balancer
Alternatively, if you delegate a subdomain to a Route 53 hosted zone in the account, we manage both records automatically.
Information to Share With Us
| Item | Example |
|---|---|
| AWS Account ID | 123456789012 |
| AWS Region | eu-central-1 |
| Deployment role ARN | arn:aws:iam::123456789012:role/docaifabric-deploy |
| Bedrock model access | Confirmation that Anthropic Claude access is enabled (Step 4) |
| Networking | "create a new VPC" (default), or VPC + subnet IDs from Step 5 |
| Application hostname | docai.yourcompany.com |
| DNS preference | You create two records we send, or Route 53 hosted zone in the account |
| Document languages | So we can confirm Textract coverage (see the OCR language note above) |
No API keys or secrets need to be shared: the application reaches Bedrock and Textract through its IAM task role, and all application secrets live in Secrets Manager in your AWS account.
Audit and Observability
Everything runs and logs inside your account, so you control access and retention.
| What you get | Where it lives | What it captures |
|---|---|---|
| CloudWatch Logs | Your account | Application container logs, queryable with Logs Insights |
| CloudWatch metrics & alarms | Your account | CPU/memory, request counts, queue health |
| CloudTrail | Account-level (built in) | Every API action, including each deployment performed by our pipeline, visible to you in real time |
Default log retention is 30 days. If you have compliance requirements for longer retention, let us know: we can configure extended retention or export to an archive bucket when we deploy.
We recommend granting our team read-only log access via a cross-account IAM role scoped to CloudWatch Logs (the AWS analogue of Azure Lighthouse), so we can diagnose issues quickly without holding credentials or asking you to forward log excerpts. The role is limited to logs and metrics (no access to your documents, secrets, or other resources) and is revocable at any time. If your policy forbids cross-account access, we fall back to log excerpts you share manually, with slower support turnaround.
Next Steps
- After Deployment: application URL, custom domain, email invitations, updates
- Application Logs: monitoring and log access