Model portability: swapping Bedrock for the Mistral API
Prerequisites + catch-up download
Tooling and AWS access common to every post in this series.
Tooling
- Terraform 1.x (install). Every post provisions infrastructure with Terraform.
- uv for Python project management (install). Each post ships a runnable script you can invoke with
uv run. - direnv (install) so
terraform,uv run, andawspick up AWS credentials automatically oncd. The project scaffold ships an.envrcthat sources a gitignored.envrc.local. - (Optional) A coding agent such as Claude Code, Cursor, Codex, or Gemini CLI to consume the
AgentPromptblocks throughout the series. Not required (each prompt has a manual equivalent shown alongside it), but it skips the boilerplate.
Agent prompt: Check and install missing tooling
You are helping set up tooling for a tutorial project.
For each of `terraform`, `uv`, and `direnv`, run `command -v` to
check whether it is installed. If present, print the version and
continue.
For missing tools, detect the system package manager in this order:
`command -v brew`, `command -v dnf`, `command -v apt-get`. Use the
first one available:
- Terraform: `brew tap hashicorp/tap && brew install hashicorp/tap/terraform`,
dnf via the HashiCorp RPM repo, or apt via the HashiCorp deb repo.
- uv: `brew install uv`, or the official installer
`curl -LsSf https://astral.sh/uv/install.sh | sh`.
- direnv: `brew install direnv`, `dnf install direnv`, or
`apt-get install direnv`.
If no package manager is available or the install fails, stop and
link the manual install page so the developer can finish by hand:
- Terraform: https://developer.hashicorp.com/terraform/install
- uv: https://docs.astral.sh/uv/getting-started/installation/
- direnv: https://direnv.net/docs/installation.html
After installing direnv, do not modify any shell rc files. Print the
hook line for the developer's shell (bash, zsh, or fish) and the path
to the relevant rc file, then wait for them to apply it themselves.
Report which tools were already present, which you installed, and
which need manual follow-up.
AWS access
- A sandbox, test, or personal AWS account with permission to create, modify, and delete the resources discussed in each post. If you don’t have one, follow the official Create Your AWS Account walkthrough (about ten minutes; requires a credit card and a phone number for verification). Treat it as disposable - you can close it from the billing console after the series.
- AWS credentials available locally via
aws configure sso,aws configure, or whichever method matches your setup. You wire them into the project through.envrc.localin the next section, not your shell rc.
Anthropic First Time Use
Bedrock requires a one-time use-case form per account (or per AWS Organization management account) before Anthropic models can be invoked. Easiest path: open any Claude model in the Bedrock console playground and submit the form. Auto-subscription on first invoke can take up to 15 minutes to settle, so it is worth clearing this before post 1.
CLI alternative and verification
Programmatic equivalent (requires AWS CLI 2.27.42 or later):
aws bedrock put-use-case-for-model-access \ --form-data "$(printf '{"companyName":"...","companyWebsite":"...","intendedUsers":"1","industryOption":"...","otherIndustryOption":"","useCases":"..."}' | base64)"Verify:
aws bedrock get-foundation-model-availability \ --model-id anthropic.claude-haiku-4-5-20251001-v1:0 \ --region eu-west-1Look for agreementAvailability.status: AVAILABLE. Expected output:
{ "modelId": "anthropic.claude-haiku-4-5-20251001-v1", "agreementAvailability": { "status": "AVAILABLE" }, "authorizationStatus": "AUTHORIZED", "entitlementAvailability": "AVAILABLE", "regionAvailability": "AVAILABLE"}If the form has not been submitted, only agreementAvailability.status flips to NOT_AVAILABLE. The other three fields stay green even when invocation would fail, so do not rely on them.
Project scaffold
Download the cumulative checkpoint that matches the state at the start of this post:
mkdir -p ~/projectscd ~/projectscurl -fsSL https://andreaslang.dev/terraform-pr-agent/terraform-pr-agent-02.tar.gz | tar xzThis contains everything through post 2. If you followed the previous post, your tree should already match; the curl above is for joining mid-series or recovering from drift. Tooling and AWS access from the sections above still apply.
The posts build on each other, so you may need artifacts created by previous posts to be able to run the examples.
What this post covers
Recently the US government decided to put export controls in place for Anthropic Mythos and Fable models. See here for details. While this is only for the recently released Fable/Mythos models, it did get me thinking about the increasing risk of reliance on US only foundation models. While I am obviously aware that this post is still running on AWS, I wanted to at least make a move to a European foundational model.
Admittedly, there is not a grand deal of choice and it also meant moving away from AWS Bedrock. Bedrock does have a few Mistral models, but regions are extremely inflexible and the specific one I wanted to use (Mistral Large 3) was not available in the EU at all (the model card says so, but it is not). Losing Bedrock also meant losing direct integration with CloudWatch, but luckily the decision to go with OTLP for audit meant I already had the code hooked up to extract these metrics out of the trace. That in combination with EMF (Embedded Metrics Format) meant I could easily send these as custom metrics to CloudWatch without a great deal of code changes.
Originally I had planned to only add the ability to switch between the models later when we get to evaluation, but with the recent events I changed the order, so the new code does still support Haiku via Bedrock, but added also the ability to use Mistral models via Mistral’s API.
The final tree. + is new in post 3, ~ extends a post 2 file, blank carries unchanged.
The download below fast-forwards to this state.
terraform-pr-agent/
infra/
audit-bucket.tf
firehose.tf
kms.tf
~ lambda.tf
logfire.tf
~ variables.tf
~ alerts.tf
bedrock.tf
~ cloudwatch.tf
iam.tf
main.tf
+ models.tf
agent/
~ handler.py
__init__.py
scripts/
build-lambda.sh
queries.sql
traces.sql
chat.py
+ tests/
+ conftest.py
+ test_handler.py
~ pyproject.toml
Fast-forward to the final code of this post
Download the cumulative checkpoint that matches the state at the end of this post. Useful for landing on the finished tree without working through every step.
mkdir -p ~/projectscd ~/projectscurl -fsSL https://andreaslang.dev/terraform-pr-agent/terraform-pr-agent-03.tar.gz | tar xzTo use Mistral models, you will need to create an API key and configure it in your .envrc.local file.
Sign up here and create an API key here. For this
post’s usage the free tier is fine, but you may as well load 10 Euros on it and switch to the “Scale” plan of the API.
Otherwise you will very quickly receive 429 errors.
Architecture
Post 2 ran a single Bedrock model behind the Lambda and shipped spans to Logfire and the S3 audit copy. Post 3 keeps that intact and turns the model into a runtime choice: Terraform renders a model registry into SSM Parameter Store, the handler builds the pydantic-ai model on first invoke by reading that registry, and a Mistral API entry sits alongside the Bedrock one (with the Mistral key fetched from SSM the same way as the Logfire token). Metrics move to EMF, so a Bedrock model and a Mistral-API model land in the same CloudWatch namespace and one dashboard covers both.
The model registry
To support both models I am passing a simple config via AWS SSM Parameter Store into the Lambda. It defines provider model id and if on bedrock inference profile to be used.
# The model registry: Terraform owns it, renders it to JSON, and parks it in# an SSM String parameter the handler reads at startup. Each entry names a# provider and a model id; Bedrock entries also carry the inference-profile# ARN. DEFAULT_MODEL (set on the Lambda) selects the active one, so switching# the agent's model is a parameter change, not a code change.locals { metrics_namespace = "TerraformPrAgent/Models"
models = { haiku = { provider = "bedrock" model_id = local.bedrock_model_id inference_profile_arn = aws_bedrock_inference_profile.agent.arn } "mistral-large" = { provider = "mistral" model_id = "mistral-large-latest" } "devstral-small" = { provider = "mistral" model_id = "devstral-small-2507" } }
mistral_key_wired = var.mistral_api_key != ""}
resource "aws_ssm_parameter" "models" { name = "/terraform-pr-agent/models" description = "Model registry for the terraform-pr-agent Lambda (provider + model id per entry)." type = "String" value = jsonencode(local.models)}In addition we need a Mistral API key wired and retrieved the same way as the Logfire key via SSM Parameter Store (encrypted).
# The Mistral API key, SecureString, fetched by the handler through the same# Parameters and Secrets extension path as the Logfire token. Only created# when TF_VAR_mistral_api_key is set, mirroring the Logfire token wiring; with# it unset the Mistral providers are simply unreachable and a Bedrock default# still works.resource "aws_ssm_parameter" "mistral_api_key" { count = local.mistral_key_wired ? 1 : 0
name = "/terraform-pr-agent/mistral-api-key" description = "Mistral API key. Consumed by the terraform-pr-agent Lambda." type = "SecureString" value = var.mistral_api_key}Building the model at invoke time
Now that we support Bedrock and Mistral models, we just need to create the right pydantic-ai model object with the matching configuration. The handler has also been modified so the model to be used can be provided via the event payload. The default is Mistral Large 3 if nothing is provided.
@cachedef _build_model(name: str) -> Model: """Build the pydantic-ai model registered under ``name``.
The registry lives in an SSM String parameter, so this runs on the first INVOKE (the extension is not ready during INIT) and is memoised per model name for warm invocations. Bedrock models authenticate via the Lambda role; Mistral models read an API key from a SecureString parameter, fetched the same way as the Logfire token. """ registry = json.loads(_fetch_ssm_parameter(os.environ["MODELS_PARAMETER"])) config = registry[name] provider = config["provider"] if provider == "bedrock": return BedrockConverseModel( config["model_id"], settings={"bedrock_inference_profile": config["inference_profile_arn"]}, ) if provider == "mistral": key_param = os.environ.get("MISTRAL_API_KEY_PARAMETER") if not key_param: raise RuntimeError( f"model {name!r} uses the Mistral API, but MISTRAL_API_KEY_PARAMETER " "is not set. Set MISTRAL_API_KEY and re-apply so the key is wired, or " "select a Bedrock model via DEFAULT_MODEL or the event's model field." ) return MistralModel( config["model_id"], provider=MistralProvider( api_key=_fetch_ssm_parameter(key_param), http_client=_retrying_http_client(), ), ) raise ValueError(f"unknown provider {provider!r} for model {name!r}")Provider-agnostic metrics with EMF
To avoid having one model via the inference profile and the Mistral models via a different mechanism, we switch all models to use EMF logged metrics, so we can build a clean dashboard (check it in the code you can download above).
def _emit_emf(spans: Sequence[ReadableSpan]) -> None: """Emit one EMF metric line for the trace, read off the root span.
pydantic-ai records gen_ai.usage.* on the root agent span as the run total (the sum of its child chat spans), so a single read is the correct total, not a sum across every span. The model dimension is the registry key the handler passed as run metadata; pydantic-ai serialises that to the root span's `metadata` attribute (even on a failed run), so it is read back here rather than carried in module state. That key is exactly what the dashboard iterates, so a Bedrock run and a Mistral run share one set of widgets. Logging the _aws envelope to stdout is enough; CloudWatch Logs extracts the metrics from the structured line. """ root = next((span for span in spans if span.parent is None), None) if root is None: return attributes = root.attributes or {} model = json.loads(attributes.get("metadata", "{}")).get("model", "unknown") errored = root.status.status_code is StatusCode.ERROR record = { "_aws": { "Timestamp": root.end_time // 1_000_000, "CloudWatchMetrics": [ { "Namespace": os.environ["METRICS_NAMESPACE"], "Dimensions": [["Model"]], "Metrics": [ {"Name": "InputTokens", "Unit": "Count"}, {"Name": "OutputTokens", "Unit": "Count"}, {"Name": "CacheReadTokens", "Unit": "Count"}, {"Name": "CacheWriteTokens", "Unit": "Count"}, {"Name": "Latency", "Unit": "Milliseconds"}, {"Name": "Invocations", "Unit": "Count"}, {"Name": "Errors", "Unit": "Count"}, ], } ], }, "Model": model, "InputTokens": attributes.get("gen_ai.usage.input_tokens", 0), "OutputTokens": attributes.get("gen_ai.usage.output_tokens", 0), # pydantic-ai sets these only when non-zero, so default to 0. Providers # without prompt caching (e.g. the Mistral API) simply never report them. "CacheReadTokens": attributes.get("gen_ai.usage.cache_read.input_tokens", 0), "CacheWriteTokens": attributes.get("gen_ai.usage.cache_creation.input_tokens", 0), "Latency": (root.end_time - root.start_time) / 1_000_000, "Invocations": 1, "Errors": 1 if errored else 0, } log.info("trace_metrics", **record)
def _on_trace_complete(spans: Sequence[ReadableSpan]) -> None: """Ship the audit copy, then emit metrics: one hook, two sinks.""" _ship_trace(spans) _emit_emf(spans)You might also wonder about log.info("trace_metrics", **record) and how this logs in the right format for EMF.
Well, the answer is I sneaked in structlog. It is an amazing Python logging library that has all the things and ease
of use the standard logging library misses.
# JSON logs to stdout, which CloudWatch Logs ingests as-is. The same stream also# carries the EMF metric envelope (see _emit_emf), so one structured sink covers# both application logs and metrics. Logging has no extension dependency, so it# is configured at import rather than on the first INVOKE.structlog.configure( processors=[ structlog.processors.add_log_level, structlog.processors.TimeStamper(fmt="iso"), structlog.processors.EventRenamer("message"), structlog.processors.JSONRenderer(), ], logger_factory=structlog.PrintLoggerFactory(), cache_logger_on_first_use=True,)log = structlog.get_logger()End State
Ease of switching between models and EMF logging/monitoring configured and the ability to run a (good) European foundation model 🇪🇺!
Coming next: workspace and small toolkit for the agent to get to work.