Open-source AI testing tool

Alignmenter
tests your AI’s voice and safety

Check if your AI sounds like your brand, stays safe, and behaves consistently. Bring your Custom GPT voice, hosted APIs, and local models. Get detailed reports in minutes, not days.

Open source
Apache 2.0 licensed
Privacy-first
Scroll to explore

Why Alignmenter?

Testing AI behavior is hard. Here’s the problem we’re solving.

The challenge

You can’t see AI behavior problems until users do

You ship a new model version. Within hours, users notice the tone feels wrong. Support gets complaints about inappropriate responses. Your brand voice has changed.

Standard tests check if answers are correct, but miss tone and personality shifts. You need a way to measure brand voice, safety, and consistency before shipping.

  • Generic evals don’t measure brand alignment
  • Manual review doesn’t scale across versions
  • Behavior drift goes undetected until production
The solution

Test every release before your users see it

Alignmenter measures how your AI behaves. Run tests in minutes, compare different models side-by-side, and catch problems before shipping.

Works with OpenAI, custom GPTs, Anthropic, and local models. Your data stays on your computer. Everything runs locally with shareable HTML reports.

  • Brand voice matching checks if responses sound like you
  • Safety checks catch harmful or off-brand responses
  • Consistency tracking spots when behavior changes unexpectedly

Three ways to measure AI behavior

Consistent, repeatable scores that show what’s actually happening

01

Authenticity

Does it sound like your brand?

Checks if AI responses match your brand’s voice and personality. Compares writing style, tone, and word choices against examples you provide. Shows confidence scores so you know when something’s off.

FORMULA
0.6 × style_sim + 0.25 × traits + 0.15 × lexicon
Key features
Compares writing style to your brand examples
Checks personality traits match your tone
Flags words and phrases that feel off-brand
Syncs instructions from your custom GPTs
Shows confidence ranges for each score
02

Safety

Catch harmful responses early

Combines keyword filters with AI judges to find safety issues. Set spending limits for AI reviewers and get offline backups when you need them. Tracks how well different checks agree.

FORMULA
min(1 - violation_rate, judge_score)
Key features
Pattern matching catches obvious problems fast
AI judges review complex cases within your budget
Tracks agreement between different safety checks
Works offline with local safety models
03

Stability

Spot unexpected behavior changes

Measures if your AI stays consistent. Flags when responses vary wildly in a single session. Compares versions to catch changes between releases you didn’t intend.

FORMULA
1 - normalized_variance(embeddings)
Key features
Finds inconsistent responses within conversations
Compares old and new model versions automatically
Set custom thresholds for when to warn you
Visual charts show where behavior shifted
Simple, powerful CLI

From install to results in 60 seconds

Install with one command, run your first test, and see a full report

terminal
$pip install alignmenter
$alignmenter init
# optional: `alignmenter import gpt --instructions brand.txt --out alignmenter/configs/persona/brand.yaml`
$alignmenter run --model openai-gpt:brand-voice --config configs/brand.yaml
Loading dataset: 60 turns across 10 sessions
✓ Brand voice score: 0.83 (range: 0.79-0.87)
✓ Safety score: 0.95
✓ Consistency score: 0.88
Report written to: reports/2025-10-31_14-23/index.html
$alignmenter report --last
Opening report in browser...
< 5 min
Test runtime on your laptop
Custom GPT ready
OpenAI, Anthropic, local, GPT Builder
100% local
No data upload required

Built for your workflow

Whether you’re validating releases, monitoring brand voice, or conducting research, Alignmenter integrates into your process.

ML Engineer

Test before you ship

Run brand voice, safety, and consistency checks before each release. Compare GPT-4o vs Claude on real conversations. Catch problems automatically in your build pipeline.

Stop regressionsCompare modelsAutomate testing
Product Manager

Keep your brand voice consistent

Sync your Custom GPT instructions into Alignmenter, make sure every release stays on-brand, and track voice consistency over time. Share easy-to-read HTML reports with your team.

Protect brand voiceSync GPT instructionsShare with stakeholders
AI Safety Team

Safety and compliance checks

Use keyword filters plus AI judges to catch safety issues. Control spending with budget limits. Export complete audit trails for compliance reviews.

Reduce riskControl costsAudit documentation
Researcher

Run repeatable experiments

Test how well different models match specific personalities. Every run produces the same results with saved outputs. Build custom tests and share them with others.

Repeatable resultsCustom metricsShare findings

Built on trust and transparency

Open source means you can see exactly how it works. Your data never leaves your computer. You can extend and customize everything to fit your needs.

Privacy by default

We never upload your data anywhere. Everything runs on your computer or your servers. Optional tools help remove sensitive information.

Apache 2.0 licensed

Free for commercial use. Copy it, modify it, use it in your products. Community contributions welcome.

Fully extensible

Plug in new AI providers, create custom tests, and build your own scoring methods. Designed to grow with your needs.

Ready to test your AI?

Join developers building better AI testing tools. Install the free CLI and run your first test in minutes.

Apache 2.0
Open source license
3 metrics
Voice, safety, consistency
Custom GPT voices
Bring your GPT Builder instructions