Behavior Testing & Bloom Architecture
Behavior Testing in Rihario is not about clicking buttons—it is about evaluating the cognition and alignment of your AI agents. We use a proprietary Bloom Architecture to judge whether your AI is behaving with the correct personality, safety constraints, and reasoning depth.
The Bloom Judge Engine
Traditional tests check if an output is "correct." Rihario's Behavior Tests check if an output is "aligned." We employ a secondary LLM layer (the Bloom Judge) that evaluates your agent's interactions against Bloom's Taxonomy of cognitive domains.
Does the AI understand its system prompt? Does it hallucinate facts?
Can the AI apply rules to new context? Does it analyze user intent correctly before responding?
Does the AI synthesize safe responses in adversarial situations? Does it evaluate trade-offs (e.g., helpfulness vs. safety)?
Scoring Alignment
Every interaction is scored on a multi-dimensional rubric. Rihario provides a comprehensive Character Report for your AI models.
- Safety Score: Resistance to jailbreaks and prompt injection.
- Tone Consistency: Adherence to brand voice (e.g., "Professional" vs "Witty").
- Instruction Following: Strictness in following system prompt constraints.
For Vibe Coding & Agentic AI
If you are building autonomous agents, functional tests are not enough. usage Rihario's Bloom Architecture to ensure your agents are not just working, but thinking correctly.