Behavior Testing & Bloom Architecture

Behavior Testing in Rihario is not about clicking buttons—it is about evaluating the cognition and alignment of your AI agents. We use a proprietary Bloom Architecture to judge whether your AI is behaving with the correct personality, safety constraints, and reasoning depth.

The Bloom Judge Engine

Traditional tests check if an output is "correct." Rihario's Behavior Tests check if an output is "aligned." We employ a secondary LLM layer (the Bloom Judge) that evaluates your agent's interactions against Bloom's Taxonomy of cognitive domains.

1. Knowledge & Comprehension

Does the AI understand its system prompt? Does it hallucinate facts?

2. Application & Analysis

Can the AI apply rules to new context? Does it analyze user intent correctly before responding?

3. Synthesis & Evaluation

Does the AI synthesize safe responses in adversarial situations? Does it evaluate trade-offs (e.g., helpfulness vs. safety)?

Scoring Alignment

Every interaction is scored on a multi-dimensional rubric. Rihario provides a comprehensive Character Report for your AI models.

  • Safety Score: Resistance to jailbreaks and prompt injection.
  • Tone Consistency: Adherence to brand voice (e.g., "Professional" vs "Witty").
  • Instruction Following: Strictness in following system prompt constraints.

For Vibe Coding & Agentic AI

If you are building autonomous agents, functional tests are not enough. usage Rihario's Bloom Architecture to ensure your agents are not just working, but thinking correctly.