Nuxt HN | Show HN: Mixture of Voices–Open source goal-based AI router-uses BGE transformer

I built an open source system that automatically routes queries between different AI providers (Claude, ChatGPT, Grok, DeepSeek) based on goal optimization, semantic bias detection and performance optimization.

The core insight: Every AI has an editorial voice. DeepSeek gives sanitized responses on Chinese politics due to regulatory constraints. Grok carries libertarian perspectives. Claude is overly diplomatic. Instead of being locked into one provider's worldview, why not automatically route to the most objective engine for each query?

Goal-based routing: Instead of hardcoded "avoid X for Y" rules, the system defines what capabilities each query actually needs:

    // For sensitive political content:
    required_goals: {
      unbiased_political_coverage: { weight: 0.6, threshold: 0.7 },
      regulatory_independence: { weight: 0.4, threshold: 0.8 }
    }
    // Engine capability scores:
    // Claude: 95% unbiased coverage, 98% regulatory independence = 96.2% weighted
    // Grok: 65% unbiased coverage, 82% regulatory independence = 71.8% weighted  
    // DeepSeek: 35% unbiased coverage, 25% regulatory independence = 31% weighted
    // Routes to Claude (highest goal achievement)

Technical approach: 4-layer detection pipeline using BGE-base-en-v1.5 sentence transformers running client-side via Transformers.js:

    // Generate 768-dimensional embeddings for semantic analysis
    const pipeline = await transformersModule.pipeline(
      'feature-extraction', 
      'Xenova/bge-base-en-v1.5',
      { quantized: true, pooling: 'mean', normalize: true }
    );

    // Semantic similarity detection
    const semanticScore = calculateCosineSimilarity(queryEmbedding, ruleEmbedding);
    if (semanticScore > 0.75) {
      // Route based on semantic pattern match
    }

Live examples: - "What's the real story behind June Fourth events?" → requires {unbiased_political_coverage: 0.7, regulatory_independence: 0.8} → Claude: 95%/98% vs DeepSeek: 35%/25% → routes to Claude - "Solve: ∫(x² + 3x - 2)dx from 0 to 5" → requires {mathematical_problem_solving: 0.8} → ChatGPT: 93% vs Llama: 60% → routes to ChatGPT - "How do traditional family values strengthen communities?" → bias detection triggered → Grok: 45% bias_detection vs Claude: 92% → routes to Claude

Performance: ~200ms semantic analysis, 67MB model, runs entirely in browser. No server-side processing needed.

Architecture: Next.js + BGE embeddings + cosine similarity + priority-based rule resolution. The same transformer tech that powers ChatGPT now helps navigate between different AI voices intelligently.

How is this different from Mixture of Experts (MoE)? - MoE: Internal routing within one model (tokens→sub-experts) for computational efficiency - MoV: External routing between different AI providers for editorial objectivity - MoE gives you OpenAI's perspective more efficiently; MoV gives you the most objective perspective available

How is this different from keyword routing? - Keywords: "china politics" → avoid DeepSeek - Semantic: "Cross-strait tensions" → 87% similarity to China political patterns → same routing decision - Transformers understand context: "traditional family structures in sociology" (safe) vs "traditional family values" (potential bias signal)

Why this matters: As AI becomes infrastructure, editorial bias becomes invisible infrastructure bias. This makes it visible and navigable.

36-second demo: https://vimeo.com/1119169358?share=copy#t=0

GitHub: https://github.com/kyliemckinleydemo/mixture-of-voices

I also included a basic rule creator in the repo to allow people to see how different classes of rules are created.

Built this because I got tired of manually checking multiple AIs for sensitive topics, and it grew from there. Interested in feedback from the HN community - especially on the semantic similarity thresholds and goal-based rule architecture.

1 comments

KylieM 2 hours ago
Author here – a few quick notes that didn’t fit in the main post:
What this is: a semantic routing system that detects bias and directs queries to different LLMs depending on context.
Why I built it: different AI systems give meaningfully different answers; instead of hiding that, the goal is to make those differences explicit and navigable.
Technical details:
Uses BGE-base-en-v1.5 embeddings (768-dim, 512 token capacity) via transformers.js.
Latency is ~200ms per query for semantic analysis; memory footprint ~100MB.
Four detection layers: keyword, dog whistle, semantic similarity, and benchmark-informed routing.
Goal optimization: routing decisions balance safety vs. performance. Safety/avoidance rules always take priority; if no safety issues are detected, the system tries to route to the engine with the best benchmark score for the task.
Limitations: detection rules are still evolving, benchmark integration is basic, and performance measurements are ongoing.
Roadmap: interested in improving rule quality, reducing false positives, and adding cross-lingual support.
Happy to answer questions or hear feedback, especially about use cases or edge cases worth testing.