<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>OvertimeLabs.ai — Articles</title>
    <link>https://overtimelabs.ai/articles</link>
    <atom:link href="https://overtimelabs.ai/articles/rss.xml" rel="self" type="application/rss+xml" />
    <description>Field notes from building production AI: RAG, agentic systems, performance, and enterprise Claude Code.</description>
    <language>en-GB</language>
    <item>
      <title>Stop your RAG system hallucinating</title>
      <link>https://overtimelabs.ai/articles/stop-rag-hallucinating</link>
      <guid isPermaLink="true">https://overtimelabs.ai/articles/stop-rag-hallucinating</guid>
      <pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate>
      <category>RAG</category>
      <description>Most RAG hallucinations are retrieval failures, not generation failures. Diagnose which, ground answers in cited context, make the model abstain, and track faithfulness.</description>
    </item>
    <item>
      <title>Self-hosting open models vs an API: where the cost actually crosses over</title>
      <link>https://overtimelabs.ai/articles/self-host-vs-api-cost</link>
      <guid isPermaLink="true">https://overtimelabs.ai/articles/self-host-vs-api-cost</guid>
      <pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate>
      <category>AI architecture</category>
      <description>Self-hosting open-weight models beats API pricing at high steady throughput or under data-residency rules; APIs win for spiky, low-volume, or frontier-quality work.</description>
    </item>
    <item>
      <title>Model routing: stop sending every request to your biggest model</title>
      <link>https://overtimelabs.ai/articles/model-routing-cost</link>
      <guid isPermaLink="true">https://overtimelabs.ai/articles/model-routing-cost</guid>
      <pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate>
      <category>AI architecture</category>
      <description>Most LLM traffic doesn't need a frontier model. Route by rules, a classifier, or a cascade to cut spend several-fold without silently degrading quality.</description>
    </item>
    <item>
      <title>LLM-as-a-judge: evaluating LLM systems that actually scale</title>
      <link>https://overtimelabs.ai/articles/llm-as-a-judge-evaluation</link>
      <guid isPermaLink="true">https://overtimelabs.ai/articles/llm-as-a-judge-evaluation</guid>
      <pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate>
      <category>AI architecture</category>
      <description>How to use LLM-as-a-judge to evaluate generative systems at scale: rubrics, golden sets, bias mitigation, human calibration with Cohen's kappa, and CI gates.</description>
    </item>
    <item>
      <title>Migrating off tag proliferation to branch/environment CI/CD in GitLab Ultimate</title>
      <link>https://overtimelabs.ai/articles/gitlab-branch-environment-cicd</link>
      <guid isPermaLink="true">https://overtimelabs.ai/articles/gitlab-branch-environment-cicd</guid>
      <pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate>
      <category>Enterprise AI</category>
      <description>Replace tag-driven releases with a branch/environment promotion model in GitLab Ultimate: protected branches, approval gates, LDAP/AD identity, and build-once-promote pipelines.</description>
    </item>
    <item>
      <title>Enterprise Claude Code without leaking your code</title>
      <link>https://overtimelabs.ai/articles/enterprise-claude-code-without-leaking-code</link>
      <guid isPermaLink="true">https://overtimelabs.ai/articles/enterprise-claude-code-without-leaking-code</guid>
      <pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate>
      <category>Enterprise AI</category>
      <description>Run Claude Code via Amazon Bedrock with VPC endpoints so prompts and code stay in your AWS account: IAM-scoped, no public egress, no training on your data.</description>
    </item>
    <item>
      <title>Cutting vision-LLM cost 70-90% with motion-gating</title>
      <link>https://overtimelabs.ai/articles/cutting-vision-llm-cost-motion-gating</link>
      <guid isPermaLink="true">https://overtimelabs.ai/articles/cutting-vision-llm-cost-motion-gating</guid>
      <pubDate>Sat, 30 May 2026 00:00:00 GMT</pubDate>
      <category>Computer vision</category>
      <description>A cheap OpenCV motion-gating layer in front of a vision-language model cuts 24/7 surveillance API cost 70-90%, as built in my dvr_ai project.</description>
    </item>
    <item>
      <title>Your first dbt tests</title>
      <link>https://overtimelabs.ai/articles/your-first-dbt-tests</link>
      <guid isPermaLink="true">https://overtimelabs.ai/articles/your-first-dbt-tests</guid>
      <pubDate>Tue, 29 Jul 2025 00:00:00 GMT</pubDate>
      <category>Data Engineering</category>
      <description>Three low-effort dbt tests catch roughly 80% of warehouse errors. Add not_null, unique and accepted_values on your keys and enums, wire them into CI, and bad data stops reaching dashboards.</description>
    </item>
    <item>
      <title>Cutting p95 latency without new hardware</title>
      <link>https://overtimelabs.ai/articles/cutting-p95-latency</link>
      <guid isPermaLink="true">https://overtimelabs.ai/articles/cutting-p95-latency</guid>
      <pubDate>Tue, 29 Jul 2025 00:00:00 GMT</pubDate>
      <category>Performance</category>
      <description>A metric-driven playbook that routinely trims 40%+ off tail latency before you reach for a bigger instance — measure, relieve the DB, control concurrency, shed load.</description>
    </item>
  </channel>
</rss>