Services
Five things, done to production standard
Not a generic dev shop. Deep, demonstrable work in the parts of AI that are genuinely hard to get right once real users — and real data — hit them.
AI systems architecture & LLM integration
The end-to-end design that turns a model demo into a system that survives production.
- Model + provider selection against real cost/latency budgets
- Tool-calling & structured output with validation
- Eval harness + regression suite (golden Q/A)
- Guardrails: input filters, rate limits, retries, circuit breakers
RAG systems — multilingual ready
Retrieval that answers from your data — grounded in citations, evaluated, and safe to put in front of users.
- pgvector schema, chunking & retrieval policy tuned to your corpus
- Hybrid search + re-ranking, citations & source tracking
- Self-hosted embeddings option so your data stays in your VPC
- Multilingual retrieval, incl. Hebrew (RTL, morphology, tokenisation)
Computer vision & multimodal AI
Real-time video and image AI that's accurate — and cheap enough to run continuously.
- Real-time video / RTSP analysis pipelines
- Vision-model selection (Gemini, Groq vision) for accuracy vs cost
- Motion-gating that cuts inference cost 70–90% without missing events
- Multimodal builds on Vertex AI (virtual try-on, image/video generation)
Agentic systems
Tool-calling agents that are orchestrated, evaluated, and benchmarked — not vibes.
- Orchestration + tool-calling design for your use case
- Framework selection benchmarked against your constraints
- ReAct / multi-step flows with caching for fast responses
- Eval harness for task completion, not just token counts
Enterprise AI-assisted development
Roll out Claude Code across your team without your source code leaving your boundary.
- Claude Code via Bedrock + VPC endpoints — code never leaves your AWS boundary
- No-training / data-residency posture for trade-secret-sensitive teams
- Protected branches + CI/CD approval gates for AI-written code
- Governance: review-gate ownership, repo risk tiers (RACI)
Not sure which of these you need?
That's what the call is for. Book 15 minutes and we'll work out the shape of the problem together.