Resume

Vitalii Serbyn

AI Systems Architect

serbyn.vitalii@gmail.comKyiv, Ukraine · Remote only · UK LTD for contractingserbyn.pro

Professional Summary

Senior engineer with 12+ years shipping production systems in mobile, web, and cloud. I design and run reliable LLM services with clear SLOs, CI/CD, MLflow registries, cost controls, and real observability. I have led small teams, owned release trains, and I bring that discipline to building and operating AI platforms that are easy to deploy, measure, and roll back.

Technical Skills

LLM & GenAI (Current Focus)

LangChain/LangGraph

vLLM

OpenAI API

Ollama

BLIP-2

RAG

Hybrid Retrieval

MLOps & Platform (Learning)

MLflow

SLO-gated CI

A/B Evaluation

Experiment Tracking

Model Registry

Canary Deployment

Backend & Data (12+ years)

Python

FastAPI

Node.js/NestJS

Celery

PostgreSQL

Redis

RabbitMQ

Qdrant

Cloud & Infrastructure (Production)

Kubernetes

Docker

AWS ECS/EKS

GCP/GKE

GitHub Actions

Terraform basics

Observability & Ops (Expert)

Prometheus

Grafana

Jaeger

Sentry

CloudWatch

PagerDuty

Frontend & Mobile

Flutter

Android/Kotlin

React basics

Professional Experience

AI Systems Architect

Easelect LTD • Remote

2023 - Present

Designed Ascend: 15-agent autonomous orchestration system with trust-gated L0-L4 execution across 4 production projects
Built policy engine with YAML-based permission boundaries, reducing costs from $82 to $34/mo (58% reduction)
Created Crest signal-driven AI platform with 8-agent pipeline, Thompson Sampling, multi-model routing ($500 → $50/mo)
Delivered 3 concurrent client projects (healthcare, Web3, content) across 9+ repos serving 10k+ users
Applied production discipline to AI systems: trust layers, cost controls, observability, fast rollback

Senior/Lead Engineer

EPAM, GlobalLogic, Startups • Remote/Hybrid

2013 - 2023

Led teams of 3–4 engineers, owned release trains with 99.5+ uptime for mobile apps (5M+ MAU)
Optimized backend APIs serving 1M+ requests/day with p95 latency <200ms (Redis caching, query optimization)
Implemented A/B testing frameworks processing 100k+ events/day for user behavior analysis
Built event-driven architectures with message queues handling 50k+ messages/min peak load
Established monitoring with Prometheus/Grafana: 15+ custom metrics, alert rules, SLA tracking
Reduced deployment time from 45min to 3min using Docker + Kubernetes + automated rollback

AI Relevance: These same patterns apply to LLM services: p95 latency SLOs, request/response caching, event processing for model inputs, monitoring for drift detection, and fast rollback for model deployments.

Key Projects

Threads-Agent — GenAI Content Platform

LangGraph • MLflow • Kubernetes • Prometheus • 2024 • Personal R&D

Multi-service system: persona runtime, optimizer service, publishing adapters
MLflow model registry with evaluation jobs tied to CI for automated quality gates
SLO-gated deployments on p95 latency, error rate, and token-cost deltas
Full observability stack: Prometheus metrics, Grafana dashboards, Jaeger tracing

ROI-Agent — Media Buyer Automation

BLIP-2 • Ollama • Meta Ads API • Celery • 2024 • Personal R&D

Multimodal pipeline: BLIP-2 for creative captions, LLM analysis for copy variants
Hybrid inference: local Ollama (Mixtral) with OpenAI fallback and structured output
A/B testing with sequential decision rules, performance rollups, policy guardrails
Integrations: Facebook Graph API, Shopify webhooks, Slack to Linear automation

Achievement Collector — Portfolio Automation

GitHub API • Impact Analysis • Portfolio Generation • 2024 • Personal R&D

GitHub PR analysis extracting technical metrics and generating ImpactScores
Automated portfolio generation from development activity and MLflow experiments
CI integration for consistent reporting and resume bullet generation

High-Traffic Mobile Backend Infrastructure

Node.js • PostgreSQL • Redis • Kubernetes • 2018-2022 • EPAM/GlobalLogic

API gateway handling 1M+ requests/day with circuit breakers and rate limiting
Real-time event processing: 100k+ events/day through Redis Streams and message queues
Database optimization: query performance tuning reducing p95 response time from 800ms to 120ms
Monitoring implementation: 20+ custom Prometheus metrics with Grafana dashboards and PagerDuty alerts

Education

M.Sc. in Computer Science

V. N. Karazin Kharkiv National University

2009 - 2011

Software Engineering Management

EPAM School

2019

Business Setup & Legal

Contracting

UK LTD available for contracting
Comfortable with W-8BEN-E for US clients
Standard business terms available

Payments

Wise Business or Revolut Business
Crypto: USDC/USDT (ERC-20, SOL, TRX)
Invoices in USD, EUR, or GBP
Standard contracting terms available

Languages

English: Fluent

Ukrainian: Native

Russian: Native