This proposal recommends evolving our Generative AI platform’s backend to enhance agility and long-term scalability. While our current system—built with a Hugging Face UI and liteLLM proxy—is effective, it introduces development friction. To address this, we propose migrating our core Retrieval-Augmented Generation (RAG) and future agentic services from TypeScript to Python. This strategic move aligns us with the broader AI ecosystem, enabling faster innovation through Python’s mature GenAI frameworks, better debugging and observability, and access to top-tier AI/ML talent.
The migration will be executed incrementally using the Strangler Fig pattern and canary deployments on our existing EKS and Terraform infrastructure to ensure zero downtime. Approval and resource allocation for Phase 1 will establish a solid Python-based foundation for future GenAI development.
1. Why Move to Python#
1.1 Rich GenAI Ecosystem#
Python is the de facto standard language for AI and ML. Virtually all leading AI libraries are Python-first:
- LLM frameworks: LangChain, LlamaIndex, Haystack
- Embeddings: SentenceTransformers, OpenAI, Cohere
- Inference: Hugging Face Transformers, vLLM, FastAPI integration
- Agents: AutoGPT, BabyAGI, CrewAI, LangChain Agents
In contrast, the TypeScript/Node ecosystem trails in features and documentation. Migrating RAG and agentic components to Python gives immediate access to mature, stable, and feature-rich libraries.
1.2 Better RAG & Agent Support#
Most cutting-edge retrieval and agent frameworks debut in Python and may take months to reach TypeScript parity. Python also has direct support for Model Context Protocol (MCP) integration and advanced chaining frameworks, allowing us to adopt innovations early.
This ensures future readiness as GenAI frameworks evolve rapidly.
1.3 Easier Debugging & Iteration#
RAG pipelines and agent flows are experimental and data-heavy. Python offers:
- Simple, interactive debugging (via IPython, pdb, or IDE breakpoints)
- No compile-step delay—ideal for rapid experimentation
- Dynamic typing with optional strict validation (Pydantic)
- Mature data exploration tools (Jupyter, Pandas, NumPy)
Debugging a multi-step RAG chain or model inference flow becomes direct and observable, unlike TypeScript’s verbose build-debug cycles.
1.4 Talent & Ecosystem Alignment#
Python dominates AI hiring and open-source contributions.
Most GenAI engineers, data scientists, and academic researchers are fluent in Python, enabling faster onboarding and broader collaboration.
Even AI coding assistants (e.g., Copilot, ChatGPT) are more optimized for Python code completion and debugging.
1.5 Infrastructure Compatibility#
Python integrates seamlessly with AWS and Kubernetes:
- SDKs:
boto3
,awscli
,kubernetes
,opensearch-py
- Model serving: FastAPI, Flask, or BentoML
- Inference scaling: Uvicorn + Gunicorn with async I/O
- CI/CD: Easily containerized and deployable via Terraform-managed pipelines
1.6 Performance#
Python runs primarily on CPython, an interpreter implemented in C with a Global Interpreter Lock (GIL). In practice, Python orchestrates work while performance-critical parts execute in native extensions (C/C++ today, increasingly Rust). The ecosystem already leans this way: pydantic-core, Ruff, Polars, and the uv package manager use Rust for speed and safety. This trend will grow—Rust is the obvious choice for new high-throughput parsers, vector/IO pipelines, and CPU-bound utilities exposed to Python via PyO3/maturin or split into sidecar microservices.
Pragmatic roadmap:
- Keep Python for RAG orchestration, APIs (FastAPI), and SDK glue.
- Push hotspots to Rust when you hit latency/CPU ceilings (e.g., chunking, embedding pre/post-processing, reranker features, fast JSON/Parquet IO).
- Use Rust crates behind stable Python interfaces; ship manylinux/musllinux wheels for smooth CI/CD.
- Reserve pure Rust services for truly parallel, stateful workloads that benefit from no-GIL and tighter memory control.
Net: Python remains the center for developer velocity and ecosystem fit; Rust is the future-proof accelerator you bolt on where it measurably pays off.
2. Migration & Implementation Plan#
Phase 1: Establish Python GenAI Service#
- Create a new Python microservice using FastAPI (async, lightweight, scalable).
- Define clear API endpoints for RAG/agentic functions.
- Use
requirements.txt
for dependency tracking. - Develop locally using Conda or
uv
(for fast installs and isolation).
Phase 2: Reimplement RAG Logic#
- Replace existing TypeScript RAG modules with Python equivalents using:
- LangChain or LlamaIndex for orchestration
- FAISS/Chroma for vector indexing
- OpenAI/Hugging Face SDKs for model access
- Add unit and integration tests to validate accuracy vs. existing implementation.
Phase 3: Local Integration#
- Point the Svelte frontend to the local FastAPI service (
localhost:8000
). - Validate full conversational flow (proxy → RAG → LLM → UI).
- Debug interactively using FastAPI’s built-in docs (
/docs
) and logs.
Phase 4: Containerization#
- Create a minimal Dockerfile:
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
- Build and test locally:
docker build -t genai-python:latest .
docker run -p 8000:8000 genai-python
Phase 5: Terraform-Driven EKS Deployment#
Use Terraform to define:
- ECR repository for images
- EKS deployment (
kubernetes_deployment
+service
) - Secrets (API keys, environment variables)
- Network ingress routes
Example Terraform snippet:
resource "kubernetes_deployment" "genai_rag" {
metadata { name = "genai-rag" }
spec {
replicas = 2
selector { match_labels = { app = "genai-rag" } }
template {
metadata { labels = { app = "genai-rag" } }
spec {
container {
image = var.image_url
name = "genai-rag"
ports { container_port = 8000 }
}
}
}
}
}
Phase 6: CI/CD Integration#
GitHub Actions / GitLab CI pipeline:
- Lint and test Python code
- Build and push Docker image to ECR
- Run
terraform plan
andterraform apply
- Deploy to EKS
- Run post-deploy health checks
This achieves one-command deploys, fully reproducible via Terraform.
Phase 7: Gradual Migration#
- Deploy Python RAG alongside the TypeScript version.
- Route a portion of traffic (A/B test) for validation.
- Monitor logs, accuracy, and latency.
- Once validated, retire the TypeScript backend.
3. Developer Tooling Recommendations#
Purpose | Recommended Tool |
---|---|
IDE | VS Code + Python Extension / PyCharm |
Testing | pytest , unittest , pytest-asyncio |
Debugging | VS Code Debugger, pdb , or ipdb |
Linting | ruff , black , isort , flake8 |
Env Management | uv , venv , or conda |
API Docs | Auto-generated Swagger via FastAPI |
Monitoring | prometheus_client , AWS CloudWatch Logs |
4. Expected Outcomes#
- Faster prototyping and debugging cycles for RAG and agentic development.
- Reduced technical debt and fewer context switches between TypeScript and Python.
- Stronger GenAI ecosystem alignment, enabling easier integration with upcoming frameworks.
- CI/CD maturity, leveraging Terraform for full infrastructure-as-code deployment.
- Better team scalability, with easier onboarding for AI developers familiar with Python.
5. References#
- Isaac H., Lessons from the Trenches: Building AI Agents (Dev.to) – Python dominates GenAI tooling and frameworks.
- Reddit Dev Discussion – Python vs TypeScript for LLM startups: Python has better AI library support and available talent.
- Gabriel O., CI/CD Pipeline for Python on AWS EKS (Medium, 2024) – Terraform-managed GitHub Actions workflow for EKS deployment.
6. Conclusion#
Migrating our GenAI backend (RAG and agentic components) to Python ensures alignment with the global AI community, accelerates development velocity, simplifies debugging, and enhances our CI/CD efficiency through Terraform-managed automation. The proposed phased migration allows for zero-downtime adoption, maintaining stability while future-proofing our GenAI architecture for upcoming MCP and agent-based workflows.