Senior AI Engineer
VoiceFlip
Software Engineering, Data Science
Ecuador Ā· guayaquil, ecuador Ā· Guayas, Ecuador Ā· Provins, France
Posted on Mar 14, 2026
Senior AI Engineer (RAG / MCP / AI Agents / Small Language Models)
Location: Argentina (Remote)
Compensation: Up to $2,000 USD/month
Language: Fluent English (required)
OverviewWe are hiring a Senior AI Engineer with strong software engineering fundamentals and hands-on experience building production-grade AI systems.
This is a systems-level role. We are looking for someone who understands backend architecture, performance optimization, distributed systems, and clean engineering practices ā and can apply those principles to LLM-powered and small language model (SLM) applications.
Core Responsibilities- Architect and implement scalable RAG systems (chunking strategies, embedding pipelines, retrieval tuning)
- Design service layers around LLM and SLM integrations
- Build and maintain MCP-compliant tool integrations
- Develop AI agents capable of multi-step reasoning and tool orchestration
- Implement and optimize small language model deployments (local inference, quantization, fine-tuning)
- Improve latency, cost efficiency, and reliability across AI systems
- Implement logging, evaluation, and guardrail frameworks for AI outputs
- Contribute to infrastructure decisions (cloud vs. local inference tradeoffs)
- 5+ years of professional software engineering experience
- Strong backend development experience (Python preferred)
- Solid understanding of:
- Distributed systems fundamentals
- API design and integration
- Data modeling
- Performance optimization
- Secure system design
- Hands-on experience with:
- RAG architectures
- Vector databases
- LLM APIs (OpenAI, Anthropic, Azure OpenAI, etc.)
- Small language models (e.g., Llama variants, Mistral, Gemma, etc.)
- Model quantization or local inference setups
- Experience building or integrating AI agents
- Fluent English (must be comfortable in technical discussions)
- Experience with model fine-tuning (LoRA, PEFT, etc.)
- Experience running models on local hardware (GPU nodes, on-prem inference, etc.)
- Familiarity with prompt injection mitigation and AI security
- Experience designing evaluation harnesses for LLM/SLM outputs
- Engineering maturity and architectural thinking
- Comfort balancing cost vs. performance trade-offs
- Ability to move from prototype to hardened production system
- Startup execution mindset

