I build reliable AI systems using prompt engineering, structured workflows, and LLM evaluation frameworks.
Prompt Architecture
• structured prompting
• workflows for complex LLM behaviors
• agent promptsLLM Systems
• prompt learning
• tool-using agents
• evaluation pipelinesAI / ML Consulting
• AI-native product design
• LLM reliability
• safety & guardrails
LLM Knowledge Systems
• Built production-ready hallucination mitigation agent that filtered for 97-98% of false outputs (one week before Google DeepMind)Model Evaluations
• Developed a proprietary, in-house 100-unit model evaluation suite to stress-test a 2-part system: generalized intelligence and enterprise/market outputsAI Systems at Scale
• Created prompt architecture for AI discovery and routing systems, including system prompts & dynamic user-context injection, improving tool routing reliability, reducing hallucination-risk scenarios, & strengthening “what should I do next?” user guidanceAI Translation/Translocation
• Devised a GenAI translation system spanning 42+ languages enabling dynamic/formal equivalence, localization, & regional nuance to deliver near-native accuracy for global enterprise contextsLLM Behavior
• Synthesized cross-surface prompt contracts defining global vs. local prompting responsibilities, improving system coherence, preventing duplication or prompt drift, and enabling consistent AI behavior across product experiences
Core Skills
• Prompt Engineering
• Prompt Architecture
• AI Agent Systems
• Evaluation Frameworks
• LLM GuardrailsTools
• Anthropic Console
• OpenAI Playground
• Python
• Markdown
• ChatMLLanguages
• Persian (native / fluent)
• Portuguese (fluent)
• Spanish (fluent)
• Italian (advanced)
• Modern Standard Arabic (advanced)
• Levantine Spoken Arabic (advanced)
• French (intermediate)
• Dakota (beginner)
• Latin (reading)
• Hebrew (limited)