Back to all jobs
Cosette Network

Conversational AI Agent Model Evaluation with AI

Bangalore, Gurgaon, Noida, Punehybrid6.0 - 13.0 years

Apply for this position

All fields marked * are required

Name & contact details are extracted from your resume automatically.

Job Description

Job Title: Senior Conversational AI Agent Model Evaluation with AI

Location: Noida/Gurgoan/Pune/Bangalore

Experience: 6-14 Years

Role Overview

We are seeking a highly skilled Senior Conversational AI Agent Model Evaluation with AI to lead end-to-end testing and quality assurance of Voice AI / Conversational AI solutions across industries such as Healthcare, Banking, Insurance, and Utilities—primarily focused on US-based customer journeys.

This role combines domain understanding, AI-driven testing, simulation design, and insights generation to ensure enterprise-grade performance, compliance, and customer experience for voice AI deployments.

---

Key Responsibilities

1. Conversational AI & Domain Understanding

· Develop a deep understanding of Voice AI / Conversational AI architectures, including NLU/NLP pipelines, dialog management, and integrations.

· Analyze customer journeys across industries (Healthcare, BFSI, Insurance, Utilities, etc.) with strong awareness of US market nuances, compliance requirements, and user behavior patterns.

· Collaborate with product, design, and development teams to ensure test scenarios align with real-world customer journeys.

---

2. AI-driven Testing & Tool Utilization

· Lead end-to-end testing using AI-powered quality platforms such as:

o Cyara / Hammer / Bluejay / similar tools

· Design, execute, and manage:

o Functional testing

o Conversational flow validation

o Regression testing for NLP models

o Load and simulation testing for voice systems

· Ensure coverage across multi-turn conversations, edge cases, and failure scenarios.

---

3. Simulation, Evals & Custom Metrics Design

· Configure and run simulation frameworks to mimic real user interactions at scale.

· Define and implement:

o Evals (evaluation frameworks)

o Custom quality metrics (intent accuracy, containment, fallback rates, sentiment proxies, etc.)

· Align evaluation metrics with business KPIs and customer experience goals.

· Leverage the AI agent’s knowledge base and training data to design realistic test scenarios.

---

4. Transcript Analysis & Insight Generation

· Perform deep analysis of:

o Call transcripts

o Conversation logs

o AI-generated responses

· Identify:

o Intent misclassification

o Dialog breakdown points

o Knowledge gaps

o UX friction points

· Convert findings into structured recommendations for product, design, and engineering teams.

---

5. Dashboarding & Reporting

· Design and build insight-driven dashboards to:

o Highlight defects and performance gaps

o Quantify customer impact

o Track quality trends over time

· Present actionable insights to:

o Client stakeholders (business impact)

o Development teams (technical root cause)

· Enable data-driven prioritization of fixes and enhancements.

---

Preferred Experience & Qualifications

Experience

· 6–10 years of experience in:

o QA / Testing / Quality Engineering

o Conversational AI / Voice Bots / Contact Center Automation

· Hands-on experience with:

o AI testing platforms (Cyara, Hammer, Bluejay, or equivalent)

o Simulation frameworks and conversational testing tools

· Experience working with US clients or products serving US customers is highly preferred.

---

Educational Qualification

· B.Tech / BE (Computer Science, IT, Electronics, or related field)

---

Core Skills

Technical Skills

· Strong understanding of:

o NLP/NLU concepts (intent, entity recognition, confidence scores)

o Voice AI systems and telephony integrations

· Experience in:

o Test automation frameworks

o Data analysis (Excel, Python, or equivalent tools preferred)

o Dashboarding tools (Power BI, Tableau, Looker, etc.)

---

Functional & Analytical Skills

· Ability to connect technical defects with business/customer impact

· Strong analytical mindset with experience in transcript-driven insights

· Experience defining custom KPIs and evaluation metrics

---

Soft Skills

· Strong stakeholder communication (client + internal teams)

· Ability to translate insights into clear action plans

Proactive problem-solving approach with attention to detail

Skills

Required

Voicebot TestingAutomation FrameworkexcelPythonPower BI
Conversational AI Agent Model Evaluation with AI at Cosette Network | Talynce Jobs