AI Engineer

6 days ago


Athens, Attica, Greece Code Full time €45,000 - €75,000 per year

We are the
1st Hub for Developers
Our motto is
"From Developers to Developers"
Our vision is to provide real career opportunities for candidates that want to take the next step in their career. Code.Career is the first process that you will speak
with developers (only) and tech (freak) experts

Position Overview

Our Client
is seeking a detail-oriented LLM Quality Analyst to join their AI team. You will be responsible for designing, implementing, and managing comprehensive testing and evaluation frameworks for their generative AI products. This role is critical to ensuring their AI systems meet the highest standards of quality, accuracy, safety, and ethical compliance before reaching production.

Key Responsibilities

Testing Framework Development

  • Design and implement comprehensive testing frameworks for LLM and generative AI applications
  • Create curated benchmark suites using industry-standard datasets (TruthfulQA, ARC, TriviaQA, MMLU)
  • Develop custom evaluation datasets tailored to specific use cases and domains
  • Build synthetic data generation pipelines for edge case testing
  • Define clear evaluation criteria, rubrics, and quality metrics
  • Establish testing protocols for different AI model types and applications

Quality Evaluation & Testing

  • Execute automated and manual evaluations of AI model outputs
  • Measure and track key quality metrics: relevance, factual consistency, coherence, hallucination rate
  • Assess model performance across dimensions: accuracy, latency, fairness, toxicity, bias
  • Perform functional correctness testing for code generation and structured outputs
  • Conduct A/B testing and shadow testing for model comparisons
  • Validate prompt engineering strategies and RAG pipeline effectiveness

Feedback Management & Issue Tracking

  • Manage intake of user-reported issues and feedback
  • Translate user feedback into reproducible test cases
  • Replicate and document bugs, edge cases, and failure modes
  • Log and track defects in issue-tracking systems (Jira, Linear, GitHub Issues)
  • Prioritize issues based on severity, frequency, and business impact
  • Collaborate with engineering teams on root cause analysis and resolution

Human Annotation & LLM-Based Evaluation

  • Coordinate human annotation efforts with clear guidelines and rubrics
  • Implement overlapping review processes to ensure annotation consistency
  • Integrate LLM-based evaluators (GPT-4, Claude) for automated quality assessment
  • Design evaluation prompts that provide structured scores and reasoning
  • Validate LLM evaluator outputs against human judgments
  • Continuously refine evaluation methodologies based on findings

Monitoring & Reporting

  • Monitor live-traffic metrics through observability dashboards
  • Track model performance trends and identify quality regressions
  • Generate comprehensive quality reports for product and engineering teams
  • Present findings and recommendations to stakeholders
  • Maintain documentation of testing procedures and evaluation results
  • Drive continuous improvement of AI system quality

Required Qualifications

Education

  • Bachelor's degree in Computer Science, Data Science, Linguistics, Cognitive Science, or related field
  • Master's degree in AI/ML, NLP, or related field (preferred)

Experience

  • 3+ years
    of experience in quality assurance, testing, or evaluation roles
  • 2+ years
    working with AI/ML systems, preferably LLMs or NLP applications
  • Experience designing and executing test plans for software or AI products
  • Proven track record of identifying and documenting complex technical issues

Technical Skills

  • Programming:
    Proficiency in Python for test automation and data analysis
  • LLM Knowledge:
    Understanding of how LLMs work, their capabilities and limitations
  • Evaluation Frameworks:
    Familiarity with LLM evaluation tools (Langfuse, DeepEval, RAGAS, Phoenix)
  • Data Analysis:
    Experience with pandas, numpy, and data visualization tools
  • Testing Tools:
    Knowledge of pytest, unittest, or similar testing frameworks
  • Issue Tracking:
    Proficiency with Jira, Linear, GitHub Issues, or similar platforms
  • APIs:
    Ability to work with REST APIs and LLM provider APIs (OpenAI, Anthropic)
  • SQL:
    Basic SQL skills for querying databases and analyzing results

Quality Assurance Expertise

  • Strong understanding of QA methodologies and best practices
  • Experience with test case design and test coverage analysis
  • Knowledge of different testing types: functional, regression, integration, performance
  • Familiarity with CI/CD pipelines and automated testing integration
  • Understanding of metrics and KPIs for quality measurement

AI/ML Evaluation Knowledge

  • Understanding of common LLM evaluation metrics (BLEU, ROUGE, BERTScore, perplexity)
  • Knowledge of bias detection and fairness evaluation techniques
  • Familiarity with hallucination detection and factual consistency checking
  • Understanding of prompt engineering and its impact on model outputs
  • Awareness of AI safety, ethics, and responsible AI principles

Core Competencies

  • Exceptional attention to detail and analytical thinking
  • Strong problem-solving and critical reasoning abilities
  • Excellent written and verbal communication skills
  • Ability to work independently and manage multiple priorities
  • Collaborative mindset for cross-functional teamwork
  • Curiosity and willingness to learn new AI technologies

Preferred Qualifications

  • Experience with specific LLM evaluation platforms (Langfuse, Weights & Biases, Arize)
  • Knowledge of human-in-the-loop evaluation workflows
  • Familiarity with RAG systems and vector database evaluation
  • Experience with adversarial testing and red-teaming for AI systems
  • Understanding of model fine-tuning and its quality implications
  • Background in linguistics, cognitive science, or human-computer interaction
  • Experience with statistical analysis and hypothesis testing
  • Knowledge of regulatory requirements (GDPR, AI Act) for AI systems
  • Contributions to AI evaluation research or open-source projects
  • Experience with multimodal AI evaluation (text, image, audio)

What They Offer

  • Competitive salary and benefits package
  • Comprehensive health, dental, and vision insurance
  • Professional development opportunities in AI/ML
  • Flexible work arrangements (remote/hybrid options)
  • Access to cutting-edge AI technologies and tools
  • Collaborative team environment with AI experts
  • Opportunity to shape quality standards for innovative AI products
  • Conference attendance and learning budget

  • AI Solutions Engineer

    2 weeks ago


    Athens, Attica, Greece AthenaGen AI Full time €40,000 - €60,000 per year

    Αν σου αρέσει να εντοπίζεις ευκαιρίες για αυτοματοποίηση, να χτίζεις AI Λύσεις που λύνουν πραγματικά επιχειρηματικά προβλήματα και να δουλεύεις με cutting-edge τεχνολογίες, τότε διάβασε παρακάτω.Τι θα...

  • Senior AI Developer

    2 weeks ago


    Athens, Attica, Greece AthenaGen AI Full time €60,000 - €80,000 per year

    Αν σου αρέσει να εντοπίζεις ευκαιρίες για αυτοματοποίηση, να χτίζεις AI Λύσεις που λύνουν πραγματικά επιχειρηματικά προβλήματα και να δουλεύεις με cutting-edge τεχνολογίες, τότε διάβασε παρακάτω.Τι θα...

  • Senior AI Developer

    2 weeks ago


    Athens, Attica, Greece AthenaGen AI Full time €40,000 - €60,000 per year

    Αν σου αρέσει να εντοπίζεις ευκαιρίες για αυτοματοποίηση, να χτίζεις AI agents που λύνουν πραγματικά επιχειρηματικά προβλήματα και να δουλεύεις με cutting-edge τεχνολογίες, τότε διάβασε παρακάτω.Τι θα...

  • Senior C# Engineer

    2 weeks ago


    Athens, Attica, Greece GLIMS AI Full time €540,000 per year

    **Location:** Remote (EU time zones preferred) | Quarterly team meetups in Amsterdam or Athens**Type:** Full-time | Start-up environment with global ambitionHelp Us Revolutionize the Audit Industry with Automation and AI.At Glims, we're not just improving financial audits—we're rethinking them. By blending deep audit expertise with intelligent automation...

  • AI Engineer

    2 weeks ago


    Athens, Attica, Greece Gaspar Full time €30,000 - €60,000 per year

    Gaspar.AI is seeking a talented and innovative AI Engineer to join our dynamic team. If you have a passion for artificial intelligence and a strong background in developing AI-based solutions, we want to hear from youAt Gaspar.AI, we are dedicated to creating a Conversational AI and Workflow Automation platform that transforms the way companies handle...

  • AI Engineer

    6 days ago


    Athens, Attica, Greece iKnowHow S.A. Full time €40,000 - €60,000 per year

    IKH is a leading Software & Robotics Solutions company with an international footprint. In just a few years, we've grown our team by over 80%, moved into a modern and spacious new office, and launched multiple exciting projects across digital transformation, custom software, and robotics.We're now looking for an AI Engineer to join our fast-growing...

  • AI Engineer

    6 days ago


    Athens, Attica, Greece IKNOWHOW SA Full time €35,000 - €60,000 per year

    IKHis a leading Software & Robotics Solutions company with an international footprint. In just a few years, we've grown our team by over 80%, moved into a modern and spacious new office, and launched multiple exciting projects across digital transformation, custom software, and robotics.We're now looking for anAI Engineerto join our fast-growing innovation...

  • Front-End Engineer

    2 weeks ago


    Athens, Attica, Greece GLIMS AI Full time €480,000 - €600,000 per year

    Front-End Engineer (React / )Location: Remote (EU time zones preferred) | Quarterly team meetups in Amsterdam or AthensType: Full-time | Start-up environment with global ambitionHelp Us Revolutionize the Audit Industry with Automation and AI.At Glims, we're not just improving financial audits—we're rethinking them. By blending deep audit expertise with...

  • Software Engineer

    2 weeks ago


    Athens, Attica, Greece Superbo AI Full time €30,000 - €60,000 per year

    Software Engineer - Front End[Position code:SEFE251130]Superbois looking for a Frontend SoftwareEngineer with proven experience in working in fast-paced environments, multi-tasking and using a large set of technologies.Superbois building a cutting edge, Conversational AI product with focus on providing a true omni experience across both voice and text...

  • Senior AI Engineer

    2 weeks ago


    Athens, Attica, Greece iKnowHow S.A. Full time €60,000 - €120,000 per year

    IKH is a leading Software & Robotics Solutions company operating internationally. Within just a couple of years, our team grew up more than 80%, we moved into a new spacious and modern office, and we kicked off 2025 with several new and exciting projects in digital transformation, custom software and robotics.We are currently looking for a Senior AI...