February 22, 2024

Opportunity: Data Scientist Consultant (LLM Evaluation)

We’re looking for an impact-minded Data Scientist to help advance the Lab’s work on generative AI and consumer-centric conversational experiences. In this consulting assignment, you will help us by developing evaluation frameworks to ensure the response quality of Consumer Reports’ first conversational AI product line.

The assignment will involve building out CR’s infrastructure for testing and refinement of a new conversational product CR is bringing to market. This will be a 3 month contract with possibility to extend. You’ll learn a lot about product innovation at non-profit enterprises, and work with multi-disciplinary, cross-functional colleagues to harness technology in service of consumers.

What will you do:

Develop and refine evaluation frameworks tailored to assess the response quality of large language models (LLMs) in a conversational AI context.
Collaborate with the Innovation Lab engineers and product teams to integrate evaluation findings into the product development lifecycle, ensuring continuous improvement.
Contribute to the creation and maintenance of a comprehensive test suite and infrastructure to facilitate ongoing quality assurance and performance benchmarking.
Stay up-to-date on the latest advancements in LLM evaluation techniques and tools, recommending adaptations where beneficial.
Promote “evaluation-driven development” practices by sharing insights and best practices with internal teams and the broader community through reports, presentations, and publications.

About you:

You have experience building large-scale evaluation frameworks, preferably with natural language processing (NLP), ideally with LLMs.
You have at least two years of experience working as a Data Scientist, preferably with a focus on NLP or evaluations, ideally with LLMs.
You possess a strong foundation in data science principles, statistical analysis, and machine learning, with specialized expertise in one or more relevant domains.
You are a curious and quick learner, capable of quickly incorporating the newest LLM evaluation techniques and tools.
You’re fascinated by potential applications of generative AI and thoughtful about how it can be developed responsibly.
You have a collaborative spirit, eager to work alongside CR staff, consultants, vendors, and other stakeholders, to contribute your expertise.

If this sounds like you, drop a note and your resume to innovationlab@cr.consumer.org. In your email, let us know what conversational AI / generative AI experiences have your attention, and why you believe you’d be a good fit for this position.