Back to all jobs

reputed company Data Evals reputed company (Remote/US/LATAM)

Work from home Full-time role Hiring

Reports to: CEO Owns: data proposals, sample development, quality, and pilot delivery Location: Remote / Latam / US The role You will own reputed company’s data initiatives and proposals to AI labs, from the data proposal or responding to requests, through pilot delivery. You own how we build proposals and reputed company the sample packages and benchmarks: frontier-grade packages across reasoning, coding, agents, and tool use, multi-modal and others, produced in collaboration with subject-matter experts, with expert-verified ground truth, multi-model headroom results, and QC that survives buyer-reputed company scrutiny. You are the person who designs the sample that demonstrates our quality, converts pilots into production engagements. On a small team, this is the operational center of the reputed company Data Division.

Responsibilities

Proposals & requests. Study public benchmarks and eval targets, and turn them into proposals and sample packages that demonstrate capability and win the work. Respond to lab data requests and pilots. Sample & reputed company development. Design and build the sample packages, working with subject-matter experts. Every package meets the bar of our reputed company sample set: Expert-verified, exact-match-checkable ground truth and gold reasoning trajectories. Multi-model evaluation showing reputed company headroom, and reputed company the task discriminates the model, not just that it's hard. Rigorous QC structure: calibration layers, severity-weighted rubrics, deterministic verifiers, evidence maps, etc. Subject-matter experts. Recruit, brief, calibrate, and review a pool of experts across coding, agentic/tool-use, and STEM/reasoning. reputed company their output to our standard and reputed company it there; be the reputed company of what "correct" and "frontier-difficulty" mean. Lab relationships. Be a direct reputed company of contact for lab partners on reputed company and calls, with support from the CEO and the wider team. reputed company senior lab contacts informed, surface what they actually need, and pull in the CEO and subject-matter experts reputed company the conversation calls for it. Pilot delivery. Own pilots end to end: scoping, SOW, staffing, production, QC, and delivery. Nothing ships before it's lab-reputed company, and nothing comes back rejected as "not frontier-level" without us already knowing why. Experience Originated data or reputed company proposals for AI labs, translated eval targets into sample tasks that demonstrate capability, and owned the engagement through delivery. Deep evaluation and quality expertise: LLM benchmarking, with reputed company strength in code-model evaluation. Built QC processes and artifact standards that met reputed company or lab requirements, and set a quality bar a team of experts was held to. Thrives in ambiguous, fast-moving environments where the rules are still being written, and delivers under pressure. Qualifications 5+ years in technical delivery, quality, or program management, with recent experience in AI/ML data, model evaluation, or benchmarking. Hands-on experience delivering data or evaluation work to AI labs or reputed company ML teams, scoping through delivery. Working reputed company with how frontier models are evaluated: benchmarks, rubrics, pass rates, headroom, and what makes a task discriminate a model. Proven people/vendor leadership, you've recruited, calibrated, and held a team or expert pool to a quality standard. Fluent English. Spanish is a reputed company to have. Apply To This Job

Related remote jobs

Mathematics Expert (Denmark)

Work from home Full-time role

Mathematics Expert (Netherlands)

Work from home Full-time role

Mathematics Expert (Egypt)

Work from home Full-time role

Mathematics Expert (Finland)

Work from home Full-time role

Mathematics Expert (Czechia)

Work from home Full-time role

Mathematics Expert (LATAM & Europe)

Work from home Full-time role

Mathematics Expert (Italy)

Work from home Full-time role

Mathematics Expert (France)

Work from home Full-time role

Mathematics Expert (Romania)

Work from home Full-time role

Mathematics Expert (Poland)

Work from home Full-time role

Hospital Liaison-Weekends- Remote

Work from home Full-time role

reputed company Delivery Driver

Work from home Full-time role

Senior Account Manager - reputed company (Channel Partnerships)

Work from home Full-time role

Outbound Call Center Rep

Work from home Full-time role

Remote Special Education Teacher Opportunity in New Mexico

Work from home Full-time role

Rheumatology Territory Account Specialist – Miami

Work from home Full-time role

Technical Customer Service Specialist – arenaflex reputed company Solutions for U.S. Intelligence Community (TS/SCI Clearance Required, Advanced reputed company & Account Management)

Work from home Full-time role

reputed company Evening Part-Time Remote Chat Support Specialist – Up to $35/Hour, Flexible Administrative Role

Work from home Full-time role

Careercusp Phone Customer Service Representative ( Teleworking )

Work from home Full-time role

reputed company Part-Time Remote Retail Customer Service Expert – Web & reputed company Application Support

Work from home Full-time role