About this role
Platform Validation & Reliability EngineerEngagement OverviewWe are seeking a highly skilled contractor to drive software validation, test reliability, and data-quality initiatives across our development vehicle platforms.This role is instrumental in ensuring the integrity of our development-embedded systems. You will work at the intersection of Hardware-in-the-Loop (HIL) testing, automated pipelines, and release-readiness evidence to improve confidence in our platform's deployment capabilities.Primary Location Timezones: UTC+0 (UK) & UTC-8 (CA)
Key Responsibilities
Test Integrity & TriageExecute and support validation activities across diverse development platform with different hardware and software configurations Triage of test and pipeline failures to distinguish between genuine firmware defects and issues rooted in bench hardware, tooling, or infrastructure. Identify recurring failure themes to improve system-wide debugging efficiency. Collaborate with internal software and platform teams to expand test coverage and reduce manual investigation overhead.Tooling & Automation Develop and refine automation scripts, dashboards, and operational runbooks to scale validation workflows.
Enhance test reliability across Hardware-in-the-Loop, Software-in-the-Loop, Model-in-the-loop Support data-quality validation checks, including logging, MCAP, offload, and ingest processes.Professional ProfileCore Competencies Technical Foundations: Proven experience in software validation, test automation, or reliability engineering within complex systems. Systems Debugging: Comfortable navigating CI/CD pipelines, Linux-based environments, and distributed test infrastructure. Programming: Proficiency in Python, understanding of C/C++ a plus Operational Mindset: Ability to pick up context quickly in ambiguous environments and document processes clearly.Preferred
Qualifications
Background in HIL, embedded systems, robotics, or automotive validation. Familiarity with sensor data validation and observability/monitoring workflows. Experience building lightweight internal tools or triage dashboards.Defining SuccessSuccess in this engagement is measured by:1.
Efficiency: A measurable reduction in the time required for pipeline failure triage.2. Clarity: Increased visibility into pipeline health and ownership through structured reporting.3. Confidence: Enhanced validation evidence that directly supports platform and release-readiness milestones.4.
Scalability: A reduction in manual effort through the implementation of repeatable automation and runbooks.Originally posted on Himalayas