Where

Software Engineer, Supercomputing Scalability

OpenAI
San Francisco Full-day Full-time

Description:

About the Team Supercomputers scale vertically. The workloads are synchronous and cluster-scale. These conditions demand a novel approach to cluster infrastructure, and it is the work of the Supercomputing Scalability Pillar to invent it. The focus is on scaling beyond k8s supported node counts, deploying cluster wide releases rapidly and atomically, comprehensive telemetry into the health and activity in the cluster, and rapid onboarding of new supercomputing systems with bleeding edge hardwar
Apr 26, 2024;   from: dice.com

Similar jobs

Description: About the Team The Supercomputing Scheduling Pillar at OpenAI is dedicated to ensuring the reliability, scalability, and user-friendliness of job lifecycle management, with an emphasis on efficient and flexible job scheduling, quota ...
9 days ago
Description: About the Team We believe that increasing compute is a huge lever to AI progress. The Supercomputing team owns the entire process of building OpenAI's compute and infrastructure. This includes the deployment of huge clusters using Kubernetes ...
9 days ago
  • BCforward
  • San Francisco
$75 $85 an hour
Description: Software Engineer BCforward is currently seeking a highly motivated Software Engineer for a San Francisco, CA - Remote opportunity. Position Title: [Software Engineer] Location: [San Francisco, CA] - Remote Anticipated Start Date: [Apr 29th, ...
26 days ago
  • Maxonic, Inc.
  • San Francisco
Description: Maxonic maintains a close and long-term relationship with our direct client. In support of their needs, we are looking for a Senior Network Software Engineer Job Title: Senior Network Software Engineer Job Location: San Francisco Bay Area ( ...
10 days ago