Diversified Services Network, Inc. (DSN) is seeking a full-time Sr. Server Administrator to join our team in either Dallas, TX; Peoria, IL; Phoenix, AZ; Broomfield, CO or Cary, NC.
We offer a hybrid work model, full benefits, PTO, 401k, and more! If you're looking to grow your technical career within an extremely reputable, stable Fortune 500 company - let's talk!
W2 ONLY – Absolutely NO C2C (will NOT respond to vendors).
JOB RESPONSIBILITIES:
- Administer and maintain GPU-accelerated servers and clusters, including NVIDIA A100, H100, and other high-end GPU sets.
- Manage and optimize NVIDIA software stack components such as CUDA, cuDNN, TensorRT, NCCL, and NGC containers.
- Monitor system performance, troubleshoot hardware/software issues, and ensure high availability of AI infrastructure.
- Collaborate with DevOps and AI teams to support containerized workflows (Docker, Kubernetes) and distributed training environments.
- Implement security best practices and ensure compliance with internal and external standards.
- Lead upgrades, patching, and lifecycle management of GPU servers and related infrastructure.
- Provide documentation, automation scripts, and training for internal teams.
EDUCATION
- Bachelor’s Degree with a minimum of 8 years' work experience, 5+ years of experience in server administration, with at least 3 years focused on NVIDIA GPU-based systems
REQUIRED SKILLS:
- 5+ years of experience in server administration, with at least 3 years focused on NVIDIA GPU-based systems
- Deep understanding of Linux system administration, especially in HPC or AI environments.
- Hands-on experience with NVIDIA GPU drivers, CUDA toolkit, and performance tuning.
- Familiarity with Slurm, Kubernetes, or other job scheduling and orchestration tools
- Experience with monitoring tools (e.g., Prometheus, Grafana) and infrastructure automation (e.g., Ansible, Terraform).
- Strong scripting skills (Bash, Python, etc.).
- Excellent problem-solving and communication skills.
DESIRED SKILLS:
- NVIDIA Certified Professional or similar credentials (desired)
- Experience with multi-GPU and multi-node training setups.
- Familiarity with AI/ML frameworks (e.g., PyTorch, TensorFlow) and their GPU dependencies.
- Exposure to cloud-based GPU infrastructure (AWS, Azure, GCP).
BENEFITS:
- 401(k)
- Dental insurance
- Vision Insurance
- Disability insurance
- Employee assistance program
- Health insurance
- Health savings account
- Life insurance
- Paid time off
- Paid Holidays
Please follow the link to our website for a list of job openings in Engineering, IT, Project Management, and more! https://www.dsnworldwide.com