As a member of the Technology team, the Dev Ops Engineer is responsible for ensuring the smooth operation and stability of Linux-based servers and workstations, applications, and services in a heterogeneous CentOS, Windows 7/10, and OSX environment. Working in concert with the core pipeline and pipeline tools teams, the Dev Ops Engineer will support a 24/7 facility with 700+ workstations and servers and a render farm accessing Pixstor/Netapp/Cloud storage across multiple studios in a WAN environment.
The Dev Ops Engineer will work with the Lead Software Engineer and take direction from the Systems Manager and Director of Technology to implement solutions which will streamline process and improve production capabilities within the studio. They will bring forth innovative ideas to take Bardel Entertainment to the next level while maintaining stability and security requirements to ensure all production requirements are met.
What you will be doing:
- Implement systems that are highly available, scalable, and self-healing
- Liaise with core pipeline and pipeline tools departments to test and implement new initiatives
- Test and modify systems and services to ensure that they operate reliably
- Implement and manage continuous delivery systems and methodologies
- Understand, implement, and automate security controls and governance processes for compliance
- Define and deploy central monitoring, metrics, and logging systems for servers and render nodes
- Design, manage, and maintain tools to automate operational processes
- Create, update and deploy workstation, render node and server OS images
- Create and implement test plans and deployments for major OS version release upgrades
- Maintain and deploy operating system patches and updates
- Troubleshoot, and optimize Linux servers and installed software
- Validate patches, applications, and new software version releases against existing systems
- Maintain and modify scripts and programs to support business processes.
- Document system installations, configurations, policies and procedures.
- Create and update documentation, including maintenance logs and end user training / onboarding
- Perform upgrades and version management of 3rd party applications
What you bring with you:
- Must have solid knowledge of and a minimum of 3 years’ experience in Linux/Unix administration. Linux Certification preferred (RHCE or equivalent)
- Experience using LDAP integration with AD in a heterogeneous Windows/Linux environment
- Experience using Kubernetes, Foreman, Puppet, Chef, Vagrant, Ansible for configuration management
- Nagios, ganglia, LogStash, Elastic Search experience
- Familiarity installing and maintain SQL databases (Mysql/Postgres)
- Experience managing and configuring workstations to be accessed via zero client technologies
- Experience with Red Hat Enterprise Virtualization / VMWARE Hypervisors and ESX and Docker deployments
- Familiarity with version control systems: Git, SVN, CVS, and Perforce:
- Programming in Python, PHP, Perl, C++
- Must be an enthusiastic self starter who works well in a team environment.
- Ability to multitask while managing and prioritizing multiple projects.
- Excellent verbal and written communication.
- Strong interpersonal skills required.
- Ability to quickly assess complex problems and make critical decisions to resolve them.
- Experience with Security Control Audit requirements, file system security an asset
- Experience with managing network switch management an asset
- Experience in an Animation/VFX production environment preferred.
- Occasional after hours and weekend work should be expected in this role to facilitate systems maintenance
Extra points if you have:
- Experience with Security Control Audit requirements, file system security
- Juniper and Brocade management experience
- Experience managing large scale render farms (Hundreds of nodes) including rapid image deployments, monitoring, and HP ILO / IPMI management