Lead Data Engineering to build a modern data platform to help make better business decisions and modernize and accelerate the discovery of new medicines.
- Writes ETL (Extract / Transform / Load) processes, designs database systems and develops tools for real-time and offline analytic processing.
- Troubleshoots software and processes for data consistency and integrity. Integrates complex and large-scale data from various sources for business partners to generate insight and make decisions.
- Translates business specifications into design specifications and code. Responsible for writing complex programs, ad hoc queries, and reports. Ensures that all code is well structured, includes sufficient documentation, and is easy to maintain and reuse.
- Partners with internal clients to gain an expert understanding of business functions and informational needs. Works closely with other technical and data analytics experts across the business to implement data solutions.
- Leads all phases of solution development. Explains technical considerations at related meetings, including those with internal clients and less experienced team members.
- Assesses data quality and tests code thoroughly for accuracy of intended purpose. Provides data analysis guidance and serves as a technical consultant for the client.
- Educates and develops junior data engineers on the team while applying quality control to their work. Develops data engineering standards and contributes expertise to other data expert teams across Vanguard.
- Tests and implements new software releases through regression testing. Identifies issues and engages with vendors to resolve and elevate software into production.
- Participates in special projects and performs other duties as assigned.
- Work with data scientists to incorporate Client models in the data analytics platform to derive intelligent insights.
- Optimize existing data infrastructure, code & data pipelines to reduce cost, increase efficiency, and improve scalability.
Qualifications and Experience
- Bachelor’s is required. A master’s degree in computer science/engineering or related field is highly preferred.
- 10+ years of work experience in data/software engineering/analytics using cloud services like Azure, AWS, or GCP is required.
- Experience with AWS technology such as Amazon S3, ECS, Lambda, Step Functions, AWS Glue, SQS, DynamoDB, Event Bridge/Kinesis/MSK, EC2, Elastic MapReduce, Glue, Redshift and CloudFormation.
- 3+ years of technical leadership in leading and mentoring a team of data engineers with strong hands-on technical expertise is required.
- A deep understanding of ETL/ELT design, database design, data architecture, Databricks, Data Lake, PySpark, Spark cluster management & optimization, and Data-Warehouse design and management is required.
- Experience working with cloud-based data infrastructure/ services like AWS, Azure Data Factory, Web Apps, Synapse or similar is required.
- Proficiency in programming languages like Python, Scala, SQL, C#, or Java for data analytics development is required.
- Good communication and influence skills are required.
- Appetite for continuous quality improvement of technical standards, methodologies, and leveraging new technologies.