Job Description
Job Title: AWS Python Spark Developer with Database Knowledge
Location: Richmond, VA (Onsite/Hybrid)
Client: Capital One
Duration: 12 Months
Employment Type: W2 Contract Only
Experience: 7-8 Years
Job Description:
We are seeking a highly skilled AWS Python Spark Developer with strong database expertise to join our team for a 12-month W2 contract with Capital One in Richmond, VA. The ideal candidate will have hands-on experience with big data processing, AWS cloud services, and ETL pipeline development.
Key Responsibilities:
• Develop and optimize ETL pipelines using Python and Apache Spark on AWS.
• Work with large-scale datasets and implement distributed data processing solutions.
• Utilize AWS services such as S3, Lambda, Glue, EMR, Redshift, and RDS.
• Optimize Spark jobs for performance, scalability, and cost-efficiency.
• Implement data modeling, governance, and security best practices.
• Collaborate with data engineers, analysts, and stakeholders to meet business needs.
• Troubleshoot and resolve performance bottlenecks in data pipelines.
• Work with relational (PostgreSQL, Redshift) and NoSQL databases for efficient data storage and retrieval.
Required Skills & Experience:
• AWS Expertise: Hands-on experience with S3, Glue, EMR, Lambda, Redshift, IAM.
• Python & Spark: Strong programming skills in Python and experience with PySpark.
• Big Data & ETL: Experience in scalable data pipeline development using Spark.
• Databases: Strong knowledge of SQL, PostgreSQL, Redshift, or other relational and NoSQL databases.
• Data Processing: Expertise in data partitioning, compression, and performance tuning.
• CI/CD & DevOps: Familiarity with Git, Jenkins, Terraform, or CloudFormation is a plus.
• Cloud Security & Compliance: Understanding of AWS security best practices and data encryption.
Preferred Qualifications:
• Experience in the financial services sector is a plus.
• Knowledge of machine learning frameworks on AWS is a bonus.
• Certifications such as AWS Certified Data Analytics or AWS Solutions Architect are preferred.
Note: This position is W2 contract only. No C2C or third-party candidates.
If you're a highly experienced AWS Python Spark Developer looking for an exciting opportunity with Capital One in Richmond, VA, we encourage you to apply!
Responsibilities
- Develop and optimize ETL pipelines using Python and Apache Spark on AWS
- Work with large-scale datasets and implement distributed data processing solutions
- Utilize AWS services such as S3, Lambda, Glue, EMR, Redshift, and RDS
- Optimize Spark jobs for performance, scalability, and cost-efficiency
- Implement data modeling, governance, and security best practices
- Collaborate with data engineers, analysts, and stakeholders to meet business needs
- Troubleshoot and resolve performance bottlenecks in data pipelines
- Work with relational (PostgreSQL, Redshift) and NoSQL databases for efficient data storage and retrieval
Requirements
- The ideal candidate will have hands-on experience with big data processing, AWS cloud services, and ETL pipeline development
- AWS Expertise: Hands-on experience with S3, Glue, EMR, Lambda, Redshift, IAM
- Python & Spark: Strong programming skills in Python and experience with PySpark
- Big Data & ETL: Experience in scalable data pipeline development using Spark
- Databases: Strong knowledge of SQL, PostgreSQL, Redshift, or other relational and NoSQL databases
- Data Processing: Expertise in data partitioning, compression, and performance tuning
- Cloud Security & Compliance: Understanding of AWS security best practices and data encryption