Generate a CV for this Job!

Based on your profile and this job description, you can create a tailored CV to apply directly.

Python Data Engineer (Middle) ID31594

AgileEngine • District of Columbia, US • On-site

Posted on: 21st February, 2025
Employment Type: FULLTIME

Job Description

About the position

AgileEngine is one of the Inc. 5000 fastest-growing companies in the U.S. and a top-3 ranked dev shop according to Clutch. We create award-winning custom software solutions that help companies across 15+ industries change the lives of millions. If you like a challenging environment where you’re working with the best and are encouraged to learn and experiment every day, there’s no better place - guaranteed! :)

Responsibilities
• Design, develop, and maintain ETL pipelines to extract, transform, and load data across various data sources (cloud storage, databases, APIs)
,
• Use Apache Airflow for orchestrating workflows, scheduling tasks, and managing pipeline dependencies
,
• Build and manage data pipelines on Azure and GCP clouds
,
• Design and support Data Lake
,
• Write Python scripts for data cleansing, transformation, and enrichment using libraries like Pandas, PySpark
,
• Analyze logs and metrics from Airflow and cloud services to resolve pipeline failures or inefficiencies.

Requirements
• Experience (2+ years) writing efficient and scalable Python code, especially for data manipulation and ETL tasks (using libraries like Pandas, PySpark, Dask, etc.)
,
• Knowledge of Apache Airflow for orchestrating ETL workflows, managing task dependencies, scheduling, and error handling
,
• Experience in building, optimizing, and maintaining ETL pipelines for large datasets, focusing on data extraction, transformation, and loading
,
• Familiarity with cloud-native storage solutions
,
• Understanding and working experience with different file formats
,
• Expertise in writing efficient SQL queries for data extraction, transformation, and analysis
,
• Familiarity with complex SQL operations (joins, aggregations, window functions, etc.)
,
• Familiarity with IAM (Identity and Access Management), data encryption, and securing cloud resources and data storage on both Azure, GCP
,
• Upper-Intermediate English level.

Nice-to-haves
• Use some Java libraries to request data from APIs
,
• Knowledge of data governance practices, and the implementation of data lineage and metadata management in cloud environments.

Benefits
• Professional growth through mentorship, TechTalks, and personalized growth roadmaps
,
• Competitive USD-based compensation and budgets for education, fitness, and team activities
,
• Exciting projects with modern solutions development and top-tier clients including Fortune 500 enterprises
,
• Flextime options for optimal work-life balance, including working from home or in the office.

Responsibilities

  • Design, develop, and maintain ETL pipelines to extract, transform, and load data across various data sources (cloud storage, databases, APIs)
  • Use Apache Airflow for orchestrating workflows, scheduling tasks, and managing pipeline dependencies
  • Build and manage data pipelines on Azure and GCP clouds
  • Design and support Data Lake
  • Write Python scripts for data cleansing, transformation, and enrichment using libraries like Pandas, PySpark
  • Analyze logs and metrics from Airflow and cloud services to resolve pipeline failures or inefficiencies
  • Use some Java libraries to request data from APIs

Requirements

  • Experience (2+ years) writing efficient and scalable Python code, especially for data manipulation and ETL tasks (using libraries like Pandas, PySpark, Dask, etc.)
  • Knowledge of Apache Airflow for orchestrating ETL workflows, managing task dependencies, scheduling, and error handling
  • Experience in building, optimizing, and maintaining ETL pipelines for large datasets, focusing on data extraction, transformation, and loading
  • Familiarity with cloud-native storage solutions
  • Understanding and working experience with different file formats
  • Expertise in writing efficient SQL queries for data extraction, transformation, and analysis
  • Familiarity with complex SQL operations (joins, aggregations, window functions, etc.)
  • Familiarity with IAM (Identity and Access Management), data encryption, and securing cloud resources and data storage on both Azure, GCP
  • Upper-Intermediate English level
  • Knowledge of data governance practices, and the implementation of data lineage and metadata management in cloud environments
AgileEngine

AgileEngine

Technology

Location

District of Columbia, US

Job Type

FULLTIME

Benefits

  • Professional growth through mentorship, TechTalks, and personalized growth roadmaps
  • Competitive USD-based compensation and budgets for education, fitness, and team activities
  • Exciting projects with modern solutions development and top-tier clients including Fortune 500 enterprises
  • Flextime options for optimal work-life balance, including working from home or in the office

Loading...

Loading...

AI Cover Letter Generator

Generate a Tailored Cover Letter!

Our AI will analyze your profile and create a personalized cover letter that highlights your relevant skills and experience.

Ready to Apply?

Click the button below to start your application process.