Data Engineer - PySpark/Spark-SQL/AWS 1

Data Engineer – PySpark/Spark-SQL/AWS

Interested in health care? Startups? Philadelphia? Let’s talk! This venture-backed Series C startup founded in Philadelphia was recently named one of Philly’s best tech startups. They are using data and innovative technologies to leverage healthcare data and help organizations and end-users gain new perspectives on patient activity, all while keeping privacy and HIPAA compliance a top priority.

Compensation: $80K-$130K + Bonus

Benefits & Perks:

  • Excellent medical, dental and vision coverage, starting on Day 1
  • Performance-based bonus
  • Equity plan
  • 401k
  • Life and long-term disability insurance
  • 20 days Paid Time Off
  • 8 paid holidays
  • Pre-tax commuter savings program
  • Company lunch once a week
  • On-site gym
  • Plus more…

Job Description

The Data Engineer will support the engineering team’s data endeavors, diving in to fix issues, optimize processes, and automate what you do more than once.

Additional Responsibilities:

  • Work with internal stakeholders to load data into the data warehouse
  • Troubleshoot and resolve issues relating to data integrity
  • Help establish procedures and best practices for transforming and storing data
  • Lead requirements gathering around data pipeline automation improvements
  • Work with open-source tools like Spark, Hadoop, Docker, Airflow, Zeppelin
  • Leverage distributed computing and serverless architecture such as AWS EMR & AWS Lambda, to develop pipelines for transforming data
  • Research and implement new technologies with a team of developers to execute strategies and implement solutions
  • Solve complex problems related to the real-time discovery of large data


Successful Data Engineers will have 5+ years of experience writing scalable applications on distributed architectures.

Additional Qualifications:

  • 3+ years of experience with Python
  • 3+ years of experience with PySpark and Spark-SQL (writing, testing, debugging spark routines)
  • 1+ years of experience with AWS EMR, AWS S3 service. Comfortable using AWS CLI and boto3
  • Comfortable using *nix command line (shell scripting, AWK, SED)
  • Experience with MySQL and Postgres
  • Experience with Apache Airflow preferred
  • Experience with Apache Zeppelin preferred
  • Experience with healthcare data preferred

Interview Process: Phone Interview with IT Pros (15-minutes) → Phone Interview with Hiring Team (15-minutes) → Take Home Assessment (60-minutes) → In-Person Interview with Hiring Team (4-hours) → Decision

Next Steps: APPLY FOR JOB → Click the link to book a call with Brad → Select a Time → Brad will call you at the # listed

Brought to you by IT Pros

To apply for this job please visit