Job Description
Our client is a leading player in the content-creation sphere, enabling creators all over the world to reach new heights and expand their brand.
What You Will Be Doing- Design and manage scalable data workflows, encompassing ETL pipelines with both single and multi-node configurations
- Establish data quality assurance protocols for both new and existing pipelines
- Generate enhanced datasets with additional attributes
- Process analytics-ready datasets to empower internal and creator-centric tools
- Address and resolve issues promptly, collaborating directly with internal data consumers
- Automate pipeline executions through scheduling and orchestration tools
- Handle extensive datasets and integrate various external APIs to enrich data
- Configure database tables to facilitate data consumption by analytics users
- Utilize big data technologies to enhance data availability and quality in the AWS cloud environment
- Bachelor's degree, preferably in Computer Science or Computer Information Systems.
- 4+ years of expertise in software engineering.
- 3+ years of proficiency in Data Engineering utilizing Apache Spark or Apache Flink.
- Showcase 3+ years of hands-on experience operating software and services within cloud environments.
- Exhibit adeptness in working with DataFrame APIs (Pandas and Spark) for both parallel and single-node processing.
- Demonstrate advanced proficiency in utilizing languages like Python, Scala, etc., incorporating modern data-optimized file formats such as Parquet and Avro.
- Showcase competence in SQL for RDBMS and data warehouse solutions, including platforms like Redshift.
Benefits and Perks:
- Competitive Salary : $170,000 - $190,000 per Year
- Full Health, Vision, and Dental Coverage
Applicants must be currently authorized to work in the United States on a full-time basis now and in the future.
This position does not offer sponsorship.
#LI-LJ1