Job ID: JOB_ID_2495
Role Overview and Strategic Impact
As we move into 2026, the role of a Data Engineer Tech Lead has evolved from simple pipeline construction to the architectural stewardship of complex data ecosystems. This position, based remotely but requiring alignment with Pacific Standard Time (PST), is designed for a seasoned professional with over 15 years of experience in the field. The successful candidate will lead the charge in building robust, scalable, and highly efficient data engineering and analytics solutions within the AWS cloud environment. This is not just a technical role; it is a leadership position that requires the ability to translate intricate business logic, legacy stored procedures, and complex SQL triggers into modern, scalable PySpark implementations. You will be at the forefront of our data strategy, ensuring that our infrastructure can handle the massive volumes of data generated by modern enterprise applications while maintaining the highest standards of data quality and integrity. In the context of 2026, the integration of AI-driven data quality checks and automated metadata harvesting has become standard. You will be expected to implement these advanced features within the PySpark framework. Furthermore, as organizations move towards a Data Mesh architecture, your role as a Tech Lead will involve defining domain-specific data products and ensuring their interoperability across the enterprise. This requires a deep understanding of data governance and the ability to enforce standards without stifling innovation.
Technical Excellence in PySpark and Python
The core of this role revolves around deep expertise in PySpark and Python. We are looking for someone who can design and implement end-to-end data pipelines that are not only functional but optimized for performance and cost in a cloud-native environment. This includes sophisticated job orchestration, workflow design, and comprehensive data mapping. Your experience with Spark Streaming will be vital as we move towards more real-time data processing capabilities. Furthermore, the ability to design and implement robust APIs will be a key differentiator. We are particularly interested in candidates who have experience with Palantir Foundry, specifically in areas such as Ontology modeling, API configuration, and Foundry Typescript. This knowledge will allow you to contribute to our most advanced data modeling initiatives and help us leverage the full power of the Foundry platform. The Bay Area tech scene continues to demand the highest levels of scalability, and your work will directly impact our ability to process petabytes of data with sub-second latency. You will also be involved in the strategic migration of legacy on-premise workloads to the AWS cloud, utilizing services like AWS Lake Formation and Managed Streaming for Apache Kafka (MSK) to build a modern, event-driven data platform.
Leadership and Collaborative Innovation
- Translate business requirements into technical solutions using PySpark and Python frameworks.
- Lead data engineering initiatives addressing moderately complex to highly complex data and analytics challenges.
- Plan and execute tasks to meet shared objectives, maintain progress tracking, and document work following best practices.
- Identify and implement internal process improvements, including scalable infrastructure design, optimized data distribution, and automation of manual workflows.
- Participate actively in Agile/Scrum ceremonies such as standups, sprint planning, and retrospectives.
- Contribute to the evolution of data systems and architecture, recommending enhancements to pipelines and frameworks.
- Provide technical guidance to team members on complex challenges spanning multiple functional and technical domains.
- Build infrastructure that supports large-scale data access and analysis, ensuring data quality and proper metadata management.
- Collaborate with leadership to strengthen data-driven decision-making through demos, mentorship, and best-practice sharing.
As a Tech Lead, you will be responsible for more than just code. You will identify and implement internal process improvements, such as automating manual workflows and optimizing data distribution. You will be a key participant in our Agile/Scrum ceremonies, providing technical guidance to team members and helping to resolve complex challenges that span multiple functional domains. Your role will involve significant collaboration with leadership, where you will demonstrate the value of our data initiatives through demos and mentorship. You will also be responsible for building the infrastructure that supports large-scale data access, ensuring that our data scientists and analysts have the tools they need to drive data-driven decision-making across the organization. This includes maintaining rigorous documentation and following best practices for metadata management and data governance. This role offers the opportunity to work on cutting-edge projects that will define the future of our data landscape, providing a platform for professional growth and the chance to make a lasting impact on our organization’s success.
Special Requirements
PST timezone alignment required, 15+ years total experience, Remote work mode.
Compensation & Location
Salary: $185,000 – $245,000 per year
Location: Bay Area, CA
Recruiter / Company – Contact Information
Recruiter / Employer: Sight Spectrum LLC
Email: santhosh.s@sightspectrum.com
Recruiter Notice:
To remove this job posting, please send an email from
santhosh.s@sightspectrum.com with the subject:
DELETE_JOB_ID_2495