NEWPosted 7 hours ago

Job ID: 3178840

Job Summary

Aditi LLC is seeking a skilled Data Scientist with a strong background in data pipeline development, cloud platforms, and machine learning to join our dynamic team. This role is primarily remote and requires a candidate with excellent problem-solving abilities and a collaborative spirit. You will be instrumental in designing, implementing, and optimizing data solutions that drive business insights and support analytical use cases.

Core Responsibilities

  • Implement robust ETL/ELT workflows for both structured and unstructured data, ensuring data integrity and accessibility.
  • Automate deployments using CI/CD tools, enhancing efficiency and reliability of data processes.
  • Collaborate effectively with cross-functional teams, including data scientists, analysts, and stakeholders, to understand and meet project requirements.
  • Design and maintain scalable data models, schemas, and database structures to support a wide range of analytical and operational use cases.
  • Evaluate and implement appropriate data storage solutions, including modern data lakes (e.g., Azure Data Lake Storage) and data warehouses.
  • Implement comprehensive data validation and quality checks to ensure the accuracy, consistency, and reliability of data.
  • Contribute to data governance initiatives, focusing on metadata management, data lineage tracking, and data cataloging to improve data understanding and usability.
  • Implement stringent data security measures, including encryption, access controls, and auditing, ensuring compliance with relevant regulations and best practices.
  • Develop and optimize scalable data pipelines using Apache Spark on Databricks, ensuring high performance and cost-efficiency.
  • Integrate Databricks solutions with other cloud services, such as Azure Data Factory, to create seamless data workflows.
  • Ensure data quality, governance, and security by leveraging tools like Unity Catalog or Delta Lake.
  • Troubleshoot and debug complex data issues, providing timely and effective solutions.

Technical Skills and Experience

  • Minimum 4 years of required experience in implementing ETL/ELT workflows, automating deployments with CI/CD, collaborating with cross-functional teams, designing data models, evaluating storage solutions, implementing data validation, contributing to data governance, implementing data security, and proficiency in Python and R programming languages.
  • Minimum 4 years of required experience with strong SQL querying and data manipulation skills, experience with the Azure cloud platform, experience with DevOps, CI/CD pipelines, and version control systems, working in agile, multicultural environments, and strong troubleshooting capabilities.
  • Minimum 3 years of required experience in designing and developing scalable data pipelines using Apache Spark on Databricks, optimizing Spark jobs, integrating Databricks with cloud services (Azure Data Factory), and ensuring data quality/governance/security using Unity Catalog or Delta Lake.
  • Deep understanding of Apache Spark architecture, RDDs, DataFrames, and Spark SQL.
  • Hands-on experience with Databricks notebooks, clusters, jobs, and Delta Lake.

Preferred Qualifications

  • 1 year of preferred experience with ML libraries such as MLflow, Scikit-learn, and TensorFlow.
  • Databricks Certified Associate Developer for Apache Spark certification.
  • Azure Data Engineer Associate certification.

Special Requirements

Remote. Candidate must be local to TX with local dl.


Compensation & Location

Salary: $70 – $90 per year

Location: Remote


Recruiter / Company – Contact Information

Recruiter / Employer: Aditi LLC

Email: nikhilm@aditi-llc.com


Interested in this position?
Apply via Email

Recruiter Notice:
To remove this job posting, please send an email from
nikhilm@aditi-llc.com with the subject:

DELETE_3178840

to delete@join-this.com.