NEWPosted 16 hours ago

Job ID: JOB_ID_3178860

Job Overview:

Aditi LLC is urgently seeking a skilled Data Scientist to join a project requiring local Dallas, TX presence for initial onboarding or specific client needs, though the role itself is remote. This position involves implementing robust data pipelines, automating deployments, and collaborating with cross-functional teams to drive data-driven insights. The ideal candidate will have a strong background in data modeling, ETL/ELT processes, and experience with cloud platforms like Azure, particularly Databricks.

Key Responsibilities:

  • Implement and maintain ETL/ELT workflows for both structured and unstructured data sources.
  • Automate deployment processes using CI/CD tools to ensure efficient and reliable releases.
  • Collaborate effectively with data scientists, analysts, and business stakeholders to understand data needs and deliver solutions.
  • Design, develop, and maintain scalable data pipelines using Apache Spark on Databricks.
  • Optimize Spark jobs for enhanced performance and cost-efficiency.
  • Integrate Databricks solutions with cloud services, including Azure Data Factory.
  • Ensure data quality, governance, and security through tools like Unity Catalog or Delta Lake.
  • Design and maintain data models, schemas, and database structures to support analytical and operational use cases.
  • Evaluate and implement appropriate data storage solutions, such as Azure Data Lake Storage and data warehouses.
  • Implement data validation and quality checks to guarantee accuracy and consistency.
  • Contribute to data governance initiatives, including metadata management, data lineage, and data cataloging.
  • Implement robust data security measures, including encryption, access controls, and auditing, ensuring compliance with regulations and best practices.
  • Develop and maintain data pipelines using Python and R programming languages.
  • Write complex SQL queries and perform data manipulation tasks.
  • Leverage experience with the Azure cloud platform for data solutions.
  • Utilize DevOps practices, CI/CD pipelines, and version control systems (e.g., Git) for efficient workflow management.
  • Work effectively in agile, multicultural environments, adapting to changing project requirements.
  • Apply strong troubleshooting and debugging capabilities to resolve data-related issues.
  • Optionally, contribute to machine learning initiatives using libraries like MLflow, Scikit-learn, and TensorFlow.

Minimum Requirements:

  • 4 years of experience in implementing ETL/ELT workflows.
  • 4 years of experience in automating deployments using CI/CD tools.
  • 4 years of experience collaborating with cross-functional teams (data scientists, analysts, stakeholders).
  • 4 years of experience designing and maintaining data models, schemas, and database structures.
  • 4 years of experience evaluating and implementing data storage solutions (Azure Data Lake Storage, data warehouses).
  • 4 years of experience implementing data validation and quality checks.
  • 4 years of experience contributing to data governance initiatives (metadata management, data lineage, data cataloging).
  • 4 years of experience implementing data security measures (encryption, access controls, auditing).
  • 4 years of experience with Python and R programming languages.
  • 4 years of strong SQL querying and data manipulation skills.
  • 4 years of experience with the Azure cloud platform.
  • 4 years of experience with DevOps, CI/CD pipelines, and version control systems.
  • 4 years of experience working in agile, multicultural environments.
  • 4 years of strong troubleshooting and debugging capabilities.
  • 3 years of experience designing and developing scalable data pipelines using Apache Spark on Databricks.
  • 3 years of experience optimizing Spark jobs for performance and cost-efficiency.
  • 3 years of experience integrating Databricks solutions with cloud services (Azure Data Factory).
  • 3 years of experience ensuring data quality, governance, and security using Unity Catalog or Delta Lake.
  • 3 years of deep understanding of Apache Spark architecture, RDDs, DataFrames, and Spark SQL.
  • 3 years of hands-on experience with Databricks notebooks, clusters, jobs, and Delta Lake.
  • 1 year of knowledge of ML libraries (MLflow, Scikit-learn, TensorFlow) is preferred.
  • 1 year of Databricks Certified Associate Developer for Apache Spark certification is preferred.
  • 1 year of Azure Data Engineer Associate certification is preferred.

Keywords:

continuous integration, continuous deployment, machine learning, rlang, information technology, North Carolina, Texas, Data Scientist, ETL, ELT, Azure, Databricks, Spark, Python, R, SQL, DevOps, CI/CD


Special Requirements

Candidate must be local to TX with local dl. Role is remote.


Compensation & Location

Salary: $60 – $75 per year

Location: Dallas, TX


Recruiter / Company – Contact Information

Recruiter / Employer: Aditi LLC

Email: bala@ardoritsolutions.com


Interested in this position?
Apply via Email

Recruiter Notice:
To remove this job posting, please send an email from
bala@ardoritsolutions.com with the subject:

DELETE_JOB_ID_3178860

to delete@join-this.com.