Job Description
The Opportunity
We are seeking a Lead Data Specialist with a strong hands-on background in data engineering to spearhead our advanced analytics and AI initiatives. This is a pivotal leadership role for a hands-on expert who is passionate about both building sophisticated machine learning models and shaping the data landscape they rely on. You will be the founding specialist of our data team, responsible for setting the technical vision, mentoring future members, and acting as the critical bridge between our data science/engineering efforts and business-facing AI applications. You will play a vital role in designing, building, and maintaining data architecture, ensuring it meets the needs of our data engineering and AI projects. In addition, you will be developing scalable and reliable data pipelines, optimizing data storage and retrieval, and ensuring data quality and security to create impactful AI-driven solutions.
Day-to-Day Responsibilities
- AI/ML Strategy:
- Contribute within a team of AI leads, and execute the data science roadmap, aligning with the business goal of delivering hyper-personalized customer experiences.
- Lead the full lifecycle of machine learning projects—from ideation and data discovery to model deployment, monitoring, and iteration.
- Mentor and guide other engineers and future developers, establishing best practices for modeling, code quality, and project execution.
- Advanced Modeling & Analytics:
- Develop and implement a range of machine learning models (e.g., recommendation engines, customer segmentation, propensity models, transaction fraud detection).
- Perform complex exploratory data analysis to identify key patterns and features that will drive business value.
- Translate complex business challenges into precise data science problems and deliver scalable, statistically robust solutions.
- Data Analytics Development & Collaboration:
- Design, develop, and maintain scalable, high-performance data pipelines and ETL processes to ingest, process, and transform large data sets from various sources into usable formats that meets business needs.
- Collaborate with LLM engineers to understand their data requirements and provide efficient data solutions to support their models that meet business objectives.
- Work with a team of AI engineers, product developers and network engineer to deliver end-to-end AI projects, from concept to production.
- Collaborate with product, engineering, and business teams to identify opportunities for AI integration and innovation.
- Data Infrastructure, Data lakehouse & Deployment:
- Support AI team to manage scalable AI infrastructure, ensuring efficient resource utilization and performance optimization.
- Jointly conduct capacity planning to support AI and data analytics workloads, on premise and cloud.
- Architect and implement robust cloud-based data solutions using technologies such as AWS (S3, Redshift, Glue), Azure (Data Lake, Synapse), or Google Cloud (BigQuery, Dataflow).
- Work with big data technologies such as Apache Hadoop, Spark, Kafka, and NoSQL databases (e.g., MongoDB) to efficiently handle large-scale data processing.
- Support the optimization of AI and data pipeline workloads using best practices (CI/CD, model versioning, monitoring, drift detection).
- Implement APIs and microservices for seamless integration into enterprise applications.
- Ensure data infrastructure and data management tools/platform aligns with security, compliance, and regulatory requirements (PCI-DSS, GDPR).
- Familiar with IBM i and z, especially on managing datalake and deploying datawarehouses
- Data Quality, Governance & Automation:
- Ensure data quality, integrity, and security throughout the data lifecycle by creating and enforcing data governance policies.
- Optimize ETL/ELT jobs and data pipelines for improved performance and scalability.
- Monitor, evaluate, and optimize pipelines (through AI agents usage) for continuous improvement and operational excellence.
- Implement data quality checks and validation processes.
- Develop and maintain data monitoring and alerting systems.
- Implement and maintain data governance.
- Establish the processes and tooling for deploying, monitoring, and maintaining machine learning models in a production environment to ensure reliability and performance.
- Technology & Innovation:
- Mentor and guide junior data scientists and engineers
- Collaborate with AI engineers, business analysts, and other stakeholders to identify data needs and deliver solutions that meet business objectives.
- Stay updated with industry trends and emerging technologies to enhance data service practices within the organization.