Senior Data Engineer (2 year contract)
StarHub
Date: 10 hours ago
City: Petaling Jaya
Contract type: Contractor

Job Description
Senior Data Engineer
You will design, develop, and deploy AI-powered, cloud-based products. As a Data Engineer, you’ll work with large-scale, heterogeneous datasets and hybrid cloud architectures to support analytics and AI solutions. Collaborate with data scientists, infra engineers, sales specialists, and stakeholders to ensure data quality, build scalable pipelines, and optimize performance. Your work will integrate telco data with other verticals (retail, healthcare), automate DataOps/MLOps/LLMOps workflows, and deliver production-grade systems.
As a Data Engineer, you will:
Qualifications
Senior Data Engineer
You will design, develop, and deploy AI-powered, cloud-based products. As a Data Engineer, you’ll work with large-scale, heterogeneous datasets and hybrid cloud architectures to support analytics and AI solutions. Collaborate with data scientists, infra engineers, sales specialists, and stakeholders to ensure data quality, build scalable pipelines, and optimize performance. Your work will integrate telco data with other verticals (retail, healthcare), automate DataOps/MLOps/LLMOps workflows, and deliver production-grade systems.
As a Data Engineer, you will:
- Ensure Data Quality & Consistency
- Validate, clean, and standardize data (e.g., geolocation attributes) to maintain integrity.
- Define and implement data quality metrics (completeness, uniqueness, accuracy) with automated checks and reporting.
- Build & Maintain Data Pipelines
- Develop ETL/ELT workflows (PySpark, Airflow) to ingest, transform, and load data into warehouses (S3, Postgres, Redshift, MongoDB).
- Automate DataOps/MLOps/LLMOps pipelines with CI/CD (Airflow, GitLab CI/CD, Jenkins), including model training, deployment, and monitoring.
- Design Data Models & Schemas
- Translate requirements into normalized/denormalized structures, star/snowflake schemas, or data vaults.
- Optimize storage (tables, indexes, partitions, materialized views, columnar encodings) and tune queries (sort/distribution keys, vacuum).
- Integrate & Enrich Telco Data
- Map 4G/5G infrastructure metadata to geospatial context, augment 5G metrics with legacy 4G, and create unified time-series datasets.
- Consume analytics/ML endpoints and real-time streams (Kafka, Kinesis), designing aggregated-data APIs with proper versioning (Swagger/OpenAPI).
- Manage Cloud Infrastructure
- Provision and configure resources (AWS S3, EMR, Redshift, RDS) using IaC (Terraform, CloudFormation), ensuring security (IAM, VPC, encryption).
- Monitor performance (CloudWatch, Prometheus, Grafana), define SLAs for data freshness and system uptime, and automate backups/DR processes.
- Collaborate Cross-Functionally & Document
- Clarify objectives with data owners, data scientists, and stakeholders; partner with infra and security teams to maintain compliance (PDPA, GDPR).
- Document schemas, ETL procedures, and runbooks; enforce version control and mentor junior engineers on best practices.
Qualifications
- Bachelor’s or Master’s in Computer Science, Software Engineering, Data Science, or equivalent experience
- 4+ years in data engineering, analytics, or related AI/ML role
- Proficient in Python for ETL/data engineering and Spark (PySpark) for large-scale pipelines
- Experience with Big Data frameworks and SQL engines (Spark SQL, Redshift, PostgreSQL) for data marts and analytics
- Hands-on with Airflow (or equivalent) to orchestrate ETL workflows and GitLab CI/CD or Jenkins for pipeline automation
- Familiar with relational (PostgreSQL, Redshift) and NoSQL (MongoDB) stores: data modeling, indexing, partitioning, and schema evolution
- Proven ability to implement scalable storage solutions: tables, indexes, partitions, materialized views, columnar encodings
- Skilled in query optimization: execution plans, sort/distribution keys, vacuum maintenance, and cost-optimization strategies (cluster resizing, Spectrum)
- Experience with cloud platforms (AWS): S3/EMR/Glue, Redshift and containerization (Docker, Kubernetes)
- Infrastructure as Code using Terraform or CloudFormation for provisioning and drift detection
- Knowledge of MLOps/LLMOps: auto-scaling ML systems, model registry management, and CI/CD for model deployment
- Strong problem-solving, attention to detail, and the ability to collaborate with cross-functional teams
- Exposure to serverless architectures (AWS Lambda) for event-driven pipelines
- Familiarity with vector databases, data mesh, or lakehouse architectures
- Experience using BI/visualization tools (Tableau, QuickSight, Grafana) for data quality dashboards
- Hands-on with data quality frameworks (Deequ) or LLM-based data applications (NL-->SQL generation)
- Participation in GenAI POCs (RAG pipelines, Agentic AI demos, geomobility analytics)
- Client-facing or stakeholder-management experience in data-driven/AI projects
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
Lead Specialist - Email Marketing
Association of International Certified Professional Accountants,
Petaling Jaya
11 hours ago
About the Role:You will design and improve email marketing campaigns within the Salesforce Marketing Cloud platform, using features like Journey Builder, audience segmentation, and automation to deliver personalized customer experiences, analyse campaign performance, and achieve our goals through targeted email communications. You will report to the Senior Manager- Marketing Operations. This role is expected to start by mid-August to participate...

Data Ops Migration Engineer
TapTalent.ai,
Petaling Jaya
12 hours ago
Role OverviewSeeking a skilled Data Migration Engineer with 4-8 years of experience in data migration, data setup, and data systems development. The ideal candidate should have expertise in Spark, SQL, and Java(Scala) for data processing, reporting, and development. Strong knowledge of data architecture and semantic layer development is required, along with experience in regression testing and cutover activities in large-scale...

Senior Customer Relationship Manager
Alliance Bank Malaysia Berhad,
Petaling Jaya
1 day ago
Job Purpose Primarily responsible to support a team of Privilege Banking (PB) Relationship Managers (RM) in customer transaction and service matters. Resolving PB customers’ issues and advise them accordingly to ensure that the customer service level at branch is upheld at all times. Perform sales and service related activities for customers i.e. CASA account opening, Fixed Deposit placements/renewals/withdrawals and purchased...
