Data Engineer

Mức lương: Thương lượng

Kinh nghiệm: 3 năm

Hạn nộp hồ sơ: 17/12/2025

Mức lương

Kinh nghiệm

Hạn nộp hồ sơ

Địa điểm

Click vào đây để xem tuyển dụng Data Engineer mới nhất

Chi tiết tuyển dụng

Mô tả công việc

Build and maintain large-scale data platforms processing tens to hundreds of millions of events per day.
Design and operate real-time and batch pipelines handle 100k+ events/second (including image, audio, video streams) using Kafka, Spark Streaming, and/or Flink .
Develop and scale multimodal pipelines (image/object detection, OCR, face recognition, audio transcription, video processing).
Work with vector databases (Milvus, Weaviate, or similar) to store and query millions of embeddings with metadata filtering and hybrid search.
Integrate LLMs and multimodal models (GPT-4, Llama-3, Claude, etc.) into data pipelines for entity extraction, classification, enrichment, and summarization.
Participate in solving Entity Resolution challenges: deduplicate and merge entities from multiple news/sources using fuzzy matching, simple graph techniques, and contextual signals.
Build and maintain a modern Lakehouse using Apache Iceberg or Delta Lake on S3/MinIO with schema evolution and time-travel capabilities.
Ensure good data quality, observability, and monitoring (lineage, basic data quality checks, dashboards, alerting).
Optimize cost and performance of Spark/Flink jobs running on Kubernetes (autoscaling, resource management, basic spot-instance usage).
Collaborate closely with AI Engineers and Bussiness Analysts to translate business needs into reliable data pipelines.

THE CHALLENGES YOU WILL LOVE

Keeping real-time pipelines stable and accurate under high throughput (10k–100k+ events/sec).
Handling noisy, conflicting, and rapidly changing data from many sources with dozens of entities, thousands of attributes, and hundreds of relationships.
Achieving low-latency enrichment with LLMs and vector search in streaming workflows.
Maintaining vector indexes with millions of new embeddings daily.
Building image/video analytics that work reliably on real-world media.
Performing schema migrations and backfills on large datasets with minimal downtime.
Ensuring good observability so the team can quickly spot and fix issues

3+ years of hands-on Data Engineering experience.
Bachelor’s degree in Computer Science, Engineering, Mathematics, or equivalent practical experience.
Strong hands-on experience with at least 3 of the following: Spark (Structured Streaming or DataFrame API), Kafka, Flink, Airflow, Iceberg/Delta Lake, vector DBs (Milvus, Weaviate, Pinecone, Qdrant…).
Proven experience shipping production data systems that process tens of millions+ events/day.
Practical experience with any form of Entity Resolution / deduplication / record linkage (even at moderate scale).
Hands-on work with image or multimodal pipelines (OCR, object detection, face recognition, transcription) is a plus.
Experience calling LLMs or embedding models from data pipelines (LangChain, LlamaIndex, direct API, etc.).
Good knowledge of vector search concepts and at least one vector database.
Proficiency in Python (primary) and SQL; Java/Scala is a plus.
Solid experience with Docker + Kubernetes in production (Helm, basic manifests, or ArgoCD is enough).
Strong focus on testing, data quality, monitoring, and automation.
You enjoy solving messy real-world data problems and making systems reliable and maintainable.

Salary that truly reflects your abilities and is competitive in the market
Additional allowances outside of salary: Breakfast and lunch provided at the company
Performance evaluated quarterly/annually with opportunities for job grade promotion; work and collaborate with a high-quality team
Provided with modern work equipment (MacBook, laptop, 24” LCD monitor, etc.)
5-day work week (Saturday and Sunday off)
Opportunities to attend professional training courses, enhance job-related skills and soft skills, and obtain IT certifications
Young, dynamic, and energetic working environment that encourages employees to maximize their potential, with many career advancement opportunities
Modern office with open-space design, located in a building exclusively occupied by the company
Annual leave and social insurance, health insurance, and unemployment insurance in full compliance with Vietnamese labor law
Comprehensive private health insurance for employees and their family members
Full support for business trip expenses
Participation in internal cultural activities, team building, and annual company trip
Wedding gifts, holiday/Tết bonuses and gifts, and other welfare benefits

6 | 0

Công ty TNHH Athena AI

Quy mô:	50-99 nhân viên
Lĩnh vực:	Công nghệ thông tin, Nhóm nghề khác
Địa chỉ:	Số 2 Trương Quốc Dung, Phường Phú Nhuận, TP Hồ Chí Minh

Thông tin công ty

Tên công ty:	Công ty TNHH Athena AI
Quy mô:	50-99 nhân viên
Lĩnh vực:	Công nghệ thông tin, Nhóm nghề khác
Địa chỉ:	Số 2 Trương Quốc Dung, Phường Phú Nhuận, TP Hồ Chí Minh

Thông tin chung

Ngày đăng tuyển

Cấp bậc

Học vấn

Số lượng tuyển

Độ tuổi

Giới tính

Hình thức làm việc

Thông tin chung