About Me
👨‍💻

About Me

Zhike(Kyle) Chen | Email: zk.chen007@gmail.com | GitHub: kk17


Summary

Backend Engineer, Data Engineer, and Machine Learning Engineer background. Contributor to open source projects like Hudi, Airflow, and Kyuubi.
  • Big Data Stack: Spark, Flink, Kafka, Airflow, Presto, Apache Hudi, Apache Kyuubi, DBT/SQLMesh, etc.
  • Cloud Platforms: AWS, GCP, and Alicloud.
  • DevOps: GitLab CI/CD, GitHub Actions, Docker, Kubernetes, Terraform/Terragrunt, etc.
  • MLOps: Ray, MLflow, JupyterHub, etc.
  • Programming Languages: Python, Java, Bash shell, Scala, Go, etc.

Work Experience

GoTo Financial, LENDING DATA TEAM

Data Engineer Manager Jan. 2023 - Present
  • Manage a streamlined, high-performing team of 10+ Data Engineers working across 5 main divisions: Data Ingestion, Stream Processing, Data Warehousing, Business Intelligence, and Machine Learning Engineering; built an open and agile team culture while promoting team members' growth.
  • Lead cross-team collaboration, supporting multiple product lines and data requirements (synchronization, analytics, reporting) for over 10 business teams.
  • Plan and execute data platform migration from AWS to GCP and from GCP to Alicloud; design equivalent data architecture on new cloud platforms while ensuring timely and high-quality migration completion.
  • Identify potential system bottlenecks and implement optimization strategies to improve performance, reduce costs, and enhance efficiency; built internal data portal and self-service data capabilities to improve team productivity.

BYBIT SINGAPORE, DATA TEAM

Principal Data Engineer Sep. 2020 - Dec. 2022 (2 years and 4 months)
As a founding member of the data team, helped design and build the company's data platform and machine learning platform from scratch, continuously evolving the tech stack to improve performance and stability.
  • Designed and maintained infrastructure services like Canal, Debezium, AWS EMR, Presto, and Airflow, leveraging GitOps and containerization to enhance productivity and standardization.
  • Built a large-scale near real-time pipeline using CDC (Canal/Debezium) and DataLake (Apache Hudi) technologies, enabling efficient data delivery and supporting update/delete operations.
  • Migrated Airflow from 1.0 to 2.0, improving task scheduling for ~40,000 daily tasks, and developed custom Airflow Operators and APIs for seamless integration with internal tools.
  • Developed a unified SQL query middleware for engines like Kyuubi, Presto, and Hive, simplifying data access and enhancing user experience through integration with the data portal.
  • Designed and implemented a machine learning platform using Juyterhub, Ray, MLflow, and Kubernetes, addressing MLOps challenges and supporting model training, tracking, and deployment.

ATOME/ADVANCEAI SINGAPORE, DATA ENGINEER TEAM

Senior Data Engineer Apr. 2018 - Sep. 2020 (2 years and 6 months)
As a Senior Data Engineer in the Data Engineer Team for the finance business line, I participated in the designing, building, and maintaining the company data platform.
  • Migrated the ETL pipelines from Jenkins to Airflow to improve maintainability, performance, and stability. Developed Ad-hoc task feature for Airflow.
  • Implement ETL data pipelines for the data processing from different upstream servers and sources. Optimized the ETL pipelines, improve efficiency, and reduce the overall execution time.
  • Deployed and maintained Spark Thrift Server and Kyuubi Server, implement custom authentication and authorization. Significantly increased the ease of analyzing data without loss of security.
  • Analyze data in the warehouse and created dashboards for business insights using PySpark, JupyterHub, and Superset.

NETEASE, HANGZHOU / GUANGZHOU, CHINA

Senior Backend Developer Jul. 2012 - Mar. 2018 (5 years and 9 months)
Initially contributed to NetEase Cloud Music before transitioning to the e-commerce department where I developed and maintained backend microservices.
  • Designed and developed critical backend APIs for multi-platform applications, including user account management, payment processing, and integration with third-party systems like Sonos
  • Built and maintained microservices infrastructure using Spring Boot/Cloud ecosystem, implementing SKU management, vendor management, and event messaging services
  • Developed an automated CI/CD pipeline with GitLab CI, Docker, and custom scripting, significantly reducing deployment time and improving release reliability
  • Refactored legacy applications into modern microservices architecture, improving maintainability and enabling rapid iteration
  • Created custom testing and monitoring tools, including a Python-based API testing framework that improved testing efficiency and reliability

Education Background

  • Singapore Management University SINGAPORE 2020 - 2022
    • Master in IT Business - AI track
  • Jinan University GUANGZHOU, CHINA 2008 - 2012
    • Bachelor Of Computer Science And Technology