Anh Hoang Chu

Software Engineer & Data Engineer

profile-pic

Summary

I'm a Data Software Engineer who is passionate about working with data and bringing data insights closer to business users through the help of technology. I have experience in data engineering, big data, data science, data warehouse, back-end databases for web applications on GCP, Azure and AWS. My tech stack is Python, SQL, Linux, PySpark, Kafka, Airflow, Tableau, Kubernetes, BigQuery, Redshift, and Azure Synapse Analytics

Experience

Databricks

03/2023 - Present

Sr Specialist Solutions Engineer

Provide technical guidance to strategic customers in designing and implementations of enterprise data modernization projects from using Delta Lake, Big Data, Spark And SQL Optimization, and Data Engineering

Microsoft

02/2022 - 01/2023

Software Engineer II

Software Engineer building, configuring, and managing back-end infrastructure for a video-powered social-learning platform owned by Microsoft

  • Led the data warehouse migration of AWS Redshift to Synapse Data Lakehouse (DLH) from architecture design to production operation
  • Built and maintained batch and streaming pipelines from transactional databases and telemetry data to Data Lakehouse
  • Provided a fast, stable, and consistent data platform on Azure Cloud for analytics downstream
  • Performed data transformation and analytics with Python, Azure Synapse Spark, Change Data Capture with Debezium, and streaming service with Kafka and Azure EventHub
  • Actively resolved performance issues, applied data loading and table design optimization resulting in 4-5x times faster queries
  • Ensured data quality, and data security through data validation, data management, and monitoring best practices
  • Built and maintained a more reliable and consistent downstream sync from DLH to CRM system using REST API
  • Ensured highly available and performant application by maintaining a multitude of Azure cloud services including storage, CI/CD, database, data warehouse, and Kubernetes

Walmart Global Tech

01/2020 - 02/2022

Software Engineer II

Software Engineer building an end-to-end analytical Supply Chain web application to track inventory and transportation from Suppliers to Stores for international markets

  • Led a team of 4 developers in migrating on-prem Data Warehouse (Teradata) to Google Cloud Platform for 10 markets using Big Query, Dataproc, Python, PySpark, and Aiflow
  • Continuously delivered new data features by analyzing and calculating supply chain metrics with SQL and Spark
  • Built and maintained ETL data pipelines that load analytical datasets to MSSQL Server from multiple data sources in Teradata, BigQuery, Oracle Database, Informix Database
  • Perform data validation and unit testing to ensure data quality
  • Improved application performance by 70% with the implementation of caching, indexing and data aggregation in the database instead of in back-end web service, which reduced the volume of data flow through the network.
  • Reduced development time and codebase complexity by 80% with code refactoring, SQL reformating, Git and CI/CD pipeline

NTT Data Services

10/2017 - 01/2020

Tableau Developer

Gathering requirements and delivering analytical projects that provide data democratization to the healthcare account IT service team

  • Designed and distributed ~50 operations and financial KPI reports to executives and leaders resulting in the reduction of outstanding IT tickets by 70% using Excel, Tableau, SQL Server and Alteryx
  • Reduced time to deliver data insights to the operations team by 90% with a new reporting process that automates existing ad-hoc reports from Excel into interactive and dynamic dashboards in Tableau

E2Open, Inc

03/2017 - 08/2017

Business Analyst Intern

Worked with Director of Business Value Delivery to create business insights and sales portfolio through operations and financial KPIs

  • Collected, cleaned, and prepared data from financial reports of 200 companies to calculate 20 different business and supply chain KPIs ranging from Profitability to Efficiency Indicators: Profits Margin, Cash Conversion Cycle, DIO (Days Inventory Outstanding), DSO (Days Sales Outstanding), DPO (Days Payable Outstanding), etc.
  • Developed financial and supply chain KPI dashboards using Excel (VBA), PowerPoint, and QlikView for sales consultants in the USA and EU to leverage the service quality and product offerings to potential and existing clients

Education

Harrisburg University of Science & Technology

08/2020 - 10/2023

Masters Computer Science

  • Data Structure & Algorithms
  • Big Data Architecture
  • Scientific Computing
  • Software Architecture & Microservices

University of Texas at Dallas

08/2015 - 08/2017

Masters Supply Chain Management

  • Business Data Warehouse
  • Advanced Analytics with SAS I/II
  • Operations Management
  • Statistics
  • Prescriptive Analytics

Awards

Excellence Award

Walmart

Consistent and examplary demonstration of aspiration with significant contribution to business

Graduation with Highest Distinction

UT Dallas

Summa cum laude

Dean's List

UT Dallas

Demonstrated academic achievements through GPA and competitions