Big Data & Analytics Professional
New York, New York
Data Engineering Contractor @ Oath Extended and supported AWS data pipeline for video analytics team. (Redshift, Lambda, EC2, S3, AWS Data Pipeline). From March 2018 to May 2018 (3 months) Greater New York City AreaCloud Data Architect @ Blackboard Insurance Blackboard insurance is a AIG backed insurtech startup.- Architectural design and review of AWS native data processing...
Data Engineering Contractor @ Oath Extended and supported AWS data pipeline for video analytics team. (Redshift, Lambda, EC2, S3, AWS Data Pipeline). From March 2018 to May 2018 (3 months) Greater New York City AreaCloud Data Architect @ Blackboard Insurance Blackboard insurance is a AIG backed insurtech startup.- Architectural design and review of AWS native data processing systems and web applications- Building internal web applications and data pipelines in Python,Pandas,SQL,Flask,JS,D3.js on AWS- Driving critical data initiatives to reinforce data integrity and creating a more robust data ecosystem- Code review and reinforcement of coding/data best practices for the team- Ensuring project delivery meets expectations and fall within a specified timeline- Designing and building foundational data models to support analytical and data warehousing needs- Building algorithmic/pipeline approaches to parse and visualize complex data structures Data Engineer @ Capsule Capsule is a retail pharmacy startup based out of New York City. As the companies first data engineering hire, my role has been around setting up and maintaining our data warehouse infrastructure, building and providing business users metrics and reporting, undertaking data migration task to support all new application releases and integrations, advising on SQL/Database related questions and needs, and defining data pipelines and data transformation strategy for tackling challenging business problems.- Setup and manage the companies data warehouse on Redshift, built ETL pipelines to populate data into source system- Built BI reports for company wide SLA Metrics and operational reports using SQL and metabase- Triaging and optimizing SQL queries and BI dashboards- Building semi real time ETL pipeline (using AWS firehose streaming services) and real time communications dashboard - Managing data infrastructure in AWS (migrated EC2 Postgres production database to RDS, set up VPN connection from local servers to AWS VPC) - Data cleaning and migration of production data to enable integration between third party pharmacy system and web application- Implementing active data quality strategies to monitor data quality as well as to fix data inconsistencies using automated algorithms From May 2017 to January 2018 (9 months) Greater New York City AreaData Engineer @ LiveIntent, Inc. LiveIntent is an ad tech firm focused on email marketing, also serving as an ad exchange for emails. The Business Intelligence team is responsible for maintaining and creating a central data warehouse for all company datasources as well as providing analysis and reporting on this data..My role , as the companies first Data Engineer, consisted of wearing many hats to architect, build, and maintain our initial data infrastructure and data web tools. I worked with both "traditional" ETL based pipelines, written in Python and SQL that leverage parallel processing, as well as scaling out Big Data solutions in Hadoop and Spark. I also managed an AWS Redshift data warehouse, Elastic Map Reduce and EC2. -Developed multiple end to end(UI,Backend,Deploy) reporting internal web application: Key Value Pair Reporting Tool, Live Audience Match Rate Reporting Tool, Big Query Table Look Up Tool, and Replay Engine UI (Flask, Data Modeling, ETL, Python, SQL,JQuery,CSS, HTML)- Built a semi realtime data streaming prototype utilizing Kinesis and Spark Streaming for processing semi realtime data.- Built new, from scratch ,data pipelines for processing log data, in python, using multi core parallel processing, which connects multiple cloud end-point from google big query, amazon s3, and amazon redshift.- Setup initial hive platform for data science usage and data processing as well as Spark code for batch processing.- Optimizing and building data consumption code connecting to API end points, such as Salesforce API and Good Data API. - Building and maintaining relational database tables and data models. - Building and deploying new data programs and workflow monitoring systems, using python with email alerting and restful APIs. - Managing ,troubleshooting , and deploying production ETL pipeline.- Managing and administrating cloud infrastructure (EC2 servers, redshift, kinesis, cron, etc..) , databases, and unix servers for data team. From July 2015 to March 2016 (9 months) Greater New York City AreaTechnology Analyst (Big Data & BI Technology) @ Barclays Investment Bank At Barclays Capital (Investment Bank), I worked on the Risk Reporting and Analytics team as well as on the Compliance Architecture/ Financial Crime team. My roles involved investigating new technology/ building proof of concepts, building and maintaining data/big data environments, and building data pipelines.Risk and Analytics Team:I regularly built scripts in unix, bash, and perl for environment stability and scalability purposes. "Big Data" technologies, I used, maintained, and built pipelines for include Pig, Oozie, and Hive. Specifically I worked on projects such as the configuration of Oozie for JMS messaging capabilities, and building utilities for HDFS for data monitoring, data compression, and HDFS management and maintaing ETL processes into our hive risk data warehouse.The data was used for risk reporting of Volcker and other SEC/FINRA regulations.Compliance Technology (& Financial Crime) Team: I worked on the Compliance Technology Architecture team, while also partnering and working directly for the Financial Crime technology team. In this role, I investigated functionality's for Apache spark.Specifically I built a connection from Mongo DB that data feeds in to Spark (Java and Scala based), I investigated Sparks capabilities and built proof of concept applications and architecture diagrams to showcase it's capabilities and usefulness.I also worked directly for the Financial Crimes team in building new internal applications.This included a custom emailing service for data ingestion, excel data converters, and composite data virtualization connector.The data here is used for identifying rogue trading as well as supporting financial crime reporting. From July 2014 to July 2015 (1 year 1 month) Greater New York City AreaVisiting Research Associate- Computational Comparative Genomics @ Temple University Computational Biology research in Comparative Genomics as a graduate visiting researcher. I used tools such as Perl, Bash, Unix, High Performance Computing and Python. Working with FASTA, MAF, and Genome Stitch data formats as well as bioinformatics tools such as Galaxy and other custom/open source applications.Worked with University of California Santa Cruz bioinformatics datasets for cross species data analysis.The goal of the research was to process and analyze comparative genomics datasets from multiple animal species. The project involved designing and building new pipelines for preparing data for data analysis. I also built tools for preliminary data analysis on the processed genomic datasets. As a side note, the research was done part time for the duration of that fall 2014 to spring 2014 and full time Jan 2013-May 2013. From December 2012 to May 2014 (1 year 6 months) Kulathinal LabSummer Technology Analyst @ Barclays Investment Bank My role as a summer analyst at Barclays revolved around working in equities trading technology. I regularly met with managers to implement change management. I also built a custom tool for equities communication technology team for parsing and analyzing FIX protocol messages, which is the message protocol of transactions coming in and out of the stock exchange.-Worked on a equities technology team, building knowledge and skill set in the Equities Technology landscape.-Built a Perl script for conversion of FIX protocol to human readable format.-Communicated regularly with various VP's to organize and update technology schema and change management From May 2013 to August 2013 (4 months) Greater New York City AreaData & Analytics Consultant @ Caserta Caserta is a specialized data consulting firm working with major fortune 500 clients on data warehousing, cloud, analytics, data science, and data governance strategies and implementation. As a consultant, I work with clients across multiple industrial sectors to implement big data and data analytics solutions. I work with clients to clarify and draft business and technical requirements, communicate technical goals and host brainstorming sessions with clients, build and improve technical architecture plans, and most importantly write code to build technology and analytics solutions.The current data stack for our clients includes AWS, Databricks, Spark, Airflow, Python, Redshift etc.. We operate and implement end to end full stack data platforms. Therefore, our hands-on work spans from dealing with Cloud/Linux networking and parallel processing improvements to data warehouse modeling and creating analytical models to support machine learning and insights. From May 2018 to February 2019 (10 months) Greater New York City AreaSenior Architecture Specialist- Big Data & Analytics @ Cigna Cigna is a major health insurance provider in the US. As an effort to gain a footing and further understanding of health care, I joined the big data & analytics team which leverages big data to enable advanced analytics and real time predictive capabilities for the organization.In this role I have worked on projects ranging different spaces. For instanced we developed a fully functional D3.js data visualization applications to visualize provider data and claims data. We also built end to end data pipelines to transfer and store massive datasets for network security , enterprise wide storage consolidation, and other purposes into our data lake. During this time, I also played the role of technical lead for the enterprise wide data governance technical team. Specifically, we were building out the data governance platform and ensuring integration and automation between our data governance platform and the Hadoop ecosystem/data lake. From April 2016 to March 2017 (1 year) Greater New York City Area
Oath
Data Engineering Contractor
March 2018 to May 2018
Greater New York City Area
Blackboard Insurance
Cloud Data Architect
Capsule
Data Engineer
May 2017 to January 2018
Greater New York City Area
LiveIntent, Inc.
Data Engineer
July 2015 to March 2016
Greater New York City Area
Barclays Investment Bank
Technology Analyst (Big Data & BI Technology)
July 2014 to July 2015
Greater New York City Area
Temple University
Visiting Research Associate- Computational Comparative Genomics
December 2012 to May 2014
Kulathinal Lab
Barclays Investment Bank
Summer Technology Analyst
May 2013 to August 2013
Greater New York City Area
Caserta
Data & Analytics Consultant
May 2018 to February 2019
Greater New York City Area
Cigna
Senior Architecture Specialist- Big Data & Analytics
April 2016 to March 2017
Greater New York City Area
University of Pennsylvania
Masters of Biotechnology, Bioinformatics and Computational Biology
2012 to 2014
Temple University
Bachelor of Science (B.S.), Biology, Cum Laude
2008 to 2012
Fox School of Business at Temple University
Minor, Business
2008 to 2012
What company does Gary Cheung work for?
Gary Cheung works for Oath
What is Gary Cheung's role at Oath?
Gary Cheung is Data Engineering Contractor
What industry does Gary Cheung work in?
Gary Cheung works in the Information Technology and Services industry.
Who are Gary Cheung's colleagues?
Gary Cheung's colleagues are Brandon Berment, Michael D'Angelo, Dante Urso, Anna Goldberg, Max Headley, Jessica Bowden, Sonia Patel, Sean Fuoco, John Hawley, and James Novino
Enjoy unlimited access and discover candidates outside of LinkedIn
One billion email addresses and counting
Everything you need to engage with more prospects.
ContactOut is used by
76% of Fortune 500 companies
Gary Cheung's Social Media Links
/company/b... /school/un... /redir/red...