Welcome to my LinkedIn page.
Data is everywhere and is changing the world. I am interested in harnessing the power of big data and using various algorithms to answer interesting questions and drive business decisions.
The areas of my expertise are in Data Analysis, Machine Learning, Statistical Analysis, Data Mining, Text Mining, and Database.
My professional skills include not limit to:
• Develops Data-Driven products using predictive analytics technique.
• Provides Big Data Analytics solutions for business use cases.
• Proficient in Statistical Analysis, Data Mining and Text Mining in Python and R.
• Advanced analytics and modeling skills in Python, R, and SAS.
• Proficient in Pivotal Greenplum Database, HAWQ, Apache Hive, Microsoft SQL Server, MySQL, PostgreSQL, MongoDB platforms, SQL Server Reporting Services (SSRS), and Microsoft Excel.
• Data visualization with Tableau, matlibplot in Python, ggplot2 and Shiny App in R, Gephi, and Graphviz.
• Great knowledge about Data Architecture and Data Engineering/ETL Process.
• Hands on with Linux command line.
• Worked with Hadoop Environment and Amazon Web Services.
Data Scientist, Machine Learning @ From December 2015 to Present (1 month) Data Analyst - Data Science @ • Providing End-to-End Machine Learning solutions to various GE businesses.
• Leveraging techniques including Machine Learning, Statistical Model and Data Visualization to discover valuable insights and solve data-driven business problems.
• Understanding and defining analytics requirements for various use cases.
• Leading modeling effort and on-time delivering expected predictive analytics outcomes using R/Python/SQL.
• Involving in Big Data architecture discussion for analytics. From July 2014 to December 2015 (1 year 6 months) Business Intelligence Analyst (Intern) @ • Predicted customer conversion probabilities using a Logistic Regression model in R.
• Recommended marketing campaigns for customer segments based on the probability of conversion.
• Created a new metric to track and analyze customer churn over various subscription types using SQL to increase customer retention. Invented a new visualization of churn trends within various time periods using MS SQL Server Reporting Services.
• Analyzed the impact of raising prices on customer acquisition and churn in various markets using SQL.
• Aggregated customer profile and survey response data in a large-scale dataset (20,000,000+ users) with SQL.
• Parsed JSON data and inserted it into MS SQL Server tables using a Python script.
• Collaborated with product and engineering teams on new business initiatives and bug resolutions. From October 2013 to June 2014 (9 months) Internal Auditor @ • Detected fraudulent transactions by tracking expenditures from University Activities Fund.
• Improved student recruitment by analyzing educational and financial data of 13 public universities in Ohio. From July 2011 to December 2012 (1 year 6 months) Supply Chain Associate @ • Installed and fully integrated SAP ERP system for supply chain optimization. From January 2009 to December 2009 (1 year) Xi‘an, Shaanxi, China
Master of Science (M.S.), Analytics @ University of San Francisco From 2013 to 2014 Katherine Zhao is skilled in: Python, R, MySQL, SAS, Data Science, Machine Learning, Statistics, Big Data, Text Mining, Data Visualization, Hadoop, Hive, Data Analysis, Tableau, HDFS