I am a software engineer with experience building and shipping products based on machine learning while leveraging distributed systems.
I am passionate about building statistically sound products that leverage machine learning and data science to turn large, complex, high-dimensional datasets into actionable insights.
I enjoy working in high-energy teams, writing efficient, scalable, production-quality code as well as data ninja-ing, including working with incomplete datasets.
I like hard problems that allow me to draw upon my strong background in machine learning, statistical interference and prediction, optimization, matrix factorization and other mathematical underpinnings behind machine learning algorithms (Stanford MS Computational Engineering).
Software Engineer @ • Developed enterprise big-data real-time personalization software on top of open-source project Kiji while leveraging math, statistics and machine learning. • Developed the first commercial experimental framework enabled by machine learning in Hadoop, reducing the time to modify and implement personalization models from months to minutes (link: http://goo.gl/VUZpg3). • Developed and implemented batch jobs for evaluating the model performance within the experimental framework. • Generated insights for improving recommendation models through statistical analysis of recommendation data to evaluate data quality and better understand data structure and patterns. (Python, R, Scalding, Java, HBase, Hadoop, Impala, Hive) From 2014 to 2015 (1 year) San Francisco Bay AreaResearch Engineer @ • Developed and implemented machine learning algorithms for object detection. • Accelerated large-scale earthquake simulations using GPUs to significantly reduce computing cost while maintaining accuracy (work presented at GPU Technology Conference 2014). From 2013 to 2014 (1 year) San Francisco Bay AreaAnalytics Experienced Associate @ • Developed software for optimizing constrained dynamic project portfolios for efficient allocation. From 2012 to 2013 (1 year) San Francisco, CACourse Assistant/Section Instructor @ Projects: 1) Machine learning: • Predicted tourism trend using time-series regression with feature selection on query-specific web search volume histories from Google Insights. • Applied k-fold cross-validation to determine model parameters. 2) Optimization: • Developed and implemented a multivariate optimization routine to solve a penalized bound-constrained optimization problem. • Developed a solver for the Lasso problem by reducing it to a linearly-constrained non-negative least squares problem. 3) Math modeling: • Developed mathematical model for Dropbox referral program. • Key result: 4.6% increased profits by optimally identifying extra storage space to award to users for referral. Teaching: • Taught sections for Stanford undergraduate courses in 1) Scientific computing, 2) Probability, 3) Statistics, 4) Linear algebra, 5) Differential equations, and 6) Vector calculus From 2010 to 2012 (2 years) Stanford, CAGraduate Research Assistant @ • Researched and developed novel cutting-edge techniques to synthesize graphene (a single sheet of graphite) for next-generation microprocessors. • Published research in leading peer-reviewed journals and presented at conferences. • Mentored two undergraduate students in research. From 2007 to 2010 (3 years) Austin, TX
MS, Computational and Mathematical Engineering @ Stanford University From 2010 to 2012 MS (Thesis), Electrical Engineering @ The University of Texas at Austin From 2007 to 2010 BS, Electrical Engineering @ Punjab Engineering College From 2003 to 2007 Shagandeep Kaur is skilled in: Scala, Statistics, Machine Learning, Algorithms, Data Mining, Python, Optimization, Large-scale Data..., Hadoop, HBase, Recommender Systems, R, GPU, Matlab, Statistical Learning, Hive, Distributed Systems, Scalding