Data Platform At Netflix
San Jose, California, United States
Ayasdi
Software Engineer
October 2014 to October 2015
Goldman Sachs
Application Engineer ||
May 2007 to October 2014
Rutgers University
Research Assistant
August 2004 to April 2007
Bloomberg LP
Intern
June 2006 to September 2006
Rutgers University
Instructor of c programming language
August 2002 to July 2004
Backend infrastructure development - building scalable distributed infrastructure for enabling complex data analytics algorithms and job execution on top of YARN. - Develop distributed infrastructure for high performance batch and interactive data analytic processing on top of native YARN. Implemented application master to interact with resource manager and node managers to schedule and launch various machine learning... Backend infrastructure development - building scalable distributed infrastructure for enabling complex data analytics algorithms and job execution on top of YARN. - Develop distributed infrastructure for high performance batch and interactive data analytic processing on top of native YARN. Implemented application master to interact with resource manager and node managers to schedule and launch various machine learning data analysis jobs on top of Hadoop cluster. - Design and implement highly scalable and flexible system architecture to allow the infrastructure adaptive to physical clusters. The system employs a group of application masters to separate the job scheduling load; dynamically discovers and monitors the liveness of application masters; balances the work load among the application masters. - Update cluster to CDH5 and employ Cloudera manager to config HDFS and YARN parameters; implement kerberos based authentication to support security hadoop cluster; repackaging software package to be hadoop cluster ready.
📖 Summary
Profile: - Strong background in distributed computational infrastructure development, computational Cloud, large scale scalable and fault-tolerant data processing, distributed financial instrument pricing; - YARN application development and deployment; distributed resource management and job scheduling, - Strong in algorithm and data structures, OO methodologies, data-driven coordination, content-based data discovery. - Tachyon project ( open source ) contributor. Skills - Big Data Analytics - AWS cloud, S3 - Hadoop/MapReduce/YARN/Spark - Java, Slang, C, C++, HTML, XML, SQL, Perl, Python - Eclipse, CVS, DB2, SybaseIQ, MySQL, Git, PostgreSQL - MAC, Linux, Unix, VMware, OSx Personality: - Result oriented person, get things done in a high quality and timely manner; - Fast learner, good team player and able to drive and lead project development; - Interested in mentoring and training.Software Engineer @ Backend infrastructure development - building scalable distributed infrastructure for enabling complex data analytics algorithms and job execution on top of YARN. - Develop distributed infrastructure for high performance batch and interactive data analytic processing on top of native YARN. Implemented application master to interact with resource manager and node managers to schedule and launch various machine learning data analysis jobs on top of Hadoop cluster. - Design and implement highly scalable and flexible system architecture to allow the infrastructure adaptive to physical clusters. The system employs a group of application masters to separate the job scheduling load; dynamically discovers and monitors the liveness of application masters; balances the work load among the application masters. - Update cluster to CDH5 and employ Cloudera manager to config HDFS and YARN parameters; implement kerberos based authentication to support security hadoop cluster; repackaging software package to be hadoop cluster ready. From October 2014 to October 2015 (1 year 1 month) Application Engineer || @ Distributed large scale infrastructure for financial instrument pricing and position attribute calculation. Research and implement approaches for distributed process profiling and compute cost reduction. Edit tracking process, automatic report generation. Selected projects: - Large scale position attribute calculation infrastructure: developed the infrastructure to calculate financial instrument attributes (terabyte data daily), such as denomination, on data centers; employed timeout and multilevel retry mechanisms (immediately, delayed, and final) to address calculation failure caused by transient application exceptions, machine, network or other resource issues; provided tools to allow ad-hoc testing and issue investigation locally; developed task status check and rerun to ensure calculation completeness which directly impacts the federal stress-test results. - Compute cost reduction via task batching: independently researched on the project for compute cost reduction. Proposed and evaluated effective batching approaches (split inputs to create tasks) to reduce total compute cost. Implemented and deployed lazy batching approach to reduce compute cost (average of 5%) by batching more computation units in a task as possible to minimize task management overhead, where time-insensitive task creation is delayed until accumulating enough computation units or reaching time limits. From May 2007 to October 2014 (7 years 6 months) Research Assistant @ Focus on large scale distributed computational infrastructure; tuple space based distributed coordination system; autonomic computing. Selected projects: - Developed a peer-to-peer cloud computing infrastructure Comet for large scale computations (e.g., Monte Carlo Simulations), which implements a distributed hash scheme on top of a resilient self-organizing overlay network. The infrastructure supports scalable data distribution, storage, flexible query, and run-time host join/leave. - Developed a Grid-based asynchronous replica exchange engine on Comet for structural biology and drug design applications. Deployed the Comet infrastructure and its applications on hundreds of machines on an Internet based Cloud computing test bed. The experiment demonstrated the ability of Comet to support replica exchange applications in unreliable and highly dynamic network environments. From August 2004 to April 2007 (2 years 9 months) Intern @ Develop an automatic failure monitor and detector in Perl. From June 2006 to September 2006 (4 months) Instructor of c programming language @ Teach undergraduate C programming language. From August 2002 to July 2004 (2 years) Doctor of Philosophy (PhD), Parallel and Distributed Computing @ Rutgers University-New Brunswick From 2002 to 2007 Master of Engineering (ME), Intelligence Network @ Beijing University of Post and Telecommunications From 1999 to 2002 Zhen Li is skilled in: Software Development, Large scale distributed and parallel system, Distributed pricing infrastructure, Financial risk management system, Java, Algorithms, Data Structures, OOP, Scalability, Teamwork, Mentoring, Hadoop, Data Analysis
What company does Zhen Li work for?
Zhen Li works for Ayasdi
What is Zhen Li's role at Ayasdi?
Zhen Li is Software Engineer
What industry does Zhen Li work in?
Zhen Li works in the Information Technology and Services industry.
Who are Zhen Li's colleagues?
Zhen Li's colleagues are Megan Neider, Shannon Supple, Chris Whiteley, Liz Yeomans, Katie Tooma, Marshall Upshur, Ville Tuulos, Audrey Ledoux, Jenny CPA, and Nabanita Ghosal
Extraversion (E), Intuition (N), Feeling (F), Judging (J)
2 year(s), 8 month(s)
Unlikely
Likely
There's 97% chance that Zhen Li is seeking for new opportunities
Enjoy unlimited access and discover candidates outside of LinkedIn
One billion email addresses and counting
Everything you need to engage with more prospects.
ContactOut is used by
76% of Fortune 500 companies