My team's mission is to help the data scientists and research engineers at LinkedIn be more productive and effective. To achieve this goal we are taking a holistic view of how Machine Learning is applied at LinkedIn. Machine Learning in itself is an optimization problem, but we're taking a step back and thinking through how we can
My team's mission is to help the data scientists and research engineers at LinkedIn be more productive and effective. To achieve this goal we are taking a holistic view of how Machine Learning is applied at LinkedIn. Machine Learning in itself is an optimization problem, but we're taking a step back and thinking through how we can optimize the human processes around applications of Machine Learning. As a company with many data products (e.g. advertising, job recommendations, search), LinkedIn's success is greatly impacted by its data scientists and research engineers. My team impacts the productivity and effectiveness of the data scientists and research engineers here at LinkedIn. As a result, we have disproportionately large impact within the company.
Currently, we are building tools to streamline model fitting and deployment to production. We touch the whole process, including: data collection, sampling algorithms, feature engineering, optimization algorithms, evaluation, and deployment to and use of models in production systems.
Previously, I applied my academic experience in information retrieval to build search and content ranking systems here at LinkedIn and previously at mSpoke (acquired by LinkedIn). I was the relevance lead that built the initial content relevance systems that powered LinkedIn Today (now LinkedIn Pulse), demonstrating the potential of content to LinkedIn. This paved the way for LinkedIn Influencers and the acquisition of Pulse.
My academic background gives me the ability to shape and articulate long term vision and applied research directions. My practical experience enables me to build systems that scale to hundreds of millions users and even greater numbers of activity from LinkedIn and around the Internet.
Engineering Manager @ I am building a team creates Machine Learning tools and infrastructure with the goal of improving data scientist and research engineer productivity and effectiveness. From November 2013 to Present (2 years 2 months) San Francisco Bay AreaStaff Software Engineer @ In addition to software development, I helped manage a small team of applied research engineers to build and improve the ranking infrastructure and models used by LinkedIn Today. I also worked on a Machine Learning infrastructure project to help streamline model fitting in Hadoop. From April 2012 to October 2013 (1 year 7 months) San Francisco Bay AreaSenior Software Engineer @ Responsible for article relevance infrastructure and algorithms on Linkedin Today. From August 2010 to April 2012 (1 year 9 months) San Francisco Bay AreaPrincipal Scientist @ In my job at mSpoke, it was my responsibility to improve the company's text processing architecture and the relevance items delivered by our products. Some examples of the work I accomplished:
- Developed named entity recognition and disambiguation tools for efficient recognition of millions of known entities.
- Designed analytics infrastructure for monitoring key customer or research metrics such as click-through-rate.
- Implemented near duplicate detection tools.
- Scaled text classification architecture to handle high throughput classification for thousands of categories. From September 2007 to August 2010 (3 years) Greater Pittsburgh AreaGraduate Student / Research Assistant @ My research at the LTI focused on the use of document structure to improve the statistical language model and inference network approaches to information retrieval. I have applied these models to known-item finding of web documents, XML element retrieval, and the retrieval of linguistically annotated text.
In addition to my dissertation research, I have researched evaluation methodology for element retrieval and federated search. I have also contributed to the Lemur Toolkit, an open source toolkit designed for information retrieval researchers. From August 2000 to June 2010 (9 years 11 months) Greater Pittsburgh AreaGraduate Research Intern @ At IBM Almaden, I worked for IBM's Project Avatar on developing an algebra for rule-based entity and relationship extraction from text. An algebra for extraction enables interesting opportunities for optimization. From June 2006 to August 2006 (3 months) San Francisco Bay AreaResearch Intern @ During my time at AOL I worked with the search group to incorporate text annotators into their document processing pipeline. From June 2000 to August 2000 (3 months) Washington D.C. Metro AreaSenior Undergraduate Research Programmer @ My work at CIIR included the design and development information retrieval document relationship visualizations and an acronym and definition extraction system and search engine. From January 1998 to May 2000 (2 years 5 months) Springfield, Massachusetts Area
PhD, Language and Information Technologies @ Carnegie Mellon University From 2000 to 2010 BS, Computer Sceince and Mathematics @ University of Massachusetts, Amherst From 1996 to 2000 Paul Ogilvie is skilled in: Information Retrieval, Text Classification, Information Extraction, Text Analytics, Natural Language Processing, Named Entity Recognition, Recommender Systems, Content Aggregation, Personalization, Search Engine Technology, Search Algorithms, Text Mining, Search, Data Mining, Scalability, Machine Learning, Distributed Systems