Computational Linguist / Linguistic Data Developer
Greater Boston Area
I have Ph.D. in Linguistics from Indiana University. My specialization was in Arabic phonetics and phonology. I have published more than 20 articles in referred journals alone and with colleagues in this field. I have 13 years experience in the field of lexicon and corpus development. I worked in the fields of text analytics, voice recognition, corpus...
I have Ph.D. in Linguistics from Indiana University. My specialization was in Arabic phonetics and phonology. I have published more than 20 articles in referred journals alone and with colleagues in this field. I have 13 years experience in the field of lexicon and corpus development. I worked in the fields of text analytics, voice recognition, corpus development, and information extraction. I manage data, and I have hired, trained, and managed linguists, and native speakers of more than 30 languages to do various tasks such as annotation, vocalization (Arabic script languages), part of speech tagging, translation, transcription, lexicon development, regular expressions development, grammar creation, etc. I have done extensive manual quality assurance on various products, wrote test plans, found and reported bugs following scrum methodology. I have also working experience with Middle Eastern languages: Dari, Farsi, Pashto, and Urdu. Finally, I am a Certified ScrumMaster, and have experience in working in an agile scrum development cycle.Project Supervisor/ Contracting Linguist @ From November 2013 to Present (2 years 2 months) Computational Linguist @ Linguistic development work for machine translation of Arabic and Urdu into English: * lexicon development for Arabic and Urdu * wrote conjugation rules for Arabic verbs which generate all the verb forms of different templates. * wrote tagging rules that identify different groups of Arabic nouns, adjectives, and verbs. * setting up the verb morphology by creating rules for runtime analysis. * General QA of the software: reporting and verifying bugs. From November 2012 to November 2013 (1 year 1 month) contracting linguist @ creating reading and listening multiple-choice test items for testing Arabic proficiency, following the ACTFL and ILR Proficiency Guidelines From August 2012 to 2013 (1 year) Senior Linguist @ • Developed corpora for more than 16 languages in various projects such as: Named entity annotation for news, social media articles, and product articles. Segmentation for Asian languages, vocalization of names for Arabic script languages, and reverse transcription of Romanized Arabic which is used in chatrooms. • Designing and developing detailed manual quality assurance test plans, and functional test plans for testing search in Arabic (includes issues such as normalization, tokenization, lemmatization, etc). • Extensive manual quality assurance testing of Basis products, using virtual machines. Suggesting ways to improve the products. Finding, reporting, and verifying bugs following scrum framework. • Maintaining, refining, and localizing guidelines for: named entity tagging (using Doc Book), part of speech tagging, regular expressions, titles of people, etc. • Locating, hiring, training, and managing of foreign language temporary workers. • Research, and presentation of talks at Government Users Conference about various Middle Eastern Language issues such as Afghani languages(Pashto and Dari), linguistic challenges of Arabic chat, the structure of Arabic nicknames, and orthographic variations in Arabic corpora. • Running python scripts, regular expressions, and unix commands to do data analysis on large quantities of data. • Knowledge of various transcription standards such as the Intelligence Community (IC) and Board of Geographic names (BGN). • Working closely and directly with NLP engineers to plan, design, develop, debug, and enhance the quality of the core technologies, and the internal processes. • Experience in working with teams using Agile/Scrum development framework. From 2005 to August 2012 (7 years) Contracting Linguist @ Managed and developed all of the Arabic linguistic data that ws needing for building a limited two-way speech translator (Arabic ~ English), which uses Arabic speech recognition, and information extraction modules. From May 2002 to December 2005 (3 years 8 months) Researcher @ • Locating native speakers of Arabic. • Writing transcription guidelines of colloquial Arabic speech. • Transcribing Levantine for the Fisher Levantine Arabic Conversational Telephone Speech. • Transcribing Iraqi Arabic telephone speech. • Part of speech tagging of Arabic. From 2003 to 2005 (2 years) Lexicon Developer Lead @ Developed lexical lists and a database to be used in a product for information extraction from medical reports. From 2001 to 2002 (1 year) Lexicon and Corpus Developer @ • Responsible for the maintenance of the US English dictionary which was used for all speech products. • Developed and delivered the legal context for Voice Xpress • Developed a lexical database. • Wrote functional specifications for phonetic transcription. • Managed one transcriber, who was transcribing the phonetic transcription of English words. • Evaluated the output of AppTek’s machine translation engine from different linguistic perspectives: morphologically, syntactically, phonetically, and orthographically. • Researched the phonological and phonetic rules that apply to Modern Standard Arabic. The rules were used by the speech synthesis engine. From 1999 to 2001 (2 years) Indiana University Doctoral Fellow. @ Completed independent articulatory and acoustic research on Arabic phonetics and phonology. From 1998 to 1999 (1 year) Resident Assistant @ Duties included advising and assisting students. Developed strong and effective leadership skills. From 1995 to 1999 (4 years) Treasurer of the Indiana University Linguistics Club @ * Process all payments made to the IULC. Enter all the information in the IULC's accounting tracking system. * Ensure that all club members have paid their membership dues. * Prepare ledger sheets for recording all income and expenses for Student Organization Accounts (SOA). * Deposit funds (dues, fund raising income etc.) in the SOA account. From 1995 to 1999 (4 years) Associate Instructor @ Taught undergraduate level Introductory Linguistics classes, and graduate level phonetics lab. From 1996 to 1998 (2 years) Ph.D., Linguistics (major), Speech and Hearing (minor) @ Indiana University Bloomington From 1994 to 1999 MA, Linguistics @ The University of New Mexico From 1993 to 1994 B.A, English Literature, minor in Psychology @ University of Jordan From 1988 to 1991 Bushra CSM is skilled in: Computational Linguistics, Natural Language Processing, Linguistics, Information Retrieval, Teaching, Translation, Data Mining, Machine Translation, English, Text Analytics, Machine Learning, Perl, Research, Arabic, Modern Standard Arabic
Appen Butler Hill, Inc
Project Supervisor/ Contracting Linguist
November 2013 to Present
LinguaSys
Computational Linguist
November 2012 to November 2013
The American Council on the Teaching of Foreign Languages (ACTFL)
contracting linguist
August 2012 to 2013
Basis Technology
Senior Linguist
2005 to August 2012
BBN Technologies, Cambridge MA
Contracting Linguist
May 2002 to December 2005
Linguistic Data Consortium
Researcher
2003 to 2005
Dictaphone Corporation, Burlington, MA
Lexicon Developer Lead
2001 to 2002
Lernout and Hauspie Speech Products
Lexicon and Corpus Developer
1999 to 2001
Indiana University Bloomington
Indiana University Doctoral Fellow.
1998 to 1999
Indiana University Bloomington
Resident Assistant
1995 to 1999
Indiana University Bloomington
Treasurer of the Indiana University Linguistics Club
1995 to 1999
Indiana University Bloomington
Associate Instructor
1996 to 1998
What company does Bushra CSM work for?
Bushra CSM works for Appen Butler Hill, Inc
What is Bushra CSM's role at Appen Butler Hill, Inc?
Bushra CSM is Project Supervisor/ Contracting Linguist
What industry does Bushra CSM work in?
Bushra CSM works in the Computer Software industry.
Who are Bushra CSM's colleagues?
Bushra CSM's colleagues are Niklas Kotter, Robin Low, Yena Han, Tong Liu, mine nazik, Joseph Franklin, Dominador Dominic Simonking, Gregory Dunham, Annabelle Adrales-Barrios, and Cedric Lokula
Issued by College of Arts & Sciences, Indiana University. · 1998
Issued by College of Arts & Sciences, Indiana University · 1998
Issued by Linguistic Institute, University of New Mexico · 1995
Enjoy unlimited access and discover candidates outside of LinkedIn
One billion email addresses and counting
Everything you need to engage with more prospects.
ContactOut is used by
76% of Fortune 500 companies