Senior Architect/Technical Implementation Manager @ • Architect and implemented predictive model to analyze client behavior using Hadoop, Flume, Java, Pentaho, Hive and MongoDB database.
• Designed roadmap for the use of Hadoop/HDFS system with MapReduce and created strategies for optimization of cluster utilization and implementation.
• Installed and configured Apache Hadoop, Scoop, Flume, Hive environment on the prototype server, loaded highly unstructured and semi structured data of 20 TB in size (replication factor of 3) on HDFS.
• Loaded unstructured data into Hadoop File System (HDFS), moved data from HDFS to RDBMS and vice-versa using SQOOP, Designed and Implemented map reduce jobs using Java and pentaho kettle tool.
• Architect design for Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
• Managed a team of senior engineers to implement the key components of the product according to specifications, defined and enforced Big Data/Kettle best practice, SCM strategy across organization and agile technology across team.
• Responsible for conceptualizing and designing end-to-end Solution Architecture blueprint for current infrastructure upgrades and future state technical capabilities for Charles River Development.
• Articulated service visions, alignment of information technologies with enterprise strategy, and shared common solutions and best practices.
• Presented technology and infrastructure decision options in transformation from legacy systems to automated, predictive data analytic, situational intelligence, and Cloud services
• Acted as a mentor for development team, assisted developers in all aspects of the software life-cycle, including: definition, design, implementation, testing and delivery, assigned and prioritized project related task.
• Authored complex multi year statement of work and implementation plan for each client to engage professional service team to build out solutions. From April 2014 to Present (1 year 9 months) Greater Boston AreaArchitect/Project Manager/Principal Software Engineer @ •Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive structured and unstructured data. Installed and configured Cloudera, and MapR based Hadoop systems, developed centralized Service setup to start, stop, and expand data nodes.
•Designed map reduce process using Informatica and Java/Python to load data into HDFS, architect ETL jobs to load JSON data into Hbase DB and transported Hbase database data into data warehouse.
•Designed a real time computation engine using Flume, Kafka, Storm/Spark and complex event processing engine to provide the credit card customer with a Credit Line Increase.
•Created road-map for BigData, ETL and SCM/Release Engineering technology across organization. Lead and managed Infrastructure team and database development team, helped to formulate estimates and timelines for project activities and setting related goals.
•Defined and enforced project processes and policies, Informatica best practice across organization. Introduced the use of advanced software development techniques such as ETL Mapping design, code reviews, and unit test plan and Integration test plan for development.
•Designed, architect and delivered multiyear complex data warehousing key initiative using Java, shell, perl scripts, XML database schema, SQL/PL SQL, T-SQL.
•Transformed complex manual efforts into simple, automated, data-driven, user experiences for better decision-making and competitive advantage.
•Shared deep content and data management lifecycle and technology solution knowledge to grow internal understanding of what's possible with new and future technologies and brought in technology vendors to present demos of leading edge content management and data management systems for core team.
•Presented decision points and pros and cons of various approaches and technologies, and the differences between custom development and Commercial-off-the-Shelf product vendors. From November 2006 to March 2014 (7 years 5 months) Lecturer/Mentor @ Mentored students for university Java, C++, database, SQL and PL/SQL research project. From July 2001 to December 2004 (3 years 6 months) Gujarat, India
Master of Science (M.S.), Computer Engineering @ California State University-Chico From 2005 to 2006 Bachelor of Science (B.S.), Electronics and Communications Engineering @ Sardar Patel University From 1997 to 2001 Macdonald Macwan is skilled in: Hadoop, Big Data, ETL, NoSQL, MapReduce, HBase, Amazon Web Services (AWS), Windows Azure, Cloud Computing IaaS, Hive, Apache Pig, Informatica, Pentaho, Data Warehouse Architecture, MongoDB