Windsor Mill, MD
- Ph.D. or Masters in a highly analytic field such as Mathematics, Statistics, Computer Science, Physics, Data Science, etc.
- A documented minimum of 3 years’ hands-on expert-level experience working with real world data sets in predictive modelling using a scripting language such as Python, MATLAB and/or R. Preference for Python, then R and then MATLAB.
- Specifically, experience doing classification and regression with support vector machines, random forests, linear and non-linear regression, etc. on real world data.
- Expert knowledge in conducting data analysis and applying advanced statistical concepts and machine learning methods to build, train, test, and evaluate a variety of supervised and unsupervised analytic models.
- Ability to clean and process large amounts of real world data.
- Experience retrieving and manipulating data from a variety of data sources included Db2, Oracle, SQL Server, Hadoop and flat files.
- Experience with database management systems, e.g. MySQL, SQLite, SQL, etc.
- Either experience with, or the ability and willingness to learn distributed processing via the Hadoop ecosystem, i.e. Spark, Impala and Hive.
- Experience working in a Linux environment via terminal commands.
- Experience presenting data and results to both technical and non-technical audiences in a clear and effective manner.
- Ability to obtain a Public Trust clearance.
- An in-depth knowledge of Social Security Administration (SSA) protocols concerning death processing, debt management, data reporting, etc.
- Knowledge of and experience handling SSA master files and related data.
- Natural language processing experience.
- VBA scripting experience
- Experience using markup languages such as LaTeX, HTML, etc.
- Experience using Agile team development.
- Experience teaching and mentoring in any and all of the above