A bioinformatics start-up is looking for a data scientist in New York.
Your primary focus will be in applying data mining techniques, doing statistical analysis, and building high quality prediction systems integrated and connected with their Core Portfolio.
Looking for candidates with experience in analyzing information and generate technical input for clinical trials and research.
- Selecting features, building and optimizing classifiers using machine learning techniques
- Data mining using state-of-the-art methods
- Extending company’s data with third party sources of information when needed
- Enhancing data collection procedures to include information that is relevant for building analytic systems
- Processing, cleansing, and verifying the integrity of data used for analysis
- Doing ad-hoc analysis and presenting results in a clear manner
- Creating automated anomaly detection systems and constant tracking of its performance
Skills and Qualifications
- Excellent understanding of machine learning techniques and algorithms, such as k-NN, Naive Bayes, SVM, Decision Forests, etc.
- Experience with common data science toolkits, such as R, Weka, NumPy, MatLab, etc Excellence in at least one of these is highly desirable
- Dairy Herd Management System experience DC305, PCDart, DHI Plus, desirable
- Dairy Operation management experience desirable
- Great communication skills
- Experience with data visualization tools, such as D3.js, GGplot, etc.
- Proficiency in using query languages such as SQL, Hive, Pig
- Experience with NoSQL databases, such as MongoDB, Cassandra, HBase
- Good applied statistics skills, such as distributions, statistical testing, regression, etc.
- Good scripting and programming skills
- Data-oriented personality
- Experience in designing and implementing large-scale machine learning algorithms and models with solid programming skills in such languages as C++/Java, Python/Scala/R, and machine learning tools such as TensorFlow, Caffe, Scikit-learn or others.
- Experience in optimization models and methodologies is strongly preferred.
- Outstanding research track records in the fields of machine learning, data mining, or operations research, with evidence through publications in renowned journals, conferences or open-source contributions.