Open Position: Data Scientist for DSI

Send CV to dsi.jobs@biu.ac.il

Job Description: We are seeking a skilled and versatile Data Scientist to join our team. As a Data Scientist, you will play a critical role in leveraging data-driven insights to solve complex problems. You will collaborate with cross-functional teams, engage with academic researchers, and work closely with industry clients to deliver impactful data science solutions. Our initial focus will be Natural Language Processing (NLP)

Responsibilities:

  • Apply advanced data analysis, statistical modeling, and machine learning techniques to extract insights from large and complex datasets.
  • Develop predictive models and algorithms to address business challenges and support decision-making.
  • Collaborate with software engineers to implement and deploy data science solutions into production.
  • Collaborate with academic researchers to stay up-to-date with the latest advancements in data science and contribute to cutting-edge research.
  • Engage with industry clients to understand their data-related needs and design customized solutions.
  • Communicate findings and insights to both technical and non-technical stakeholders through visualizations, reports, and presentations.

Qualifications:

  • Master’s or Ph.D. degree in a quantitative field such as Computer Science, Statistics, Mathematics, or related fields.
  • Strong proficiency in programming languages like Python and R.
  • Hands-on experience with data preprocessing, feature engineering, and model development.
  • Familiarity with machine learning libraries (e.g., scikit-learn, TensorFlow, PyTorch) and data manipulation libraries (e.g., pandas).
  • Knowledge and experience of relevant techniques and skills that showcase your  expertise in the NLP domain, among others:
  • Text Preprocessing
  • Contextual Embeddings, e.g., BERT, GPT, Transformer-based models
  • Text Classification and Sentiment Analysis
  • Libraries and frameworks: NLTK, spaCy, Gensim, Transformers (Hugging Face)
  • LLMs, generative LLMs – Text Generation using Recurrent Neural Networks (RNNs) and Transformers
  • Autoencoder-Decoder Techniques
  • Excellent problem-solving skills and the ability to think critically about complex issues.
  • Effective communication skills and the ability to work collaboratively in cross-functional teams.
  • Prior experience in working on data science projects with industry partners.

Data scientist position (climate and atmospheric physics)

The climate and atmospheric physics group at Bar-Ilan University is seeking a data scientist to join the group. The candidate should have advanced programming skills and experience in machine learning and big data analysis. Those skills will be applied to satellite data and climate models to study interactions between clouds, climate, and atmospheric circulation. A background in physics or environmental science is an advantage. Master or Ph.D graduates in computer science or physics that are interested in doing academic research would be the ideal candidates, although anyone with the required skills is welcome to apply.

For more information: Tom Goren tom.goren@mail.huji.ac.il

Seeking a Part-time Research Assistant for a project on motion capture of sign languages

We are looking for a part-time Research Assistant for an exciting project involving the analysis of sign languages through motion capture. Interested students should enquire only if they have Computer Science/programming background and an interest in Computer Vision. Please email Dr. Rose Stamp: rose.stamp@biu.ac.il

Open M.S. and Ph.D. Positions at the Neural Interfaces lab

There are a couple of open graduate students (PhD or MSc) positions in the Neural Interfaces lab lead by Prof. Izhar Bar-Gad:

  1. Identification of involuntary movements from video: In a joint project with Schneider children’s hospital we collected videos from Tourette syndrome patients displaying involuntary facial movements (motor tics).

Research definition: Unsupervised learning algorithms for identifying repeating behavioral motifs from the video.

Required background: Computer vision, Machine learning.

  1. Identification of animal behavior from a kinematic timeseries: We record kinematic signals (accelerometer, gyroscope, magnetometer) from animals during natural behavior. This information is consequently used to decode the behavior and link it to brain activity.

Research definition: Unsupervised learning algorithms for identifying repeating behavioral motifs from the kinematic signals.

Required background: Machine learning, Time series analysis.

For more details, please reach out to:

Izhar Bar-Gad, Ph.D.

Professor, Neural Interfaces lab

Gonda Brain Research Center

Bar Ilan University

Email: izhar.bar-gad@biu.ac.il

WWW: http://www.ibglab.org/

Open Master Thesis Position in Networks and NLP

We are seeking a CS or Math Master Student to do his/her thesis on a funded project on .

The Network Properties of Word Embeddings project is funded by Data Science Institute.

Participating Faculty members:

Prof. Reuven Cohen, Department of Mathematics, reuven@math.biu.ac.il

Prof. Yoav Goldberg, Department of Computer Science, yogo@cs.biu.ac.il

Dr. Simcha (Simi) Haber, Department of Mathematics, simi@biu.ac.il

Description of research work:

This work combines two strength areas of Bar-Ilan University – Network Science and Natural Language Processing – to study the network properties of word embeddings.

Word embedding — the mapping of words into numerical vector spaces — has seen tremendous success in numerous NLP tasks in recent years ([1]). Multiple methods for learning word embeddings from textual corpora have also be proposed. The resulting representations typically preserve semantic, syntactic and other properties of words. Once represented as vectors in an Euclidean space of dimension n (typically n is in the range of 50 to 500), it is natural to consider the implied graph or network of the words. In such a graph, two words are adjacent based on their distance as vectors or other similarity metric. Graphs and networks are widely used in Natural Language Processing including word graphs (e.g. [2], [3]), and it has been shown that word graphs share statistical features as other complex networks ([5]). However, the network properties of word embedding graphs have not been investigated until now.

The work will involve analyzing and comparing network properties of various word-embedding based networks – across word embedding algorithms and parameters, corpora and language and in addition study the relationship between these network properties and linguistics properties. When analyzing and comparing the word networks we will look at common network graph properties and algorithms – such as various centrality measures, clustering algorithms, page rank etc ([4]). We will also investigate properties of individual nodes (words), edges (relationships between words) and components (groups of words).

There are multiple word embedding techniques commonly used for NLP applications – e.g. Word2Vec or GloVe. The techniques vary in the underlying algorithm, the objective function and the context used for words in the text. Other parameters such as window size or embedding size also have an effect on the resulting word vectors. We will compare the networks resulting from the different algorithms and parameters. 

Different textual dataset (corpora) result in different embeddings vectors as words have different usage patterns in, say, a general and quite formal corpus such as wikipedia to, say, Twitter tweets. Naturally, we will compare the different network properties across languages as well. In all cases (algorithm, corpus and language), universal network properties will be searched (e.g. whether certain centrality or other measures indifferent to language or algorithm). For measure or properties that do differ across different embedding graphs we will investigate if one can attribute or tie the differences to linguistic properties (e.g., does a certain measure correlate to the morphological richness of a language or do languages from same linguistic families have similar network properties).

This interdisciplinary work brings together experts from two different data science fields (NLP and Network Science) who will co-advise an MSc student. We expect to publish the results of this project in top venues of both fields of study.

[1] Speech and Language Processing (3rd ed.). Dan Jurafsky and James H. Martin.2019. Chapter 6: Vector Semantics and Embeddings.

[2] Graph-Based Methods for Natural Language Processing and Understanding – A Survey and Analysis. Mills, Michael T. and Bourbakis, Nikolaos G. IEEE Transactions on Systems, Man, and Cybernetics: Systems. (2014).

[3] A survey of graphs in natural language processing. NASTASE, V., MIHALCEA, R., & RADEV, D. Natural Language Engineering, 21(5), 665-698. (2015).

[4] Graph Theory. Adrian Bondy  And U.S.R. Murty. Springer, 2017.

[5] The small world of human language. Ferreri and Sole. Proceedings of the Royal Society B: Biological Sciences 268(1482):2261-5. 2001.

If interested please contact Dr. Simi) Habersimi@biu.ac.il

MA Student Required for Research Project about Sign Languages

Seeking to recruit an MA student with python/data-science skills for work on a funded project led by Dr. Rose Stamp in the English Literature & Linguistics department. The project will include analysis of sign language and motion capture data using machine learning and python scripts.

50% position. Work may result in academic publications.

For details please contact Dr. Rose Stamp – rose.stamp@biu.ac.il

Research Assistant – Open Position

Position closed

 

Seeking a research assistant for a project by Dr. Gabrielle Gayer, Prof. Offer Lieberman and Prof. Itzhak Gilboa — “אמידת כללים: טכניקות מבוססות-עצים כמודלים קוגניטיביים” .

Project includes analysis of large database of real-estate prices in Australia and applying prediction models to estimate prices via machine learning techniques such as decision trees and random forests.

65%-75% position for up to three years.

Proper background in programming and data science required.

If interested please contact Gabi at gabi.gayer@gmail.com.

 

 

 

 

Student Position for Research Project on Bots in Social Networks

Position closed!

Seeking a student with python/data-science skills for work on a funded bot project lead by Dr. Alon Sela.
50% flexible position. Hourly rate – 90 NIS (may be option for Milga). Work my result in academic publications.

For details please contact Dr. Alon Sela – alonse2012@gmail.com

Position closed!