PhD position in Distributed Systems (1.0 FTE)
Distributed server clusters are often used effectively to perform data analysis on voluminous collections of data. These clusters substantially speed up large-scale data analysis, by dividing data collections among available machines, where they can be processed in parallel. For
Distributed server clusters are often used effectively to perform data analysis on voluminous collections of data. These clusters substantially speed up large-scale data analysis, by dividing data collections among available machines, where they can be processed in parallel. For instance, the distributed data processing platform Spark has become a de-facto standard in the world of large-scale data processing. The data processing pipelines for such platforms are composed during design time and then submitted to the central “master” component who then distributes the code among several worker nodes.
In many practical situations, the analysis application is not static and evolves over time: the developers add new processing steps, data scientists adjust parameters of their algorithm, and quality assurance discovers new bugs. Currently, an update of a pipeline looks as follows: the developers patch their code, re-submit the updated version, and finally restart the entire pipeline.
However, restarting a processing pipeline safely is difficult: the intermediate state is lost and needs to be re-computed; some data need to be reprocessed and, finally, the cost of restarting may not be trivial - especially for real-time streaming componentsthat require 24x7 availability.
In this project we develop a platform to support evolving data-intensive applications without the need for restarting them when the requirements change (e.g. new data sources or algorithms become available). We apply our developed tools and techniques and evaluate their effectiveness in the context of three different industrial use cases from three top sectors: water treatment, life sciences, and High-Tech Systems and Materials / Smart Industry.
Do you want to be part of our team?
- you have a Master degree before the end of June
- you have experience in Computer Science or related fields, with a strong background in formal methods, service-oriented computing, software engineering, concurrency and distributed systems, and especially practical software tool development
- you have experience in the field of machine learning, data analysis / statistics, and big data analysis platforms.
CONDITIONS OF EMPLOYMENT
Fixed-term contract: 12 months.
We offer you in accordance with the Collective Labour Agreement for Dutch Universities:
- a salary of € 2,325 gross per month in the first year, up to a maximum of € 2,972 gross per month in the fourth and final year
- a full-time position (1.0 FTE)
- a holiday allowance of 8% gross annual income
- an 8.3% end-of-year allowance.
The position is limited to a period of 4 years. A PhD training programme is part of the agreement and you will be enrolled in the Graduate School of Science and Engineering.
You get a temporary position of one year with the option of renewal for another three years. Prolongation of the contract is contingent on sufficient progress in the first year to indicate that a successful completion of the PhD thesis within the next three years is to be expected.
Faculty of Science and Engineering
Founded in 1614, the University of Groningen enjoys an international reputation as a dynamic and innovative institution of higher education offering high-quality teaching and research. Flexible study programmes and academic career opportunities in a wide variety of disciplines encourage the 31,000 students and researchers alike to develop their own individual talents. As one of the best research universities in Europe, we join forces with other top universities and networks worldwide to become a truly global centre of knowledge.
In the Distributed Systems group of the Faculty of Science and Engineering (FSE), we have a 4-years PhD position available. You will work in the project Evolutionary changes in Distributed Analysis (ECiDA). This project involves the development of dynamic data analysis pipelines on distributed data clusters.
Our group performs fundamental research and delivers education at the frontiers of dynamic complex distributed systems using formal engineering tools and seeks applications with societal impact. Over the last decade the main research interests covered the areas of AI planning and discrete optimization in highly distributed environments, Internet-of-Things, building automation, large-scale data analytics, business process management, and energy distributed infrastructures as main application domains. The research results have been field-tested in collaboration with industry. One of these applications eventually led to the founding of the Sustainable Buildings company that applies the optimization algorithms in practice.
Prof. A. Lazovik
Autres annonces de l'employeur
Postes de haut niveau
GRANTS FOR POSTDOCTORAL RESEARCH IN THE UNITED STATES
University of Basel Basel
Wissenschaftliche/r Mitarbeiter/in Stadt.Geschichte.Basel
University of Basel Basel
University of Basel Basel
Scientific programmer for Parallelization and optimization of code for a wireless control toolbox
TU Delft Holland