Example Projects for TU Delft/EEMCS Master Students
Finding train and track defects based on spectrograms of currents at electricity substations
David M.J. Tax
DEKRA is an internationally active service-providing company. DEKRA Rail is the European rail division of DEKRA for Testing, Inspection, Certification and Research and has about 120 professionals. The main part of DEKRA Rail used to be NS Technical Research (part of Dutch Railways) with over 100 years of experience.
The Netherlands has a dense railway network. The biggest part of the network is electrified and electricity substations power the overhead lines. While operating, DC current flows from the overhead line to the electrified train, and then back through the track (retour current). Depending on the state of the train, this retour current consists of different frequencies and amplitudes.
DEKRA Rail has a monitoring system installed in several electricity substations in the Netherlands to measure the retour current of trains continuously. Trains can have failures in their system, which introduces an AC component in the retour current. When certain criteria are met, this can be an indicator of a defect within the train or track. At the time (t0) that one of such criteria is met, a measurement of 30 sec before t0 en 30 sec after t0 is saved and send to the DEKRA Rail servers. The measurement is transformed to a spectrogram as seen in Fig 1.
In this spectrogram it is possible to recognize patterns which correspond to defects of trains or tracks. We receive over 100 measurements each day, and only a few are really related to a defect. We would like to classify the measurements automatically, to make it easy to find trains and tracks with defects.
For more information, please contact Léanneke Loeve at firstname.lastname@example.org
Feature extraction for an educational program recommender system
Partners: StudyPortals (http://www.studyportals.com/) is an international study choice platform. They offer web-based services to students and academic institutions. For students, it provides portals that categorize and summarize Bachelor, Master, and PhD programs. For institutions, it promotes their educational programs as well as analyzing their marketing and student recruitment efforts.
Introduction: In this project, there is an interest in exploring ways to offer the best fitting studies to students by enriching the study page description. Their educational program search engine relies on descriptions provided by the universities at the moment. For instance, a user who comes to the website enters a query and, based on the keywords in that query, the search engine presents a list of programs at different universities to the user.
Task: However, the provided descriptions by the universities are sometimes limited and lead to unsatisfactory results. This project will be proposing a content framework to the institutions based on a data driven investigation, such as feature extraction of the study pages. The new proposed framework will be tested using A/B testing and the performance will be measured by metrics such as the click-through rate. During the course of the project, the student will be free to focus on the exploratory nature of data with some machine learning components or to fully focus on developing new machine learning techniques for better information retrieval and feature extraction.
For more information, contact M.Loog@tudelft.nl
Energy disaggregation using non-intrusive load monitoring
Introduction: Greeniant is a small company based in Rijswijk working on creating consumer focused energy awareness using data from smart meters. Specifically, we are developing a technology for energy disaggregation, also known as non-intrusive load monitoring.
Energy disaggregation is the practice of estimating the energy consumed by each appliance in an household by only analyzing the total household energy consumption measured at the smart meter, which avoids the need to install separate hardware (plugs/sensors) on the individual appliances. For example, it extracts the energy consumed by washing machines, dryers, dish washers, ovens, etc. from the aggregate electricity meter reading of a household. Energy disaggregation provides the consumer with more understanding on how the individual appliances contribute to the total energy consumption and bills. The increased awareness about the energy consumption of the appliances helps consumers to take effective measures to improve their energy efficiency.
Project: Currently, we are looking for an intelligent student to join our science team to work on developing our non-intrusive load monitoring algorithms. The task of the student is to work on developing algorithms to learn the energy consumption patterns of different appliances, using techniques from machine learning and pattern recognition.
Requirements: We are looking for a student with the following background:
- Working on MSc degree in computer science, mathematics or related field.
- Has knowledge/experience in machine learning/pattern recognition, particularly using Bayesian Networks (Hidden Markov Models, Hidden semi-Markov Models).
- Has experience in programming languages, such as Java or C++, experience with Matlab and/or Python + Numpy/Scipy is a plus.
- Creative and motivated, capable of working in teams.
- Good command of either Dutch or English.
Mobility and traffic flow
D.M.J. TaxTeam Smart Mobility of the Mobility department at TNO is looking for smart and enthusiastic students who are interested in doing state of the art research in the field of Pattern Recognition and Machine Learning applied to the field of Mobility and traffic flow in the Netherlands. The Dutch highway network is densely monitored by Induction Detector Loops (ILD). Around every 500 meter there is a ILD that measures each minute average speed(km/h) and intensity(veh/h). Furthermore, TNO maps and fuses different data sources together in the vicinity of the ILD locations such as rainfall and incident information. We are looking for an automated way to classify and understand traffic jams in general. We are for example interested in traffic jams that are reoccurring each day and those that are not.
For more detailed information, see here.
Social Signal Processing
In the European Union project Social Signal Processing the goal is to automatically extract, segment and classify social signals that appear in the interaction between people. The pattern recognition group supports the social scientists in analyzing the behavioral patterns and signs that are used in normal social interaction. Questions like "Who is opposing who in a discussion? Who is agreeing? Who is nodding at what time?" have to be answered.
The pattern recognition research focusses on the more fundamental problems in the processing of (huge) video sequence data. Is it possible to segment the audio and the video without expert interaction? How should the image and audio data be represented to allow for this? Is it possible to automatically extract how many people are present in a video? How do they look like, and what do they sound like? Which speaker is speaking at what time? How can we classify sequences of variable length? Can we find out when something unexpected, something exciting happens?
Active learning does not allude to the activity, or absence of it, of the person interested in this project. It refers more to active in the sense of being effective. More precisely, the adjective refers to the learning phase of a classifier and how, indeed, it can be made more effective.
You trained a classifier that should act as support to a medical expert in coming to diagnoses, but you are not really satisfied with its [the classifier's that is] performance. One way to potentially improve the accuracy is to provide the classifier with additional labeled examples to train on. Getting labeled examples, however, is rather expensive as you need the medical expert to provide the correct label, so you would like to achieve as much gain in classification performance with as little additional labeled examples as possible. This is the challenge active learning tries to solve by enabling the current classifier to provide active feedback to its user on which unlabeled samples would be most informative for it to have labeled.
Active learning is a rather novel research direction and a valuable approach in general, not only for developing medical expert systems such as the one that featured above. Applications like image segmentation and tasks like speech tagging, for instance, can also benefit from it. Active learning is broadly applicable and both more applied and more fundamental master's projects are possible in this area of pattern recognition.
Real-time classification of rodent behavior.
D.M.J. Tax, Elsbeth van Dam (Noldus, Wageningen)
Observation of rodent behavior is important to many fields of research in the life sciences. Rats and mice are used as models for human diseases and their behavior is studied in labs around the world in order to find new drugs that cure psychiatric and neurological disorders. The automation of these measurements is crucial to advances in pharmaceutical research as well as animal welfare. In this assignment Noldus Information Technology BV (www.noldus.com) tries to create state of the art advances in computer vision and behavior recognition on the cutting edge between science and application.
State-of-the-art in behavior recognition is the detection of behavior of humans, mostly based on spatial measurements like pose and speed. Unfortunately these techniques are not suitable for the detection of subtle rodent behaviors like grooming, sniffing and eating. A few systems have been described in literature that can recognize rodent behavior from a side view. In a recent study features are generated based on a computational model of motion processing in the human brain. Classification of these features and their temporal context is done using advanced event recognition techniques (HMMSVM).
We aim to apply these techniques from literature to top view recordings of rats in infrared light. The capability of the trained classification modules will be evaluated in an automated recognition system for real-time operation in a noisy environment. For this research we have a large and manually labeled dataset of high quality available.
We are looking for students who are interested in computer vision and pattern recognition applied to behavioral research. Knowledge of Matlab is preferred.
Multiple Instance Learning
David Tax and Marco Loog