Data Science Batch #7 projects - Some highlights

Nitin Kumar

Student projects Data Science class #7

This is the first of many project recaps that we’re writing for our future students and corporate partners to give insight into the kind of projects Constructor Learning's Data Science students get to work on during their Capstone Project. 

Constructor Learning’s batch #7 (May 13, 2019 - July 31, 2019) of Data Science students worked on five projects that were provided by our industrial partners, such as Swiss International Airlines, Qard and  PriceHubble. All projects involved Machine Learning and two included Deep Learning. They covered a broader space of Data Science applications. Here is a list and some details.

Predicting Bise wind at Zurich airport

for Swiss International Airlines

Zurich airport faces ~ 30% delays when the Bise wind (cold, dry, northeast to southwest direction) hits Zurich airport. This project focused on predicting Bise events as well as predicting their duration. The two students who worked on this task reached a precision of ~80%. Though such results are high, they need to reach the 95% level to be used in a real-life warning system. We’re happy with these results and even happier to know that the project will be continued in the upcoming Data Science batch. 
 

Default Prediction on E-Commerce based on Public Data

for qardfinance.com

As a FinTech startup, Qard analyzes e-commerce businesses applying for loans and uses a data-driven approach to identify those with a substantial risk of default on their loan payments. Qard would like to extend this system to using non-financial data. For this project, Constructor Learning's students worked on extracting e-commerce-specific non-financial data from around 400GB of structured/unstructured data that has been collected by Qard over the years. The students reached an accuracy of around 70% on identifying default cases using non-financial development of such a system would help all loan providers because they would not need to ask a borrower specific detail about their finances. 

PhenoCAT: (Un)supervised classification of microscopy images with Deep Learning

Personal student project

This was an independent project brought by one of the students with PhDs in similar fields. Image-based Genetic Perturbation screens are extensively used in research labs to identify markers of cancer-causing genes. Such screens generate petabytes of data (millions of images) and require automatic systems to analyze these images. The two students wanted to test if they could use Deep Learning, primarily Convolutional Neural Networks and Variational Auto-Encoders to automatically classify images into their category of interests. Since no labelled data was present, students had to use active learning to sequentially create their train-test data. The supervised approach produced an accuracy of >90%. A second approach using unsupervised learning based on auto-encoders needs further exploration but was already able to create real-looking computer-generated images...
 

Real estate image classification (quality of houses)

for Pricehubble

This project involved the application of Active Learning with Convolutional Neural Networks to automatically classify property images into different price categories. For this project, multiple pre-trained networks (ex: ResNet and VGG16) were used as a starting point to further train them with our data. Using pre-trained networks is a standard practice in image analytics using Deep Learning. The student who worked on the challenge could achieve an accuracy of ~93% on this data.
 

Skill-gap analysis and course recommendations for your best-fit job

for Constructor Learning

As an EdTech startup, Constructor Learning often looks for ways to help our students develop their learning needs using data. The central aim of this project was to identify the skill set required for technology-related jobs in Switzerland, match it with the job seeker’s own skills and background (as extracted from LinkedIn profiles), and finally offer the latter suitable positions or training programs. To this end, the students employed NLP techniques to find out the semantic similarities between job ads and candidate skills, something which most job recommendation services lack. Constructor Learning is now working on developing this project as an online tool to help not only our students but also the general Swiss public. 

Interested in reading more about Constructor Learning and tech related topics? Then check out our other blog posts.

Read more
Blog