Menno Gijsman
  • Home
  • Portfolio
  • Timeline

Card image

Smartnotation's diarization

audioanalysis Card imageMachine Learning Card imagespeaker diarization

Card image

Movie reviews sentiment

Sentiment Card imageText analysis Card imagedeep learning

Card image

Crime in London

Data visualization Card imageGeographical WebApp

Crime in London

During my second year of school I've got assigned to create a Geographical Web Application. I've chosen to map out crime in London, UK. This was my first time building a Geographical WebApp. I visualized the data as choropleth, and build in 2 filters. A filter by month, and a filter by crime. The data was opensource on the metropolitan police's website. Connecting the data to the correct projection has been done trough leaflet. This resulted into a really basic Geographical WebApp.

Screenshot of GEO WebApp

Smartnotation's diarization

Smartnotation is a project I worked on during my HBO internship at Emerald-IT. Smartnotation is an easy to use, voice enabled meeting minute solution. A new feature they want to see implemented is automatic note taking. Creating part of this feature was my job as intern.

Creating a system that automatically takes notes exists out of two steps. First is partitioning an input audio stream into homogeneous segments according to speaker identity. This phenomenon is also known as speaker diarization. The second step is filtering out the useful information for each speaker, and turning this data into notes. It was my task to create the speaker diarization system.

Eventually I've created an algorithm based on the audio direction(DOA) into 4 microphones, and 34 extracted features from the input audio stream. This resulted into an algorithm with an accuracy between 70% - 95% (tested up to 5 speakers, due to covid-19 regulations). Besides writing an algorithm I've also created scripts for post-processing, Visualization for improving the algorithm, sending data to the smartnotation API, etc. Eventually I strung the system together with a GUI I've created in Electron

The final flow of the project:

project flow

Movie reviews sentiment prediction

Durning my 3rd year HBO, I've worked with another student on determining the sentiment of movie reviews. We've collected over 150.000 movie reviews from rotten tomatoes, trough kaggle. Some of these reviews already had a sentiment grade (a grade between 0 and 4 determining how positive/negative the review was). These rows functioned as our train set.

We've started off by exploring, and cleaning our dataset. We did this by visualising our data, and cleaning things as rows containing empty cells, removing rows containing only non-alpha characters, removing certain non-alpha characters in general, etc. After that we've tried out multiple supervised algorithms, and setting those off to each other. We also expirimented with deep learning using an LSTM Model. Overall this was a really basic and introducing project to the concepts of machine and deep learning. We ended up with an accuracy of around 67%.