Beta's projects for final year students 2021—2022

Related courses

General

Cloud classifier

Project info

Description

Given a photo of clouds, classify the species of clouds in it.

The student is expected to learn about cloud species and their classification, and build a system that can do the classification.

The program can be implemented on a desktop computer or a mobile phone.

Requirements

Deliverables

References

Some resources that may make experimentation and implementation easier

Typhoon track predictor

Project info

Description

The Best Track Archive of Joint Typhoon Warning Center (JTWC) (link, another link, yet another link; these links don't work all the time. OK, let's Google.) contain decades of data of tracks of tropical cyclones all over the world. The Severe Weather Information Centre, Meteoalarm, and Pacific Disaster Center contain current data for weather warning. These data are publicly available and can be analyzed for applications based on statistics.

The project is to use these data to answer some questions the student is going to set, such as

The student is expected to come up with questions similar to these above that makes sense meteorologically, and answer them by analysing data.

Expectations on students

References

Although some resources here are Python libraries, there is no restriction on the languages and tools you use. Indeed, a good data analysis and machine learning project like this one often requires the use of multiple languages.

Analysing ECG patterns

Project info

Description

Electrocardiogram (ECG) signals is useful for gaining insight into a person's health.

It can be recorded using specialised equipments, or just an Apple Watch.

The app in the watch analyses whether it's a normal sinus rhythm, or if there is sign of atrial fibrillation (AFib).

Yet, there are more types of rhythms that signal cardiac dysfunctions, such as Premature atrial contraction (APB)), Atrial flutter (AFL), Supraventricular tachycardia (SVTA), Premature ventricular contraction (PVC), Bigeminy, and Trigeminy.

This project is about classifying ECGs into different types.

Sample ECGs

Graphs generated from data from Pawel Plawiak (CC BY 4.0)


Sinus rhythm

Atrial fibrillation

Supraventricular tachycardia

Requirements

Deliverables

References

Some resources that may make experimentation and implementation easier

Visualising network intrusions

Project info

Description

Intrusion detection systems (IDSes) monitor network traffic and flag suspicious activities for network managers to take action.

Suspicious activities include port scans, repeated unsucessful logins, man-in-the-middle (MitM) attacks, botnet attacks, denial-of-service (DoS) attacks, among others.

Suspicious activities are visualised and sent to a Security Information and Event Management (SIEM) system for logging.

This project is to build or enhance an IDS and/or SIEM system, with emphasis on the visualisation component that shows what is happening in real time.

Requirements

Deliverables

References

Some related resources on the web

Real-time Speaker Recognizer

Project info

Description

Build a system that recognizes and labels the speakers in a recorded radio talk show or phone-in show, without need of prior training.

Note that only recognition of whether the speaker is the same as someone who spoke, or a new speaker, is needed; there is no need to recognise the person from a database.

i.e., Input is an audio stream of people speaking. Output are time segment designations of when which speaker (e.g., labeled as Speaker 1,2,3) is speaking.

The system should be natural language-independent. It should be able to take live speech, generating output as the input is analyzed, and possibly correcting earlier outputs when necessary.

Note that the student is expected to find or build their own collection of testing data. A collection of movie voiceover tracks, radio broadcast, or webcast is sufficient.

Experimentation using systems such as Audacity, PureData, Octave, Mathematica, or Matlab is encouraged.

The system should be implemented in an operating-system independent way.

Application: some biometrics systems authenticate the user by speaker recognition. Your study may shed light on the usability of such a system.

Keywords: cepstrum, formants, MFCC, speaker diarisation

Requirements

Keywords

Speaker diarisation, Audio signal processing

References

Voice changer

Project info

Description

Why can you recognize the voice of your girlfriend/boyfriend on the other side of the phone?

Why can you still recognize a person when he or she deliberately makes his or her sound high pitched?

Can you trick Siri to listen to you on a device trained to recognise the voice of your friend?

There are acoustic features that can be extracted to identify the speaker.

Given a recorded speech of person A and some reference of speech of person B, the project is about developing software to produce a sound clip of the speech that sounds like it is spoken by person B, without requiring an extensive training phase.

Requirements

References

Some resources that may make experimentation and implementation easier

Swift application design and development

Project info

Students intended to take this project should email me one or more outlines of your app design before selection, and pitch your idea in a (virtual?) meeting. Expect to refine the ideas for a few times before it is finalised.

Description

The open-sourced language Swift has been evolved to version 5.4, and is gaining popularity among macOS/iOS/watchOS/tvOS developers, as its library integration is getting better. The project is to develop an macOS/iOS/watchOS/tvOS application by looking into the details of the language and taking advantage of its language features.

It is open to the student to design and implement a Swift-based application or application suite to be run on macOS/iOS/watchOS/tvOS. i.e., it is the student who proposes what the app should be like. Your supervisor's job is to make sure its scale is that of a final year project, and oversees its progress.

Expect to spend quite some time on forums and writing little programs to experiment, since some features (like SwiftUI) are quite new and not yet well-documented.

Expectations on students

References

What the text

Project info

Description

Given a scanned image of text, identify the list of possible languages the text is in, in order of decreasing confidence.

Also, recognise the text and output it to a file for verification.

The implementation should be as platform-independent as possible so that it can be run standalone on mobile devices or desktop devices.

References

Form reader

Project info

Description

Though electronic forms are gaining widespread use, paper forms are still used often. For example, the Government's Care and Share Scheme in 2019 require paper form submission, citing latency in tendering and building a new reliable system.

This project is about building a generic system that after a blank form is read by a webcam or mobile device, the same forms with content can be extracted and processed.

While a generic system powerful for processing general forms would involve written character recognition that can be quite involved, we can start with paper survey forms in which the respondent should put a mark on some of the boxes, with minimal optional free-form writing which can be collected as images.

Ideally, the system should recognise the content in a second or two so autofeeding the forms under the camera is feasible.

A more powerful system would have features such as recognition of corrections in the forms, extraction of the out-of-template elements, recognition of written characters,... your milage may vary.

Requirements

References

Financial data forecaster

Project info

Description

Collect time series of historical financial data (cryptocurrency prices, stock prices, futures prices, market indices) at different points in time, design an algorithm that would predict their values in the future.

Some factors the algorithm can take into the consideration include the day in month, weekday of day, time of day, various financial indicators, correlations between data from different time series.

Time series of non-numerical data such as news articles, Twitter feeds, or Facebook posts can be analysed to improve the accuracy of the prediction. Indeed, this has been proven to be quite effective in some prior studies.

Note that the student is expected to build their own collection of training and testing data.

Be very careful about accuracy claims of better than 70% when you do literature research on how good their systems are, especially when the system uses historical numerical data or financial indicators only.

It is a difficult project, or everyone who can program are already billionaires.

Do not take this project if your plan is just to apply machine learning algorithms onto some standard data collected from financial data sites. A good project involves much more than this.

Expectations on students

References

Although some resources here are Python libraries, there is no restriction on the languages and tools you use. Indeed, a good data analysis and machine learning project like this one often requires the use of multiple languages.