Beta's projects for final year students 2020—2021

General

Real-time Speaker Recognizer

Project info

Description

Build a system that recognizes and labels the speakers in a recorded radio talk show or phone-in show, without need of prior training.

A more advanced version of the system should be language-independent. It should be able to take live speech, generating output as the input is analyzed, and possibly correcting earlier outputs when necessary.

Note that the student is expected to find or build their own collection of training and testing data.

Experimentation using systems such as Audacity, PureData, Octave, Mathematica, or Matlab is encouraged.

The system should be implemented in an operating-system independent way.

Application: some biometrics systems authenticate the user by speaker recognition. Your study may shed light on the usability of such a system.

Keywords: cepstrum, formants, MFCC, speaker diarisation

Requirements

Voice changer

Project info

Description

Why can you recognize the voice of your girlfriend/boyfriend on the other side of the phone?

Why can you still recognize a person when he or she deliberately makes his or her sound high pitched?

There are acoustic features that can be extracted to identify the speaker.

Given a recorded speech of person A and some reference of speech of person B, the project is about developing software to produce a sound clip of the speech that sounds like it is spoken by person B.

Requirements

References

Some resources that may make experimentation and implementation easier

Typhoon track predictor

Project info

Description

The Best Track Archive of Joint Typhoon Warning Center (JTWC) (link, another link, yet another link; these links don't work all the time) contain decades of data of tracks of tropical cyclones all over the world. The Severe Weather Information Centre, Meteoalarm, and Pacific Disaster Center contain current data for weather warning. These data are publicly available and can be analyzed for applications based on statistics.

The project is to use these data to answer some questions, such as

The student is expected to come up with questions similar to these above that makes sense meteorologically, and answer them by analysing data.

Expectations on students

References

Some resources that may make experimentation and implementation easier

Although some resources here are Python libraries, there is no restriction on the languages and tools you use. Indeed, a good data analysis and machine learning project like this one often requires the use of multiple languages.

Swift application design and development

Project info

Students intended to take this project should email me one or more outlines of your app design before selection, and pitch your idea in a meeting. Expect to refine the ideas for a few times before it is finalised.

Description

The open-sourced language Swift has been evolved to version 5.1, and is gaining popularity among macOS/iOS/watchOS/tvOS developers, as its library integration is getting better. The project is to develop an macOS/iOS/watchOS/tvOS application by looking into the details of the language and taking advantage of its language features.

It is open to the student to design and implement a Swift-based application or application suite to be run on macOS/iOS/watchOS/tvOS. i.e., it is the student who proposes what the app should be like. Your supervisor's job is to make sure its scale is that of a final year project, and oversees its progress.

Expectations on students

References

Some resources that may make experimentation and implementation easier

What the text

Project info

Description

Given a scanned image of text, identify the list of possible languages the text is in, in order of decreasing confidence.

Also, recognise the text and output it to a file for verification.

The implementation should be as platform-independent as possible so that it can be run standalone on mobile devices or desktop devices.

References

Some resources that may make experimentation and implementation easier

Form reader

Project info

Description

Though electronic forms are gaining widespread use, paper forms are still used often. For example, the Government's Care and Share Scheme (CSS) require paper form submission, citing latency in tendering and building a new reliable system.

This project is about building a generic system that after a blank form is read by a webcam or mobile device, the same forms with content can be extracted and processed.

While a generic system powerful for processing the CSS forms would involve written character recognition that can be quite involved, we can start with paper survey forms in which the respondent should put a mark on some of the boxes, with minimal optional free-form writing which can be collected as images.

Ideally, the system should recognise the content in a second or two so autofeeding the forms under the camera is feasible.

A more powerful system would have features such as recognition of corrections in the forms, extraction of the out-of-template elements, recognition of written characters,... your milage may vary.

Requirements

References

Some resources that may make experimentation and implementation easier

Financial data forecaster

Project info

Description

Collect time series of historical financial data (cryptocurrency prices, stock prices, futures prices, market indices) at different points in time, design an algorithm that would predict their values in the future.

Some factors the algorithm can take into the consideration include the day in month, weekday of day, time of day, various financial indicators, correlations between data from different time series.

Time series of non-numerical data such as news articles, Twitter feeds, or Facebook posts can be analysed to improve the accuracy of the prediction. Indeed, this has been proven to be quite effective in some prior studies.

Note that the student is expected to build their own collection of training and testing data.

Be very careful about accuracy claims of better than 70% when you do literature research on how good their systems are, especially when the system uses historical numerical data or financial indicators only.

It is not an easy project, or everyone who can program are already billionaires.

Expectations on students

References

Some resources that may make experimentation and implementation easier

Although some resources here are Python libraries, there is no restriction on the languages and tools you use. Indeed, a good data analysis and machine learning project like this one often requires the use of multiple languages.

Visualising network intrusions

Project info

Description

Intrusion detection systems (IDSes) monitor network traffic and flag suspicious activities for network managers to take action.

Suspicious activities include port scans, repeated unsucessful logins, man-in-the-middle (MitM) attacks, botnet attacks, denial-of-service (DoS) attacks, among others.

Suspicious activities are visualised and sent to a Security Information and Event Management (SIEM) system for logging.

This project is to build or enhance an IDS and/or SIEM system, with emphasis on the visualisation component that shows what is happening in real time.

Requirements

Deliverables

References

Some related resources on the web

Analysing ECG patterns

Project info

Description

Electrocardiogram (ECG) signals is useful for gaining insight into a person's health.

It can be recorded using specialised equipments, or just an Apple Watch.

The app in the watch analyses whether it's a normal sinus rhythm, or if there is sign of atrial fibrillation (AFib).

Yet, there are more types of rhythms that signal cardiac dysfunctions, such as Premature atrial contraction (APB)), Atrial flutter (AFL), Supraventricular tachycardia (SVTA), Premature ventricular contraction (PVC), Bigeminy, and Trigeminy.

This project is about classifying ECGs into different types.

Sample ECGs

Graphs generated from data from Pawel Plawiak (CC BY 4.0)


Sinus rhythm

Atrial fibrillation

Supraventricular tachycardia

Requirements

Deliverables

References

Some resources that may make experimentation and implementation easier

Studies on cryptocurrencies

Project info

Description

There are many kinds of cryptocurrencies, many exchanges, and many transactions per day. There are many ways to analyse cryptocurrency prices, transactions, discussions, for users, traders, and analysts alike.

Students taking the project are to come up with a reasonable set of propositions for analysis of cryptocurrencies, collect the data, and come up with reasonable conclusions.

Another possible direction is to build a system for real-time or near real-time analysis of cryptocurrency data.

Requirements

Deliverables