Introduction

Emails have a crucial and ever increasing role in our daily lives. As a primary means of communication and documentation for personal, education, and work situations alike, emails are a fast and cheap or free solution to our needs. According to Statista, the number of emails sent and received per day worldwide in 2018 was 281.1 billion, and that number is only projected to grow in the years to come. Subsequently, phishing attempts have increased by 65% in the last year, leading to billions of dollars of losses. Therefore, the importance of email security cannot be understated. The greatest vulnerability in cybersecurity tends to be due to human error. As reported by a study conducted by cybersecurity company Kaspersky, approximately 90% of data breaches affecting consumers were partially a result of social engineering, i.e. manipulating user behaviour. In order to reduce the risk of security breaches, the best defense is to increase user awareness. Most popular email clients such as Gmail, Outlook, Apple Mail and etc. have measures in place to alert users to suspicious emails which may contain fraudulent links or malicious attachments. Our project aims to emulate a similar level of security as these commonly used email clients through the use of malicious link databases and keyword analysis. In addition, our project will also have a user interface which is tailor-made for HKU students and faculty members. We aim to have an email client that will assist users in increasing their productivity and reducing time wastage handling emails, while also protecting them from cybersecurity threats.

Objective

Our project will have the following objectives:

Create a system that can detect a high percentage of malicious links
Build an intuitive and visually appealing user interface
Create a client that is as lightweight as possible
Intelligently organise emails into folders to assist productivity

Background

How malicious links are being distributed?

The message in the email contains things which are hidden from the users. Padding of the content in order to deceive the security measure which helps to differentiate between legitimate messages and the non-legitimate ones. Texts related to certain company may be displayed in the messages, however the links are linked to other random websites, which may be harmful. This usually indicates the message is an attack rather than a legitimate website link. This information may not known by many and hence lead to losses, such as lost of banking details, virus injection, spyware injection and so on.

Malicious links can also be embedded in attachments, especially in compressed attachments. Without extracting the files, the contents will not be known.Hence, they are not detected by automated means. Some files can also be disguised as the other types, which has a high possibility of attacks.Compressed files, along with other types of files, can hide an attack.

How are malicious links detected in email or malicious email by email client?

Since attackers usually constantly adapt and innovate to improve their chances of successful attacks, email clients usually incorporates different methods to detect malicious email. Firstly, they use spam filtering to detect malicious links. Sender's email address is being checked with a list of blacklisted emails. A match results in a rejected email.

Attackers can create new email address to bypass the filtering via matching with database of blacklisted emails. To counter the attempt, incoming emails are usually analyzed and suspicious keywords, which is often used in malicious emails are then detected. The security in state-of-the-art email client,Gmail features early phishing detection. Early phishing detection is a machine learning model which constantly analyse messages for possibility of phishing to enhance the protection of user data. Artificial Intelligence along with its machine learning framework are used by Gmail to train the spam filters so that tricks of replacing letters with numbers or similar techniques can be identified and eliminated.

Gmail is also proud of one of their security features, click-time warnings for malicious links. For this detection model, Google's Safe Browsing service which provides list of URLs containing malicious content to Chrome, Firefox and Safari as well as to other ISPs is used. With the help of reputation and similarity analysis on those URLs, any new URL which is deemed to be malicious will result in warning raised by the email client. A warning is also raised if it leads to untrusted domains.

As for open source email client, such as Thunderbird, it uses an automated scam filtering. Automated scam filtering looks for similar characteristics in email as the scam email. Link with numerical server names is one of the characteristics often seen in scam email. Thunderbird checks the link behind the text display and categorised the link as scam if it does not match the displayed text. Image link is also checked to make sure it has image source the link points to. Similar to Gmail, a warning is raised if the links lead to untrusted domains,

Methodology

Our project will be a web-based email client which will be compatible with modern web browsers. We will use HTML5 and CSS for front end appearance and user interface. The backend will be JavaScript and PHP.

We will use the Agile framework in development. Our development cycle will use vertical slicing to determine the tasks to be done in each iteration. Our first phase will build the basic UI functionality and malicious link detection. Subsequent iterations will continue to enhance the UI with added features and implement additional security measures. For the security measure, we are planning to do more research to decide the optimal methods. As for now, we will use checking of sender's email with a database of blacklisted emails as a base.

Schedule

01-09-2019 - 29-09-2019

Project plan

30-09-2019 - 31-10-2019

Research

01-11-2019 - 31-12-2019

Implementation of basic email client and security features

01-01-2020 - 31-01-2020

Improvement and addition of extra details

01-02-2020 - 28-02-2020

Improvement of user interface

01-03-2020 - 31-03-2020

User testing and Final touch up

Milestone

29-09-2019

Submission of detailed project plan

Completed

31-10-2019

Completion of research

Incomplete

31-12-2019

Completion of implementation of basic features

Incomplete

31-01-2020

Submission of Elaboration Deliverables

Incomplete

28-02-2020

Completion of Improvement of user interface

Incomplete

19-04-2020

Submission of Final Deliverables

Incomplete