Smart Email Client to Detect Malicious URLs

This is a final year project website hosted under the department of Computer Science

Get Started

Background

Everyday, we receive a plethora of emails every day which may contain some URLs. These emails may be malicious. A malicious email is one which may probe us to redirect to unexpected sites or cause some harmful software to be downloaded. To give an example of a malicious email, an email might contain some special or secret information with the recipient that will help with his health issues, financial issues, or other common problems, like a secret cure for high blood pressure and high cholesterol that the medical establishment does not want you to know about. Many of these emails do not provide details: you have to click through to get the secret. Once you click on this malicious URL, it can redirect you to websites in order to steal your sensitive information like Credit Card details, passwords etc. It is easy to look at an email and not see what is hidden behind the display, but what is behind it is some complicated programming. Just a little bit of knowledge, it can be understood how this works, and leveraging this knowledge a better alert system can be built to avoid these potential problems. Our solution hence is to build a new (or modify an existing open-source) email client.

Objective

It is easy to look at an email and not see what is hidden behind the display. In fact, most people would not even suspect that behind a very simple looking email might be lurking some complicated programming. Hence the objective of our project is to build a new email client or modify an existing email client source. The main goal of the email client is that it can read through the contents of emails and perform analysis on the URLs that are included in the email. The analysis can be based on any existing online URL checker or our own AI algorithm. Warning messages can then be issued to the user accordingly as to whether the URL or attachments in the email are safe to open or whether they are harmful.

Number one
Email as a vehicle for malware
> 50% people
Use Email
13.3%
Average Click-through Rate
54.6%
Spam Email

Methodology

In order to detect and combat malicious URLs it is paramount to understand the goals and intentions of the sender.

Targeting

The target of a malicious email may either be to one specific person (TME, also known as spear fishing) , or may be sent to multiple recipients (UnTargeted Malicious Email (UTME). The one targeting multiple recipients can be more easily identified as malicious as opposed to the one targeting one specific person. This is because of the cost trade-off to sender and is usually only used when a high level of information is to be extracted. For example, a high technology enterprise may be vulnerable to this attack as the person may hold database access such as plans of marketing, client details or other sensitive data.

Addressing

In order for senders to camouflage their identities, they use certain hiding techniques in order to mask the email source, such as copying someone on the recipient’s contact list, or a celebrity.

Content

The content of emails holding malicious URLs may vary. The message body can include all kinds of hidden features that you may not be able to see when you view it in your normal viewing window.

Given below is a simple flow chart diagram summarising our Methodology.



The first step in our methodology was to obtain a dataset containing URLs and then perform feature check analysis on them.



After that it was important to do data modelling on the dataset. 3 models were chosen for predicting the accuracy of malicious URL Detection. Random Forest has the higest accuracy rate.



Finally we selected our model for classification and performed email extraction which would select the URL from the email and then run model on it to detect the presence of malicious URLs.

Testing

After running our code, we get the following successful output. As can be seen, our model is able to predict malicious as well as benign URLs accurately.



About Us

Shreya Palit

Developer

Dr. S.M. Yiu

Supervisor

Trisha Gupta

Developer

Contact us

Your message has been sent. Thank you!