Everyday, we receive a plethora of emails every day which may contain some URLs. These emails may be malicious. A malicious email is one which may probe us to redirect to unexpected sites or cause some harmful software to be downloaded. To give an example of a malicious email, an email might contain some special or secret information with the recipient that will help with his health issues, financial issues, or other common problems, like a secret cure for high blood pressure and high cholesterol that the medical establishment does not want you to know about. Many of these emails do not provide details: you have to click through to get the secret. Once you click on this malicious URL, it can redirect you to websites in order to steal your sensitive information like Credit Card details, passwords etc. It is easy to look at an email and not see what is hidden behind the display, but what is behind it is some complicated programming. Just a little bit of knowledge, it can be understood how this works, and leveraging this knowledge a better alert system can be built to avoid these potential problems. Our solution hence is to build a new (or modify an existing open-source) email client.
It is easy to look at an email and not see what is hidden behind the display. In fact, most people would not even suspect that behind a very simple looking email might be lurking some complicated programming. Hence the objective of our project is to build a new email client or modify an existing email client source. The main goal of the email client is that it can read through the contents of emails and perform analysis on the URLs that are included in the email. The analysis can be based on any existing online URL checker or our own AI algorithm. Warning messages can then be issued to the user accordingly as to whether the URL or attachments in the email are safe to open or whether they are harmful.
In order to detect and combat malicious URLs it is paramount to understand the goals and intentions of the sender.
The target of a malicious email may either be to one specific person (TME, also known as spear fishing) , or may be sent to multiple recipients (UnTargeted Malicious Email (UTME). The one targeting multiple recipients can be more easily identified as malicious as opposed to the one targeting one specific person. This is because of the cost trade-off to sender and is usually only used when a high level of information is to be extracted. For example, a high technology enterprise may be vulnerable to this attack as the person may hold database access such as plans of marketing, client details or other sensitive data.
In order for senders to camouflage their identities, they use certain hiding techniques in order to mask the email source, such as copying someone on the recipient’s contact list, or a celebrity.
The content of emails holding malicious URLs may vary. The message body can include all kinds of hidden features that you may not be able to see when you view it in your normal viewing window.
After running our code, we get the following successful output. As can be seen, our model is able to predict malicious as well as benign URLs accurately.