In this project, techniques for profiling cyber criminals will be developed. The major
techniques issues are described as follows.
1) Criminal data consolidation
Multiple sources of data about the cybercrimes and cybercriminals will be consulted. First,
a list of attributes would be devised based on the nature specific to the cyber crime and
the online environment, in addition to the areas of concerns in conventional criminal
profiling, like personal background and particulars. A rough metric of elements of the
target cybercrimes will be established. This metric will assist in developing the key
concepts in the process of cybercrime profiling. We will then develop a crawler module
which collects relevant digital data from various online platforms such as internet
discussion forums and auction sites. We will use the textual content with psycholinguistic
analysis and structural relationship among the crawled data to further refine the metric
obtained earlier on.
2) Online identity resolution
Due to the phenomenon that general users would not disclose their real identities, in
particular, cybercriminals often mask or change their online identities, we shall develop
techniques for online identity resolution, which help correlating different online identities
and removing duplicated entries. A group of features will be defined for screening the
posts or information authored by potential cybercriminals. The collected data will be fed
to a data parser for page parsing and for identifying the personal nature of information.
The results are organized, based on the individual target and stored in the data repository.
The topics or interests, and personal attributes for each target will be extracted. These
attributes may include, but are not limited to, the languages or codes being used, the
social links with other users, the online patterns of an individual, temporal behavior, and
interaction behavior. An analysis module will be developed to aid the categorization of
cybercriminals and correlate the matching behavioral characteristics. Categorization will
be conducted through two approaches. The first approach involves the use of the
statistical technique of clustering to extract certain attributes and online features. The
second approach would be a psycholinguistic based analysis of the digital data. By
extracting behavioral information and identifying the key online features, we can draw
the signature of a specific target and identify the individualˇ¦s possible multiple accounts.
A signature, as defined in criminal profiling theory, is the unique detail which links
multiple criminal incidents together. In the proposed instance at hand, the signature is
believed to connect multiple accounts to a single user, where a possible result would be
connecting multiple online user accounts to a single user identity.
3) Categorization of cyber-criminals and performing reference and prediction
After the filtering, the personal nature of characteristics extracted from the data will be
matched against the metric. The results will be used to reveal a set of key features that
represents the modus operandi of each type of cyber-criminals. An analysis module will
be developed to aid the categorization of cyber-criminals and correlate the matching
behavioral characteristics. With this, we are able to establish a database of cybercriminal
profiles with accurate categorization, which may provide investigative leads on
cybercrime investigation and analysis. |