BigData

Building Secure Big Data Application

Nowadays cloud computing paradigm enables various big data applications (e.g., Spark) to be deployed in public clouds. Thus, sensitive data suffers from severe security threats in public clouds. Some of these applications are written in Java including Apache Storm.

Although Java has many security features including type-safety, the JVM runtime alone is insufficient to defend against privileged attacks, because adversaries may control the entire cloud software stack, including OS kernels. This final year project includes applying Intel SGX to a popular big data streaming software, Apache Spark.

Methodology

Detailed Partition Design

For the implementation of Intel SGX Function Annotation in Apache Storm, detailed functions partition is needed because the source code is developed in a compact way such that many component share the same function including the function need to be annotated.

Engineering Work

Proper engineering works including but not limited to VM setup, server configuration, Java and Intel SGX programming are conducted in the computers provided by supervisor and PhD student. Engineering works are expected to be conducted in the source code of Apache Storm in low level manner according to the detailed partition design.

Evaluation

After the successful engineering work, evaluation of the secure big data application compared to the original big data application or other existing secure big data systems running the same computations will be conducted. The metric for performance tentatively includes but is not limited to: tuples processed per second per node.

Team

CHEN Xing Quan Hansen
HKU Final Year Student

Dr. Heming Cui
PhD Columbia, HKU Assistant Professor

Documentation

  • ALL WORK

Timeline