Phase 1 - Inception

Completion of the necessary background information

Posted on September 29, 2019

Background

Red Tides - Causes and Effects

Red Tides are the discoloration of water when there is rapid growth of colonies of algae. They are naturally occuring but human activities seem to increase their frequency and intensity. From discharge of nutrient-rich fertilizers to global warming, factors of nutrition, salinity and temperature all come into play and can be traced back to human activities.

Most red tides are harmless but some are toxic and are harmful to people, fish, shellfish, marine animals and birds. These red tides are known as Harmful Algal Blooms (HAB)[14] caused by harmful algae species. Though Red Tides or Algal Blooms are a common occurrence, HAB occur far less frequently[14]. Nonetheless, when HABs do occur they can potentially cause severe losses to the aquaculture industry locally and throughout the world[15].

Current detection systems

Traditionally, lab based methods would be used to descriminate algae species and detect red tides [16]. However, with advancing technology, off-site, time-consuming methods are being replaced by real-time, on-site techniques using miniature sensors and cameras. Classification of Algae species are being done mainly by on-board image processors which can store its predictions and can then later be fine tuned. Detecting Red Tides are done on a large scale by remote sensing of the surface pigment, reflectance or temperature. Satellite Imaging techniques such as CZCS[16] and SeaWiFS [16] is also used.

Motivations

Previous Early Detection Systems use indicators of the root cause of the problem – rapid growth of algae. This project aims to analyse the growth of particular algae species using microscopic images. Due to the high impact of Red Tides in the socio-economic sectors of our society, an early detection system which directly analysis the growth of algae would bring about the fastest results in warning us of a potential HAB.

Objectives

Classification

The first objective of this project is to make a computer vision system which will detect and classify harmful algal species in microscopic images.

Testing new techniques to tackle class imbalance

Additionally, this project aims to test new techniques to tackle class imbalance. The nature of Algae databases are such that there is a large class imbalance. In one such database, 3 classes make up approximately 85% of all 3 million images.

Time-Series Analysis and Forecasting

Once classification is done to a reasonable accuracy, the project will begin analysing the growth of particular HAB causing species. First, trends will be taken out to do a feasibility study to see if forecasting is possible. Then a system will be created to model and forecast possible HAB outbreaks given the growth of particular algal species.

Methodology

Dataset

Finding the right dataset is a challenge as there are two domains of data – the list of harmful algal species and the dataset containing the said species and more. The resources looked into were:

  • IOC UNESCO Taxonomic Reference List [5]
  • AFCD’s HK Red Tide Information Network [13]

There are numerous public imaging datasets available which give annotated plankton images. The ones looked into are:

  • EcoTaxa [1]
  • COPEPOD Database [2]
  • ICES Plankton Database [3]
  • PlanktonNet [4]
  • WHOI-Plankton Database [6]

The WHOI-Plankton Database was selected as it contained 14 harmful algal species listed in the above two resources.

Classification

Convolutional Neural Network

Convolutional Neural Networks [7] (CNNs) combine the benefit of convolution in image proecssing with the predictive powers and optimization techniques neural networks have to offer. A CNN is able to train and optimize its kernals to solve the problem at hand. Thus, the need for feature engineering has been removed which greatly simplifies the process. Furthermore, subsequent downsizing makes it easier for the CNN to train itself as it gets rid of much of the irrelevant data present in the picture.

Transfer Learning

Transfer Learning has been widely used on multiple domains such as NLP, sentiment analysis and image classification [11]. Current state-of-the art plankton image classification models use Transfer Learning with a slight change in architecture along with feature engineering [12]. This project will make use of Transfer Learning from multiple domains of planktons to classify microscopic plankton images.

Time-Series Analysis and Forecasting

Long Short Term Memory Recurrent Neural Network

Recurrent Neural Networks (RNNs) are widely used on time-series data [9] to make predictions about the future given a certain trend of data. Furthermore, Long Short Term Memory (LSTM) have improved the accuracy and training time of RNNs [8]. LSTMs have also been used in the domain of time-series data and have out performed all other techniques [10]. Thus, this project will make use of LSTMs to forecast red tides.

Schedule and Milestones

Time Milestone
2019
September
  • Research on project - background
  • Research on existing solutions
  • 29th – Deliverables of Phase 1
October
  • Research on other datasets
  • Analysis of possibilities of time-series data
November
  • Improving accuracy of classification
  • Forecasting using Time-Series
December
  • Evalutation of implementation
  • Iteration and improvement of results
2020
January
  • Exploration of alternate approaches<
  • 7-11 – Project presentation<
February
  • Implementation of alternate approaches
  • 2 – Deliverables of Phase 2
March
  • Evalutation of alternate implementation
  • Testing and experimentation
April
  • Final cleanup
  • Report writing
  • 19 – Deliverables of Phase 3
  • 20-24 – Final Presentation
May
  • 5 – Project exhibition

References

  1. Picheral M, Colin S, Irisson J-O (2017). EcoTaxa, a tool for the taxonomic classification of images. http://ecotaxa.obs-vlfr.fr.
  2. https://www.st.nmfs.noaa.gov/copepod/about/databases.html
  3. http://www.ices.dk/marine-data/dataset-collections/Pages/Plankton.aspx
  4. https://planktonnet.awi.de/#search
  5. https://www.whoi.edu/website/redtide/species/by-name/
  6. Orenstein, E. C., Beijbom, O., Peacock, E. E., & Sosik, H. M. (2015). Whoi-plankton-a large scale fine grained visual recognition benchmark dataset for plankton classification. arXiv preprint arXiv:1510.00745.
  7. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105)
  8. Gers, F. A., Schmidhuber, J., & Cummins, F. (1999). Learning to forget: Continual prediction with LSTM.
  9. Hüsken, M., & Stagge, P. (2003). Recurrent neural networks for time series classification. Neurocomputing, 50, 223-235.
  10. F. Karim, S. Majumdar, H. Darabi and S. Chen, "LSTM Fully Convolutional Networks for Time Series Classification," in IEEE Access, vol. 6, pp. 1662-1669, 2018. doi: 10.1109/ACCESS.2017.2779939
  11. Pan, S. J., & Yang, Q. (2009). A survey on transfer learning. IEEE Transactions on knowledge and data engineering, 22(10), 1345-1359.
  12. Dai, J., Yu, Z., Zheng, H., Zheng, B., & Wang, N. (2016, November). A hybrid convolutional neural network for plankton classification. In Asian Conference on Computer Vision (pp. 102-114). Springer, Cham.
  13. https://www.afcd.gov.hk/english/fisheries/hkredtide/database/database.html
  14. https://www.afcd.gov.hk/english/fisheries/hkredtide/redtide.html
  15. https://www.afcd.gov.hk/english/fisheries/hkredtide/redtide/red04.html
  16. Sellner, K. G., Doucette, G. J., & Kirkpatrick, G. J. (2003). Harmful algal blooms: causes, impacts and detection. Journal of Industrial Microbiology and Biotechnology, 30(7), 383-406.