Covid-19 Detection App with Tensorflow: Intro

Estimated reading time: 6 minutes

In this series of articles, we are going to build a production-ready Covid-19 detection system prototype using Tensorflow. We are using X-ray image data to predict whether the given sample is Covid-19 Positive. We are planning to deploy this model on the Google Cloud Platform.

Introduction
Covid-19 Data Collection, Exploration & Processing
Covid-19 Model Training and Evaluation
App Development Using Streamlit
Model Deployment On Google Cloud Run
Summary

Introduction

Covid-19 has affected everyone across the globe. It is very important to detect a Covid-19 positive person as soon as possible. As it helps us to avoid the spread of the virus and to allow early diagnosis. Traditionally, Doctor’s are using RT-PCR widely across the globe to detect Covid-19 infection. This test requires throat and nose samples and it takes around 24hrs to get results. This also requires special equipment along with it If the test is not accurate and in that case then we need another test.

As Covid-19 is a respiratory disease, there is possible to use machine learning for X-Ray image evaluation. There are also other possibilities like using cough sound, pulse rate, and other patient symptoms data to predict Covid-19. In this work, we are exploring the possibility to use X-Ray data to evaluate Covid-19 infection.

We have written this article with the purpose of helping the researcher and enginers to start with the problem domain, the project is not the final product that can be used in a real-life environment. If We need To improve this project we need more data and better model evaluation can be used.

Tools we are going to use to build this app are

Tensorflow: For model training
Google Cloud: For hosting our application
Streamlit: Data App to interact.

Covid-19 Data Collection, Exploration & Processing

The dataset we are using for this project is the COVID-19 Radiography Database from Kaggle. The dataset contains 4 different categories like Normal, Covid-19, Lung opacity images, Viral Pneumonia images. For this project we using only normal and covid-19 images. We need to move these two folders into one common train data folder. We provide it as a path for the TensorFlow data reader.

The input size for Covid-19 images is 299×299, we are resizing them into 256×256 across the project. The following code reads image data from the folder. Further, we divide the data into a train and validation set with 80:20.

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
        folder_path,
        validation_split=0.2,
        subset="training",
        seed=123,
        image_size=(img_height, img_width),
        batch_size=batch_size)

val_ds = tf.keras.preprocessing.image_dataset_from_directory(
        folder_path,
        validation_split=0.2,
        subset="validation",
        seed=123,
        image_size=(img_height, img_width),
        batch_size=batch_size)

#This function display Images into grid structure with respective label
def show_images(ds):
    plt.figure(figsize=(10, 10))
    for images, labels in ds.take(1):
        for i in range(9):
            ax = plt.subplot(3, 3, i + 1)
            plt.imshow(images[i].numpy().astype("uint8"))
            plt.title(class_names[labels[i]])
            plt.axis("off")

Training Data Samples — fig 1. Training data Samples

We can perform data augmentation to generate more training data and avoid overfitting.

#Random Flip operation 
data_augmentation_layer = keras.Sequential(
        [
            layers.experimental.preprocessing.RandomFlip("horizontal"),
            layers.experimental.preprocessing.RandomRotation(0.1),
        ]
    )

Let’s check images after performing this operation.

After Synthetic data, Covid-19, Tensorflow — fig 2. The images after data augmentation

We need to perform normalization on images to reduce processing overhead and improve training performance. I’m using the normalization layer provided by Tensorflow. By integrating this layer as part of the model we don’t need to perform any processing on the inference stage.

normalization_layer = layers.experimental.preprocessing.Rescaling(1. / 255)

Covid-19 Model Training and Evaluation

In this series of article I’m going to solve this task using two methods

Custom CNN Model
Transfer Learning with performance tunning

#The function returns the model 
def get_custom_model(data_augmentation_layer, normalize_layer):
     return tf.keras.Sequential([
        data_augmentation_layer,
        normalize_layer,
        tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu'),
        tf.keras.layers.MaxPooling2D(),

        tf.keras.layers.Conv2D(32, 3, padding='same', activation='relu'),
        tf.keras.layers.MaxPooling2D(),

        tf.keras.layers.Conv2D(64, 3, padding='same', activation='relu'),
        tf.keras.layers.MaxPooling2D(),

        tf.keras.layers.Conv2D(128, 3, padding='same', activation='relu'),
        tf.keras.layers.MaxPooling2D(),

        tf.keras.layers.Conv2D(256, 3, padding='same', activation='relu'),
        tf.keras.layers.MaxPooling2D(),

        tf.keras.layers.Conv2D(512, 3, padding='same', activation='relu'),
        tf.keras.layers.MaxPooling2D(),

        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ])

The Custom CNN model contains 6 convolutional layers, followed by two dense layers.
We need the activation function RELU across all convolutional layers.
We use the output is set 1 as problem domain is binary classification.
The activation function sigmoid is used to keep result values between range 0 and 1.

# Define Callback
es_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3)
model_custom.compile(optimizer='adam',
                         loss=tf.keras.losses.binary_crossentropy,
                         metrics=['accuracy',keras.metrics.AUC(name='prc', curve='PR')])

The Early Stopping mechanism is used to avoid overfitting of the model.
The model is trained using Adam optimizer, which can handle sparse gradients and noisy problems.

history = model_custom.fit(
      train_ds,
      validation_data=val_ds,
      epochs=50,
      callbacks=[es_callback]
    )

Epoch 1/50
346/346 [==============================] - 83s 232ms/step - loss: 0.5890 - accuracy: 0.7346 - prc: 0.7763 - val_loss: 0.4866 - val_accuracy: 0.7339 - val_prc: 0.8995
Epoch 2/50
346/346 [==============================] - 71s 205ms/step - loss: 0.4412 - accuracy: 0.7846 - prc: 0.9257 - val_loss: 0.3181 - val_accuracy: 0.8476 - val_prc: 0.9689
Epoch 3/50
346/346 [==============================] - 67s 195ms/step - loss: 0.3381 - accuracy: 0.8439 - prc: 0.9626 - val_loss: 0.2937 - val_accuracy: 0.8820 - val_prc: 0.9776
Epoch 4/50
346/346 [==============================] - 89s 259ms/step - loss: 0.2664 - accuracy: 0.8898 - prc: 0.9765 - val_loss: 0.2201 - val_accuracy: 0.9200 - val_prc: 0.9809
Epoch 5/50
346/346 [==============================] - 74s 213ms/step - loss: 0.2488 - accuracy: 0.9004 - prc: 0.9794 - val_loss: 0.1895 - val_accuracy: 0.9345 - val_prc: 0.9843
Epoch 6/50
346/346 [==============================] - 69s 200ms/step - loss: 0.2282 - accuracy: 0.9087 - prc: 0.9813 - val_loss: 0.1882 - val_accuracy: 0.9305 - val_prc: 0.9879
Epoch 7/50
346/346 [==============================] - 69s 200ms/step - loss: 0.1993 - accuracy: 0.9234 - prc: 0.9862 - val_loss: 0.1316 - val_accuracy: 0.9515 - val_prc: 0.9947
Epoch 8/50
346/346 [==============================] - 99s 286ms/step - loss: 0.1838 - accuracy: 0.9236 - prc: 0.9894 - val_loss: 0.1359 - val_accuracy: 0.9439 - val_prc: 0.9946
Epoch 9/50
346/346 [==============================] - 79s 227ms/step - loss: 0.1561 - accuracy: 0.9397 - prc: 0.9923 - val_loss: 0.1281 - val_accuracy: 0.9490 - val_prc: 0.9939
Epoch 10/50
346/346 [==============================] - 74s 214ms/step - loss: 0.1636 - accuracy: 0.9330 - prc: 0.9909 - val_loss: 0.1271 - val_accuracy: 0.9529 - val_prc: 0.9945
Epoch 11/50
346/346 [==============================] - 74s 215ms/step - loss: 0.1533 - accuracy: 0.9395 - prc: 0.9918 - val_loss: 0.1186 - val_accuracy: 0.9540 - val_prc: 0.9936
Epoch 12/50
346/346 [==============================] - 73s 210ms/step - loss: 0.1455 - accuracy: 0.9428 - prc: 0.9927 - val_loss: 0.1291 - val_accuracy: 0.9453 - val_prc: 0.9951
Epoch 13/50
346/346 [==============================] - 71s 207ms/step - loss: 0.1276 - accuracy: 0.9476 - prc: 0.9953 - val_loss: 0.0826 - val_accuracy: 0.9692 - val_prc: 0.9971
Epoch 14/50
346/346 [==============================] - 70s 202ms/step - loss: 0.1203 - accuracy: 0.9538 - prc: 0.9951 - val_loss: 0.0844 - val_accuracy: 0.9707 - val_prc: 0.9968
Epoch 15/50
346/346 [==============================] - 70s 203ms/step - loss: 0.1187 - accuracy: 0.9510 - prc: 0.9944 - val_loss: 0.1142 - val_accuracy: 0.9551 - val_prc: 0.9973
Epoch 16/50
346/346 [==============================] - 77s 222ms/step - loss: 0.1281 - accuracy: 0.9512 - prc: 0.9937 - val_loss: 0.1176 - val_accuracy: 0.9613 - val_prc: 0.9943

After model training our neural network, we are able to achieve ~96% validation accuracy.
The next step is to save the model. We need to use the saved model in our interactive app.
In the next tutorial, we will use transfer learning to train the model.

App Development Using Streamlit

The Streamlit is a tool for transforming data scripts into interactive web apps. If you are new to these tools, give a try with simple commands

pip install streamlit
streamlit hello

Model Deployment On Google Cloud Run

The Google Cloud Run is managed serverless platform. It scales from zero to many instances without any downtime. The platform provides metrics and logs which are helpful to track model performance and issue with ease.

In the upcoming part of the series, we are going to build a Covid-19 detection app using Streamlit and deploy it on Google Cloud run.

Summary

In this article, we started by understanding the problem domain and solved it using deep learning. The code is available on GitHub
In the next part of this series, we will train this model using transfer learning and deploy our data app on the Google Cloud Run.
You can check out the app.
Feel free to share your thoughts and opinions.

Cookie	Duration	Description
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
wordpress_test_cookie	session	This cookie is used to check if the cookies are enabled on the users' browser.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.

Covid-19 Detection App with Tensorflow: Intro

Table of contents

Introduction

Covid-19 Data Collection, Exploration & Processing

Covid-19 Model Training and Evaluation

App Development Using Streamlit

Model Deployment On Google Cloud Run

Summary

Responses

Table of contents

Introduction

Covid-19 Data Collection, Exploration & Processing

Covid-19 Model Training and Evaluation

App Development Using Streamlit

Model Deployment On Google Cloud Run

Summary

Related Articles

Responses