Can you fight cancer with the help of Big Data and Machine Learning? How can these technologies help in the procedure of diagnosis, drug discovery and treating cancer?

Cancer is an ailment with a long tail distribution. This implies there are different explanations for this condition to take place with no single solution to get rid of it. There are ailments which influence a huge number of people, however, have a sole reason for the event to occur. For instance, let us think about Cholera. Food or water tainted by Vibrio Cholerae is why Cholera occurs. Cholera can happen simply because of Vibrio Cholerae, and there is no other reason. When we discover the main source of an illness, it is moderately simple to overcome it. 

In our ammunition stockpile to pursue this war with cancer and overcoming it, Big Data and Machine Learning are weapons of mass destruction. 

Data Explosion and Gene Sequencing

One area that produces massive amount of data is gene sequencing. How much data? you may ask. Gene sequencing produces human data that is equivalent of ¼ of Youtube’s yearly data production. In terms of scale, this data combined with the additional information from genome sequencing if burned on 4GB DVDs, you would be looking at a stack that is half a mile high.

The strategies for gene sequencing have improved throughout the years, and the expense for the equivalent has plunged exponentially. In the year 2008, the expense on gene sequencing was 10 million dollars. Today, it works out to just a 1000 dollars. It is estimated to decrease further in the future. By 2025, it is estimated that 1 billion individuals will have gene sequencing done. It is evaluated that one billion individuals will have their qualities sequenced by 2025. By 2030, the genomics data will be somewhere close to 2 – 40 exabytes in a year. 

Fighting Cancer with Big Data and Machine Learning

The fight against cancer can be won in many ways if the large amount of data being generated is combined with Machine Learning algorithms. Diagnosis, treatment and prognosis assistance can be gained with Machine Learning. Customizable therapy will be possible, and the long tail distribution can also be dealt with.

Labelled data can be used in diagnosing cancer. This is made possible because of the vast Electronic Medical Records available and the data records from all hospitals. The use of Natural Language Processing is done to make sense of prescriptions of the doctors, CT and MRI scans are analyzed using Deep Learning Neural Networks. The various Machine Learning algorithms sift through the EMR database and find the patterns which are hidden. This will help with the diagnosis. 

An example to support this is a college student from the US was able to design a particular Artificial Neural Network from her home and even developed a model which was able to diagnose breast cancer with incredible accuracy.

Diagnosing with Big Data and Machine Learning

A very fine example of how diagnosis can be improved when it comes to cancer is the case of 16-year-old Brittany Wenger. She took it upon herself to improve diagnostics when her older cousin was diagnosed with breast cancer. To detect cancer, a less invasive method is FNA (Fine Needle Aspiration) which the doctor’s thought was not reliable. Brittany wanted to make this method better and decided to put her coding abilities to use to achieve this. An improved and less invasive method could be used by women if it was deemed reliable. 

Making use of the public domain data which was inclusive of FNA from the University of Wisconsin was the first step. An artificial neural network was then coded by her. Following this, she used cloud technologies for data processing and further trained the artificial neural network to detect similarities. It was a massive process of trial and error and she was finally able to detect breast cancer with FNA test data and it was sensitive to malignancy by 99.1%. This method isn’t restricted to breast cancer alone and is being used to detect other cancers as well.

The amount and quality of data determine the accuracy of the diagnosis. With more data available, database querying will be more by the algorithms. This will result in finding similarities and more valuable models being the output.

Treating Cancer with Big Data and Machine Learning

Moving on from diagnosis, Big Data and Machine Learning play a huge role in the treatment of cancer of as well. Let’s take another case where 49-year-old Kathy was diagnosed with stage III breast cancer. Kathy’s husband John was the CIO of a hospital in Boston. John planned Kathy’s treatment with the help of Big Data tools that were designed by him.

A powerful search tool was created in 2008 called SHRINE (Shared Health Research Information Network). This was created with the help of Harvard affiliated hospitals who shared their databases. By the time Kathy was diagnosed, the doctors treating her could sift through records that were close to 6 million in number. Questions like “Stage 3 breast cancer and treatment for 50-year-old woman” could be queried in SHRINE. This information allowed doctors to avoid surgery and treat her with chemotherapy drugs in customized treatment for the tumor cells.  

Once the chemotherapy was completed, the radiologists couldn’t find any more tumor cells in Kathy’s body. This is an example of how Big Data tools allow for customized treatment plan according to the patient’s requirement.

The one size fits all treatment process does not work for cancer because of its long tail distribution. For customized treatment plans to work, the following are key – diagnostic test results, gene sequence, gene mutation data, Big Data and Machine Learning tools.

Drug Discovery with Big Data and Machine Learning

Moving further from diagnosis and treatment, Big Data and Machine Learning can help revolutionize drug discovery. Open data and computational resources are used by researchers to discover new uses for drugs that are already in existence and have been approved by agencies like FDA. Another example of this was when a group of students at University of California. SFO used Big Data technologies and Machine Learning algorithms to find out that a drug used to treat pinworms could shrink a carcinoma which was a type of liver cancer in mice. This particular carcinoma was the second largest contributor of cancer deaths in the world.

Apart from finding new uses for drugs in existence, new drugs can also be discovered. Using data which is related to different drugs, their properties, chemical composition, disease symptoms, side effects etc, new drugs can be devised to treat various types of cancer. This will make it an easier process to devise new medicines and will help save millions of dollars in the process. 

In summary, Cancer is rather dangerous and comes in many different forms. In the form of Big Data and Machine Learning, we do now possess a stronger arsenal to combat cancer. From diagnosis, to treatment plans, to drug re-administering/discovery, we have the ability to beat cancer at every stage. 

Like this article? Subscribe to our weekly newsletter to never miss out!

Previous post

AI and data science predictions for 2021

Next post

5 new year’s resolutions to improve how organizations work with data in 2021