Archive for November, 2020

The way we train AI is fundamentally flawed – MIT Technology Review

For example, they trained 50 versions of an image recognition model on ImageNet, a dataset of images of everyday objects. The only difference between training runs were the random values assigned to the neural network at the start. Yet despite all 50 models scoring more or less the same in the training testsuggesting that they were equally accuratetheir performance varied wildly in the stress test.

The stress test used ImageNet-C, a dataset of images from ImageNet that have been pixelated or had their brightness and contrast altered, and ObjectNet, a dataset of images of everyday objects in unusual poses, such as chairs on their backs, upside-down teapots, and T-shirts hanging from hooks. Some of the 50 models did well with pixelated images, some did well with the unusual poses; some did much better overall than others. But as far as the standard training process was concerned, they were all the same.

The researchers carried out similar experiments with two different NLP systems, and three medical AIs for predicting eye disease from retinal scans, cancer from skin lesions, and kidney failure from patient records. Every system had the same problem: models that should have been equally accurate performed differently when tested with real-world data, such as different retinal scans or skin types.

We might need to rethink how we evaluate neural networks, says Rohrer. It pokes some significant holes in the fundamental assumptions we've been making.

DAmour agrees. The biggest, immediate takeaway is that we need to be doing a lot more testing, he says. That wont be easy, however. The stress tests were tailored specifically to each task, using data taken from the real world or data that mimicked the real world. This is not always available.

Some stress tests are also at odds with each other: models that were good at recognizing pixelated images were often bad at recognizing images with high contrast, for example. It might not always be possible to train a single model that passes all stress tests.

One option is to design an additional stage to the training and testing process, in which many models are produced at once instead of just one. These competing models can then be tested again on specific real-world tasks to select the best one for the job.

Thats a lot of work. But for a company like Google, which builds and deploys big models, it could be worth it, says Yannic Kilcher, a machine-learning researcher at ETH Zurich. Google could offer 50 different versions of an NLP model and application developers could pick the one that worked best for them, he says.

DAmour and his colleagues dont yet have a fix but are exploring ways to improve the training process. We need to get better at specifying exactly what our requirements are for our models, he says. Because often what ends up happening is that we discover these requirements only after the model has failed out in the world.

Getting a fix is vital if AI is to have as much impact outside the lab as it is having inside. When AI underperforms in the real-world it makes people less willing to want to use it, says co-author Katherine Heller, who works at Google on AI for healthcare: We've lost a lot of trust when it comes to the killer applications, thats important trust that we want to regain.

Originally posted here:
The way we train AI is fundamentally flawed - MIT Technology Review

Machine Learning Predicts How Cancer Patients Will Respond to Therapy – HealthITAnalytics.com

November 18, 2020 -A machine learning algorithm accurately determined how well skin cancer patients would respond to tumor-suppressing drugs in four out of five cases, according to research conducted by a team from NYU Grossman School of Medicine and Perlmutter Cancer Center.

The study focused on metastatic melanoma, a disease that kills nearly 6,800 Americans each year. Immune checkpoint inhibitors, which keep tumors from shutting down the immune systems attack on them, have been shown to be more effective than traditional chemotherapies for many patients with melanoma.

However, half of patients dont respond to these immunotherapies, and these drugs are expensive and often cause side effects in patients.

While immune checkpoint inhibitors have profoundly changed the treatment landscape in melanoma, many tumors do not respond to treatment, and many patients experience treatment-related toxicity, said corresponding study authorIman Osman, medical oncologist in the Departments of Dermatology and Medicine (Oncology) at New York University (NYU) Grossman School of Medicine and director of the Interdisciplinary Melanoma Program at NYU Langones Perlmutter Cancer Center.

An unmet need is the ability to accurately predict which tumors will respond to which therapy. This would enable personalized treatment strategies that maximize the potential for clinical benefit and minimize exposure to unnecessary toxicity.

READ MORE: How Social Determinants Data Can Enhance Machine Learning Tools

Researchers set out to develop a machine learning model that could help predict a melanoma patients response to immune checkpoint inhibitors. The team collected 302 images of tumor tissue samples from 121 men and women treated for metastatic melanoma with immune checkpoint inhibitors at NYU Langone hospitals.

They then divided these slides into 1.2 million portions of pixels, the small bits of data that make up images. These were fed into the machine learning algorithm along with other factors, such as the severity of the disease, which kind of immunotherapy regimen was used, and whether a patient responded to the treatment.

The results showed that the machine learning model achieved an AUC of 0.8 in both the training and validation cohorts, and was able to predict which patients with a specific type of skin cancer would respond well to immunotherapies in four out of five cases.

Our findings reveal that artificial intelligence is a quick and easy method of predicting how well a melanoma patient will respond to immunotherapy, said study first author Paul Johannet, MD, a postdoctoral fellow at NYU Langone Health and its Perlmutter Cancer Center.

Researchers repeated this process with 40 slides from 30 similar patients at Vanderbilt University to determine whether the results would be similar at a different hospital system that used different equipment and sampling techniques.

READ MORE: Simple Machine Learning Method Predicts Cirrhosis Mortality Risk

A key advantage of our artificial intelligence program over other approaches such as genetic or blood analysis is that it does not require any special equipment, said study co-author Aristotelis Tsirigos, PhD, director of applied bioinformatics laboratories and clinical informatics at the Molecular Pathology Lab at NYU Langone.

The team noted that aside from the computer needed to run the program, all materials and information used in the Perlmutter technique are a standard part of cancer management that most, if not all, clinics use.

Even the smallest cancer center could potentially send the data off to a lab with this program for swift analysis, said Osman.

The machine learning method used in the study is also more streamlined than current predictive tools, such as analyzing stool samples or genetic information, which promises to reduce treatment costs and speed up patient wait times.

Several recent attempts to predict immunotherapy responses do so with robust accuracy but use technologies, such as RNA sequencing, that are not readily generalizable to the clinical setting, said corresponding study authorAristotelis Tsirigos, PhD, professor in the Institute for Computational Medicine at NYU Grossman School of Medicine and member of NYU Langones Perlmutter Cancer Center.

READ MORE: Machine Learning Forecasts Prognosis of COVID-19 Patients

Our approach shows that responses can be predicted using standard-of-care clinical information such as pre-treatment histology images and other clinical variables.

However, researchers also noted that the algorithm is not yet ready for clinical use until they can boost the accuracy from 80 percent to 90 percent and test the algorithm at more institutions. The research team plans to collect more data to improve the performance of the model.

Even at its current level of accuracy, the model could be used as a screening method to determine which patients across populations would benefit from more in-depth tests before treatment.

There is potential for using computer algorithms to analyze histology images and predict treatment response, but more work needs to be done using larger training and testing datasets, along with additional validation parameters, in order to determine whether an algorithm can be developed that achieves clinical-grade performance and is broadly generalizable, said Tsirigos.

There is data to suggest that thousands of images might be needed to train models that achieve clinical-grade performance.

Read the rest here:
Machine Learning Predicts How Cancer Patients Will Respond to Therapy - HealthITAnalytics.com

DIY Camera Uses Machine Learning to Audibly Tell You What it Sees – PetaPixel

Adafruit Industries has created a machine learning camera built with the Raspberry Pi that can identify objects extremely quickly and audibly tell you what it sees. The group has listed all the necessary parts you need to build the device at home.

The camera is based on Adafruits BrainCraft HAT add-on for the Raspberry Pi 4, and uses TensorFlow Lite object recognition software to be able to recognize what it is seeing. According to Adafruits website, its compatible with both the 8-megapixel Pi camera and the 12.3-megapixel interchangeable lens version of module.

While interesting on its own, DIY Photography makes a solid point by explaining a more practical use case for photographers:

You could connect a DSLR or mirrorless camera from its trigger port into the Pis GPIO pins, or even use a USB connection with something like gPhoto, to have it shoot a photo or start recording video when it detects a specific thing enter the frame.

A camera that is capable of recognizing what it is looking at could be used to only take a photo when a specific object, animal, or even a person comes into the frame. That would mean it could have security system or wildlife monitoring applications. Whenever you might wish your camera knew what it was looking at, this kind of technology would make that a reality.

You can find all the parts you will need to build your own version of this device on Adafruits website here. They also have published an easy machine learning guide for the Raspberry Pi as well as a guide on running TensorFlow Lite.

(via DPReview and DIY Photography)

Read more here:
DIY Camera Uses Machine Learning to Audibly Tell You What it Sees - PetaPixel

How machine learning was used to decode an ancient Chinese cave – Times of India

The name of the cell, a prosaic Cave 465, does not quite convey the cornucopia of imagery it contains angry Tantric deities in frenzied sexual union with their consorts. For decades, researchers have tried figuring out how old the Buddhist cave temple at the Mogao site along the ancient Silk Road in China is. Estimates range from the 9th century to the 14th. But now, the discovery of hidden Sanskrit inscriptions on pieces of paper stuck to its ceiling have helped narrow down its origins.On the edge of the Gobi desert, by the Dachuan river, the Mogao Caves have baffled researchers, who have settled on a thousand-year window for when all the 492 caves were carved out of cliffs, one at a long time, starting in the 4th century CE. Each cell, at first, appears isolated in its own history, linked to others through a grid of associations established by identifying pigments, painting styles or plain old radiocarbon dating. But Cave 465, to the north of the site, is unique.

Link:
How machine learning was used to decode an ancient Chinese cave - Times of India

SVG Tech Insight: Increasing Value of Sports Content Machine Learning for Up-Conversion HD to UHD – Sports Video Group

This fall SVG will be presenting a series of White Papers covering the latest advancements and trends in sports-production technology. The full series of SVGs Tech Insight White Papers can be found in the SVG Fall SportsTech Journal HERE.

Following the height of the 2020 global pandemic, live sports are starting to re-emerge worldwide albeit predominantly behind closed doors. For the majority of sports fans, video is the only way they can watch and engage with their favorite teams or players. This means the quality of the viewing experience itself has become even more critical.

With UHD being adopted by both households and broadcasters around the world, there is a marked expectation around visual quality. To realize these expectations in the immediate term, it will be necessary for some years to up-convert from HD to UHD when creating 4K UHD sports channels and content.

This is not so different from the early days of HD, where SD sporting related content had to be up-converted to HD. In the intervening years, however, machine learning as a technology has progressed sufficiently to be a serious contender for performing better up-conversions than with more conventional techniques, specifically designed to work for TV content.

Ideally, we want to process HD content into UHD with a simple black box arrangement.

The problem with conventional up-conversion, though, is that it does not offer an improved resolution, so does not fully meet the expectations of the viewer at home watching on a UHD TV. The question, therefore, becomes: can we do better for the sports fan? If so, how?

UHD is a progressive scan format, with the native TV formats being 38402160, known as 2160p59.64 (usually abbreviated to 2160p60) or 2160p50. The corresponding HD formats, with the frame/field rates set by region, are either progressive 1280720 (720p60 or 720p50) or interlaced 19201080 (1080i30 or 1080i25).

Conversion from HD to UHD for progressive images at the same rate is fairly simple. It can be achieved using spatial processing only. Traditionally, this might typically use a bi-cubic interpolation filter, (a 2-dimensional interpolation commonly used for photographic image scaling.) This uses a grid of 44 source pixels and interpolates intermediate locations in the center of the grid. The conversion from 1280720 to 38402160 requires a 3x scaling factor in each dimension and is almost the ideal case for an upsampling filter.

These types of filters can only interpolate, resulting in an image that is a better result than nearest-neighbor or bi-linear interpolation, but does not have the appearance of being a higher resolution.

Machine Learning (ML) is a technique whereby a neural network learns patterns from a set of training data. Images are large, and it becomes unfeasible to create neural networks that process this data as a complete set. So, a different structure is used for image processing, known as Convolutional Neural Networks (CNNs). CNNs are structured to extract features from the images by successively processing subsets from the source image and then processes the features rather than the raw pixels.

Up-conversion process with neural network processing

The inbuilt non-linearity, in combination with feature-based processing, mean CNNs can invent data not in the original image. In the case of up-conversion, we are interested in the ability to create plausible new content that was not present in the original image, but that doesnt modify the nature of the image too much. The CNN used to create the UHD data from the HD source is known as the Generator CNN.

When input source data needs to be propagated through the whole chain, possibly with scaling involved, then a specific variant of a CNN known as a Residual Network (ResNet) is used. A ResNet has a number of stages, each of which includes a contribution from a bypass path that carries the input data. For this study, a ResNet with scaling stages towards the end of the chain was used as the Generator CNN.

For the Generator CNN to do its job, it must be trained with a set of known data patches of reference images and a comparison is made between the output and the original. For training, the originals are a set of high-resolution UHD images, down-sampled to produce HD source images, then up-converted and finally compared to the originals.

The difference between the original and synthesized UHD images is calculated by the compare function with the error signal fed back to the Generator CNN. Progressively, the Generator CNN learns to create an image with features more similar to original UHD images.

The training process is dependent on the data set used for training, and the neural network tries to fit the characteristics seen during training onto the current image. This is intriguingly illustrated in Googles AI Blog [1], where a neural network presented with a random noise pattern introduces shapes like the ones used during training. It is important that a diverse, representative content set is used for training. Patches from about 800 different images were used for training during the process of MediaKinds research.

The compare function affects the way the Generator CNN learns to process the HD source data. It is easy to calculate a sum of absolute differences between original and synthesized. This causes an issue due to training set imbalance; in this case, the imbalance is that real pictures have large proportions with relatively little fine detail, so the data set is biased towards regenerating a result like that which is very similar to the use of a bicubic interpolation filter.

This doesnt really achieve the objective of creating plausible fine detail.

Generative Adversarial Neural Networks (GANs) are a relatively new concept [2], where a second neural network, known as the Discriminator CNN, is used and is itself trained during the training process of the Generator CNN. The Discriminator CNN learns to detect the difference between features that are characteristic of original UHD images and synthesized UHD images. During training, the Discriminator CNN sees either an original UHD image or a synthesized UHD image, with the detection correctness fed back to the discriminator and, if the image was a synthesized one, also fed back to the Generator CNN.

Each CNN is attempting to beat the other: the Generator by creating images that have characteristics more like originals, while the Discriminator becomes better at detecting synthesized images.

The result is the synthesis of feature details that are characteristic of original UHD images.

With a GAN approach, there is no real constraint to the ability of the Generator CNN to create new detail everywhere. This means the Generator CNN can create images that diverge from the original image in more general ways. A combination of both compare functions can offer a better balance, retaining the detail regeneration, but also limiting divergence. This produces results that are subjectively better than conventional up-conversion.

Conversion from 1080i60 to 2160p60 is necessarily more complex than from 720p60. Starting from 1080i, there are three basic approaches to up-conversion:

Training data is required here, which must come from 2160p video sequences. This enables a set of fields to be created, which are then downsampled, with each field coming from one frame in the original 2160p sequence, so the fields are not temporally co-located.

Surprisingly, results from field-based up-conversion tended to be better than using de-interlaced frame conversion, despite using sophisticated motion-compensated de-interlacing: the frame-based conversion being dominated by the artifacts from the de-interlacing process. However, it is clear that potentially useful data from the opposite fields did not contribute to the result, and the field-based approach missed data that could produce a better result.

A solution to this is to use multiple fields data as the source data directly into a modified Generator CNN, letting the GAN learn how best to perform the deinterlacing function. This approach was adopted and re-trained with a new set of video-based data, where adjacent fields were also provided.

This led to both high visual spatial resolution and good temporal stability. These are, of course, best viewed as a video sequence, however an example of one frame from a test sequence shows the comparison:

Comparison of a sample frame from different up-conversion techniques against original UHD

Up-conversion using a hybrid GAN with multiple fields was effective across a range of content, but is especially relevant for the visual sports experience to the consumer. This offers a realistic means by which content that has more of the appearance of UHD can be created from both progressive and interlaced HD source, which in turn can enable an improved experience for the fan at home when watching a sports UHD channel.

1 A. Mordvintsev, C. Olah and M. Tyka, Inceptionism: Going Deeper into Neural Networks, 2015. [Online]. Available: https://ai.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html

2 I. e. a. Goodfellow, Generative Adversarial Nets, Neural Information Processing Systems Proceedings, vol. 27, 2014.

More:
SVG Tech Insight: Increasing Value of Sports Content Machine Learning for Up-Conversion HD to UHD - Sports Video Group