Archive for the ‘Machine Learning’ Category

COVID-19 showed why the military must do more to accelerate machine learning for its toughest challenges – C4ISRNet

As recent events have shown, military decision-making is one of the highest-stakes challenges in the world: Diplomatic relations are at stake; billions of dollars of tax-funded budgets are in the balance; the safety and well-being of thousands of military and civilian personnel around the globe are on the line; and above all, the freedom and liberty of the United States and its more than 330 million citizens must be protected. But with such immense stakes comes an almost unfathomably large amount of related data that must be taken into account. Whether it is managing population health in an increasingly complex and connected world, or managing decisions on the network-centric battlefield, standalone humans are proving insufficient to harness the data, analyze it, and make timely and correct decisions.

Spanning six branches and upward of 1.3 million active duty military personnel on all seven continents, how can all of the data points from dictates from the commander-in-chief to handwritten notes on the deck of an aircraft carrier be taken into account? In matters of national security, speed and reliability in decision-making and avoiding technological surprises or being caught off guard by the nations political rivals require massive real-time analysis and first and second order thinking that includes the complexities of human behavior.

Consider all of the stakes and moving parts facing the leadership at a large domestic military base during the recent COVID-19 pandemic. Concerns of COVID-19 did not just need to consider the base personnel, but also the behavior of the civilians in the surrounding counties, as people from throughout the region, military and civilian contractors alike, were coming and going daily. The information necessary to consider starts with infection and hospitalization rates, but also includes behavior monitoring (and influencing) as well as staying up to date with steps being taken by local, regional and state officials to monitor the virus and limit its spread. With so many moving parts, it is very difficult to stay up to the minute on everything and to determine the right decision with any degree of certainty.

The answer to this guesswork and analysis paralysis lies in the capabilities of artificial intelligence and machine learning. If the military continues to waste too much time with human hours of effort and analysis that could be handled by machines, that could lead to danger and even death of military personnel or civilians. At the heart of complex systems, such as the U.S. military, there is a critical tipping point where the systems are so complex that humans can no longer track them. But AI solutions are capable of delivering up-to-the-minute data modeling, considering all factors at play and second and third order consequences, that can present tangible, data-driven intelligence that takes actions far beyond the limitations of linear human minds. Perhaps the biggest benefit is the confidence to avoid the negative publicity from the podium moment, when asked to justify decisions. Decision-makers can confidently move beyond relying on hunches and instead identify data based on sub-indexes, models from experts, and simulations specific to that day and the circumstances specific to each facility.

When President Biden was recently called onto the carpet to explain the rapid fall of Afghanistan in nine days, he should have had an AI that could at least explain the data, the models and weights that fed the analysis, conclusions and decisions based on the belief that the 300,000 strong Afghan army would be able to hold off the 60,000 Taliban fighters long enough for an orderly withdrawal. Journalists would then be free to question the data sources, the models or the weightings, but not the president, who would be relying on these systems for his judgment. But more importantly, such a system would have certainly predicted this rapid fall in its Monte Carlo distribution of potential outcomes, and would have generated counter measures and cautions.

Without a deeper commitment to AI, the military risks missing out on intelligence that transcends classified, siloed and otherwise restricted information without compromising security. One of the biggest challenges to high-stakes decision-making in the military is silos of classified information, making it difficult or impossible for every party to know every factor that is shaping the situation.

Using AI and machine learning solves this challenge safely. Rather than dumping disparate data from various branches of the military and clearance level into one gigantic data lake, it is possible to leave all the data safely and securely where it is, and train a machine to know and inform the human decision-makers that the data exists. AI is capable of processing not only all of the information in the corpus, but it is also able to know which parties do and do not have clearance to each individual piece of data. In matters of classified information, it can tell different personnel that the information exists, and direct these individuals to the authority qualified to disclose it.

Capabilities like these can be readily applied to large, complex military undertakings, featuring processes, decisions and volumes of information. For instance, when a new aircraft carrier is being built, management requires information in hand-written reports. It is difficult for the naked eye to tell if the project is on time or on budget because of the heavy reliance on human judgment. If any human assessment is just a fraction off, it can massively impact the whole project.

Recent challenges that factor in the vagaries of human behavior illustrated starkly by COVID-19 and the withdrawal from Afghanistan, beg for the rapid analysis and creative input of machine learning systems. From digestion and quantification of countless data points to absorbing and cataloging knowledge of experts who will not always be around to help with predictive modeling of circumstances with dozens of variables, this amplified intelligence is the key to better outcomes.

Richard Boyd is CEO at Tanjo, a machine learning company.

Go here to read the rest:
COVID-19 showed why the military must do more to accelerate machine learning for its toughest challenges - C4ISRNet

Avalo uses machine learning to accelerate the adaptation of crops to climate change – TechCrunch

Climate change is affecting farming all over the world, and solutions are seldom simple. But if you could plant crops that resisted the heat, cold or drought instead of moving a thousand miles away, wouldnt you? Avalo helps plants like these become a reality using AI-powered genome analysis that can reduce the time and money it takes to breed hardier plants for this hot century.

Founded by two friends who thought theyd take a shot at a startup before committing to a life of academia, Avalo has a very direct value proposition, but it takes a bit of science to understand it.

Big seed and agriculture companies put a lot of work into creating better versions of major crops. By making corn or rice ever so slightly more resistant to heat, insects, drought or flooding, they can make huge improvements to yields and profits for farmers, or alternatively make a plant viable to grow somewhere it couldnt before.

There are big decreases to yields in equatorial areas and its not that corn kernels are getting smaller, said co-founder and CEO Brendan Collins. Farmers move upland because salt water intrusion is disrupting fields, but they run into early spring frosts that kill their seedlings. Or they need rust resistant wheat to survive fungal outbreaks in humid, wet summers. We need to create new varieties if we want to adapt to this new environmental reality.

To make those improvements in a systematic way, researchers emphasize existing traits in the plant; this isnt about splicing in a new gene but bringing out qualities that are already there. This used to be done by the simple method of growing several plants, comparing them, and planting the seeds of the one that best exemplifies the trait like Mendel in Genetics 101.

Nowadays, however, we have sequenced the genome of these plants and can be a little more direct. By finding out which genes are active in the plants with a desired trait, better expression of those genes can be targeted for future generations. The problem is that doing this still takes a long time as in a decade.

The difficult part of the modern process stems (so to speak) from the issue that traits, like survival in the face of a drought, arent just single genes. They may be any number of genes interacting in a complex way. Just as theres no single gene for becoming an Olympic gymnast, there isnt one for becoming drought-resistant rice. So when the companies do what are called genome-wide association studies, they end up with hundreds of candidates for genes that contribute to the trait, and then must laboriously test various combinations of these in living plants, which even at industrial rates and scales takes years to do.

Numbered, genetically differentiated rice plants being raised for testing purposes. Image Credits: Avalo

The ability to just find genes and then do something with them is actually pretty limited as these traits become more complicated, said Mariano Alvarez, co-founder and CSO of Avalo. Trying to increase the efficiency of an enzyme is easy, you just go in with CRISPR and edit it but increasing yield in corn, there are thousands, maybe millions of genes contributing to that. If youre a big strategic [e.g., Monsanto] trying to make drought-tolerant rice, youre looking at 15 years, 200 million dollars its a long play.

This is where Avalo steps in. The company has built a model for simulating the effects of changes to a plants genome, which they claim can reduce that 15-year lead time to two or three and the cost by a similar ratio.

The idea was to create a much more realistic model for the genome thats more evolutionarily aware, said Collins. That is, a system that models the genome and genes on it that includes more context from biology and evolution. With a better model, you get far fewer false positives on genes associated with a trait, because it rules out far more as noise, unrelated genes, minor contributors and so on.

He gave the example of a cold-tolerant rice strain that one company was working on. A genomewide association study found 566 genes of interest, and to investigate each costs somewhere in the neighborhood of $40,000 due to the time, staff and materials required. That means investigating this one trait might run up a $20 million tab over several years, which naturally limits both the parties who can even attempt such an operation, and the crops that they will invest the time and money in. If you expect a return on investment, you cant spend that kind of cash improving a niche crop for an outlier market.

Were here to democratize that process, said Collins. In that same body of data relating to cold-tolerant rice, We found 32 genes of interest, and based on our simulations and retrospective studies, we know that all of those are truly causal. And we were able to grow 10 knockouts to validate them, three in a three-month period.

In each graph, dots represent confidence levels in genes that must be tested. The Avalo model clears up the data and selects only the most promising ones. Image Credits: Avalo

To unpack the jargon a little there, from the start Avalos system ruled out more than 90% of the genes that would have had to be individually investigated. They had high confidence that these 32 genes were not just related, but causal having a real effect on the trait. And this was borne out with brief knockout studies, where a particular gene is blocked and the effect of that studied. Avalo calls its method gene discovery via informationless perturbations, or GDIP.

Part of it is the inherent facility of machine learning algorithms when it comes to pulling signal out of noise, but Collins noted that they needed to come at the problem with a fresh approach, letting the model learn the structures and relationships on its own. And it was also important to them that the model be explainable that is, that its results dont just appear out of a black box but have some kind of justification.

This latter issue is a tough one, but they achieved it by systematically swapping out genes of interest in repeated simulations with what amount to dummy versions, which dont disrupt the trait but do help the model learn what each gene is contributing.

Avalo co-founders Mariano Alvarez (left) and Brendan Collins by a greenhouse. Image Credits: Avalo

Using our tech, we can come up with a minimal predictive breeding set for traits of interest. You can design the perfect genotype in silico [i.e., in simulation] and then do intensive breeding and watch for that genotype, said Collins. And the cost is low enough that it can be done by smaller outfits or with less popular crops, or for traits that are outside possibilities since climate change is so unpredictable, who can say whether heat- or cold-tolerant wheat would be better 20 years from now?

By reducing the capital cost of undertaking this exercise, we sort of unlock this space where its economically viable to work on a climate-tolerant trait, said Alvarez.

Avalo is partnering with several universities to accelerate the creation of other resilient and sustainable plants that might never have seen the light of day otherwise. These research groups have tons of data but not a lot of resources, making them excellent candidates to demonstrate the companys capabilities.

The university partnerships will also establish that the system works for fairly undomesticated plants that need some work before they can be used at scale. For instance it might be better to supersize a wild grain thats naturally resistant to drought instead of trying to add drought resistance to a naturally large grain species, but no one was willing to spend $20 million to find out.

On the commercial side, they plan to offer the data handling service first, one of many startups offering big cost and time savings to slower, more established companies in spaces like agriculture and pharmaceuticals. With luck Avalo will be able to help bring a few of these plants into agriculture and become a seed provider as well.

The company just emerged from the IndieBio accelerator a few weeks ago and has already secured $3 million in seed funding to continue their work at greater scale. The round was co-led by Better Ventures and Giant Ventures, with At One Ventures, Climate Capital, David Rowan and of course IndieBio parent SOSV participating.

Brendan convinced me that starting a startup would be way more fun and interesting than applying for faculty jobs, said Alvarez. And he was totally right.

Read more:
Avalo uses machine learning to accelerate the adaptation of crops to climate change - TechCrunch

Machine learning links material composition and performance in catalysts – University of Michigan News

From left to right, diagrams show an oxygen atom bonding with a metal, a metal oxide, and a perovskite. The new model could help chemical engineers design these three types of catalysts to improve the sustainability of fuel and fertilizer production as well as the manufacturing of household chemicals. Credit: Jacques Esterhuizen, Linic Lab, University of Michigan.

In a finding that could help pave the way toward cleaner fuels and a more sustainable chemical industry, researchers at the University of Michigan have used machine learning to predict how the compositions of metal alloys and metal oxides affect their electronic structures.

The electronic structure is key to understanding how the material will perform as a mediator, or catalyst, of chemical reactions.

Were learning to identify the fingerprints of materials and connect them with the materials performance, said Bryan Goldsmith, the Dow Corning Assistant Professor of Chemical Engineering.

A better ability to predict which metal and metal oxide compositions are best for guiding which reactions could improve large-scale chemical processes such as hydrogen production, production of other fuels and fertilizers, and manufacturing of household chemicals such as dish soap.

The objective of our research is to develop predictive models that will connect the geometry of a catalyst to its performance. Such models are central for the design of new catalysts for critical chemical transformations, said Suljo Linic, the Martin Lewis Perl Collegiate Professor of Chemical Engineering.

One of the main approaches to predicting how a material will behave as a potential mediator of a chemical reaction is to analyze its electronic structure, specifically the density of states. This describes how many quantum states are available to the electrons in the reacting molecules and the energies of those states.

Usually, the electronic density of states is described with summary statisticsan average energy or a skew that reveals whether more electronic states are above or below the average, and so on.

Thats OK, but those are just simple statistics. You might miss something. With principal component analysis, you just take in everything and find whats important. Youre not just throwing away information, Goldsmith said.

Principal component analysis is a classic machine learning method, taught in introductory data science courses. They used the electronic density of states as input for the model, as the density of states is a good predictor for how a catalysts surface will adsorb, or bond with, atoms and molecules that serve as reactants. The model links the density of states with the composition of the material.

Unlike conventional machine learning, which is essentially a black box that inputs data and offers predictions in return, the team made an algorithm that they could understand.

We can see systematically what is changing in the density of states and correlate that with geometric properties of the material, said Jacques Esterhuizen, a doctoral student in chemical engineering and first author on the paper in Chem Catalysis.

This information helps chemical engineers design metal alloys to get the density of states that they want for mediating a chemical reaction. The model accurately reflected correlations already observed between a materials composition and its density of states, as well as turning up new potential trends to be explored.

The model simplifies the density of states into two pieces, or principal components. One piece essentially covers how the atoms of the metal fit together. In a layered metal alloy, this includes whether the subsurface metal is pulling the surface atoms apart or squeezing them together, and the number of electrons that the subsurface metal contributes to bonding. The other piece is just the number of electrons that the surface metal atoms can contribute to bonding. From these two principal components, they can reconstruct the density of states in the material.

This concept also works for the reactivity of metal oxides. In this case, the concern is the ability of oxygen to interact with atoms and molecules, which is related to how stable the surface oxygen is. Stable surface oxygens are less likely to react, whereas unstable surface oxygens are more reactive. The model accurately captured the oxygen stability in metal oxides and perovskites, a class of metal oxides.

The study was supported by the Department of Energy and the University of Michigan.

More information:

Go here to read the rest:
Machine learning links material composition and performance in catalysts - University of Michigan News

Frontier Development Lab Transforms Space and Earth Science for NASA with Google Cloud Artificial Intelligence and Machine Learning Technology – SETI…

August 26, 2021, Mountain View, Calif., Frontier Development Lab (FDL), in partnership with the SETI Institute, NASA and private sector partners including Google Cloud, are transforming space and Earth science through the application of industry-leading artificial intelligence (AI) and machine learning (ML) tools.

FDL tackles knowledge gaps in space science by pairing ML experts with researchers in physics, astronomy, astrobiology, planetary science, space medicine and Earth science.These researchers have utilized Google Cloud compute resources and expertise since 2018, specifically AI / ML technology, to address research challenges in areas like astronaut health, lunar exploration, exoplanets, heliophysics, climate change and disaster response.

With access to compute resources provided by Google Cloud, FDL has been able to increase the typical ML pipeline by more than 700 times in the last five years, facilitating new discoveries and improved understanding of our planet, solar system and the universe. Throughout this period, Google Clouds Office of the CTO (OCTO) has provided ongoing strategic guidance to FDL researchers on how to optimize AI / ML , and how to use compute resources most efficiently.

With Google Clouds investment, recent FDL achievements include:

"Unfettered on-demand access to massive super-compute resources has transformed the FDL program, enabling researchers to address highly complex challenges across a wide range of science domains, advancing new knowledge, new discoveries and improved understandings in previously unimaginable timeframes, said Bill Diamond, president and CEO, SETI Institute.This program, and the extraordinary results it achieves, would not be possible without the resources generously provided by Google Cloud.

When I first met Bill Diamond and James Parr in 2017, they asked me a simple question: What could happen if we marry the best of Silicon Valley and the minds of NASA? said Scott Penberthy, director of Applied AI at Google Cloud. That was an irresistible challenge. We at Google Cloud simply shared some of our AI tricks and tools, one engineer to another, and they ran with it. Im delighted to see what weve been able to accomplish together - and I am inspired for what we can achieve in the future. The possibilities are endless.

FDL leverages AI technologies to push the frontiers of science research and develop new tools to help solve some of humanity's biggest challenges. FDL teams are comprised of doctoral and post-doctoral researchers who use AI / ML to tackle ground-breaking challenges. Cloud-based super-computer resources mean that FDL teams achieve results in eight-week research sprints that would not be possible in even year-long programs with conventional compute capabilities.

High-performance computing is normally constrained due to the large amount of time, limited availability and cost of running AI experiments, said James Parr, director of FDL. Youre always in a queue. Having a common platform to integrate unstructured data and train neural networks in the cloud allows our FDL researchers from different backgrounds to work together on hugely complex problems with enormous data requirements - no matter where they are located.

Better integrating science and ML is the founding rationale and future north star of FDLs partnership with Google Cloud. ML is particularly powerful for space science when paired with a physical understanding of a problem space. The gap between what we know so far and what we collect as data is an exciting frontier for discovery and something AI / ML and cloud technology is poised to transform.

You can learn more about FDLs 2021 program here.

The FDL 2021 showcase presentations can be watched as follows:

In addition to Google Cloud, FDL is supported by partners including Lockheed Martin, Intel, Luxembourg Space Agency, MIT Portugal, Lawrence Berkeley National Lab, USGS, Microsoft, NVIDIA, Mayo Clinic, Planet and IBM.

About the SETI InstituteFounded in 1984, the SETI Institute is a non-profit, multidisciplinary research and education organization whose mission is to lead humanity's quest to understand the origins and prevalence of life and intelligence in the universe and share that knowledge with the world. Our research encompasses the physical and biological sciences and leverages expertise in data analytics, machine learning and advanced signal detection technologies. The SETI Institute is a distinguished research partner for industry, academia and government agencies, including NASA and NSF.

Contact Information:Rebecca McDonaldDirector of CommunicationsSETI Institutermcdonald@SETI.org

DOWNLOAD FULL PRESS RELEASE HERE.

The rest is here:
Frontier Development Lab Transforms Space and Earth Science for NASA with Google Cloud Artificial Intelligence and Machine Learning Technology - SETI...

The dos and donts of machine learning research read it, nerds – The Next Web

Did you know Neural is taking the stage this fall? Together with an amazing line-up of experts, we will explore the future of AI during TNW Conference 2021. Secure your ticket now!

Machine learning is becoming an important tool in many industries and fields of science. But ML research and product development present several challenges that, if not addressed, can steer your project in the wrong direction.

In a paper recently published on the arXiv preprint server, Michael Lones, Associate Professor in the School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, provides a list of dos and donts for machine learning research.

The paper, which Lones describes as lessons that were learnt whilst doing ML research in academia, and whilst supervising students doing ML research, covers the challenges of different stages of the machine learning research lifecycle. Although aimed at academic researchers, the papers guidelines are also useful for developers who are creating machine learning models for real-world applications.

Here are my takeaways from the paper, though I recommend anyone involved in machine learning research and development to read it in full.

Machine learning models live and thrive on data. Accordingly, across the paper, Lones reiterates the importance of paying extra attention to data across all stages of the machine learning lifecycle. You must be careful of how you gather and prepare your data and how you use it to train and test your machine learning models.

No amount of computation power and advanced technology can help you if your data doesnt come from a reliable source and hasnt been gathered in a reliable manner. And you should also use your own due diligence to check the provenance and quality of your data. Do not assume that, because a data set has been used by a number of papers, it is of good quality, Lones writes.

Your dataset might have various problems that can lead to your model learning the wrong thing.

For example, if youre working on a classification problem and your dataset contains too many examples of one class and too few of another, then the trained machine learning model might end up learning to predict every input as belonging to the stronger class. In this case, your dataset suffers from class imbalance.

While class imbalance can be spotted quickly with data exploration practices, finding other problems needs extra care and experience. For example, if all the pictures in your dataset were taken in daylight, then your machine learning model will perform poorly on dark photos. A more subtle example is the equipment used to capture the data. For instance, if youve taken all your training photos with the same camera, your model might end up learning to detect the unique visual footprint of your camera and will perform poorly on images taken with other equipment. Machine learning datasets can have all kinds of such biases.

The quantity of data is also an important issue. Make sure your data is available in enough abundance. If the signal is strong, then you can get away with less data; if its weak, then you need more data, Lones writes.

In some fields, the lack of data can be compensated for with techniques such as cross-validation and data augmentation. But in general, you should know that the more complex your machine learning model, the more training data youll need. For example, a few hundred training examples might be enough to train a simple regression model with a few parameters. But if you want to develop a deep neural network with millions of parameters, youll need much more training data.

Another important point Lones makes in the paper is the need to have a strong separation between training and test data. Machine learning engineers usually put aside part of their data to test the trained model. But sometimes, the test data leaks into the training process, which can lead to machine learning models that dont generalize to data gathered from the real world.

Dont allow test data to leak into the training process, he warns. The best thing you can do to prevent these issues is to partition off a subset of your data right at the start of your project, and only use this independent test set once to measure the generality of a single model at the end of the project.

In more complicated scenarios, youll need a validation set, a second test set that puts the machine learning model into a final evaluation process. For example, if youre doing cross-validation or ensemble learning, the original test might not provide a precise evaluation of your models. In this case, a validation set can be useful.

If you have enough data, its better to keep some aside and only use it once to provide an unbiased estimate of the final selected model instance, Lones writes.

Today, deep learning is all the rage. But not every problem needs deep learning. In fact, not every problem even needs machine learning. Sometimes, simple pattern-matching and rules will perform on par with the most complex machine learning models at a fraction of the data and computation costs.

But when it comes to problems that are specific to machine learning models, you should always have a roster of candidate algorithms to evaluate. Generally speaking, theres no such thing as a single best ML model, Lones writes. In fact, theres a proof of this, in the form of the No Free Lunch theorem, which shows that no ML approach is any better than any other when considered over every possible problem.

The first thing you should check is whether your model matches your problem type. For example, based on whether your intended output is categorical or continuous, youll need to choose the right machine learning algorithm along with the right structure. Data types (e.g., tabular data, images, unstructured text, etc.) can also be a defining factor in the class of model you use.

One important point Lones makes in his paper is the need to avoid excessive complexity. For example, if youre problem can be solved with a simple decision tree or regression model, theres no point in using deep learning.

Lones also warns against trying to reinvent the wheel. With machine learning being one of the hottest areas of research, theres always a solid chance that someone else has solved a problem that is similar to yours. In such cases, the wise thing to do would be to examine their work. This can save you a lot of time because other researchers have already faced and solved challenges that you will likely meet down the road.

To ignore previous studies is to potentially miss out on valuable information, Lones writes.

Examining papers and work by other researchers might also provide you with machine learning models that you can use and repurpose for your own problem. In fact, machine learning researchers often use each others models to save time and computational resources and start with a baseline trusted by the ML community.

Its important to avoid not invented here syndrome, i.e., only using models that have been invented at your own institution, since this may cause you to omit the best model for a particular problem, Lones warns.

Having a solid idea of what your machine learning model will be used for can greatly impact its development. If youre doing machine learning purely for academic purposes and to push the boundaries of science, then there might be no limits to the type of data or machine learning algorithms you can use. But not all academic work will remain confined in research labs.

[For] many academic studies, the eventual goal is to produce an ML model that can be deployed in a real world situation. If this is the case, then its worth thinking early on about how it is going to be deployed, Lones writes.

For example, if your model will be used in an application that runs on user devices and not on large server clusters, then you cant use large neural networks that require large amounts of memory and storage space. You must design machine learning models that can work in resource-constrained environments.

Another problem you might face is the need for explainability. In some domains, such as finance and healthcare, application developers are legally required to provide explanations of algorithmic decisions in case a user demands it. In such cases, using a black-box model might be impossible. For example, even though a deep neural network might give you a performance advantage, its lack of interpretability might make it useless. Instead, a more transparent model such as a decision tree might be a better choice even if it results in a performance hit. Alternatively, if deep learning is an absolute requirement for your application, then youll need to investigate techniques that can provide reliable interpretations of activations in the neural network.

As a machine learning engineer, you might not have precise knowledge of the requirements of your model. Therefore, it is important to talk to domain experts because they can help to steer you in the right direction and determine whether youre solving a relevant problem or not.

Failing to consider the opinion of domain experts can lead to projects which dont solve useful problems, or which solve useful problems in inappropriate ways, Lones writes.

For example, if you create a neural network that flags fraudulent banking transactions with very high accuracy but provides no explanation of its decision, then financial institutions wont be able to use it.

There are various ways to measure the performance of machine learning models, but not all of them are relevant to the problem youre solving.

For example, many ML engineers use the accuracy test to rate their models. The accuracy test measures the percent of correct predictions the model makes. This number can be misleading in some cases.

For example, consider a dataset of x-ray scans used to train a machine learning model for cancer detection. Your data is imbalanced, with 90 percent of the training examples flagged as benign and a very small number classified as malign. If your trained model scores 90 on the accuracy test, it might have just learned to label everything as benign. If used in a real-world application, this model can lead to missed cases with disastrous outcomes. In such a case, the ML team must use tests that are insensitive to class imbalance or use a confusion matrix to check other metrics. More recent techniques can provide a detailed measure of a models performance in various areas.

Based on the application, the ML developers might also want to measure several metrics. To return to the cancer detection example, in such a model, it might be important to reduce false negatives as much as possible even if it comes at the cost of lower accuracy or a slight increase in false positives. It is better to send a few people healthy people for diagnosis to the hospital than to miss critical cancer patients.

In his paper, Lones warns that when comparing several machine learning models for a problem, dont assume that bigger numbers do not necessarily mean better models. For example, performance differences might be due to your model being trained and tested on different partitions of your dataset or on entirely different datasets.

To really be sure of a fair comparison between two approaches, you should freshly implement all the models youre comparing, optimise each one to the same degree, carry out multiple evaluations and then use statistical tests to determine whether the differences in performance are significant, Lones writes.

Lones also warns not to overestimate the capabilities of your models in your reports. A common mistake is to make general statements that are not supported by the data used to train and evaluate models, he writes.

Therefore, any report of your models performance must also include the kind of data it was trained and tested on. Validating your model on multiple datasets can provide a more realistic picture of its capabilities, but you should still be wary of the kind of data errors we discussed earlier.

Transparency can also contribute greatly to other ML research. If you fully describe the architecture of your models as well as the training and validation process, other researchers that read your findings can use them in future work or even help point out potential flaws in your methodology.

Finally, aim for reproducibility. if you publish your source code and model implementations, you can provide the machine learning community with great tools in future work.

Interestingly, almost everything Lones wrote in his paper is also applicable to applied machine learning, the branch of ML that is concerned with integrating models into real products. However, I would like to add a few points that go beyond academic research and are important in real-world applications.

When it comes to data, machine learning engineers must consider an extra set of considerations before integrating them into products. Some include data privacy and security, user consent, and regulatory constraints. Many a company has fallen into trouble for mining user data without their consent.

Another important matter that ML engineers often forget in applied settings is model decay. Unlike academic research, machine learning models used in real-world applications must be retrained and updated regularly. As everyday data changes, machine learning models decay and their performance deteriorates. For example, as life habits changed in wake of the covid lockdown, ML systems that had been trained on old data started to fail and needed retraining. Likewise, language models need to be constantly updated as new trends appear and our speaking and writing habits change. These changes require the ML product team to devise a strategy for continued collection of fresh data and periodical retraining of their models.

Finally, integration challenges will be an important part of every applied machine learning project. How will your machine learning system interact with other applications currently running in your organization? Is your data infrastructure ready to be plugged into the machine learning pipeline? Does your cloud or server infrastructure support the deployment and scaling of your model? These kinds of questions can make or break the deployment of an ML product.

For example, recently, AI research lab OpenAIlaunched a test version of their Codex API model for public appraisal. But their launch failed because their servers couldnt scale to the user demand.

Hopefully, this brief post will help you better assess your machine learning project and avoid mistakes. Read Loness full paper, titled, How to avoid machine learning pitfalls: a guide for academic researchers, for more details about common mistakes in the ML research and development process.

This article was originally published by Ben Dickson onTechTalks, a publication that examines trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech, and what we need to look out for. You can read the original article here.

See the original post here:
The dos and donts of machine learning research read it, nerds - The Next Web