Archive for the ‘Machine Learning’ Category

Using Deep Learning to Find Genetic Causes of Mental Health Disorders in an Understudied Population – Neuroscience News

Summary: A new deep learning algorithm that looks for the burden of genomic variants is 70% accurate at identifying specific mental health disorders within the African-American community.

Source: CHOP

Minority populations have been historically under-represented in existing studies addressing how genetic variations may contribute to a variety of disorders. A new study from researchers at Childrens Hospital of Philadelphia (CHOP) shows that a deep learning model has promising accuracy when helping to diagnose a variety of common mental health disorders in African American patients.

This tool could help distinguish between disorders as well as identify multiple disorders, fostering early intervention with better precision and allowing patients to receive a more personalized approach to their condition.

The study was recently published by the journalMolecular Psychiatry.

Properly diagnosing mental disorders can be challenging, especially for young toddlers who are unable to complete questionnaires or rating scales. This challenge has been particularly acute in understudied minority populations. Past genomic research has found several genomic signals for a variety of mental disorders, with some serving as potential therapeutic drug targets.

Deep learning algorithms have also been used to successfully diagnose complex diseases like attention deficit hyperactivity disorder (ADHD). However, these tools have rarely been applied in large populations of African American patients.

In a unique study, the researchers generated whole genome sequencing data from 4,179 patient blood samples of African American patients, including 1,384 patients who had been diagnosed with at least one mental disorder This study focused on eight common mental disorders, including ADHD, depression, anxiety, autism spectrum disorder, intellectual disabilities, speech/language disorder, delays in developments and oppositional defiant disorder (ODD).

The long-term goal of this work is to learn more about specific risks for developing certain diseases in African American populations and how to potentially improve health outcomes by focusing on more personalized approaches to treatment.

Most studies focus only on one disease, and minority populations have been very under-represented in existing studies that utilize machine learning to study mental disorders, said senior author Hakon Hakonarson, MD, Ph.D., Director of the Center for Applied Genomics at CHOP.

We wanted to test this deep learning model in an African American population to see whether it could accurately differentiate mental disorder patients from healthy controls, and whether we could correctly label the types of disorders, especially in patients with multiple disorders.

The deep learning algorithm looked for the burden of genomic variants in coding and non-coding regions of the genome. The model demonstrated over 70% accuracy in distinguishing patients with mental disorders from the control group. The deep learning algorithm was equally effective in diagnosing patients with multiple disorders, with the model providing exact diagnostic matches in approximately 10% of cases.

The model also successfully identified multiple genomic regions that were highly enriched formental disorders, meaning they were more likely to be involved in the development of these medical disorders. The biological pathways involved included ones associated with immune responses, antigen and nucleic acid binding, a chemokine signaling pathway, and guanine nucleotide-binding protein receptors.

However, the researchers also found that variants in regions that did not code for proteins seemed to be implicated in these disorders at higher frequency, which means they may serve as alternative markers.

By identifying genetic variants and associated pathways, future research aimed at characterizing their function may provide mechanistic insight as to how these disorders develop, Hakonarson said.

Author: Press OfficeSource: CHOPContact: Press Office CHOPImage: The image is in the public domain

Original Research: Open access.Application of deep learning algorithm on whole genome sequencing data uncovers structural variants associated with multiple mental disorders in African American patients by Yichuan Liu et al. Molecular Psychiatry

Abstract

Application of deep learning algorithm on whole genome sequencing data uncovers structural variants associated with multiple mental disorders in African American patients

Mental disorders present a global health concern, while the diagnosis of mental disorders can be challenging. The diagnosis is even harder for patients who have more than one type of mental disorder, especially for young toddlers who are not able to complete questionnaires or standardized rating scales for diagnosis. In the past decade, multiple genomic association signals have been reported for mental disorders, some of which present attractive drug targets.

Concurrently, machine learning algorithms, especially deep learning algorithms, have been successful in the diagnosis and/or labeling of complex diseases, such as attention deficit hyperactivity disorder (ADHD) or cancer. In this study, we focused on eight common mental disorders, including ADHD, depression, anxiety, autism, intellectual disabilities, speech/language disorder, delays in developments, and oppositional defiant disorder in the ethnic minority of African Americans.

Blood-derived whole genome sequencing data from 4179 individuals were generated, including 1384 patients with the diagnosis of at least one mental disorder. The burden of genomic variants in coding/non-coding regions was applied as feature vectors in the deep learning algorithm. Our model showed ~65% accuracy in differentiating patients from controls. Ability to label patients with multiple disorders was similarly successful, with a hamming loss score less than 0.3, while exact diagnostic matches are around 10%. Genes in genomic regions with the highest weights showed enrichment of biological pathways involved in immune responses, antigen/nucleic acid binding, chemokine signaling pathway, and G-protein receptor activities.

A noticeable fact is that variants in non-coding regions (e.g., ncRNA, intronic, and intergenic) performed equally well as variants in coding regions; however, unlike coding region variants, variants in non-coding regions do not express genomic hotspots whereas they carry much more narrow standard deviations, indicating they probably serve as alternative markers.

See the original post:
Using Deep Learning to Find Genetic Causes of Mental Health Disorders in an Understudied Population - Neuroscience News

Founded by Ex-Uber Data Architect and Apache Hudi Creator, Onehouse Supercharges Data Lakes for AI and Machine Learning With $8 Million in Seed…

Onehouse Combines the Ease-of-Use of a Data Warehouse With the Scale of a Data Lake Into a Fully-Managed Service on Top of the Popular Apache Hudi Open Source Project

MENLO PARK, Calif., Feb. 02, 2022 (GLOBE NEWSWIRE) -- Today Onehouse, the first managed lakehouse company, emerged from stealth with its cloud-native managed service based on Apache Hudi that makes data lakes easier, faster and cheaper.

Data has become the driving force of innovation across nearly every industry in the world. Yet organizations still struggle to build and maintain data architectures that can economically scale at the fast-paced growth of their data. As the size of the data and the AI and machine learning (ML) workloads increase, their costs rise exponentially and they start to outgrow their data warehouses. To scale any further they turn to a data lake where they face a whole new set of complex challenges like constantly tuning data layouts, large-scale concurrency controls, fast data ingestion, data deletions and more.

Onehouse founder Vinoth Chandar faced these very challenges as he was building one of the largest data lakes in the world at Uber. A rapidly growing Uber needed the performance of a warehouse and the scale of a data lake, in near real-time to power AI/ML driven features like predicting ETAs, recommending eats and ensuring ride safety. He created Apache Hudi to implement a new path-breaking architecture where the core warehouse and database functionality was directly added to the data lake, today known as the lakehouse. Apache Hudi brings a state-of-the-art data lakehouse to life with advanced indexes, streaming ingestion services and data clustering/optimization techniques.

Apache Hudi is now widely adopted across the industry used from startups to large enterprises including Amazon, Walmart, Disney+ Hotstar, GE Aviation, Robinhood and TikTok to build exabyte scale data lakes in near-real-time at vastly improved price/performance. The broad adoption of Hudi has battle-tested and proven the foundational benefits of this open source project. Thousands of organizations from across the world have contributed to Hudi and the project has grown 7x in less than two years to nearly one million monthly downloads. At Uber, Hudi continues to ingest more than 500 billion records every day.

Story continues

Zheng Shao and Mohammad Islam from Uber shared we started the Hudi project in 2016, and submitted it to Apache Incubator Project in 2019. Apache Hudi is now a Top-Level Project, with the majority of our Big Data on HDFS in Hudi format. This has dramatically reduced the computing capacity needs at Uber in the Cost-Efficient Open Source Big Data Platform at Uber blog: https://eng.uber.com/cost-efficient-big-data-platform/.

Even with transformative technology like Apache Hudi, building a high quality data lake requires months of investment with scarce talent without which there are high risks that data is not fresh enough or the lake is unreliable or performs poorly.

Onehouse founder and CEO Vinoth Chandar said: While a warehouse can just be used, a lakehouse still needs to be built. Having worked with many organizations on that journey for four years in the Apache Hudi community, we believe Onehouse will enable easy adoption of data lakes and future-proof the data architecture for machine learning/data science down the line.

Onehouse streamlines the adoption of the lakehouse architecture, by offering a fully-managed cloud-native service that quickly ingests, self-manages and auto-optimizes data. Instead of creating yet another vertically integrated data and query stack, it provides one interoperable and truly open data layer that accelerates workloads across all popular data lake query engines like Apache Spark, Trino, Presto and even cloud warehouses as external tables.

Leveraging unique capabilities of Apache Hudi, Onehouse opens the door for incremental data processing that is typically orders of magnitude faster than old-school batch processing. By combining a breakthrough technology and a fully-managed easy-to-use service, organizations can build data lakes in minutes, not months, realize large cost savings and still own their data in open formats, not locked into any individual vendors.

Industry Analysts on Onehouse

The complexity of building a data lake today is prohibitive for many organizations who want to quickly unlock analytics and AI from their data, said Paul Nashawaty, Senior Analyst at Enterprise Strategy Group. The team at Onehouse is building a fully-managed lakehouse infrastructure that automates away tedious data engineering chores and complex performance tuning. Built on an industry proven open source project, Apache Hudi, Onehouse ensures your data foundation is open and future proof.

Data is the new oil and the driving force behind data economy and innovation. But it is very hard, and expensive to build real-time data lakes that can serve AI/ML model creation and model serving in real-time, said Andy Thurai, Vice President and Principal Analyst at Constellation Research. A good data lakehouse solution should consider using a hybrid model as well as look into using a combination of commercial and open-source options (such as Apache Hudi) to strike a balance between cost vs ease of use.

To unlock the power of machine learning, enterprises should invest in an open standards data lake that makes all enterprise data available for relevant models, said Hyoun Park, Chief Analyst at Amalgam Insights. Onehouse tackles this challenge head-on by providing a fully-managed lakehouse that will greatly accelerate the ability to translate massive and varied data sources into AI-guided insight.

$8 Million in Seed Funding Onehouse raised $8 million in seed funding co-led by Greylock and Addition. Onehouse plans to use the money for its managed lakehouse product and to further the research and development on Apache Hudi.

Greylock Partner Jerry Chen said: The data lake house is the future of data lakes, providing customers the ease of use of a data warehouse with the cost and scale advantages of a data lake. Apache Hudi is already the de facto starting point for modern data lakes and today Onehouse makes data lakes easily accessible and usable by all customers.

Addition Investor Aaron Schildkrout said: Onehouse is ushering in the next generation of data infrastructure, replacing expensive data ingestion and data warehousing solutions with a single lakehouse thats dramatically less costly, faster, more open and - now - also easier to use. Onehouse is going to make broadly accessible what has to-date been a tightly held secret used by only the most advanced data teams.

Additional Resources

About OnehouseOnehouse provides a cloud-native managed lakehouse service that makes data lakes easier, faster and cheaper. Onehouse blends the ease of use of a warehouse with the scale of a data lake into a fully managed product. Engineers can build data lakes in minutes, process data in seconds and own data in open source formats, not locked away to individual vendors. Onehouse is founded by a former Uber data architect and the creator of Apache Hudi who pioneered the fundamental technology of the lakehouse. For more information, please visit https://onehouse.ai or follow @Onehousehq.

Media and Analyst Contact:Amber Rowlandamber@therowlandagency.com+1-650-814-4560

A photo accompanying this announcement is available at https://www.globenewswire.com/NewsRoom/AttachmentNg/aedd9404-e43b-49fb-9091-a4b0e57e7f39

Here is the original post:
Founded by Ex-Uber Data Architect and Apache Hudi Creator, Onehouse Supercharges Data Lakes for AI and Machine Learning With $8 Million in Seed...

Artificial Intelligence Creeps on to the African Battlefield – Brookings Institution

Even as the worlds leading militaries race to adopt artificial intelligence in anticipation of future great power war, security forces in one of the worlds most conflict-prone regions are opting for a more measured approach. In Africa, AI is gradually making its way into technologies such as advanced surveillance systems and combat drones, which are being deployed to fight organized crime, extremist groups, and violent insurgencies. Though the long-term potential for AI to impact military operations in Africa is undeniable, AIs impact on organized violence has so far been limited. These limits reflect both the novelty and constraints of existing AI-enabled technology.

Artificial intelligence and armed conflict in Africa

Artificial intelligence (AI), at its most basic, leverages computing power to simulate the behavior of humans that requires intelligence. Artificial intelligence is not a military technology like a gun or a tank. It is rather, as the University of Pennsylvanias Mark Horowitz argues, a general-purpose technology with a multitude of applications, like the internal combustion engine, electricity, or the internet. And as AI applications proliferate to military uses, it threatens to change the nature of warfare. According to the ICRC, AI and machine-learning systems could have profound implications for the role of humans in armed conflict, especially in relation to: increasing autonomy of weapon systems and other unmanned systems; new forms of cyber and information warfare; and, more broadly, the nature of decision-making.

In at least two respects, AI is already affecting the dynamics of armed conflict and violence in Africa. First, AI-driven surveillance and smart policing platforms are being used to respond to attacks by violent extremist groups and organized criminal networks. Second, the development of AI-powered drones is beginning to influence combat operations and battlefield tactics.

AI is perhaps most widely used in Africa in areas with high levels of violence to increase the capabilities and coordination of law enforcement and domestic security services. For instance, fourteen African countries deploy AI-driven surveillance and smart-policing platforms, which typically rely on deep neural networks for image classification and a range of machine learning models for predictive analytics. In Nairobi, Chinese tech giant Huawei has helped build an advanced surveillance system, and in Johannesburg automated license plate readers have enabled authorities to track violent, organized criminals with suspected ties to the Islamic State. Although such systems have significant limitations (more on this below), they are proliferating across Africa.

AI-driven systems are also being deployed to fight organized crime. At Liwonde National Park in Malawi, park rangers use EarthRanger software, developed by the late Microsoft co-founder, Paul Allen, to combat poaching using artificial intelligence and predictive analytics. The software detects patterns in poaching that the rangers might overlook, such as upticks in poaching during holidays and government paydays. A small, motion-activated poacher cam relies on an algorithm to distinguish between humans and animals and has contributed to at least one arrest. Its not difficult to imagine how such a system might be repurposed for counterinsurgency or armed conflict, with AI-enabled surveillance and monitoring systems deployed to detect and deter armed insurgents.

In addition to the growing use of AI within surveillance systems across Africa, AI has also been integrated into weapon systems. Most prominently, lethal autonomous weapons systems use real-time sensor data coupled with AI and machine learning algorithms to select and engage targets without further intervention by a human operator. Depending on how that definition is interpreted, the first use of a lethal autonomous weapon system in combat may have taken place on African soil in March 2020. That month, logistics units belonging to the armed forces of the Libyan warlord Khalifa Haftar came under attack by Turkish-made STM Kargu-2 drones as they fled Tripoli. According to a United Nations report, the Kargu-2 represented a lethal autonomous weapons system because it had been programmed to attack targets without requiring data connectivity between the operator and munition. Although other experts have instead classified the Kargu-2 as a loitering munition, its use in combat in northern Africa nonetheless points to a future where AI-enabled weapons are increasingly deployed in armed conflicts in the region.

Indeed, despite global calls for a ban on similar weapons, the proliferation of systems like the Kargu-2 is likely only beginning. Relatively low costs, tactical advantages, and the emergence of multiple suppliers have led to a booming market for low-and-mid tier combat drones currently being dominated by players including Israel, China, Turkey, and South Africa. Such drones, particularly Turkeys Bakratyar TB2, have been acquired and used by well over a dozen African countries.

While the current generation of drones by and large do not have AI-driven autonomous capabilities that are publicly acknowledged, the same cannot be said for the next generation, which are even less costly, more attritable, and use AI-assisted swarming technology to make themselves harder to defend against. In February, the South Africa-based Paramount Group announced the launch of its N-RAVEN UAV system, which it bills as a family of autonomous, multi-mission aerial vehicles featuring next-generation swarm technologies. The N-RAVEN will be able to swarm in units of up to twenty and is designed for technology transfer and portable manufacture within partner countries. These features are likely to be attractive to African militaries.

AIs limits, downsides, and risks

Though AI may continue to play an increasing role in the organizational strategies, intelligence-gathering capabilities, and battlefield tactics of armed actors in Africa and elsewhere, it is important to put these contributions in a broader perspective. AI cannot address the fundamental drivers of armed conflict, particularly the complex insurgencies common in Africa. African states and militaries may overinvest in AI, neglecting its risks and externalities, as well as the ways in which AI-driven capabilities may be mitigated or exploited by armed non-state actors.

AI is unlikely to have a transformative impact on the outbreak, duration, or mitigation of armed conflict in Africa, whose incidence has doubled over the past decade. Despite claims by its makers, there is little hard evidence linking the deployment of AI-powered smart cities with decreases in violence, including in Nairobi, where crime incidents have remained virtually unchanged since 2014, when the citys AI-driven systems first went online. The same is true of poaching. During the COVID-19 pandemic, fewer tourists and struggling local economies have fueled significant increases, overwhelming any progress that has resulted from governments adopting cutting-edge technology.

This is because, in the first place, armed conflict is a human endeavor, with many factors that influence its outcomes. Even the staunchest defenders of AI-driven solutions, such as Huawei Southern Africa Public Affairs Director David Lane, admit that they cannot address the underlying causes of insecurity such as unemployment or inequality: Ultimately, preventing crime requires addressing these causes in a very local way. No AI algorithm can prevent poverty or political exclusion, disputes over land or national resources, or political leaders from making chauvinistic appeals to group identity. Likewise, the central problems with Africas militariesendemic corruption, human rights abuses, loyalties to specific leaders and groups rather than institutions and citizens, and a proclivity for ill-timed seizures of powerare not problems that artificial intelligence alone can solve.

In the second place, the aspects of armed conflict that AI seems most likely to disruptremote intelligence-gathering capabilities and air powerare technologies that enable armies to keep enemies at arms-length and win in conventional, pitched battles. AIs utility in fighting insurgencies, in which non-state armed actors conduct guerilla attacks and seek to blend in and draw support from the population, is more questionable. To win in insurgencies requires a sustained on the ground presence to maintain order and govern contested territory. States cannot hope to prevail in such conflicts by relying on technology that effectively removes them from the fight.

Finally, the use of AI to fight modern armed conflict remains at a nascent stage. To date, the prevailing available evidence has documented how state actors are adopting AI to fight conflict, and not how armed non-state actors are responding. Nevertheless, states will not be alone in seeking to leverage autonomous weapons. Former African service members speculate that it is only a matter of time before before the deployment of swarms or clusters of offensive drones by non-state actors in Africa, given their accessibility, low costs, and existing use in surveillance and smuggling. Rights activists have raised the alarm about the potential for small, cheap, swarming slaughterbots, that use freely available AI and facial recognition systems to commit mass acts of terror. This particular scenario is controversial, but according to American Universitys Audrey Kurth Cronin, it is both technologically feasible and consistent with classic patterns of diffusion.

The AI armed conflict evolution

These downsides and risks suggest the continued diffusion of AI is unlikely to result in the revolutionary changes to armed conflict suggested by some of its more ardent proponents and backers. Rather, modern AI is perhaps best viewed as continuing and perhaps accelerating long-standing technological trends that have enhanced sensing capabilities and digitized and automated the operations and tactics of armed actors everywhere.

For all its complexity, AI is first and foremost a digital technology, its impact dependent on and difficult to disentangle from a technical triad of data, algorithms, and computing power. The impact of AI-powered surveillance platforms, from the EarthRanger software used at Liwonde to Huawei-supplied smart policing platforms, isnt just a result of machine-learning algorithms that enable human-like reasoning capabilities, but also on the ability to store, collect, process collate and manage vast quantities of data. Likewise, as pointed out by analysts such as Kelsey Atherton, the Kargu 2 used in Libya can be classified as an autonomous loitering munition such as Israels Harpy drone. The main difference between the Kargu 2 and the Harpy, which was first manufactured in 1989, is where the former uses AI-driven image recognition, the latter uses electro-optical sensors to detect and hone in on enemy radar emissions.

The diffusion of AI across Africa, like the broader diffusion of digital technology, is likely to be diverse and uneven. Africa remains the worlds least digitized region. Internet penetration rates are low and likely to remain so in many of the most conflict-prone countries. In Somalia, South Sudan, Ethiopia, the Democratic Republic of Congo, and much of the Lake Chad Basin, internet penetration is below 20%. AI is unlikely to have much of an impact on conflict in regions where citizens leave little in the way of a digital footprint, and non-state armed groups control territory beyond the easy reach of the state.

Taken together, these developments suggest that AI will cause a steady evolution in armed conflict in Africa and elsewhere, rather than revolutionize it. Digitization and the widespread adoption of autonomous weapons platforms may extend the eyes and lengthen the fists of state armies. Non-state actors will adopt these technologies themselves and come up with clever ways to exploit or negate them. Artificial intelligence will be used in combination with equally influential, but less flashy inventions such as the AK-47, the nonstandard tactical vehicle, and the IED to enable new tactics that take advantage or exploit trends towards better sensing capabilities and increased mobility.

Incrementally and in concert with other emerging technologies, AI is transforming the tools and tactics of warfare. Nevertheless, experience from Africa suggests that humans will remain the main actors in the drama of modern armed conflict.

Nathaniel Allen is an assistant professor with the Africa Center for Strategic Studies at National Defense University and a Council on Foreign Relations term member. Marian Ify Okpali is a researcher on cyber policy and an academic specialist at the Africa Center for Strategic Studies at National Defense University. The opinions expressed in this article are those of the authors.

Microsoft provides financial support to the Brookings Institution, a nonprofit organization devoted to rigorous, independent, in-depth public policy research.

Originally posted here:
Artificial Intelligence Creeps on to the African Battlefield - Brookings Institution

CEO of Alberta-based company says it’s time for Alberta, companies to invest in AI and machine learning – Edmonton Journal

Breadcrumb Trail Links

Now is the time for Alberta-based companies and the province to invest more in AI and machine learning technology, said the CEO of an Edmonton company.

This advertisement has not loaded yet, but your article continues below.

Cam Linke, CEO of Alberta Machine Intelligence Institute (Amii), said its a special time in AI machine learning with lots of advancements being made.

This isnt just an academic thing, there is the ability and tools to be able to apply machine learning to a myriad of business problems, said Linke. Right now, businesses dont have to make enormous investments upfront, they can make reasoned investments around a business plan that can have a meaningful business impact right now.

However, Linke said at the same time, the field is growing rapidly.

Its kind of a special time where its sitting right at the intersection of engineering, where it can be applied right now, and science, where the fields continuing to learn, grow and do new things, he said.

This advertisement has not loaded yet, but your article continues below.

Linke said there is a carrot in the stick when it comes to regions and companies around machine learning where the carrot is creating a lot of opportunity, business value and the ability to create a competitive advantage in your industry.

The stick of it is that if youre not, your competitor is, he said. You kind of have to, not just because theres great opportunity there, but someone in your industry and one of your competitors is going to take advantage of this technology and they will have a competitive edge over you if youre not making that investment.

Linke added Alberta is ahead of many provinces due to the province investing in machine learning since 2002 and the federal governments Pan-Canadian AI Strategy announced five years ago.

This advertisement has not loaded yet, but your article continues below.

Amii is a non-profit that supports and invests in world-leading research and training primarily done at the University of Alberta. Linke said the company has partnered with more than 100 companies, from small start-ups to multi-nationals like Shell, to help in the AI and machine learning fields.

Linke said Amii has worked with companies on implementing things such as predictive maintenance which can predict when a machine may fail which helps a company get in front of repairs before a more expensive incident occurs. Another example is the machine learning and reinforcement learning used at a water treatment plant optimizing the amount of water that can be treated, while trying to reduce the amount of energy used.

Linke said Alberta is already seeing the impacts and work of more AI and machine learning being introduced.

Were seeing it by the amount of investment by large companies in the area, the amount of investment in start-ups and the growth of start-ups in the area and were seeing it with the number of jobs and the number of people hired in the area, said Linke.

ktaniguchi@postmedia.com

twitter.com/kellentaniguchi

This advertisement has not loaded yet, but your article continues below.

Sign up to receive daily headline news from the Edmonton Journal, a division of Postmedia Network Inc.

A welcome email is on its way. If you don't see it, please check your junk folder.

The next issue of Edmonton Journal Headline News will soon be in your inbox.

We encountered an issue signing you up. Please try again

Postmedia is committed to maintaining a lively but civil forum for discussion and encourage all readers to share their views on our articles. Comments may take up to an hour for moderation before appearing on the site. We ask you to keep your comments relevant and respectful. We have enabled email notificationsyou will now receive an email if you receive a reply to your comment, there is an update to a comment thread you follow or if a user you follow comments. Visit our Community Guidelines for more information and details on how to adjust your email settings.

See the article here:
CEO of Alberta-based company says it's time for Alberta, companies to invest in AI and machine learning - Edmonton Journal

How to build healthcare predictive models using PyHealth? – Analytics India Magazine

Machine learning has been applied to many health-related tasks, such as the development of new medical treatments, the management of patient data and records, and the treatment of chronic diseases. To achieve success in those SOTA applications, we must rely on the time-consuming technique of model building evaluation. To alleviate this load, Yue Zhao et al have proposed a PyHealth, a Python-based toolbox. As the name implies, this toolbox contains a variety of ML models and architecture algorithms for working with medical data. In this article, we will go through this model to understand its working and application. Below are the major points that we are going to discuss in this article.

Lets first discuss the use case of machine learning in the healthcare industry.

Machine learning is being used in a variety of healthcare settings, from case management of common chronic conditions to leveraging patient health data in conjunction with environmental factors such as pollution exposure and weather.

Machine learning technology can assist healthcare practitioners in developing accurate medication treatments tailored to individual features by crunching enormous amounts of data. The following are some examples of applications that can be addressed in this segment:

The ability to swiftly and properly diagnose diseases is one of the most critical aspects of a successful healthcare organization. In high-need areas like cancer diagnosis and therapy, where hundreds of drugs are now in clinical trials, scientists and computationalists are entering the mix. One method combines cognitive computing with genetic tumour sequencing, while another makes use of machine learning to provide diagnosis and treatment in a range of fields, including oncology.

Medical imaging, and its ability to provide a complete picture of an illness, is another important aspect in diagnosing an illness. Deep learning is becoming more accessible as data sources become more diverse, and it may be used in the diagnostic process, therefore it is becoming increasingly important. Although these machine learning applications are frequently correct, they have some limitations in that they cannot explain how they came to their conclusions.

ML has the potential to identify new medications with significant economic benefits for pharmaceutical companies, hospitals, and patients. Some of the worlds largest technology companies, like IBM and Google, have developed ML systems to help patients find new treatment options. Precision medicine is a significant phrase in this area since it entails understanding mechanisms underlying complex disorders and developing alternative therapeutic pathways.

Because of the high-risk nature of surgeries, we will always need human assistance, but machine learning has proved extremely helpful in the robotic surgery sector. The da Vinci robot, which allows surgeons to operate robotic arms in order to do surgery with great detail and in confined areas, is one of the most popular breakthroughs in the profession.

These hands are generally more accurate and steady than human hands. There are additional instruments that employ computer vision and machine learning to determine the distances between various body parts so that surgery can be performed properly.

Health data is typically noisy, complicated, and heterogeneous, resulting in a diverse set of healthcare modelling issues. For instance, health risk prediction is based on sequential patient data, disease diagnosis based on medical images, and risk detection based on continuous physiological signals.

Electroencephalogram (EEG) or electrocardiogram (ECG), for example, and multimodal clinical notes (e.g., text and images). Despite their importance in healthcare research and clinical decision making, the complexity and variability of health data and tasks need the long-overdue development of a specialized ML system for benchmarking predictive health models.

PyHealth is made up of three modules: data preprocessing, predictive modelling, and assessment. Both computer scientists and healthcare data scientists are PyHealths target consumers. They can run complicated machine learning processes on healthcare datasets in less than 10 lines of code using PyHealth.

The data preprocessing module converts complicated healthcare datasets such as longitudinal electronic health records, medical pictures, continuous signals (e.g., electrocardiograms), and clinical notes into machine learning-friendly formats.

The predictive modelling module offers over 30 machine learning models, including known ensemble trees and deep neural network-based approaches, using a uniform yet flexible API geared for both researchers and practitioners.

The evaluation module includes a number of evaluation methodologies (for example, cross-validation and train-validation-test split) as well as prediction model metrics.

There are five distinct advantages to using PyHealth. For starters, it contains more than 30 cutting-edge predictive health algorithms, including both traditional techniques like XGBoost and more recent deep learning architectures like autoencoders, convolutional based, and adversarial based models.

Second, PyHealth has a broad scope and includes models for a variety of data types, including sequence, image, physiological signal, and unstructured text data. Third, for clarity and ease of use, PyHealth includes a unified API, detailed documentation, and interactive examples for all algorithmscomplex deep learning models can be implemented in less than ten lines of code.

Fourth, unit testing with cross-platform, continuous integration, code coverage, and code maintainability checks are performed on most models in PyHealth. Finally, for efficiency and scalability, parallelization is enabled in select modules (data preprocessing), as well as fast GPU computation for deep learning models via PyTorch.

PyHealth is a Python 3 application that uses NumPy, scipy, scikit-learn, and PyTorch. As shown in the diagram below, PyHealth consists of three major modules: First is the data preprocessing module can validate and convert user input into a format that learning models can understand;

Second is the predictive modelling module is made up of a collection of models organized by input data type into sequences, images, EEG, and text. For each data type, a set of dedicated learning models has been implemented, and the third is the evaluation module can automatically infer the task type, such as multi-classification, and conduct a comprehensive evaluation by task type.

Most learning models share the same interface and are inspired by the scikit-API learn to design and general deep learning design: I fit learns the weights and saves the necessary statistics from the train and validation data; load model chooses the model with the best validation accuracy, and inference predicts the incoming test data.

For quick data and model exploration, the framework includes a library of helper and utility functions (check parameter, label check, and partition estimators). For example, a label check can check the data label and infer the task type, such as binary classification or multi-classification, automatically.

PyHealth for model building

Now below well discuss how we can leverage the API of this framework. First, we need to install the package by using pip.

! pip install pyhealth

Next, we can load the data from the repository itself. For that, we need to clone the repository. After cloning the repository inside the datasets folder there is a variety of datasets like sequenced based, image-based, etc. We are using the mimic dataset and it is in the zip form we need to unzip it. Below is the snippet clone repository, and unzip the data.

The unzipped file is saved in the current working directory with the name of the folder as a mimic. Next to use this dataset we need to load the sequence data generator function which serves as functionality to prepare the dataset for experimentation.

Now we have loaded the dataset. Now we can do further modelling as below.

Here is the fitment result.

Through this article, we have discussed how machine learning can be used in the healthcare industry by observing the various applications. As this domain is being quite vast and N number application, we have discussed a Python-based toolbox that is designed to build a predictive modelling approach by using various deep learning techniques such as LSTM, GRU for sequence data, and CNN for image-based data.

Read the original here:
How to build healthcare predictive models using PyHealth? - Analytics India Magazine