Archive for the ‘Machine Learning’ Category

Experts Talk Machine Learning Best Practices for Database Management – Database Trends and Applications

Machine learning is becoming the go-to solution for greater automation and intelligence. A recent study fielded amongst the subscribers of DBTA found that 48% currently have machine learning initiatives underway with another 20% considering adoption. At the same time, most projects are still in the early phases.

DBTA recently held a roundtable webinar with Gaurav Deshpande, VP of marketing, TigerGraph; Santiago Giraldo, director of product marketing data engineering and machine learning, Cloudera; and Paige Roberts, open source relations manager, Vertica, who discussed key technologies and strategies for maximizing machine learnings impact.

Advanced analytics and machine learning on connected data allows organizations to connect all data sets and pipelines, analyze that connected data, and learn from that connected data, Deshpande explained.

TigerGraph is a scalable graph database for the enterprise that is foundational for AI and ML solutions, he said. It offers flexible schema, high performance for complex transactions, and high performance for deep analytics.

The success of machine learning adoption is intertwined, collaboration is critical, said Giraldo. It requires an enterprise data platform that streamlines the full data lifecycle.

Machine learning with Cloudera provides customers with a hybrid platform across multiple clouds and data centers. Cloudera is one of the only offerings with integrated experiences with SDX backed security and governance, said Giraldo. It enables collaborative and integrated BI and augmentation from expert data scientists to data analysts.

Applications and services that enable our data-driven world use both BI and data science, according to Roberts.

When choosing the best platform that includes machine learning, she suggests not committing to only open source, only proprietary, or only one brand.

Dont lock yourself in to only one deployment optionsolution only works on-prem, only works on cloud, or only works on this cloud, Roberts said.

Users should not tightly couple componentseverything should be interchangeable, Roberts said. Switching out one component shouldnt break everything. And plan for the future, dont get locked in, she said.

An archived on-demand replay of this webinar is available here.

Read the original:
Experts Talk Machine Learning Best Practices for Database Management - Database Trends and Applications

Intel works with Deci to speed up machine learning on its chips – VentureBeat

Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.

Intel today announced a strategic business and technology collaboration with Deci to optimize machine learning on the formers processors. Deci says that in the coming weeks, it will work with Intel to deploy innovative AI technologies to the companies mutual customers.

Machine learning deployments have historically been constrained by the size and speed of algorithms and the need for costly hardware. In fact, areportfrom MIT found that machine learning might be approaching computational limits. A separate Syncedstudy estimated that the University of Washingtons Grover fake news detection modelcost $25,000 to train in about two weeks. OpenAI reportedly racked up a whopping $12 million to train itsGPT-3 language model, and Google spent an estimated $6,912 trainingBERT, a bidirectional transformer model that redefined the state of the art for 11 natural language processing tasks.

Intel and Deci say the partnership will enable machine learning at scale on Intel chips, potentially enabling new applications of inference through reductions in costs and latency. Already, Deci has worked to accelerate the inference speed of the well-known ResNet-50 neural network on Intel processors, achieving a reduction in the models latency by a factor of 11.8 and increasing throughput by up to 11 times.

By optimizing the AI models that run on Intels hardware, Deci enables customers to get even more speed and will allow for cost-effective and more general deep learning use cases on Intel CPUs, Deci CEO and cofounder Yonatan Geifman said. We are delighted to collaborate with Intel to deliver even greater value to our mutual customers and look forward to a successful partnership.

Deci achieves runtime acceleration through a combination of data preprocessing and loading, selecting model architectures and hyperparameters (i.e., the variables that influence a models predictions) as well as datasets optimized for inference. It also takes care of steps like deployment, serving, monitoring, and explainability. Decis accelerator redesigns models to create new models with several computation routes, all optimized for a given inference device.

Decis router component ensures that each data input is directed via the proper route. (Each route is specialized with a prediction task.) As for the companys accelerator, it works in synergy with other compression techniques like pruning and quantization. The accelerator can even act as a multiplier for complementary acceleration solutions such as AI compilers and specialized hardware, according to the company.

Deci was cofounded by Geifman, entrepreneur Jonathan Elial, and Ran El-Yaniv, a computer science professor at Technion in Haifa, Israel. Geifman and El-Yaniv met at Technion, where Geifman is a PhD candidate at the universitys computer science department. To date, the Tel Aviv-based company, a participant in Intels Ignite startup accelerator, has raised $9.1 million from investors including Square Peg.

The rest is here:
Intel works with Deci to speed up machine learning on its chips - VentureBeat

NVIDIA and Harvard University Researchers Introduce AtacWorks: A Machine Learning Toolkit to Revolutionize Genome Sequencing – MarkTechPost

Researchers from NVIDIA and Harvard University have introduced a machine learning-driven toolkit calledAtacWorksthat has the potential to bring about remarkable advancements in genome sequencing.

What is genome sequencing?

Genome sequencing was introduced by British biochemist Frederick Sanger and his team in 1977. The world was fascinated by how this new technology could uncover human similarity and genetic diversity in new ways.

A genome is a map of all the nucleotides in our body, andgenome sequencingis a technique used to generate this nucleotide map to decode our DNA. The human genome consists of over 3 billion nucleotides.

Genome sequencing has helped scientists figure out the location of various genes and how they work together to ensure the growth and maintenance of organisms. It has served as an essential tool in the study of hereditary diseases and genetic abnormalities.

Current challenges and limitations of genome sequencing techniques

The traditional technique,ATAC-seq, measures the intensity of signals across the genome and plots the data in a graph. However, ATAC-seq can perform DNA sequencing efficiently only if it has access to many cells. We need a large number of cells (in the order of thousands) to carry out reasonably efficient genome sequencing. The fewer cells available, the noisier the data, and the more challenging it is to analyze rare cell types.

In addition, the traditional process is time-consuming. This poses a significant challenge to studying genetic mutations in organisms, like viruses, that rapidly mutate.

Introducing AtacWorks: the latest game-changer in genome sequencing

A machine learning driven toolkit called,AtacWorks,was created by researchers from NVIDIA and Harvard University to help address some of the challenges we face in genome sequencing.

AtacWorks is a Pytorch basedConvolutional Neural Network (CNN)trained to differentiate between data and noise and pick out peaks in a noisy data set. AtacWorks can be combined with ATAC-seq data to obtain the same quality data using lesser data points (lesser number of cells in this case). Researchers have found that AtacWorks can produce the same quality data from 1 million data points as was earlier done using 50 million data points.

In addition, AtacWorks helps speed up analysis by usingtensor core GPUs. This makes it possible to complete the full genome analysis in just 30 minutes a radical difference compared to the traditional 15-hour time frame.

In theresearch paper published in Nature Communications, Harvard researchers applied AtacWorks to a dataset of stem cells that produce red and white blood cells. Stem cells are rare cell types and are often found in very small numbers in the human body at a time.

Using a sample of just 50 stem cells, the researchers were able to identify distinct regions of the DNA of a stem cell that causes it to evolve into a red blood cell or a white blood cell. They were also able to isolate DNA sequences that correspond to red blood cells.

This remarkable breakthrough made by machine learning in genome sequencing has the potential to lead to the discovery of new drugs and explore evolution through the study of new mutations.

Source: https://ngc.nvidia.com/catalog/resources/nvidia:atacworks

Paper: https://www.nature.com/articles/s41467-021-21765-5

Github: https://github.com/clara-parabricks/AtacWorks

Suggested

The rest is here:
NVIDIA and Harvard University Researchers Introduce AtacWorks: A Machine Learning Toolkit to Revolutionize Genome Sequencing - MarkTechPost

Construction and validation of a machine learning-based nomogram: A tool to predict the risk of getting severe coronavirus disease 2019 (COVID-19) -…

This article was originally published here

Immun Inflamm Dis. 2021 Mar 13. doi: 10.1002/iid3.421. Online ahead of print.

ABSTRACT

BACKGROUND: Identifying patients who may develop severe coronavirus disease 2019 (COVID-19) will facilitate personalized treatment and optimize the distribution of medical resources.

METHODS: In this study, 590 COVID-19 patients during hospitalization were enrolled (Training set: n = 285; Internal validation set: n = 127; Prospective set: n = 178). After filtered by two machine learning methods in the training set, 5 out of 31 clinical features were selected into the model building to predict the risk of developing severe COVID-19 disease. Multivariate logistic regression was applied to build the prediction nomogram and validated in two different sets. Receiver operating characteristic (ROC) analysis and decision curve analysis (DCA) were used to evaluate its performance.

RESULTS: From 31 potential predictors in the training set, 5 independent predictive factors were identified and included in the risk score: C-reactive protein (CRP), lactate dehydrogenase (LDH), Age, Charlson/Deyo comorbidity score (CDCS), and erythrocyte sedimentation rate (ESR). Subsequently, we generated the nomogram based on the above features for predicting severe COVID-19. In the training cohort, the area under curves (AUCs) were 0.822 (95% CI, 0.765-0.875) and the internal validation cohort was 0.762 (95% CI, 0.768-0.844). Further, we validated it in a prospective cohort with the AUCs of 0.705 (95% CI, 0.627-0.778). The internally bootstrapped calibration curve showed favorable consistency between prediction by nomogram and the actual situation. And DCA analysis also conferred high clinical net benefit.

CONCLUSION: In this study, our predicting model based on five clinical characteristics of COVID-19 patients will enable clinicians to predict the potential risk of developing critical illness and thus optimize medical management.

PMID:33713584 | DOI:10.1002/iid3.421

See original here:
Construction and validation of a machine learning-based nomogram: A tool to predict the risk of getting severe coronavirus disease 2019 (COVID-19) -...

Machine Learning: Long way to go for AI bias-correction; some hurl abuses, others see abuse where theres none – The Financial Express

While more companies are warming up to AI, AI platforms are being taught to screen for specific cue words to detect bias or abuse.

While the focus on checking human biases from getting coded into artificial intelligence (AI) is desirable, there is a need for the developing AI that is intelligent about biases and contexts, too. The Indian Express reports that the reason behind YouTube AI banning Agadmator, a popular chess channel on the platform last year, could be the use of white, black and attackwhich mean different things in chess and in race-relations.

While more companies are warming up to AI, AI platforms are being taught to screen for specific cue words to detect bias or abuse. So, in this case, with the use of the particular words, YouTube AI read racism where there was none. How poorly human understanding is being translated for machines is evident from not just this case, but also from that of Microsofts Tay-bot, that all too quickly picked up anti-Semitic and hateful content from the internet when it should have been designed to filter this out contextually.

While the need will be to continuously go back to the AI drawing board, human control of AIs learning and other machine-learning will be important to set the context for the machines.

AI ethics is surely a minefieldbusiness interests, as various analyses of the recent episode at Google involving the termination of two senior ethics experts at the company suggest, could sometimes come into conflict with the larger good. But, as research translates human understanding for machines more effectively, chances are both Tay-bot and Youtubes reported AI gaffe, at the other extreme, will become rarer.

Get live Stock Prices from BSE, NSE, US Market and latest NAV, portfolio of Mutual Funds, Check out latest IPO News, Best Performing IPOs, calculate your tax by Income Tax Calculator, know markets Top Gainers, Top Losers & Best Equity Funds. Like us on Facebook and follow us on Twitter.

Financial Express is now on Telegram. Click here to join our channel and stay updated with the latest Biz news and updates.

Continue reading here:
Machine Learning: Long way to go for AI bias-correction; some hurl abuses, others see abuse where theres none - The Financial Express