Archive for the ‘Machine Learning’ Category

New Machine Learning Features, Data Integrations, and Upgraded Classification Engines Available in Grooper Version 2.9 – PRNewswire

OKLAHOMA CITY, July 28, 2020 /PRNewswire/ --Grooper, the leading intelligent document processing and digital data integration platform announces the release of version 2.9. Included are fourteen new capabilities that enhance machine learning, classification, separation, data integration, and reporting.

New Machine Learning FeaturesMachine learning is easier and more powerful. The new Rebuild Training features provide tuning and A/B testing using identical training sets and document training decisions.

Integration with BoxBuilt-in integration with Box.com enables file import and export, metadata mapping, data lookups, and more.

Advanced Document ClassificationTackle complicated document sets with advanced classification strategies. Target documents within or across groups that are lexically dissimilar or similar with high accuracy.

New Document ViewerUsers can choose from multiple document renditions to build better data extractions.

Improved Document SeparationDocument separation is now more robust and accurate due to new auto-separation logic. New page extractors separate unstructured documents.

Enhanced Database ExportDefine multiple exports on a single export step within a single database or spanning multiple tables. Multipart database exports are simplified and SQL server-generated identity columns are supported.

CMIS Data LookupsPopulate and validate data fields based on queryable metadata located on CMIS objects.

New Data Annotation Option in Data ReviewExtracted document data is now displayed at the extraction location on the document. This speeds up human data review and includes multiple configurable properties.

Content Type FilteringNow users can enable classification, extraction, and review to proceed in stages for larger more complicated projects.

Compile Stats FeatureThe Compile Stats feature provides comprehensive statistics on classification and extraction activities to assist administrators in developing and troubleshooting advanced content models.

Learn more about Grooper visit http://www.grooper.com.

About GrooperGrooper was built from the ground up by BIS, a company with 35 years of continuous experience developing and delivering new technology. Grooper is an intelligent document processing and digital data integration solution that empowers organizations to extract meaningful information from paper/electronic documents and other forms of unstructured data.

The platform combines patented and sophisticated image processing, capture technology, machine learning, natural language processing, and optical character recognition to enrich and embed human comprehension into data. By tackling tough challenges that other systems cannot resolve, Grooper has become the foundation for many industry-first solutions in healthcare, financial services, oil and gas, education, and government.

SOURCE Grooper

http://www.grooper.com

Here is the original post:
New Machine Learning Features, Data Integrations, and Upgraded Classification Engines Available in Grooper Version 2.9 - PRNewswire

Top Machine Learning Algorithms, Frameworks, Tools and Products Used by Data Scientists – Customer Think

A recent survey by Kaggle revealed that data professionals used a variety of different algorithms, tools, frameworks and products to extract insights. Top algorithms were linear/logistic regression, decision trees/random forests and Gradient Boosting Machines. Top frameworks were Scikit-learn and TensorFlow. Top tools for automation were related to model selection and data augmentation. While half of the respondents did not use ML products, the top products used were Google Cloud ML Engine, Azure ML Studio and Amazon Sagemaker.

Machine learning is employed by data scientists to find patterns and predict important outcomes. The application of machine learning reaches across industries (e.g., healthcare, education) and professions (e.g., marketing, content management), and data professionals have many different tools, methods and products they can use to extract useful insights. Kaggle conducted a survey in October 2019 of nearly 20,000 data professionals (2019 Kaggle Machine Learning and Data Science Survey) that reveals the variety of ways they solve their machine learning problems. Todays post is about the machine learning methods and tools data professionals used in 2019.

Figure 1. Top Machine Learning Algorithms Used in 2019. Click image to enlarge.

The survey included a question for data professionals, Which of the following machine learning algorithms do you use on a regular basis? Select all that apply. On average, data professionals used 3 (median) machine learning algorithms. The top 10 machine learning algorithms used were (see Figure 1):

Adoption rates for the top two algorithms were the highest for data professionals who self-identified as statistician and data scientist. Adoption rates were around 10 percentage points higher for these data pros (e.g., ~80% for linear/logistic regression, ~70% for decision trees and random forests).

Arecent poll by KDNuggets found similar results to the current study. In their study, machine learning methods also included regression (56%), decision trees/rules (48%), random forests (45%), Gradient Boosting Machines (23%).

Figure 2. Machine Learning Frameworks Used. Click image to enlarge.

The survey included a question, Which of the following machine learning frameworks do you use on a regular basis? Select all that apply. On average, data professionals used 2 (median) machine learning frameworks. The top 10 machine learning frameworks used were (see Figure 2):

Figure 3. Machine Learning Tools Used. Click image to enlarge.

The survey also asked all data professionals about the machine learning tools they used. A little over half of the respondents (53%) indicated that they did not use any automated machine learning tools. The most used automated machine learning tool used were (see Figure 3):

Figure 4. Machine Learning Products Used. Click image to enlarge.

The survey also asked all data professionals about the machine learning products they used. A little over a third of the respondents (38%) indicated that they did not use any machine learning products. The most used automated machine learning products used were (see Figure 4):

I conducted a principal components analysis of all the various machine learning utilities to identify groupings of these machine learning methods. I found a fairly clear 9-component solution:

Azure Machine Learning Studio stood out as the lone product as it did not load on any of the 9 components.

The pattern of results show that the various machine learning methods tend to be used together. For example, when ML automation tools are used, data professionals tend to use all of them. Similarly, data professionals either tend use all Google products or use none of them. Data professionals who employ evolutionary approaches also tend to use generative adversarial networks.

The results of the Kaggle survey of nearly 20,000 data professionals reveals the most popular machine learning algorithms, products, tools and frameworks.

While machine learning is still a hot and growing field of data science, over a third of the respondents do not use any ML products. Top algorithms used were linear/logistic regression, decision trees/random forests and Gradient Boosting Machines. The most used machine learning frameworks were Scikit-learn and TensorFlow. Top tools for machine learning automation were related to model selection and data augmentation. The top products used were Google Cloud ML Engine, Azure ML Studio and Amazon Sagemaker.

Excerpt from:
Top Machine Learning Algorithms, Frameworks, Tools and Products Used by Data Scientists - Customer Think

Putting AI and Machine Learning to Work in Cloud-Based BI and Analytics – AiThority

Machine learning (ML) in the cloud is powering a whole new generation of intelligent and predictive cloud analytics solutions like Azure Databricks and Azure Synapse. The benefits of cloud economics, tooling and flexibility, along with next-level insights to drive real time business decisions are the primary drivers behind the growing trend of on-premise data lake migrations to the cloud.

Cloud analytics services like Synapse are designed to collect and analyze current and actionable data delivering insights into processes and workflows that can impact business operations. But what if you need those insights immediately, and you need them in the hands of employees and experts who are working simultaneously across the globe in real time and always accurate and up to date? IT stakeholders are turning to the cloud for faster, more accurate and timelier business insights especially in the face of Covid-19 where companies are looking to operate as economically possible and millions are forced into remote working locations.

Even before the pandemic, a 2019 survey by TechTarget found that 27% of respondents plan to deploy cloud analytics in 2020. That same study points to an increase in cloud technology as the number two activity that companies are employing to improve employee experience and productivity, and notes that 38% of companies plan to bolster their cloud technology in 2020. In speaking to the experts at AWS and Azure, that number is higher today. Hindsight is also 20/20!

There are multiple reasons that organizations are moving their data lakes and analytics capabilities to the cloud. First among them is cost: the move streamlines a workforce, so even though there are start-up costs involved in the migration process, the long-term cost-benefit analysis plays out in their favor. Companies are also able to run faster and lighter with cloud analytics with no need to run dedicated client-side applications and IT teams freed of the necessity of coordinating upgrades across an entire infrastructure. In our experience across our customer base at WANdisco and in working with CSPs like Azure and AWS, we have found, on average, that the total cost of ownership to manage a 1PB Hadoop data lake on premise over a three year period costs a company $2M. To manage that same 1PB in AWS S3 or Azure ADLS Gen 2 storage costs $900,000 over three years.

The question is how to most rapidly (time to value) migrate that 1PB data lake with zero downtime and ensuring the data is consistent on prem and in the cloud during migration as the data is always changing if its business critical. The architects and data teams have two choices.

They can use various flavors of open source DistCP tools and scripts, which is the manual approach to a data lake migration. Dont be fooled by fancy names by the Hadoop or Cloud vendors. Its all DistCP under the covers. Whats wrong with this approach? Its an IT project. And like most IT projects, 61% of them either fail or suffer cost and SLA overruns. Heres what you have to do in this scenario:

How long can this take? We have seen teams struggle for months and even years depending on data volume and business requirements around acceptable application downtime, data availability and data consistency. Weve seen companies put 8-10 people on projects, fail after 6 months, then pay $1M to a systems integrator and fail after another 9 months. OUCH.

There is a better way. And forward-looking companies like AMD, Daimler, and many others have figured it out. How?

By leveraging modern technology to automate data lake migration and replication to the cloud with WANdisco LiveData Cloud Services through its patented Distributed Coordination Engine platform.

This innovation is founded on fundamental IP which is based around forming consensus in a distributed network. This is an extremely hard problem to solve and to this day some people believe that it cannot be solved. So what is this problem at a high level? If you have a network of nodes, distributed across the world with little to no knowledge of the distance and bandwidth between the nodes, how can you get the nodes to coordinate between each other without worrying about any failure scenarios?

The solution is the application of a consensus algorithm and the gold standard in consensus is an algorithm called Paxos. Our chief Scientist Dr. Yeturu Aahlad, an expert in distributed systems, devised the first, and even now only, commercialised version of Paxos. By doing so, he solved a problem that had been puzzling computer scientists for years.

WANdiscos LiveData Cloud Services are based on this core IP including our products focused on analytical data and the challenge of migrating this data to the cloud and keeping the data consistent in multiple locations.

As businesses request to have data available in a more and more decentralized environment, the old mechanisms to provide and manage data are not sufficient anymore. Moreover, the amount of data is rising exponentially which leads to a phenomenon called data gravity. With an increasing volume of data, the more it is a challenge to provide this in a distributed environment, allow changes to the data in any environment, and ensure it remains consistent across all environments. Additionally regulation and compliance requirements make it even more challenging for data managers to fulfil businesses needs.

As enterprises look to leverage the scale and economics of the cloud, WANdisco offers a fundamentally different approach to manage these large volumes of data accelerating the ability for enterprises to undergo digital transformation.

Heres what Merv Adrian, Research VP of Data and Analytics at Gartner had to say, WANdiscos ability to move petabytes of data without interrupting production and without risk of losing the data midflight is something no other vendor does and, until now, has been virtually impossible to accomplish.

The Bottom Line

Cloud computing has completely transformed entire industries, computing paradigms and enterprises, and has become the ideal for storing and accessing big data sets. The Covid-19 pandemic has only accelerated this move given the need to operate as economically as possible with more employees working remotely. Cloud computing saves both money and time, which makes it immediately attractive to businesses, while also increasing access for global companies, providing a synergic platform for coordination and cooperation between far-flung employees. 85% of the Fortune 500 have moved to the cloud and continue to do so. The migration of static data has been easy. The challenge now has been how to quickly migrate and replicate large on-premises data lakes and applications to the cloud, when the data is business critical and application downtime, data loss and inconsistencies cannot be tolerated. The good news is that now there is a better way via automated migration and replication that delivers 10X faster time to value, is 100% safer, while ensuring zero downtime during migration.

Share and Enjoy !

Read the original here:
Putting AI and Machine Learning to Work in Cloud-Based BI and Analytics - AiThority

IDTechEx Report Suggests Machine Learning will be Accessible across Chemical and Materials Companies in the Future – CIO Applications

Material Informatics (MI) is a data-centric approach applicable to specific material science and chemistry R&D. Without a doubt, this will become a standard method in a research scientist toolkit.

FREMONT, CA: Machine learning has rapidly become an essential part of every industry. Material scientists and chemists will all have access to machine learning tools to enhance their Research & Development in the future. Seamlessly integrating these underlying operations will not happen quickly, but overlooking the developments in materials informatics will lead to a loss of competitive advantage.

Material Informatics (MI) is a data-centric approach applicable to specific material science and chemistry R&D. Without a doubt, this will become a standard method in a research scientist toolkit. Instead of just grabbing headlines, some form of MI will be assumed in all developments. The key to MI is around the integration, implementation, and manipulation of data infrastructures as well as machine learning approaches designed for chemical and materials datasets.

There is a significant amount of evidence to support this. However, the best backing is how the industries are responding to the technology. There has been a large amount of activity over recent years, including partnerships, investments, and announcements from some of the most notable chemical and materials companies.

Machine learning, by itself, can be used in various kinds of projects, from finding new structure-property relationships, proposing new candidates or process conditions, reducing the number of expensive and time-consuming computer simulations, and more. Machine learning approaches can take numerous forms of supervised and unsupervised learning methods. Generative methods can be effective at screening for optimized outputs across organic compounds. At the same time, even simple modified random forest models can be useful for proposing follow-on reactions to meet a desired set of criteria.

However, this is still at an early stage and requires a lot more development. There is a lot to be leveraged from existing developments in AI, but will first require integrating specialist domain knowledge and coping with the unique challenges of a materials dataset. The application space is broad, and studies have shown success ranging from organometallics, thermoelectrics, nanomaterials, and ceramics to many more.

Original post:
IDTechEx Report Suggests Machine Learning will be Accessible across Chemical and Materials Companies in the Future - CIO Applications

inPowered Selected by ANA as Winner of ‘Best Use of AI/Machine Learning’ Category at 2020 B2 Awards – Yahoo Finance

Content Marketing Has an ROI Problem & AI Can Fix That

SAN FRANCISCO, July 28, 2020 /PRNewswire/ --inPowered, the AI platform delivering business outcomes with content marketing, was awarded the top honors for the "Best Use of AI/Machine Learning" categoryat the 2020 Association of National Advertiser's B2 Awards. This marks the first time that inPowered has received this accolade from the ANA, one of the most highly regarded organizations within the advertising and marketing space.

The entry, titled "Content Marketing has an ROI Problem & AI Can Solve That," discussed the current pain point surrounding measurement and ROI that continues to frustrate marketers. inPowered has challenged the industry standard of evaluating success based off "CPC" or "CPM" by inventing a new content economy to measure KPIs; one that concentrates on consumer engagement versus clicks and impressions. Powered by an artificial intelligence (AI) engine, inPowered's proprietary technology doesn't optimize for clicks but instead for interactions that last a minimum of 15 seconds with each piece of content. This focus on authentically engaged users allows data collected from the technology to guide consumers towards post-click engagement and next-action business outcomes; resulting in a digital funnel entirely optimized for achieving real results and establishing concrete key performance indicators at the lowest cost per engagement.

"Since inception our mission has been to deliver real business outcomes with content marketing, as opposed to the vanity metrics like clicks and impressions that come from display advertising," said Peyman Nilforoush, CEO and Co-Founder at inPowered."This award from the ANA highlights the enormous opportunity for brands to achieve real ROI with content marketing by utilizing AI-powered content distribution, instead of DSP's or ad-network buys that result in expensive costs per visit, low times on-site and high bounce rates from un-engaged users."

The Association of National Advertisers had their biggest year yet with submissions for the 2020 B2 Awards, receiving hundreds of entries across more than three dozen categories. As the largest & oldest marketing organization in the United States, the ANA's mission is to drive growth for marketing professionals, brands and businesses, and for the industry as a whole. "B2B marketing is a cornerstone of our industry, and these awards honor the best and the brightest in the business," said Bob Liodice, Chief Executive Officer at the ANA.

ABOUT INPOWERED:

inPowered is the AI platform built to deliver business outcomes with content marketing. Using inPowered's artificial intelligence-powered technology, brands are able to increase the ROI of their content marketing initiatives by optimizing advertising spend towards the lowest cost across channels; as well as placing calls to action at optimized times to convert already-engaged audiences into tangible business outcomes. The company was founded in 2014 by Peyman Nilforoush and Pirouz Nilforoush after selling their previous company to Ziff Davis. http://www.inpwrd.com

MEDIA CONTACT:

Chelsea Waite, Director of Communications(415) 968-9859chelsea.waite@inpwrd.com

Related Images

inpowered-logo.jpg inPowered Logo inPowered Logo

View original content to download multimedia:http://www.prnewswire.com/news-releases/inpowered-selected-by-ana-as-winner-of-best-use-of-aimachine-learning-category-at-2020-b2-awards-301101420.html

SOURCE inPowered

Follow this link:
inPowered Selected by ANA as Winner of 'Best Use of AI/Machine Learning' Category at 2020 B2 Awards - Yahoo Finance