Archive for the ‘Machine Learning’ Category

Unlocking the potential of IoT systems: The role of Deep Learning … – Innovation News Network

The Internet of Things (IoT), a network of interconnected devices equipped with sensors and software, has revolutionised how we interact with the world around us, empowering us to collect and analyse data like never before.

As technology advances and becomes more accessible, more objects are equipped with connectivity and sensor capabilities, making them part of the IoT ecosystem. The number of active IoT systems is expected to reach 29.7 billion by 2027, marking a significant surge from the 3.6 billion devices recorded in 2015. This exponential growth requires a tremendous demand for solutions to mitigate the safety and computational challenges of IoT applications. In particular, industrial IoT, automotive, and smart homes are three main areas with specific requirements, but they share a common need for efficient IoT systems to enable optimal functionality and performance.

Increasing the efficiency of IoT systems and unlocking their potential can be achieved through Artificial Intelligence (AI), creating AIoT architectures. By utilising sophisticated algorithms and Machine Learning techniques, AI empowers IoT systems to make intelligent decisions, process vast amounts of data, and extract valuable insights. For instance, this integration drives operational optimisation in industrial IoT, facilitates advanced autonomous vehicles, and offers intelligent energy management and personalised experiences in smart homes.

Among the different AI algorithms, Deep Learning that leverages artificial neural networks is very appropriate for IoT systems for several reasons. One of the primary reasons is its ability to learn and extract features automatically from raw sensor data. This is particularly valuable in IoT applications where the data can be unstructured, noisy, or have complex relationships. Additionally, Deep Learning enables IoT applications to handle real-time and streaming data efficiently. This ability allows for continuous analysis and decision-making, which is crucial in time-sensitive applications such as real-time monitoring, predictive maintenance, or autonomous control systems.

Despite the numerous advantages of Deep Learning for IoT systems, its implementation has inherent challenges, such as efficiency and safety, that must be addressed to fully leverage its potential. The Very Efficient Deep Learning in IoT (VEDLIoT) project aims to solve these challenges.

A high-level overview of the different VEDLIoT components is given in Fig. 1. IoT is integrated with Deep Learning by the VEDLIoT project to accelerate applications and optimise the energy efficiency of IoT. VEDLIoT achieves these objectives through the utilisation of several key components:

VEDLIoT concentrates on some use cases, such as demand-oriented interaction methods in smart homes (see Fig. 2), industrial IoT applications like Motor Condition Classification and Arc Detection, and the Pedestrian Automatic Emergency Braking (PAEB) system in the automotive sector (see Fig. 3). VEDLIoT systematically optimises such use cases through a bottom-up approach by employing requirement engineering and verification techniques, as shown in Fig. 1. The project combines expert-level knowledge from diverse domains to create a robust middleware that facilitates development through testing, benchmarking, and deployment frameworks, ultimately ensuring the optimisation and effectiveness of Deep Learning algorithms within IoT systems. In the following sections, we briefly present each component of the VEDLIoT project.

Various accelerators are available for a wide range of applications, from small embedded systems with power budgets in the milliwatt range to high-power cloud platforms. These accelerators are categorised into three main groups based on their peak performance values, as shown in Fig. 4.

The first group is the ultra-low power category (< 3 W), which consists of energy-efficient microcontroller-style cores combined with compact accelerators for specific Deep Learning functions. These accelerators are designed for IoT applications and offer simple interfaces for easy integration. Some accelerators in this category provide camera or audio interfaces, enabling efficient vision or sound processing tasks. They may offer a generic USB interface, allowing them to function as accelerator devices attached to a host processor. These ultra-low power accelerators are ideal for IoT applications where energy efficiency and compactness are key considerations, providing optimised performance for Deep Learning tasks without excessive power.

The VEDLIoT use case of predictive maintenance is a good example and makes use of an ultra-low power accelerator. One of the most important design criteria is low power consumption, as it is a battery-powered small box that can externally be installed on any electric motor and should monitor the electronic motor for at least three years without a battery change.

The next category is the low-power group (3 W to 35 W), which targets a broad range of automation and automotive applications. These accelerators feature high-speed interfaces for external memories and peripherals and efficient communication with other processing devices or host systems such as PCIe. They support modular and microserver-based approaches and provide compatibility with various platforms. Additionally, many accelerators in this category incorporate powerful application processors capable of running full Linux operating systems, allowing for flexible software development and integration. Some devices in this category include dedicated application-specific integrated circuits (ASICs), while others feature NVIDIAs embedded graphics processing units (GPUs). These accelerators balance power efficiency and processing capabilities, making them well-suited for various compute-intensive tasks in the automation and automotive domains.

The high-performance category (> 35 W) of accelerators is designed for demanding inference and training scenarios in edge and cloud servers. These accelerators offer exceptional processing power, making them suitable for computationally-intensive tasks. They are commonly deployed as PCIe extension cards and provide high-speed interfaces for efficient data transfer. The devices in this category have high thermal design powers (TDPs), indicating their ability to handle significant workloads. These accelerators include dedicated ASICs, known for their specialised performance in Deep Learning tasks. They deliver accelerated processing capabilities, enabling faster inference and training times. Some consumer-class GPUs may also be included in benchmarking comparisons to provide a broader perspective.

Selecting the proper accelerator from the abovementioned wide range of available options is not straightforward. However, VEDLIoT takes on this crucial responsibility by conducting thorough assessments and evaluations of various architectures, including GPUs, field-programmable gate arrays (FPGAs), and ASICs. The project carefully examines these accelerators performances and energy consumptions to ensure their suitability for specific use cases. By leveraging its expertise and comprehensive evaluation process, VEDLIoT guides the selection of Deep Learning accelerators within the project and in the broader landscape of IoT and Deep Learning applications.

Trained Deep Learning models have redundancy that can sometimes be compressed to 49 times their original size, with negligible accuracy loss. Although many works are related to such compression, most results show theoretical speed-ups that only sometimes translate into more efficient hardware execution since they do not consider the target hardware. On the other hand, the process of deploying Deep Learning models on edge devices involves several steps, such as training, optimisation, compilation, and runtime. Although various frameworks are available for these steps, their interoperability can vary, resulting in different outcomes and performance levels. VEDLIoT addresses these challenges through hardware-aware model optimisation using ONNX, an open format for representing Machine Learning models, ensuring compatibility with the current open ecosystem. Additionally, Renode, an open-source simulation framework, serves as a functional simulator for complex heterogeneous systems, allowing for the simulation of complete System-on-Chips (SoCs) and the execution of the same software used on hardware.

Furthermore, VEDLIoT uses the EmbeDL toolkit to optimise Deep Learning models. The EmbeDL toolkit offers comprehensive tools and techniques to optimise Deep Learning models for efficient deployment on resource-constrained devices. By considering hardware-specific constraints and characteristics, the toolkit enables developers to compress, quantise, prune, and optimise models while minimising resource utilisation and maintaining high inference accuracy. EmbeDL focuses on hardware-aware optimisation and ensures that Deep Learning models can be effectively deployed on edge devices and IoT devices, unlocking the potential for intelligent applications in various domains. With EmbeDL, developers can achieve superior performance, faster inference, and improved energy efficiency, making it an essential resource for those seeking to maximise the potential of Deep Learning in real-world applications.

Since VEDLIoT aims to combine Deep Learning with IoT systems, ensuring security and safety becomes crucial. In order to emphasise these aspects in its core, the project leverages trusted execution environments (TEEs), such as Intel SGX and ARM TrustZone, along with open-source runtimes like WebAssembly. TEEs provide secure environments that isolate critical software components and protect against unauthorised access and tampering. By using WebAssembly, VEDLIoT offers a common environment for execution throughout the entire continuum, from IoT, through the edge and into the cloud.

In the context of TEEs, VEDLIoT introduces Twine and WaTZ as trusted runtimes for Intels SGX and ARMs TrustZone, respectively. These runtimes simplify software creation within secure environments by leveraging WebAssembly and its modular interface. This integration bridges the gap between trusted execution environments and AIoT, helping to seamlessly integrate Deep Learning frameworks. Within TEEs using WebAssembly, VEDLIoT achieves hardware-independent robust protection against malicious interference, preserving the confidentiality of both data and Deep Learning models. This integration highlights VEDLIoTs commitment to securing critical software components, enabling secure development, and facilitating privacy-enhanced AIoT applications in cloud-edge environments.

Additionally, VEDLIoT employs a specialised architectural framework, as shown in Fig. 5, that helps to define, synchronise and co-ordinate requirements and specifications of AI components and traditional IoT system elements. This framework consists of various architectural views that address the systems specific design concerns and quality aspects, including security and ethical considerations. By using these architecture views as templates and filling them out, correspondences and dependencies can be identified between the quality-defining architecture views and other design decisions, such as AI model construction, data selection, and communication architecture. This holistic approach ensures that security and ethical aspects are seamlessly integrated into the overall system design, reinforcing VEDLIoTs commitment to robustness and addressing emerging challenges in AI-enabled IoT systems.

Traditional hardware platforms support only homogeneous IoT systems. However, RECS, an AI-enabled microserver hardware platform, allows for the seamless integration of diverse technologies. Thus, it enables fine-tuning of the platform towards specific applications, providing a comprehensive cloud-to-edge platform. All RECS variants share the same design paradigm to be a densely-coupled, highly-integrated communication infrastructure. For the varying RECS variants, different microserver sizes are used, from credit card size to tablet size. This allows customers to choose the best variant for each use case and scenario. Fig. 6 gives an overview of the RECS variants.

The three different RECS platforms are suitable for cloud/data centre (RECS|Box), edge (t.RECS) and IoT usage (u.RECS). All RECS servers use industry-standard microservers, which are exchangeable and allow for use of the latest technology just by changing a microserver. Hardware providers of these microservers offer a wide spectrum of different computing architectures like Intel, AMD and ARM CPUs, FPGAs and combinations of a CPU with an embedded GPU or AI accelerator.

VEDLIoT addresses the challenge of bringing Deep Learning to IoT devices with limited computing performance and low-power budgets. The VEDLIoT AIoT hardware platform provides optimised hardware components and additional accelerators for IoT applications covering the entire spectrum, from embedded via edge to the cloud. On the other hand, a powerful middleware is employed to ease the programming, testing, and deployment of neural networks in heterogeneous hardware. New methodologies for requirement engineering, coupled with safety and security concepts, are incorporated throughout the complete framework. The concepts are tested and driven by challenging use cases in key industry sectors like automotive, automation, and smart homes.

Please note, this article will also appear in the fifteenthedition of ourquarterly publication.

See the rest here:
Unlocking the potential of IoT systems: The role of Deep Learning ... - Innovation News Network

This AI Research Addresses the Problem of ‘Loss of Plasticity’ in … – MarkTechPost

Modern deep-learning algorithms are now focused on problem environments where training occurs just once on a sizable data collection, never againall of the early triumphs of deep learning in voice recognition and picture classification employed such train-once settings. Replay buffers and batching were later added to deep understanding when applied to reinforcement learning, making it extremely close to a train-once setting. A large batch of data was also used to train recent deep learning systems like GPT-3 and DallE. The most popular approach in these situations has been to gather data continuously and then occasionally prepare a new network from scratch in a training configuration. Of course, in many applications, the data distribution varies over time, and training must continue in some manner. Modern deep-learning techniques were developed with the train-once setting in mind.

In contrast, the perpetual learning problem setting focuses on continuously learning from fresh data. The ongoing learning option is ideal for issues where the learning system must deal with a dynamic data stream. For instance, think of a robot that has to find its way around a house. The robot would have to be retrained from scratch or run the danger of being rendered useless every time the houses layout changed if the train-once setting was used. It would be necessary to retrain from scratch if the design changed regularly. On the other hand, the robot might easily learn from the new information and continuously adjust to the changes in the house under the ongoing learning scenario. The importance of lifelong learning has grown in recent years, and more specialized conferences are being held to address it, such as the Conference on Life-long Learning Agents (CoLLAS).

They emphasize the environment of ongoing learning in their essay. When exposed to fresh data, deep learning systems frequently lose most of what they have previously learned, a condition known as catastrophic forgetting. In other words, deep learning techniques do not retain stability in ongoing learning issues. In the late 1900s, early neural networks were the first to demonstrate this behavior. Catastrophic forgetting has recently gotten fresh interest due to the development of deep learning since several articles have been written about preserving stability in deep continuous learning.

The capacity to continue learning from fresh material is distinct from catastrophic forgetting and perhaps more essential to continuous learning. They call this capacity plasticity.Continuous learning systems must maintain plasticity because it enables them to adjust to changes in their data streams. If their data stream changes, continuously learning systems that lose flexibility may become worthless. They emphasize the problem of flexibility loss in their essay. These studies employed a configuration in which the network was first shown a collection of instances for a predetermined number of epochs, after which the training set was enlarged with new examples, and the training cycle repeated for an extra number of epochs. After accounting for the number of epochs, they discovered that the error for the cases in the first training set was lower than for the later-added examples. These publications offered proof that the loss of flexibility caused by deep learning and the backpropagation algorithm upon which it is based is a common occurrence.

New outputs, known as heads, were added to the network in its configuration when a new job was offered, and the number of outputs increased as more tasks were encountered. Thus, the effects of interference from old heads were mixed up with the consequences of plasticity loss. According to Chaudhry et al., the loss of plasticity was modest when old heads were taken out at the beginning of a new task, indicating that the major cause of the loss of plasticity they saw was interference from old heads. The fact that previously researchers only employed ten challenges prevented them from measuring the loss of plasticity that occurs when deep learning techniques are presented with a lengthy list of tasks.

Although the findings in these publications suggest that deep learning systems have lost some of their essential adaptability, no one has yet shown that continuous learning has lost plasticity. In the reinforcement learning field, where recent works have demonstrated a significant loss of plasticity, there is more evidence for the loss of plasticity in contemporary deep learning. By demonstrating that early learning in reinforcement learning issues can have a negative impact on later learning, Nishikin et al. coined the term primacy bias.

Given that reinforcement learning is fundamentally continuous as a consequence of changes in the policy, this result may be attributable to deep learning networks losing their flexibility in circumstances where learning is ongoing. Additionally, Lyle et al. demonstrated that some deep reinforcement learning agents may eventually lose their capacity to pick up new skills. These are significant data points, but because of the intricacy of contemporary deep reinforcement learning, it isnt easy to make any firm conclusions. These studies show that deep learning systems lose flexibility but fall short of providing a complete explanation of the phenomenon. These studies include those from the psychology literature around the turn of the century and more contemporary ones in machine learning and reinforcement learning. In this study, researchers from the Department of Computing Science, University of Alberta, and CIFAR AI Chair, Alberta Machine Intelligence Institute provide a more conclusive response to plasticity loss in contemporary deep learning.

They demonstrate that persistent supervised learning issues cause deep learning approaches to lose plasticity and that this plasticity loss can be severe. In a continuous supervised learning problem using the ImageNet dataset and including hundreds of learning trials, they first show that deep learning suffers from loss of plasticity. The complexity and related confusion that always develop in reinforcement learning are eliminated when supervised learning tasks are used instead. We can also determine the complete amount of the loss of plasticity thanks to the hundreds of tasks that we have. They next prove the universality of deep learnings lack of flexibility over a wide variety of hyperparameters, optimizers, network sizes, and activation functions using two computationally less expensive problems (a variation of MNIST and the slowly changing regression problem). They want a deeper grasp of its origins after demonstrating the severity and generality of loss of flexibility in deep learning.

Check out thePaper.All Credit For This Research Goes To the Researchers on This Project. Also,dont forget to joinour 29k+ ML SubReddit,40k+ Facebook Community,Discord Channel,andEmail Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.

Read more:
This AI Research Addresses the Problem of 'Loss of Plasticity' in ... - MarkTechPost

What Is Kernel In Machine Learning And How To Use It? – Dataconomy

The concept of a kernel in machine learning might initially sound perplexing, but its a fundamental idea that underlies many powerful algorithms. There are mathematical theorems that support the working principle of all automation systems that make up a large part of our daily lives.

Kernels in machine learning serve as a bridge between linear and nonlinear transformations. They enable algorithms to work with data that doesnt exhibit linear separability in its original form. Think of kernels as mathematical functions that take in data points and output their relationships in a higher-dimensional space. This allows algorithms to uncover intricate patterns that would be otherwise overlooked.

So how can you use kernel in machine learning for your own algorithm? Which type should you prefer? What do these choices change in your machine learning algorithm? Lets take a closer look.

At its core, a kernel is a function that computes the similarity between two data points. It quantifies how closely related these points are in the feature space. By applying a kernel function, we implicitly transform the data into a higher-dimensional space where it might become linearly separable, even if it wasnt in the original space.

There are several types of kernels, each tailored to specific scenarios:

The linear kernel is the simplest form of kernel in machine learning. It operates by calculating the dot product between two data points. In essence, it measures how aligned these points are in the feature space. This might sound straightforward, but its implications are powerful.

Imagine you have data points in a two-dimensional space. The linear kernel calculates the dot product of the feature values of these points. If the result is high, it signifies that the two points have similar feature values and are likely to belong to the same class. If the result is low, it suggests dissimilarity between the points.

The linear kernels magic lies in its ability to establish a linear decision boundary in the original feature space. Its effective when your data can be separated by a straight line. However, when data isnt linearly separable, thats where other kernels come into play.

The polynomial kernel in machine learning introduces a layer of complexity by applying polynomial transformations to the data points. Its designed to handle situations where a simple linear separation isnt sufficient.

Imagine you have a scatter plot of data points that cant be separated by a straight line. Applying a polynomial kernel might transform these points into a higher-dimensional space, introducing curvature. This transformation can create intricate decision boundaries that fit the data better.

For example, in a two-dimensional space, a polynomial kernel of degree 2 would generate new features like x^2, y^2, and xy. These new features can capture relationships that werent evident in the original space. As a result, the algorithm can find a curved boundary that separates classes effectively.

The Radial Basis Function (RBF) kernel in machine learning is one of the most widely used kernels in the training of algorithms. It capitalizes on the concept of similarity by creating a measure based on Gaussian distributions.

Imagine data points scattered in space. The RBF kernel computes the similarity between two points by treating them as centers of Gaussian distributions. If two points are close, their Gaussian distributions will overlap significantly, indicating high similarity. If they are far apart, the overlap will be minimal.

This notion of similarity is powerful in capturing complex patterns in data. In cases where data points are related but not linearly separable, the usage of RBF kernel in machine learning can transform them into a space where they become more distinguishable.

The sigmoid kernel in machine learning serves a unique purpose its used for transforming data into a space where linear separation becomes feasible. This is particularly handy when youre dealing with data that cant be separated by a straight line in its original form.

Imagine data points that cant be divided into classes using a linear boundary. The sigmoid kernel comes to the rescue by mapping these points into a higher-dimensional space using a sigmoid function. In this transformed space, a linear boundary might be sufficient to separate the classes effectively.

The sigmoid kernels transformation can be thought of as bending and shaping the data in a way that simplifies classification. However, its important to note that while the usage of a sigmoid kernel in machine learning can be useful, it might not be as commonly employed as the linear, polynomial, or RBF kernels.

Kernels are the heart of many machine learning algorithms, allowing them to work with nonlinear and complex data. The linear kernel suits cases where a straight line can separate classes. The polynomial kernel adds complexity by introducing polynomial transformations. The RBF kernel measures similarity based on Gaussian distributions, excelling in capturing intricate patterns. Lastly, the sigmoid kernel transforms data to enable linear separation when it wasnt feasible before. By understanding these kernels, data scientists can choose the right tool to unlock patterns hidden within data, enhancing the accuracy and performance of their models.

Kernels, the unsung heroes of AI and machine learning, wield their transformative magic through algorithms like Support Vector Machines (SVM). This article takes you on a journey through the intricate dance of kernels and SVMs, revealing how they collaboratively tackle the conundrum of nonlinear data separation.

Support Vector Machines, a category of supervised learning algorithms, have garnered immense popularity for their prowess in classification and regression tasks. At their core, SVMs aim to find the optimal decision boundary that maximizes the margin between different classes in the data.

Traditionally, SVMs are employed in a linear setting, where a straight line can cleanly separate the data points into distinct classes. However, the real world isnt always so obliging, and data often exhibits complexities that defy a simple linear separation.

This is where kernels come into play, ushering SVMs into the realm of nonlinear data. Kernels provide SVMs with the ability to project the data into a higher-dimensional space where nonlinear relationships become more evident.

The transformation accomplished by kernels extends SVMs capabilities beyond linear boundaries, allowing them to navigate complex data landscapes.

Lets walk through the process of using kernels with SVMs to harness their full potential.

Imagine youre working with data points on a two-dimensional plane. In a linearly separable scenario, a straight line can effectively divide the data into different classes. Here, a standard linear SVM suffices, and no kernel is needed.

However, not all data is amenable to linear separation. Consider a scenario where the data points are intertwined, making a linear boundary inadequate. This is where kernel in machine learning step in to save the day.

You have a variety of kernels at your disposal, each suited for specific situations. Lets take the Radial Basis Function (RBF) kernel as an example. This kernel calculates the similarity between data points based on Gaussian distributions.

By applying the RBF kernel, you transform the data into a higher-dimensional space where previously hidden relationships are revealed.

In this higher-dimensional space, SVMs can now establish a linear decision boundary that effectively separates the classes. Whats remarkable is that this linear boundary in the transformed space corresponds to a nonlinear boundary in the original data space. Its like bending and molding reality to fit your needs.

Kernels bring more than just visual elegance to the table. They enhance SVMs in several crucial ways:

Handling complexity: Kernel in machine learning enables SVMs to handle data that defies linear separation. This is invaluable in real-world scenarios where data rarely conforms to simplistic structures.

Unleashing insights: By projecting data into higher-dimensional spaces, kernels can unveil intricate relationships and patterns that were previously hidden. This leads to more accurate and robust models.

Flexible decision boundaries: Kernel in machine learning grants the flexibility to create complex decision boundaries, accommodating the nuances of the data distribution. This flexibility allows for capturing even the most intricate class divisions.

Kernel in machine learning is like a hidden gem. They unveil the latent potential of data by revealing intricate relationships that may not be apparent in their original form. By enabling algorithms to perform nonlinear transformations effortlessly, kernels elevate the capabilities of machine learning models.

Understanding kernels empowers data scientists to tackle complex problems across domains, driving innovation and progress in the field. As we journey further into machine learning, lets remember that kernels are the key to unlocking hidden patterns and unraveling the mysteries within data.

Featured image credit: rawpixel.com/Freepik.

Originally posted here:
What Is Kernel In Machine Learning And How To Use It? - Dataconomy

Reddit Expands Machine Learning Tools To Help Advertisers Find … – B&T

Reddit has introduced Keyword Suggestions, a tool for advertisers that applies machine learning to help expand their keyword lists recommending relevant and targetable keywords, while filtering out keywords that arent brand suitable.

The new system is available via the Reddit Ads Manager and ranks each suggestion by monthly views, and opens up an expanded list of relevant targeting possibilities to increase the reach and efficiency of campaigns.

The tool is powered by advanced machine learning and natural language processing to find the most relevant terms.

This technology takes the original context of each keyword into consideration so that only those existing in a brand-safe and suitable environment are served to advertisers.

In practice, this means machine learning is doing the heavy lifting, pulling from the Reddit posts and conversations that best match each advertisers specific needs. Most importantly, this allows advertisers to show the most relevant ads to the Reddit users who will be most interested in them.

The promise and potential of artificial intelligence, while exciting, has also elevated the value of real, human interactions and interests for both consumers and marketers. As we enter a new chapter in our industry and evolve beyond traditional signals, interest-based, contextually relevant targeting will be the most effective way to reach people where theyre most engaged, said Jim Squires, Reddits EVP of business marketing and growth.

Powered by Reddits vast community of communities, which are segmented by interest and populated with highly engaged discussions, Keyword Suggestions leverages the richness of conversation on Reddit and provides advertisers with recommendations to easily and effectively target relevant audiences on our platform.

The platform has also boosted its interest-based targeting tools with twice the number of categories available for targeting.

Reddits continued focus on enhancing their targeting products via machine learning will certainly help advertisers reach more of their target audience and discover new audiences on the platform. Additionally, implementing negative keyword targeting strategies overall increases relevancy and improves performance, said GroupM vice president and global head of social, Amanda Grant.

Given the rich nature of conversations on the Reddit platform, we expect improved business outcomes as we tap into these tools to refine our focus on the right audience.

Read the original here:
Reddit Expands Machine Learning Tools To Help Advertisers Find ... - B&T

Seattle startup that helps companies protect their AI and machine learning code raises $35M – GeekWire

From left: Protect AI CEO Ian Swanson, CTO Badar Ahmed, and president Daryan Dehghanpisheh. (Protect AI Photo)

Seattle cybersecurity startup Protect AI landed $35 million to boost the rollout of its platform that helps enterprises shore up their machine learning code.

Protect AI sells software that allows companies to monitor the various layers and components of machine learning systems, detecting potential violations and logging information on those attacks. It primarily sells to large enterprises in regulated industries including finance, healthcare, life sciences, energy, government, and tech.

The fresh funding comes as AI has become a focal point for many enterprise-level executives, who are mandated to deploy the tech alongside their product suites, CEO Ian Swanson told GeekWire. This rapid adoption comes with elevated risks, he said.

[AI] is flying down the highway right now, he said. For a lot of organizations, that cant be stopped. So we need to make sure that we can maintain and understand it.

A KPMG survey found than only 6% of organizations have a dedicated team in place for evaluating risk and implementing risk mitigation strategies as part of their overall generative AI strategy.

At the same time, companies of all sizes are facing an increasing number of cyber threats, pressuring execs to invest heavily in their security systems.McKinsey and Co.predicts businesses will spend more than $100 billion on related services by 2025.

Protect AIs flagship product, AI Radar, creates a machine learning bill of materials to track a companys software supply chain components: operations tools, platforms, models, data, services, and cloud infrastructure. Swanson compares it to regular automotive maintenance and inspection, where tires and brakes need constant checks, along with ensuring the right fuel is used.

We really have to understand the ingredients and the recipe of all this, he said.

A hacker gaining access to a companys machine learning system can steal intellectual property or inject malicious code, Swanson said. For instance, Protect AI found a vulnerability in MLflow, a popular machine learning lifecycle platform used by Walmart, Time Warner, Prudential, and other large companies.

The startup presented its findings in March, pressuring MLflow to update its platform within a few weeks. The flaw, left unpatched, would have allowed unauthenticated hackers to read any file accessible on a users MLflow server and potentially inject code.

Protect AIs first product was NB Defense, an open-sourced app that works to address vulnerabilities in development platform Jupyter Notebooks. Protect AIs tools work in Google Cloud, Oracle Cloud, Microsoft Azure and Amazon Web Services.

In the AI cybersecurity space, there are several well-funded startups.

Swanson said Protect AI tracks the entire machine learning supply chain, from the original inputted training sets to the ongoing use of the model.

This is Swansons third startup. His first company was Sometrics, a virtual currency platform and in-game payments provider. It wasacquired by American Expressin 2011. After that, he founded DataScience.com, a cloud workspace platform that wasacquired by Oraclein 2018. Swanson also held AI leadership roles at AWS and Oracle.

Swanson is joined byBadar Ahmed, a former engineering leader at Oracle and DataScience, andDaryan Dehghanpisheh, a former leader at AWS. The company has 25 employees, up from 15 when the company raised its $13.5 million seed round in December.

The Series A round was led by Evolution Equity Partners, with participation from Salesforce Ventures and existing investors Acrew Capital, Boldstart Ventures, Knollwood Capital, and Pelion Ventures. The startup has raised a total of $48.5 million to date.

See original here:
Seattle startup that helps companies protect their AI and machine learning code raises $35M - GeekWire