Archive for the ‘Machine Learning’ Category

New cybersecurity system protects networks with LIDAR, no not that LiDAR – C4ISRNet

When it comes to identifying early cyber threats, its important to have laser-like precision. Mapping out a threat environment can be done with a range of approaches, and a team of researchers from Purdue University created a new system for just such applications. They are calling that approach LIDAR, or lifelong, intelligent, diverse, agile and robust.

This is not to be confused with LiDAR, for Light Detection and Ranging, a kind of remote sensing system that uses laser pulses to measure distances from the sensor. The light-specific LiDAR, sometimes also written LIDAR, is a valuable tool for remote sensing and mapping, and features prominently in the awareness tools of self-driving vehicles.

Purdues LIDAR, instead, is a kind of architecture for network security. It can adapt to threats, thanks in part to its ability to learn three ways. These include supervised machine learning, where an algorithm looks at unusual features in the system and compares them to known attacks. An unsupervised machine learning component looks through the whole system for anything unusual, not just unusual features that resemble attacks. These two machine-learning components are mediated by a rules-based supervisor.

One of the fascinating things about LIDAR is that the rule-based learning component really serves as the brain for the operation, said Aly El Gamal, an assistant professor of electrical and computer engineering in Purdues College of Engineering. That component takes the information from the other two parts and decides the validity of a potential attack and necessary steps to move forward.

By knowing existing attacks, matching to detected threats, and learning from experience, this LIDAR system can potentially offer a long-term solution based on how the machines themselves become more capable over time.

Aiding the security approach, said the researchers, is the use of a novel curiosity-driven honeypot, which can like a carnivorous pitcher plant lure attackers and then trap them where they will do no harm. Once attackers are trapped, it is possible the learning algorithm can incorporate new information about the threat, and adapt to prevent future attacks making it through.

The research team behind this LIDAR approach is looking to patent the technology for commercialization. In the process, they may also want to settle on a less-confusing moniker. Otherwise, we may stumble into a future where users securing a network of LiDAR sensors with LIDAR have to enact an entire Whos on First? routine every time they update their cybersecurity.

Visit link:
New cybersecurity system protects networks with LIDAR, no not that LiDAR - C4ISRNet

New York Institute of Finance and Google Cloud launch a Machine Learning for Trading Specialisation on Coursera – HedgeWeek

The New York Institute of Finance (NYIF) and Google Cloud have launched a new Machine Learning for Trading Specialisation available exclusively on the Coursera platform.

The Specialisation helps learners leverage the latest AI and machine learning techniques for financial trading.

Amid the Fourth Industrial Revolution, nearly 80 per cent of financial institutions cite machine learning as a core component of business strategy and 75 per cent of financial services firms report investing significantly in machine learning. The Machine Learning for Trading Specialisation equips professionals with key technical skills increasingly needed in the financial industry today.

Composed of three courses in financial trading, machine learning, and artificial intelligence, the Specialisation features a blend of theoretical and applied learning. Topics include analysing market data sets, building financial models for quantitative and algorithmic trading, and applying machine learning in quantitative finance.

As we enter an era of unprecedented technological change within our sector, were proud to offer up-skilling opportunities for hedge fund traders and managers, risk analysts, and other financial professionals to remain competitive through Coursera, says Michael Lee, Managing Director of Corporate Development at NYIF. The past ten years have demonstrated the staying power of AI tools in the finance world, further proving the importance for both new and seasoned professionals to hone relevant tech skills.

The Specialisation is particularly suited for hedge fund traders, analysts, day traders, those involved in investment management or portfolio management, and anyone interested in constructing effective trading strategies using machine learning. Prerequisites include basic competency with Python, familiarity with pertinent libraries for machine learning, a background in statistics, and foundational knowledge of financial markets.

Cutting-edge technologies, such as machine and reinforcement learning, have become increasingly commonplace in finance, says Rochana Golani, Director, Google Cloud Learning Services. Were excited for learners on Coursera to explore the potential of machine learning within trading. Looking beyond traditional finance roles, were also excited for the Specialisation to support machine learning professionals seeking to apply their craft to quantitative trading strategies.

View post:
New York Institute of Finance and Google Cloud launch a Machine Learning for Trading Specialisation on Coursera - HedgeWeek

Iguazio pulls in $24m from investors, shows off storage-integrated parallelised, real-time AI/machine learning workflows – Blocks and Files

Workflow-integrated storage supplier Iguazio has received $24m in C-round funding and announced its Data Science Platform. This is deeply integrated into AI and machine learning processes, and accelerates them to real-time speeds through parallel access to multi-protocol views of a single storage silo using data container tech.

The firm said digital payment platform provider Payoneer is using it for proactive fraud prevention with real-time machine learning and predictive analytics.

Yaron Weiss, VP Corporate Security and Global IT Operations (CISO) at Payoneer, said of Iguazios Data Science Platform: Weve tackled one of our most elusive challenges with real-time predictive models, making fraud attacks almost impossible on Payoneer.

He said Payoneer had built a system which adapts to new threats and enables is to prevent fraud with minimum false positives. The systems predictive machine learning models identify suspicious fraud and money laundering patterns continuously.

Weiss said fraud was detected retroactively with offline machine learning models; customers could only block users after damage had already been done. Now it can take the same models and serve them in real time against fresh data.

The Iguazio system uses a low latency serverless framework, a real-time multi-model data engine and a Python eco-system running over Kubernetes. Iguazio claims an estimated 87 per cent of data science models which have shown promise in the lab never make it to production because of difficulties in making them operational and able to scale.

It is based on so-called data containers that store normalised data from multiple sources; incoming stream records, files, binary objects, and table items. The data is indexed, and encoded by a parallel processing engine. Its stored in the most efficient way to reduce data footprint while maximising search and scan performance for each data type.

Data containers are accessed througha V310 API and can be read as any type regardless of how it was ingested. Applications can read, update, search, and manipulate data objects, while the data service ensures data consistency, durability, and availability.

Customers can submit SQL or API queries for file metadata, to identify or manipulate specific objects without long and resource-consuming directory traversals, eliminating any need for separate and non-synchronised file-metadata databases.

So-called API engines engine uses offload techniques for common transactions, analytics queries, real-time streaming, time-series, and machine-learning logic. They accept data and metadata queries, distribute them across all CPUs, and leverage data encoding and indexing schemes to eliminate I/O operations. Iguazio claims this provides magnitudes faster analytics and eliminates network chatter.

The Iguazio software is claimed to be able to accelerate the performance of tools such as Apache Hadoop and Spark by up to 100 times without requiring any software changes.

This DataScience Platform can run on-premises or in the public cloud. The Iguazio website contains much detail about its components and organisation.

Iguazio will use the $24m to fund product innovation and support global expansion into new and existing markets. The round was led by INCapital Ventures, with participation from existing and new investors, including Samsung SDS, Kensington Capital Partners, Plaza Ventures and Silverton Capital Ventures.

Originally posted here:
Iguazio pulls in $24m from investors, shows off storage-integrated parallelised, real-time AI/machine learning workflows - Blocks and Files

Federated machine learning is coming – here’s the questions we should be asking – Diginomica

A few years ago, I wondered how edge data would ever be useful given the enormous cost of transmitting all the data to either the centralized data center or some variant of cloud infrastructure. (It is said that 5G will solve that problem).

Consider, for example, applications of vast sensor networks that stream a great deal of data at small intervals. Vehicles on the move are a good example.

There is telemetry from cameras, radar, sonar, GPS and LIDAR, the latter about 70MB/sec. This could quickly amount to four terabytes per day (per vehicle). How much of this data needs to be retained? Answers I heard a few years ago were along two lines:

My counterarguments at the time were:

Introducing TensorFlow federated, via The TensorFlow Blog:

This centralized approach can be problematic if the data is sensitive or expensive to centralize. Wouldn't it be better if we could run the data analysis and machine learning right on the devices where that data is generated, and still be able to aggregate together what's been learned?

Since I looked at this a few years ago, the distinction between an edge device and a sensor has more or less disappeared. Sensors can transmit via wifi (though there is an issue of battery life, and if they're remote, that's a problem); the definition of the edge has widened quite a bit.

Decentralized data collection and processing have become more powerful and able to do an impressive amount of computing. The case is point in Intel's Introducing the Intel Neural Compute Stick 2 computer vision and deep learning accelerator powered by the Intel Movidius Myriad X VPU, that can stick into a Pi for less than $70.00.

But for truly distributed processing, the Apple A13 chipset in the iPhone 11 has a few features that boggle the mind: From Inside Apple's A13 Bionic system-on-chip Neural Engine, a custom block of silicon separate from the CPU and GPU, focused on accelerating Machine Learning computations. The CPU has a set of "machine learning accelerators" that perform matrix multiplication operations up to six times faster than the CPU alone. It's not clear how exactly this hardware is accessed, but for tasks like machine learning (ML) that use lots of matrix operations, the CPU is a powerhouse. Note that this matrix multiplication hardware is part of the CPU cores and separate from the Neural Engine hardware.

This should beg the question, "Why would a smartphone have neural net and machine learning capabilities, and does that have anything to do with the data transmission problem for the edge?" A few years ago, I thought the idea wasn't feasible, but the capability of distributed devices has accelerated. How far-fetched is this?

Let's roll the clock back thirty years. The finance department of a large diversified organization would prepare in the fall a package of spreadsheets for every part of the organization that had budget authority. The sheets would start with low-level detail, official assumptions, etc. until they all rolled up to a small number of summary sheets that were submitted headquarters. This was a terrible, cumbersome way of doing things, but it does, in a way, presage the concept of federated learning.

Another idea that vanished is Push Technology that shared the same network load as centralizing sensor data, just in the opposite direction. About twenty-five years, when everyone had a networked PC on their desk, the PointCast Network used push technology. Still, it did not perform as well as expected, often believed to be because its traffic burdened corporate networks with excessive bandwidth use, and was banned in many places. If Federated Learning works, those problems have to be addressed

Though this estimate changes every day, there are 3 billion smartphones in the world and 7 billion connected devices.You can almost hear the buzz in the air of all of that data that is always flying around. The canonical image of ML is that all of that data needs to find a home somewhere so that algorithms can crunch through it to yield insights. There are a few problems with this, especially if the data is coming from personal devices, such as smartphones, Fitbit's, even smart homes.

Moving highly personal data across the network raises privacy issues. It is also costly to centralize this data at scale. Storage in the cloud is asymptotically approaching zero in cost, but the transmission costs are not. That includes both local WiFi from the devices (or even cellular) and the long-distance transmission from the local collectors to the central repository. This s all very expensive at this scale.

Suppose, large-scale AI training could be done on each device, bringing the algorithm to the data, rather than vice-versa? It would be possible for each device to contribute to a broader application while not having to send their data over the network. This idea has become respectable enough that it has a name - Federated Learning.

Jumping ahead, there is no controversy that training a network without compromising device performance and user experience, or compressing a model and resorting to a lower accuracy are not alternatives. In Federated Learning: The Future of Distributed Machine Learning:

To train a machine learning model, traditional machine learning adopts a centralized approach that requires the training data to be aggregated on a single machine or in a datacenter. This is practically what giant AI companies such as Google, Facebook, and Amazon have been doing over the years. This centralized training approach, however, is privacy-intrusive, especially for mobile phone usersTo train or obtain a better machine learning model under such a centralized training approach, mobile phone users have to trade their privacy by sending their personal data stored inside phones to the clouds owned by the AI companies.

The federated learning approach decentralizes training across mobile phones dispersed across geography. The presumption is that they collaboratively develop machine learning while keeping their personal data on their phones. For example, building a general-purpose recommendation engine for music listeners. While the personal data and personal information are retained on the phone, I am not at all comfortable that data contained in the result sent to the collector cannot be reverse-engineered - and I havent heard a convincing argument to the contrary.

Here is how it works. A computing group, for example, is a collection of mobile devices that have opted to be part of a large scale AI program. The device is "pushed" a model and executes it locally and learns as the model processes the data. There are some alternatives to this. Homogeneous models imply that every device is working with the same schema of data. Alternatively, there are heterogeneous models where harmonization of the data happens in the cloud.

Here are some questions in my mind.

Here is the fuzzy part: federated learning sends the results of the learning as well as some operational detail such as model parameters and corresponding weights back to the cloud. How does it do that and preserve your privacy and not clog up your network? The answer is that the results are a fraction of the data, and since the data itself is not more than a few Gb, that seems plausible. The results sent to the cloud can be encrypted with, for example, homomorphic encryption (HE). An alternative is to send the data as a tensor, which is not encrypted because it is not understandable by anything but the algorithm. The update is then aggregated with other user updates to improve the shared model. Most importantly, all the training data remains on the user's devices.

In CDO Review, The Future of AI. May Be In Federated Learning:

Federated Learning allows for faster deployment and testing of smarter models, lower latency, and less power consumption, all while ensuring privacy. Also, in addition to providing an update to the shared model, the improved (local) model on your phone can be used immediately, powering experiences personalized by the way you use your phone.

There is a lot more to say about this. The privacy claims are a little hard to believe. When an algorithm is pushed to your phone, it is easy to imagine how this can backfire. Even the tensor representation can create a problem. Indirect reference to real data may be secure, but patterns across an extensive collection can surely emerge.

Read the original:
Federated machine learning is coming - here's the questions we should be asking - Diginomica

Clean data, AI advances, and provider/payer collaboration will be key in 2020 – Healthcare IT News

In 2020, the importance of clean data, advancements in AI and machine learning, and increased cooperation between providers and payers will rise to the fore among important healthcare and health IT trends, predicts Don Woodlock, vice president of HealthShare at InterSystems.

All of these trends are good news for healthcare provider organizations, which are looking to improve the delivery of care, enhance the patient and provider experiences, achieve optimal outcomes, and trim costs.

The importance of clean data will become clear in 2020, Woodlock said.

Data is becoming an increasingly strategic asset for healthcare organizations as they work toward a true value-based care model, he explained. With the power of advanced machine learning models, caregivers can not only prescribe more personalized treatment, but they can even predict and hopefully prevent issues from manifesting.

However, there is no machine learning without clean data meaning the data needs to be aggregated, normalized and deduplicated, he added.

Don Woodlock, InterSystems

Data science teams spend a significant part of their day cleaning and sorting data to make it ready for machine learning algorithms, and as a result, the rate of innovation slows considerably as more time is spent on prep then experimentation, he said. In 2020, healthcare leaders will better see the need for clean data as a strategic asset to help their organization move forward smartly.

This year, AI and machine learning will move from if and when to how and where, Woodlock predicted.

AI certainly is at the top of the hype cycle, but the use in practice currently is very low in healthcare, he noted. This is not such a bad thing as we need to spend time perfecting the technology and finding the areas where it really works. In 2020, I foresee the industry moving toward useful, practical use-cases that work well, demonstrate value, fit into workflows, and are explainable and bias-free.

Well-developed areas like image recognition and conversational user experiences will find their foothold in healthcare along with administrative use-cases in billing, scheduling, staffing and population management where the patient risks are lower, he added.

In 2020, there will be increased collaboration between payers and providers, Woodlock contended.

The healthcare industry needs to be smarter and more inclusive of all players, from patient to health system to payer, in order to truly achieve a high-value health system, he said.

Payers and providers will begin to collaborate more closely in order to redesign healthcare as a platform, not as a series of disconnected events, he concluded. They will begin to align all efforts on a common goal: positive patient and population outcomes. Technology will help accelerate this transformation by enabling seamless and secure data sharing, from the patient to the provider to the payer.

InterSystems will be at booth 3301 at HIMSS20.

Twitter:@SiwickiHealthITEmail the writer:bill.siwicki@himssmedia.comHealthcare IT News is a HIMSS Media publication.

See the original post:
Clean data, AI advances, and provider/payer collaboration will be key in 2020 - Healthcare IT News