AI – Friday, August 09, 2024: Notable and Interesting News, Articles, and Papers

Advanced AI data center

A selection of the most important recent news, articles, and papers about AI.

News, Articles, and Analyses

NVIDIA Announces Generative AI Models and NIM Microservices for OpenUSD Language, Geometry, Physics and Materials | NVIDIA Newsroom

https://nvidianews.nvidia.com/news/nvidia-announces-generative-ai-models-and-nim-microservices-for-openusd

(Monday, July 29, 2024) “SIGGRAPH—NVIDIA today announced major advancements to Universal Scene Description, or OpenUSD, that will expand adoption of the universal 3D data interchange framework to robotics, industrial design and engineering, and accelerate developers’ abilities to build highly accurate virtual worlds for the next evolution of AI.”

Players, creators, and AI collaborate to build and expand rich game narratives – Microsoft Research

https://www.microsoft.com/en-us/research/blog/players-creators-and-ai-collaborate-to-build-and-expand-rich-game-narratives/

Author: Brenda Potts

(Monday, August 05, 2024) “Integrating LLMs in video game development can create dynamic and interactive narratives. By involving players in the narrative design process, LLMs can generate unique player-driven strategies and provide valuable feedback for game designers:”

Cannibalizing generative AI models could ‘go mad’ over time

https://readwrite.com/cannibalizing-generative-ai-models-could-go-mad-over-time/

Author: Rachael Davies

(Tuesday, August 06, 2024) “Generative AI models that just feed off one another could end up ‘going mad’ over time, affecting the quality of the output.”

Intel’s Q2 2024 Earnings: Navigating Challenges & Strategic Shifts – The Futurum Group

https://futurumgroup.com/insights/intels-q2-2024-earnings-release-navigating-challenges-and-strategic-shifts/

Author: Ron Westfall

“Intel’s Q2 2024 earnings usher in a $10B cost reduction plan to bolster near-term competitiveness in key segments and land long game strategy.”

What is Google Cloud’s generative AI evaluation service? | InfoWorld

https://www.infoworld.com/article/3483406/what-is-google-clouds-generative-ai-evaluation-service.html

“The service is targeted at enterprise users and is designed to help businesses understand how well a large language model works for a particular use case.”

Technical Papers, Articles, and Preprints

[2408.02946] Scaling Laws for Data Poisoning in LLMs

https://arxiv.org/abs/2408.02946

Authors: Bowen, Dillon; Murphy, Brendan; Cai, Will; Khachaturov, David; Gleave, Adam; Pelrine, Kellin

arXiv logo(Tuesday, August 06, 2024) “Recent work shows that LLMs are vulnerable to data poisoning, in which they are trained on partially corrupted or harmful data. Poisoned data is hard to detect, breaks guardrails, and leads to undesirable and harmful behavior. Given the intense efforts by leading labs to train and deploy increasingly larger and more capable LLMs, it is critical to ask if the risk of data poisoning will be naturally mitigated by scale, or if it is an increasing threat. We consider three threat models by which data poisoning can occur: malicious fine-tuning, imperfect data curation, and intentional data contamination. Our experiments evaluate the effects of data poisoning on 23 frontier LLMs ranging from 1.5-72 billion parameters on three datasets which speak to each of our threat models. We find that larger LLMs are increasingly vulnerable, learning harmful behavior — including sleeper agent behavior — significantly more quickly than smaller LLMs with even minimal data poisoning. These results underscore the need for robust safeguards against data poisoning in larger LLMs.”

[2408.03827] Automated Code Fix Suggestions for Accessibility Issues in Mobile Apps

https://arxiv.org/abs/2408.03827

Authors: Mehralian, Forough; Barik, Titus; Nichols, Jeff; Swearngin, Amanda

arXiv logo(Wednesday, August 07, 2024) “Accessibility is crucial for inclusive app usability, yet developers often struggle to identify and fix app accessibility issues due to a lack of awareness, expertise, and inadequate tools. Current accessibility testing tools can identify accessibility issues but may not always provide guidance on how to address them. We introduce FixAlly, an automated tool designed to suggest source code fixes for accessibility issues detected by automated accessibility scanners. FixAlly employs a multi-agent LLM architecture to generate fix strategies, localize issues within the source code, and propose code modification suggestions to fix the accessibility issue. Our empirical study demonstrates FixAlly’s capability in suggesting fixes that resolve issues found by accessibility scanners — with an effectiveness of 77% in generating plausible fix suggestions — and our survey of 12 iOS developers finds they would be willing to accept 69.4% of evaluated fix suggestions.”

[2408.03588] Facing the Music: Tackling Singing Voice Separation in Cinematic Audio Source Separation

https://arxiv.org/abs/2408.03588

Authors: Watcharasupat, Karn N.; Wu, Chih-Wei; Orife, Iroro

arXiv logo(Wednesday, August 07, 2024) “Cinematic audio source separation (CASS) is a fairly new subtask of audio source separation. A typical setup of CASS is a three-stem problem, with the aim of separating the mixture into the dialogue stem (DX), music stem (MX), and effects stem (FX). In practice, however, several edge cases exist as some sound sources do not fit neatly in either of these three stems, necessitating the use of additional auxiliary stems in production. One very common edge case is the singing voice in film audio, which may belong in either the DX or MX, depending heavily on the cinematic context. In this work, we demonstrate a very straightforward extension of the dedicated-decoder Bandit and query-based single-decoder Banquet models to a four-stem problem, treating non-musical dialogue, instrumental music, singing voice, and effects as separate stems. Interestingly, the query-based Banquet model outperformed the dedicated-decoder Bandit model. We hypothesized that this is due to a better feature alignment at the bottleneck as enforced by the band-agnostic FiLM layer. Dataset and model implementation will be made available at https://github.com/kwatcharasupat/source-separation-landing.”