AI – Monday, December 30, 2024: Commentary with Notable and Interesting News, Articles, and Papers

Commentary and a selection of the most important recent news, articles, and papers about AI.

Today’s Brief Commentary

In my last AI newsletter, I said it would be the last one before the new year. I should have expected that my links buffer would not stay empty for that long.

I want to draw your attention to two topics within the links.

The first concerns AI and the open-source integrated development environment Visual Studio Code (VSC). On one hand, Microsoft’s GitHub is creating a free access tier so programmers can use AI to generate code within applications. On the other hand, Anysphere seems to have raised $100M for its commercial Cursor editor. Cursor is an extension of VSC and provides … AI so programmers can generate code within applications.

In my experience, if you have a commercial product for software development, coders will eventually create an open-source app that pretty much does the same thing for free. The clock starts ticking as soon as your product is out there and demonstrates value. You can keep adding to the software, making it monolithic and over-featured, or extend your product portfolio with other salable offerings. The history of computer software development is littered with failed tools companies that could not stand up to the influx of open-source alternatives.

The second topic is about generative AI and techniques like retrieval-augmented generation (RAG) and the newer “deliberative alignment.” It’s starting to seem like AI developers are saying, “Oh, that’s not working well, so let’s try this other thing. Whoops, we need to patch it for that other case.” I know that academic and industry researchers are now thinking about AI algorithms and processes that we may not see for several years. Meanwhile, are we witnessing hack upon hack, kludge upon kludge?

Coding and Software Engineering


GitHub makes AI Copilot free for VS Code developers — with limits | VentureBeat

https://venturebeat.com/programming-development/github-is-making-its-ai-programming-copilot-free-for-vs-code-developers-with-limits/

Author: Carl Franzen

Date: Wednesday, December 18, 2024

Commentary: For me, AI generation of code snippets is by far the most useful application I have used from the genre. It saves a lot of time compared to looking up the names of Python functions and methods and their parameters.

Excerpt: Microsoft code repository subsidiary GitHub has announced the launch of GitHub Copilot Free, an accessible version of its popular AI-powered coding assistant, now integrated directly into the Visual Studio Code (VS Code) integrated developer environment (IDE).

Anysphere reportedly raises $100M for its AI-driven Cursor code editor | SiliconANGLE

https://siliconangle.com/2024/12/20/anysphere-reportedly-raises-100m-ai-driven-cursor-code-editor/

Author: Maria Deutscher

Date: Friday, December 20, 2024

Commentary: Interesting raise and valuation given that it is built on the open-source Visual Studio Code, noting the free GenAI coding assistant from Microsoft and GitHub above.

Excerpt: Anysphere’s code editor, Cursor, is based on the popular VS Code software development tool from Microsoft Corp. The startup augmented the core feature set with an AI assistant that helps programmers carry out their work faster. Additionally, it doubles as a search tool for jumping to specific sections of a code file

The assistant is accessible through a ChatGPT-like chatbot interface. According to Anysphere, developers can ask the AI for help with a programming task and have it generate upwards of a dozen lines of code at once. If the initial output doesn’t fully meet project requirements, users can ask the chatbot to modify it.

Games


Game industry predictions for 2025 | The DeanBeat | VentureBeat

https://venturebeat.com/games/game-industry-predictions-for-2025-the-deanbeat/

Author: Dean Takahashi

Date: Friday, December 27, 2024

Excerpt: Based on all of the pronouncements from big companies and statements from game startups, you can bet that game companies are kicking the tires on AI tools from game startups aimed at making game development faster, cheaper and easier. They may, like EA, be making their own or they may be betting on startups like Inworld AI that have talent from both AI and games in-house. AI can make our worlds more dynamic and personalized to exactly what we want.

Generative AI and Models


OpenAI announces new o3 models | TechCrunch

https://techcrunch.com/2024/12/20/openai-announces-new-o3-model/

Author: Kyle Wiggers

Date: Friday, December 20, 2024

Excerpt: And there are risks. AI safety testers have found that o1’s reasoning abilities make it try to deceive human users at a higher rate than conventional, “non-reasoning” models — or, for that matter, leading AI models from Meta, Anthropic, and Google. It’s possible that o3 attempts to deceive at an even higher rate than its predecessor; we’ll find out once OpenAI’s red-team partners release their testing results.

For what it’s worth, OpenAI says that it’s using a new technique, “deliberative alignment,” to align models like o3 with its safety principles. (o1 was aligned the same way.) The company has detailed its work in a new study.

Deliberative alignment: reasoning enables safer language models | OpenAI

https://openai.com/index/deliberative-alignment/

Date: Friday, December 20, 2024

Excerpt: We introduce deliberative alignment, a training paradigm that directly teaches reasoning LLMs the text of human-written and interpretable safety specifications, and trains them to reason explicitly about these specifications before answering. We used deliberative alignment to align OpenAI’s o-series models, enabling them to use chain-of-thought (CoT) reasoning to reflect on user prompts, identify relevant text from OpenAI’s internal policies, and draft safer responses. Our approach achieves highly precise adherence to OpenAI’s safety policies, and without requiring human-labeled CoTs or answers. We find that o1 dramatically outperforms GPT-4o and other state-of-the art LLMs across a range of internal and external safety benchmarks, and saturates performance on many challenging datasets. We believe this presents an exciting new path to improve safety, and we find this to be an encouraging example of how improvements in capabilities can be leveraged to improve safety as well.

New law will require [New York] state agencies to monitor use of generative AI | Times Union

https://www.timesunion.com/capitol/article/new-law-require-state-agencies-monitor-use-19999749.php

Author: Raga Justin

Date: Tuesday, December 24, 2024

Excerpt: Hochul has previously indicated a reluctance to commit New York to regulating usage of generative AI and said the federal government should ultimately be responsible for broad oversight. But she has taken more steps in recent months to shape the direction of the technology’s growth in New York.

Her administration earlier this year announced the Empire AI Consortium, a well-funded state and private partnership dedicated to researching and directing the artificial intelligence industry.

Newsletters


Deep Learning Weekly | Substack

https://www.deeplearningweekly.com/

Commentary: This is a great newsletter to keep abreast of innovations in deep learning including and beyond generative AI.

Excerpt: Bringing you everything new and exciting in the world of deep learning from academia to the grubby depths of industry every week right to your inbox.

Research and Technical


Breaking up is hard to do: Chunking in RAG applications | Stack Overflow Blog

https://stackoverflow.blog/2024/12/27/breaking-up-is-hard-to-do-chunking-in-rag-applications/

Date: Friday, December 27, 2024

Commentary: Do others agree that parts of the generative AI workflow is starting to look like hacks on top of hacks? I think we are due for a complete rethink of these processes because they are appearing less and less elegant.

Excerpt: There are a lot of possible chunking strategies, so figuring out the optimal one for your use case takes a little work. Some say that chunking strategies need to be custom for every document that you process. You can use multiple strategies at the same time. You can apply them recursively over a document. But ultimately, the goal is to store the semantic meaning of a document and its constituent parts in a way that an LLM can retrieve based on query strings.

Sovereign Initiatives


Building the Future Federation | The Rising Calls for Sovereign AI | CIRSD

https://www.cirsd.org/en/horizons/horizons-summer-2024–issue-no-27/building-the-future-federation—the-rising-calls-for-sovereign-ai

Authors: David L. Shrier and ChatGPT

Date: Monday, July 1, 2024

Commentary: CIRSD = Center for International Relations and Sustainable Development. I offer this link for consideration, not as an endorsement.

Excerpt: Government focus on the subject is driven by a variety of strategic and ethical considerations. One of the primary concerns is the potential for the values embedded in AI systems, particularly those developed by major technology companies, to reflect the interests and ideologies of a few powerful entities rather than the diverse ethical frameworks of different nations. This has led to a desire among many governments to ensure that their national values and ethics are integrated into AI systems like Generative Pretrained Transformers (GPTs) and large language models (LLMs).