Anthropic’s Claude AI can control computers and write JavaScript code: the new features

Credit: Anthropic.

Anthropicthe artificial intelligence startup founded in 2021 by the brothers Daniela and Dario Amodei together with five other former OpenAI employees, recently introduced a revolutionary function that was called “Computer Use” and which, precisely, allows you to control a computer by trying to imitate the behavior of a human user. The function in question is integrated into the AI ​​model of Claude 3.5 Sonnet and is capable of controlling a computer almost like a human being would. In what sense? Now Claude can “see” the screen via screenshot, move the cursor, click, type text, and so on, emulating in every way the interaction of a real person with a computer. For now, the function is still in an experimental phase (so much so that it is available as public beta), but Anthropic’s goal is to allow its model to autonomously perform a series of repetitive tasks that today require human intervention. Another recently announced function is the analysis function which allows the model to write and execute code in the language JavaScript.

How the AI ​​Claude 3.5 sonnet controls the PC: advantages and limits of “Computer Use”

Claude’s ability to use a computer relies on a display system called “flipbook”which captures and analyzes a series of screenshots to interpret and respond to what appears on the screen, without however taking advantage of a real-time video stream. This approach means that, for now, the model may occasionally miss short actions or notifications, as its vision relies on static images taken at regular intervals.

Anthropic’s AI is trained to recognize positions of elements on the screen, such as buttons and icons, measuring the distance in pixels to position the cursor precisely and perform relatively complex actions completely automatically. Regarding how “Computer Use” works, Anthropic explained:

When a developer tasks Claude with computer software and gives him the necessary access, Claude looks at screenshots of what is visible to the user, then counts how many pixels vertically or horizontally it takes to move a cursor to click in the correct place. Teaching Claude to count pixels accurately was crucial. Without this ability, the model has difficulty issuing mouse commands, similar to how models often have difficulty with seemingly simple questions like “how many A’s are in the word ‘banana’?”.

However, the feature is not yet perfect: many common actions, such as dragging files or using keyboard shortcuts, are not yet fully supported. In this regard, the same company warns that using “Computer Use” can be «cumbersome and error-prone» adding that the function was released in public beta precisely «to receive feedback from developers, and we expect the capability to improve rapidly over time».

In the demonstration shared by Anthropic in a video published on YouTube (which we propose again below), you can see Claude 3.5 Sonnet which, using “Computer Use”, manages to fill out the contact form of a given company and retrieve the information about it scattered on the disk of the Mac used for the test, managing to complete the operation very well. Clearly, what is seen in the demo will have to be verified and confirmed by the testers who decide to preview the functionality.

Is Anthropic’s computer monitoring feature safe?

The idea that an artificial intelligence can completely control a computer could raise some doubts regarding the safety issue.

Anthropic has also implemented strict security policies to limit Claude’s access to certain types of content. For example, the model is programmed to avoid interacting with social media and sensitive content, such as government sites or election-related activity, to minimize the risks of abuse or manipulation. Furthermore, measures are in place against possible attacks “prompt injection” – a particular type of cyber attack in which malicious instructions are given by a cybercriminal to an AI model, forcing it to perform unwanted actions that deviate from the user’s original intent – ​​as well as monitoring mechanisms set up to detect any improper use by users during this testing phase.

New JavaScript analysis tool

In addition to the “Computer Use” function, Anthropic also announced the possibility of exploiting an analysis function for Claude.ai, which allows the model to write and execute JavaScript codein a similar way to what the ChatGPT Code Interpreter manages to do (which uses the Python language instead of JavaScript). Thanks to this capability, Claude is now able to perform in-depth analysis on data in real time, process information and offer more precise results. In describing this other feature, Anthropic expressed itself in these terms:

Think of the analytics tool as a built-in code sandbox, where Claude can do complex calculations, analyze data, and iterate on different ideas before sharing an answer. The ability to process information and execute code means you get more accurate answers, drawing on Claude 3.5 Sonnet’s cutting-edge data and coding skills. (…) With the analysis tool, you get answers that are not only well-reasoned, but are mathematically precise and reproducible.

According to what the company said, this function will be useful in various contexts and for different professional figures, including marketers, engineers and finance teams.