Product Thumbnail

Clevrr Computer

Computer use but with OpenAI and Gemini models

Open Source
Artificial Intelligence
GitHub
Tech

An open-source implementation of Anthropic's Computer Use to perform basic tasks using AI Agents. Currently supports Langchain, Azure OpenAI Models, and Gemini models. Contributions and supports are more than welcomed to improve the functionality.

Top comment

Hi PHers 👋🏻, Anthropic has taken the world by storm with their latest Computer Use model. It can perform arbitrary actions on the system using text prompts. This is one big feat in AI Development 🤯 ❓ Why Clevrr Computer? With Recent development in such models, it got me thinking if other models would be capable of doing the same or not? Clevrr Computer is a Langchain ReAct Agent with multi-modal modals like Google's Gemini 1.5 Pro and OpenAI's GPT-4o to perform actions on the system computer like mouse movements, web search, native app controls, etc. The possibilities are endless with such a technology and this is just the beginning. 💻 What can you do with it? You can use Clevrr Computer to reply to your Mom on WhatsApp, spin up a fun conversation with ChatGPT on the web with 2 AIs, or do research and analysis on basic web search. ⚙️ How it works? It's a multi-modal AI Agent running at the back with a constant screenshot capturing mechanism to learn what it is seeing on the screen and direct the main action agent to function accordingly, using python to perform actions automagically. ⛔ Does it have limitations? Of course, it does (and it should, honestly) have limitation in terms of the screen access, code complexity, some tasks performance, etc. 🧪 Where can I try it? Check out the full steps to download and use the Agent within your computer at https://github.com/Clevrr-AI/Cle... I would love for you to contribute to make this project perfect and fast forward the era of AI agents! Contributions are more than welcome. 💖

Comment highlights

Huge congrats to the Clevrr Computer team on today's launch! I love how you've democratized access to AI-powered task automation with open-source goodness. Here's a curious question: What's the most creative/basic task (e.g., email sorting, content gen?) you've seen users automate so far with Clevrr Computer's Langchain, Azure OpenAI, or Gemini models?

This is awesome! Computer use is definitely the next step with AI. Quick question, can you command the AI with voice prompts or text prompts only? Congratulations on your launch team!

This is the potential to change how I use artificial intelligence in everyday work.

Impressive work. The integration with various AI models could really enhance my experience.

Clevrr Computer can really save a lot effort in dealing with basic tasks! Congrats on the launch and good luck to Clevrr Computer.

Check out the whole functionality, how it works, and use cases on the X thread https://x.com/yuvrajdagur12/stat...

Whoa, Clevrr Computer sounds like it's taking AI interaction to a whole new level! The idea of an AI agent that can actually control your computer is mind-blowing. I'm both excited and a bit nervous about the possibilities here. Being able to automate tasks like replying to messages or doing web research could be a huge time-saver. But I'm glad they've put some limitations in place - that's definitely important. I'm really curious to see how this performs in real-world scenarios and how it compares to Anthropic's model. Might have to check out that GitHub repo and give it a spin. Anyone else thinking of trying this out?