AI systems and smart assistants are set to disrupt the way we work and get things done. In the future, these systems will not just be able to answer questions or draft emails, but also take action and execute tasks on our behalf.
In this talk, I will present a novel AI model that takes a step in this direction, UI-Act. UI-Act is a Transformer model, much like a LLM, that however takes pixel-based screen input and outputs mouse actions (clicks), instead of textual tokens. This makes the model capable to learn and execute literally any tasks on a computer. I will showcase this through a demo and argue for why I think this will become the next big AI breakthrough in the coming years.
What is an AI agent?
How to design agents?
Designing agents towards generality
UI-based agents for assisting knowledge work
UI-based AI agents - Why and How?
Tobias Norlund
Research Scientist
About the speaker
Tobias Norlund is a Research Scientist at AI Sweden, currently working on developing next generation Swedish foundation models. He holds a degree of Licentiate in NLP from Chalmers University and 8 years of experience in AI-based product development from Schibsted and Recorded Future.
The most rewarding Community for remote developers in Europe