Sunday was Expo day — featuring a mixture of talks, workshops, and demonstrations.
One recurring theme throughout the Expo was efficiency — (how) can we do more with less?
Less power, fewer model parameters, smaller chips, …
The most popular session based on attendee numbers in the conference app (and judging by people being unable to get into the room, causing the talk to be repeated later in a larger room) was a talk by Amazon titled “Smaller models can pack a punch in the era of large language models”. The talk explored how relatively small 7-billion-parameter models like Mistral 7B are able to outperform 13B models, and be competitive in some areas against other models with as many as 34B parameters.
Other sessions around efficiency in generative applications included Qualcomm’s demonstration of running Stable Diffusion on their Snapdragon mobile processor; Positron AI introducing their new inference chips which aim to operate at significantly improved performance/watt and performance/$ compared to the current state-of-the-art, and AMD’s talk on improving energy efficiency through quantization.
This will remain a theme throughout the rest of the week — for example, in the Efficient Learning Oral Session (Tuesday 3:40-4:40pm, including QLoRA at 3:55pm) and the LLM Efficiency Challenge (Friday 1.30pm-4.30pm), with the latter exploring what teams can achieve by fine-tuning on a single GPU for 24 hours.
Some brief highlights from other sessions…
Amazon’s AutoML talk did a great job of introducing version 1.0 of their open-source AutoGluon library. Helpfully, the workshop has an accompanying website with various helpful resources including cheatsheets and some Colab notebooks to get started.
Many of the talks were very well-attended, and not just the ones related to generative models. Other notably popular talks were MathWorks’ Reinforcement Learning talk, Google’s Graph Learning talk, and Microsoft’s Autonomous Agents talk.
Tomorrow is the start of the main conference, starting with the Affinity Workshops, followed by the Tutorials, the first of the Invited Talks at 5:25pm, a welcome reception, and the first of the Creative AI Performances at 6:30pm.