Reading level
HuggingFace built a family of visual models small enough to run on your phone or laptop without internet. SmolVLM comes in three sizes: 256 million parameters, 500 million, and 2 billion. Despite the reduced size, it can look at multiple images simultaneously, understand video, perform OCR on documents, and answer questions. The Apache 2.0 license means anyone can use them in commercial products for free, accelerating adoption in IoT and mobile applications.
Companies
HuggingFace
Tools
SmolVLM, SmolVLM-256M, SmolVLM-500M, SmolVLM-2B
Tags
Edge AIVLMSmall ModelOpen SourceMulti-Image
Sources