Google Gemma 3n Open-SOS AI model releases models that can run locally on 2GB RAM

Google on Thursday released a full version of the latest open-source model Gemma 3n in the GEMMA 3 family of Artificial Intelligence (AI) model. The first announced in May, the new model has been designed and optimized for cases of on-device use and facilitates many new architecture-based reforms. Interestingly, the big language model (LLM) can be locally run on only 2 GB RAM. This means that the model can also be deployed and operated on a smartphone, provided that it comes with AI-competent processing power.

Gemma 3n is a multimodal AI model

One in blog postMountain view-based tech veteran announced the release of the full version of GEMMA 3N. The model follows the launch of the Gemma 3 and the Gemmasign model and is included in the gemmavers. Since it is an open-source model, the company has provided its model weight as well as Cookbook to the community. The model itself is available for use under a permissible Gemma license, allowing both academic and commercial uses.

Gemma 3n is a multimodal AI model. It basically supports image, audio, video and text input. However, it can only generate text output. It is also a multilingual model and supports 140 languages ​​and 35 languages ​​for lessons when the input is multimoded.

Google It is said that Gemma 3n has a “mobile-first architecture”, built on the matryoshka transformer or matformer architecture. It is a nested transformer, named after the dolls victim of Russian nest, where fits each other. This architecture offers a unique way to train the AI ​​model with various parameter sizes.

Gemma 3n comes in two sizes – E2B and E4B – Small for effective parameters. This means, despite having five billion and eight billion parameters in size, the active parameters are just two and four billion.

It is obtained using a technique called per-layer embeding (ple), where only the most essential parameters are required to be loaded into fast memory. The rest of the additional layer lives in embeding and can be controlled by the CPU.

Therefore, with a matterior system, the E4B version nesks the E2B model, and when the larger models are being trained, it trains the small model simultaneously. This allows users to use E4B for more advanced operations or E2B for fast output without finding any noticeable difference in processing or output quality.

Google is allowing users to make some interior parts to make custom-shaped models. For this, the company is releasing a matterior lab tool that will allow developers to test various combinations to help find custom model sizes.

Currently, Gemma 3n is available to download via Google Hugging face listing And Kagal EntryUsers can also visit Google AI Studio to try Gemma 3n. In particular, the Jemma model can also be deployed directly from AI Studio for cloud run.

Leave a Comment