Guia completo para rodar Gemma 4 localmente com Ollama e conectá-lo…
INEMA
mas com openai:⌗
from openai import OpenAI
client = OpenAI(
base_url="localhost:11434/v1 ↗",
api_key="ollama" # qualquer valor
)
response = client.chat.completions.create(
model="gemma4:e4b",
messages=[{"role": "user", "content": "oi"}]
)
COMO INSTALAR EM UM SERVIDOR DO ZERO
-
Atualizar o sistema sudo apt update && sudo apt upgrade -y
-
Instalar Docker curl -fsSL get.docker.com ↗ | sh sudo systemctl enable --now docker sudo usermod -aG docker $USER newgrp docker
-
Instalar drivers Nvidia (se tiver GPU) sudo apt install -y ubuntu-drivers-common sudo ubuntu-drivers autoinstall sudo reboot
-
Instalar Nvidia Container Toolkit (para Docker usar a GPU) curl -fsSL nvidia.github.io ↗ | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg curl -s -L nvidia.github.io ↗ | \ sed 's#deb https://#deb ↗ [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g ↗' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt update && sudo apt install -y nvidia-container-toolkit sudo nvidia-ctk runtime configure --runtime=docker sudo systemctl restart docker
-
Subir Ollama + Open WebUI
Cria a pasta e o compose: mkdir ~/ollama && cd ~/ollama
Cria o docker-compose.yml: services: ollama: image: ollama/ollama:latest container_name: ollama ports: - "11434:11434" volumes: - ollama-data:/root/.ollama environment: - OLLAMA_HOST=0.0.0.0 - OLLAMA_KEEP_ALIVE=24h restart: unless-stopped deploy: resources: reservations: devices: - driver: nvidia count: all capabilities: [gpu]
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
ports:
- "3000:8080"
extra_hosts:
- "host.docker.internal:host-gateway"
environment:
- OLLAMA_BASE_URL=http://host.docker.internal:11434
volumes:
- open-webui-data:/app/backend/data
restart: unless-stopped
volumes: ollama-data: name: ollama-data open-webui-data: name: open-webui-data
Sobe: docker compose up -d
- Baixar um modelo docker exec ollama ollama pull gemma4:e4b
Acessa em IP-DO-SERVIDOR:3000 ↗
Aqui está a versão traduzida para português, já pronta em Markdown 👇
- 🚀 Ollama + Gemma 4 — Guia CompletoExecute o Claude Code com o novo Gemma 4 do Google — 100% grátis, privado e local
- 📌 Visão GeralConfigure o Gemma 4 localmente com Ollama e conecte ao Claude Code→ Sem chaves de API→ Sem custos→ Sem internet
- ⚡ Principais VantagensRecursoBenefício💰 Custo$0 para sempre🔒 PrivacidadeTudo roda localmente⚡ VelocidadeSem latência de rede🧠 CapacidadeÓtimo para tarefas do dia a dia🧰 RequisitosmacOS / Linux / WindowsOllama ≥ 0.20.08GB–20GB de RAMClaude Code (npm install -g @anthropic-ai/claude-code)🛠️ Guia de Instalação1. Instalar Ollama```undefined
brew install ollama
ou⌗
curl -fsSL ollama.com/install.sh ↗ | sh
-
Iniciar o servidorundefined ollama serve
-
Baixar o modeloModeloRAMUsoe2b~4GBMáquinas fracase4b~8GBPadrão26b~14GBWorkstation31b~19GB🚀 Melhorundefined ollama pull gemma4:31b
-
Testar o modeloundefined ollama run gemma4:31b
-
Conectar ao Claude Codeundefined ollama launch claude
⚡ Início Rápido (copiar e colar)undefined brew install ollama && ollama serve & ollama pull gemma4:31b ANTHROPIC_BASE_URL=localhost:11434 ↗ ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_MODEL=gemma4:31b claude
🧪 Comandos úteisundefined ollama list ollama ps ollama stop gemma4:31b ollama rm gemma4:31b ollama --version
🧠 Estratégia de usoUse Gemma 4 paraUse Claude (cloud) paraCódigo simplesSistemas complexosCorreção de bugsArquiteturaBoilerplateCódigo crítico🎯 O que você ganha✅ Código com IA grátis✅ 100% offline✅ Uso ilimitado✅ Licença comercial (Apache 2.0)🔗 Recursoshttps://ollama.comhttps://ollama.com/library/gemma4https://docs.anthropic.com/claude/docs/claude-code✨ DicasUse 31b se tiver RAM suficienteUse e4b em notebooksMantenha o Ollama atualizado🏁 ProntoAgora você está rodando IA local para programação sem custo 🚀
Se quiser, posso já te entregar isso como arquivo .md traduzido pra download também 👍
Ollama + Gemma 4: Ultimate Cheat Sheet
> **Run Claude Code with Google's brand-new Gemma 4 — 100% free, 100% private, 100% local.**
---
✨ What This Is
This cheat sheet walks you through everything you need to set up Gemma 4 (Google's new open-weight AI model) running locally on your machine through Ollama, then connect it to Claude Code so you can code with AI assistance for $0/month — forever.
No API keys. No usage bills. No rate limits. No internet required.
---
🎯 Why You Want This
💰 Cost — Code without spending a penny
🔒 Privacy — Every line of code stays on your machine
⚡ Speed — No network round-trips, instant local inference
🧠 Performance — Good enough for 80% of everyday coding tasks
---
📋 Prerequisites
Before you start, make sure you have:
A Mac, Linux box, or Windows PC
Ollama v0.20.0 or later
At least 8GB RAM (e4b) or 20GB RAM (31b)
Claude Code installed
About 10 minutes
---
1️⃣ Install Ollama
Mac
```brew install ollama
Linux ```curl -fsSL ollama.com/install.sh ↗ | sh
Windows
Download from https://ollama.com/download
---
2️⃣ Start Server
```ollama serve
3️⃣ Pull Model ```ollama pull gemma4:31b
---
4️⃣ Test
```ollama run gemma4:31b
5️⃣ Connect Claude Code ```ollama launch claude
---
⚡ Quick Start
```brew install ollama && ollama serve &
ollama pull gemma4:31b
ANTHROPIC_BASE_URL=http://localhost:11434 ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_API_KEY= ANTHROPIC_MODEL=gemma4:31b claude
🛠️ Commands
ollama list
ollama ps
ollama stop gemma4:31b
ollama rm gemma4:31b
ollama --version
🎁 What You Get Free AI coding 100% local & private Offline usage No limits Apache 2.0 license
📚 Links ollama.com ↗ ollama.com/library/gemma4 ↗ docs.anthropic.com ↗
🦙 Ollama: ollama.com ↗
Gemma 4 + Ollama
1