- What Does “Running a Model” Mean in Ollama?
- Pull (Download) Your First AI Model
- Run the Model and Start Chatting
- Understanding Model Sizes
- Example Prompts to Try
- Run Another Model
- Common Beginner Errors & Fixes
- FAQ
- Conclusion
- References
Now that you have installed Ollama on your system, the next exciting step is actually running your first AI model.
This is where people usually get surprised — because it feels just like talking to ChatGPT except everything runs offline, directly on your computer. This will help you run a local AI assistant to chat with.
What Does “Running a Model” Mean in Ollama?
When you run a model, you’re basically loading an AI brain into your system so it can answer your questions. It is like running ChatGPT on your local machine which is not that much powerful and can do most of the task for you.
When you run a model, you’re basically loading an AI brain into your system so it can answer your questions.
Examples of models:
- Llama 3.1 (general-purpose assistant)
- Mistral (fast and lightweight)
- Gemma (Google’s open model)
- Phi (super small and efficient)
Think of them like different personalities or skill-sets you can plug in.
Pull (Download) Your First AI Model
Before you run a model, you need to download it. The Ollama client gives the pull method to download models
Open your terminal (Command Prompt on Windows, Terminal on Mac/Linux) and type:
ollama pull llama3.1What this does:
- Downloads the Llama 3.1 model
- Saves it on your computer
- Makes it available for offline use
Depending on your internet speed, this may take a few minutes.
Run the Model and Start Chatting
Once the model is downloaded, run:
ollama run llama3.1You’ll immediately see a prompt like:
>>>Now type anything, for example:
>>> Write a short motivational message for my morning.The model will reply instantly — just like ChatGPT, but running locally.
To exit the chat:
/exitUnderstanding Model Sizes
When you download a model, you might see versions like:
- 7B
- 8B
- 13B
- 70B
These numbers represent the size of the model, measured in “billions of parameters.”
Smaller models (3B–7B)
- Faster
- Less memory
- Good for simple tasks
Medium models (8B–13B)
- Better reasoning
- Still fast on modern laptops
Large models (30B–70B)
- Advanced reasoning
- Needs strong hardware
- Not recommended for beginners
For most people, 7B or 8B models are more than enough.
Example Prompts to Try
You can ask the Ollama model whatever you want and it will give you the answer
Writing poem
Write a 2 line poem about morning coffee.Coding help
Explain how a Python function works with an example.Summary
Summarize this paragraph in simple words: [paste text]Brainstorming
Give me 5 blog ideas about AI that are easy to write.Run Another Model
To see the difference between models, try:
Mistral
ollama pull mistral
ollama run mistralGemma
ollama pull gemma2
ollama run gemma2Phi
ollama pull phi3
ollama run phi3Common Beginner Errors & Fixes
Error: “Model not found”
ollama pull llama3.1Error: “Port already in use”
If such issue happen then restart the ollama and then try again
FAQ
Q: Do I need internet to use the model?
Only for downloading. After that, everything works offline.
Can I run multiple models?
Can I run multiple models?
Q: Do all models work on all laptops?
Most 7B–8B models run perfectly on normal laptops.
Conclusion
Running your first model in Ollama is easy and fun. Once you download a model and start chatting with it, you’ll understand the power of offline AI.