Now that you have installed Ollama on your system, the next exciting step is actually running your first AI model.
This is where people usually get surprised — because it feels just like talking to ChatGPT except everything runs offline, directly on your computer. This will help you run a local AI assistant to chat with.

What Does “Running a Model” Mean in Ollama?

When you run a model, you’re basically loading an AI brain into your system so it can answer your questions. It is like running ChatGPT on your local machine which is not that much powerful and can do most of the task for you.

When you run a model, you’re basically loading an AI brain into your system so it can answer your questions.

Examples of models:

  • Llama 3.1 (general-purpose assistant)
  • Mistral (fast and lightweight)
  • Gemma (Google’s open model)
  • Phi (super small and efficient)

Think of them like different personalities or skill-sets you can plug in.

Pull (Download) Your First AI Model

Before you run a model, you need to download it. The Ollama client gives the pull method to download models

Open your terminal (Command Prompt on Windows, Terminal on Mac/Linux) and type:

Bash
ollama pull llama3.1

What this does:

  • Downloads the Llama 3.1 model
  • Saves it on your computer
  • Makes it available for offline use

Depending on your internet speed, this may take a few minutes.

Run the Model and Start Chatting

Once the model is downloaded, run:

Bash
ollama run llama3.1

You’ll immediately see a prompt like:

Bash
>>>

Now type anything, for example:

Bash
>>> Write a short motivational message for my morning.

The model will reply instantly — just like ChatGPT, but running locally.

To exit the chat:

Bash
/exit

Understanding Model Sizes

When you download a model, you might see versions like:

  • 7B
  • 8B
  • 13B
  • 70B

These numbers represent the size of the model, measured in “billions of parameters.”

Smaller models (3B–7B)

  • Faster
  • Less memory
  • Good for simple tasks

Medium models (8B–13B)

  • Better reasoning
  • Still fast on modern laptops

Large models (30B–70B)

  • Advanced reasoning
  • Needs strong hardware
  • Not recommended for beginners

For most people, 7B or 8B models are more than enough.

Example Prompts to Try

You can ask the Ollama model whatever you want and it will give you the answer

Writing poem

Bash
Write a 2 line poem about morning coffee.

Coding help

Bash
Explain how a Python function works with an example.

Summary

Bash
Summarize this paragraph in simple words: [paste text]

Brainstorming

Bash
Give me 5 blog ideas about AI that are easy to write.

Run Another Model

To see the difference between models, try:

Mistral

Bash
ollama pull mistral
ollama run mistral

Gemma

Bash
ollama pull gemma2
ollama run gemma2

Phi

Bash
ollama pull phi3
ollama run phi3

Common Beginner Errors & Fixes

Error: “Model not found”

Bash
ollama pull llama3.1

Error: “Port already in use”

If such issue happen then restart the ollama and then try again

FAQ

Q: Do I need internet to use the model?

Only for downloading. After that, everything works offline.

Can I run multiple models?

Can I run multiple models?

Q: Do all models work on all laptops?

Most 7B–8B models run perfectly on normal laptops.

Conclusion

Running your first model in Ollama is easy and fun. Once you download a model and start chatting with it, you’ll understand the power of offline AI.

References

Leave a Reply

Your email address will not be published. Required fields are marked *