How to Run GLM-5.2 Inside Hermes Agent in Five Minutes

I have 319,000 YouTube subscribers and 3,600+ members in my AI Profit Boardroom. Most of them run Hermes Agent for content and code work. The question I get most this month is simple. Can I swap the model behind Hermes to GLM-5.2? Yes. And it takes about five minutes once you know the exact keys. This post shows every command I ran to make it live. Nothing here is theory. I typed each line, hit enter, and watched it work.

Why GLM-5.2 in Hermes

GLM-5.2 is Z.ai's flagship coding model. It ships with a 1 million token context window. That window is the whole reason to do this swap. One million tokens holds an entire codebase at once. It holds a full Obsidian vault. It holds every file in a small repo with room to spare. Hermes Agent is provider-agnostic by design. You do not fork it. You do not patch it. You add a profile, point it at an endpoint, and run.

What you need first

You need Hermes installed and working on your machine. If you have not installed it yet, run the one-line installer.

curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash

You need a GLM-5.2 API key from Z.ai. Grab one from the Z.ai platform. You need that key string before you touch any config. You also need to pick how you want to reach GLM-5.2. There are two supported paths and I cover both below.

The two ways to wire GLM-5.2 into Hermes

Hermes speaks to models through providers. A provider is just a name plus a base URL plus a key. GLM-5.2 has two clean paths into Hermes.

Path A: the local Ollama daemon

This path runs GLM-5.2 through your local Ollama install. It is best if you already run Ollama and want one toolchain for every model. You need an Ollama cloud subscription for the glm-5.2:cloud tag to pull.

The provider block looks like this.

model:
  default: "glm-5.2:cloud"
  provider: "ollama"
  base_url: "http://localhost:11434/v1"

The model name is glm-5.2:cloud. The provider name is ollama. The base URL points at your local Ollama daemon on port 11434.

Path B: the Z.ai Coding Plan

This path goes straight to Z.ai over the network. It is the simplest route if you do not want to run a local daemon. There is nothing to install besides Hermes.

The provider block looks like this.

model:
  default: "glm-5.2"
  provider: "zai"
  base_url: "https://api.z.ai/api/coding/paas/v4"

The provider name is zai. The base URL is the Z.ai coding endpoint. That endpoint is built for the coding plan tier.

I use Path B in this post because it has the fewest moving parts. The steps are identical for Path A once you swap those three values.

Step 1: create a fresh Hermes profile

Do not edit your default profile. Make a new one so the GLM-5.2 setup stays isolated. Profiles are cheap. They are just separate config trees.

hermes profile create glm

That one command creates a profile named glm. It gets its own config file, its own sessions, its own skills. You now have a clean slate to configure.

Step 2: set the model and provider

Hermes stores model settings in config.yaml under the model block. You set them from the terminal with hermes config set. Run these four lines, one per setting.

hermes -p glm config set model.default "glm-5.2"
hermes -p glm config set model.provider "zai"
hermes -p glm config set model.base_url "https://api.z.ai/api/coding/paas/v4"
hermes -p glm config set model.context_length 1000000

That last line is the important one. GLM-5.2 supports a 1,000,000 token window. Telling Hermes that number lets it plan context use correctly. Without it, Hermes falls back to a smaller window and wastes the model.

You can check what landed in the file.

hermes -p glm config show

Look for the model section in the output. You want to see default, provider, base_url, and context_length all set. If any are blank, rerun that one config set line.

Step 3: add your GLM API key

The key does not go in config.yaml. Keys go in the .env file so they stay out of version control. Hermes reads the GLM key from one environment variable: GLM_API_KEY.

First, find your .env path for this profile.

hermes -p glm config env-path

That prints the exact file to edit. Open it and add this line at the bottom.

GLM_API_KEY=your_real_key_here

Save the file. Now export it in your current shell so the next command can see it.

export GLM_API_KEY=your_real_key_here

If you skip the export, the next step fails silently. That silent failure is the number one thing people hit. Do both. Write the file for next time and export for right now.

Step 4: confirm the model is live

Before you trust the setup, prove it. Run the model picker to see what Hermes thinks the active model is.

hermes -p glm model

You should see glm-5.2 listed as the current model with the zai provider. If it shows something else, your config set lines did not land. Go back to step 2 and rerun them.

Step 5: the one-line smoke test

Now prove it. Run a single non-interactive query through the new profile.

hermes -p glm -z "Say the word ready and nothing else."

The -z flag takes a prompt and runs it once, no chat loop. If everything is wired right, you get a short reply from GLM-5.2. You should see the word ready come back. If you see that, you are live.

Run a slightly bigger test to feel the context window.

hermes -p glm -z "Read this repo and list the top-level directories." -t terminal

That hands Hermes the terminal toolset and asks GLM-5.2 to look at your files. If it runs, your setup handles real work.

The win: 1 million tokens at once

GLM-5.2 has a 1,000,000 token context window. One million tokens holds a whole codebase. It holds a full notes vault. It holds a long chat log plus the files plus the prompt, all at once. You stop chunking. You stop summarising to fit limits. You point Hermes at the project and let GLM-5.2 see all of it.

When each path wins

Use Path A, the local Ollama daemon, when you already run Ollama for other models. You keep one toolchain and one billing relationship. Use Path B, the Z.ai Coding Plan, when you want the fewest moving parts. No local daemon, no port to manage, just an endpoint and a key. Both paths give you the same model and the same context window. The only difference is where the request goes first.

Troubleshooting: the three ways it breaks

The setup is small, but it fails in three specific ways. I hit all three while writing this post. Here is each one and the fix.

Failure 1: wrong base_url

Symptom: every request times out or returns a connection error. Cause: the base_url in config does not match the provider. The Z.ai coding endpoint ends in /v4. The Ollama endpoint ends in /v1 and runs on localhost port 11434. Fix: run hermes -p glm config show and read the base_url line. If it is wrong, set it again with hermes -p glm config set model.base_url "...".

Failure 2: key not exported

Symptom: you get an auth error or an empty response. Cause: the key is in the .env file but not in your current shell. Hermes reads .env at startup, but the very first run in a fresh shell can miss it. Fix: run export GLM_API_KEY=your_real_key_here in the terminal you are using. Then rerun your query.

Failure 3: provider name mismatch

Symptom: Hermes falls back to a different model without telling you. Cause: the provider value in config does not match a name Hermes knows. The Z.ai path needs the exact string zai. The Ollama path needs the exact string ollama. A typo here is silent. Hermes tries to resolve the provider, fails quietly, and uses a fallback. Fix: run hermes -p glm config show and read the model.provider line. If it says anything other than zai or ollama, set it again.

hermes -p glm config set model.provider "zai"

Then rerun the smoke test from step 5.

Make it stick

Once the smoke test passes, save the export line to your shell profile. Add it to .bashrc, .zshrc, or whatever your shell reads on start.

export GLM_API_KEY=your_real_key_here

That way every new terminal has the key. You never hit the silent auth failure again. You can also set the profile as your default so you skip the -p glm flag.

hermes profile use glm

After that, plain hermes runs the GLM-5.2 profile. No flags needed.

Wrap-up

You now have a second profile that runs GLM-5.2 inside Hermes. The setup is five commands and one export. The payoff is a 1 million token context window on every task. That window changes how you work. You stop splitting files to fit limits. You stop losing context between turns. You point Hermes at the whole project and go. If it breaks, check the three failures above. One of them is the cause. Fix it, rerun the smoke test, and you are back live.