skip to content

tqdm — Progress Bars

Add auto-updating progress bars to any Python loop or CLI pipeline with tqdm. Covers iterables, manual updates, pandas integration, nested bars, async, Jupyter, and byte-piping.

6 min read 28 snippets deep dive

tqdm — Progress Bars#

What it is#

tqdm wraps any Python iterable and displays an auto-updating progress bar with ETA, rate, and elapsed time in the terminal or Jupyter notebook. It adds less than 60 ns of overhead per iteration and requires no configuration beyond installation — just wrap an iterable and the bar appears. It is the de-facto standard for loop progress in Python data pipelines.

Install#

pip install tqdm

Output: (none — exits 0 on success)

Quick example#

from tqdm import tqdm
import time

for _ in tqdm(range(100)):
    time.sleep(0.02)

Output:

100%|████████████████████████████| 100/100 [00:02<00:00, 49.8it/s]

When / why to use it#

  • Long-running loops over files, API pages, dataset rows, or model batches where ETA matters.
  • CLI tools where a bar communicates liveness to the user.
  • Data pipelines with pandas (.progress_apply) or PyTorch data loaders.
  • Notebooks where it renders as an HTML widget.
  • Shell pipelines where it acts as a byte-throughput meter (like pv).

Common pitfalls#

[!WARNING] Wrapping a generator with unknown length — if total= is omitted and tqdm cannot infer the length from len(), the ETA shows ? and the percentage is hidden. Always pass total=len(items) when the iterable is a generator.

[!WARNING] Nested bars without position= — each inner bar overwrites the outer bar on the same line. Use position=0 on the outer bar and position=1, leave=False on the inner bar.

[!WARNING] print() inside a tqdm loop — ordinary print interleaves with bar output and creates visual noise. Replace with tqdm.write("msg"), which prints above the bar without disturbing it.

[!TIP] tqdm(iterable, desc="Loading") is the most readable one-liner. The desc= prefix appears left of the bar and doubles as a log label when redirected.

[!TIP] from tqdm.auto import tqdm auto-selects the notebook HTML widget when running in Jupyter and falls back to the terminal bar everywhere else — the cleanest single import for code that runs in both environments.

Richer example — file hashing pipeline#

from tqdm import tqdm
import pathlib, hashlib

files = list(pathlib.Path(".").glob("**/*.py"))

results = {}
with tqdm(files, desc="Hashing", unit="file", colour="green") as pbar:
    for path in pbar:
        pbar.set_postfix(file=path.name, refresh=False)
        results[str(path)] = hashlib.md5(path.read_bytes()).hexdigest()

print(f"Hashed {len(results)} files")

Output:

Hashing: 100%|██████████| 42/42 [00:00<00:00, 312.4file/s, file=utils.py]
Hashed 42 files

trange — range shorthand#

trange(n) is identical to tqdm(range(n)) and is the idiomatic way to wrap a counter loop. It accepts all the same keyword arguments as tqdm.

from tqdm import trange
import time

total = 0
for i in trange(50, desc="Summing", unit="step"):
    total += i
    time.sleep(0.01)

print(f"Total: {total}")

Output:

Summing: 100%|████████████████████| 50/50 [00:00<00:00, 87.3step/s]
Total: 1225

Manual updates with update() and set_postfix()#

When the loop body controls progress (streaming downloads, chunked reads, batch training), open tqdm as a context manager with total= and call pbar.update(n) to advance by n units. set_postfix attaches live key-value metadata to the right of the bar without interrupting the display.

from tqdm import tqdm
import time

with tqdm(total=1000, desc="Download", unit="KB") as pbar:
    downloaded = 0
    while downloaded < 1000:
        chunk = 64
        time.sleep(0.005)
        downloaded += chunk
        pbar.update(chunk)
        pbar.set_postfix(speed="12800 KB/s", refresh=False)

Output:

Download: 100%|█████████| 1000/1000 [00:00<00:00, 12.8KB/s, speed=12800 KB/s]

pandas integration — progress_apply#

tqdm patches pandas Series, DataFrame, and GroupBy objects with a progress_apply method. Call tqdm.pandas() once at module level to activate the patch; then replace .apply() with .progress_apply() throughout.

import pandas as pd
from tqdm import tqdm

tqdm.pandas(desc="Transforming")

df = pd.DataFrame({"value": range(200)})
df["doubled"] = df["value"].progress_apply(lambda x: x * 2)
print(df.tail(3))

Output:

Transforming: 100%|████████| 200/200 [00:00<00:00, 4123.7it/s]
   value  doubled
197   197      394
198   198      396
199   199      398

For groupby, use progress_apply on the grouped object the same way:

df = pd.DataFrame({"group": ["a", "b"] * 100, "val": range(200)})
result = df.groupby("group")["val"].progress_apply(sum)
print(result)

Output:

Transforming: 100%|████| 2/2 [00:00<00:00, 312.4it/s]
group
a    9900
b    10100
Name: val, dtype: int64

Nested bars#

Use position= (0-indexed from the bottom of the terminal block) and leave=False on inner bars so they erase themselves when complete. The outermost bar uses position=0 (default) and leave=True (default).

from tqdm import tqdm
import time

epochs = 3
batches = 5

for epoch in tqdm(range(epochs), desc="Epoch", position=0):
    for batch in tqdm(range(batches), desc="  Batch", position=1, leave=False):
        time.sleep(0.04)

Output (mid-run):

Epoch:  67%|██████████████       | 2/3 [00:01<00:00,  1.39it/s]
  Batch:  60%|████████████       | 3/5 [00:00<00:00,  9.87it/s]

Async — asyncio and concurrent.futures#

For asyncio, use tqdm.asyncio.tqdm which provides gather() and as_completed() drop-ins that track coroutine completion. For concurrent.futures, wrap the iterator returned by pool.map with a standard tqdm.

import asyncio
from tqdm.asyncio import tqdm as atqdm

async def fetch(i):
    await asyncio.sleep(0.05)
    return i * i

async def main():
    tasks = [fetch(i) for i in range(20)]
    results = await atqdm.gather(*tasks, desc="Fetching")
    print(results[:5])

asyncio.run(main())

Output:

Fetching: 100%|████████████████████| 20/20 [00:05<00:00,  3.94it/s]
[0, 1, 4, 9, 16]
from concurrent.futures import ThreadPoolExecutor
from tqdm import tqdm
import time

def work(x):
    time.sleep(0.02)
    return x ** 2

with ThreadPoolExecutor(max_workers=4) as pool:
    results = list(tqdm(pool.map(work, range(20)), total=20, desc="Threads"))
print(results[:5])

Output:

Threads: 100%|████████████████████| 20/20 [00:00<00:00, 52.3it/s]
[0, 1, 4, 9, 16]

Jupyter / notebook mode#

In Jupyter, from tqdm.notebook import tqdm renders an HTML progress widget with colour gradients and smooth updates. The API is identical to the terminal version. from tqdm.auto import tqdm is the recommended import — it picks notebook mode automatically.

from tqdm.auto import tqdm   # widget in Jupyter, terminal bar elsewhere
import time

for _ in tqdm(range(50), desc="Training epoch"):
    time.sleep(0.02)

Custom format string and bar characters#

bar_format= controls every token in the rendered string. ascii= replaces the default Unicode block characters with a custom set of ASCII fill characters (lowest to highest density).

from tqdm import tqdm
import time

fmt = "{desc}: {percentage:3.0f}%|{bar}| {n_fmt}/{total_fmt} [{elapsed}<{remaining}, {rate_fmt}]"

for _ in tqdm(range(60), bar_format=fmt, ascii="░▒█", desc="Custom"):
    time.sleep(0.02)

Output:

Custom:  85%|░░░░░░░░░░░░░░░░░░░░░░░░░▒█   | 51/60 [00:01<00:00, 42.1it/s]

Available format tokens: {l_bar}, {bar}, {r_bar}, {n}, {n_fmt}, {total}, {total_fmt}, {percentage}, {elapsed}, {elapsed_s}, {remaining}, {remaining_s}, {rate}, {rate_fmt}, {rate_noinv}, {rate_noinv_fmt}, {postfix}, {desc}, {unit}.

Dynamic description and colour#

from tqdm import tqdm
import time

stages = ["Loading", "Preprocessing", "Fitting", "Evaluating"]
with tqdm(total=len(stages), colour="cyan") as pbar:
    for stage in stages:
        pbar.set_description(stage)
        time.sleep(0.3)
        pbar.update(1)

Output:

Evaluating: 100%|███████████████████████| 4/4 [00:01<00:00,  3.33it/s]

CLI piping — byte-count filter#

Invoked as python -m tqdm, tqdm reads stdin, forwards it to stdout, and displays throughput. It is a portable drop-in for pv on systems where pv is unavailable.

cat large_file.bin | python -m tqdm --bytes > /dev/null

Output:

 512MB [00:04, 121MB/s]

Count lines instead of bytes:

cat records.jsonl | python -m tqdm --unit=line --unit-scale > output.jsonl

Output: (none — exits 0 on success)

Disabling bars for non-interactive contexts#

import sys
from tqdm import tqdm

verbose = True  # set via CLI arg or environment variable

# Explicit disable flag
for item in tqdm(range(100), disable=not verbose):
    pass

# Auto-detect TTY — suppresses bar in CI, cron, piped output
for item in tqdm(range(100), disable=not sys.stdout.isatty()):
    pass

Quick reference#

FeatureCode
Basic wraptqdm(iterable)
Counter looptrange(n)
With labeltqdm(it, desc="Step")
Manual progresstqdm(total=n) then pbar.update(k)
Postfix metadatapbar.set_postfix(loss=0.42)
Print above bartqdm.write("message")
pandas applytqdm.pandas() then .progress_apply(fn)
Nested barsouter position=0, inner position=1, leave=False
Asynciofrom tqdm.asyncio import tqdm
Thread pooltqdm(pool.map(fn, items), total=n)
Notebook autofrom tqdm.auto import tqdm
Disabledisable=not verbose
Custom charsascii="░▒█"
CLI pipepython -m tqdm --bytes