Architecture¶

This page explains the core abstractions and data flow in MiniGrid so you can use and extend it without reading through the source.

Class hierarchy¶

        graph TD
    GYM[gymnasium.Env]
    MGE[MiniGridEnv]
    RG[RoomGrid]
    SR[Single-room envs]
    MR[Multi-room and BabyAI envs]

    GYM --> MGE
    MGE --> SR
    MGE --> RG
    RG --> MR

MiniGridEnv (minigrid/minigrid_env.py) owns the episode loop, rendering, and observation generation. RoomGrid (minigrid/core/roomgrid.py) extends it with a structured room-and-door layout. Concrete environments subclass one of these two and only need to implement _gen_grid().

Episode lifecycle¶

        flowchart TD
    A[env.reset] --> B[_gen_grid]
    B --> C[place_agent]
    C --> D[return obs dict]
    D --> E[env.step action]
    E --> F{result}
    F -->|goal reached| G[terminated, positive reward]
    F -->|lava| H[terminated, zero reward]
    F -->|max_steps hit| I[truncated, zero reward]
    F -->|otherwise| E

_gen_grid is the only method a subclass must implement. Everything else — observation building, rendering, FOV masking — is handled by the base class.

The grid¶

Grid (minigrid/core/grid.py) is a flat list of WorldObj | None, indexed as grid[x + y*width]. Access it via grid.get(x, y) and grid.set(x, y, obj).

Coordinate origin is the top-left corner. X increases rightward, Y increases downward. The outermost cells are conventionally walls, so the usable interior is [1, width-2] × [1, height-2].

World objects¶

        classDiagram
    class WorldObj {
        +type str
        +color str
        +can_overlap() bool
        +can_pickup() bool
        +see_behind() bool
        +toggle(env, pos) bool
        +encode() tuple
    }
    WorldObj <|-- Wall
    WorldObj <|-- Floor
    WorldObj <|-- Door
    WorldObj <|-- Key
    WorldObj <|-- Ball
    WorldObj <|-- Box
    WorldObj <|-- Goal
    WorldObj <|-- Lava

Every cell holds a WorldObj subclass or None (empty). Each object encodes to a 3-integer tuple (object_idx, color_idx, state).

Class	Walk through	Pick up	Blocks sight	Notes
`Wall`	✗	✗	✓	Only object that blocks vision
`Floor`	✓	✗	✗	Decorative walkable tile
`Door`	if open	✗	if closed	Locked requires matching `Key` by color
`Key`	✗	✓	✗	Matches `Door` by color
`Ball`	✗	✓	✗
`Box`	✗	✓	✗	Toggle opens it; can contain another object
`Goal`	✓	✗	✗	Stepping on it ends episode with reward
`Lava`	✓	✗	✗	Stepping on it ends episode with reward 0

Door state encodes as: 0=open, 1=closed, 2=locked.

Observation format¶

        graph LR
    OBS[obs dict]
    IMG[image]
    DIR[direction]
    MSN[mission]
    CELL[each cell]
    OBJ[OBJECT_TO_IDX]
    COL[COLOR_TO_IDX]
    STA[state]

    OBS --> IMG
    OBS --> DIR
    OBS --> MSN
    IMG --> CELL
    CELL --> OBJ
    CELL --> COL
    CELL --> STA

The image key is a uint8 (7, 7, 3) array — not pixel values but integer-encoded cells. Each cell is [object_idx, color_idx, state]. Decode with IDX_TO_OBJECT and IDX_TO_COLOR from minigrid.core.constants. The cell at image[3, 6] is always directly in front of the agent. Cells behind walls encode as object_idx=0 (“unseen”).

direction is an int 0–3 (right/down/left/up). mission is the natural-language task string.

Getting pixel images instead¶

from minigrid.wrappers import RGBImgPartialObsWrapper, RGBImgObsWrapper

env = RGBImgPartialObsWrapper(env)   # agent POV, pixel image
env = RGBImgObsWrapper(env)          # full grid, pixel image

Agent field of view¶

        flowchart LR
    GRID[Full grid] --> FOV[7x7 egocentric window]
    FOV --> VIS[Visible cells encoded normally]
    FOV --> HID[Cells behind walls encoded as unseen]

The agent sees a 7×7 region rotated so “forward” is always at the bottom. Walls and closed/locked doors block sight.

from minigrid.wrappers import ViewSizeWrapper, FullyObsWrapper

env = ViewSizeWrapper(env, agent_view_size=11)  # larger FOV, must be odd
env = FullyObsWrapper(env)                       # remove FOV entirely

Multi-room layout¶

        graph TD
    subgraph RoomGrid
        R00[Room 0,0] -->|door| R10[Room 1,0]
        R10 -->|door| R20[Room 2,0]
        R00 -->|door| R01[Room 0,1]
        R10 -->|door| R11[Room 1,1]
        R20 -->|door| R21[Room 2,1]
        R01 -->|door| R11
        R11 -->|door| R21
    end

RoomGrid subdivides the grid into a num_cols × num_rows array of equal rooms. Each Room tracks its neighbours, doors, and objects. Use connect_all() to guarantee every room is reachable.

Directory map¶

Path	What lives here
`minigrid/minigrid_env.py`	`MiniGridEnv` base class
`minigrid/core/grid.py`	`Grid`
`minigrid/core/world_object.py`	`WorldObj`, `Wall`, `Door`, `Key`, …
`minigrid/core/roomgrid.py`	`RoomGrid`, `Room`
`minigrid/core/actions.py`	`Actions` enum
`minigrid/core/constants.py`	Index↔name mappings, `TILE_PIXELS`
`minigrid/core/mission.py`	`MissionSpace`
`minigrid/envs/`	Concrete single-room environments
`minigrid/envs/babyai/`	Language-grounded BabyAI environments
`minigrid/wrappers.py`	All wrappers
`minigrid/utils/rendering.py`	Low-level tile drawing
`minigrid/utils/baby_ai_bot.py`	Instruction-following bot