An HDF5 group is a container inside an HDF5 file that holds datasets and other groups. It plays the role that a directory plays in a Linux filesystem — a way to organize the contents hierarchically.

Every HDF5 file has a root group, written /, which contains everything else. Subgroups are created with h5py’s create_group method, and they accept path-like names with / separators:

with h5py.File('./hdf5_groups.h5', 'w') as hdf:
    G1  = hdf.create_group('/Group1')
    G1.create_dataset('dataset1', data=matrix_1)
    G21 = hdf.create_group('Group2/Friends')   # creates both Group2 and Friends
    G21.create_dataset('dataset3', data=matrix_3)
    G22 = hdf.create_group('Group2/Office')
    G22.create_dataset('dataset4', data=matrix_4)

A path like 'Group2/Friends' says: create Group2 if it doesn’t already exist, then create Friends inside it. Datasets with the same name can coexist as long as they’re in different groups — Group1/dataset4 and Group2/Office/dataset4 are different objects, just as /home/Alice/notes.txt and /home/Bob/notes.txt are different files on Linux.

Two equivalent reading styles:

G1 = hdf.get('/Group1')           # step into the group
d1 = G1.get('dataset1')
d1_prime = hdf.get('Group1/dataset1')   # or address directly by full path

Stepping in is cleaner when reading several datasets from the same group; full paths are cleaner when reading just one dataset and the path is known. The two styles compose freely.