hpc - Can anybody please explain to me the relationship between libibverbs and librxe? -
i struggling understand relationship between libibverbs , librxe , low-level kernel driver hca.
specifically, have next doubts :
when packet arrives on hca, low-level kernel driver passes packet userspace application. there memory re-create involved here. in picture, libibverbs , librxe sit? similarly send command issued user must able straight talk hardware via low-level driver. need have userspace libraries in case?
the infiniband verbs implementation consists of 4 components:
a vendor-specific kernel module (e.g.ib_mthca
mellanox devices) a kernel module allows verbs access userspace (ib_uverbs
) an user-space vendor driver library (e.g. libmthca
) a glue component between previous 2 (libibverbs
) infiniband supports in general 2 semantics - packet-based operation , remote dma. no matter mode of operation, both implement zero-copy straight reading , writing application buffer(s). done (as explained haggai_e) fixing buffer in physical memory (also called registering), preventing virtual memory manager swapping off disk or moving around in physical ram. nice feature of infiniband each hca has own virtual-to-physical address translation engine allows 1 pass userspace pointers straight hardware.
the reason have user-level driver verbs exposes straight hca's hardware registers userspace , each hca has different set of registers, hence need intermediate userspace layer. of course, implemented exclusively in kernel , single vendor-independent userspace library used, infiniband tries hard provide low latency possible , having go through kernel every time expensive. fact rdma devices can translate virtual addresses on own means userspace library not have go through kernel in order obtain physical address of buffer when creating entries in work queues (part of mechanism used verbs send , receive data).
note there 2 vendor libraries - 1 in kernel , 1 in userspace. former provides verbs functionality other kernel modules file systems (e.g. lustre) or network protocol drivers (e.g. ip-over-infiniband), while latter provides functionality in userspace. operations cannot done exclusively in userspace, e.g. registering memory or opening/closing device contexts, , transparently passed kernel module libibverbs
.
although technically rdma on converged ethernet (roce, implemented in userspace librxe
) not infiniband on hardware level, openfabrics stack designed in such way back upwards rdma-capable hardware other infiniband hcas, including roce , iwarp adapters.
see this summary intel on topic of accessing infiniband on linux more details.
hpc infiniband
No comments:
Post a Comment