Dataset Cards - Alberdice

Dataset Cards - Alberdice

small-2ag

small-2ag - Download

Metadata

Environment nameVersionAgentsAction typeObservation sizeReward type
RWARECode included in Alberdice repository2Discrete[71]Dense

Generation procedure for each dataset

Converted from alberdice format to a Vault.

Summary statistics

UidEpisode return meanMin returnMax returnTransitionsTrajectoriesJoint SACo
Expert7.12 ± 2.071.1312.3750000010000.99

small-4ag

small-4ag - Download

Metadata

Environment nameVersionAgentsAction typeObservation sizeReward type
RWARECode included in Alberdice repository4Discrete[71]Dense

Generation procedure for each dataset

Converted from alberdice format to a Vault.

Summary statistics

UidEpisode return meanMin returnMax returnTransitionsTrajectoriesJoint SACo
Expert9.49 ± 0.843.9312.0850000010001.00

small-6ag

small-6ag - Download

Metadata

Environment nameVersionAgentsAction typeObservation sizeReward type
RWARECode included in Alberdice repository6Discrete[71]Dense

Generation procedure for each dataset

Converted from alberdice format to a Vault.

Summary statistics

UidEpisode return meanMin returnMax returnTransitionsTrajectoriesJoint SACo
Expert10.76 ± 0.687.5912.6950000010001.00

tiny-2ag

tiny-2ag - Download

Metadata

Environment nameVersionAgentsAction typeObservation sizeReward type
RWARECode included in Alberdice repository2Discrete[71]Dense

Generation procedure for each dataset

Converted from alberdice format to a Vault.

Summary statistics

UidEpisode return meanMin returnMax returnTransitionsTrajectoriesJoint SACo
Expert12.77 ± 1.561.9716.8150000010001.00

tiny-4ag

tiny-4ag - Download

Metadata

Environment nameVersionAgentsAction typeObservation sizeReward type
RWARECode included in Alberdice repository4Discrete[71]Dense

Generation procedure for each dataset

Converted from alberdice format to a Vault.

Summary statistics

UidEpisode return meanMin returnMax returnTransitionsTrajectoriesJoint SACo
Expert15.67 ± 1.2010.4018.6350000010001.00

tiny-6ag

tiny-6ag - Download

Metadata

Environment nameVersionAgentsAction typeObservation sizeReward type
RWARECode included in Alberdice repository6Discrete[71]Dense

Generation procedure for each dataset

Converted from alberdice format to a Vault.

Summary statistics

UidEpisode return meanMin returnMax returnTransitionsTrajectoriesJoint SACo
Expert17.45 ± 1.0111.8819.9750000010001.00