recbole.data.kg_dataset¶
- class recbole.data.dataset.kg_dataset.KnowledgeBasedDataset(config)[source]¶
Bases:
DatasetKnowledgeBasedDatasetis based onDataset, and load.kgand.linkadditionally.Entities are remapped together with
item_idspecially. All entities are remapped into three consecutive ID sections.virtual entities that only exist in interaction data.
entities that exist both in interaction data and kg triplets.
entities only exist in kg triplets.
It also provides several interfaces to transfer
.kgfeatures into coo sparse matrix, csr sparse matrix,DGL.GraphorPyG.Data.- head_entity_field¶
The same as
config['HEAD_ENTITY_ID_FIELD'].- Type:
str
- tail_entity_field¶
The same as
config['TAIL_ENTITY_ID_FIELD'].- Type:
str
- relation_field¶
The same as
config['RELATION_ID_FIELD'].- Type:
str
- entity_field¶
The same as
config['ENTITY_ID_FIELD'].- Type:
str
- kg_feat¶
Internal data structure stores the kg triplets. It’s loaded from file
.kg.- Type:
pandas.DataFrame
- item2entity¶
Dict maps
item_idtoentity, which is loaded from file.link.- Type:
dict
- entity2item¶
Dict maps
entitytoitem_id, which is loaded from file.link.- Type:
dict
Note
entity_fielddoesn’t exist exactly. It’s only a symbol, representing entity features.[UI-Relation]is a special relation token.- ckg_graph(form='coo', value_field=None)[source]¶
Get graph or sparse matrix that describe relations of CKG, which combines interactions and kg triplets into the same graph.
Item ids and entity ids are added by
user_numtemporally.For an edge of <src, tgt>,
graph[src, tgt] = 1ifvalue_fieldisNone, elsegraph[src, tgt] = self.kg_feat[self.relation_field][src, tgt]orgraph[src, tgt] = [UI-Relation].Currently, we support graph in DGL and PyG, and two type of sparse matrices,
cooandcsr.- Parameters:
form (str, optional) – Format of sparse matrix, or library of graph data structure. Defaults to
coo.value_field (str, optional) –
self.relation_fieldorNone, Defaults toNone.
- Returns:
Graph / Sparse matrix of kg triplets.
- property entities¶
Returns: numpy.ndarray: List of entity id, including virtual entities.
- property entity_num¶
Get the number of different tokens of entities, including virtual entities.
- Returns:
Number of different tokens of entities, including virtual entities.
- Return type:
int
- property head_entities¶
Returns: numpy.ndarray: List of head entities of kg triplets.
- kg_graph(form='coo', value_field=None)[source]¶
Get graph or sparse matrix that describe relations between entities.
For an edge of <src, tgt>,
graph[src, tgt] = 1ifvalue_fieldisNone, elsegraph[src, tgt] = self.kg_feat[value_field][src, tgt].Currently, we support graph in DGL and PyG, and two type of sparse matrices,
cooandcsr.- Parameters:
form (str, optional) – Format of sparse matrix, or library of graph data structure. Defaults to
coo.value_field (str, optional) – edge attributes of graph, or data of sparse matrix, Defaults to
None.
- Returns:
Graph / Sparse matrix of kg triplets.
- property relation_num¶
Get the number of different tokens of
self.relation_field.- Returns:
Number of different tokens of
self.relation_field.- Return type:
int
- property relations¶
Returns: numpy.ndarray: List of relations of kg triplets.
- property tail_entities¶
Returns: numpy.ndarray: List of tail entities of kg triplets.