Knowledge Graph Features and Explanation
Although latent feature models such as TransE are demonstrating state-of-the-art performance in the task of link prediction in knowledge graphs, they act as a black box. Latent feature models rely on latent representation of features difficult for human interpretation. In this post, I present a family of knowledge graph completion models known as graph-based feature models with intrinsic mean for prediction explanation. Graph-based feature models provide interpretable predictions through the observable features they use (sub-graphs, connecting paths, neighborhood information, etc.)
1. On the importance of graph features
If you were given the task to tell whether a city is a capital of a country, a common approach would consist of:
- Finding the characteristics (also called features) of existing capital cities (similarly for country regions).
- Figuring out some unique characteristics that discriminate a capital from other cities.
Let’s consider the knowledge graph below illustrating a positive instance (in dark green) of the relation “capital of” that is “Paris is the capital city of France” and a negative example (in dark red): “Lille city is not the capital of France”. Each relation or relation path around a node carries a particular information about the node itself and can be used as a feature.
Following a manual inspection of the knowledge graph, one can make the following observations:
- Shared features between positive and negative examples: both “Lille” which is not the capital of “France” and the capital “Paris” have a “mayor” or are “located in” a region itself part of “France”. Although these characteristics describe well a city it is not specific to a capital.
- Discriminant features for capital: comparing relations around a capital and those around a normal city, we can learn that “EmbassyIn” is a discriminant feature and indicates that the city is likely to be a capital. Similarly the fact that a city host the residence of the country’s president or that it has an airport can be useful clue indicating it could be a capital. Interestingly, knowing that a capital does not have consulates is as well a useful information.
- Discriminant features for country: the same reasoning is true for the country entity “France”. While having a “flag” may not be so distinctive (cities or regions have flags), the fact that a country has a “border country” is a good indicator of a country.
Concretely we just used local and quasi local patterns from the graph to learn features of a relationship type.
2. Graph-based feature models
Graph-based Feature Models follow the exact same process as our manual inspection. They start with extracting the features from relationships and relationship paths around positive and negative instances of a relation of interest (in this example “capital of”). They then use a classification model to learn the discriminant relationships. Applied to our example, this leads to the following table:
|Characteristic/Feature||Indicator of a “capital of” relationship|
|e||City:has LocatedIn/is Residence/headOf||Positive|
When evaluating a new instance of relation “capital of”, the model will assess whether the country has a border and if the city has embassies; has an airport; is the residence of the head of the country; DOES NOT have consulates.
With the rise of complex machine learning algorithms, such as neural networks acting as black boxes, comes the problem of explainability of predictions. Although explanation is not crucial when recommending new music songs you may like, it is necessary in applications such as bio-medicine where lives are at stake. Recent research initiatives have put the focus on Explainable Artificial Intelligence (XAI).
Using observable graph features, graph-based models have an intrinsic mean for explanation. When a graph-based model is predicting a new relationship instance, it knows which relation path has contributed (positively or negatively) the most to that discovery. The path features can therefore be used as a mean for explanation which is easy for a human to interpret. In the example below, our graph-based feature model is able to predict correctly that Canberra is the most likely capital of Australia using the features learned previously. More than a simple prediction the model can use these path features to provide an intelligible explanation.