
Hyperbolic embeddings have captured the attention of the machine learning community through proposals from Nickel&Kiela (2017) and Chamberlain et al. (2017). The motivation is to embed structured, discrete objects such as knowledge graphs into a continuous representation that can be used with modern machine learning methods. Hyperbolic embeddings can preserve graph distances and complex relationships in very few dimensions, particularly for hierarchical graphs. In this website, we release hyperbolic embeddings that can be further integrated into applications related to knowledge base completion or can be supplied as features into various NLP tasks such as Question Answering. We provide a brief introduction both to hyperbolic geometry and to our work in this blogpost. In our paper, we present new approaches for solving the underlying optimization problems and show tradeoffs for each of these approaches.
We provide the hyperbolic embeddings for chosen hierarchies from WordNet, Wikidata and MusicBrainz and evaluate the intrinsic quality of the embeddings through hyperbolicanalogues of "similarity" and "analogy". We use our combinatorial construction algorithm and our optimizationbased approach implemented in PyTorch for all of the embeddings. Preliminary code for the embedding algorithms is publicly available here. Our selected 10dimensional entity embeddings are available to download in GloVe format below.

Download
You can download our 10dimensional entity embeddings with 64bit precision for selected relationships from WordNet, Wikidata and MusicBrainz in the following in GloVe format. We also include the scale factor for the embeddings in the top line, since hyperbolic embeddings are not scale invariant, unlike Euclidean embeddings. Input graph distances between entities can be approximately recovered through dividing the hyperbolic distance between their corresponding hyperbolic embeddings by the provided scale factor. You can also find the
preprocessing scripts to create edge lists for input graphs here. You can also get started with our demo notebook.
WordNet: hyperE.wn.10d.zip
Wikidata: hyperE.Wikidata.10d.zip
MusicBrainz: hyperE.MusicBrainz.10d.zip

Intrinsic Results
Analogy
Stateoftheart word embeddings used in many natural language processing applications successfully encode the semantic relationships between words in the Euclidean space. One of the main intrinsic evaluation tasks for word embeddings is "word analogy", which consists of questions like "Athens is to Greece as Berlin is to what?" GloVe model answers the question "a is to b as c is to what?" by finding the word d whose word embedding e_{d} is closest to the vector e_{b}e_{a}+e_{c} according to cosine similarity.
We define the analogous linear operations in hyperbolic space and use the hyperbolic distance as the similarity metric. In other words, we output the vector e_{d} that has the smallest hyperbolic distance to the hyperbolicanalogue of e_{b}e_{a}+e_{c}. We extend this analogy experiment to the entity hierarchies in Wikidata and include some samples for different types of relationships in the following. For reference, these entity embeddings were produced using our combinatorial approach with 10 dimensions and 64 bits of precision.
Knowledge Graph 
Relationship 
e_{a} 
e_{b} 
e_{c} 
Top Result 
WordNet 
Hypernym 
guitarist 
musician 
novelist 
writer 
WordNet 
Domain Topic 
fluorocarbon 
chemistry 
angular velocity 
physics 
Wikidata 
Student of 
Eubilides 
Euclides 
Dicaearchus 
Aristotle 
Wikidata 
Part of 
Malaysia 
Asia 
Amazon Rainforest 
South America 
Detecting Relations
In our music hierarchy embedding, songsalbum edges have weight 2, and albumartist edges have weight 1. We can use these weights and the embedded hierarchy to detect relations by simply computing hyperbolic distances between embedded points. For example, the first two songs below are from the Beatles' "Sgt. Pepper's Lonely Hearts Club Band" album, and indeed the embedding expresses this relation. The next two songs are by the Beatles, but from different albums. Lastly, songs by the Rolling Stones and AC/DC are shown as unrelated (note the much larger distance).
Song 1 
Song 2 
Distance 
Relation 
With a Little Help From My Friends 
Lucy in the Sky With Diamonds 
3.761 
Same album and artist 
A Hard Day's Night 
While My Guitar Gently Weeps 
4.823 
Same artist, different album 
(I Can't Get No) Satisfaction 
Back in Black 
23.430 
Unrelated 
Preserving Relationships
Hyperbolic space enables us to compress the hierarchical relationship information in a knowledge graph to very few dimensions compared to what is possible in Euclidean space. We report precision and recall results for preserving relationships between entity pairs for several input relationship graphs in order to show how well we can preserve graph distances when we embed a multilayer hierarchy in hyperbolic space. In this task, "correct" means the following: If an entity pair in the input graph is one edge away, hyperbolic distance between final entity embeddings stays within 1(± 0.01), when divided by the scale factor calculated based on the input graph. For reference, each of these embeddings were produced using our combinatorial approach with 64 bits of precision and 10 dimensions, and the largest connected component for each relationship in WordNet was taken.
Knowledge Graph 
Relationship 
Precision 
Recall 
WordNet 
Hypernym 
98.11% 
98.04% 
WordNet 
Domain Topic 
97.92% 
98.42% 
WordNet 
Member Holonym 
99.43% 
99.75% 

Visualizations
In Euclidean space, circle circumference and disc area grow linearly and quadratically with radius, respectively. However, in hyperbolic space, they both grow exponentially with respect to radius, which allows particularly efficient embeddings for hierarchical structures like trees. MAP refers to mean average precision, which is a metric that captures how well each vertex's neighborhoods are preserved.
We include some visualizations for our different embedding approaches in the following. Left handside is embedding a simple tree using our stochastic gradient descentbased optimization approach, which minimizes a loss function based on hyperbolic distance. For the same input graph, the righthand side is using our combinatorial construction, a deterministic algorithm that places tree vertices exactly.
The combinatorial construction first embeds the input graph into a tree, and then it embeds this tree into the Poincaré ball. The second step of the construction builds on Sarkar's approach for twodimensional hyperbolic space. Combinatorial construction can embed treelike input graphs in hyperbolic space with very low distortion in linear time, while the optimizationbased approach can better handle incomplete information and it is more robust to different kinds of input graphs. This shows that input graphs that aren't necessarily treelike can sometimes be embedded in hyperbolic space with good MAP and distortion. You can also warmstart our optimizationbased approach implemented in PyTorch with the embedding outputted from our combinatorial construction. We include detailed analysis of these algorithms in our paper.
Optimizationbased construction

Combinatorial construction

Related Work
Our study of hyperbolic embeddings was motivated by recent works showing that hyperbolic representations are suitable for many hierarchies, with applications including the question answering (QA) system HyperQA in Tay et al. (2018), vertex classifiers in Chamberlain et al. (2017), and link prediction in Nickel&Kiela (2017).
Embedding entailment relations, i.e. directed acyclic graphs with hyperbolic cones as a heuristic was proposed in Ganea et al. (2018). Hyperbolic word and sentence embeddings are studied by Dhingra et al. (2018). Many of these works use the Poincaré model of hyperbolic geometry; our work and Nickel&Kiela (2018) suggest that other models, such as the hyperboloid model of hyperbolic space, may be simpler for certain tasks. Of course, it is always possible to translate between models.
A pair of recent approaches seek to add hyperbolic operations to neural networks. Gulcehre et al. (2018) introduces a hyperbolic version of the attention mechanism using the hyperboloid model. In Ganea et al. (2018a), building blocks from certain networks are generalized to operate with Riemannian manifolds.
Hyperbolic operations can also be applied to Support Vector Machines, as shown by Cho et al. (2018). Finally, descent methods suitable for hyperbolic optimization have gained renewed research interest; such works include Enokida et al. (2018) for geodesic updates, Zhang and Sra (2018) for accelerated Riemannian gradient methods, and Wilson&Leimeister (2018) for performing gradient descent in hyperbolic space using the hyperboloid model.

Email us!
This project is maintained by HazyResearch, as part of the Stanford DAWN project.
Send us an email here for new potential applications or datasets you would like to see using HyperE until we release our engine.

 