Manifolds¶
hype.manifold¶
-
class
hype.manifold.Manifold(*args, **kwargs)[source]¶ Base class for all manifolds.
-
static
dim(dim)[source]¶ Add any additional dimensions necessary for the manifold
Parameters: dim (int) – dimension specified by user Returns int
-
distance(u, v)[source]¶ Compute the distance between
uandvParameters: - u (Tensor) – first tensor
- v (Tensor) – second tensor
Returns: Distance between embeddings
uandvReturn type: Tensor
-
expm(p, d_p, lr=None, out=None)[source]¶ Exponential map for manifold. Takes a point
d_pin the tangent space ofpand maps it on to the manifoldParameters: - p (Tensor) – reference point defining the tangent space
- d_p (Tensor) – point in
p’s tangent space to be mapped on to the manifold
Returns: d_pmapped on to the manifoldReturn type: Tensor
-
init_weights(w, scale=0.0001)[source]¶ Initialize the weights of a Tensor
Parameters: - w (Tensor) – Parameter to initialize
- scale (float) – Initialize uniformly in the range [-scale, scale]
Returns: initialization is done in place
Return type: None
-
logm(x, y)[source]¶ Logarithmic map for manifold. Takes a point
ylocated on the manifold and projects it into the tangent space ofxParameters: - x (Tensor) – reference point defining the tangent space
- y (Tensor) – point to be mapped into
x’s tangent space
-
normalize(u)[source]¶ Perform any type of normalization to a Tensor. Examples include fixing a vector to the Hyperboloid (lorentz model) or restricting the norm of a vector
Parameters: u (Tensor) – vectors to normalize Returns: Normalized tensor Return type: Tensor
-
ptransp(x, y, v, ix=None, out=None)[source]¶ Parallel transport for manifold. Assuming
vis in the tangent space ofx,ptranspwill perform parallel transport into the tangent space ofyParameters: - x (Tensor) – starting point
- y (Tensor) – end point
- v (Tensor) – point in tangent space
Returns: embedding parallel transported from the tangent space of
xto the tangent space ofyReturn type: Tensor
-
static
hype.lorentz¶
-
class
hype.lorentz.LorentzManifold(eps=1e-12, _eps=1e-05, norm_clip=1, max_norm=1000000.0, debug=False, **kwargs)[source]¶ Lorentz model of hyperbolic geometry. This is the manifold used in “Learning Continuous Hierarchies in the Lorentz Model of Hyperbolic Geometry” (Nickel et al., 2018)
-
distance(u, v)[source]¶ See
distance()\(d(u, v) = \text{acosh}(-\langle u, v, \rangle_L)\)
-
expm(p, d_p, lr=None, out=None, normalize=False)[source]¶ See
expm()\(exp_p(d_p) = \text{cosh}(||d_p||_L)p + \text{sinh}(||d_p||) \frac{d_p}{||d_p||_L}\)
-
init_weights(w, irange=1e-05)[source]¶ Same as
init_weights(), but also fixes the normalized embeddings to the hyperboloid
-
static
ldot(u, v, keepdim=False)[source]¶ Computes the Lorentzian Scalar Product between
uandv\(\langle u, v \rangle_L = -u_0 * v_0 + \sum_{i=1}^n u_i * v_i\)
Parameters: - u (Tensor) – embedding
- v (Tensor) – embedding
Returns: Tensor
-
normalize(w)[source]¶ See
normalize()
-
hype.poincare¶
-
class
hype.poincare.PoincareManifold(eps=1e-05, **kwargs)[source]¶ Poincaré Ball model of hyperbolic geometry. This is the manifold used in “Poincaré Embeddings for Learning Hierarchical Representations” (Nickel et al., 2017)
Parameters: eps (float) – \(\epsilon\) value to restrict the radius of the ball. This is used to prevent numerical overflow/underflow -
static
dim(dim)¶ Add any additional dimensions necessary for the manifold
Parameters: dim (int) – dimension specified by user Returns int
-
distance(u, v)[source]¶ See
distance()\(d(u, v) = \text{arcosh}\left( 1 + 2 \frac{||u - v||^2}{(1 - ||u||^2)(1 - ||v||^2)} \right)\)
-
init_weights(w, scale=0.0001)¶ Initialize the weights of a Tensor
Parameters: - w (Tensor) – Parameter to initialize
- scale (float) – Initialize uniformly in the range [-scale, scale]
Returns: initialization is done in place
Return type: None
-
normalize(u)¶ See
normalize()
-
static
hype.euclidean¶
-
class
hype.euclidean.EuclideanManifold(max_norm=1, **kwargs)[source]¶ -
static
dim(dim)¶ Add any additional dimensions necessary for the manifold
Parameters: dim (int) – dimension specified by user Returns int
-
distance(u, v)[source]¶ See
distance()\(d(u, v) = \sum_{i=0}^{n} (u_i - v_i)\)
-
init_weights(w, scale=0.0001)¶ Initialize the weights of a Tensor
Parameters: - w (Tensor) – Parameter to initialize
- scale (float) – Initialize uniformly in the range [-scale, scale]
Returns: initialization is done in place
Return type: None
-
normalize(u)[source]¶ See
normalize()
-
static
Dataloaders¶
hype.graph_dataset¶
-
class
hype.graph_dataset.BatchedDataset¶ Create a dataset for training Hyperbolic embeddings. Rather than allocating many tensors for individual dataset items, we instead produce a single batch in each iteration. This allows us to do a single Tensor allocation for the entire batch and filling it out in place.
Parameters: - idx (ndarray[ndims=2]) – Indexes of objects corresponding to co-occurrence.
I.E. if
idx[0, :] == [4, 19], then item 4 co-occurs with item 19 - weights (ndarray[ndims=1]) – Weights for each co-occurence. Corresponds
to the number of times a pair co-occurred. (Equal length to
idx) - nnegs (int) – Number of negative samples to produce with each positive
- objects (list[str]) – Mapping from integer ID to hashtag string
- nnegs – Number of negatives to produce with each positive
- batch_size (int) – Size of each minibatch
- num_workers (int) – Number of threads to use to produce each batch
- burnin (bool) – perform frequency based negative sampling
-
next()¶ Get the next minibatch
Returns: inputs,targetsReturn type: Tuple[Tensor, Tensor]
-
nnegatives()¶ Number of negative samples to use.
Returns: int
- idx (ndarray[ndims=2]) – Indexes of objects corresponding to co-occurrence.
I.E. if
Models¶
hype.graph¶
-
hype.graph.eval_reconstruction(adj, lt, distfn, workers=1, progress=False)[source]¶ Reconstruction evaluation. Evaluate how well the embedding is able to reconstruct the original input graph. Specifically, for each node, we compute all of its nearest neighbors in the embedding space and rank them amongst its non-neighbors.
Parameters: - adj (dict[int, set[int]]) – Adjacency list mapping objects to its neighbors
- lt (torch.Tensor[N, dim]) – Embedding table with N embeddings and dim dimensionality
- distfn (Callable[[Tensor, Tensor], Tensor]) – distance function to use for computing nearest neighbors in embedding space
- workers (int) – number of workers to use
- progress (bool) – display progress bar
Returns: mean_rank,map_rankReturn type: Tuple[float, float]
-
hype.graph.load_edge_list(path, symmetrize=False)[source]¶ Load an edgelist dataset in CSV format. The CSV file must have at least 3 columns:
id1,id2, andweight. If the dataset is directed, then it is assumed thatid2is the parent ofid1.Parameters: - path (str) – path to the CSV file
- symmetrize (bool) – If set to
True, then for every edgeA -> B, we create a symmetric edgeB -> A
Returns: A tuple containiner:
idxan array of edges,objectsa list of the unique objects in the graph, andweightsan array the same length ofidxcontaining the weights of each edgeReturn type: Tuple[np.ndarray[ndim=2], list[str], np.ndarray[ndim=1]]
hype.sn¶
Optimizers¶
hype.rsgd¶
-
class
hype.rsgd.RiemannianSGD(params, lr=<Mock name='mock.required' id='139711080874672'>, rgrad=<Mock name='mock.required' id='139711080874672'>, expm=<Mock name='mock.required' id='139711080874672'>)[source]¶ Riemannian stochastic gradient descent.
Parameters: - rgrad (Function) – Function to compute the Riemannian gradient from the Euclidean gradient
- retraction (Function) – Function to update the retraction of the Riemannian gradient
Utils¶
hype.train¶
-
hype.train.train(device, model, data, optimizer, opt, log, rank=1, queue=None, ctrl=None, checkpointer=None, progress=False)[source]¶ Function to train embeddings
Parameters: - device (torch.device) – which device to train on
- model (torch.nn.Module) – model to train
- data (BatchedDataset or AdjacencyDataset) – dataloader
- optimizer (torch.optim.Optimizer) – optimizer
- opt (SimpleNamespace) – command line options
- log (logging.Logger) – log
- rank (int) – thread rank if using multiple training threads
- queue (multiprocessing.Queue) – Queue to put epoch stats into if using multiple threads/asynchronous control
- checkpointer (Callable) – checkpointing function
- progress (bool) – whether or not to display progress bar per epoch
hype.checkpoint¶
-
class
hype.checkpoint.LocalCheckpoint(path, include_in_all=None, start_fresh=False)[source]¶ Module for managing model checkpoints.
Parameters: - path (str) – path to save the checkpoint to
- include_in_all (dict) – a dictionary of objects to save in every call to
:func:
save - start_fresh (bool) – If
True, then ignore any existing checkpoint, - initialize from previous checkpoint (otherwise) –
-
initialize(params)[source]¶ Initialize the checkpoint. If
start_freshisTrue, thenparamsis returned. Otherwise if a checkpoint atself.pathexists, the checkpoint is loaded and returnedParameters: params (dict) – checkpoint contents Returns: Either paramsor the contents of the checkpoint stored atself.pathReturn type: dict
-
save(params, tries=10)[source]¶ Save a checkpoint containing
paramsmerged withself.include_in_allParameters: - params (dict) – data to store in checkpoint. This is merged with
anything supplied to
include_in_allin the constructor - tries (int) – number of attempts to try and save the checkpoint. If the number of attempts exhausts, then no checkpoint is saved
Returns: None
- params (dict) – data to store in checkpoint. This is merged with
anything supplied to