# Liliana Model Set 125

Nowadays, with the ever increasing growth of interconnected data, a huge number of real-world scenarios and variety of applications can profitably be modeled using complex networks. In this context, one key aspect is how to incorporate information about the structure of the graph into machine learning models. Graph representation learning approaches are gaining increasing attention in recent years, since they are designed to overcome the limitations of traditional, hand-engineered feature extraction methods, by learning a mapping to embed nodes, or entire (sub)graphs, as points in a low-dimensional vector space. This mapping is then optimized so that geometric relationships in this learned space reflect the structure of the original graph. After optimizing the embedding space, the learned embeddings can be used as feature inputs for downstream machine/deep learning tasks for exploration and/or prediction (e.g., node classification, community detection and evolution, link prediction).

## Liliana Model Set 125

**Download Zip: **__https://www.google.com/url?q=https%3A%2F%2Furlin.us%2F2tLtnP&sa=D&sntz=1&usg=AOvVaw2GTiJ7vb_IfmgWPd0F3gXZ__

Effective integration and exploitation of across-layer information, including the possibility of assigning different weights to different layers or treating them equally, as needed. This also avoids using a simplistic approach based on network flattening, so that dependencies between the layers can be retained, including both the links between the replicas of the nodes in different layers (pillar-edges) and any other inter-layer edges. Moreover, with respect to modeling the across-layer information related to pillar-edges, we also propose a variant of the main method, which will be referred to as Co-MLHAN-SA.

Our proposed Co-MLHAN is a self-supervised graph representation learning approach conceived for multilayer heterogeneous attributed networks. As previously discussed, a key novelty of Co-MLHAN is its higher expressiveness w.r.t. existing methods, since heterogeneity is assumed to hold at both node and edge levels, possibly for each layer of the network. This capability of handling graphs that are multilayer, heterogeneous, and attributed simultaneously, enables Co-MLHAN to better model complex real-world scenarios, thus incorporating most information when generating node embeddings.

The second stage models two graph views, named network schema view and meta-path view, able to encode the local and global structure surrounding nodes, respectively, while exploiting multilayer information.

Following Wang et al. (2021), given a target entity, the network schema view is used to capture the local structure, by modeling information from all the direct neighbors of the corresponding target nodes, whereas the meta-path view is used to capture the global structure, by modeling information from all the nodes connected to the corresponding target nodes through a meta-path and from the pillar-edges derived by the corresponding meta-path based graph.

Note that we refer to relation type and not to node type to be consistent in the event that target nodes are connected to a certain node type through multiple relationships. We point out that, in accordance with the infomax principle, the network schema view does not model pillar-edges, since they are processed in the other view. We also specify that intra-layer edges in different layers are seen as different types of relations, reflecting the separation into layers according to a certain aspect. In practice, layers are an additional way for distinguishing the context of relations.

where \(\beta ^(l)\) is the learned attention coefficient for layer \(G_l\), computed via the same attention model like in Eq. 6, where in this case the learnable weights are shared by all layers.

In the meta-path view, across-layer dependencies are modeled as particular types of meta-paths, i.e., across-layer meta-paths. They refer to the same composite relation, with the additional constraint that the terminal nodes belong to different layers, and that the intermediate node matches a pillar-edge, i.e., it corresponds to an entity (of type different from the target one) with both instances involved in the composite relation. An example is illustrated in Fig. 7. We define the set of across-layer meta-paths, \(\mathcalM^\Updownarrow \), as the the union of all meta-paths of any type and defined over all layer-pairs.

where \(\sigma (\cdot )\) is a non-linear activation function (default is \(ReLU(\cdot ) = max(0,\cdot )\)), \(\textbfW^(k,l)\) is the trainable weight matrix for the m-th meta-path in the k-th convolutional layer of shape (d, d), and \(\widetilde\textbfD^l_ii=\sum _j\widetilde\textbfA^l_ij\) is the degree matrix derived from \(\widetilde\textbfA_l = \textbfA_l + \textbfI_n\), with \(\textbfI^l_n\) as the identity matrix of size \(n_l\), and \(n_l\) number of nodes of layer \(G_l\). The GCN model for across-layer meta-paths is built similarly, considering \(N^\Updownarrow (\cdot )\) instead of \(N^\Leftrightarrow (\cdot )\) and \(\pi\) instead of l.

To give an intuition, we model across-layer information downstream of semantic attention, by accounting for another level of attention, i.e., across-layer attention (by analogy with the network schema view).

Aggregating information of different instances of the same type (MPVE-SA-1). We still use the notation \(N^\Leftrightarrow (\cdot )\) and \(N^\Updownarrow (\cdot )\) to indicate the set of within-layer and across-layer neighbors, respectively. While the definition of \(N^\Leftrightarrow (\cdot )\) does not change w.r.t. Eq. 8, the definition of \(N^\Updownarrow (\cdot )\) of the Co-MLHAN-SA approach is modified in the modeling of pillar-edges, by directly considering all the instances of the same target entities in other layers, as shown in Eq. 15:

Unlike MPVE-1, the inter-layer dependencies are taken into account by the GNN, employing a modified version of the propagation rule that can handle the supra-adjacency matrix as input. We thus build for each meta-path its corresponding meta-path based supra-graph, i.e., a graph where pillar edges exist between every node and its counterpart in other coupled layers. In our setting, we instantiate \(f_m\) with a multi-layer GCN model (Zangari et al. 2021), as shown in Eq. 17:

We found the optimal hyperparameters for the representation learning process via grid search algorithm. Specifically, we trained the model using the Adam optimization algorithm (Kingma and Ba 2017) with full batch size, for 10,000 epochs, with early stopping technique based on the contrastive loss value and patience set to 30 (i.e., the training procedure stops if loss value does not decrease for 30 consecutive epochs), with \(\lambda =0.5\) for the convex combination of the two contrastive losses. Learning rate was set to 0.0001, and dropout regularization technique with \(p=0.3\) was applied to the transformed features \(\textbfh\).

In this section, we summarize the main findings of the empirical evaluation of our framework. We experimented it on two novel network datasets derived from IMDb (cf. Appendix 2), which are simultaneously multilayer, heterogeneous, and attributed. Specifically, we modeled IMDb as a temporal network with two layers, where each layer is heterogeneous and corresponds to years of movie releases. The first network dataset, named IMDb-MLH, was conceived for the comparative evaluation of our framework, since it fulfills the requirements of our competitors. The second network dataset, named IMDb-MLH-mb, was designed to reduce class imbalance and is not applicable to the competitors. Thus, we used it to investigate different input settings of our methods, i.e., Co-MLHAN and Co-MLHAN-SA.

We discuss below most relevant GNN-based approaches that are designed for different aspects of complex networks and particularly related to our approach. Over the last years, several works focused on the extension of popular GNN models such as GCN (Kipf and Welling 2017) and GAT (Velickovic et al. 2018) to the heterogeneous or multilayer case. Their extension is still an open research problem. In this section, we explore both semi-supervised and unsupervised learning paradigms, with emphasis on contrastive learning approaches in unsupervised contexts.

HetGNN (Zhang et al. 2019) introduces a random walk with restart strategy to sample a fixed size of strongly correlated heterogeneous neighbors for each node, and group them on the basis of their type. It employs two modules of recurrent neural networks, encoding deep features interactions of heterogeneous contents and content embeddings of different neighboring groups, respectively, which are further combined by an attention mechanism. Co-MLHAN shares with HetGNN the modeling approach to external content encoding.

Other models leverage meta-path based neighbors and they differ in the information captured along the meta-paths. HAN (Wang et al. 2019) focuses only on the information associated with the endpoint nodes of meta-paths. It employs both node-level and semantic-level attentions. Upon the learned attention values, the model can generate node embeddings by aggregating features from meta-path based neighbors in a hierarchical manner. In addition to the information of the terminal nodes in meta-paths, MAGNN (Fu et al. 2020) also incorporates information from intermediate nodes along the meta-paths. It uses intra-meta-path aggregation to incorporate intermediate nodes, and inter-meta-path aggregation to combine messages from multiple meta-paths. DHGCN (Manchanda et al. 2021) incorporates both the information of the nodes along the meta-paths and the information in the ego-network of the endpoints nodes, i.e., the information coming from the direct neighbors of the terminal nodes. It utilizes a two-step schema-aware hierarchical approach, performing attentio