Document Details

Document Type : Thesis 
Document Title :
GRAPH-STRUCTURED DATA LEARNING USING NEURAL NETWORKS
تعلم البيانات المهيكلة ببيان باستخدام الشبكات العصبية
 
Subject : Faculty of Engineering 
Document Language : Arabic 
Abstract : The technological advancements of our age are accompanied by large amounts of varying data. Data taken from various aspects of life and physical phenomena are considered raw. Raw data needs to be processed in order to deduce useful information that facilitates the decision making process. Data pertaining to different phenomena come in varying types. The way data is mathematically described affects the choice of the data processing tools. Machine learning algorithms and especially artificial neural networks have proved to be excellent data processing tools. They enable the user to learn from the given data and make predictions that would traditionally require a human operator to make. The literature is rich with endeavors to design neural network models with high prediction accuracy for many real life challenges. Generic neural network models can be used for any type of data with varying degrees of success. Research in neural network models strive to fully understand the properties of the data at hand. This understanding leads to designing specialized learning models that can exploit the given data and process it most efficiently. For example, Recurrent Neural Networks (RNNs) are known to be suitable for time-domain data, and Convolutional Neural Networks (CNNs) are excellent for learning from image data. The graph domain is an abstract mathematical representation that can represent many real life data. Other types of data such as time series, 2D images and 3D meshes can also be naturally adapted to the graph domain. Therefore, there is a need to develop specialized neural network models capable of learning graph structured data as efficiently as possible. The aim of this thesis is to design a neural network model that is able to learn from graph data and make predictions with relatively high accuracy. This work focuses on semi-supervised learning of graph data. This designation entails the availability of some labeled data point in the learning phase. The model should be able to predict the labels of the data points whose labels were not visible in the training phase. The thesis provides an overview of five prominent neural network models. Namely the models are: Graph Neural Network (GNN), Graph Convolutional Networks (GCNs), Network of GCNs (N-GCN), Graph Attention Networks (GATs), and the Attention-Based Graph Neural Networks (AGNNs). The thesis then details three graph diffusion processes. Diffusion processes are used to propagate the data through the underlying graph structure. The first two diffusion processes are formulized by two minimization problems. The two minimization problems aim to fulfill the smoothness assumption. This assumption entails the expectation that graph nodes close to each other should exhibit similar features. The smoothness assumption is also balanced by a term that restrains the data points from deviating too much from their original values after the diffusion process. The third diffusion process is based the idea of graph random walk. We propose using a combination of the three diffusion processes as a new process. The new combined diffusion process is used in designing a neural network model for graph structured data. The proposed model is named Combined Graph Diffusion Embedding Network (CGDEN). The model was tested on two benchmarking citation datasets; Cora and Citeseer. The classifying accuracy of the model was compared against a number of baselines from the literature. The model achieved a correct classifying accuracy of 84.9% on the Cora dataset, and 73.4% on the Citeseer dataset. The Cora result is 1.9% higher than the previous best performance, while the Citeseer result had a 1.3% improvement on the previous best result. The state of the art results of the model on the standard benchmarking dataset prove the ability of the model to efficiently learn from and make predictions on graph structured data. The citation datasets are modeled as undirected and unweighted graphs. For future works; the CGDEN model could be modified to accept directed and weighted graphs as inputs. There is also a need to find a systematic way to find the optimum values for the hyper parameters and in the model. 
Supervisor : Dr. Mohammad Moinuddin 
Thesis Type : Master Thesis 
Publishing Year : 1441 AH
2019 AD
 
Co-Supervisor : Prof. Ubaid M. Al. Saggaf 
Added Date : Thursday, September 19, 2019 

Researchers

Researcher Name (Arabic)Researcher Name (English)Researcher TypeDr GradeEmail
عبدالله محسن الجفريAl-Gafri, Abdullah MohsenResearcherMaster 

Files

File NameTypeDescription
 45027.pdf pdf 

Back To Researches Page