Research in context graph, Ricgraph, Ricgraph Explorer, Ricgraph REST API, Data enrichment, Data harvesting, Data linking, Enrichment, Graph, Graph database, Harvest, Harvest data, Harvester, Knowledge graph, Linked data, Metadata, Utrecht University, Visualization
Ricgraph as a server on Linux
This page describes how to install and run Ricgraph in a multi-user environment on Linux. Multi-user environment means that you install Ricgraph on a Linux (virtual) machine, and that various persons can log on to that machine, each with his own user id and password. Each person will be able to use Ricgraph by using a web link in their web browser. For other Ricgraph install options start reading at Install and configure Ricgraph for a single user.
The reason that a Linux multi-user environment for Ricgraph is different from installing and using Ricgraph on your own user id, is that you will need to run the graph database backend and Ricgraph Explorer as a system user instead of running it using your own user id. In case you run Ricgraph with your own user id, you will be the only user able to use it. In case other persons on that same machine would like to use Ricgraph, they have to install it for themselves. By installing Ricgraph as a server, as described on this page, Ricgraph will be started automatically when your machine boots, and it can be used by any user on that machine.
To install and run Ricgraph in a multi-user environment, read Fast and recommended way to install Ricgraph as a server.
On this page, you can find:
- Fast and recommended way to install Ricgraph as a server
- Run Ricgraph scripts from the command line or as a cronjob
- Use a service unit file to run Ricgraph Explorer and the Ricgraph REST API
- Use Apache, WSGI, and ASGI to make Ricgraph Explorer and the Ricgraph REST API accessible from outside your virtual machine
- Use Nginx, WSGI, and ASGI to make Ricgraph Explorer and the Ricgraph REST API accessible from outside your virtual machine
- How to install Ricgraph and Ricgraph Explorer on SURF Research Cloud
- Steps to take to install Ricgraph as a server by hand
Return to main README.md file.
1 Fast and recommended way to install Ricgraph as a server
To follow this procedure, you need to be able to change to user root.
Login as user root.
sudo bash
Get the most recent Ricgraph Makefile. Type:
cd wget https://raw.githubusercontent.com/UtrechtUniversity/ricgraph/main/Makefile
Read more at Ricgraph Makefile.
Install Neo4j Community Edition. Type:
make install_enable_neo4j_community
On success, the Makefile will print installed successfully.
Download and install Ricgraph in system directory /opt.
make install_ricgraph_server
On success, the Makefile will print installed successfully.
Harvest two source systems in Ricgraph:
cd /opt/ricgraph_venv make run_bash_script
This will harvest two source systems, the data repository Yoda and the Research Software Directory. It will print a lot of output, and it will take a few minutes. When ready, it will print Done.
To read more about harvesting data, see Ricgraph harvest scripts. To read more about writing harvesting scripts, see Ricgraph script writing.
Start Ricgraph Explorer to browse the information harvested:
cd /opt/ricgraph_venv make run_ricgraph_explorer
The Makefile will tell you to go to your web browser, and go to http://127.0.0.1:3030. Read more at Ricgraph Explorer. For the Ricgraph REST API, read more on the Ricgraph REST API page.
Optional: use a service unit file to run Ricgraph Explorer and the Ricgraph REST API. Type:
make install_enable_ricgraphexplorer_restapi
On success, the Makefile will print installed successfully. Read more at Use a service unit file to run Ricgraph Explorer and the Ricgraph REST API.
Optional and possibly dangerous: use Apache or Nginx webserver, WSGI, and ASGI to make Ricgraph Explorer and the Ricgraph REST API accessible from outside your virtual machine. Read more at Use Apache…. or at Use Nginx…. On success, the Makefile will print installed successfully.
Exit from user root.
If everything succeeded, you are done installing Ricgraph as a server. If not, sections Steps to take to install Ricgraph as a server by hand or Install and start Neo4j Community Edition graph database backend may help in finding solutions.
2 Run Ricgraph scripts from the command line or as a cronjob
2.1 Using the Makefile
The Ricgraph Makefile can also be used to execute a Python script or a bash script. Such a script can be used to harvest the sources specific to your organization. The Makefile provides a command make run_python_script
, to run any Ricgraph Python script. You will need to add the name of the script to run using the Makefile command line parameter python_script, e.g.
make run_python_script python_script=[path]/[python script]
There is a similar command for running bash scripts:
make run_bash_script bash_script=[path]/[bash script]
Both make
commands execute the script and the output will appear both on your screen and in a file. The Makefile will tell you the name of this log file.
Examples of commands you can use are:
harvest from the Research Software Directory:
make run_python_script python_script=harvest/harvest_rsd_to_ricgraph.py
harvest two sources without needing any keys or configuration:
make run_bash_script
run Ricgraph Explorer:
make run_ricgraph_explorer
2.2 In case you have installed Ricgraph as a server
After following the steps in Create a Python virtual environment and install Ricgraph in it, it is possible to run Ricgraph from the command line or as a cronjob. To be able to run these scripts you need to be user ricgraph and group ricgraph. You can do this by using user ricgraph in your crontab file (e.g. in /etc/crontab), or by using the command
sudo su - ricgraph
If you are finished with these commands, exit from user ricgraph.
Examples of commands you can use are:
harvest from the Research Software Directory:
cd /opt/ricgraph_venv/harvest; ../bin/python harvest_rsd_to_ricgraph.py
harvest two sources without needing any keys or configuration:
cd /opt/ricgraph_venv/harvest_multiple_sources; ./multiple_harvest_demo.sh
run Ricgraph Explorer:
cd /opt/ricgraph_venv/ricgraph_explorer; ../bin/python ricgraph_explorer.py
2.3 In case you have installed Ricgraph for a single user
After following the steps in Create a Python virtual environment and install Ricgraph in it, it is possible to run Ricgraph from the command line. You do not need to be user ricgraph and group ricgraph. The following assumes your Python virtual environment is in your Linux home directory $HOME.
Examples of commands you can use are:
harvest from the Research Software Directory:
cd $HOME/ricgraph_venv/harvest; ../bin/python harvest_rsd_to_ricgraph.py
harvest all your favorite sources:
cd $HOME/ricgraph_venv/harvest_multiple_sources; ./multiple_harvest_demo.sh
run Ricgraph Explorer:
cd $HOME/ricgraph_venv/ricgraph_explorer; ../bin/python ricgraph_explorer.py
3 Use a service unit file to run Ricgraph Explorer and the Ricgraph REST API
Using a service unit file to run Ricgraph Explorer is very useful if you would like to set up a virtual machine that you want to use as a demo server, or if you would like to use the Ricgraph REST API. After the steps in this section, Ricgraph Explorer and the Ricgraph REST API are run automatically at the start of the virtual machine, so you can immediately start giving the demo.
For comparison, if you had installed the graph database backend and Ricgraph for a single user, as described in the documentation describing the installation and configuration of Ricgraph for a single user, after the start of the virtual machine, you would need to start the graph database backend, the virtual environment, and ricgraph_explorer.py by hand.
To use a service unit file, you can either use the Ricgraph Makefile and execute command:
make install_enable_ricgraphexplorer_restapi
or follow the steps below.
Using a service unit file will not expose Ricgraph Explorer, the Ricgraph REST API, and Ricgraph data to the outside world. All data will only be accessible in the virtual machine.
Follow the steps in Create a Python virtual environment and install Ricgraph in it.
Login as user root.
Install the Ricgraph Explorer service unit file: copy file ricgraph_server_config/ricgraph_explorer_gunicorn.service to /etc/systemd/system, type:
cp /opt/ricgraph_venv/ricgraph_server_config/ricgraph_explorer_gunicorn.service /etc/systemd/system
Make it run by typing:
systemctl enable ricgraph_explorer_gunicorn.service systemctl start ricgraph_explorer_gunicorn.service
Check the log for any errors, use one of:
systemctl -l status ricgraph_explorer_gunicorn.service journalctl -u ricgraph_explorer_gunicorn.service
Exit from user root.
Now you can use Ricgraph Explorer by typing http://localhost:3030 in your web browser (i.e., the web browser of the virtual machine). You can use the Ricgraph REST API by using the path http://localhost:3030/api followed by a REST API endpoint.
4 Use Apache, WSGI, and ASGI to make Ricgraph Explorer and the Ricgraph REST API accessible from outside your virtual machine
4.1 Introduction Apache webserver
Ricgraph Explorer is written in Flask, a framework for Python to build web interfaces. Flask contains a development web server, and if you start Ricgraph Explorer by typing ricgraph_explorer.py, it will be started using that development web server. As this development web server is sufficient for development and demoing, it is certainly not sufficient for exposing Ricgraph data to the outside world (that is, to users outside your own virtual machine). The same holds for the Ricgraph REST API.
For this, you will need a web server and a WSGI environment. For the REST API, you will need an ASGI environment. This section describes how to do that with Apache and gunicorn. Note that the example configuration file for Apache exposes Ricgraph Explorer to the outside world on a http (unencrypted) connection, without any form of authentication. Certainly, this is not the way to do it. At least you should expose Ricgraph Explorer and the REST API using a https (encrypted) connection, possibly with additional authentication.
Therefore, the configuration file provided is an example for further development. There is no example code for a https connection, nor for authentication, nor for automatically obtaining and renewing SSL certificates, because these are specific to a certain situation (such as your external IP address, hostname, web server, domain name, SSL certificate provider, authentication source, etc.). So only expose Ricgraph Explorer, the Ricgraph REST API, and the data in Ricgraph to the outside world if you have considered these subjects, and have made an informed decision what is best for your situation.
To prevent accidental exposure of Ricgraph Explorer, the REST API, and the data in Ricgraph to the outside world, you will have to modify the Apache configuration file. You need to make a small modification to make it work. How to do this is described in the comments at the start of the configuration file.
Note that it is also possible to use Nginx as a webserver. If you are using SURF Research Cloud, you will need to use Nginx.
To use Apache c.s., you can either use the Ricgraph Makefile and execute command:
make prepare_webserver_apache
or follow the steps below.
4.2 Installation Apache
Note that different Linux editions use different paths. In the steps below, path names from OpenSUSE Leap are used. Please adapt them to you own Linux edition:
- OpenSUSE Leap:
apache2
and /etc/apache2/vhosts.d - Ubuntu:
apache2
and /etc/apache/sites-available - Fedora:
httpd
and /etc/httpd/conf.d
Using Apache, WSGI, and ASGI will expose Ricgraph Explorer, the Ricgraph REST API, and Ricgraph data to the outside world.
Follow the steps in Create a Python virtual environment and install Ricgraph in it.
Login as user root.
Make sure Apache has been installed.
Gunicorn has already been installed when you installed the Python requirements.
Enable two Apache modules (they have already been installed when you installed Apache):
a2enmod mod_proxy a2enmod mod_proxy_http
Install the Apache Ricgraph Explorer configuration file: copy file ricgraph_server_config/ricgraph_explorer.conf-apache to /etc/apache2/vhosts.d, type:
cp /opt/ricgraph_venv/ricgraph_server_config/ricgraph_explorer.conf-apache /etc/apache2/vhosts.d chmod 600 /etc/apache2/vhosts.d/ricgraph_explorer.conf-apache
4.3 Post-install steps Apache
Login as user root.
Move the Apache Ricgraph Explorer configuration file to its final location:
mv /etc/apache2/vhosts.d/ricgraph_explorer.conf-apache /etc/apache2/vhosts.d/ricgraph_explorer.conf
However, for Ubuntu do:
mv /etc/apache2/sites-available/ricgraph_explorer.conf-apache /etc/apache2/sites-available/ricgraph_explorer.conf ln -s /etc/apache2/sites-enabled/ricgraph_explorer.conf /etc/apache2/sites-available/ricgraph_explorer.conf
Change ricgraph_explorer.conf in such a way it fits your situation. Make the modification to ricgraph_explorer.conf as described in the comments at the start of ricgraph_explorer.conf. Test the result.
Make it run by typing:
systemctl enable apache2.service systemctl start apache2.service
Check the log for any errors, use one of:
systemctl -l status apache2.service journalctl -u apache2.service
Exit from user root.
Now you can use Ricgraph Explorer from inside your virtual machine by typing http://localhost in your web browser in the virtual machine, or from outside your virtual machine by going to http://[your IP address] or http://[your hostname].
You can use the Ricgraph REST API from inside your virtual machine by using the path http://localhost:3030/api followed by a REST API endpoint, or from outside your virtual machine by using the path http://[your IP address/api] or http://[your hostname]/api, both followed by a REST API endpoint.
5 Use Nginx, WSGI, and ASGI to make Ricgraph Explorer and the Ricgraph REST API accessible from outside your virtual machine
5.1 Introduction Nginx webserver
Please read the introduction of section Use Apache, WSGI, and ASGI to make Ricgraph Explorer and the Ricgraph REST API accessible from outside your virtual machine. The same explanation and words of caution as for using Apache as a webserver hold for Nginx as a webserver.
To prevent accidental exposure of Ricgraph Explorer, the REST API, and the data in Ricgraph to the outside world, you will have to modify the Nginx configuration file. You need to make a small modification to make it work. How to do this is described in the comments at the start of the configuration file.
Note that it is also possible to use Apache as a webserver.
To use Nginx c.s., you can either use the Ricgraph Makefile and execute command:
make prepare_webserver_nginx
or follow the steps below.
5.2 Installation Nginx
Note that different Linux editions use different paths. In the steps below, path names from OpenSUSE Leap are used. Please adapt them to you own Linux edition:
- OpenSUSE Leap: /etc/nginx/vhosts.d
- Ubuntu: /etc/nginx/sites-available
- Fedora: /etc/nginx/conf.d
Using Nginx, WSGI, and ASGI will expose Ricgraph Explorer, the Ricgraph REST API, and Ricgraph data to the outside world.
Follow the steps in Create a Python virtual environment and install Ricgraph in it.
Login as user root.
Make sure Nginx has been installed.
Gunicorn has already been installed when you installed the Python requirements.
Install the Nginx Ricgraph Explorer configuration file: copy file ricgraph_server_config/ricgraph_explorer.conf-nginx to /etc/nginx/vhosts.d, type:
cp /opt/ricgraph_venv/ricgraph_server_config/ricgraph_explorer.conf-nginx /etc/nginx/vhosts.d chmod 600 /etc/nginx/vhosts.d/ricgraph_explorer.conf-nginx
5.3 Post-install steps Nginx
Login as user root.
Move the Nginx Ricgraph Explorer configuration file to its final location:
mv /etc/nginx/vhosts.d/ricgraph_explorer.conf-nginx /etc/nginx/vhosts.d/ricgraph_explorer.conf
However, for Ubuntu do:
mv /etc/nginx/sites-available/ricgraph_explorer.conf-nginx /etc/nginx/sites-available/ricgraph_explorer.conf ln -s /etc/nginx/sites-enabled/ricgraph_explorer.conf /etc/nginx/sites-available/ricgraph_explorer.conf
Change ricgraph_explorer.conf in such a way it fits your situation. Make the modification to ricgraph_explorer.conf as described in the comments at the start of ricgraph_explorer.conf. Test the result.
Make it run by typing:
systemctl enable nginx.service systemctl start nginx.service
Check the log for any errors, use one of:
systemctl -l status nginx.service journalctl -u nginx.service
Exit from user root.
Now you can use Ricgraph Explorer from inside your virtual machine by typing http://localhost in your web browser in the virtual machine, or from outside your virtual machine by going to http://[your IP address] or http://[your hostname].
You can use the Ricgraph REST API from inside your virtual machine by using the path http://localhost:3030/api followed by a REST API endpoint, or from outside your virtual machine by using the path http://[your IP address]/api or http://[your hostname]/api, both followed by a REST API endpoint.
6 How to install Ricgraph and Ricgraph Explorer on SURF Research Cloud
SURF Research Cloud is a portal where you can easily build a virtual research environment. You can use preconfigured workspaces, or you can add them yourself. A virtual research environment or workspace is a virtual machine that you can use to install Ricgraph. Please follow these steps if you would like to install Ricgraph and Ricgraph Explorer on SURF Research Cloud.
6.1 Preliminaries
- Make sure you have access to SURF Research Cloud and that you have a wallet available. A wallet is a budget. A wallet has credits, and these credits are used to pay for the SURF computing resources. The more resources you use, the more you have to pay. For resources, think of disk space, the number of CPUs, the amount of memory, and the time the virtual machine is running.
- If you do not have access to SURF Research Cloud or you do not have a wallet, please contact the SURF Research Cloud contact person at your organization. These persons may be at the Research Data Management Support desk, service desk, or help desk of your organization, or they might be persons like research engineers, data stewards, data managers, or data consultants.
6.2 Create a SURF Research Cloud workspace
To create a SURF Research Cloud workspace, follow the following steps:
- Go to the SURF Research Cloud portal and log in.
- Optional: Allocate storage. This step is only required if you expect to install a lot of programs on the virtual research environment and expect to create or use a lot of data. In the case of Ricgraph: > 100M nodes and edges. This is for advanced use only, since this storage will be attached to /data in the virtual research environment, and not to /var/lib, where the Neo4j Community Edition graph database lives.
- Click on “Create new storage”.
- Select the collaborative organization that you want to use for running Ricgraph. If you have only one, it will be preselected.
- Select your wallet. If you have only one, it will be preselected.
- Select the cloud provider. We use “SURF HPC Cloud volume”.
- Choose the size of your storage. In the video below we use “100GB”. The larger, the more credits it will cost.
- Enter a name and a description.
- After a few moments your storage will be created and available.
- Create a workspace (that is, a virtual machine to run Ricgraph in):
- Click on “Create new workspace”.
- Select the collaborative organization that you want to use for running Ricgraph (as above). If you have only one, it will be preselected.
- Select your wallet (as above). If you have only one, it will be preselected.
- Now select a “catalogue item”, that is, a pre-installed virtual machine. Choose “Ubuntu Desktop”.
- Select the cloud provider. We use “SURF HPC Cloud”.
- Select which version of Ubuntu you want to use. Choose “Ubuntu 22.04 Desktop”.
- Select a configuration. In the video below we use “1 Core - 8 GB RAM”. The larger, the more credits it will cost.
- By default, the workspace has ~95GB of storage on the system and home partition.
- Optionally you can add more storage, above is explained how to allocate it. If you have done this, select this additional storage.
- Rename your workspace.
- After some minutes your workspace will be created and available. It will be started up automatically.
- Note that your workspace has a will be removed date. You might want to set it to a suitable date.
- Done.
You might want to watch the video how to install Ricgraph and Ricgraph Explorer on SURF Research Cloud (2m14s) (click to view or download). Note that in the video, we use an old version of Ubuntu. Please use Ubuntu 22.04 as described above.
6.3 Install Ricgraph in a SURF Research Cloud workspace
The next steps in your workspace are to install the graph database backend and Ricgraph. You can install Ricgraph for a single user or Ricgraph as a server. Note that if you would like to use a webserver, you will need to use Nginx.
6.4 Pause and resume a SURF Research Cloud workspace
On the SURF Research Cloud portal, you can pause and resume your workspace. Pausing means that the workspace will not run, and of course then it will not be accessible. If you have paused your workspace, it does not cost credits (money). If you resume your workspace, you can use it again.
6.5 Access a SURF Research Cloud workspace
On the workspace window, you will find the name of the workspace. It will be a https link that ends with .src.surf-hosted.nl. SURF Research Cloud uses guacamole, which provides you with a desktop window in your browser. There are two ways to access your workspace and authenticate:
- Use port 443, then you will login on your workspace using SURF conext. In this case, the link will look like: https://[name of your workspace].src.surf-hosted.nl.
- Use port 3389, then you will login on your workspace using a one time password. In this case, the link will look like: https://[name of your workspace].src.surf-hosted.nl:3389.
7 Steps to take to install Ricgraph as a server by hand
Skip this section if you have done the Fast and recommended way to install Ricgraph as a server and there were no errors.
- Install your graph database backend.
- Create a ricgraph user and group.
- Create a Python virtual environment and install Ricgraph in it.
- Create and update the Ricgraph initialization file. This is also the place where you specify which graph database backend you use.
- Start harvesting data, see Ricgraph harvest scripts, or writing scripts, see Ricgraph script writing.
- Start browsing using Ricgraph Explorer.
7.1 Install your graph database backend
Install your graph database backend (choose one of these):
7.2 Create a ricgraph user and group
Follow these steps:
Login as user root.
Create group and user ricgraph. First check if they exist:
grep ricgraph /etc/group grep ricgraph /etc/passwd
If you get output, they already exist, and you don’t need to do this step. If you get no output, you will need to create the group and user:
groupadd --system ricgraph useradd --system --comment "Ricgraph user" --no-create-home --gid ricgraph ricgraph
Exit from user root.
7.3 Create a Python virtual environment and install Ricgraph in it
Follow these steps:
Suppose you are a user with login alice and you are in Linux group users.
Login as user root.
For Debian/Ubuntu: type:
apt install python3-venv
Go to directory /opt, type:
cd /opt
Create a Python virtual environment: in /opt, type:
python3 -m venv ricgraph_venv
Change the owner and group to your own user alice and group users, in /opt, type:
chown -R alice:users /opt/ricgraph_venv
The path /opt/ricgraph_venv is hardwired in the configuration files ricgraph_server_config/ricgraph_explorer_gunicorn.service and ricgraph_server_config/ricgraph_explorer.conf-apache. This is done for security reasons. If you change the path, also change it in these files.
Exit from user root. Do the following steps as your own user.
Download the latest release of Ricgraph from the Ricgraph downloads page to directory /opt/ricgraph_venv. Get the
tar.gz
version.Install Ricgraph: go to /opt/ricgraph_venv, type:
tar xf /opt/ricgraph-X.YY.tar.gz
(X.YY is the version number you downloaded). You will get a directory /opt/ricgraph_venv/ricgraph-X.YY.
Merge the Ricgraph you have extracted with tar with the virtual environment, and do some cleanup: in /opt/ricgraph_venv, type:
mv ricgraph-X.YY/* /opt/ricgraph_venv rm -r /opt/ricgraph_venv/ricgraph-X.YY rm /opt/ricgraph_venv/ricgraph-X.YY.tar.gz
Activate the Python virtual environment: in /opt/ricgraph_venv, type:
source bin/activate
Install the standard Python requirements: in /opt/ricgraph_venv, type:
pip install setuptools pip wheel
Install the Python requirements for Ricgraph: in /opt/ricgraph_venv, type:
pip install -r requirements.txt
If you get an error message
ERROR: Could not find a version that satisfies the requirement neo4j>=5.8
then your Python version is too old. Please read How to solve an AttributeError: Neo4jDriver object has no attribute executequery.
Create a Ricgraph initialization file, read Ricgraph initialization file. This is also the place where you specify which graph database backend you use. You can find these settings in section GraphDB.
- For Neo4j, enter the new password for Neo4j from section Install and start Neo4j Community Edition at the parameter graphdb_password.
Deactivate the Python virtual environment: type
deactivate
Login as user root.
Change the owner and group to ricgraph of directory /opt/ricgraph_venv. In /opt, type
chown -R ricgraph:ricgraph /opt/ricgraph_venv
Exit from user root.