Keywords

Research in context graph, Ricgraph, Ricgraph Explorer, Ricgraph REST API, Data enrichment, Data harvesting, Data linking, Enrichment, Graph, Graph database, Harvest, Harvest data, Harvester, Knowledge graph, Linked data, Metadata, Utrecht University, Visualization

Ricgraph as a server on Linux

This page describes how to install and run Ricgraph in a multi-user environment on Linux. Multi-user environment means that you install Ricgraph on a Linux (virtual) machine, and that various persons can log on to that machine, each with his own user id and password. Each person will be able to use Ricgraph by using a web link in their web browser. For other Ricgraph install options start reading at Install and configure Ricgraph for a single user.

The reason that a Linux multi-user environment for Ricgraph is different from installing and using Ricgraph on your own user id, is that you will need to run the graph database backend and Ricgraph Explorer as a system user instead of running it using your own user id. In case you run Ricgraph with your own user id, you will be the only user able to use it. In case other persons on that same machine would like to use Ricgraph, they have to install it for themselves. By installing Ricgraph as a server, as described on this page, Ricgraph will be started automatically when your machine boots, and it can be used by any user on that machine.

To install and run Ricgraph in a multi-user environment, read Fast and recommended way to install Ricgraph as a server.

On this page, you can find:

Return to main README.md file.

2 Run Ricgraph scripts from the command line or as a cronjob

2.1 Using the Makefile

The Ricgraph Makefile can also be used to execute a Python script or a bash script. Such a script can be used to harvest the sources specific to your organization. The Makefile provides a command make run_python_script, to run any Ricgraph Python script. You will need to add the name of the script to run using the Makefile command line parameter python_script, e.g.

make run_python_script python_script=[path]/[python script]

There is a similar command for running bash scripts:

make run_bash_script bash_script=[path]/[bash script]

Both make commands execute the script and the output will appear both on your screen and in a file. The Makefile will tell you the name of this log file.

Examples of commands you can use are:

  • harvest from the Research Software Directory:

    make run_python_script python_script=harvest/harvest_rsd_to_ricgraph.py
  • harvest two sources without needing any keys or configuration:

    make run_bash_script
  • run Ricgraph Explorer:

    make run_ricgraph_explorer

2.2 In case you have installed Ricgraph as a server

After following the steps in Create a Python virtual environment and install Ricgraph in it, it is possible to run Ricgraph from the command line or as a cronjob. To be able to run these scripts you need to be user ricgraph and group ricgraph. You can do this by using user ricgraph in your crontab file (e.g. in /etc/crontab), or by using the command

sudo su - ricgraph

If you are finished with these commands, exit from user ricgraph.

Examples of commands you can use are:

  • harvest from the Research Software Directory:

    cd /opt/ricgraph_venv/harvest; ../bin/python harvest_rsd_to_ricgraph.py
  • harvest two sources without needing any keys or configuration:

    cd /opt/ricgraph_venv/harvest_multiple_sources; ./multiple_harvest_demo.sh
  • run Ricgraph Explorer:

    cd /opt/ricgraph_venv/ricgraph_explorer; ../bin/python ricgraph_explorer.py

2.3 In case you have installed Ricgraph for a single user

After following the steps in Create a Python virtual environment and install Ricgraph in it, it is possible to run Ricgraph from the command line. You do not need to be user ricgraph and group ricgraph. The following assumes your Python virtual environment is in your Linux home directory $HOME.

Examples of commands you can use are:

  • harvest from the Research Software Directory:

    cd $HOME/ricgraph_venv/harvest; ../bin/python harvest_rsd_to_ricgraph.py
  • harvest all your favorite sources:

    cd $HOME/ricgraph_venv/harvest_multiple_sources; ./multiple_harvest_demo.sh
  • run Ricgraph Explorer:

    cd $HOME/ricgraph_venv/ricgraph_explorer; ../bin/python ricgraph_explorer.py

3 Use a service unit file to run Ricgraph Explorer and the Ricgraph REST API

Using a service unit file to run Ricgraph Explorer is very useful if you would like to set up a virtual machine that you want to use as a demo server, or if you would like to use the Ricgraph REST API. After the steps in this section, Ricgraph Explorer and the Ricgraph REST API are run automatically at the start of the virtual machine, so you can immediately start giving the demo.

For comparison, if you had installed the graph database backend and Ricgraph for a single user, as described in the documentation describing the installation and configuration of Ricgraph for a single user, after the start of the virtual machine, you would need to start the graph database backend, the virtual environment, and ricgraph_explorer.py by hand.

To use a service unit file, you can either use the Ricgraph Makefile and execute command:

make install_enable_ricgraphexplorer_restapi

or follow the steps below.

Using a service unit file will not expose Ricgraph Explorer, the Ricgraph REST API, and Ricgraph data to the outside world. All data will only be accessible in the virtual machine.

  • Follow the steps in Create a Python virtual environment and install Ricgraph in it.

  • Login as user root.

  • Install the Ricgraph Explorer service unit file: copy file ricgraph_server_config/ricgraph_explorer_gunicorn.service to /etc/systemd/system, type:

    cp /opt/ricgraph_venv/ricgraph_server_config/ricgraph_explorer_gunicorn.service /etc/systemd/system

    Make it run by typing:

    systemctl enable ricgraph_explorer_gunicorn.service
    systemctl start ricgraph_explorer_gunicorn.service

    Check the log for any errors, use one of:

    systemctl -l status ricgraph_explorer_gunicorn.service
    journalctl -u ricgraph_explorer_gunicorn.service
  • Exit from user root.

  • Now you can use Ricgraph Explorer by typing http://localhost:3030 in your web browser (i.e., the web browser of the virtual machine). You can use the Ricgraph REST API by using the path http://localhost:3030/api followed by a REST API endpoint.

4 Use Apache, WSGI, and ASGI to make Ricgraph Explorer and the Ricgraph REST API accessible from outside your virtual machine

4.1 Introduction Apache webserver

Ricgraph Explorer is written in Flask, a framework for Python to build web interfaces. Flask contains a development web server, and if you start Ricgraph Explorer by typing ricgraph_explorer.py, it will be started using that development web server. As this development web server is sufficient for development and demoing, it is certainly not sufficient for exposing Ricgraph data to the outside world (that is, to users outside your own virtual machine). The same holds for the Ricgraph REST API.

For this, you will need a web server and a WSGI environment. For the REST API, you will need an ASGI environment. This section describes how to do that with Apache and gunicorn. Note that the example configuration file for Apache exposes Ricgraph Explorer to the outside world on a http (unencrypted) connection, without any form of authentication. Certainly, this is not the way to do it. At least you should expose Ricgraph Explorer and the REST API using a https (encrypted) connection, possibly with additional authentication.

Therefore, the configuration file provided is an example for further development. There is no example code for a https connection, nor for authentication, nor for automatically obtaining and renewing SSL certificates, because these are specific to a certain situation (such as your external IP address, hostname, web server, domain name, SSL certificate provider, authentication source, etc.). So only expose Ricgraph Explorer, the Ricgraph REST API, and the data in Ricgraph to the outside world if you have considered these subjects, and have made an informed decision what is best for your situation.

To prevent accidental exposure of Ricgraph Explorer, the REST API, and the data in Ricgraph to the outside world, you will have to modify the Apache configuration file. You need to make a small modification to make it work. How to do this is described in the comments at the start of the configuration file.

Note that it is also possible to use Nginx as a webserver. If you are using SURF Research Cloud, you will need to use Nginx.

To use Apache c.s., you can either use the Ricgraph Makefile and execute command:

make prepare_webserver_apache

or follow the steps below.

4.2 Installation Apache

Note that different Linux editions use different paths. In the steps below, path names from OpenSUSE Leap are used. Please adapt them to you own Linux edition:

  • OpenSUSE Leap: apache2 and /etc/apache2/vhosts.d
  • Ubuntu: apache2 and /etc/apache/sites-available
  • Fedora: httpd and /etc/httpd/conf.d

Using Apache, WSGI, and ASGI will expose Ricgraph Explorer, the Ricgraph REST API, and Ricgraph data to the outside world.

  • Follow the steps in Create a Python virtual environment and install Ricgraph in it.

  • Login as user root.

  • Make sure Apache has been installed.

  • Gunicorn has already been installed when you installed the Python requirements.

  • Enable two Apache modules (they have already been installed when you installed Apache):

    a2enmod mod_proxy
    a2enmod mod_proxy_http
  • Install the Apache Ricgraph Explorer configuration file: copy file ricgraph_server_config/ricgraph_explorer.conf-apache to /etc/apache2/vhosts.d, type:

    cp /opt/ricgraph_venv/ricgraph_server_config/ricgraph_explorer.conf-apache /etc/apache2/vhosts.d
    chmod 600 /etc/apache2/vhosts.d/ricgraph_explorer.conf-apache

4.3 Post-install steps Apache

  • Login as user root.

  • Move the Apache Ricgraph Explorer configuration file to its final location:

    mv /etc/apache2/vhosts.d/ricgraph_explorer.conf-apache /etc/apache2/vhosts.d/ricgraph_explorer.conf

    However, for Ubuntu do:

    mv /etc/apache2/sites-available/ricgraph_explorer.conf-apache /etc/apache2/sites-available/ricgraph_explorer.conf
    ln -s /etc/apache2/sites-enabled/ricgraph_explorer.conf /etc/apache2/sites-available/ricgraph_explorer.conf

    Change ricgraph_explorer.conf in such a way it fits your situation. Make the modification to ricgraph_explorer.conf as described in the comments at the start of ricgraph_explorer.conf. Test the result.

  • Make it run by typing:

    systemctl enable apache2.service
    systemctl start apache2.service

    Check the log for any errors, use one of:

    systemctl -l status apache2.service
    journalctl -u apache2.service
  • Exit from user root.

  • Now you can use Ricgraph Explorer from inside your virtual machine by typing http://localhost in your web browser in the virtual machine, or from outside your virtual machine by going to http://[your IP address] or http://[your hostname].

  • You can use the Ricgraph REST API from inside your virtual machine by using the path http://localhost:3030/api followed by a REST API endpoint, or from outside your virtual machine by using the path http://[your IP address/api] or http://[your hostname]/api, both followed by a REST API endpoint.

5 Use Nginx, WSGI, and ASGI to make Ricgraph Explorer and the Ricgraph REST API accessible from outside your virtual machine

5.1 Introduction Nginx webserver

Please read the introduction of section Use Apache, WSGI, and ASGI to make Ricgraph Explorer and the Ricgraph REST API accessible from outside your virtual machine. The same explanation and words of caution as for using Apache as a webserver hold for Nginx as a webserver.

To prevent accidental exposure of Ricgraph Explorer, the REST API, and the data in Ricgraph to the outside world, you will have to modify the Nginx configuration file. You need to make a small modification to make it work. How to do this is described in the comments at the start of the configuration file.

Note that it is also possible to use Apache as a webserver.

To use Nginx c.s., you can either use the Ricgraph Makefile and execute command:

make prepare_webserver_nginx

or follow the steps below.

5.2 Installation Nginx

Note that different Linux editions use different paths. In the steps below, path names from OpenSUSE Leap are used. Please adapt them to you own Linux edition:

  • OpenSUSE Leap: /etc/nginx/vhosts.d
  • Ubuntu: /etc/nginx/sites-available
  • Fedora: /etc/nginx/conf.d

Using Nginx, WSGI, and ASGI will expose Ricgraph Explorer, the Ricgraph REST API, and Ricgraph data to the outside world.

  • Follow the steps in Create a Python virtual environment and install Ricgraph in it.

  • Login as user root.

  • Make sure Nginx has been installed.

  • Gunicorn has already been installed when you installed the Python requirements.

  • Install the Nginx Ricgraph Explorer configuration file: copy file ricgraph_server_config/ricgraph_explorer.conf-nginx to /etc/nginx/vhosts.d, type:

    cp /opt/ricgraph_venv/ricgraph_server_config/ricgraph_explorer.conf-nginx /etc/nginx/vhosts.d
    chmod 600 /etc/nginx/vhosts.d/ricgraph_explorer.conf-nginx

5.3 Post-install steps Nginx

  • Login as user root.

  • Move the Nginx Ricgraph Explorer configuration file to its final location:

    mv /etc/nginx/vhosts.d/ricgraph_explorer.conf-nginx /etc/nginx/vhosts.d/ricgraph_explorer.conf

    However, for Ubuntu do:

    mv /etc/nginx/sites-available/ricgraph_explorer.conf-nginx /etc/nginx/sites-available/ricgraph_explorer.conf
    ln -s /etc/nginx/sites-enabled/ricgraph_explorer.conf /etc/nginx/sites-available/ricgraph_explorer.conf

    Change ricgraph_explorer.conf in such a way it fits your situation. Make the modification to ricgraph_explorer.conf as described in the comments at the start of ricgraph_explorer.conf. Test the result.

  • Make it run by typing:

    systemctl enable nginx.service
    systemctl start nginx.service

    Check the log for any errors, use one of:

    systemctl -l status nginx.service
    journalctl -u nginx.service
  • Exit from user root.

  • Now you can use Ricgraph Explorer from inside your virtual machine by typing http://localhost in your web browser in the virtual machine, or from outside your virtual machine by going to http://[your IP address] or http://[your hostname].

  • You can use the Ricgraph REST API from inside your virtual machine by using the path http://localhost:3030/api followed by a REST API endpoint, or from outside your virtual machine by using the path http://[your IP address]/api or http://[your hostname]/api, both followed by a REST API endpoint.

6 How to install Ricgraph and Ricgraph Explorer on SURF Research Cloud

SURF Research Cloud is a portal where you can easily build a virtual research environment. You can use preconfigured workspaces, or you can add them yourself. A virtual research environment or workspace is a virtual machine that you can use to install Ricgraph. Please follow these steps if you would like to install Ricgraph and Ricgraph Explorer on SURF Research Cloud.

6.1 Preliminaries

  • Make sure you have access to SURF Research Cloud and that you have a wallet available. A wallet is a budget. A wallet has credits, and these credits are used to pay for the SURF computing resources. The more resources you use, the more you have to pay. For resources, think of disk space, the number of CPUs, the amount of memory, and the time the virtual machine is running.
  • If you do not have access to SURF Research Cloud or you do not have a wallet, please contact the SURF Research Cloud contact person at your organization. These persons may be at the Research Data Management Support desk, service desk, or help desk of your organization, or they might be persons like research engineers, data stewards, data managers, or data consultants.

6.2 Create a SURF Research Cloud workspace

To create a SURF Research Cloud workspace, follow the following steps:

  • Go to the SURF Research Cloud portal and log in.
  • Optional: Allocate storage. This step is only required if you expect to install a lot of programs on the virtual research environment and expect to create or use a lot of data. In the case of Ricgraph: > 100M nodes and edges. This is for advanced use only, since this storage will be attached to /data in the virtual research environment, and not to /var/lib, where the Neo4j Community Edition graph database lives.
    • Click on “Create new storage”.
    • Select the collaborative organization that you want to use for running Ricgraph. If you have only one, it will be preselected.
    • Select your wallet. If you have only one, it will be preselected.
    • Select the cloud provider. We use “SURF HPC Cloud volume”.
    • Choose the size of your storage. In the video below we use “100GB”. The larger, the more credits it will cost.
    • Enter a name and a description.
    • After a few moments your storage will be created and available.
  • Create a workspace (that is, a virtual machine to run Ricgraph in):
    • Click on “Create new workspace”.
    • Select the collaborative organization that you want to use for running Ricgraph (as above). If you have only one, it will be preselected.
    • Select your wallet (as above). If you have only one, it will be preselected.
    • Now select a “catalogue item”, that is, a pre-installed virtual machine. Choose “Ubuntu Desktop”.
    • Select the cloud provider. We use “SURF HPC Cloud”.
    • Select which version of Ubuntu you want to use. Choose “Ubuntu 22.04 Desktop”.
    • Select a configuration. In the video below we use “1 Core - 8 GB RAM”. The larger, the more credits it will cost.
    • By default, the workspace has ~95GB of storage on the system and home partition.
    • Optionally you can add more storage, above is explained how to allocate it. If you have done this, select this additional storage.
    • Rename your workspace.
    • After some minutes your workspace will be created and available. It will be started up automatically.
    • Note that your workspace has a will be removed date. You might want to set it to a suitable date.
  • Done.

You might want to watch the video how to install Ricgraph and Ricgraph Explorer on SURF Research Cloud (2m14s) (click to view or download). Note that in the video, we use an old version of Ubuntu. Please use Ubuntu 22.04 as described above.

6.3 Install Ricgraph in a SURF Research Cloud workspace

The next steps in your workspace are to install the graph database backend and Ricgraph. You can install Ricgraph for a single user or Ricgraph as a server. Note that if you would like to use a webserver, you will need to use Nginx.

6.4 Pause and resume a SURF Research Cloud workspace

On the SURF Research Cloud portal, you can pause and resume your workspace. Pausing means that the workspace will not run, and of course then it will not be accessible. If you have paused your workspace, it does not cost credits (money). If you resume your workspace, you can use it again.

6.5 Access a SURF Research Cloud workspace

On the workspace window, you will find the name of the workspace. It will be a https link that ends with .src.surf-hosted.nl. SURF Research Cloud uses guacamole, which provides you with a desktop window in your browser. There are two ways to access your workspace and authenticate:

7 Steps to take to install Ricgraph as a server by hand

Skip this section if you have done the Fast and recommended way to install Ricgraph as a server and there were no errors.

  1. Install your graph database backend.
  2. Create a ricgraph user and group.
  3. Create a Python virtual environment and install Ricgraph in it.
  4. Create and update the Ricgraph initialization file. This is also the place where you specify which graph database backend you use.
  5. Start harvesting data, see Ricgraph harvest scripts, or writing scripts, see Ricgraph script writing.
  6. Start browsing using Ricgraph Explorer.

7.1 Install your graph database backend

Install your graph database backend (choose one of these):

7.2 Create a ricgraph user and group

Follow these steps:

  • Login as user root.

  • Create group and user ricgraph. First check if they exist:

    grep ricgraph /etc/group
    grep ricgraph /etc/passwd

    If you get output, they already exist, and you don’t need to do this step. If you get no output, you will need to create the group and user:

    groupadd --system ricgraph
    useradd --system --comment "Ricgraph user" --no-create-home --gid ricgraph ricgraph
  • Exit from user root.

7.3 Create a Python virtual environment and install Ricgraph in it

Follow these steps:

  • Suppose you are a user with login alice and you are in Linux group users.

  • Login as user root.

  • For Debian/Ubuntu: type:

    apt install python3-venv
  • Go to directory /opt, type:

    cd /opt
  • Create a Python virtual environment: in /opt, type:

    python3 -m venv ricgraph_venv
  • Change the owner and group to your own user alice and group users, in /opt, type:

    chown -R alice:users /opt/ricgraph_venv
  • The path /opt/ricgraph_venv is hardwired in the configuration files ricgraph_server_config/ricgraph_explorer_gunicorn.service and ricgraph_server_config/ricgraph_explorer.conf-apache. This is done for security reasons. If you change the path, also change it in these files.

  • Exit from user root. Do the following steps as your own user.

  • Download the latest release of Ricgraph from the Ricgraph downloads page to directory /opt/ricgraph_venv. Get the tar.gz version.

  • Install Ricgraph: go to /opt/ricgraph_venv, type:

    tar xf /opt/ricgraph-X.YY.tar.gz 

    (X.YY is the version number you downloaded). You will get a directory /opt/ricgraph_venv/ricgraph-X.YY.

  • Merge the Ricgraph you have extracted with tar with the virtual environment, and do some cleanup: in /opt/ricgraph_venv, type:

    mv ricgraph-X.YY/* /opt/ricgraph_venv
    rm -r /opt/ricgraph_venv/ricgraph-X.YY
    rm /opt/ricgraph_venv/ricgraph-X.YY.tar.gz
  • Activate the Python virtual environment: in /opt/ricgraph_venv, type:

    source bin/activate
  • Install the standard Python requirements: in /opt/ricgraph_venv, type:

    pip install setuptools pip wheel
  • Install the Python requirements for Ricgraph: in /opt/ricgraph_venv, type:

    pip install -r requirements.txt

    If you get an error message

    ERROR: Could not find a version that satisfies the requirement neo4j>=5.8

    then your Python version is too old. Please read How to solve an AttributeError: Neo4jDriver object has no attribute executequery.

  • Create a Ricgraph initialization file, read Ricgraph initialization file. This is also the place where you specify which graph database backend you use. You can find these settings in section GraphDB.

  • Deactivate the Python virtual environment: type

    deactivate
  • Login as user root.

  • Change the owner and group to ricgraph of directory /opt/ricgraph_venv. In /opt, type

    chown -R ricgraph:ricgraph /opt/ricgraph_venv
  • Exit from user root.

Back to top