Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Log on to your SAIL desktop.

  2. First, create a folder on your P Drive where you will save your conda environments. 

    Go to P drive > yourusername

    Create new folder eg conda-envs 

    This folder will be the home for all your conda environments.

  3. Open Anaconda Prompt, a command line window which you will use to control your conda environments
    Click
    Double click on the Start menu (the little Windows icon on the bottom left) → Anaconda3 (64-bit) → Anaconda Prompt (Anaconda3). Anaconda Icon on your SAIL desktop. 

    This will open a command line window with a line of text like:
    1. (base) C:\Users\<your username will be here>


  4. Now we need to make an environment to use.

    1. Option 1: create a new (empty) environment and install packages/libraries you want to use

      In the Anaconda Prompt window, type the following and then hit 'Enter':
      1. conda create -p P:\<your username here>\<name of new folder>\<name of your new environment> –-channel=anaconda --channel=conda-forge nb_conda_kernels pandas numpy jupyterlab

        For example, if my username is 'leal' and I want to create an environment called 'mynewenv' in a folder called 'conda-envs', the command I would run would be:

        conda create -p P:\leal\conda-envs\mynewenv –-channel=anaconda --channel=conda-forge nb_conda_kernels pandas numpy jupyterlab



      2. Wait a little while until the window asks you whether to proceed - hit 'y' on your keyboard and then press 'Enter'.



      3. Wait while your new environment is created and all requested packages are installed.



      4. You might get a pop-up saying, "this app has been blocked by your system administrator" this is fine, and it all worked. Just click 'Close' on the message.
         



    2. Option 2: clone the base environment and copy all the installed packages into your own environment

      In the Anaconda Prompt window, type the following and then hit 'Enter':
      1. conda create --prefix <your username here>\<name of new folder>\<name of your new environment> --clone base 

        For example, if my username is 'daviesj' and I want to create an environment called 'ranch' in a folder called 'myconda', the command I would run would be:

      2. conda create -p P:\leal\myconda\ranch --clone base


      3. You will see a few lines of code followed by the number of packages and files that are going to be copied (see screenshot below)

      4. Wait while your new environment is created and all requested packages are installed.



      5. This process may take a few hours to run so advise that you go do something else and come back later!

      6. You might get a pop-up saying, "this app has been blocked by your system administrator" this is fine, and it all worked. Just click 'Close' on the message.
         



      7. When it is finished you will see the message done, as seen in the see screen shot below, and some instructions for how to activate the environment you have just created (see Step 5). 




  5. Next you need to activate your environment

    Type the following in the command line window:
    1. conda activate P:\<your username here>\<name of your new environment>
    2. So, if my username is 'leal' and I created an environment called 'mynewenv' in a folder called 'conda-envs', I would use the command:
      1. conda activate P:\leal\conda-envs\mynewenv


    3. You'll know when the environment is activated because the environment name and path will be in brackets as in the image above.

      You can also type
      conda env list and you will see an asterisks next to the active environment.


  6. Congratulations, you can now move on to the next part of the guide which will instruct you how to install the specific packages/libraries you want to use.

...

  1. Log on to your SAIL desktop.
  2. Click Double click on the Start menu (the little Windows icon on the bottom left) → Anaconda3 (64-bit) → Anaconda Prompt (Anaconda3). Anaconda Icon on your SAIL desktop.
    This will open a command line window with a line of text like:
    1. (base) C:\Users\<your username will be here>
  3. Activate your conda environment by typing the following in the same command line window:
    1. conda activate P:\<your username here>\<name of your new environment>
    2. So, if my username is 'leal' and I created an environment called 'mynewenv' in a folder called 'conda-envs', I would use the command:
      1. conda activate P:\leal\conda-envs\mynewenv
    3. You will know when the environment is activated because the window will show a line of text like:
      1. (P:\<your username>\<your environment name>) C:\Users\<your username>

  4. VERY IMPORTANT: Before starting Jupyter, you will need to first change the working directory so that can either i) access previously saved notebooks, notebooks created by your colleagues or ii) create new notebooks that are saved in the desired place, eg in the SAIL project folder so you can easily share with your collaborators. 

    You will see the current working directory in the command prompt window after your conda environment name, it is likely a C Drive file path.  If you launch jupyter within the C Drive, you will be encumbered by not being able to see any other files (such as images, documents) within the jupyter directory.

    Furthermore, the notebooks will almost definitely contain outputs that are project specific, and so we must remember to adhere to SAIL policy and make sure we use the S Drive folder for project specific outputs.
    Consider that, unlike pure sql scripts or r scripts which aren't likely to contain results/data/outputs, jupyter notebooks contain both code and outputs, and therefore, MUST NOT BE SAVED ON THE P DRIVE. 

    1. To change the file path to a different drive, type the following:

      1. S: (Hit enter)
        or

        P: (Hit enter) 

        (S for SAIL policy for project specific) 

    2. You may also want to change your directory to a specific folder (though you can access any sub-directories within the drive you have specified above). Type the following and press Enter:
      1. cd <your username here>
        So if my username is 'leal', I would type:
        cd leal

    3. Optional: You might want to navigate to the specific folder in which you'll be working/saving this work, but that is out of the scope of this simple guide.

  5. We are now ready to start Jupyter. In the command line window, type and hit Enter with either of the following commands:
    1. jupyter lab
      1. This gives you a more modern Jupyter interface.
    2. jupyter notebook
      1. This will give you the 'classic' Jupyter interface.


  6. Jupyter will automatically open in a Microsoft Edge tab. You can navigate wherever you want to save your notebooks, create folders, make your notebooks, etc.
    1. You must leave the Anaconda Prompt window open while using Jupyter, though you may not need to look at it, it is helpful to check logs should you have any errors or issues. 

  7. To ensure that you're using the correct environment kernel in Jupyter, you need to pay attention when creating notebooks.

    1. In the modern interface:
      1. Under the 'Notebook' heading in the launcher tab, select the one with the name
        Python \[conda env: <name of your conda env here>\]*


    2. In the 'classic' interface:
      1. Click on 'New' in the top right.
      2. In the drop-down window that opens, make sure you choose the option called 
        Python \[conda env: <name of your conda env here>\]*

  8. When you're done and want to exit Jupyter, click on the 'Anaconda Prompt' window on the taskbar, click somewhere in the window, and press Ctrl+C twice.
    Please wait a few seconds; Jupyter should shut down, making it safe to close your notebook.

...

  1. Log on to your SAIL desktop.
  2. Click on the Start menu (the little Windows icon on the bottom Anaconda Icon on your SAIL desktop.

    opening Anaconda from the desktop icon rather than  the Start menu (the little Windows icon on the bottom left) → Anaconda3 (64-bit) → Anaconda Prompt (Anaconda3).
    resets the R_LIBS_USER= to be null which is required if you use both Anaconda and RStudio.

    This will open a command line window with a line of text like:
    1. (base) C:\Users\<your username will be here>
  3. Now we need to make an environment to use. In this command line window, type the following and then hit 'Enter':
    1. conda create -p P:\<your username here>\<name of your new environment> –-channel=anaconda --channel=conda-forge nb_conda_kernels pandas numpy jupyterlab
      -p is short for --prefix which tells conda that your environment name is prefixed by the file path

    2. For example, if my username is 'leal' and I want to create an environment called 'mynewenv' in a folder called 'conda-envs', the command I would run would be:
      1. conda create -p P:\leal\conda-envs\mynewenv –-channel=anaconda --channel=conda-forge nb_conda_kernels pandas numpy jupyterlab
    3. There is already a conda environment available to SAIL users but this base environment is saved on the C Drive (C:). Since your user account doesn't have permission to write to the C drive this will be a problem if you want to install new libraries and packages. 
      We therefore need to create an environment on our P Drive (P:) 
      because you cannot install packages to the base environment.

    4. We use the '-p' flag to tell Anaconda to put all your environment files and folders in a specific location and not in its default location on the C:.
    5. The environment path has to be somewhere that won't get cleared during the maintenance window, which rules out the C:, and also somewhere you have permission to write to. 
      This is why we use our P:\<username> folder.
    6. The --channel flags tell Anaconda where to look for the libraries that we want it to install in our new environment
    7. The libraries we want to be installed by default are:
      1. nb_conda_kernels - This extension enables a Jupyter Notebook or JupyterLab application in one conda environment to access kernels for Python, R, and other languages found in other environments. When a kernel from an external environment is selected, the kernel conda environment is automatically activated before the kernel is launched. This allows you to utilize different versions of Python, R, and other languages from a single Jupyter installation.
      2. pandas and NumPy - are very useful Python libraries for dealing with data.
      3. jupyterlab - the complete Jupyter distribution is needed so we can use Jupyter.
  4. Wait while your new environment is created and all requested packages are installed.
  5. You might get a pop-up saying, "this app has been blocked by your system administrator" this is fine, and it all worked. Just click 'Close' on the message.
    1. This is related to python.exe permissions; it is a known issue but shouldn't cause any problems.
  6. After your environment is created, you need to activate it by typing the following in the same command line window:
    1. conda activate P:\<your username here>\<name of your new environment>
    2. So, if my username is 'leal' and I created an environment called 'mynewenv' in a folder called 'conda-envs', I would use the command:
      1. conda activate P:\leal\conda-envs\mynewenv
    3. You need to activate an environment every time you open a fresh Anaconda Prompt window; otherwise, Anaconda will try and use the base environment.
  7. Congratulations, you can now move on to the next part of the guide.

...

  1. Log on to your SAIL desktop.
  2. Click on Double click on the Anaconda Icon on your SAIL desktop. 
    opening Anaconda from the desktop icon rather than  the Start menu (the little Windows icon on the bottom left) → Anaconda3 (64-bit) → Anaconda Prompt (Anaconda3). This will open a command line window with a line of text like:
    resets the R_LIBS_USER= to be null which is required if you use both Anaconda and RStudio.

    This will open a command line window with a line of text like:
    1. (base) C:\Users\<your username will be here>
  3. Activate your conda environment by typing the following in the same command line window:
    1. conda activate P:\<your username here>\<name of your new environment>
    2. So, if my username is 'leal' and I created an environment called 'mynewenv' in a folder called 'conda-envs', I would use the command:
      1. conda activate P:\leal\conda-envs\mynewenv
    3. You will know when the environment is activated because the window will show a line of text like:
      1. (P:\<your username>\<your environment name>) C:\Users\<your username>
    4. You need to activate an environment every time you open a fresh Anaconda Prompt window; otherwise, Anaconda will try and use the base environment.
  4. VERY IMPORTANT: Before starting Jupyter, we must ensure we're on the P: in the command line window. To do this, type the following into the window and press Enter:
    1. P:
    2. You cannot change drive letters from within Jupyter. If you start Jupyter when you're on the C: it will only be able to save your notebooks on the C:. The C: gets cleared whenever
      your desktop restarts, so you'd lose all your notebooks if you saved them here.
  5. Then type the following and press Enter:
    1. cd <your username here>
    2. So if my username is 'leal', I would type:
      1. cd leal
    3. Jupyter needs to be able to write to wherever it starts, and making sure it starts in your user folder avoids several common "permission denied" problems.
  6. (Optional) You might want to navigate to the specific folder in which you'll be working/saving this work, but that is out of the scope of this simple guide.
  7. We are now ready to start Jupyter. In the command line window, type and hit Enter with either of the following commands:
    1. jupyter notebook
      1. This will give you the 'classic' Jupyter interface.
    2. jupyter lab
      1. This gives you a more modern Jupyter interface.
  8. Jupyter will automatically open in a Microsoft Edge tab. You can navigate wherever you want to save your notebooks, create folders, make your notebooks, etc.
    1. You need to leave the Anaconda Prompt window open while you're using Jupyter.
      1. If you close the Anaconda Prompt window, it shuts down the local server that Jupyter is using, crashing Jupyter.
  9. To ensure that you're using the correct environment kernel in Jupyter, you need to pay attention when creating notebooks.
    1. In the 'classic' interface:
      1. Click on 'New' in the top right.
      2. In the drop-down window that opens, make sure you choose the option called 
        1. Python \[conda env: <name of your conda env here>\]*
    2. In the modern interface:
      1. Under the 'Notebook' heading in the launcher tab, select the one with the name
        1. Python \[conda env: <name of your conda env here>\]*
    3. We want any notebook that we use to use the specific version of everything that we installed in our Anaconda Environment.
      Doing this means that you can use packages installed in your Anaconda Environment inside Jupyter too.
  10. When you're done and want to exit Jupyter, click on the 'Anaconda Prompt' window on the taskbar, click somewhere in the window, and press Ctrl+C twice.
  11. Please wait a few seconds; Jupyter should shut down, making it safe to close your notebook.

...

  1. Log on to your SAIL desktop.
  2. Click on the Start menu (the little Windows icon on the bottom left) → Anaconda3 (64-bit) → Anaconda Prompt (Anaconda3). This will open a command line window with a line of text like:
    1. (base) C:\Users\<your username will be here>
  3. Activate your conda environment by typing the following in the same command line window:
    1. conda activate P:\<your username here>\<name of your new environment>
    2. So, if my username is 'leal' and I created an environment called 'mynewenv' in a folder called 'conda-envs', I would use the command:
      1. conda activate P:\leal\conda-envs\mynewenv
    3. You will know when the environment is activated because the window will show a line of text like:
      1. (P:\<your username>\<your environment name>) C:\Users\<your username>
  4. Outside your SAIL desktop, go to Google and search '<name of the package you want to install> anaconda'.
    1. For example, if I want to install the package 'recordlinkage', I would search on Google for 'recordlinkage anaconda'.
  5. Select the Google result from anaconda.org; this should take you directly to the Anaconda page for the package.
  6. On the page, there will be a command that tells you how to install it. Sticking with the recordlinkage example, the webpage shows me that the command to install is:
    1. conda install -c conda-forge recordlinkage
  7. Go back to your SAIL desktop and type this installation command into your Anaconda Prompt window, hitting 'Enter'.
  8. Wait for Anaconda to ask you if you want to proceed - hit 'y' on your keyboard and then press 'Enter'.
  9. Your package is installed! 


...


Connecting to DB2 from a Python version > 3.7

There is a known problem with the ibm_db package compiled for Python versions above 3.7, which means we need to do some extra things so we can connect to it from our notebooks.

As a prerequiste, your Anaconda environment will need sqlalchemy, ibm_db, and ibm_db_sa packages installed.

You can then run this script (replacing your username and password in the variables) to connect to db2.

Code Block
languagepy
import os 
import sys
rootpath = [x for x in sys.path if x.split(os.sep)[-1] == "site-packages"][0]
dllpath = f"{rootpath}\\clidriver\\bin"
os.add_dll_directory(dllpath)

import ibm_db
import ibm_db_dbi
import ibm_db_sa
import pandas as pd

from sqlalchemy import create_engine, text

uid = "YOUR USERNAME HERE"
pwd = "YOUR PASSWORD HERE"

database = "pr_sail"
hostname = "db2.database.ukserp.ac.uk"
port = "60070"
security = "ssl"
ssl_client_keystoredb = r"R:\UKSeRP\DB2_SSL\chi.kdb"
ssl_client_keystash = r"R:\UKSeRP\DB2_SSL\chi.sth"
conn_str = (
    f"db2+ibm_db://{uid}:{pwd}@{hostname}:{port}/{database}"
    f"?Security={security}"
    f"&SSLClientKeystoredb={ssl_client_keystoredb}"
    f"&SSLClientKeystash={ssl_client_keystash}"
)

engine = create_engine(conn_str)

sql = "SELECT * FROM syscat.tables LIMIT 1"

with engine.connect() as conn:
    ResultProxy = conn.execute(text(sql))
    results = ResultProxy.fetchall()
    df = pd.DataFrame(results, columns=ResultProxy.keys())

print(df)

The reason for this problem is explained in this GitHub issue: https://github.com/ibmdb/python-ibmdb/issues/887


...

FAQs


Can i install R packages from CRAN?

...