Getting Started =============== Installation ------------ .. TODO: Update the link to the final PyPI package THORR is available on `PyPI `_ and can be installed using pip: .. TODO: Update the link to the final PyPI package .. code-block:: bash :linenos: pip install thorr .. pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ thorr Initial Setup ------------- Use THORR's command-line interface to create a new project. The ``new-project`` command creates a new THORR project with the specified name and directory using the following syntax: .. code-block:: bash :linenos: python -m thorr new-project [OPTIONS] NAME [DIR] Replace ``NAME`` with the name of your project and ``DIR`` with the directory where you want to create the project. If ``DIR`` is not provided, the project will be created in the current directory. When creating a new project, you can also download THORR's data files. See :ref:`new-project` for more information on the available options. To create a new THORR project called *my_project* in the current directory and download the data files, run the following command: .. code-block:: bash :linenos: python -m thorr new-project --get-data my_project This command will create a new directory with the project name and the following structure: .. code-block:: text my_project/ ├── .env/ └── data/ The ``.env`` directory contains the configuration files for the project, and the ``data`` directory contains the data files. By default, a template configuration file is created in the ``.env`` directory. You can customize the configuration file to suit your project's needs. See :ref:`configuration` for more information on configuring your project. To create a new THORR project without downloading the data files or creating a template configuration file, run the following command: .. code-block:: bash :linenos: python -m thorr new-project my_project .. _configuration: Configuration ------------- THORR uses a configuration file to manage project settings and parameters. The configuration file is an INI (``.ini``) file located in the ``.env`` directory of the project. Here is an example of a THORR configuration file: .. code-block:: ini [project] title = my_project project_dir = my_project region = global description = start_date = end_date = [database] type = postgresql user = my_username password = my_password host = localhost port = 1234 database = name_of_database schema = name_of_schema [data] gis_geopackage = data/gis/thorr_gis.gpkg ml_model = data/ml/global_ml.joblib [data.geopackage_layers] basins = Basins rivers = Rivers dams = Dams reservoirs = Reservoirs reaches = Reaches buffered_reaches = BufferedReaches [ee] private_key_path = /path/to/earth/engine/private/key.json service_account = service_account_email The configuration file contains the following sections: :ref:`config-project`, :ref:`config-database`, :ref:`config-database`, and :ref:`config-gee`. Each section contains key-value pairs that define the settings and parameters for the project. .. _config-project: ``[project]`` ~~~~~~~~~~~~~~ The ``[project]`` section contains the project settings, such as the project name, region, description and time frame. The following keys are available in the ``[project]`` section: +-------------+-----------------------------------------------------------------------------+ | Key | Value | +=============+=============================================================================+ | name | The name or title of the project | +-------------+-----------------------------------------------------------------------------+ | project_dir | Path to the project directory | +-------------+-----------------------------------------------------------------------------+ | region | Abbreviation of the region for the project (See :ref:`thorr-regions`) | +-------------+-----------------------------------------------------------------------------+ | description | Brief description of the project | +-------------+-----------------------------------------------------------------------------+ | start_date | Start date for THORR water temperature estimates | +-------------+-----------------------------------------------------------------------------+ | end_date | End date for THORR water temperature estimates | +-------------+-----------------------------------------------------------------------------+ .. _thorr-regions: Regions ^^^^^^^ The ``region`` key in the ``[project]`` section specifies the region for the project. THORR has separately trained models for different regions. The following regions are currently available in THORR: +----------------------+---------------------+ | Region | Abbreviation | +======================+=====================+ | Columbia River Basin | crb | +----------------------+---------------------+ .. _config-database: ``[database]`` ~~~~~~~~~~~~~~~ The ``[database]`` section contains the database connection settings. The following keys are available in the ``[database]`` section: +----------+-------------------------------------------------+ | Key | Value | +==========+=================================================+ | type | Type of database: postgresql or mysql | +----------+-------------------------------------------------+ | user | Username | +----------+-------------------------------------------------+ | password | Password | +----------+-------------------------------------------------+ | host | Host address | +----------+-------------------------------------------------+ | port | Port number | +----------+-------------------------------------------------+ | database | Name of the database where the schema is stored | +----------+-------------------------------------------------+ | schema | Name of the schema | +----------+-------------------------------------------------+ See :doc:`database` for more information on setting up the database. .. _data: ``[data]`` ~~~~~~~~~~ The ``[data]`` section contains the paths to the GIS and machine learning data files. An additional :ref:`data.geopackage_layers` sub-section is dedicated to the GIS geopackage layers. The following keys are available in the ``[data]`` section: +----------------+----------------------------------------------------------------------------------------+ | Key | Value | +================+========================================================================================+ | gis_geopackage | File path to the GIS geopackage file that contains all the vector files used by THORR. | +----------------+----------------------------------------------------------------------------------------+ | ml_model | Path to the machine learning model used to generate water temperature | +----------------+----------------------------------------------------------------------------------------+ .. _data.geopackage_layers: ``[data.geopackage_layers]`` ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``[data.geopackage_layers]`` sub-section contains the names of the layers in the GIS geopackage file. The following keys are available in the ``[data.geopackage_layers]`` sub-section: +------------+---------------------------+ | Key | Value | +============+===========================+ | basins | Layer name for basins | +------------+---------------------------+ | rivers | Layer name for rivers | +------------+---------------------------+ | dams | Layer name for dams | +------------+---------------------------+ | reservoirs | Layer name for reservoirs | +------------+---------------------------+ | reaches | Layer name for reaches | +------------+---------------------------+ See :doc:`gis` for more information on how THORR's GIS data is structured. .. _config-gee: ``[ee]`` ~~~~~~~~ The ``[ee]`` section contains the configuration settings for Google Earth Engine (GEE). THORR obtains satellite information from the GEE platform. Therefore, a `GEE service account and private key `_ are required. The following keys are available in the ``[ee]`` section: +------------------+----------------------------------------+ | Key | Value | +==================+========================================+ | private_key_path | /path/to/earth/engine/private/key.json | +------------------+----------------------------------------+ | service_account | service_account_email | +------------------+----------------------------------------+ Workflow and Cronjob -------------------- Once the project is set up and configured, you can start using THORR to generate water temperature estimates. THORR's workflow consists of 4 main steps: 1. Read and process GIS information from database 2. Retrieve and process satellite data from Google Earth Engine (See :ref:`retrieve-data`) 3. Generate water temeprature estimates using machine learning models 4. Save the results to the database This workflow can be automated to run at regular intervals using a cronjob. .. TODO: Add instructions on setting up a cronjob to run the THORR workflow at regular intervals.