Skip to content

Episode 3: Dependencies & EnvironmentsΒΆ

Learning Objectives

By the end of this episode, you will:

  • Understand how to specify package dependencies
  • Use optional dependencies with "extras"
  • Work with virtual environments effectively
  • Handle version constraints properly
  • Understand the difference between package and development dependencies

🎬 Adding new/more features¢

Dr. Sarah's kir-pydemo package is getting popular! Now she wants to add some new features:

  • Read FASTA files (needs biopython)
  • Create plots of GC content distributions (needs matplotlib)
  • Statistical analysis of sequences (needs numpy, scipy)

But she has concerns:

"Not everyone needs all these features. Do I force all users to install matplotlib even if they just want basic sequence analysis? What if someone's using an old version of numpy that conflicts with what I need?"

The solution? Proper dependency management with pyproject.toml!

πŸ“¦ Understanding DependenciesΒΆ

Dependencies are other Python packages that your package needs to work. There are different types:

1. Core DependenciesΒΆ

Required for basic functionality - installed automatically with your package:

[project]
dependencies = [
    "biopython>=1.80",
    "numpy>=1.20.0",
]

2. Optional DependenciesΒΆ

Needed for extra features - installed only when requested:

[project.optional-dependencies]
plotting = ["matplotlib>=3.5.0"]
dev = ["pytest>=7.0", "black>=22.0"]

Installed with: pip install kir-pydemo[plotting]

3. Development DependenciesΒΆ

Tools for development - not needed by users:

  • Testing frameworks (pytest)
  • Code formatters (black, ruff)
  • Documentation builders (sphinx)
  • Type checkers (mypy)

πŸ”¨ Hands-On: Adding DependenciesΒΆ

Step 1: Decide What's Core vs. OptionalΒΆ

For kir-pydemo, let's say we want to add:

  • Core: None (our basic functions use only stdlib!)
  • Optional - bio: biopython for FASTA file support
  • Optional - plotting: matplotlib for visualization
  • Optional - dev: Testing and code quality tools

Step 2: Update pyproject.tomlΒΆ

[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"

[project]
name = "kir-pydemo"
version = "0.2.0"  # πŸ†• Bumped version
description = "A demonstration package for DNA sequence analysis"
readme = "README.md"
requires-python = ">=3.9"
license = {text = "MIT"}
authors = [
    {name = "BMRC Training", email = "training@example.com"}
]
keywords = ["bioinformatics", "DNA", "sequence analysis", "tutorial"]
classifiers = [
    "Development Status :: 3 - Alpha",
    "Intended Audience :: Science/Research",
    "Topic :: Scientific/Engineering :: Bio-Informatics",
    "License :: OSI Approved :: MIT License",
    "Programming Language :: Python :: 3",
    "Programming Language :: Python :: 3.9",
    "Programming Language :: Python :: 3.10",
    "Programming Language :: Python :: 3.11",
    "Programming Language :: Python :: 3.12",
]

# πŸ†• NEW: Optional dependencies
[project.optional-dependencies]
bio = [
    "biopython>=1.80",
]
plotting = [
    "matplotlib>=3.5.0",
    "numpy>=1.20.0",
]
dev = [
    "pytest>=7.4.0",
    "pytest-cov>=4.1.0",
    "black>=23.0.0",
    "ruff>=0.1.0",
    "mypy>=1.5.0",
]
# Convenience: Install everything
all = [
    "kir-pydemo[bio,plotting]",
]

[project.scripts]
kir-pydemo = "kir_pydemo.cli:main"

[project.urls]
Homepage = "https://github.com/bmrc/kir-pydemo"
Documentation = "https://kir-pydemo.readthedocs.io"
Repository = "https://github.com/bmrc/kir-pydemo"
Issues = "https://github.com/bmrc/kir-pydemo/issues"

Step 3: Version ConstraintsΒΆ

Let's understand version specifiers:

dependencies = [
    "numpy",                    # Any version (not recommended!)
    "numpy>=1.20.0",           # Minimum version
    "numpy>=1.20.0,<2.0.0",    # Version range
    "numpy~=1.20.0",           # Compatible release (>=1.20.0, <1.21.0)
    "numpy==1.20.0",           # Exact version (too restrictive!)
]

Best practices:

  • βœ… Use minimum versions: package>=1.0.0
  • βœ… Exclude known broken versions: package>=1.0.0,!=1.2.0
  • βœ… Use upper bounds cautiously: package>=1.0.0,<2.0.0
  • ❌ Avoid pinning exact versions in libraries: package==1.0.0

Pinning vs. Constraints

Libraries (packages imported by others) should use loose constraints:

dependencies = ["requests>=2.28.0"]

Applications (final products) can pin exact versions:

dependencies = ["requests==2.31.0"]

kir-pydemo is a library, so we use minimum version constraints.

Step 4: Add FASTA SupportΒΆ

Create a new module src/kir_pydemo/io.py that uses biopython:

"""File I/O utilities for sequence data."""

from pathlib import Path
from typing import List, Tuple

try:
    from Bio import SeqIO
    HAS_BIOPYTHON = True
except ImportError:
    HAS_BIOPYTHON = False


def read_fasta(filepath: Path) -> List[Tuple[str, str]]:
    """
    Read sequences from a FASTA file.

    Parameters
    ----------
    filepath : Path
        Path to the FASTA file

    Returns
    -------
    List[Tuple[str, str]]
        List of (name, sequence) tuples

    Raises
    ------
    ImportError
        If biopython is not installed
    FileNotFoundError
        If the file doesn't exist

    Examples
    --------
    >>> sequences = read_fasta(Path("sequences.fasta"))
    >>> for name, seq in sequences:
    ...     print(f"{name}: {len(seq)} bp")
    """
    if not HAS_BIOPYTHON:
        raise ImportError(
            "biopython is required for FASTA support. "
            "Install with: pip install kir-pydemo[bio]"
        )

    if not filepath.exists():
        raise FileNotFoundError(f"File not found: {filepath}")

    sequences = []
    for record in SeqIO.parse(filepath, "fasta"):
        sequences.append((record.id, str(record.seq)))

    return sequences

Graceful Degradation

Notice the pattern:

  1. Try to import optional dependency
  2. Set a HAS_* flag
  3. Check the flag before using the feature
  4. Raise helpful error if not installed

This allows users to install only what they need!

Step 5: Update CLI for FASTA SupportΒΆ

Update src/kir_pydemo/cli.py to support FASTA files:

# Add to the gc-content subcommand
gc_parser.add_argument(
    "--fasta",
    type=Path,
    help="read sequences from FASTA file (requires: pip install kir-pydemo[bio])",
)
# In cmd_gc_content function
def cmd_gc_content(args: argparse.Namespace) -> int:
    """Handle the gc-content command."""
    sequences = []

    if args.fasta:
        try:
            from kir_pydemo.io import read_fasta
            fasta_sequences = read_fasta(args.fasta)
            for name, seq in fasta_sequences:
                result = gc_content(seq)
                print(f"{name}: GC content = {result:.{args.precision}f}%")
            return 0
        except ImportError as e:
            print(f"Error: {e}", file=sys.stderr)
            return 1

    # ... rest of the function

🌍 Virtual Environments¢

Virtual environments isolate your project's dependencies from the system Python.

Why Use Virtual Environments?ΒΆ

Without virtual environments:

System Python
β”œβ”€β”€ numpy==1.19.0  (old project needs this)
β”œβ”€β”€ pandas==1.3.0
└── kir-pydemo attempts to install numpy>=1.20.0  ❌ CONFLICT!

With virtual environments:

System Python
└── virtualenv installed

Project A (venv-a/)
β”œβ”€β”€ numpy==1.19.0
└── pandas==1.3.0

Project B (venv-b/)
β”œβ”€β”€ numpy==1.23.0  βœ… No conflict!
└── kir-pydemo

Creating Virtual EnvironmentsΒΆ

Using venv (built-in)ΒΆ

# Create a virtual environment
python -m venv venv

# Activate it
source venv/bin/activate      # Linux/Mac
venv\Scripts\activate         # Windows

# Your prompt changes: (venv) user@host:~$

# Install packages in this environment
pip install -e ".[dev]"

# Deactivate when done
deactivate

Installing with ExtrasΒΆ

# Install just the package
uv pip install kir-pydemo

# Install with bio support
uv pip install kir-pydemo[bio]

# Install with multiple extras
uv pip install kir-pydemo[bio,plotting]

# Install everything
uv pip install kir-pydemo[all]

# For development (editable install with dev tools)
uv pip install -e ".[dev]"

Quote the Extras

On some shells (especially zsh), you need quotes:

uv pip install "kir-pydemo[bio]"   # Quoted
uv pip install -e ".[dev]"         # Quoted

πŸ“ requirements.txt vs pyproject.tomlΒΆ

People often ask: "Should I use requirements.txt or pyproject.toml?"

pyproject.toml (for libraries)ΒΆ

[project]
dependencies = [
    "numpy>=1.20.0",  # Loose constraints
]

Use when:

  • Building a package to distribute
  • Want to specify minimum requirements
  • Need flexibility for users

requirements.txt (for applications)ΒΆ

# requirements.txt
numpy==1.23.4      # Pinned versions
pandas==1.5.3
matplotlib==3.7.1

Use when:

  • Deploying an application
  • Need reproducible environments
  • Want exact versions

Both TogetherΒΆ

For kir-pydemo development, you might have:

pyproject.toml - Loose constraints for users:

[project]
dependencies = ["numpy>=1.20.0"]

[project.optional-dependencies]
dev = ["pytest>=7.0"]

requirements-dev.txt - Pinned versions for development:

# Exact versions used in development
numpy==1.23.4
pytest==7.4.0
black==23.7.0

Generate from current environment:

pip freeze > requirements-dev.txt

πŸ”’ Dependency Lock FilesΒΆ

Modern tools provide lock files for reproducible installs:

Poetry (poetry.lock)ΒΆ

# Install poetry
uv pip install poetry

# Initialize
poetry init

# Add dependency
poetry add numpy

# Generates poetry.lock with exact versions

PDM (pdm.lock)ΒΆ

# Install pdm
uv pip install pdm

# Initialize
pdm init

# Add dependency
pdm add numpy

# Generates pdm.lock

pip-tools (requirements.txt + requirements.in)ΒΆ

# Install pip-tools
uv  pip install pip-tools

# Create requirements.in (loose)
echo "numpy>=1.20.0" > requirements.in

# Generate requirements.txt (pinned)
pip-compile requirements.in

# Install exact versions
pip-sync requirements.txt

Lock Files in Practice

For kir-pydemo (a library), we don't commit lock files to the repository. For applications, lock files ensure everyone uses identical dependency versions.

πŸ“‹ Checkpoint: What Have We Achieved?ΒΆ

Verify you've successfully completed Episode 3:

  • Added optional dependencies to pyproject.toml
  • Created extras: [bio], [plotting], [dev], [all]
  • Implemented graceful dependency handling with try/except
  • Added FASTA file support with biopython
  • Created and activated a virtual environment
  • Installed package with extras: pip install -e ".[dev]"
  • Understand version constraints and when to use them

🎯 Key Takeaways¢

  1. Dependencies in [project.dependencies] are always installed
  2. Optional dependencies in [project.optional-dependencies] are installed with [extras]
  3. Version constraints should be loose for libraries, strict for applications
  4. Virtual environments isolate project dependencies
  5. Graceful degradation provides helpful errors when optional deps are missing
  6. Lock files ensure reproducible environments (more important for apps than libraries)

πŸš€ What's Next?ΒΆ

In Episode 4, we'll add Testing & Quality tools to ensure kir-pydemo is reliable and maintainable:

  • Writing tests with pytest
  • Code formatting with black/ruff
  • Type checking with mypy
  • Pre-commit hooks for automation

This will make your package production-ready!

πŸ“š Further ReadingΒΆ