16. IDEs & Development Tools#

1. IDE Built-In Features#

What is an IDE?#

  • Integrated Development Environment
  • All-in-one tool for writing, testing, debugging code
  • Popular for Data Science: VS Code, PyCharm, JupyterLab

Standard Built-In Features:#

FeatureDescription
Syntax highlightingColor-codes different parts of code ✅
Integrated debuggerSet breakpoints, step through code ✅
Version control integrationGit commands inside IDE ✅
Terminal accessRun commands without leaving IDE ✅
Code completionAuto-suggests code as you type ✅
LintingReal-time code error detection ✅
Refactoring toolsRename variables, extract functions ✅
Search & ReplaceAcross entire project ✅
Extensions/PluginsAdd extra functionality ✅

2. NOT Built-In — Exam Answer#

NOT Typically Built-In:#

❌ Native hardware virtualization → needs Docker, VMware, VirtualBox
❌ Automated code review and security scanning ✅ (exam answer JAN_AN Q407)
   → needs external tools: SonarQube, Snyk, GitHub Actions
  • ✅ Automated code review and security scanning is NOT built-in — exam answer (JAN_AN Q407)
  • ✅ Native hardware virtualization is NOT built-in — exam answer (JAN_FN Q288)
  • ❌ Syntax highlighting → IS built-in
  • ❌ Integrated debugger → IS built-in
  • ❌ Version control integration → IS built-in
  • ❌ Terminal access → IS built-in
  • ❌ Code completion → IS built-in

3. External Tools for Code Review:#

SonarQube  → static code analysis
Snyk       → security vulnerability scanning
GitHub Actions → CI/CD, automated testing
Dependabot → dependency vulnerability alerts
pre-commit → git hooks for code checks

IDEBest ForKey Features
VS CodeGeneral purposeLightweight, huge extension library
PyCharmPython developmentDeep Python support, smart debugger
JupyterLabData explorationNotebooks, inline visualization
Google ColabCloud notebooksFree GPU, easy sharing
RStudioR programmingR-specific tools
SpyderScientific PythonMATLAB-like interface

5. Jupyter Notebooks — Best Practices#

Structure:#

Cell 1: Imports
Cell 2: Configuration (paths, constants)
Cell 3: Load data
Cell 4: Explore data
Cell 5: Clean data
Cell 6: Analyze
Cell 7: Visualize
Cell 8: Conclusions

Before Committing to Git:#

# Clear outputs before committing
jupyter nbconvert --clear-output notebook.ipynb

# OR use nbstripout
pip install nbstripout
nbstripout notebook.ipynb

Why Clear Outputs?#

  • Outputs can contain sensitive data
  • Large outputs bloat git repository
  • Avoids merge conflicts in output cells
  • Reproducibility — others should run and get same outputs

Restart and Run All:#

Kernel → Restart & Run All
→ Ensures notebook runs top to bottom without hidden state
→ Always do this before sharing/committing

VS Code — Key Features for Data Science:#

Extensions useful for TDS:
├── Python           → Python language support
├── Pylance          → Fast type checking
├── Jupyter          → Notebook support in VS Code
├── GitLens          → Enhanced git visualization
├── Docker           → Docker file support
├── REST Client      → Test APIs directly in editor
└── Rainbow CSV      → Color-coded CSV viewing

Debugger — How to Use:#

Setting breakpoints:
1. Click left of line number → red dot appears
2. Run in debug mode (F5)
3. Code pauses at breakpoint

Debug controls:
F5  → Continue (run to next breakpoint)
F10 → Step Over (next line, don't go into function)
F11 → Step Into (go inside function)
F12 → Step Out (exit current function)

Watch panel → monitor variable values
Call stack → see function call chain
Variables panel → all current variables

Terminal Access in IDE:#

# VS Code integrated terminal
Ctrl + ` (backtick) → open terminal
Can run:
  python script.py
  pip install pandas
  git add . && git commit -m "msg"
  pytest tests/
  docker build -t app .

uv — Python Package Manager (Exam-Relevant)#

What is uv?#

  • Fast modern Python package and project manager
  • Written in Rust → extremely fast (10-100x faster than pip)
  • Replacement for: pip + venv + pyenv combined

Correct Command Sequence — Exam Answer:#

# ✅ Correct sequence (exam answer May_FN Q359)
uv init                      # Step 1: initialize project
uv python install 3.11       # Step 2: install specific Python version
uv add pandas numpy          # Step 3: add dependencies

# ❌ Wrong sequences:
uv init python 3.11 → uv add pandas         # wrong
uv create --version 3.11 → uv install       # wrong
uv new --python=3.11 → uv sync pandas       # wrong

Complete uv Reference:#

uv init                  # create new project
uv python install 3.11   # install Python version
uv add pandas            # add dependency
uv remove pandas         # remove dependency
uv sync                  # install from lockfile
uv run script.py         # run in project environment
uv venv                  # create virtual environment

uv vs Traditional:#

TaskTraditionaluv
Create venvpython -m venv .venvuv venv
Install packagepip install pandasuv add pandas
Install Pythonpyenv install 3.11uv python install 3.11
Install from requirementspip install -r requirements.txtuv sync

Virtual Environments — Overview:#

# venv (built-in Python)
python -m venv .venv           # create
source .venv/bin/activate      # activate (Mac/Linux)
.venv\Scripts\activate         # activate (Windows)
deactivate                     # deactivate

# conda
conda create -n myenv python=3.11
conda activate myenv
conda deactivate

# uv (fastest)
uv venv
source .venv/bin/activate

Why Virtual Environments?#

  • Isolate project dependencies
  • Avoid conflicts between projects
  • Reproducible environments
  • Each project has its own Python packages

pip — Package Management Reference:#

pip install pandas              # install
pip install pandas==2.1.0       # specific version
pip install -r requirements.txt # from file
pip uninstall pandas            # remove
pip list                        # list installed
pip freeze > requirements.txt   # save current env
pip show pandas                 # package info
pip install --upgrade pandas    # upgrade

Quick Reference#

IDE Built-In Features:
  ✅ Syntax highlighting
  ✅ Integrated debugger
  ✅ Version control integration
  ✅ Terminal access
  ✅ Code completion
  ✅ Linting, refactoring, search

NOT Built-In:
  ❌ Native hardware virtualization ✅ (JAN_FN exam answer)
  ❌ Automated code review + security scanning ✅ (JAN_AN exam answer)

Popular IDEs:
  VS Code      → general purpose, lightweight
  PyCharm      → deep Python support
  JupyterLab   → notebooks, data exploration

uv sequence:
  uv init → uv python install 3.11 → uv add pandas ✅

Jupyter best practices:
  ✅ Clear outputs before committing
  ✅ Restart & Run All before sharing
  ✅ Structure cells logically

Virtual environments:
  venv → built-in Python
  conda → data science focused
  uv → fastest, modern ✅