2. Shell & Command Line#

1. ls - List Directory Contents#

Basic Usage:#

ls              # list files in current directory
ls /path/to/dir # list files in specific directory

All Important Flags:#

ls -a           # list ALL files including hidden ✅ (exam answer)
ls -l           # long format (permissions, size, date, owner)
ls -h           # human-readable sizes (KB, MB, GB) - NOT hidden files ❌
ls -la          # long format + hidden files combined
ls -lh          # long format + human-readable sizes
ls -lt          # sort by modification time
ls -lr          # reverse order
ls -lS          # sort by file size
ls -R           # recursive (list subdirectories too)

Hidden Files:#

  • Start with . (dot)
  • Examples: .gitignore, .env, .bashrc, .ssh/
  • ls without -a → hidden files NOT shown
  • ls -a → ALL files shown including hidden ✅
  • ls -h → human-readable sizes (NOT hidden)
  • show -a → not a valid Unix command
  • dir /ah → Windows command only

Long Format (ls -la) Output:#

ls -la
# Output:
# drwxr-xr-x  5 user group 4096 Dec 14 16:45 .
# drwxr-xr-x 20 user group 4096 Dec 13 10:22 ..
# -rw-r--r--  1 user group  220 Dec 14 09:00 .bashrc     ← hidden
# -rw-r--r--  1 user group  156 Dec 14 09:00 .gitignore  ← hidden
# drwxr-xr-x  2 user group 4096 Dec 14 16:45 data/
# -rw-r--r--  1 user group 2048 Dec 14 16:45 script.py
#
# Columns: permissions | links | owner | group | size | date | name

2. find - Search for Files by NAME#

# Find files by name
find . -name "*.py"              # all Python files
find . -name "data.csv"          # specific file
find /home -name "*.log"         # in specific directory
find . -name "Pine Ridge"        # ❌ finds FILE named "Pine Ridge"
                                 # NOT content inside files

# Find by type
find . -type f -name "*.csv"     # files only
find . -type d -name "data"      # directories only

# Find by size
find . -size +10M                # files larger than 10MB
find . -size -1k                 # files smaller than 1KB

# Find and execute command
find . -name "*.log" -exec cat {} \;     # cat all log files
find . -name "*.pyc" -exec rm {} \;     # delete all .pyc files

# Find modified recently
find . -mtime -7                 # modified in last 7 days
find . -mtime +30                # modified more than 30 days ago

find vs grep:#

CommandSearchesExample
findFile NAMESfind . -name "*.log"
grepFile CONTENTSgrep "error" logfile.txt

3. cat - Display File Contents#

cat file.txt              # display entire file
cat file1.txt file2.txt   # display multiple files
cat -n file.txt           # display with line numbers
cat > newfile.txt         # create new file (type content, Ctrl+D to end)
cat >> file.txt           # append to file
  • ❌ Does NOT search content (use grep)
  • ❌ NOT suitable for very large files (use less or head)

4. grep - Search Text INSIDE Files#

Basic Usage:#

grep "search_term" file.txt        # search in file ✅
grep "Pine Ridge" evacuation.txt   # find all lines with "Pine Ridge"
grep "error" app.log               # find error lines in log

Important Flags:#

grep -i "term" file.txt     # case-insensitive search
grep -n "term" file.txt     # show line numbers
grep -v "term" file.txt     # invert - lines NOT containing term
grep -c "term" file.txt     # count matching lines
grep -l "term" *.txt        # list files containing term
grep -r "term" ./dir/       # recursive search in directory
grep -w "term" file.txt     # whole word match only
grep -A 2 "term" file.txt   # show 2 lines AFTER match
grep -B 2 "term" file.txt   # show 2 lines BEFORE match
grep -E "pattern" file.txt  # extended regex ✅ (exam relevant)

grep with Regex:#

grep -E "[0-9]+" file.txt         # lines with numbers
grep -E "^ERROR" file.txt         # lines starting with ERROR
grep -E "\.py$" file.txt          # lines ending with .py
grep -E "(GET|POST)" access.log   # lines with GET or POST
grep -E "1[0-5][0-9]" file.txt    # match time range pattern

5. awk - Field Extraction & Processing#

Basic Usage:#

awk '{print $1}' file.txt         # print first field
awk '{print $3, $7}' file.txt     # print 3rd and 7th fields ✅
awk -F',' '{print $2}' file.csv   # use comma as delimiter
awk '{print NR, $0}' file.txt     # print with line numbers

Common Patterns:#

# Print specific fields from log
awk '{print $1, $4, $7}' access.log

# Filter and print
awk '$9 == 200 {print $0}' access.log   # lines where 9th field = 200

# Calculate sum
awk '{sum += $5} END {print sum}' file.txt

# Print lines matching condition
awk '$7 ~ /\/checkout/' access.log      # URL contains /checkout/

# Count occurrences
awk '{count[$7]++} END {for (url in count) print count[url], url}' log

Special Variables:#

VariableMeaning
$0Entire line
$1, $2Field 1, 2…
NRCurrent line number
NFNumber of fields
FSField separator

6. sort - Sort Output#

sort file.txt              # alphabetical sort
sort -n file.txt           # numeric sort
sort -r file.txt           # reverse order
sort -nr file.txt          # numeric reverse ✅ (exam relevant - sort -nr)
sort -k2 file.txt          # sort by 2nd field
sort -k2 -n file.txt       # sort by 2nd field numerically
sort -u file.txt           # sort and remove duplicates
sort -t',' -k2 file.csv    # sort CSV by 2nd column

7. uniq - Count/Remove Duplicates#

uniq file.txt              # remove consecutive duplicates
uniq -c file.txt           # count occurrences ✅ (exam relevant)
uniq -d file.txt           # show only duplicates
uniq -u file.txt           # show only unique lines

# IMPORTANT: uniq only works on SORTED input
# Always sort before uniq:
sort file.txt | uniq -c

8. head and tail - View File Portions#

head file.txt              # first 10 lines (default)
head -n 20 file.txt        # first 20 lines
head -1000 file.txt        # first 1000 lines

tail file.txt              # last 10 lines
tail -n 20 file.txt        # last 20 lines
tail -f file.txt           # follow file in real-time (for logs) ✅

9. wc - Word/Line/Character Count#

wc file.txt                # lines, words, characters
wc -l file.txt             # count lines only ✅
wc -w file.txt             # count words only
wc -c file.txt             # count bytes
wc -m file.txt             # count characters

# Count lines matching pattern
grep "ERROR" app.log | wc -l

10. Shell Pipeline - Combining Commands#

The | (pipe) operator:#

  • Takes output of left command as input to right command
  • Can chain multiple commands together

Complete Log Analysis Pipeline:#

# ✅ Exam-relevant pipeline for large log file analysis:
grep -E "(0[89]|1[0-8]):[0-5][0-9]" traffic.log \
| awk '{print $3, $7}' \
| sort \
| uniq -c \
| sort -nr

# Breakdown:
# grep -E "..."  → filter lines matching time pattern
# awk '{...}'   → extract fields 3 and 7
# sort          → sort alphabetically (needed before uniq)
# uniq -c       → count unique occurrences
# sort -nr      → sort by count, highest first

Other Useful Pipelines:#

# Count HTTP status codes in log
awk '{print $9}' access.log | sort | uniq -c | sort -nr

# Find most common IP addresses
awk '{print $1}' access.log | sort | uniq -c | sort -nr | head -10

# Find all POST requests
grep "POST" access.log | awk '{print $7}' | sort | uniq -c | sort -nr

# Count errors by hour
grep "ERROR" app.log | awk '{print substr($2,1,2)}' | sort | uniq -c

# Search specific pattern and count
grep -c "404" access.log

# Extract unique URLs
awk '{print $7}' access.log | sort -u

11. dir /ah - Windows Command (NOT Unix)#

Windows: dir /ah          ← lists hidden files in Windows
Unix:    ls -a            ← lists hidden files in Unix/Linux/Mac

❌ dir /ah does NOT work on Unix/Linux/Mac
❌ show -a does not exist in any standard shell

12. Other Essential Shell Commands#

Directory Navigation:#

pwd                        # print working directory
cd /path/to/dir            # change directory
cd ..                      # go up one level
cd ~                       # go to home directory
cd -                       # go to previous directory
mkdir new_folder           # create directory
mkdir -p a/b/c             # create nested directories
rmdir empty_folder         # remove empty directory
rm -rf folder/             # remove directory and contents (careful!)

File Operations:#

cp source.txt dest.txt     # copy file
cp -r source/ dest/        # copy directory recursively
mv old.txt new.txt         # move/rename file
rm file.txt                # remove file
rm -f file.txt             # force remove (no confirmation)
touch newfile.txt          # create empty file or update timestamp

Viewing & Searching:#

less file.txt              # view file page by page (better than cat for large files)
more file.txt              # similar to less
nano file.txt              # simple text editor
vi file.txt                # powerful text editor

# Search in less:
# /search_term → forward search
# ?search_term → backward search
# n → next match
# q → quit

Process Management:#

ps aux                     # list all running processes
kill -9 PID                # force kill process
top                        # real-time process monitor
htop                       # improved process monitor

File Permissions:#

chmod 755 script.py        # set permissions (rwxr-xr-x)
chmod +x script.py         # make executable
chmod 644 file.txt         # rw-r--r--
chown user:group file.txt  # change owner

# Permission notation:
# r=4, w=2, x=1
# 755 → rwxr-xr-x (owner: all, group+others: read+execute)
# 644 → rw-r--r-- (owner: read+write, others: read only)

Network & Downloads:#

curl https://api.example.com/data        # make HTTP request
curl -O https://example.com/file.zip     # download file
wget https://example.com/file.zip        # download file
curl -H "Authorization: Bearer TOKEN" url # with header
curl -X POST -d '{"key":"value"}' url    # POST request

Environment Variables:#

export API_KEY="your_key"    # set environment variable
echo $API_KEY                # print variable
env                          # list all environment variables
unset API_KEY                # remove variable
printenv API_KEY             # print specific variable

13. Shell Scripting Basics#

Shebang and Basic Script:#

#!/bin/bash
# This is a comment

# Variables
NAME="John"
echo "Hello, $NAME"

# User input
read -p "Enter name: " name
echo "Hello, $name"

# Conditionals
if [ -f "file.txt" ]; then
    echo "File exists"
elif [ -d "folder" ]; then
    echo "Folder exists"
else
    echo "Neither exists"
fi

# Loops
for i in {1..5}; do
    echo "Number: $i"
done

for file in *.csv; do
    echo "Processing: $file"
done

# While loop
while IFS= read -r line; do
    echo "$line"
done < file.txt

ETL Script with Idempotency:#

#!/bin/bash
# Idempotent ETL script

TEMP_FILE="/tmp/output_temp.csv"
FINAL_FILE="/output/final.csv"
LOG_FILE="/logs/etl.log"

echo "$(date): Starting ETL" >> $LOG_FILE

# Write to temp file first
python process_data.py > $TEMP_FILE

# Check if processing succeeded
if [ $? -eq 0 ]; then
    # Atomic move to final destination
    mv $TEMP_FILE $FINAL_FILE   # ✅ atomic operation
    echo "$(date): ETL completed" >> $LOG_FILE
else
    echo "$(date): ETL failed" >> $LOG_FILE
    rm -f $TEMP_FILE
    exit 1
fi

14. grep for Log Analysis - Exam Scenarios#

# ✅ Correct: grep searches CONTENT
grep "Pine Ridge" evacuation-logs.txt

# ❌ Wrong: find searches FILE NAMES
find -name "Pine Ridge"

# ❌ Wrong: ls lists directory contents
ls "Pine Ridge"

# ❌ Wrong: cat shows everything, no search
cat evacuation-logs.txt

Scenario 2: Search across multiple files#

grep -r "ERROR" ./logs/         # recursive search
grep "404" *.log                # all log files
grep -l "critical" ./logs/*     # list files containing "critical"

Scenario 3: Extract specific log entries#

# POST requests to /checkout/
grep "POST /checkout" access.log

# Requests between 12:00-15:59
grep -E "1[2-5]:[0-5][0-9]" access.log

# Combine with awk for field extraction
grep "POST" access.log | awk '{print $1, $4, $7, $9}'

Shell Commands - Quick Reference Card#

Navigation & Files:
  pwd       → current directory
  ls -a     → list all files (including hidden) ✅
  ls -la    → long format + hidden
  cd path   → change directory
  find      → search FILE NAMES

Viewing Content:
  cat       → display file (small files)
  less      → view file page by page (large files)
  head -n N → first N lines
  tail -n N → last N lines
  tail -f   → follow file in real-time

Searching:
  grep "term" file    → search CONTENT ✅
  grep -i             → case-insensitive
  grep -E "pattern"   → extended regex
  grep -r "term" dir  → recursive search

Processing:
  awk '{print $N}'    → extract field N
  sort                → sort output
  sort -nr            → numeric reverse sort
  uniq -c             → count unique ✅
  wc -l               → count lines

Pipeline:
  cmd1 | cmd2         → pipe output to next command
  grep | awk | sort | uniq -c | sort -nr ✅

Log Analysis Pattern:
  grep -E "pattern" log | awk '{print $field}' | sort | uniq -c | sort -nr